Rudolf Rosa

office: N 235
office hours: My office is N 235 "octopus" in Troja, building N "Impakt".
Please contact me per email to arrange a meeting.
Also see my calendar in the other tab for my (un)availability.
email: rosa@ufal.mff.cuni.cz
address: IMPAKT – „N“
V Holešovičkách 747/2
180 00 Praha 8
Czech Republic

Main Research Interests

Natural language generation, generative art

Automatically or semi-atuomatically generating literary texts, such as theatre scripts, short stories, poems...

Plně či částečně automatické generování literárních textů, jako jsou divadelní scénáře, povídky, básně...

Robopsychologist

Looking for linguistic structures in Deep Neural Networks

Hledání jazykových struktur v hlubokých neuronových sítích

Popularization of science

Computational linguistics, text generation, language models, theatre script generation, life with artificial intelligence...

Lectures, interative seminars, workshops, live system demonstrations, consultations...

For schools, for the public, for seniors, for businesses, for specialists...

Popularizace vědy

Počítačová lingvistika, generování textu, jazykové modely, generování scénářů divadelních her, život s umělou inteligencí...

Přednášky, interaktivní semináře, workshopy, živé ukázky systémů, konzultace...

Pro školy, pro veřejnost, pro seniory, pro firmy, pro specilisty...

In the past:

Automatic post-editing of Machine translation

Morphology, derivations

Dependency parsing

Unsupervised and semi-supervised methods, especially cross-lingual and multilingual

Projects

EduPo: Generování české poezie v edukačním a multimediálním prostředí (Generating Czech poetry in an educative and multimedia environment)

In the EduPo project, we focus on automated generation and analysis of Czech poetry, with the goal of building an interactive educational application for teaching poetry.

V projektu EduPo se věnujeme automatickému generování a analýze poezie, s cílem vyvinout interaktivní vzdělávací aplikaci pro výuku poezie.

AI: Authorship and Interpretation (AI: autorství a interpretace)

The AIAI project is a theoretical research project, looking for new ways of viewing authorship and interpretation of works (co-)created with artificial intelligence tools.

Projekt teoretického výzkumu AIAI hledá odpovědi na otázky autorství a interpretace děl (spolu)vytvořených nástroji umělé inteligence.

Interdisciplinary research on theology and technology (interdisciplinární výzkum propojující teologii a technologie)

Jako externí člen Výzkumné skupiny pro teologii a současnou kulturu (TCC RG) na ETF UK se podílím na výzkumu propojujícím teologii a technologie (zejména umělou inteligenci), v rámci projektu The Anthropology of Artificial Intelligence: Ethics, Understanding, Human Nature a následně Interdisciplinary institute for theology and new technologies in Prague.

AI v kontextu (AI in Context)

We are a multidisciplinary working group of experts studying and using AI in various contexts. We also organize various lectures and discussions. Most of what we do is presented in Czech language only.

Jsme multidisciplinární pracovní skupina odborníků zabývajících se umělou inteligencí v kontextu různých oborů. Mezi naše aktivity patří:

Organizujeme sérii zvaných přednášek a navazujících diskusních seminářů s názvem AI v kontextu (NAIL127)

Provozujeme exponát Život s umělou inteligencí ve vzdělávacím centru Univerzity Karlovy Didaktikon

V Didaktikonu také nabízíme přednášky a workshopy na téma umělé inteligence, a to pro žáky základních a středních škol, v rámci vzdělávání učitelů DVPP, a také formou univerzity třetího věku

Pro Univerzitu Karlovu zastřešujeme online kurz úvodu do umělé inteligence Elements of AI+

Jako odborní garanti se podílíme na Aignos workshopech pro základní a střední školy Tvoříme s umělou inteligenceí

Pro přehled našich aktivit sledujte náš web aivk.cz

Past

THEAITRE: Umělá inteligence autorem divadelní hry? (automated generation of theatre play scripts). In cooperation with Švandovo theatre, DAMU, and Tomáš Studeník, we created THEaiTRobot, a system for automatic generation of theatre play scripts. Within the project, we managed to create and put on stage the first full-length (60 minutes) theatre play, AI: Když robot píše hru (AI: When a Robot Writes a Play), which has 90% of the script generated automatically.

Linguistic Structure Representation in Neural Networks (LSD). I worked on a GAČR grant of David Mareček, called LSD, where we were trying to look at what linguistic structures can be found hidden inside of neural networks. We published a book at the end of the grant. David Mareček's group still continues with research in this direction, especially Tomasz Limisiewicz.

Unsupervised morphology induction. Together with Zdeněk Žabokrtský, we were trying to handle morphology in an unsupervised way, e.g. to find lemmas for word forms, to separate derivation from inflection, etc. I am no longer active there, but Zdeněk eventually built the DeriNet team that focuses on derivational morphology.

Cross-lingual Syntactic Parsing, i.e. training a parser on one language and applying it to another language. This was my dissertation, and I also had a GAUK grant for that.

Pohádkové dítě / Fairytale Child chatbot. A simple console chatbot that wants to hear a fairly tale from you! / Jednoduchý konzolový chatbot, který si od vás chce nechat vyprávět pohádku!

HimL focused on semantically sane translation of medical texts from English to Czech, German, Romanian and Polish.

QTLeap was a project aimed at significantly improving the quality of machine translation using deep language processing approaches (also see TectoMT).

I was a member of the HamleDT group, which was a project of harmonizing dependency treebanks for various languages, later evolving and merging into the Universal Dependencies project (which I am also an official memebr of).

Depfix is a system for automatic post-editing of machine translation outputs. It was developed as a part of the Faust project. It was later succeeded by MLFix, by Dušan Variš.

MSTperl is a reimplementation of the Maximum spanning tree dependency parser (McDonald et al., 2005) in Perl. It is tuned for Czech and has several advanced features that are useful for parsing the machine-translated sentences by Depfix. It also has some features for delexicalized parser transfer. I do not use it anymore -- I switched to Parsito and UDPipe.

Curriculum Vitae

You can download my CV in English.

Můžete si stáhnout můj životopis v češtině.

Teaching

List of classes
NAIL127 AI v kontextu
NAIL130 Elements of AI+
NPFL092 NLP Technology
NPFL118 Natural language processing on computational cluster
NPFL120 Multilingual Natural Language Processing
NPFL125 Introduction to Language Technologies
NPFL140 Large Language Models
NPRG045 Ročníkový projekt

Selected Bibliography

Google Scholar
ORCID: 0000-0003-4908-6127
Scopus ID: 55345284700
Researcher ID: D-4427-2017
All of my publications and talks can be found in Biblio. But the tool is currently somewhat broken so maybe you don't want to use it.

You can use Google Scholar or Semantic Scholar, and I also have here an automated static listing of my publications.

Students

I am happy to supervise NLP projects (bachelor theses, master theses, etc.), have a look at Project Ideas.
Warning: Reading scientific literature is my weak point, so it will be mostly your responsibility to review existing literature relevant to the topic!

Rád povedu projekty v oblasti zpracování přirozeného jazyka (Bc. a Mgr. práce apod.), mrkněte na Náměty na projekty.
Varování: Čtení odborné literatury není mou silnou stránkou, takže rešerše relevantních článků budou především Vaší zodpovědností!

Bachelor students

Yuliya Yamalutdinova: Detection of contradictions in pairs of texts in Kazakh (Detekce kontradikce mezi dvěma texty v kazaštině) — defended 2019

Zuzana Svobodová: Generating text descriptions of journeys in a map (Generování textového popisu trasy v mapě) — defended 2020

Jan Matějka: Generator of computer descriptions (Generátor popisků počítačových sestav a notebooků) — defended 2020

Lukáš Chaloupský: Automatic generation of images and their usage as training data (Automatické generování obrázků a jejich využití jako trénovacích dat) — defended 2020

Ondřej Michálek: Biblical paraphrasing (Biblické parafrázování) — defended 2020

František Trebuňa: Generating text from structured data (Generování textu ze strukturovaných dat) — defended 2021

Peter Grajcar: Generating a drawing according to a textual description (Generování kresby dle slovního popisu) — defended 2021

Daniela Jurášová: Automatické generovanie hrebeňoviek (Automatic generation of crosswords) — defended 2021

Zuzana Urbanová: Quote Attribution and Character Networks in Novels (Přiřazování mluvčích a vztahy mezi postavami v knihách) — defended 2021

Dominik Prokop: Generování výsledků tenisových dvouher (Generation of tennis singles results) — defended 2022

Viktor Bujko: Extrakcia informácií z reportov o leteckých incidentoch (Information extraction from aviation incident reports) — defended 2022

Tomáš Sourada: Automatic inflection in Czech language — defended 2023, presented at LREC 2024 as OOVs in the Spotlight: How to Inflect them?, 3rd place at SVOČ 2024 competition

Jan Pavelka: Object layout in a 2D room based on text description — defended 2024

Barbora Štěpánková: Generation of Czech Lyrics to Cover Songs — defended 2024, presented at NLP4DH 2025 as Song Lyrics Adaptations: Computational Interpretation of the Pentathlon Principle

X Y: Automatic Identification of Poetic Forms (in progress)

Master students

Abhishek Agrawal: Eye-tracking features in syntactic parsing (Rysy z eye-trackeru v syntaktickém parsingu) — defended 2020 (paper on Lantern 2020)

Lukáš Chaloupský: Automatic generation of medical reports from chest X-rays in Czech — defended 2022

Goutham Venkatesh: Modelling character personalities within THEaiTRE project (research project) — defended 2023

Rishu Kumar: Summarization of theatre scripts within THEaiTRE project (research project)

Michal Chudoba: Generation of Czech poetic strophes and their evaluation — defended 2024, published on arXiv as GPT Czech Poet: Generation of Czech Poetic Strophes with Language Models

Antonia Claésia Da Costa Souza: Multilingual multidomain generation of school tests that are hard to solve automatically — defended 2025

Jose Emilio Maldonado Rodríguez: Automatic Detection of Creativity in Translation — defended 2026

X Y: Metaphor detection in both prose and poetry for the EduPo project (research project, in progress)

Interns

Tomasz Limisiewicz: Analyzing syntactic features of BERT self-attentions — completed in 2019 (paper in findings of EMNLP 2020)

List of all defended theses supervised by me

Other

I was one of the main organizers of the Slovakoczech NLP workshop for students and early-stage researchers -- see SloNLP 2015, SloNLP 2016, SloNLP 2017, SloNLP 2018, SloNLP 2019.

My Erdös number is 4 (me - Jaroslava Hlaváčová - Petr Savický - Zsolt Tuza - Paul Erdös)

Můj herní index je 73 (aktuální po Navíc 2021), náš šifrovací tým se jmenuje Divize nulou.

Institute of Formal and Applied Linguistics

Charles University, Czech Republic
Faculty of Mathematics and Physics

Search form

Rudolf Rosa

Main Research Interests

Projects

EduPo: Generování české poezie v edukačním a multimediálním prostředí (Generating Czech poetry in an educative and multimedia environment)

AI: Authorship and Interpretation (AI: autorství a interpretace)

Interdisciplinary research on theology and technology (interdisciplinární výzkum propojující teologii a technologie)

AI v kontextu (AI in Context)

Past

Curriculum Vitae

Teaching

Selected Bibliography

Students

Bachelor students

Master students

Interns

Other