Main Research Interests
Originally a theoretical linguist, I switched to more computational tasks while my Ph.D. studies at ÚFAL.
- Corpus search engine. Currently, I am working on the project KonText within LINDAT/CLARIN. My task is to make all the corpora (monolingual, parallel, speech) from the Lindat repository available via the KonText user interface. One of the challenges is to fit the linguistic annotation like syntax or discourse into a linear representation.
- Machine Translation. In my Ph.D. research, I investigated Machine Translation (both Rule-Based and Statistical) between the two closely-related Slavic languages - Czech and Russian. I worked with the system TectoMT (RBMT for Czech and Russian) and SMT Moses, and compared the output from those two system from the point of view of a linguist. The principal question for me was whether relatedness of languages helps in MT or not.
- Surface valency. I am especially interested in the differences between Czech and Russian surface valency frames (e.g. cases like rušit+Gen vs. мешать+Dat).
- Multiword Expressions. I participated in the project Lexemann as an annotator. Now I am a member of a PARSEME working group, action COST.