Daniel Zeman – Curriculum Vitae


Citizenship: Czechia, European Union.
Male, married, three children.


2005, Univerzita Karlova (Charles University), Praha
Obtained the RNDr. title.
13.1.2005, Univerzita Karlova (Charles University), Praha
Obtained Ph.D. in mathematical linguistics
1999, May-July: University of Pennsylvania, Philadelphia (Pennsylvania, USA)
visiting scholar at the Institute for Research in Cognitive Science. Invited by Aravind Joshi to work together with Anoop Sarkar on automatic acquiring of subcategorization frames from the Prague Dependency Treebank.
1998, July-August: Johns Hopkins University, Baltimore (Maryland, USA)
participation in the summer workshop Core NLP Technology Applicable to Multiple Languages at the Center for Language and Speech Processing.
1997 to 2005: Univerzita Karlova (Charles University), Praha
Graduate student of mathematical linguistics at the Faculty of Mathematics and Physics. The PhD thesis topic: Parsing with a Statistical Dependency Model. Special interest in syntax.
1990 to 1997: Univerzita Karlova (Charles University), Praha
undergraduate study of Computer Science at the Faculty of Mathematics and Physics. Regular study finished with the fifth year on October 10, 1995. Thesis and final examination in the field of Computational and Formal Linguistics. The exam was passed in June 1997. Obtained the title “magistr” (“Mgr.”, an equivalent to MSc.)
1986 to 1990: Akademické gymnázium (Academic Grammar School), Praha
regular study with specialization in programming. In 1990 finished with the leaving examination in Mathematics, Programming, Czech and German Languages.


since 2010: Morfologická a syntaktická analýza (Morphological and Syntactic Analysis)
lecturer, Faculty of Mathematics and Physics, Charles University
since 2000: Počítače a přirozený jazyk (Computers and Natural Language)
lecturer, faculty of Nuclear Sciences and Physical Engineering, Czech Technical University
2013: Lingvistické softwarové nástroje (Linguistic Software Tools)
lecturer, Faculty of Arts, Palacký University in Olomouc
2001-2002: Programování (Programming)
workshop leader, Faculty of Mathematics and Physics, Charles University
1999-2010: Počítačové zpracování češtiny (Automatic Processing of Czech); since winter 2003/2004 renamed Počítačové zpracování přirozeného jazyka (Automatic Processing of Natural Language)
lecturer and workshop leader, Faculty of Mathematics and Physics, Charles University

Supervised masters' theses

  • Bushra Jawaid, English-to-Urdu machine translation, defended 2010
  • Pranava Swaroop Madhyastha, higher order dependency parsing, defended 2011
  • Angelina Ivanova, acquisition of bilingual dictionaries from Wikipedia, defended 2011
  • Ke Tran, unsupervised morphemic segmentation, defended 2012
  • Joachim Daiber, parsing of Twitter data, defended 2013
  • Sibel Ciddi, processing of Turkish, defended 2014

Supervised bachelors' theses

  • David Mareček (novelizátor zákonů), Martin Žember (detekce spamů), Ondřej Hálek (ČVUT, machine translation and named entities)

Professional positions

since 2000: Univerzita Karlova, Praha
Researcher, Center for Computational Linguistics, since 2004 Institute of Formal and Applied Linguistics. Research interests: statistical parsing of Czech, dependency modeling, morphological analysis, machine translation, resource-poor languages.
2006: University of Maryland, College Park.
Awarded Fulbright-Masaryk Fellowship (January to July), postdoc (July to December). I worked with Philip Resnik at the University of Maryland, Institute for Advanced Computer Studies, Computational Linguistics & Information Processing.
1995 to 1999: Olt s.r.o., Praha
after finishing the regular MSc. study at Charles University, I started my cooperation with the Prague software firm Olt s.r.o. I have been developing parts of their programs for Windows NT, e.g. a built-in text editor (in C++).
1994: SSaG s.r.o., Praha
during the study, from April to November 1994 I worked as programmer.

Grant projects (principal investigator or co-PI)

  • Morphologically and Syntactically Annotated Corpora of Many Languages (MANYLA) (GAČR, 2015-2017)
  • Czech in the Machine Translation Era (CZECHMATE) (GAČR, 2011-2013)
  • MUSSLAP (sign language processing) (GAAV, 2004-2008)



Since 1999 regular reviewer of submissions to international conferences, workshops and journals.

2014 – 2016 member of the scientific council of the Czech National Corpus project.


  • Czech (native)
  • English (fluent)
  • German, Russian (sufficient knowledge for communication, although not fluent)
  • Spanish, French (basic knowledge)
  • Able to understand Slovak (near native) and Polish (a little)

Programming environments

Programming languages:
Perl, Java, C++, Visual Basic
Operating systems:
Windows, Linux

Other interests

travel, geocaching, alpine tourism, canoeing