Milan Straka

office
Room 420
office hours
Monday 9-17
Tuesday 10-16
email
straka@ufal.mff.cuni.cz
phone
(+420) 95155 4361
address
Malostranské náměstí 25
118 00 Praha 1
Czech Republic

Main Research Interests

  • Machine Learning
    • Artificial Neural Networks
    • Deep Learning
    • Structured Prediction
    • Bayesian Nonparametrics Modelling and Unsupervised Learning
  • NLP Tools
    • POS Tagging
    • Dependency Parsing
    • Named Entity Recognition and Linking

Projects

Curriculum Vitae

Teaching

Selected Bibliography

Papers

  • Milan Straka, Jana Straková and Jan Hajič: Prague at EPE 2017: The UDPipe System. In Proceedings of the 2017 Shared Task on Extrinsic Parser Evaluation at the Fourth International Conference on Dependency Linguistics and the 15th International Conference on Parsing Technologies, Pisa, Italy, September 2017.
  • Milan Straka and Jana Straková. Tokenizing, POS Tagging, Lemmatizing and Parsing UD 2.0 with UDPipe. In Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, Vancouver, Canada, August 2017.
  • Daniel Zeman, Martin Popel, Milan Straka, Jan Hajic, Joakim Nivre, Filip Ginter, Juhani Luotolahti, Sampo Pyysalo, Slav Petrov, Martin Potthast, Francis Tyers, Elena Badmaeva, Memduh Gokirmak, Anna Nedoluzhko, Silvie Cinkova, Jan Hajic jr., Jaroslava Hlavacova, Václava Kettnerová, Zdenka Uresova, Jenna Kanerva, Stina Ojala, Anna Missilä, Christopher D. Manning, Sebastian Schuster, Siva Reddy, Dima Taji, Nizar Habash, Herman Leung, Marie-Catherine de Marneffe, Manuela Sanguinetti, Maria Simi, Hiroshi Kanayama, Valeria dePaiva, Kira Droganova, Héctor Martínez Alonso, Çağrı Çöltekin, Umut Sulubacak, Hans Uszkoreit, Vivien Macketanz, Aljoscha Burchardt, Kim Harris, Katrin Marheinecke, Georg Rehm, Tolga Kayadelen, Mohammed Attia, Ali Elkahky, Zhuoran Yu, Emily Pitler, Saran Lertpradit, Michael Mandl, Jesse Kirchner, Hector Fernandez Alcalde, Jana Strnadová, Esha Banerjee, Ruli Manurung, Antonio Stella, Atsuko Shimada, Sookyoung Kwak, Gustavo Mendonca, Tatiana Lando, Rattima Nitisaroj and Josie Li. CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies. In Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, Vancouver, Canada, August 2017.
  • Natalia Klyueva and Antoine Doucet and Milan Straka. Neural Networks for Multi-Word Expression Detection. In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2017, Workshop on Multiword Expressions), Valencia, Spain, April 2017.
  • Jana Straková, Milan Straka, Magda Ševčíková, and Zdeněk Žabokrtský. Czech Named Entity Corpus. In The Handbook of Linguistic Annotation, editors Nancy Ide and James Pustejovsky, ISBN 9402408797, Springer, Netherlands, 2017.
  • Ševčíková Magda, Žabokrtský Zdeněk, Vidra Jonáš, Straka Milan. Lexikální síť DeriNet: elektronický zdroj pro výzkum derivace v češtině. In Časopis pro moderní filologii, Vol. 98, No. 1, Charles University, Prague, Czech Republic, ISSN 0008-7386, pp. 62-76, Sep 2016.
  • Jana Straková, Milan Straka and Jan Hajič. Neural Networks for Featureless Named Entity Recognition in Czech. In Proceedings of the 19th International Conference on Text, Speech and Dialogue (TSD 2016), Brno, Czech Republic, September 2016.
  • Milan Straka, Jan Hajič and Jana Straková. UDPipe: Trainable Pipeline for Processing CoNLL-U Files Performing Tokenization, Morphological Analysis, POS Tagging and Parsing. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016), Portorož, Slovenia, May 2016.
  • Zdeněk Žabokrtský, Magda Ševčíková, Milan Straka, Jonáš Vidra and Adéla Limburská. Merging Data Resources for Inflectional and Derivational Morphology in Czech. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016), Portorož, Slovenia, May 2016.
  • Milan Straka, Jan Hajič, Jana Straková and Jan Hajič jr. Parsing Universal Dependency Treebanks using Neural Networks and Search-Based Oracle. In Proceedings of the Fourteenth International Workshop on Treebanks and Linguistic Theories (TLT 14), December 2015.
  • Jana Straková, Milan Milan and Jan Hajič: Open-Source Tools for Morphology, Lemmatization, POS Tagging and Named Entity Recognition. In Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pages 13-18, Baltimore, Maryland, June 2014. Association for Computational Linguistics.
  • Jana Straková, Milan Straka and Jan Hajič: A New State-Of-The-Art Czech Named Entity Recognizer. In TSD 2013, Text, Speech and Dialogue, Pilsen, Czech Republic, 2013.
  • David Mareček and Milan Straka: Stop-probability estimates computed on a large corpus improve Unsupervised Dependency Parsing. In ACL 2013, Annual Meeting of the Association for Computational Linguistics, Sofia, Bulgaria, August 2013.
  • Milan Straka: Adams’ Trees Revisited – Correct and Efficient Implementation. In Proceedings of TFP 2011, Symposium on Trends in Functional Programming, Madrid, Spain, May 2011.
  • Milan Straka: The performance of the Haskell containers package. In Proceedings of Haskell 2010, 3rd ACM Haskell symposium on Haskell, Baltimore, Maryland, September 2010.
  • Milan Straka: Optimal worst-case fully persistent arrays. In TFP 2009, Symposium on Trends in Functional Programming, Komarno, Slovakia, June 2009.
  • Martin Mareš and Milan Straka: Linear-Time Ranking of Permutations. In Proceedings of ESA 2007, 15th Annual European Symposium, Eilat, Israel, October 2007.

Theses

ORCID