Jana Straková Jana Kravalová

Main Research Interests

  • deep learning
  • multilingual text processing: POS tagging, lemmatization, NER, dependency parsing, semantic parsing

Projects

Curriculum Vitae

Short CV

Selected Bibliography

ORCID

Papers

  1. Jana Straková, Eva Fučíková, Jan Hajič, Zdeňka Urešová (2023): Extending an Event-type Ontology: Adding Verbs and Classes using Fine-tuned LLMs Suggestions. In: Proceedings of the 17th Linguistic Annotation Workshop, pp. 85-95, Association for Computational Linguistics, Stroudsburg, PA, USA (url, bibtex)
  2. Jakub Náplava, Milan Straka, Jana Straková, Alexandr Rosen (2022): Czech Grammar Error Correction with a Large and Diverse Corpus. In: Transactions of the Association for Computational Linguistics, ISSN 2307-387X, 10, pp. 452-467 (url, local PDF, bibtex)
  3. Milan Straka, Jana Straková (2022): ÚFAL CorPipe at CRAC 2022: Effectivity of Multilingual Models for Coreference Resolution. In: Proceedings of the CRAC 2022 Shared Task on Multilingual Coreference Resolution, pp. 28-37, Association for Computational Linguistics, Gyeongju, Korea (url, local PDF, bibtex)
  4. Jakub Náplava, Martin Popel, Milan Straka, Jana Straková (2021): Understanding Model Robustness to User-generated Noisy Texts. In: Proceedings of the 7th Workshop on Noisy User-generated Text (W-NUT 2021), pp. 340-350, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-954085-90-9 (url, local PDF, bibtex)
  5. Jakub Náplava, Milan Straka, Jana Straková (2021): Diacritics Restoration using BERT with Analysis on Czech language. In: The Prague Bulletin of Mathematical Linguistics, ISSN 0032-6585, 116, pp. 27-42 (pdf, local PDF, bibtex)
  6. Milan Straka, Jakub Náplava, Jana Straková (2021): Character Transformations for Non-Autoregressive GEC Tagging. In: Proceedings of the 7th Workshop on Noisy User-generated Text (W-NUT 2021), pp. 417-422, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-954085-90-9 (url, local PDF, bibtex)
  7. Milan Straka, Jakub Náplava, Jana Straková, David Samuel (2021): RobeCzech: Czech RoBERTa, a Monolingual Contextualized Language Representation Model. In: 24th International Conference on Text, Speech and Dialogue, pp. 197-209, Springer, Cham, Switzerland, ISBN 978-3-030-83526-2 (url, local PDF, bibtex)
  8. Milan Straka, Jana Straková (2020): UDPipe at EvaLatin 2020: Contextualized Embeddings and Treebank Embeddings. In: Proceedings of LT4HALA 2020 - 1st Workshop on Language Technologies for Historical and Ancient Languages, pp. 124-129, European Language Resources Association (ELRA), Marseille, France, ISBN 979-10-95546-53-5 (url, local PDF, bibtex)
  9. Milan Straka, Jana Straková (2019): ÚFAL MRPipe at MRP 2019: UDPipe Goes Semantic in the Meaning Representation Parsing Shared Task. In: Proceedings of the CoNLL 2019 Shared Task: Cross-Framework Meaning Representation Parsing, pp. 127-137, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-950737-60-4 (url, local PDF, bibtex)
  10. Milan Straka, Jana Straková, Jan Hajič (2019): Czech Text Processing with Contextual Embeddings: POS Tagging, Lemmatization, Parsing and NER. In: Proceedings of the 22nd International Conference on Text, Speech and Dialogue - TSD 2019, Lecture Notes in Computer Science, ISSN 0302-9743, 11697, pp. 137-150, Springer International Publishing, Cham / Heidelberg / New York / Dordrecht / London, ISBN 978-3-030-27946-2 (url, local PDF, bibtex)
  11. Milan Straka, Jana Straková, Jan Hajič (2019): Evaluating Contextualized Embeddings on 54 Languages in POS Tagging, Lemmatization and Dependency Parsing (Electronic). In: ArXiv.org Computing Research Repository, ISSN 2331-8422, 1904.02099 (url, local PDF)
  12. Milan Straka, Jana Straková, Jan Hajič (2019): UDPipe at SIGMORPHON 2019: Contextualized Embeddings, Regularization with Morphological Categories, Corpora Merging. In: Proceedings of the 16th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology, pp. 95-103, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-950737-36-9 (pdf, local PDF, bibtex)
  13. Jana Straková, Milan Straka, Jan Hajič (2019): Neural Architectures for Nested NER through Linearization. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 5326-5331, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-950737-48-2 (pdf, local PDF, bibtex)
  14. Jana Straková, Milan Straka, Jan Hajič, Martin Popel (2019): Hluboké učení v automatické analýze českého textu. In: Slovo a slovesnost, ISSN 0037-7031, vol. 80, no. 4, pp. 306-327 (bibtex)
  15. Milan Straka, Jana Straková (2017): Tokenizing, POS Tagging, Lemmatizing and Parsing UD 2.0 with UDPipe. In: Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, pp. 88-99, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-945626-70-8 (pdf, local PDF, bibtex)
  16. Milan Straka, Jana Straková, Jan Hajič (2017): Prague at EPE 2017: The UDPipe System. In: Proceedings of the 2017 Shared Task on Extrinsic Parser Evaluation at the Fourth International Conference on Dependency Linguistics and the 15th International Conference on Parsing Technologies, pp. 65-74, Association for Computational Linguistics (ACL), Stroudsburg, PA, USA, ISBN 978-1-945626-74-6 (pdf, local PDF, bibtex)
  17. Jana Straková (2017): Neural Network Based Named Entity Recognition (PhD thesis). In: (pdf, local PDF, local PDF, bibtex)
  18. Jana Straková, Milan Straka, Magda Ševčíková, Zdeněk Žabokrtský (2017): Czech Named Entity Corpus. In: Handbook of Linguistic Annotation, pp. 855-873, Springer Netherlands, Netherlands, ISBN 978-94-024-0879-9 (bibtex)
  19. Gabriel Altman, Jan Andres, Johan van der Auwera, Jarmila Bachmannová, Jan Balhar, Aleš Bičan, Lenka Bičanová, Jana Bílková, Petr Biskup, Ondřej Bláha, Izabela Blaszczyk, Ondřej Bojar, Tomáš Bořil, Máša Bořkovcová, Ivana Bozděchová, Pavel Caha, Václav Cvrček, Radek Čech, Marie Čechová, František Čermák, David S. Danaher, František Daneš, Jaroslav David, Mojmír Dočekal, Jakub Dotlačil, Vít Dovalil, Věra Dvořák, Eva Eckertová, Viktor Elšík, Joseph Emonds, Adolf Erhart, François Esvan, Dan Faltýnek, Masako Fidler, Alena Andrlová Fidlerová, Zbyněk Fišer, Eva Flanderková, Mirjam Fried, Markus Giger, Miroslav Grepl, Jan Hajič, Eva Hajičová, Ernst Hansack, Björn Hansen, Radoslav Harman, Milan Harvalík, Martin Havlík, Eva Havlová, Elke Hentschel, Milada Hirschová, Zdeňka Hladká, Jana Hoffmannová, Jiří Homoláč, Milada Homolková, Tomáš Hoskovec, Jan Hric, Jaroslav Hubáček, Jan Chloupek, Leonid L. Iomdin, Pavel Ircing, Laura Janda, Ilona Janyšková, Milan Jelínek, Tomáš Jelínek, Lucie Jílková, Filip Jurčíček, Michal Jurka, Petr Karlík, Petr Karlík mladší, Helena Karlíková, Stanislava Kloferová, Martina Kloudová, Miroslava Knappová, Robert Kolár, Ivana Kolářová, Marie Kopřivová, Jan Kořenský, Pavel Kosek, Peter Kosta, Michaela Koščová, Jiří Koten, Ondřej Koupil, Michal Kovář, Michala Králíková, Marie Krappmann, Jiří Kraus, Marie Krčmová, Susan Kresin, Michal Křen, Michal Křístek, Pavel Kubaník, Miroslav Kubát, Tomáš Kubík, Vladislav Kuboň, Ivona Kučerová, Natalia Levshina, Alena Macurová, Ján Mačutek, Jarosław Malicki, Petr Mareš, Olga Martincová, Jiří Marvan, Jindřich Matoušek, Barbara Mertins, Roland Meyer, Krzysztof Migdalski, Eva Minářová, Kamila Mrázková, Iveta Mrázová, Richard Müller, Olga Müllerová, Mira Nábělková, Olga Navrátilová, Iva Nebeská, Anna Nedoluzhko, Marek Nekula, Zuzana Nevěřilová, Stefan Michael Newerkla, Mark Newson, Pavel Novák, Renata Novotná, Norbert Nübler, Radek Ocelák, Karel Oliva, Ivo Osolsobě, Klára Osolsobě, Ludmila Pacnerová, Karel Pala, Zdena Palková, Jarmila Panevová, Pavel Pecina, Jaroslav Peregrin, Anna Maria Perissutti, Ondřej Pešek, Vladimír Petkevič, Petr Plecháč, Jana Pleskalová, Jan Radimský, Paul Rastall, Alexandr Rosen, Zdenka Rusínová, Lucie Saicová Římalová, Tamah Sherman, Tobias Scheer, Boris Skalka, Radek Skarnitzl, Marián Sloboda, Olga Stehlíková, Hana Strachoňová, Jana Straková, Roman Sukač, Zbyněk Sviták, Aleš Svoboda, Josef Syka, Ondřej Šefčík, Radek Šimík, Hana Gruet Škrabalová, Dušan Šlosar, Rudolf Šrámek, Jan Štěpán, František Štícha, Michaela Tabakovičová, Knut Tarald Taraldsen, Lucie Taraldsen Medová, Jiří Trávníček, Vladimír Trpka, Jana Marie Tušková, Ludmila Uhlířová, Lenka Uličná, Oldřich Uličný, Jana Valdrová, Irena Vaňková, Ivo Vasiljev, Radoslav Večerka, Jarmil Vepřek, Ljuba Veselinova, Kateřina Veselovská, Ludmila Veselovská, Jan Volín, Taťána Vykypělová, Roland Wagner, James Wilson, Uliana Yazhinova, Daniel Zeman, Jiří Zeman, Šárka Zikánová, Markéta Ziková, Petr Zima, Ilse Zimmermann, Zdeněk Žabokrtský, Stanislav Žaža (2016): Nový encyklopedický slovník češtiny. In: , ISBN 978-80-7422-480-5 (url, bibtex)
  20. Milan Straka, Jan Hajič, Jana Straková (2016): UDPipe: Trainable Pipeline for Processing CoNLL-U Files Performing Tokenization, Morphological Analysis, POS Tagging and Parsing. In: Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC 2016), pp. 4290-4297, European Language Resources Association, Paris, France, ISBN 978-2-9517408-9-1 (pdf, local PDF, bibtex)
  21. Jana Straková, Milan Straka, Jan Hajič (2016): Neural Networks for Featureless Named Entity Recognition in Czech. In: Text, Speech, and Dialogue: 19th International Conference, TSD 2016, Lecture Notes in Computer Science, ISSN 0302-9743, 9924, pp. 173-181, Springer International Publishing, Cham / Heidelberg / New York / Dordrecht / London, ISBN 978-3-319-45509-9 (url, local PDF, bibtex)
  22. Milan Straka, Jan Hajič, Jana Straková, Jan Hajič, jr. (2015): Parsing Universal Dependency Treebanks using Neural Networks and Search-Based Oracle. In: 14th International Workshop on Treebanks and Linguistic Theories (TLT 2015), pp. 208-220, IPIPAN, Warszawa, Poland, ISBN 978-83-63159-18-4 (pdf, local PDF, bibtex)
  23. Jana Straková, Milan Straka, Jan Hajič (2014): Open-Source Tools for Morphology, Lemmatization, POS Tagging and Named Entity Recognition. In: Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pp. 13-18, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-941643-00-6 (pdf, local PDF, bibtex)
  24. Jana Straková, Milan Straka, Jan Hajič (2013): A New State-of-The-Art Czech Named Entity Recognizer. In: Text, Speech and Dialogue: 16th International Conference, TSD 2013. Proceedings, Lecture Notes in Computer Science, ISSN 0302-9743, 8082, pp. 68-75, Springer Verlag, Berlin / Heidelberg, ISBN 978-3-642-40584-6 (url, local PDF, bibtex)
  25. Albert Kim, Jana Straková (2012): Concurrent effects of lexical status and letter-rotation during early stages of visual word recognition: evidence from ERPs. In: Brain Research, ISSN 0006-8993, 1468, pp. 52-62 (bibtex)
  26. Jana Straková (2010): When Informatics Meets Neuroscience: Software and Statistics for Human Brain Imaging. In: WDS 2010 Proceedings of Contributed Papers, pp. 94-96, Matfyzpress, Charles University, Praha, Czechia, ISBN 978-80-7378-139-2 (local PDF, bibtex)
  27. Jana Straková, Pavel Pecina (2010): Czech Information Retrieval with Syntax-based Language Models. In: Proceedings of the 7th International Conference on Language Resources and Evaluation (LREC 2010), pp. 1359-1362, European Language Resources Association, Valletta, Malta, ISBN 2-9517408-6-7 (pdf, local PDF, bibtex)
  28. Eneko Agirre, Enrique Alfonseca, Keith Brendan Hall, Jana Kravalová, Marius Pasca, Aitor Soroa (2009): A Study on Similarity and Relatedness Using Distributional and WordNet-based Approaches. In: Proceedings of NAACL-HLT 09, pp. 19-27, Association for Computational Linguistics, Boulder, CO, USA, ISBN 978-1-932432-41-1 (pdf, bibtex)
  29. Jana Kravalová (2009): Využití syntaxe v metodách pro vyhledávání informací (masters thesis). In: (local PDF, bibtex)
  30. Jana Kravalová, Zdeněk Žabokrtský (2009): Czech Named Entity Corpus and SVM-based Recognizer. In: Proceedings of the 2009 Named Entities Workshop: Shared Task on Transliteration (NEWS 2009), pp. 194-201, Association for Computational Linguistics, Suntec, Singapore, ISBN 978-1-932432-57-2 (url, bibtex)