Jana Straková Jana Kravalová
Main Research Interests
- deep learning
- multilingual text processing: POS tagging, lemmatization, NER, dependency parsing, semantic parsing
Projects
Curriculum Vitae
Selected Bibliography
- Google Scholar
- ORCID: 0000-0003-0075-2408
- Scopus ID: 57193758664
- Researcher ID: L-5805-2017
ORCID
Papers
- OOVs in the Spotlight: How to Inflect them?. In: Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pp. 12455-12466, European Language Resources Association, Torino, Italy, ISBN 978-2-493814-10-4 (pdf, bibtex)
- ÚFAL LatinPipe at EvaLatin 2024: Morphosyntactic Analysis of Latin. In: Proceedings of the Third Workshop on Language Technologies for Historical and Ancient Languages (LT4HALA) @ LREC-COLING-2024, pp. 207-214, ELRA and ICCL, Torino, Italia, ISBN 978-2-493814-46-3 (pdf, bibtex)
- CWRCzech: 100M Query-Document Czech Click Dataset and Its Application to Web Relevance Ranking. In: Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 1221-1231, Association for Computing Machinery, New York, NY, USA, ISBN 9798400704314 (url, bibtex)
- Extending an Event-type Ontology: Adding Verbs and Classes using Fine-tuned LLMs Suggestions. In: Proceedings of the 17th Linguistic Annotation Workshop, pp. 85-95, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-959429-83-8 (url, bibtex)
- Czech Grammar Error Correction with a Large and Diverse Corpus. In: Transactions of the Association for Computational Linguistics, ISSN 2307-387X, 10, pp. 452-467 (url, local PDF, bibtex)
- ÚFAL CorPipe at CRAC 2022: Effectivity of Multilingual Models for Coreference Resolution. In: Proceedings of the CRAC 2022 Shared Task on Multilingual Coreference Resolution, pp. 28-37, Association for Computational Linguistics, Gyeongju, Korea (url, local PDF, bibtex)
- Understanding Model Robustness to User-generated Noisy Texts. In: Proceedings of the 7th Workshop on Noisy User-generated Text (W-NUT 2021), pp. 340-350, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-954085-90-9 (url, local PDF, bibtex)
- Diacritics Restoration using BERT with Analysis on Czech language. In: The Prague Bulletin of Mathematical Linguistics, ISSN 0032-6585, 116, pp. 27-42 (pdf, local PDF, bibtex)
- Character Transformations for Non-Autoregressive GEC Tagging. In: Proceedings of the 7th Workshop on Noisy User-generated Text (W-NUT 2021), pp. 417-422, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-954085-90-9 (url, local PDF, bibtex)
- RobeCzech: Czech RoBERTa, a Monolingual Contextualized Language Representation Model. In: 24th International Conference on Text, Speech and Dialogue, pp. 197-209, Springer, Cham, Switzerland, ISBN 978-3-030-83526-2 (url, local PDF, bibtex)
- UDPipe at EvaLatin 2020: Contextualized Embeddings and Treebank Embeddings. In: Proceedings of LT4HALA 2020 - 1st Workshop on Language Technologies for Historical and Ancient Languages, pp. 124-129, European Language Resources Association (ELRA), Marseille, France, ISBN 979-10-95546-53-5 (url, local PDF, bibtex)
- ÚFAL MRPipe at MRP 2019: UDPipe Goes Semantic in the Meaning Representation Parsing Shared Task. In: Proceedings of the CoNLL 2019 Shared Task: Cross-Framework Meaning Representation Parsing, pp. 127-137, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-950737-60-4 (url, local PDF, bibtex)
- Czech Text Processing with Contextual Embeddings: POS Tagging, Lemmatization, Parsing and NER. In: Proceedings of the 22nd International Conference on Text, Speech and Dialogue - TSD 2019, Lecture Notes in Computer Science, ISSN 0302-9743, 11697, pp. 137-150, Springer International Publishing, Cham / Heidelberg / New York / Dordrecht / London, ISBN 978-3-030-27946-2 (url, local PDF, bibtex)
- Evaluating Contextualized Embeddings on 54 Languages in POS Tagging, Lemmatization and Dependency Parsing (Electronic). In: ArXiv.org Computing Research Repository, ISSN 2331-8422, 1904.02099 (url, local PDF)
- UDPipe at SIGMORPHON 2019: Contextualized Embeddings, Regularization with Morphological Categories, Corpora Merging. In: Proceedings of the 16th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology, pp. 95-103, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-950737-36-9 (pdf, local PDF, bibtex)
- Neural Architectures for Nested NER through Linearization. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 5326-5331, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-950737-48-2 (pdf, local PDF, bibtex)
- Hluboké učení v automatické analýze českého textu. In: Slovo a slovesnost, ISSN 0037-7031, vol. 80, no. 4, pp. 306-327 (bibtex)
- Tokenizing, POS Tagging, Lemmatizing and Parsing UD 2.0 with UDPipe. In: Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, pp. 88-99, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-945626-70-8 (pdf, local PDF, bibtex)
- Prague at EPE 2017: The UDPipe System. In: Proceedings of the 2017 Shared Task on Extrinsic Parser Evaluation at the Fourth International Conference on Dependency Linguistics and the 15th International Conference on Parsing Technologies, pp. 65-74, Association for Computational Linguistics (ACL), Stroudsburg, PA, USA, ISBN 978-1-945626-74-6 (pdf, local PDF, bibtex)
- Neural Network Based Named Entity Recognition (PhD thesis). In: (pdf, local PDF, local PDF, bibtex)
- Czech Named Entity Corpus. In: Handbook of Linguistic Annotation, pp. 855-873, Springer Netherlands, Netherlands, ISBN 978-94-024-0879-9 (bibtex)
- Nový encyklopedický slovník češtiny. In: , ISBN 978-80-7422-480-5 (url, bibtex)
- UDPipe: Trainable Pipeline for Processing CoNLL-U Files Performing Tokenization, Morphological Analysis, POS Tagging and Parsing. In: Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC 2016), pp. 4290-4297, European Language Resources Association, Paris, France, ISBN 978-2-9517408-9-1 (pdf, local PDF, bibtex)
- Neural Networks for Featureless Named Entity Recognition in Czech. In: Text, Speech, and Dialogue: 19th International Conference, TSD 2016, Lecture Notes in Computer Science, ISSN 0302-9743, 9924, pp. 173-181, Springer International Publishing, Cham / Heidelberg / New York / Dordrecht / London, ISBN 978-3-319-45509-9 (url, local PDF, bibtex)
- Parsing Universal Dependency Treebanks using Neural Networks and Search-Based Oracle. In: 14th International Workshop on Treebanks and Linguistic Theories (TLT 2015), pp. 208-220, IPIPAN, Warszawa, Poland, ISBN 978-83-63159-18-4 (pdf, local PDF, bibtex)
- Open-Source Tools for Morphology, Lemmatization, POS Tagging and Named Entity Recognition. In: Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pp. 13-18, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-941643-00-6 (pdf, local PDF, bibtex)
- A New State-of-The-Art Czech Named Entity Recognizer. In: Text, Speech and Dialogue: 16th International Conference, TSD 2013. Proceedings, Lecture Notes in Computer Science, ISSN 0302-9743, 8082, pp. 68-75, Springer Verlag, Berlin / Heidelberg, ISBN 978-3-642-40584-6 (url, local PDF, bibtex)
- Concurrent effects of lexical status and letter-rotation during early stages of visual word recognition: evidence from ERPs. In: Brain Research, ISSN 0006-8993, 1468, pp. 52-62 (bibtex)
- When Informatics Meets Neuroscience: Software and Statistics for Human Brain Imaging. In: WDS 2010 Proceedings of Contributed Papers, pp. 94-96, Matfyzpress, Charles University, Praha, Czechia, ISBN 978-80-7378-139-2 (local PDF, bibtex)
- Czech Information Retrieval with Syntax-based Language Models. In: Proceedings of the 7th International Conference on Language Resources and Evaluation (LREC 2010), pp. 1359-1362, European Language Resources Association, Valletta, Malta, ISBN 2-9517408-6-7 (pdf, local PDF, bibtex)
- A Study on Similarity and Relatedness Using Distributional and WordNet-based Approaches. In: Proceedings of NAACL-HLT 09, pp. 19-27, Association for Computational Linguistics, Boulder, CO, USA, ISBN 978-1-932432-41-1 (pdf, bibtex)
- Využití syntaxe v metodách pro vyhledávání informací (masters thesis). In: (local PDF, bibtex)
- Czech Named Entity Corpus and SVM-based Recognizer. In: Proceedings of the 2009 Named Entities Workshop: Shared Task on Transliteration (NEWS 2009), pp. 194-201, Association for Computational Linguistics, Suntec, Singapore, ISBN 978-1-932432-57-2 (url, bibtex)