Pavel Pecina

Associate Professor, Institute of Formal and Applied Linguistics, Charles University, Prague

Khresmoi/KConnect data for medical Machine Translation

  1. Khresmoi Summary Translation Test Data 2.0, LINDAT/CLARIN digital library at the Institute of Formal and Applied Linguistics, Charles University.
  2. Khresmoi Query Translation Test Data 2.0, LINDAT/CLARIN digital library at the Institute of Formal and Applied Linguistics, Charles University.

Panacea data for domain adaptation of Machine Translation

  1. Parallel training, development, and test data for English-French and English-Greek: Environment, Labour Legislation.
  2. Monolingual data sets for English: Environment, Labour Legislation, French: Environment, Labour Legislation, and Greek: Environment, Labour Legislation.

Reference data for Collocation Extraction

  1. Reference Data for Collocation Extraction: Czech Dependency Bigrams from the Prague Dependency Treebank. Multiword Expressions Workshop (MWE 2008), Marrakech, Morocco, 2008. Frequency data available here.
  2. Reference Data for Collocation Extraction: Czech Surface Bigrams from the Prague Dependency Treebank. Multiword Expressions Workshop (MWE 2008), Marrakech, Morocco, 2008. Frequency data available here.
  3. Reference Data for Collocation Extraction: Czech Surface Bigrams from the Czech National Corpus. Multiword Expressions Workshop (MWE 2008), Marrakech, Morocco, 2008. Frequency data available here.

Presentations

  1. Habilitation presentation, Faculty of Mathematics and Physics, Charles University, Prague, 2017.
  2. Malach: zpracování audiovizuálního archívu svědectví přeživších holocaustu, New Media Inspiration, Prague, 2015.
  3. Simple and Effective Parameter Tuning for Domain Adaptation of Statistical Machine Translation. The 24th International Conference on Computational Linguistics (Coling 2012), Mumbai, India, December 14, 2012.
  4. Lexical Association Measures: Collocation Extraction. Invited talk, LOEWE Digital Humanities, Goethe University, Frankfurt am Main, Germany, Jul 12, 2012.
  5. Cross-Language Speech Retrieval and its Evaluation in the Malach Project. Invited talk, European Masters Program in Language and Communication Technologies workshop, Prague, May 29, 2012.
  6. Lexical Association Measures: Collocation Extraction. Invited talk, Institute of the Czech National Corpus, Prague, Czech Republic, Feb 7, 2012.
  7. Towards Using Web-Crawled Data for Domain Adaptation in Statistical Machine Translation. The 15th Annual Conference of the European Associtation for Machine Translation (EAMT 2011), Leuven, Belgium, May 31, 2011. (presented by Antonio Toral)
  8. Lexical Association Measures: Collocation Extraction. Invited talk, Knowledge Engineering Group seminar, Prague, Czech Republic, Nov 26, 2009.
  9. Lexical Association Measures: Collocation Extraction. CNGL seminar, Dublin City University, Dublin, Ireland, Sep 21, 2009.
  10. Jak psát a nepsat vědecké články. Winter Seminar, UFAL Horní Mísečky, Czech Republic, Feb 9, 2009.
  11. Lexical Association Measures: Collocation Extraction. MFF UK, Ph.D. thesis defense, Prague, Czech Republic, Sep 24, 2008.
  12. A Machine Learning Approach to Multiword Expression Extraction, Towards a Shared Task for Multiword Expressions Workshop (MWE 2008), LREC 2008, Marrakech, Morocco, Jun 1, 2008.
  13. Reference Data for Czech Collocation Extraction, Towards a Shared Task for Multiword Expressions Workshop (MWE 2008), LREC 2008, Marrakech, Morocco, Jun 1, 2008.
  14. Úklid a čištění jako věda. Mixer, Prague, Oct 21, 2007.
  15. Cross-Language Speech Retrieval and its Evaluation in the Malach Project, UFAL Seminar, Prague, Nov 20, 2006.
  16. Vyhledavání informací v projektu Malach, Mixer, Prague, Apr 12, 2006.
  17. An Extensive Empirical Study of Collocation Extraction Methods, ACL 2005 Student Research Workshop, Ann Arbor, USA, Jun 27, 2005.
  18. Collocation Extraction: The Statistical Approach, Invited talk, Institute of Czech National Corpus, Prague, Apr 12, 2005.
  19. Validating and Improving the Czech WordNet via Lexico-Semantic Annotation of the Prague Dependency Treebank, LREC workshop: Building Lexical Resources from Semantically Annotated Corpora, Lisbon, Portugal, Jun 8, 2004.
  20. Automatic Collocation Extraction from Text Corpora, UFAL Seminar, MFF, Prague, May 17, 2004.

Posters

  1. Domain Adaptation of Statistical Machine Translation using Web-Crawled Resources: A Case Study. EAMT, Trento, Italy, 2012. (presented by Antonio Toral)
  2. Combining Association Measures for Collocation Extraction. COLING/ACL, Sydney, Australia, 2006.
  3. Language Modeling for Czech ASR. MALACH, NSF site visit, Washigton, USA, 2004.

Links

Profiles

Contact

UFAL MFF UK
Room 422, 4th floor
Malostranské nám. 25
118 00 Prague 1
Czech Republic

+420 951 554 332