Principal investigator (ÚFAL): 
Project Manager (ÚFAL): 
Provider: 
Grant id: 
START/HUM/010
Duration: 
2021-2023
Projects: 

A data-based approach to competition in word-formation: selected semantic categories across seven languages

The project deals with data-based research into competition in word-formation. It aims to compare word-formation processes and strategies that speakers employ to express the semantic concepts of diminutiveness and femaleness in seven European languages (two Slavic, three Germanic, and two Romance languages). Derivatives, compounds and syntactic phrases used for these concepts in the analysed languages (cf. 'Polizistin' in German, 'policewoman' in English, and 'mujer policía' in Spanish) will be identified either by exploiting available language resources and tools (some of which have been developed by the project team members) or using tools and methods designed specifically for the project. The team of four PhD students of computational linguistics will develop machine learning models that will be able to simulate how these semantic concepts are expressed in the languages studied and discover which linguistic properties influence native speakers' choices among the competing alternatives. The results of the research are expected to be relevant both for the linguistic discussion on competition in word-formation and for modelling word-formation in Natural Language Processing.

Reg. n. CZ.02.2.69/0.0/0.0/19_073/0016935.

Publications

  • Ševčíková, M.; Kyjánek, L.; Vidová Hladká, B. Agent noun formation in Czech: An empirical study on suffix rivalry. In Second Workshop on Paradigmatic Word Formation Modelling, 2021, pp. 65-68.
  • Svoboda, E.; Ševčíková, M. Spliting and Identifying Czech Compounds: A Pilot Study. In Proceedings of the Third Workshop on Resources and Tools for Derivational Morphology (DeriMo 2021). France, 2021, pp. 125-134.
  • Vidra, J.; Žabokrtský, Z.; Kyjánek, L.; Ševčíková, M.; Dohnalová, Š.; Svoboda, E.; Bodnár, J. DeriNet 2.1, LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL), Faculty of Mathematics and Physics, Charles University, 2021, http://hdl.handle.net/11234/1-3765.
  • Kyjánek, L.; Žabokrtský, Z.; Vidra, J.; Ševčíková, M. Universal Derivations v1.1, LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL), Faculty of Mathematics and Physics, Charles University, 2021, http://hdl.handle.net/11234/1-3247.