Principal investigator (ÚFAL): 
Project Manager (ÚFAL): 
Provider: 
Grant id: 
246723
Duration: 
2023-2025

The aim of the project is a data-based description of the morphematic structure of verbs in four languages (Czech, English, German and Spanish) across the whole frequency spectrum - from the most frequent verbs to verbs with a minimum number of occurrences. The research seeks to answer the question of whether and how morphematic complexity of verbs varies across different frequency bands: the initial hypothesis that the top frequency band will be made up of verbs with a simpler morphematic structure and limited root inventory, and that the number verbs with higher morphematic complexity will grow in lower frequency bands will be investigated on data from comparable text corpora using other types of language resources (e.g. derivational networks such as DeriNet and Universal Derivations, data containing morphological segmentation such as Universal Segmentations, valency dictionaries such as Vallex, PDT-Vallex or EngVallex). In contrast to available dictionaries, grammatical descriptions and case studies, which, although they take into account quantitative characteristics when describing verbs, are based on limited data samples, the planned analysis opens up the possibility of linking information on verb characteristics with frequency information on other, non-verbal parts of the vocabulary of the analyzed languages.