Trees and paradigms: How to model derivation in natural languages?

Monday, 2 June, 2025 - 13:30

Room:

Trees and paradigms: How to model derivation in natural languages?

Magda Ševčíková (ÚFAL MFF UK)

While morphology is defined as a broad discipline encompassing both the formation of word forms (inflection; journal – journals) and the creation of new words (derivation; journal > journalist), computational linguistics have long focused almost exclusively on inflectional morphology. Language data resources specifically dedicated to derivation have been developed over the past two decades, and formal linguistic models of derivation still lag behind those of inflection.

Derivation, understood in linguistics as the creation of a new word from a single ancestor, has been modeled using structures that conform to the definition of rooted trees: the simplest word appears as the root, and more complex words are arranged around it, each linked to a single predecessor. However, such structures are not well-suited to capturing more intricate yet common cases in which identifying a single ancestor is not straightforward, be it derivatives that bear comparable formal and semantic relationships to more than one word (e.g. unprofessionally related both to professionally and unprofessional), or words that share a substring (such as altruist and altruism) but a simpler common source is not available.

As an alternative to derivational trees, I will present the option of modeling derivation through paradigms. The central concept of a derivational paradigm is—by analogy with the inflectional paradigm, defined as the set of forms of a single word that convey grammatical meanings such as number or case—understood as an unordered set of words that share a common root and encode derivational meanings like action or agent. In this talk, I will illustrate how, beyond addressing cases that violate tree-based structures, paradigms open up new possibilities for modeling phenomena such as competition in derivation and the distinctions between native words and loanwords.

Institute of Formal and Applied Linguistics

Charles University, Czech Republic
Faculty of Mathematics and Physics

Search form

Trees and paradigms: How to model derivation in natural languages?

Magda Ševčíková (ÚFAL MFF UK)