Prague Database of Forms and Functions

ForFun is an interface for various linguistic research, particularly in describing syntactic functions and their formal realizations in Czech sentences. ForFun draws on the complex linguistic annotation of Prague Dependency Treebanks (PDTs) and arranges morphological and syntactical annotation into new tool which gives a possibility to search quickly and in a user-friendly way all forms (almost 1,500 items) used in PDTs for particular function and vice versa to look up all functions (66 items) expressed by the particular forms.

1. “From function to form” and “from form to function”

The ForFun database is split into two interconnected but reversed sets. The “from function to form” set contains a list of all syntactic functions (see below 3).

When choosing one function type, the user can search for examples of all forms that may represent that function according to various criteria. These criteria consist of

The number of examples available in the database is always shown for each specified 4‑combination (given form, functor, word class and source). Either first ten of them or all examples are displayed on demand.

The “from form to function” set contains a long list (almost 1,500 items) of all formal realizations of particular sentence units that occur in PDTs (see below 3). For any form, there are again plenty of examples sorted by function, word-class of the parent node, and the source of text data, always with the frequency of the 4‑combination in the data.

In both sets, examples can be also first filtered by their source, which allows the user to hide e.g. all forms used only in spoken language.

2. Functions

Since the database is extracted from the PDTs, it takes over the list of syntactic functions as well as the terminology. They are called functors. Functors (66 items) represent the semantic values of syntactic dependency relations; they express the functions of individual modifications in the sentence.

For more details, see full list of all dependency functions and their descriptions and labels.

3. Forms

The forms (almost 1,500 items) that occur in PDTs are of the following types (with the following notations):

4. Governing nodes

We note the following types of the governing nodes:

---no parent node
0parent node has no word class assigned

5. Sources

The ForFun database is constituted on the four PDTs corpora:

6. How to cite:

Mikulová Marie, Bejček Eduard: ForFun 1.0: Prague Database of Syntactic Forms and Functions — An Invaluable Resource for Linguistic Research. In: Proceedings of the 11th International Conference on Language Resources and Evaluation (LREC 2018), Copyright © European Language Resources Association, Paris, France, 2018. (in press)

Mikulová Marie, Bejček Eduard: ForFun 1.0. Data/software, Charles University, Prague, Czech Republic, Dec 2017.

7. References

You can read more in these papers: