The Valency Lexicon of Czech Verbs with Complex Syntactic-Semantic Annotation

The main goal of the project is to create consistent electronic dictionary rendering underlying structure of Czech verbs and additional syntactico-semantic information useful for the analysis and synthesis of Czech texts as well as other applied tasks in NLP.

The Valency Lexicon of Czech Verbs is a collection of linguistically annotated data and documentation, resulting from an attempt at formal description of valency frames of Czech verbs.

The lexicon provides:

  • valency frames with basic syntactico-semantic characterization of the most frequent verbs in their particular senses (number of complementations, their morphological forms and obligatoriness);
  • glosses, examples;
  • additional characteristics – idiom, control, reflexivity, reciprocity, syntactico-semantic class.

The lexicon is available in three formats:

  • html version for comfortable browsing and sorting according various criteria;
  • pdf version for printing;
  • xml data for further applications.

 

VALLEX, version 3

VALLEX 3.0 is an enhanced, cleaned and corrected successor of VALLEX 2.5. It contains - in addition to the information stored in VALLEX 2.5 - also 

  • annotation of grammaticalized alternations (diatheses and reciprocity) and lexicalized alternations,
  • links to real-world sentences annotated by the lexicon entries for more than one hundred Czech verbs, and
  • links to PDT-Vallex, a lexicon connected with the Prague Dependency Corpus.

VALLEX 3.0 has been developed within the project Delving Deeper: Lexicographic Description of Syntactic and Semantic Properties of Czech Verbs supported by the Grant Agency of the Czech Republic, grant  No. GA P406/12/0557.

 

How to cite

If you make use of VALLEX, please cite (at least one of) the following papers:

 

@article{2008-book,
       booktitle= {{Valen{\v{c}}n{\'{i}} slovn{\'{i}}k {\v{c}}esk{\'{y}}ch sloves}},
       author = {Mark{\'{e}}ta Lopatkov{\'{a}} and Zden{\v{e}}k {\v{Z}}abokrtsk{\'{y}} and V{\'{a}}clava Ketnerov{\'{a}}},
       year = {2008},
       publisher = {Karolinum},
       adress = {Praha},
}

@article{2007-vallex-pbml,
       journal = {The Prague Bulletin of Mathematical Linguistics},
       title = {Valency Information in {VALLEX} 2.0: Logical Structure of the Lexicon},
       author = {Zden{\v{e}}k {\v{Z}}abokrtsk{\'{y}} and Mark{\'{e}}ta Lopatkov{\'{a}}},
       year = {2007},
       number = {87},
       pages = {41--60},
}

 


VALLEX Archive

VALLEX 2.7

VALLEX 2.7 is an enhanced, cleaned and corrected successor of VALLEX 2.5. It contains - in addition to the information stored in VALLEX 2.5 - also 

  • annotation of grammaticalized alternations (diatheses and reciprocity) and lexicalized alternations
  • links to real-world sentences annotated by the lexicon entries for more than one hundred Czech verbs, and
  • links to PDT-Vallex, a lexicon connected with the Prague Dependency Corpus.

VALLEX 2.7 is a beta version of the VALLEX lexicon, version 3, which is being developed within the project Delving Deeper: Lexicographic Description of Syntactic and Semantic Properties of Czech Verbs supported by the Grant Agency of the Czech Republic, grant  No. GA P406/12/0557.

VALLEX 2.5

VALLEX 2.5 is a cleaned and corrected successor of VALLEX 2.0. It was released electronically at the end of 2007 and since spring 2008 it is available also as a book issued by Karolinum Press, the publishing house of Charles University in Prague.

VALLEX 2.0

In VALLEX 2.0, there are roughly 2,730 lexeme entries containing together around 6,460 lexical units ("senses"). VALLEX 2.0—unlike traditional dictionaries and also unlike VALLEX 1.0—treats a pair of perfective and imperfective aspectual counterparts as a single lexeme (if perfective and imperfective verbs would be counted separately, the size of VALLEX 2.0 would virtually grow to 4,250 verb entries).

VALLEX 1.0

VALLEX 1.0 contains roughly 1400 verbs (counting only perfective and imperfective verbs, but not their iterative counterparts) – 1000 most frequent Czech verbs were selected according to their number of occurrences in a part of the Czech National Corpus (only 'být' (to be) was excluded); then their perfective or imperfective aspectual counterparts were added, if they were missing.

Licence

VALLEX can be used free of charge by any academic, educational or research institution, or other organization or individual making use of VALLEX for non-commercial research and/or education purposes. Legal usage of VALLEX is conditioned by filling the registration form.