Valency Lexicon of Czech Verbs

Markéta Lopatková, Václava Kettnerová, Eduard Bejček, Karolína Skwarska, Zdeněk Žabokrtský



   VALLEX Data
     - as web pages
     - as a book
     - as an XML file


   Docs & Publications

   License & Registration




There are almost 10,000 sentences manually annotated with VALLEX valency frames throughout VALEVAL project. These annotated data are also available in a machine-tractable form. The XML is as simple as it can be (each verb occurence contains a sentence with the verb marked (plus three context sentences) and a correct valency frame assigned). It is ready to use e. g. in machine learning. Data were created in cooperation with Ondřej Bojar and Jiří Semecký.
For more information see
  • paper: Lopatková Markéta, Bojar Ondřej, Semecký Jiří, Benešová Václava, Žabokrtský Zdeněk: Valency Lexicon of Czech Verbs VALLEX: Recent Experiments with Frame Disambiguation. In: Lecture Notes in Computer Science, Vol. 3658, Proceedings of the 8th International Conference, TSD 2005, Springer, Berlin / Heidelberg, ISBN 3-540-28789-2, ISSN 0302-9743, pp. 99-106, 2005.
  • and article: Bojar Ondřej Bojar, Semecký Jiří, and Benešová Václava. VALEVAL: Testing VALLEX Consistency and Experimenting with Word-Frame Disambiguation. Prague Bulletin of Mathematical Linguistics, (83):5-17, 2005.

An example of 54th occurence of verb "brát", which uses frame #2 out of 10:
    <verb lemma='brát' frames='10'>
      <occurence number='54' frame='2'>
        <sentence>Nebraňte se tolik svému osudu!"</sentence>
        <sentence>Aidan se odvrátila.</sentence>
        <sentence>Jeho řeči ji dráždily.</sentence>
        <sentence is_here='1'><word>Braly</word> jí naději, a toho se děsila.</sentence>