Home
Intro
VALLEX Data
- as web pages
- as a book
- as an XML file
VALEVAL
Docs & Publications
License & Registration
Download
Disclaimer
Acknowledgements
|
There are almost 10,000 sentences manually annotated with VALLEX valency frames throughout VALEVAL project.
These annotated data are also available in a machine-tractable form. The XML is as simple as it can be (each verb occurence contains a sentence with the verb marked (plus three context sentences) and a correct valency frame assigned). It is ready to use e. g. in machine learning.
Data were created in cooperation with Ondřej Bojar and Jiří Semecký.
For more
information see
- paper:
Lopatková Markéta, Bojar Ondřej, Semecký Jiří, Benešová Václava, Žabokrtský Zdeněk: Valency Lexicon of Czech Verbs VALLEX: Recent Experiments with Frame Disambiguation. In: Lecture Notes in Computer Science, Vol. 3658, Proceedings of the 8th International Conference, TSD 2005, Springer, Berlin / Heidelberg, ISBN 3-540-28789-2, ISSN 0302-9743, pp. 99-106, 2005.
- and article:
Bojar Ondřej Bojar, Semecký Jiří, and Benešová Václava. VALEVAL: Testing VALLEX Consistency and Experimenting with Word-Frame Disambiguation. Prague Bulletin of Mathematical Linguistics, (83):5-17, 2005.
An example of 54th occurence of verb "brát", which uses frame #2 out of 10:
<body>
<verb lemma='brát' frames='10'>
...
<occurence number='54' frame='2'>
<sentence>Nebraňte se tolik svému osudu!"</sentence>
<sentence>Aidan se odvrátila.</sentence>
<sentence>Jeho řeči ji dráždily.</sentence>
<sentence is_here='1'><word>Braly</word> jí naději, a toho se děsila.</sentence>
</occurence>
...
</verb>
...
</body>
|