[ Skip to the content ]

Institute of Formal and Applied Linguistics

at Faculty of Mathematics and Physics, Charles University, Prague, Czech Republic


[ Back to the navigation ]

Publication


Year 2006
Type in proceedings
Status published
Language English
Author(s) Cinková, Silvie
Title From PropBank to EngValLex: Adapting the PropBank-Lexicon to the Valency Theory of the Functional Generative Description
Czech title Od PropBanku k EngValLexu: Adaptace PropBank-lexiconu na valenční teorii Funkčního generativního popisu
Proceedings 2006: Genova, Italy: LREC 2006: Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC 2006)
Pages range 2170-2175
Supported by 2006-2008 GA405/06/0589 (Tektogramatický popis jazyka pro rozpoznávání mluvené řeči a strojový překlad)
Czech abstract EngValLex je valenční slovník anglických sloves používaných v korpusu Penn Treebank, který jsme získali poloautomatickou konverzí již existujícího valenčního slovníku PropBank-Lexicon do formátu použitelného pro tektogramatickou anotaci podle Funkčního generativního popisu. Tento článek se věnuje automatické konverzi dat a lingvistické problematice jejich následné manuální korektury.
English abstract EngValLex is the name of an FGD-compliant valency lexicon of English verbs, built from the PropBank-Lexicon and following the structure of Vallex, the FGD-based lexicon of Czech verbs. EngValLex is interlinked with the PropBank-Lexicon, thus preserving the original links between the PropBank-Lexicon and the PropBank-Corpus. Therefore it is also supposed to be part of corpus annotation. This paper describes the automatic conversion of the PropBank-Lexicon into Pre-EngValLex, as well as the progress of its subsequent manual refinement (EngValLex). At the start, the Propbank-arguments were automatically re-labeled with functors (semantic labels of FGD) and the PropBank-rolesets were split into the respective example sentences, which became FGD-valency frames of Pre-EngValLex. Human annotators check and correct the labels and make the preliminary valency frames FGD-compliant. The most essential theoretical difference between the original and EngValLex is the syntactic alternations used by the PropBank-Lexicon, not yet employed within the Czech framework. The alternation-based approach substantially affects the conception of the frame, making in very different from the one applied within the FGD-framework. Preserving the valuable alternation information required special linguistic rules for keeping, altering and re-merging the automatically generated preliminary valency frames.
Specialization linguistics ("jazykověda")
Confidentiality default – not confidential
Open access no
ISBN* 2-9517408-2-4
Address* Genova, Italy
Month* May
Institution* ELRA
Creator: Common Account
Created: 12/8/06 7:03 PM
Modifier: Almighty Admin
Modified: 2/3/11 10:58 AM
***

cin_lrec_def.pdfpubliccin_lrec_def.pdfapplication/pdf
Content, Design & Functionality: ÚFAL, 2006–2016. Page generated: Mon Nov 20 12:56:39 CET 2017

[ Back to the navigation ] [ Back to the content ]

100% OpenAIRE compliant