[ Skip to the content ]

Institute of Formal and Applied Linguistics

at Faculty of Mathematics and Physics, Charles University, Prague, Czech Republic


[ Back to the navigation ]

Publication


Year 2013
Type in proceedings
Status published
Language English
Author(s) Mareček, David Straka, Milan
Title Stop-probability estimates computed on a large corpus improve Unsupervised Dependency Parsing
Czech title Odhady STOP-pravděpodobností počítané na velkých datech vylepšují neřízený závislostní analýzu
Proceedings 2013: Sofija, Bulgaria: ACL 2013: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics
Pages range 281-290
How published online
URL http://aclweb.org/anthology/P/P13/P13-1028.pdf
Supported by 2012-2015 DF12P01OVV022 (Zpřístupnění rozsáhlého video archivu kulturního dědictví pomocí metod automatického rozpoznávání mluvené řeči a strojového překladu. (AMALACH)) 2012-2016 PRVOUK P46 (Informatika)
Czech abstract Even though the quality of unsupervised dependency parsers grows, they often fail in recognition of very basic dependencies. In this paper, we exploit a prior knowledge of STOP-probabilities (whether a given word has any children in a given direction), which is obtained from a large raw corpus using the reducibility principle. By incorporating this knowledge into Dependency Model with Valence, we managed to considerably outperform the state-of-the-art results in terms of average attachment score over 20 treebanks from CoNLL 2006 and 2007 shared tasks
English abstract Even though the quality of unsupervised dependency parsers grows, they often fail in recognition of very basic dependencies. In this paper, we exploit a prior knowledge of STOP-probabilities (whether a given word has any children in a given direction), which is obtained from a large raw corpus using the reducibility principle. By incorporating this knowledge into Dependency Model with Valence, we managed to considerably outperform the state-of-the-art results in terms of average attachment score over 20 treebanks from CoNLL 2006 and 2007 shared tasks.
Specialization linguistics ("jazykověda")
Confidentiality default – not confidential
Open access no
ISBN* 978-1-937284-50-3
Address* Sofija, Bulgaria
Month* August
Publisher* Association for Computational Linguistics
Institution* Bălgarska akademija na naukite
Creator: Common Account
Created: 8/13/13 12:10 PM
Modifier: Almighty Admin
Modified: 2/26/14 12:23 PM
***

PaperpublicP13-1028.pdfapplication/pdf
Content, Design & Functionality: ÚFAL, 2006–2016. Page generated: Wed Nov 14 17:42:46 CET 2018

[ Back to the navigation ] [ Back to the content ]

100% OpenAIRE compliant