[ Skip to the content ]

Institute of Formal and Applied Linguistics

at Faculty of Mathematics and Physics, Charles University, Prague, Czech Republic


[ Back to the navigation ]

Publication


Year 2009
Type in proceedings
Status published
Language English
Author(s) Kravalová, Jana Žabokrtský, Zdeněk
Title Czech Named Entity Corpus and SVM-based Recognizer
Czech title Český korpus pojmenovaných entit a jejich rozpoznávač
Proceedings 2009: Suntec, Singapore: ACL-IJCNLP 2009 workshop: Proceedings of the 2009 Named Entities Workshop: Shared Task on Transliteration (NEWS 2009)
Pages range 194-201
URL http://www.aclweb.org/anthology/W/W09/W09-3538
Supported by 2005-2010 MSM 0021620838 (Moderní metody, struktury a systémy informatiky) 2005-2009 1ET101120503 (Integrace jazykových zdrojů za účelem extrakce informací z přirozených textů) 2005-2009 LC536 (Centrum komputační lingvistiky)
Czech abstract Tento článek se zabývá rozpoznáváním pojmenovaných entit v českých textech. Popisuje nový korpus s ručně značkovanými entitami ve dvouúrovňovém anotačním schématu. Data byla použita pro trénování rozpoznávače pojmenovaných entit, který je založen na klasifikátoru SVM.
English abstract This paper deals with recognition of named entities in Czech texts. We present a recently released corpus of Czech sentences with manually annotated named entities, in which a rich two-level classification scheme was used. There are around 6000 sentences in the corpus with roughly 33000 marked named entity instances. We use the data for training and evaluating a named entity recognizer based on Support Vector Machine classification technique. The presented recognizer outperforms the results previously reported for NE recognition in Czech.
Specialization linguistics ("jazykověda")
Confidentiality default – not confidential
Open access no
ISBN* 978-1-932432-57-2
Address* Suntec, Singapore
Month* August
Venue* Singapore
Publisher* Association for Computational Linguistics
Institution* Association for Computational Linguistics
Creator: Common Account
Created: 9/15/09 1:49 PM
Modifier: Almighty Admin
Modified: 2/17/10 9:04 AM
***

Content, Design & Functionality: ÚFAL, 2006–2018. Page generated: Mon Feb 18 06:14:06 CET 2019

[ Back to the navigation ] [ Back to the content ]

100% OpenAIRE compliant