Monday, March 10, 2014 - 13:30

Open-Source Taggers for (Czech) POS Tagging and NE Recognition

We present two recently released open-source tagging tools: NameTag is a free
software for named entity recognition (NER) which achieves state-of-the-art
performance in Czech; MorphoDiTa (Morphologic Dictionary and Tagger) performs
morphologic analysis, morphologic generation, tagging and tokenization with
state-of-the-art results for Czech and a throughput around 10-200K words per
second. Both tools are free software under LGPL license and are distributed
along with trained linguistic models which are free for non-commercial use
under CC BY-NC-SA license. We will also briefly discuss a recent release of the
Czech Named Entity Corpus 2.0 which was used as a training material for the
named entity recognition tool.