OS: 
Linux
Tags: 

NameTag 3

1. Introduction

NameTag 3 is an open-source tool for both flat and nested named entity recognition (NER). NameTag 3 identifies proper names in text and classifies them into a set of predefined categories, such as names of persons, locations, organizations, etc.

NameTag 3 offers state-of-the-art or near state-of-the-art performance in English, German, Spanish, Dutch, Czech and Ukrainian.

NameTag is available in the following versions:

NameTag 3 is a free software under Mozilla Public License 2.0, and the linguistic models are free for non-commercial use and distributed under CC BY-NC-SA license, although for some models the original data used to create the model may impose additional licensing conditions. NameTag is versioned using Semantic Versioning.

Copyright 2024 Institute of Formal and Applied Linguistics, Faculty of Mathematics and Physics, Charles University, Czech Republic.

2. Current Release

NameTag 3 can be used either as a commandline tool or by requesting the NameTag webservice:

NameTag 3 source code can be found at GitHub.

3. Models

The individual models are described on the NameTag 3 Models webpage and are distributed via the LINDAT repository. The latest version is 240830.

4. Results at a Glance

Corpus NameTag 2 NameTag 3 NameTag 3 Model
CNEC 2.0 fine-grained (nested) 83.44 86.39 nametag3-czech-cnec2.0-240830
CNEC 2.0 coarse (nested) 87.04 89.29 nametag3-czech-cnec2.0-240830
English CoNLL-2003 (flat) 91.68 93.85 nametag3-multilingual-conll-240830
German CoNLL-2003 (flat) 82.65 87.07 nametag3-multilingual-conll-240830
Dutch CoNLL-2002 (flat) 91.17 94.42 nametag3-multilingual-conll-240830
Spanish CoNLL-2002 (flat) 88.55 89.90 nametag3-multilingual-conll-240830
Ukrainian Lang-uk (flat) 88.73 91.73 nametag3-multilingual-conll-240830
CNEC 2.0 CoNLL (4 labels, flat) N/A 86.35 nametag3-multilingual-conll-240830

5. License

NameTag 3 is a free software under Mozilla Public License 2.0, and the linguistic models are free for non-commercial use and distributed under CC BY-NC-SA license, although for some models the original data used to create the model may impose additional licensing conditions. NameTag is versioned using Semantic Versioning.

The associated models and data are licensed under CC BY-NC-SA, although for some models the original data used to create the model may impose additional licensing conditions.

If you use this tool for scientific work, please give us credit by referencing Straková et al. (2019) (see BibTeX for referencing).

6. Acknowledgements

Acknowledgements for the individual language models are listed in NameTag 3 Models page.

This work has been supported by the Grant Agency of the Czech Republic under the EXPRO program as project “LUSyD” (project No. GX20-16819X). The work described herein has also been using data provided by the LINDAT/CLARIAH-CZ Research Infrastructure, supported by the Ministry of Education, Youth and Sports of the Czech Republic (Project No. LM2023062).

6.1. Publications

Straková Jana, Straka Milan, Hajič Jan: Neural Architectures for Nested NER through Linearization. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Copyright © Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-950737-48-2, pp. 5326-5331, 2019.

@inproceedings{strakova-etal-2019-neural,
    title = "Neural Architectures for Nested {NER} through Linearization",
    author = "Strakov{\'a}, Jana  and
      Straka, Milan  and
      Hajic, Jan",
    editor = "Korhonen, Anna  and
      Traum, David  and
      M{\`a}rquez, Llu{\'\i}s",
    booktitle = "Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics",
    month = jul,
    year = "2019",
    address = "Florence, Italy",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/P19-1527",
    doi = "10.18653/v1/P19-1527",
    pages = "5326--5331",
}

7. Contact

Authors:

NameTag website.

Screenshot: