Principal investigator (ÚFAL):
Named entity linking is a task of recognizing names of the persons, organizations, geographic terms, and other real objects composed from one or multiple words and linking them to an existing knowledge base. The first step for the linking system is to find occurrences of the named entities in the text (named entity recognition, NER). Next, it is necessary to determine if there is a possibility to attach entry in the knowledge base to the named entity (EL - entity linking). Good linking systems exist for prominent languages such as English, Spanish, and Chinese, mainly because large datasets are containing annotated data for these languages. There is no advanced Czech linking system yet. The linking system can be used for the extraction of the named entities from any kind of text. This system allows advanced queries to the text based on factual information and not only based on the occurrences of the words themselves. Examples of such queries could be searching for the specified period in the historical document (poets that that worked after the year 1867), searching for the specified person, and not all persons with the same name. Other examples of usage of the linking system are in dialogue systems for tracking the named entities in the conversation, and in the translation task. The goal of this project is to create an advanced system for linking named entities working in as many languages as possible. The main goal will be the linking system for the Czech language.