In this grant project, we explore mutual similarities of natural languages, and we use our findings for two types of computational linguistics tasks, dealing with current problems of natural language processing on syntax level.
The first task type is cross-lingual projection technologies, where a model of one language is used to approximately model a similar language for which sufficient language resources are not available.
The second task type will focus on portability of monolingual technologies, where tools and procedures developed for working with one or a few languages will be generalized so that they can be used to process any or nearly any language for which sufficient data are available.
Although there exist vast language resources for a number of languages, practice often shows that it is hard to successfully solve the aforementioned tasks. This is due to the fact that the available resources are usually very heterogeneous, are using different annotation schemes and are built on the basis of different linguistic traditions and conventions. A necessary by-step in reaching the main goals of the project is therefore to collect and harmonize existing syntactically annotated language corpora (see HamleDT).