In this grant project we will investigate behavior of non-trivial syntactic trees. Performing this we will focus on numerous computational linguistics tasks.
The first task is collecting and preparing the data where elliptical constructions can be extracted from. We are planning to start from two or three languages and expand our experiments into as many languages as we can get. However, it is not an easy-to-solve task due to the lack of syntactically annotated resources where ellipsis is marked as well.
Secondly, design and explore different types of representations of elliptical constructions from cross linguistic perspective. It is important to create a universal approach which can be applied to majority of languages.
Next, exploring parsing and learning tools and algorithms applied to the prepared data. The task may require not only tuning the existing tools, but developing a novel method.
In spite of the huge attention to universal morphosyntactic annotation and efforts of developing cross-linguistically consistent treebank annotation for many languages, elliptical constructions are still an open issue.
Kira Droganova, Daniel Zeman, "Elliptic Constructions: Spotting Patterns in UD Treebanks" In online proceedings of NoDaLiDa Workshop on Universal Dependencies (UDW 2017), NoDaLiDa 2017, Gothenburg, May 22