Monday, February 22, 2016 - 13:30

Exploiting KonText for querying corpora from the Lindat repository


In this presentation, I will describe how KonText – a corpus query interface for Czech National Corpus is adopted for handling various corpora from Lindat. The repository contains corpora with different types of annotation – like syntactic, shallow semantic, sentiment and other types. I will show how to search for this information within the KonText environment using CQL (Corpus Query Language) on the example of two corpora. First, I will focus on the Universal Dependencies – syntactically annotated treebanks for several languages. Secondly, I will demonstrate the queries over the Prague Dependency Treebank that are related both to syntactic (analytical) and deep syntactic (tectogrammatical) layers. I will also show several query examples from other Lindat corpora.

The talk will be given in English.