Syntactically annotated corpora are usually displayed as trees. However, the
users do not always need to view trees, and linear representation of a text enhanced with some features (like dependency relations, information on a parent node) can be enough.
We annotated a large corpus of Czech that was automatically downloaded from the web - CWC. CWC corpus tagged with Featurama tagger and parsed with MST parser (the corpus in LINDAT is in plain-text format only!!). The attributes are: node: form, lemma, tag, afun, parent: p_form, p_lemma, p_tag, p_afun, parent (distance in tokens to the parent, ex. -1 - one to the left, +5 - 5 to the right), effective parent: ep_form, ep_lemma, ep_tag, ep_afun, eparent (distance to the eparent). In the earlier versions, we had attributes p_distance (if parent is immediate/distant wrt to a node), p_position(if parent is left/right wrt to a node), they are now substituted with parent/eparent, in line with Czech National Corpus style.
Examples of queries: