The Linguistic Annotation Workshop 2007

Working Groups (Webpage)

In addition to paper presentations and software demos, there will be a few invited "working group" presentations, each laying out the dimensions of some crucial problem facing the field of corpus annotation, particularly problems involving merging annotation and extending annotation to new languages, genres and modalities. The final list of working group topics will appear on the workshop website by February 15, 2007.

Our preliminary topics include:
(a) selection of diverse or balanced corpora with few licensing restrictions for common annotation by the community. Possible corpora include the "open" portion of the American National Corpus and Wikipedia XML, a freely available cleaned-up corpus that is derived from the Wikipedia;
(b) approaches to discourse coherence, especially as resulting from different interacting annotation layers, and its applications to computational linguistics;
(c) annotation systems/frameworks and interoperability, including the feasibility of applying a common annotation framework to various annotation types, language processing tasks, modalities, and languages, especially as it could enable the merging of annotations of diverse phenomena produced by different systems.

We will attempt to lay out clearly and precisely the assumptions on such topics held by members of the annotation community and in doing so, we hope to both: (1) lay the foundations for the meaningful integration of annotation resources; and (2) assess the limitations of integrated approaches.

Working Groups Webpage