'; ?> The Vilem Mathesius Center

Vilem Mathesius Lecture Series 21

Prague Treebanking for Everyone: A two-day tutorial IS OVER

Slides and video recordings of lectures are available - see below.

  • DATE November 28-29, 2006
  • VENUE Krystal Hotel, Jose Martiho 407/2, Prague 6 (on the map), 2nd floor, room No. 249
  • AUDIENCE 90 participants
  • ABSTRACT The tutorial will introduce the Prague Dependency Treebank project, which aims at a complex manual annotation of a substantial amount of naturally occurring sentences in continuous Czech texts. The Prague Dependency Treebank has three levels of annotation: morphological, analytical (describing surface syntax in a dependency fashion) and tectogrammatical, which combines syntax and sentence semantics into a language meaning representation, keeping the dependency structure as the core of the annotation structure but adding basic coreferential links, topic/focus annotation, and a detailed semantic labeling of every sentence unit. More Prague treebanks (the Czech Academic Corpus, the Prague Czech-English Dependency Treebank, and the Prague Arabic Dependency Treebank) will be introduced as well. In addition to the data, all the treebank and data processing tools will be discussed in details. This tutorial is intended for students, researchers, and practitioners in natural language processing who want to see how many of the broadly annotated data and the annotation and data processing tools have been built in the Prague treebanking projects. The fact that the annotations and tools can be used in a general way could be a strong motivation for all attendees.
  • OUTLINE (Playing flash-video files, we strongly recommend having the particular slides open.)
    SESSIONTITLE (speaker)
    DAY 1   
    Part 1
    DATA: The Prague Dependency Treebank Introduction, Morphology (Jan Hajič)
    Part 2
    DATA (cntnd): PDT Surface Dependency Syntax, "Deep" (Tectogrammatical) Syntax (Jan Hajič)
    Part 3
    DATA (cntnd): PDT
    Part 4
    DATA (cntnd): PDT Valency (Jan Hajič)
    DAY 2   
    Part 5
    • Annotation editors
    • Browsers and viewers
    Part 6
    DATA The Prague Mark-up Language (Petr Pajas)
    Part 7
    TOOLS (cntd)
    Part 8
    DATA: More Prague Treebanks
     Good byeGood bye ... (Jan Hajič)
     Complete tutorial notespdf [4,1MB]


  • HOW TO GET info
  • PREREQUISITIES Acquittance with basic issues in corpus and computational linguistics will be useful, but not mandatory.

    There is ***NO*** participation fee.

    Please, send a message to Barbora Hladka at hladka@ufal.mff.cuni.cz telling us whether
    • you plan to come (for us to know how big lecture room we should reserve)
    • you wish us to book a room for you in Krystal Hotel (where the tutorial will be held) - if yes, please contact Anna Kotesovcova till November 15, 2006; then the Krystal Hotel will be no longer available for the special prices (i.e. 30 EUR for a single room, 20 EUR for a double room). If you prefer to stay in the very historic center of Prague, see the special offer.
    • you wish to have lunches served in the hotel restaurant. Lunches will be on your own expenses (apprx. 7 EUR).


    9:30-11:00 11:30-13:00 14:30-16:00 16:30-18:00
    Tuesday Part 1 Part 2 Part 3 Part 4
    Wednesday Part 5 Part 6 Part 7 Part 8


    Jan Hajic, Barbora Hladka
    Institute of Formal and Applied Linguistics
    Charles University
    Malostranske nam. 25
    118 00 Prague
    Czech Republic
    tel.: +420-221 914 223
    fax: +420-221 914 304
    e-mail: {hajicova, hajic, hladka}@ufal.mff.cuni.cz

Content: Petr Homola. Webmasters: Zlatka Šubrová and Juraj Šimlovič.
Site is valid XHTML 1.0 and valid CSS. Maintained with TED Notepad and Vim text editors.
2007 © Institute of Formal and Applied Linguistics. All Rights Reserved.

Site navigation: