Vilem Mathesius Lecture Series 21
Prague Treebanking for Everyone: A two-day tutorial IS OVER
Slides and video recordings of lectures are available - see below.
- DATE November 28-29, 2006
- VENUE Krystal Hotel, Jose Martiho 407/2, Prague 6 (on the map), 2nd floor, room No. 249
- AUDIENCE 90 participants
- ABSTRACT The tutorial will introduce the Prague Dependency Treebank project, which aims at a complex manual annotation of a substantial amount of naturally occurring sentences in continuous Czech texts. The Prague Dependency Treebank has three levels of annotation: morphological, analytical (describing surface syntax in a dependency fashion) and tectogrammatical, which combines syntax and sentence semantics into a language meaning representation, keeping the dependency structure as the core of the annotation structure but adding basic coreferential links, topic/focus annotation, and a detailed semantic labeling of every sentence unit. More Prague treebanks (the Czech Academic Corpus, the Prague Czech-English Dependency Treebank, and the Prague Arabic Dependency Treebank) will be introduced as well. In addition to the data, all the treebank and data processing tools will be discussed in details. This tutorial is intended for students, researchers, and practitioners in natural language processing who want to see how many of the broadly annotated data and the annotation and data processing tools have been built in the Prague treebanking projects. The fact that the annotations and tools can be used in a general way could be a strong motivation for all attendees.
- OUTLINE (Playing flash-video files, we strongly recommend having the particular slides open.)
SESSION TITLE (speaker) DAY 1 Part 1
9:30-11:00DATA: The Prague Dependency Treebank Introduction, Morphology (Jan Hajič)
Part 2
11:30-13:00DATA (cntnd): PDT Surface Dependency Syntax, "Deep" (Tectogrammatical) Syntax (Jan Hajič)
Part 3
14:30-16:00DATA (cntnd): PDT - Grammatemes (Zdeněk ®abokrtský)
- "Deep" Syntax: topic/focus and deep word order (Eva Hajičová)
- Coreference (Eva Hajičová)
Part 4
16:30-18:00DATA (cntnd): PDT Valency (Jan Hajič) DAY 2 Part 5
9:30-11:00TOOLS - Annotation editors
- Browsers and viewers
- m-layer: LAW (Jaroslava Hlaváčová)
- [at]-layer, valency lexicon: TrEd (Jan ©těpánek)
- m-layer: Bonito (Jaroslava Hlaváčová)
- [at]-layer: Netgraph (Jiří Mírovský)
Part 6
11:30-12:30DATA The Prague Mark-up Language (Petr Pajas) Part 7
14:30-16:00TOOLS (cntd) - Automatic processing of data (Jan ©těpánek)
- STYX - an electronic exercise book of Czech (Ondřej Kučera)
Part 8
16:30-18:00DATA: More Prague Treebanks - Prague Czech-English Dependency Treebank (Jan Hajič)
- Prague Arabic Dependency Treebank (Otakar Smrľ)
Good bye Good bye ... (Jan Hajič) Complete tutorial notes pdf [4,1MB]
OTHER INFO
- HOW TO GET info
- PREREQUISITIES Acquittance with basic issues in corpus and computational linguistics will be useful, but not mandatory.
- REGISTRATION
There is ***NO*** participation fee.
Please, send a message to Barbora Hladka at hladka@ufal.mff.cuni.cz telling us whether- you plan to come (for us to know how big lecture room we should reserve)
- you wish us to book a room for you in Krystal Hotel (where the tutorial will be held) - if yes, please contact Anna Kotesovcova till November 15, 2006; then the Krystal Hotel will be no longer available for the special prices (i.e. 30 EUR for a single room, 20 EUR for a double room). If you prefer to stay in the very historic center of Prague, see the special offer.
- you wish to have lunches served in the hotel restaurant. Lunches will be on your own expenses (apprx. 7 EUR).
- you plan to come (for us to know how big lecture room we should reserve)
- TIME SCHEDULE
9:30-11:00 11:30-13:00 14:30-16:00 16:30-18:00 Tuesday Part 1 Part 2 Part 3 Part 4 Wednesday Part 5 Part 6 Part 7 Part 8 - CO-LOCATED EVENTS
- Vilem Mathesius Series of Lectures
- "Treebanks and Linguistic Theories" Conference
- "Treebanking & Advanced Processing of Arabic" Workshop
- Vilem Mathesius Series of Lectures
- DOWNLOAD POSTER (pdf)
- ORGANIZERS
Jan Hajic, Barbora Hladka
Institute of Formal and Applied Linguistics
Charles University
Malostranske nam. 25
118 00 Prague
Czech Republic
tel.: +420-221 914 223
fax: +420-221 914 304
e-mail: {hajicova, hajic, hladka}@ufal.mff.cuni.cz