Year 3 - 2021

The work continues in accordance with the plan, in four workpackages:

WP8: Remaining manual checks/additions to CzeDLex, we expect to process the remaining third of the entries.

WP9: Manual discourse annotation of (a sample of approx. 1 thousand) non-journalistic texts (TED talks, fiction), checking the possibility of annotation projection for parallel non-journalistic texts with English discourse annotation available.

WP10: Checking/extending coverage of CzeDLex with respect to the non-journalistic texts from WP9.

WP11: Third version of a discourse parser for Czech, using the newest version of CzeDLex and tested also on newly
annotated non-journalistic texts.


The final version of CzeDLex will be published, reflecting all checks, corrections and additions from WP8 and WP10. It will be used in the development of the third version of a discourse parser of Czech, which will be tested also on newly annotated non-journalistic texts (WP9, WP11). An article with results from the second year of the project will be presented at an international conference and published in its proceedings. Theoretical and practical results of the third year and the whole project will be used to prepare an article that will be submitted for a publication as a journal article (The Prague Bulletin of Mathematical Linguistics or similar).