The project aims at a development of an automatic method of shallow discourse parsing in Czech. It will use several key resources, some of them already existing (most importantly the Prague Dependency Treebank, the Penn Discourse Treebank, the Prague Czech-English Dependency Treebank), some of them will be developed within the project, using cost-effective methods (electronic lexicon of discourse connectives, additional discourse-annotated data).

The main goals of the project are:

  • to develop an electronic lexicon of Czech discourse connectives
  • to develop a shallow discourse parser for Czech with the use of the lexicon
  • apart from using existing discourse-annotated data, obtain and utilize more annotated data with annotation projection