Implicit relations in text coherence

The project (supported by GAČR grant GA 17-03461S) deals with issues of discourse relations and textual coherence, namely with the description and explanation how discourse relations are understood between sentences, where the semantics of the relation cannot be inferred from the meaning of the discourse connective (conjunctions, etc.). In these cases, the discourse connective is either not expressed in the text (so called implicit discourse relation), or its semantics is underspecified.

Implicit discourse relations in Czech were subjected to a comprehensive analysis during which an annotated corpus PDiT-EDA 1.0 was created for the research, on which we investigated the distribution of implicit discourse relations in comparison with explicit relations and determined the influence of a number of factors influencing explicit / implicit (relation semantics, sentence realization, negation, text genre, etc.). We then verified the possibility of expressing some discourse relations implicitly in psycholinguistic experiments.

Related publications:

  • Zikánová Šárka: Implicitní diskurzní vztahy v češtině. Charles University, Faculty of Mathematics and Physics, Prague, Czech Republic, ISBN 9788088132127, 192 pp., 2021
  • Zikánová Šárka, Mírovský Jiří, Synková Pavlína: Explicit and Implicit Discourse Relations in the Prague Discourse Treebank. In: Lecture Notes in Computer Science, Vol. 11697, Proceedings of the 22nd International Conference on Text, Speech and Dialogue - TSD 2019, Springer International Publishing, Cham / Heidelberg / New York / Dordrecht / London, ISBN 978-3-030-27946-2, ISSN 0302-9743, pp. 236-248, 2019

 

Underspecified discourse connectives were examined in a cross-linguistics comparison in Czech, Hungarian, Lithuanian, French and English. For the research, translations of subtitles in TED talks in individual languages ​​were annotated in parallel. We investigated the extent to which underspecification in the original language is acceptable to translators and how the underspecification is processed by them during translation. At the same time, we monitored the identical processes (semantic shifts, implicitation) in translations into different languages.

Related publications:

 

Partial analyses then focused on specific issues that arose during the project. These include, for example, discourse structures with external arguments, automatic evaluation of coherence in texts or features of textual coherence in various text genres.

Related publications:

  • Poláková Lucie, Mírovský Jiří: Connectives with both Arguments External: A Survey on Czech. CICLing: International Conference on Computational Linguistics and Intelligent Text Processing, La Rochelle. 2019. Accepted for publication in: Lecture Notes in Computer Science, Springer Verlag Heidelberg, Heidelberg, Germany, ISSN 0302-9743, pp. 1-12.
  • O. Bojar, J. Mírovský, K. Rysová, M. Rysová: EvalD Reference-Less Discourse Evaluation for WMT18. In: Proceedings of the Third Conference on Machine Translation, Volume 2: Shared Tasks, Association for Computational Linguistics, Stroudsburg, ISBN 978-1-948087-81-0, pp. 545-549, 2018
  • E. Hajičová, J. Mírovský: Discourse Coherence Through the Lens of an Annotated Text Corpus: A Case Study. In: Proceedings of the 11th International Conference on Language Resources and Evaluation (LREC 2018), European Language Resources Association, Paris, France, ISBN 979-10-95546-00-9, pp. 1637-1642, 2018