Coreference in Universal Dependencies (CorefUD) is an initiative to collect coreference corpora in various languages and harmonize them to the same scheme and data format (CoNLL-U).

CorefUD 1.0, the current version of the collection, can be downloaded from http://hdl.handle.net/11234/1-4698.

We organize a CRAC 2022 Shared Task on Multilingual Coreference Resolution.

If you want to learn more about the collection, please have a look at

  • an ÚFAL technical report,
  • slides presented at the Universal Anaphora Workhop (April 9, 2021)
  • slides presented at the UFAL Monday Seminar (April 19, 2021)
  • slides presented at Universal Anaphora panel at CRAC 2021 (November 11, 2021)
  • CorefUD 1.0 file format description: corefud-1.0-format.pdf (February 18, 2022)

Feel free to write us if you have any questions: Anna Nedoluzhko, Michal Novák, Martin Popel, Zdeněk Žabokrtský, and Daniel Zeman.