Coreference in Universal Dependencies (CorefUD) is an initiative to collect coreference corpora in various languages and harmonize them to the same scheme and data format (CoNLL-U).

CorefUD 0.2, the current version of the collection, can be downloaded from http://hdl.handle.net/11234/1-4598.

The next version, CorefUD 1.0, is in the process of releasing at LINDAT/CLARIAH-CZ. Meanwhile you can download it from here.

If you want to learn more about the collection, please have a look at

  • an ÚFAL technical report,
  • slides presented at the Universal Anaphora Workhop (April 9, 2021)
  • slides presented at the UFAL Monday Seminar (April 19, 2021)
  • slides presented at Universal Anaphora panel at CRAC 2021 (November 11, 2021)

Feel free to write us if you have any questions: Anna Nedoluzhko, Michal Novák, Martin Popel, Zdeněk Žabokrtský, and Daniel Zeman.