Coreference in Universal Dependencies (CorefUD) is an initiative to collect coreference corpora in various languages and harmonize them to the same scheme and data format (CoNLL-U).

CorefUD 1.1, the current version of the collection, can be downloaded from http://hdl.handle.net/11234/1-5053.

We organize a CRAC 2023 Shared Task on Multilingual Coreference Resolution, which follows the previous edition of the shared task in 2022.

If you want to learn more about the collection, please have a look at

Feel free to write us if you have any questions: Anna Nedoluzhko, Michal Novák, Martin Popel, Zdeněk Žabokrtský, and Daniel Zeman.

CorefUD releases

  • CorefUD 1.1 (February 24, 2023) [URL]
  • CorefUD 1.0 (April 4, 2022) [URL]
  • CorefUD 0.2 (December 12, 2021) [URL]
  • CorefUD 0.1 (March 11, 2021) [URL]

Publications

  • Žabokrtský Zdeněk, Konopík Miloslav, Nedoluzhko Anna, Novák Michal, Ogrodniczuk Maciej, Popel Martin, Pražák Ondřej, Sido Jakub, Zeman Daniel, Zhu Yilun: Findings of the Shared Task on Multilingual Coreference Resolution. In: Proceedings of the CRAC 2022 Shared Task on Multilingual Coreference Resolution, Copyright © Association for Computational Linguistics, Gyeongju, Korea, ISSN 2951-2093, pp. 1-17, 2022 [URL]
  • Nedoluzhko Anna, Novák Michal, Popel Martin, Žabokrtský Zdeněk, Zeldes Amir, Zeman Daniel: CorefUD 1.0: Coreference Meets Universal Dependencies. In: Proceedings of the 13th Conference on Language Resources and Evaluation (LREC 2022), Copyright © European Language Resources Association, Marseille, France, ISBN 979-10-95546-72-6, pp. 4859-4872, 2022 [URL]
  • Nedoluzhko Anna, Novák Michal, Popel Martin, Žabokrtský Zdeněk, Zeman Daniel: Is one head enough? Mention heads in coreference annotations compared with UD-style heads. In: Proceedings of the Sixth International Conference on Dependency Linguistics (Depling, SyntaxFest 2021), Copyright © Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-955917-14-8, pp. 101-114, 2021 [URL]
  • Popel Martin, Žabokrtský Zdeněk, Nedoluzhko Anna, Novák Michal, Zeman Daniel: Do UD Trees Match Mention Spans in Coreference Annotations?. In: Findings of the Association for Computational Linguistics: EMNLP 2021, Copyright © Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-955917-10-0, pp. 3570-3576, 2021 [URL]
  • Nedoluzhko Anna, Novák Michal, Popel Martin, Žabokrtský Zdeněk, Zeman Daniel: Coreference meets Universal Dependencies – a pilot experiment on harmonizing coreference datasets for 11 languages. Technical report no. 2021/66, Copyright © ÚFAL MFF UK, Praha, Czechia, ISSN 1214-5521, 65 pp., Apr 2021 [PDF]

Presentations

  • Yu Juntao, Novák Michal: The recent developments in Universal Anaphora Scorer. Invited talk at CRAC 2022 (October 17, 2022) [PDF]
  • Nedoluzhko Anna, Novák Michal, Popel Martin, Žabokrtský Zdeněk, Zeldes Amir, Zeman Dan: CorefUD 1.0: Coreference Meets Universal Dependencies. LREC 2022 oral presentation (June 2022) [PDF]
  • Nedoluzhko Anna, Novák Michal, Popel Martin, Žabokrtský Zdeněk, Zeman Dan, Zeldes Amir: The Universal Anaphora Extension of the CONLL-U Markup Scheme. Universal Anaphora panel at CRAC 2021 (November 11, 2021) [PDF]
  • Nedoluzhko Anna, Novák Michal, Popel Martin, Žabokrtský Zdeněk, Zeman Dan: CorefUD 0.1 – a pilot experiment on harmonizing coreference datasets for 11 languages. ÚFAL Monday Seminar (April 19, 2021) [PDF]
  • Nedoluzhko Anna, Novák Michal, Popel Martin, Žabokrtský Zdeněk, Zeman Dan: CorefUD 0.1 – a pilot experiment on harmonizing coreference datasets for 11 languages. Universal Anaphora Workhop (April 9, 2021) [PDF]

How to cite

When using CorefUD, please cite the following LREC paper:

@inproceedings{nedoluzhko-etal-2022-corefud,
    title = "{C}oref{UD} 1.0: Coreference Meets {U}niversal {D}ependencies",
    author = "Nedoluzhko, Anna and Nov{\'a}k, Michal and Popel, Martin and {\v{Z}}abokrtsk{\'y}, Zden{\v{e}}k and Zeldes, Amir and Zeman, Daniel",
    booktitle = "Proceedings of the Thirteenth Language Resources and Evaluation Conference",
    month = jun,
    year = "2022",
    address = "Marseille, France",
    publisher = "European Language Resources Association",
    url = "https://aclanthology.org/2022.lrec-1.520",
    pages = "4859--4872",
}