Coreference in Universal Dependencies (CorefUD) is an initiative to collect coreference corpora in various languages and harmonize them to the same scheme and data format (CoNLL-U).

We organized a CRAC 2024 Shared Task on Multilingual Coreference Resolution, which followed the previous editions of the shared task in 2023 and 2022.

If you want to learn more about the collection, please have a look at

Feel free to write us if you have any questions: Anna Nedoluzhko, Michal Novák, Martin Popel, Zdeněk Žabokrtský, and Daniel Zeman.

CorefUD releases

  • CorefUD 1.2 (March 28, 2024) [URL]
  • CorefUD 1.1 (February 24, 2023) [URL]
  • CorefUD 1.0 (April 4, 2022) [URL]
  • CorefUD 0.2 (December 12, 2021) [URL]
  • CorefUD 0.1 (March 11, 2021) [URL]

Publications

  • Žabokrtský Zdeněk, Konopík Miloslav, Nedoluzhko Anna, Novák Michal, Ogrodniczuk Maciej, Popel Martin, Pražák Ondřej, Sido Jakub, Zeman Daniel: Findings of the Second Shared Task on Multilingual Coreference Resolution. In: Proceedings of the CRAC 2023 Shared Task on Multilingual Coreference Resolution, Copyright © Association for Computational Linguistics, Singapore, DOI 10.18653/v1/2023.crac-sharedtask.1, pp. 1-18, 2023 [URL]
  • Žabokrtský Zdeněk, Konopík Miloslav, Nedoluzhko Anna, Novák Michal, Ogrodniczuk Maciej, Popel Martin, Pražák Ondřej, Sido Jakub, Zeman Daniel, Zhu Yilun: Findings of the Shared Task on Multilingual Coreference Resolution. In: Proceedings of the CRAC 2022 Shared Task on Multilingual Coreference Resolution, Copyright © Association for Computational Linguistics, Gyeongju, Korea, ISSN 2951-2093, pp. 1-17, 2022 [URL]
  • Nedoluzhko Anna, Novák Michal, Popel Martin, Žabokrtský Zdeněk, Zeldes Amir, Zeman Daniel: CorefUD 1.0: Coreference Meets Universal Dependencies. In: Proceedings of the 13th Conference on Language Resources and Evaluation (LREC 2022), Copyright © European Language Resources Association, Marseille, France, ISBN 979-10-95546-72-6, pp. 4859-4872, 2022 [URL]
  • Nedoluzhko Anna, Novák Michal, Popel Martin, Žabokrtský Zdeněk, Zeman Daniel: Is one head enough? Mention heads in coreference annotations compared with UD-style heads. In: Proceedings of the Sixth International Conference on Dependency Linguistics (Depling, SyntaxFest 2021), Copyright © Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-955917-14-8, pp. 101-114, 2021 [URL]
  • Popel Martin, Žabokrtský Zdeněk, Nedoluzhko Anna, Novák Michal, Zeman Daniel: Do UD Trees Match Mention Spans in Coreference Annotations?. In: Findings of the Association for Computational Linguistics: EMNLP 2021, Copyright © Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-955917-10-0, pp. 3570-3576, 2021 [URL]
  • Nedoluzhko Anna, Novák Michal, Popel Martin, Žabokrtský Zdeněk, Zeman Daniel: Coreference meets Universal Dependencies – a pilot experiment on harmonizing coreference datasets for 11 languages. Technical report no. 2021/66, Copyright © ÚFAL MFF UK, Praha, Czechia, ISSN 1214-5521, 65 pp., Apr 2021 [PDF]

Presentations

  • Yu Juntao, Novák Michal: The recent developments in Universal Anaphora Scorer. Invited talk at CRAC 2022 (October 17, 2022) [PDF]
  • Nedoluzhko Anna, Novák Michal, Popel Martin, Žabokrtský Zdeněk, Zeldes Amir, Zeman Dan: CorefUD 1.0: Coreference Meets Universal Dependencies. LREC 2022 oral presentation (June 2022) [PDF]
  • Nedoluzhko Anna, Novák Michal, Popel Martin, Žabokrtský Zdeněk, Zeman Dan, Zeldes Amir: The Universal Anaphora Extension of the CONLL-U Markup Scheme. Universal Anaphora panel at CRAC 2021 (November 11, 2021) [PDF]
  • Nedoluzhko Anna, Novák Michal, Popel Martin, Žabokrtský Zdeněk, Zeman Dan: CorefUD 0.1 – a pilot experiment on harmonizing coreference datasets for 11 languages. ÚFAL Monday Seminar (April 19, 2021) [PDF]
  • Nedoluzhko Anna, Novák Michal, Popel Martin, Žabokrtský Zdeněk, Zeman Dan: CorefUD 0.1 – a pilot experiment on harmonizing coreference datasets for 11 languages. Universal Anaphora Workhop (April 9, 2021) [PDF]

How to cite

When using CorefUD, please cite the following LREC paper:

@inproceedings{nedoluzhko-etal-2022-corefud,
    title = "{C}oref{UD} 1.0: Coreference Meets {U}niversal {D}ependencies",
    author = "Nedoluzhko, Anna and Nov{\'a}k, Michal and Popel, Martin and {\v{Z}}abokrtsk{\'y}, Zden{\v{e}}k and Zeldes, Amir and Zeman, Daniel",
    booktitle = "Proceedings of the Thirteenth Language Resources and Evaluation Conference",
    month = jun,
    year = "2022",
    address = "Marseille, France",
    publisher = "European Language Resources Association",
    url = "https://aclanthology.org/2022.lrec-1.520",
    pages = "4859--4872",
}