We present the Czech Court Decisions Dataset (CCDD) that is a dataset of 300 court decisions published by The Supreme Court of the Czech Republic (SC) and the Constitutional Court of the Czech Republic (CC). In these decisions selected entities are manually detected and classified.
CCDD contains 150 court decisions published by the Supreme Court of the Czech Republic in 2012 and we selected them randomly with respect to their distribution over the senates. Next CCDD contains 150 court decisions published by the Constitutional Court of the Czech Republic in 2004 - 2012.
The following entities are recognizied in CCDD:
In addition, court decision references are linked with the institutions that issued them. Each applicability entity follows an act reference. For manual annotation we used the web-based annotation tool Brat. The annotators marked entity occurrences and label them with an appropriate tag. Then they marked relations between court decisions and institutions.
We did a single annotation of 300 court decision. However, to get the inter-annotator agreement we selected 15 random documents from the dataset and annotated them by three annotators. In average the annotators marked 551 institutions, 258 court decision references, 402 act references, and 42 applicabilities. We used the Fleiss' kappa to calculate the agreement. We report
The table below presents the CCDD statistics:
SC | CC | |||||
Entity type | # of entities | # of tokens | Average entity length | # of entities | # of tokens | Average entity length |
Institution | 4,891 | 13,714 | 2.8 | 6,318 | 15,798 | 2.5 |
Decision references | 1,449 | 6,967 | 4.8 | 1,644 | 8,146 | 5.0 |
Act references | 4,387 | 33,628 | 7.7 | 2,597 | 18,774 | 7.2 |
Applicability | 247 | 1,179 | 4.8 | 233 | 938 | 4.0 |
Distributed under CC BY-NC-SA 4.0 licence.
We gratefully acknowledge support from the Technology Agency of the Czech Republic (grant no. TA02010182).