KUKY - Documentation

Authors

Silvie Cinková

Barbora Kubíková,

Michal Kuk

Tereza Novotná

Jana Šamánková,

Přemysl Pospíšil

Jiří Mírovský

Barbora Hladká

Published

December 21, 2024

1 Introduction

KUKY is a curated selection of 224 Czech administrative and legal documents for readability research.

The documents are manually enriched with a two-level annotation. This annotation mimics the way a plain-language expert scrutinizes a document before redesigning it for better readability: first, they closely read the entire document and detect problematic passages, classifying them as either

incomprehensible or confusing beyond repair

or
superfluous, irrelevant, obscuring the actual message.

The remaining text is considered relevant. At this point, the editor does not indicate missing pieces of information or argumentation.

In a second step, the editor works with the relevant text according to a genre-specific template. These templates determine the types of information or argumentation that convey the message most efficiently. The classification of genres in this corpus pertains to a set of features (more details in Section 3).

3 Text genres represented in KUKY

The documents in this corpus come from a limited number of sources and thus do not aspire to cover all legal writing. The main resources are the publicly available databases of the Ombudsman’s Office, the Supreme Administrative Court, and the Frank Bold free legal advice database, as well as various other documents collected from private sources.

Some documents come in two versions. The pairs are either originals and revisions, or drafts and their final versions. Many of the re-designed versions were even created exclusively for this corpus as examples of documents optimized for legal clarity as well as readability.

Typical documents in this corpus are:

free legal advice - both generic advice and individualized legal assistance
findings of the Office of the Ombudsman
court findings (including a few automatic translations from foreign court rulings widely praised for their clarity)
letters of citizens to authorities
letters of authorities to citizens
public local administration announcements

4 Annotation schemes

4.1 Relevance Stoplight

The Relevance Stoplight scheme contains three content labels and one floating comment label:

1_Nesrozumitelné (Incomprehensible, confusing beyond repair)
2_Zbytečné (Superfluous, irrelevant)
3_Relevantní (Relevant)
komentář anotátora (Note)

This annotation scheme is common for all KUKY documents. Figure 1 shows the annotation in the Gloss annotation tool.

Figure 1: Relevance Spotlight annotation in Gloss

4.2 Speech Acts

This annotation operates two schemes: one designed specifically for normative documents (generic legal advice by Frank Bold) and one for all other documents, which all relate to individual cases and mostly contain argumentation (see Figure 2).

4.2.1 Speech Acts scheme for normative documents

01_Situace (Situation)

Snippets of text indicating what situation (and goal) the advice applies to.
02_Kontext (Context)

Snippets of text giving the broader picture, for instance precedent cases or typical procedures and their outcomes.
03_Postup (Procedure)

Snippets of text describing what the recipient is advised to do.
04_Proces (Process)

Snippets of text describing the expected responses of authorities or other parties to steps taken by the recipient
05_Podmínky (Conditions, options)

Snippets of text specifying circumstances under which an action can or cannot be taken.
06_Doporučení (Recommendations)

Snippets of text that recommend additional actions or compare the individual options with respect to their desired impact.
07_Odkazy (Links)

Explicit textual links to other documents in Frank Bold’s knowledge base of legal advice.
08_Prameny (References)

References to external documents, particularly laws and regulations.
09_Nezařaditelné (Not classified)

Any other text.

4.2.2 Speech Acts scheme for documents relating to individual cases

This scheme is defined by syllogistic deductive reasoning, which is based on a particularly good correspondence between the real-world aspects of the given case (the narratives) and the extracts of law applied to the narratives. Matching couples of narrative and law support conclusions. These are the most important labels of the scheme:

Příběh (Narrative, or minor premise in syllogistic terms)

Text snippets describing what actually happened to whom.
Pravidlo (Law, or major premise in syllogistic terms)

Text snippets applying a piece of law - that is, quoting, paraphrasing or summarizing it.
Závěr (Conclusion/Consequent in syllogistic terms)

Text snippets establishing an explicit link between a narrative snippet and a law snippet.

Other labels are

Rada (Advice)

Text snippets containing optional information that is meant to aid the recipient.
Výzva (Command)

Text snippets stating what the recipient is obliged or supposed to do next.
Právní otázka (Legal issue)

Usually one text snippet summarizing the matter of dispute in legal terms, typically formulated as a yes-no question.
Metatext (Processing matters)

Information about possible previous legal processing of the matter, which authorities have been involved so far, with which outcome (e.g., lower courts, local administration)

Figure 2: Speech acts annotation in the Gloss annotation tool

The Narrative, Legal Issue, and Conclusion labels can contain references to their corresponding snippets.

In well-structured argumentative documents, each snippet of narrative is matched to a corresponding law snippet, and each conclusion is supported by at least one such pair.

Whenever Law and Narrative form a couple to support a Conclusion, the link between the Law and the Conclusion is not explicitly annotated.

Also the Legal Issue ought to be connected to a Conclusion (see Figure 3).

Figure 3: Possible links between text snippets

Conceptually, it would naturally have made more sense to draw all arrows from the Conclusion and make the link between Conclusion and Law. In the current corpus release, the link direction is determined solely by the ergonomics of the annotation: the annotator determines both the snippet boundaries and links between snippets at the same time. The conceptually easiest way for the annotator is to start from the Narratives and chunk the Narratives so as to make the best possible pairs with the Law snippets. These pairs are then linked to Conclusions (Figure 3, top left).

When a Narrative is not matched with a Law but it backs up a Conclusion anyway, the link goes from the Narrative to Conclusion (Figure 3, bottom left).

When the document contains a Legal Issue, it is supposed to be linked to a Conclusion (typically a summarizing Conclusion, Figure 3, top right).

Finally the annotator examines loose Law snippets and links them to corresponding Conclusions, the link going from the Conclusions to the Laws (Figure 3, bottom right).

Fig. Figure 4 presents the Syllogism annotation.

Figure 4: Syllogism annotation in the Gloss annotation editor

This annotation is flat. That is, there is no hierarchy in the Conclusions. However, a Narrative-Law pair can point to several Conclusions. This indicates that these Conclusions are related. Ideally, a document ought to form clusters of Conclusions that connect to the same Law-Narrative pairs.

The less a document pertains to the syllogistic structure, the fewer links are found and the more randomly they occur.

5 Data structure

The KUKY corpus comes in two JSON files, one for the normative documents and one for the argumentative ones.

Both files contain three JSON objects:

documents
labels
annotations.

5.1 Documents

The structure of the Documents object is identical for both files and is elaborated in Section 6, as well as in Section 8.1.

5.2 Labels

Each file has its own set of labels (see Section 4, and more specifically Section 8.2 and Section 8.3 ). In the JSON structure, these objects are used in the annotations object as properties of the individual annotation spans represented as JSON objects.

5.3 Annotations

Most properties of the annotations objects are shared by both files (that is, their specifications in the labels object are largely identical), but annotations in argumentative documents have a more complex structure of attributes due to the the syllogism annotation. For details see Section 8.5 and Section 8.4.

6 Metadata

The metadata information about the individual documents is included as attributes of the individual documents. In the JSON structure they are properties of the JSON objects representing the individual documents nested in an array in the umbrella object called documents. Here comes an overview of the metadata attributes:

doc_id - unique ID of the document generated by Gloss.
doc_name - the name of the file as uploaded to Gloss.
plainText: the full text of the document, including markdown tags.
Readability - (high, medium, low) by intuitive assessment, relative to other documents in the corpus.
SyllogismBased - does this document systematically use syllogism? (true, false)
DocumentVersion - (Original, Partial Redesign, Redesign).
ParentDocumentID - when the document is a redesigned document and the corpus also contains the original, then this item stores the ID of the original document.
LegalActType - the type of the legal act (“individual”, “normative”). Individual legal acts concern concrete cases and already existing applications of a norm, while normative acts are laws and guidelines, as well as generic legal advice.
Objectivity - objectivity of the text (“quasiobjective”, “persuasive”).
Bindingness - is the document legally binding? (true, false)
AuthorType - the source type of the document (“authority”, “individual”).
RecipientType - the type of the recipient the document was designated to (“natural person”, “legal person”, “combined”).
RecipientIndividuation - expected familiarity of the recipient with the matter (“individual”, “bulk”, “public”). Court decisions are usually treated as bulk, since they concern the process parties as well as authorities. Although these documents are often public, their primary goal is not to inform the public but to resolve a situation for the involved parties with judicially reviewable arguments.
Anonymized - is the document anonymized? (“Anonymized by source”, “On-site anonymization”, “No”). Normative documents do not need to be anonymized.

Citation

Please cite the data when using the corpus for your research:

Silvie Cinková, Michal Kuk, Jana Šamánková, Barbora Kubíková, Přemysl Pospíšil, Jiří Mírovský, Barbora Hladká, Tereza Novotná: KUKY 1.0. Data/software, ÚFAL MFF UK, Prague, Czech Republic, 2024, LINDAT http://hdl.handle.net/11234/1-5812.

7 Acknowledgments

This work was funded by the Technological Agency of the Czech Republic, Grant No. TQ01000526, 2023-2025.

License

The corpus KUKY 1.0 is distributed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0) license.

8 Appendix: JSON schemes

8.1 JSON schema for documents

 {
    "$schema": "https://json-schema.org/draft/2019-09/schema",
    "$id": "http://example.com/example.json",
    "type": "array",
    "title": "Description of individual documents in KUKY. Applicable to documents in all data files. ",
    "items": {
        "type": "object",
        "title": "Structure of a document from a collection of normative documents (e.g. Frank Bold legal advice)",
        "required": [
            "doc_id",
            "doc_name",
            "plainText",
            "Readability",
            "SyllogismBased",
            "DocumentVersion",
            "ParentDocumentID",
            "LegalActType",
            "Objectivity",
            "Bindingness",
            "AuthorType",
            "RecipientType",
            "RecipientIndividuation",
            "Anonymized"
        ],
        "properties": {
            "doc_id": {
                "type": "string",
                "title": "Document ID generated by the annotation tool",
                "examples": [
                    "671918e2c6537d54ff0626db",
                    "671918e2c6537d54ff0626dc",
                    "671918e2c6537d54ff0626de",
                    "673b7a37c6537d54ff062b8f",
                    "673b7a37c6537d54ff062b90",
                    "673b7a37c6537d54ff062b91"
                ]
            },
            "doc_name": {
                "type": "string",
                "title": "the name of the file as uploaded in the annotation tool. Typical format suffixes are txt and md.",
                "examples": [
                    "orig_Certifikáty autorizovaných inspektorů.txt",
                    "red_Co je to územní plánování_final_přidat odkaz na manuál o RP až bude.txt",
                    "Duchody.txt",
                    "006_Chci ZMĚNIT DŮCHOD - V. 21.txt",
                    "008_Policie.txt",
                    "020_red_Jak chránit vody a správně s nimi nakládat_revKZ.txt"
                ]
            },
            "plainText": {
                "type": "string",
                "title": "Document plain texts including markdown tags when the source contained them. Nothe that if you strip the documents of the markdown tags, annotation offsets will not map on the text correctly.",
                "examples": [
                    "##...",
                    "Dů...",
                    "Vy...",
                    "Po..."
                ]
            },
            "Readability": {
                "type": "string",
                "title": "This document was selected to represent one of three levels of readability.",
                "enum": [
                    "low",
                    "high",
                    "medium"
                ]
            },
            "SyllogismBased": {
                "type": "boolean",
                "title": "Whether or not the argumentation of the text is substantially based on syllogism. Normative texts do not argue, and therefore they have all the value false."
            },
            "DocumentVersion": {
                "type": "string",
                "title": "Most documents are originals; that is, the corpus does not contain their revised version. Revised versions are classified as either Partial Redesign or Redesign depending on the extent of the revision. Revisions were chosen to be better than originals but even originals can have high readability" ,
                "enum": [
                    "Original",
                    "Redesign",
                    "Partial Redesign"
                ]
            },
            "ParentDocumentID": {
                "type": [
                    "null",
                    "string"
                ],
                "title": "If the document is a (Partial) Redesign, this is the ID of its corresponding Original document.",
                "examples": [
                    "null",
                    "673b7a37c6537d54ff062b96"
                ]
            },
            "LegalActType": {
                "type": "string",
                "title": "Individual texts are dealing with a concrete case. Normative texts are setting norms or guidelines for situations with generic participants.",
                "enum": [
                    "normative",
                    "individual"
                ]
            },
            "Objectivity": {
                "type": "string",
                "title": "Most documents are quasiobjective, no matter whether they require or entitle the client to do something. Persuasive documents are usually documents that seek to persuade an authority to act in a certain way, such as defense or suit.",
                "enum": [
                    "quasiobjective",
                    "persuasive"
                ]
            },
            "Bindingness": {
                "type": "boolean",
                "title": "Is the document legally binding? Typical binding documents are authority decisions (courts, local administration) and laws."
                ]
            },
            "AuthorType": {
                "type": "string",
                "title": "Is the author a citizen or a representative of an authority?",
                "enum": [
                    "individual",
                    "authority"
                ]
            },
            "RecipientType": {
                "type": "string",
                "title": "Is the intended recipient a citizen (or many citizens), typically lacking legal proficiency or assistance, or a company or administration, who typically have these? Or both?" ,
                "enum": [
                    "natural person", 
                    "legal person", 
                    "combined"
                ]
            },
            "RecipientIndividuation": {
                "type": "string",
                "title": "How much context must be delivered for the intended recipient? The public must get a full description of the case. An individual recipient, who is involved in the case, has an extensive pre-knowledge and thus it could be counterproductive to repeat all details of the case before coming to the actual message that is to be conveyed. A bulk recipient is typically several involved parties, each with their own pre-knowledge, and other authorities. Typically, court findings and decisions are written for a bulk of recipients (party and counterparty, as well as potentially higher courts in case of appeals.",
                "enum": [
                    "public", 
                    "bulk", 
                    "individual"
                ]
            },
            "Anonymized": {
                "type": "string",
                "title": "Some documents need not be anonymized at all, for instance normative documents with generic recipients. Documents from public sources have usually been anonymized to some extent. Typically, personal details of parties involved in a case are anonymized. Anonymized text passages are usually replaced by random initials (X.Y., A.B., etc.) or class descriptors (place, surname, etc.). It is not always clear whether or not file reference numbers have been altered as well, and authorities are not anonymized at all, not even names of persons acting in the capacity of an authority, such as judges or administration clerks. When a document is anonymized on-site, the anonymization goes beyond what we mostly see in the source-anonymized documents. Since this corpus is meant to mimic authentic unedited texts, we manually replaced initials and placeholders with random and fictitious proper names (that is, pseudonymized rather than anonymized the text). We also randomly swap digits in phone numbers, file references, dates, etc. Documents originally anonymized by source keep this classification, even though we mostly edited them as well.",
                "enum": [
                    "No", 
                    "Anonymized by source", 
                    "On-site anonymization"
                ]
            }
        },
        "examples": [{
            "doc_id": "671918e2c6537d54ff0626db",
            "doc_name": "orig_Certifikáty autorizovaných inspektorů.txt",
            "plainText": "##...",
            "Readability": "low",
            "SyllogismBased": "false",
            "DocumentVersion": "Original",
            "ParentDocumentID": null,
            "LegalActType": "normative",
            "Objectivity": "quasiobjective",
            "Bindingness": false,
            "AuthorType": "individual",
            "RecipientType": "natural person",
            "RecipientIndividuation": "public",
            "Anonymized": "No"
        },
        {
            "doc_id": "671918e2c6537d54ff0626dc",
            "doc_name": "red_Co je to územní plánování_final_přidat odkaz na manuál o RP až bude.txt",
            "plainText": "##...",
            "Readability": "high",
            "SyllogismBased": "false",
            "DocumentVersion": "Redesign",
            "ParentDocumentID": null,
            "LegalActType": "normative",
            "Objectivity": "quasiobjective",
            "Bindingness": false,
            "AuthorType": "individual",
            "RecipientType": "natural person",
            "RecipientIndividuation": "public",
            "Anonymized": "No"
        },
        {
            "doc_id": "671918e2c6537d54ff0626de",
            "doc_name": "Duchody.txt",
            "plainText": "Dů...",
            "Readability": "low",
            "SyllogismBased": "false",
            "DocumentVersion": "Original",
            "ParentDocumentID": null,
            "LegalActType": "normative",
            "Objectivity": "quasiobjective",
            "Bindingness": false,
            "AuthorType": "authority",
            "RecipientType": "natural person",
            "RecipientIndividuation": "public",
            "Anonymized": "No"
        },
        {
            "doc_id": "673b7a37c6537d54ff062b8f",
            "doc_name": "006_Chci ZMĚNIT DŮCHOD - V. 21.txt",
            "plainText": "Vy...",
            "Readability": "high",
            "SyllogismBased": "false",
            "DocumentVersion": "Redesign",
            "ParentDocumentID": null,
            "LegalActType": "normative",
            "Objectivity": "quasiobjective",
            "Bindingness": false,
            "AuthorType": "authority",
            "RecipientType": "natural person",
            "RecipientIndividuation": "public",
            "Anonymized": "No"
        },
        {
            "doc_id": "673b7a37c6537d54ff062b90",
            "doc_name": "008_Policie.txt",
            "plainText": "Po...",
            "Readability": "medium",
            "SyllogismBased": "false",
            "DocumentVersion": "Original",
            "ParentDocumentID": null,
            "LegalActType": "normative",
            "Objectivity": "quasiobjective",
            "Bindingness": false,
            "AuthorType": "authority",
            "RecipientType": "natural person",
            "RecipientIndividuation": "public",
            "Anonymized": "No"
        },
        {
            "doc_id": "673b7a37c6537d54ff062b91",
            "doc_name": "020_red_Jak chránit vody a správně s nimi nakládat_revKZ.txt",
            "plainText": "##...",
            "Readability": "high",
            "SyllogismBased": "false",
            "DocumentVersion": "Partial Redesign",
            "ParentDocumentID": "673b7a37c6537d54ff062b96",
            "LegalActType": "normative",
            "Objectivity": "quasiobjective",
            "Bindingness": false,
            "AuthorType": "individual",
            "RecipientType": "natural person",
            "RecipientIndividuation": "public",
            "Anonymized": "No"
        }]
    },
    "examples": [
        [{
            "doc_id": "671918e2c6537d54ff0626db",
            "doc_name": "orig_Certifikáty autorizovaných inspektorů.txt",
            "plainText": "##...",
            "Readability": "low",
            "SyllogismBased": "false",
            "DocumentVersion": "Original",
            "ParentDocumentID": null,
            "LegalActType": "normative",
            "Objectivity": "quasiobjective",
            "Bindingness": false,
            "AuthorType": "individual",
            "RecipientType": "natural person",
            "RecipientIndividuation": "public",
            "Anonymized": "No"
        },
        {
            "doc_id": "671918e2c6537d54ff0626dc",
            "doc_name": "red_Co je to územní plánování_final_přidat odkaz na manuál o RP až bude.txt",
            "plainText": "##...",
            "Readability": "high",
            "SyllogismBased": "false",
            "DocumentVersion": "Redesign",
            "ParentDocumentID": null,
            "LegalActType": "normative",
            "Objectivity": "quasiobjective",
            "Bindingness": false,
            "AuthorType": "individual",
            "RecipientType": "natural person",
            "RecipientIndividuation": "public",
            "Anonymized": "No"
        },
        {
            "doc_id": "671918e2c6537d54ff0626de",
            "doc_name": "Duchody.txt",
            "plainText": "Dů...",
            "Readability": "low",
            "SyllogismBased": "false",
            "DocumentVersion": "Original",
            "ParentDocumentID": null,
            "LegalActType": "normative",
            "Objectivity": "quasiobjective",
            "Bindingness": false,
            "AuthorType": "authority",
            "RecipientType": "natural person",
            "RecipientIndividuation": "public",
            "Anonymized": "No"
        },
        {
            "doc_id": "673b7a37c6537d54ff062b8f",
            "doc_name": "006_Chci ZMĚNIT DŮCHOD - V. 21.txt",
            "plainText": "Vy...",
            "Readability": "high",
            "SyllogismBased": "false",
            "DocumentVersion": "Redesign",
            "ParentDocumentID": null,
            "LegalActType": "normative",
            "Objectivity": "quasiobjective",
            "Bindingness": false,
            "AuthorType": "authority",
            "RecipientType": "natural person",
            "RecipientIndividuation": "public",
            "Anonymized": "No"
        },
        {
            "doc_id": "673b7a37c6537d54ff062b90",
            "doc_name": "008_Policie.txt",
            "plainText": "Po...",
            "Readability": "medium",
            "SyllogismBased": "false",
            "DocumentVersion": "Original",
            "ParentDocumentID": null,
            "LegalActType": "normative",
            "Objectivity": "quasiobjective",
            "Bindingness": false,
            "AuthorType": "authority",
            "RecipientType": "natural person",
            "RecipientIndividuation": "public",
            "Anonymized": "No"
        },
        {
            "doc_id": "673b7a37c6537d54ff062b91",
            "doc_name": "020_red_Jak chránit vody a správně s nimi nakládat_revKZ.txt",
            "plainText": "##...",
            "Readability": "high",
            "SyllogismBased": "false",
            "DocumentVersion": "Partial Redesign",
            "ParentDocumentID": "673b7a37c6537d54ff062b96",
            "LegalActType": "normative",
            "Objectivity": "quasiobjective",
            "Bindingness": false,
            "AuthorType": "individual",
            "RecipientType": "natural person",
            "RecipientIndividuation": "public",
            "Anonymized": "No"
        }]
    ]
}

8.2 JSON schema of annotation categories in argumentative texts

{
    "$schema": "https://json-schema.org/draft/2019-09/schema",
    "$id": "http://example.com/example.json",
    "type": "object",
    "title": "JSON schema of labels used in the argumentative legal texts",
    "required": [
        "labels"
    ],
    "properties": {
        "labels": {
            "type": "array",
            "title": "This is the inventory of all annotation categories",
            "items": {
                "type": "object",
                "title": "One individual annotation category",
                "required": [
                    "label_id",
                    "label",
                    "isA",
                    "attributes"
                ],
                "properties": {
                    "label_id": {
                        "type": "string",
                        "title": "id generated by the annotation tool. Use only if you want to rename them without having to find and replace the Czech labels",
                        "enum": [
                            "660d7ac3c6537d54ff05e5a8",
                            "660d7ad5c6537d54ff05e5a9",
                            "660d7adec6537d54ff05e5aa",
                            "660d7badc6537d54ff05e5ab",
                            "660d7be5c6537d54ff05e5ac",
                            "660d7c52c6537d54ff05e5ad",
                            "660d7c68c6537d54ff05e5ae",
                            "66608523c6537d54ff05f0f2",
                            "65f82b2ac6537d54ff05de63",
                            "65f9724dc6537d54ff05df41",
                            "65f97278c6537d54ff05df42",
                            "664dfd77c6537d54ff05f0b3",
                            "664dfe2dc6537d54ff05f0b5"
                        ]
                    },
                    "label": {
                        "type": "string",
                        "title": "the Czech labels, Spotlight and Speech Acts merged together",
                        "enum": [
                            "Pravidlo",
                            "Příběh",
                            "Závěr",
                            "Komentář o čemkoli",
                            "Metatext",
                            "Rada",
                            "Výzva",
                            "Právní otázka",
                            "1_Nesrozumitelné",
                            "3_Relevantní",
                            "2_Zbytečné",
                            "komentář anotátora",
                            "nálepky"
                        ]
                    },
                    "isA": {
                        "type": "string",
                        "title": "generated by the Gloss annotation tool, describes Gloss-internal structure",
                        "examples": [
                            "58781cf945f90f3bfc5cba7d",
                            "587c172845f90f3bfc5cba7f"
                        ]
                    },
                    "attributes": {
                        "type": "array",
                        "title": "Gloss-generated, but substantial to understand the data",
                        "items": {
                            "type": "object",
                            "title": "A Schema",
                            "required": [
                                "isColl",
                                "type",
                                "name"
                            ],
                            "properties": {
                                "isColl": {
                                    "type": "boolean",
                                    "title": "Is it a collection attribute, that is, can this attribute take more than one value? ",
                                    "enum": [
                                        false,
                                        true
                                    ]
                                },
                                "type": {
                                    "type": "string",
                                    "title": "Reference to the Gloss-generated ID of annotation category (synonymous with label, convenience for non-Czech speakers). This schema slightly underspecifies the constraints. These are better elaborated in the KUKY documentation. ",
                                    "enum": [
                                        "5854ca16e2bc651508f8536f",
                                        "660d7ac3c6537d54ff05e5a8",
                                        "660d7adec6537d54ff05e5aa",
                                        "664dfe2dc6537d54ff05f0b5"
                                    ]
                                },
                                "name": {
                                    "type": "string",
                                    "title": "These attributes store IDs of referenced annotation categories. For easier orientation, their names are the property names, e.g. the pravidla property can only contain IDs of spans annotated as Pravidlo.",
                                    "enum": [
                                        "komentář",
                                        "pravidla",
                                        "závěry",
                                        "Závěr",
                                        "piš sem",
                                        "nálepky"
                                    ]
                                }
                            },
                        "examples": [
                            [{
                                "isColl": false,
                                "type": "5854ca16e2bc651508f8536f",
                                "name": "komentář"
                            }],
                            [{
                                "isColl": true,
                                "type": "660d7ac3c6537d54ff05e5a8",
                                "name": "pravidla"
                            },
                            {
                                "isColl": true,
                                "type": "660d7adec6537d54ff05e5aa",
                                "name": "závěry"
                            },
                            {
                                "isColl": false,
                                "type": "5854ca16e2bc651508f8536f",
                                "name": "komentář"
                            }],
                            [{
                                "isColl": true,
                                "type": "660d7ac3c6537d54ff05e5a8",
                                "name": "pravidla"
                            }],
                            [{
                                "isColl": false,
                                "type": "5854ca16e2bc651508f8536f",
                                "name": "komentář"
                            }],
                            [{
                                "name": "komentář",
                                "type": "5854ca16e2bc651508f8536f",
                                "isColl": false
                            }],
                            [{
                                "name": "komentář",
                                "type": "5854ca16e2bc651508f8536f",
                                "isColl": false
                            }],
                            [{
                                "name": "komentář",
                                "type": "5854ca16e2bc651508f8536f",
                                "isColl": false
                            }],
                            [{
                                "isColl": true,
                                "type": "660d7adec6537d54ff05e5aa",
                                "name": "Závěr"
                            },
                            {
                                "isColl": false,
                                "type": "5854ca16e2bc651508f8536f",
                                "name": "komentář"
                            }],
                            [],
                            [],
                            [],
                            [{
                                "isColl": false,
                                "type": "5854ca16e2bc651508f8536f",
                                "name": "piš sem"
                            },
                            {
                                "isColl": false,
                                "type": "664dfe2dc6537d54ff05f0b5",
                                "name": "nálepky"
                            }],
                            []
                        ]
                    }
                },
    "examples": [{
        "labels": [{
            "label_id": "660d7ac3c6537d54ff05e5a8",
            "label": "Pravidlo",
            "isA": "58781cf945f90f3bfc5cba7d",
            "attributes": [{
                "isColl": false,
                "type": "5854ca16e2bc651508f8536f",
                "name": "komentář"
            }]
        },
        {
            "label_id": "660d7ad5c6537d54ff05e5a9",
            "label": "Příběh",
            "isA": "58781cf945f90f3bfc5cba7d",
            "attributes": [{
                "isColl": true,
                "type": "660d7ac3c6537d54ff05e5a8",
                "name": "pravidla"
            },
            {
                "isColl": true,
                "type": "660d7adec6537d54ff05e5aa",
                "name": "závěry"
            },
            {
                "isColl": false,
                "type": "5854ca16e2bc651508f8536f",
                "name": "komentář"
            }]
        },
        {
            "label_id": "660d7adec6537d54ff05e5aa",
            "label": "Závěr",
            "isA": "58781cf945f90f3bfc5cba7d",
            "attributes": [{
                "isColl": true,
                "type": "660d7ac3c6537d54ff05e5a8",
                "name": "pravidla"
            }]
        },
        {
            "label_id": "660d7badc6537d54ff05e5ab",
            "label": "Komentář o čemkoli",
            "isA": "58781cf945f90f3bfc5cba7d",
            "attributes": [{
                "isColl": false,
                "type": "5854ca16e2bc651508f8536f",
                "name": "komentář"
            }]
        },
        {
            "label_id": "660d7be5c6537d54ff05e5ac",
            "label": "Metatext",
            "isA": "58781cf945f90f3bfc5cba7d",
            "attributes": [{
                "name": "komentář",
                "type": "5854ca16e2bc651508f8536f",
                "isColl": false
            }]
        },
        {
            "label_id": "660d7c52c6537d54ff05e5ad",
            "label": "Rada",
            "isA": "58781cf945f90f3bfc5cba7d",
            "attributes": [{
                "name": "komentář",
                "type": "5854ca16e2bc651508f8536f",
                "isColl": false
            }]
        },
        {
            "label_id": "660d7c68c6537d54ff05e5ae",
            "label": "Výzva",
            "isA": "58781cf945f90f3bfc5cba7d",
            "attributes": [{
                "name": "komentář",
                "type": "5854ca16e2bc651508f8536f",
                "isColl": false
            }]
        },
        {
            "label_id": "66608523c6537d54ff05f0f2",
            "label": "Právní otázka",
            "isA": "58781cf945f90f3bfc5cba7d",
            "attributes": [{
                "isColl": true,
                "type": "660d7adec6537d54ff05e5aa",
                "name": "Závěr"
            },
            {
                "isColl": false,
                "type": "5854ca16e2bc651508f8536f",
                "name": "komentář"
            }]
        },
        {
            "label_id": "65f82b2ac6537d54ff05de63",
            "label": "1_Nesrozumitelné",
            "isA": "58781cf945f90f3bfc5cba7d",
            "attributes": []
        },
        {
            "label_id": "65f9724dc6537d54ff05df41",
            "label": "3_Relevantní",
            "isA": "58781cf945f90f3bfc5cba7d",
            "attributes": []
        },
        {
            "label_id": "65f97278c6537d54ff05df42",
            "label": "2_Zbytečné",
            "isA": "58781cf945f90f3bfc5cba7d",
            "attributes": []
        },
        {
            "label_id": "664dfd77c6537d54ff05f0b3",
            "label": "komentář anotátora",
            "isA": "58781cf945f90f3bfc5cba7d",
            "attributes": [{
                "isColl": false,
                "type": "5854ca16e2bc651508f8536f",
                "name": "piš sem"
            },
            {
                "isColl": false,
                "type": "664dfe2dc6537d54ff05f0b5",
                "name": "nálepky"
            }]
        },
        {
            "label_id": "664dfe2dc6537d54ff05f0b5",
            "label": "nálepky",
            "isA": "587c172845f90f3bfc5cba7f",
            "attributes": []
        }]
    }]
}

8.3 JSON schema of annotation categories in normative texts (Frank Bold data set)

{
    "$schema": "https://json-schema.org/draft/2019-09/schema",
    "$id": "http://example.com/example.json",
    "type": "object",
    "default": {},
    "title": "JSON schema of labels used in the normative legal texts (Frank Bold dataset)",
    "required": [
        "labels"
    ],
    "properties": {
        "labels": {
            "type": "array",
            "default": [],
            "title": "This is the inventory of all annotation categories",
            "items": {
                "type": "object",
                "title": "One individual annotation category",
                "required": [
                    "label_id",
                    "label",
                    "isA"
                ],
                "properties": {
                    "label_id": {
                        "type": "string",
                        "title": "id generated by the annotation tool. Use only if you want to rename them without having to find and replace the Czech labels",
                        "enum": [
                            "66f3b441c6537d54ff0624cd",
                            "66f3b451c6537d54ff0624cf",
                            "66f3b45ec6537d54ff0624d0",
                            "66f3b468c6537d54ff0624d1",
                            "66f3b46fc6537d54ff0624d2",
                            "66f3d323c6537d54ff0624f4",
                            "66f3d34ac6537d54ff0624f5",
                            "6721f844c6537d54ff062705",
                            "6721f96dc6537d54ff062708",
                            "661f7f4bc6537d54ff05ef6a",
                            "661f7f5ac6537d54ff05ef6b",
                            "661f7f66c6537d54ff05ef6c"
                        ]
                    },
                    "label": {
                        "type": "string",
                        "title": "the Czech labels, Spotlight and Speech Acts merged together",
                        "enum": [
                            "01_Situace",
                            "03_Postup",
                            "05_Podmínky",
                            "08_Prameny",
                            "07_Odkazy",
                            "04_Proces",
                            "02_Kontext",
                            "09_Nezařaditelné",
                            "06_ Doporučení",
                            "1_Nesrozumitelné",
                            "2_Zbytečné",
                            "3_Relevantní"
                        ]
                    },
                    "isA": {
                        "type": "string",
                        "title": "generated by the Gloss annotation tool, describes Gloss-internal structure",
                        "examples": [
                            "58781cf945f90f3bfc5cba7d"
                        ]
                    },
                    "attributes": {
                        "type": "array",
                        "title": "The annotation of normative documents only allows for one free-text comment per annotation span, and it is represented as an attribute with the following properties. Unlike the annotation scheme of argumentative texts, no links to other annotation spans occur.",
                        "items": {
                            "type": "object",
                            "default": {},
                            "title": "Individual attribute for annotator's comment, mostly empty.",
                            "required": [
                                "isColl",
                                "type",
                                "name"
                            ],
                            "properties": {
                                "isColl": {
                                    "type": "boolean",
                                    "default": false,
                                    "title": "Only one comment is allowed, so it is not a Collection in Gloss terms. ",
                                    "enum": [
                                        false
                                    ]
                                },
                                "type": {
                                    "type": "string",
                                    "default": "",
                                    "title": "Gloss-generated label type ID of Comment",
                                    "examples": [
                                        "5854ca16e2bc651508f8536f"
                                    ]
                                },
                                "name": {
                                    "type": "string",
                                    "default": "",
                                    "title": "This is the Czech label for Comment",
                                    "enum": [
                                        "Komentář"
                                    ]
                                }
                            },
                            "examples": [{
                                "isColl": false,
                                "type": "5854ca16e2bc651508f8536f",
                                "name": "Komentář"
                            }]
                        },
                        "examples": [
                            [],
                            [],
                            [],
                            [],
                            [],
                            [],
                            [],
                            [{
                                "isColl": false,
                                "type": "5854ca16e2bc651508f8536f",
                                "name": "Komentář"
                            }],
                            [],
                            [],
                            [],
                            []
                        ]
                    }
                },
    "examples": [{
        "labels": [{
            "label_id": "66f3b441c6537d54ff0624cd",
            "label": "01_Situace",
            "isA": "58781cf945f90f3bfc5cba7d",
            "attributes": []
        },
        {
            "label_id": "66f3b451c6537d54ff0624cf",
            "label": "03_Postup",
            "isA": "58781cf945f90f3bfc5cba7d",
            "attributes": []
        },
        {
            "label_id": "66f3b45ec6537d54ff0624d0",
            "label": "05_Podmínky",
            "isA": "58781cf945f90f3bfc5cba7d",
            "attributes": []
        },
        {
            "label_id": "66f3b468c6537d54ff0624d1",
            "label": "08_Prameny",
            "isA": "58781cf945f90f3bfc5cba7d",
            "attributes": []
        },
        {
            "label_id": "66f3b46fc6537d54ff0624d2",
            "label": "07_Odkazy",
            "isA": "58781cf945f90f3bfc5cba7d",
            "attributes": []
        },
        {
            "label_id": "66f3d323c6537d54ff0624f4",
            "label": "04_Proces",
            "isA": "58781cf945f90f3bfc5cba7d",
            "attributes": []
        },
        {
            "label_id": "66f3d34ac6537d54ff0624f5",
            "label": "02_Kontext",
            "isA": "58781cf945f90f3bfc5cba7d",
            "attributes": []
        },
        {
            "label_id": "6721f844c6537d54ff062705",
            "label": "09_Nezařaditelné",
            "isA": "58781cf945f90f3bfc5cba7d",
            "attributes": [{
                "isColl": false,
                "type": "5854ca16e2bc651508f8536f",
                "name": "Komentář"
            }]
        },
        {
            "label_id": "6721f96dc6537d54ff062708",
            "label": "06_ Doporučení",
            "isA": "58781cf945f90f3bfc5cba7d",
            "attributes": []
        },
        {
            "label_id": "661f7f4bc6537d54ff05ef6a",
            "label": "1_Nesrozumitelné",
            "isA": "58781cf945f90f3bfc5cba7d",
            "attributes": []
        },
        {
            "label_id": "661f7f5ac6537d54ff05ef6b",
            "label": "2_Zbytečné",
            "isA": "58781cf945f90f3bfc5cba7d",
            "attributes": []
        },
        {
            "label_id": "661f7f66c6537d54ff05ef6c",
            "label": "3_Relevantní",
            "isA": "58781cf945f90f3bfc5cba7d",
            "attributes": []
        }]
    }]
}

8.4 JSON schema of annotation spans in normative texts (Frank Bold data set)

{
    "$schema": "https://json-schema.org/draft/2019-09/schema",
    "$id": "http://example.com/example.json",
    "type": "object",
    "default": {},
    "title": "Schema of individual annotation spans in normative texts (Frank Bold data set)",
    "required": [
        "annotations"
    ],
    "properties": {
        "annotations": {
            "type": "array",
            "default": [],
            "title": "Array of annotation spans in all documents, defined by character offsets including possible markdown tags",
            "items": {
                "type": "object",
                "title": "One individual annotation span in a document, defined by character offsets including possible markdown tags",
                "required": [
                    "doc_id",
                    "doc_name",
                    "task_type",
                    "label",
                    "start",
                    "end",
                    "span_id",
                    "type",
                    "owner"
                ],
                "properties": {
                    "doc_id": {
                        "type": "string",
                        "title": "Gloss-generated document ID, cf. the documents JSON schema",
                        "examples": [
                            "673b7a38c6537d54ff062bad",
                            "671918e2c6537d54ff0626de",
                            "671918e2c6537d54ff0626dc"
                        ]
                    },
                    "doc_name": {
                        "type": "string",
                        "title": "File name of the document as it was uploaded to the Gloss annotation tool",
                        "examples": [
                            "025_red_GDPR Jak právo chrání osobní údaje_final.txt",
                            "Duchody.txt",
                            "red_Co je to územní plánování_final_přidat odkaz na manuál o RP až bude.txt"
                        ]
                    },
                    "task_type": {
                        "type": "string",
                        "title": "Which annotation scheme, Stoplight or Speech Acts? Speech acts are marked as ArgM.",
                        "enum": [
                            "Stoplight",
                            "ArgM"
                        ]
                    },
                    "label": {
                        "type": "string",
                        "title": "The Czech label of the annotation category",
                        "examples": [
                            "3_Relevantní",
                            "04_Proces",
                            "2_Zbytečné"
                        ]
                    },
                    "start": {
                        "type": "integer",
                        "title": "The character offset of the start of the annotation span, in the given document, including markdown tag characters",
                        "examples": [
                            12559,
                            5730,
                            800,
                            739
                        ]
                    },
                    "end": {
                        "type": "integer",
                        "title": "The character offset of the end of the annotation span, in the given document, including markdown tag characters",
                        "examples": [
                            12728,
                            5737,
                            1284,
                            799
                        ]
                    },
                    "span_id": {
                        "type": "string",
                        "title": "Gloss-generated ID of the given annotation span",
                        "examples": [
                            "675aeb59c6537d54ff063d11",
                            "672b3620c6537d54ff0627fe",
                            "672b4c1fc6537d54ff062884",
                            "672b4c22c6537d54ff062885"
                        ]
                    },
                    "type": {
                        "type": "string",
                        "title": "Gloss-generated ID of the annotation category, synonymous with the Czech labels",
                        "examples": [
                            "661f7f66c6537d54ff05ef6c",
                            "66f3d323c6537d54ff0624f4",
                            "661f7f5ac6537d54ff05ef6b"
                        ]
                    },
                    "owner": {
                        "type": "string",
                        "title": "Gloss-generated ID of the annotator",
                        "examples": [
                            "65e9f0d2c6537d54ff05dc27"
                        ]
                    }
                },              
    "examples": [{
        "annotations": [{
            "doc_id": "673b7a38c6537d54ff062bad",
            "doc_name": "025_red_GDPR Jak právo chrání osobní údaje_final.txt",
            "task_type": "Stoplight",
            "label": "3_Relevantní",
            "start": 12559,
            "end": 12728,
            "span_id": "675aeb59c6537d54ff063d11",
            "type": "661f7f66c6537d54ff05ef6c",
            "owner": "65e9f0d2c6537d54ff05dc27"
        },
        {
            "doc_id": "671918e2c6537d54ff0626de",
            "doc_name": "Duchody.txt",
            "task_type": "ArgM",
            "label": "04_Proces",
            "start": 5730,
            "end": 5737,
            "span_id": "672b3620c6537d54ff0627fe",
            "type": "66f3d323c6537d54ff0624f4",
            "owner": "65e9f0d2c6537d54ff05dc27"
        },
        {
            "doc_id": "671918e2c6537d54ff0626dc",
            "doc_name": "red_Co je to územní plánování_final_přidat odkaz na manuál o RP až bude.txt",
            "task_type": "Stoplight",
            "label": "2_Zbytečné",
            "start": 800,
            "end": 1284,
            "span_id": "672b4c1fc6537d54ff062884",
            "type": "661f7f5ac6537d54ff05ef6b",
            "owner": "65e9f0d2c6537d54ff05dc27"
        },
        {
            "doc_id": "671918e2c6537d54ff0626dc",
            "doc_name": "red_Co je to územní plánování_final_přidat odkaz na manuál o RP až bude.txt",
            "task_type": "Stoplight",
            "label": "3_Relevantní",
            "start": 739,
            "end": 799,
            "span_id": "672b4c22c6537d54ff062885",
            "type": "661f7f66c6537d54ff05ef6c",
            "owner": "65e9f0d2c6537d54ff05dc27"
        }]
    }]
}

8.5 JSON schema of annotation spans in argumentative texts

{
    "$schema": "https://json-schema.org/draft/2019-09/schema",
    "$id": "http://example.com/example.json",
    "type": "object",
    "default": {},
    "title": "Schema of individual annotation spans in argumentative texts",
    "required": [
        "annotations"
    ],
    "properties": {
        "annotations": {
            "type": "array",
            "default": [],
            "title": "Array of annotation spans in all documents, defined by character offsets including possible markdown tags",
            "items": {
                "type": "object",
                "title": "One individual annotation span in a document, defined by character offsets including possible markdown tags",
                "required": [
                    "doc_id",
                    "doc_name",
                    "task_type",
                    "label",
                    "start",
                    "end",
                    "span_id",
                    "linked_rules",
                    "linked_conclusions",
                    "type",
                    "owner"
                ],
                "properties": {
                    "doc_id": {
                        "type": "string",
                        "title": "Gloss-generated document ID, cf. the documents JSON schema. Closed list, defined in the labels scheme.",
                        "examples": [
                            "6611a64cc6537d54ff05e668",
                            "678563edc6537d54ff0651c6",
                            "66f19554c6537d54ff06244d",
                            "66ab4dccc6537d54ff05fd1d",
                            "66082017c6537d54ff05e15a"
                        ]
                    },
                    "doc_name": {
                        "type": "string",
                        "title": "File name of the document as it was uploaded to the Gloss annotation tool",
                        "examples": [
                            "Dopis ID_dluh.md",
                            "27 cdo 544_2023.md",
                            "Mestsky_urad_Vyzva_k_zaplaceni_nakladu_rizeni_pred.md",
                            "KVOP_19_Stavarska_zprava_JSm.md",
                            "stavarska-1_kusv.md"
                        ]
                    },
                    "task_type": {
                        "type": "string",
                        "title": "Which annotation scheme, Stoplight or Speech Acts? Speech acts are marked as ArgM.",
                        "enum": [
                            "ArgM",
                            "Stoplight"
                        ]
                    },
                    "label": {
                        "type": "string",
                        "title": "The Czech label of the annotation category. Closed list, defined in the labels scheme. ",
                        "examples": [
                            "Závěr",
                            "Pravidlo",
                            "Metatext",
                            "3_Relevantní",
                            "2_Zbytečné",
                            "Příběh",
                            "Právní otázka"
                        ]
                    },
                    "start": {
                        "type": "integer",
                        "title": "The character offset of the start of the annotation span, in the given document, including markdown tag characters",
                        "examples": [
                            1907,
                            7872,
                            5668,
                            1617,
                            1049,
                            3460,
                            6441,
                            728
                        ]
                    },
                    "end": {
                        "type": "integer",
                        "title": "The character offset of the end of the annotation span, in the given document, including markdown tag characters",
                        "examples": [
                            2038,
                            7893,
                            5678,
                            2578,
                            1210,
                            3692,
                            6664,
                            940
                        ]
                    },
                    "span_id": {
                        "type": "string",
                        "title": "Gloss-generated ID of the given annotation span",
                        "examples": [
                            "667294c5c6537d54ff05f170",
                            "6788e6b5c6537d54ff06547d",
                            "6788e6d3c6537d54ff06547f",
                            "674458e2c6537d54ff063158",
                            "674458b1c6537d54ff063155",
                            "6672963dc6537d54ff05f177",
                            "66b3281fc6537d54ff05ff1c",
                            "66b32b04c6537d54ff05ff3b"
                        ]
                    },
                    "linked_rules": {
                        "type": [
                            "array",
                            "null"
                        ],
                        "title": "Span IDs of annotation spans having the label Pravidlo (Law). This attribute occurs only with the labels Příběh (Narrative) and Závěr (Conclusion).",
                        "items": {
                            "type": "string",
                            "title": "A Schema",
                            "examples": [
                                "6672988cc6537d54ff05f18a",
                                "66b327fbc6537d54ff05ff19",
                                "66b3280ac6537d54ff05ff1a",
                                "66b32812c6537d54ff05ff1b",
                                "66b32838c6537d54ff05ff1e",
                                "66b32699c6537d54ff05ff09"
                            ]
                        },
                        "examples": [
                            [],
                            null,
                            [
                                "6672988cc6537d54ff05f18a"],
                            ["66b327fbc6537d54ff05ff19",
                                "66b3280ac6537d54ff05ff1a",
                                "66b32812c6537d54ff05ff1b",
                                "66b32838c6537d54ff05ff1e",
                                "66b32699c6537d54ff05ff09"
                            ]
                        ]
                    },
                    "linked_conclusions": {
                        "type": [
                            "null",
                            "array"
                        ],
                        "title": "Span IDs of annotation spans having the label Závěr (Conclusion). This attribute occurs only with the labels Příběh (Narrative) and Právní otázka (Legal Issue).",
                        "items": {
                            "type": "string",
                            "title": "A Schema",
                            "examples": [
                                "6672967ec6537d54ff05f17b",
                                "66729683c6537d54ff05f17c",
                                "66b326b5c6537d54ff05ff0c",
                                "66b326c5c6537d54ff05ff0d",
                                "66b326cbc6537d54ff05ff0e",
                                "66b32f71c6537d54ff05ff69"
                            ]
                        },
                        "examples": [
                            null,
                            ["6672967ec6537d54ff05f17b",
                                "66729683c6537d54ff05f17c"
                            ],
                            ["66b326b5c6537d54ff05ff0c",
                                "66b326c5c6537d54ff05ff0d",
                                "66b326cbc6537d54ff05ff0e"
                            ],
                            [
                                "66b32f71c6537d54ff05ff69"]
                        ]
                    },
                    "type": {
                        "type": "string",
                        "title": "Gloss-generated ID of the annotation category, synonymous with the Czech labels",
                        "examples": [
                            "660d7adec6537d54ff05e5aa",
                            "660d7ac3c6537d54ff05e5a8",
                            "660d7be5c6537d54ff05e5ac",
                            "65f9724dc6537d54ff05df41",
                            "65f97278c6537d54ff05df42",
                            "660d7ad5c6537d54ff05e5a9",
                            "66608523c6537d54ff05f0f2"
                        ]
                    },
                    "owner": {
                        "type": "string",
                        "title": "Gloss-generated ID of the annotator",
                        "examples": [
                            "65ec336dc6537d54ff05dd0f",
                            "677e86e4c6537d54ff065092"
                        ]
                    }
                },
    "examples": [{
        "annotations": [{
            "doc_id": "6611a64cc6537d54ff05e668",
            "doc_name": "Dopis ID_dluh.md",
            "task_type": "ArgM",
            "label": "Závěr",
            "start": 1907,
            "end": 2038,
            "span_id": "667294c5c6537d54ff05f170",
            "linked_rules": [],
            "linked_conclusions": null,
            "type": "660d7adec6537d54ff05e5aa",
            "owner": "65ec336dc6537d54ff05dd0f"
        },
        {
            "doc_id": "678563edc6537d54ff0651c6",
            "doc_name": "27 cdo 544_2023.md",
            "task_type": "ArgM",
            "label": "Pravidlo",
            "start": 7872,
            "end": 7893,
            "span_id": "6788e6b5c6537d54ff06547d",
            "linked_rules": null,
            "linked_conclusions": null,
            "type": "660d7ac3c6537d54ff05e5a8",
            "owner": "677e86e4c6537d54ff065092"
        },
        {
            "doc_id": "678563edc6537d54ff0651c6",
            "doc_name": "27 cdo 544_2023.md",
            "task_type": "ArgM",
            "label": "Metatext",
            "start": 5668,
            "end": 5678,
            "span_id": "6788e6d3c6537d54ff06547f",
            "linked_rules": null,
            "linked_conclusions": null,
            "type": "660d7be5c6537d54ff05e5ac",
            "owner": "677e86e4c6537d54ff065092"
        },
        {
            "doc_id": "66f19554c6537d54ff06244d",
            "doc_name": "Mestsky_urad_Vyzva_k_zaplaceni_nakladu_rizeni_pred.md",
            "task_type": "Stoplight",
            "label": "3_Relevantní",
            "start": 1617,
            "end": 2578,
            "span_id": "674458e2c6537d54ff063158",
            "linked_rules": null,
            "linked_conclusions": null,
            "type": "65f9724dc6537d54ff05df41",
            "owner": "65ec336dc6537d54ff05dd0f"
        },
        {
            "doc_id": "66f19554c6537d54ff06244d",
            "doc_name": "Mestsky_urad_Vyzva_k_zaplaceni_nakladu_rizeni_pred.md",
            "task_type": "Stoplight",
            "label": "2_Zbytečné",
            "start": 1049,
            "end": 1210,
            "span_id": "674458b1c6537d54ff063155",
            "linked_rules": null,
            "linked_conclusions": null,
            "type": "65f97278c6537d54ff05df42",
            "owner": "65ec336dc6537d54ff05dd0f"
        },
        {
            "doc_id": "6611a64cc6537d54ff05e668",
            "doc_name": "Dopis ID_dluh.md",
            "task_type": "ArgM",
            "label": "Příběh",
            "start": 3460,
            "end": 3692,
            "span_id": "6672963dc6537d54ff05f177",
            "linked_rules": [
                "6672988cc6537d54ff05f18a"
            ],
            "linked_conclusions": [
                "6672967ec6537d54ff05f17b",
                "66729683c6537d54ff05f17c"
            ],
            "type": "660d7ad5c6537d54ff05e5a9",
            "owner": "65ec336dc6537d54ff05dd0f"
        },
        {
            "doc_id": "66ab4dccc6537d54ff05fd1d",
            "doc_name": "KVOP_19_Stavarska_zprava_JSm.md",
            "task_type": "ArgM",
            "label": "Příběh",
            "start": 6441,
            "end": 6664,
            "span_id": "66b3281fc6537d54ff05ff1c",
            "linked_rules": [
                "66b327fbc6537d54ff05ff19",
                "66b3280ac6537d54ff05ff1a",
                "66b32812c6537d54ff05ff1b",
                "66b32838c6537d54ff05ff1e",
                "66b32699c6537d54ff05ff09"
            ],
            "linked_conclusions": [
                "66b326b5c6537d54ff05ff0c",
                "66b326c5c6537d54ff05ff0d",
                "66b326cbc6537d54ff05ff0e"
            ],
            "type": "660d7ad5c6537d54ff05e5a9",
            "owner": "65ec336dc6537d54ff05dd0f"
        },
        {
            "doc_id": "66082017c6537d54ff05e15a",
            "doc_name": "stavarska-1_kusv.md",
            "task_type": "ArgM",
            "label": "Právní otázka",
            "start": 728,
            "end": 940,
            "span_id": "66b32b04c6537d54ff05ff3b",
            "linked_rules": null,
            "linked_conclusions": [
                "66b32f71c6537d54ff05ff69"
            ],
            "type": "66608523c6537d54ff05f0f2",
            "owner": "65ec336dc6537d54ff05dd0f"
        }]
    }]
}

Reuse

CC BY-NC-SA 4.0

Institute of Formal and Applied Linguistics

Charles University, Czech Republic
Faculty of Mathematics and Physics

Search form

KUKY - Documentation

1 Introduction

3 Text genres represented in KUKY

4 Annotation schemes

4.1 Relevance Stoplight

4.2 Speech Acts

4.2.1 Speech Acts scheme for normative documents

4.2.2 Speech Acts scheme for documents relating to individual cases

5 Data structure

5.1 Documents

5.2 Labels

5.3 Annotations

6 Metadata

Citation

7 Acknowledgments

License

8 Appendix: JSON schemes

8.1 JSON schema for documents

8.2 JSON schema of annotation categories in argumentative texts

8.3 JSON schema of annotation categories in normative texts (Frank Bold data set)

8.4 JSON schema of annotation spans in normative texts (Frank Bold data set)

8.5 JSON schema of annotation spans in argumentative texts

Reuse

Search form

1 Introduction

2 GDPR

3 Text genres represented in KUKY

4 Annotation schemes

4.1 Relevance Stoplight

4.2 Speech Acts

4.2.1 Speech Acts scheme for normative documents

4.2.2 Speech Acts scheme for documents relating to individual cases

5 Data structure

5.1 Documents

5.2 Labels

5.3 Annotations

6 Metadata

Citation

7 Acknowledgments

License

8 Appendix: JSON schemes

8.1 JSON schema for documents

8.2 JSON schema of annotation categories in argumentative texts

8.3 JSON schema of annotation categories in normative texts (Frank Bold data set)

8.4 JSON schema of annotation spans in normative texts (Frank Bold data set)

8.5 JSON schema of annotation spans in argumentative texts

Reuse