The Czech Legal Text Treebank (CLTT) is a manually annotated corpus of dependency trees. The treebank consists of 1,128 sentences from the legal domain.
The sentences were taken from Accounting Act (563/1991 Coll., as amended) and Decree on Double-entry Accounting for undertakers (500/2002 Coll., as amended). The selection was given by the goals determined in the INTLIB project, focusing on the accounting subdomain namely.
The annotations in CLTT fit the framework originally formulated in the Prague Dependency Treebank (PDT) project. The dependency approach to syntactic analysis with the main role of the verb is applied. Technically, we speak about the analytical (a-) layer of annotation where each token in the sentence has one corresponding node and dependencies are assigned with the syntactic dependency function stored in the afun attribute.
To make manual annotation as easy as possible, we developed a special annotation strategy:
CLTT is avaiable at LINDAT/CLARIN repository: http://hdl.handle.net/11234/1-1516
To browse CLTT, you need to run the open-source application TrEd with the INTLIB extension. This extension can be installed directly from TrEd using Setup >> Manage Extensions >> Get New Extensions. Make sure that the repository http://ufal.mff.cuni.cz/tred/extensions/core/ is enabled in Setup >> Manage Extensions >> Edit Repositories.
Please use the following text to cite CLTT:
Kríž, Vincent; Hladká, Barbora and Urešová, Zdeňka, 2015, Czech Legal Text Treebank, LINDAT/CLARIN digital library at Institute of Formal and Applied Linguistics, Charles University in Prague, http://hdl.handle.net/11234/1-1516.
Distributed under CC BY-NC-SA licence.