Corpora

Title Type
Bengali Visual Genome Project
CorefUD Project
Czech Academic Corpus Project
Czech Legal Text Treebank Project
Czech Malach Cross-lingual Speech Retrieval Test Collection Project
Czech Named Entity Corpus Project
Czech RST Discourse Treebank 1.0 Project
CzeDLex - A Lexicon of Czech Discourse Connectives Project
CzEng Project
CzEngVallex - Czech and English verbal valency Project
CzeSL Project
Deep Universal Dependencies Project
Deltacorpus Project
ELITR Minuting Corpus Project
EngVallex - English valency lexicon linked to corpora Project
European Language Grid Project
EUROSAI Corpus Project
EVALD 3.0 (Evaluator of Discourse) Project
HamleDT Project
Hausa Visual Question Answering Dataset Project
HindEnCorp Project
Hindi Visual Genome Project
Implicit relations in text coherence Project
Interset Project
Lindat KonText Project
Malayalam Visual Genome Project
Medieval Charter Sections Corpus Project
Methods for rapid discourse annotation in selected corpora Project
Modeling of Complexity in Czech Literary Texts Project
MorfFlex CZ Project
Multilingual Corpus Annotation as a Support for Language Technologies Project
NomVallex: Valency Lexicon of Czech Nouns and Adjectives Project
OdiEnCorp Project
ParCzech Project
PARSEME Project
PARSEME Project
PAWS (Parallel Anaphoric Wall Street Journal) Project
PDT-C Project
PDT-Vallex: Valency Lexicon Linked to Czech Corpora Project
PDTSC 2.0 Project
PML-Tree Query Project
Prague Czech-English Dependency Treebank Project
Prague Czech-English Dependency Treebank 2.0 Coref Project
Prague Czech-English Dependency Treebank 3.0 Project
Prague Database of Spoken Language 1.0 Project
Prague Dependency Treebank Project
Prague Dependency Treebank 3.0 Project
Prague Dependency Treebank 3.5 Project
Prague Discourse Treebank 1.0 Project
Prague Discourse Treebank 2.0 Project
Prague Discourse Treebank 3.0 Project
Prague English Dependency Treebank Project
Prague Markup Language (PML) Project
QT21 Project
ROMi 1.0 Project
Semantic Pattern Recognition Project
Sentiment Analysis in Czech Project
Shallow discourse parsing in Czech Project
Slovakoczech NLP workshop Project
SumeCzech Project
SynSemClass (formerly CzEngClass) Project
UFAL Medical Corpus Project
UFAL Parallel Corpus of North Levantine Project
UniDive Project
Universal Dependencies Project
UrMonoCorp Project
VPS-30-En: Verb Pattern Sample - 30 English Project
VPS-GradeUp Project
W2C Project
Working with the Penn Discourse Treebank Project
Working with the RST-DT and the RST-SC Project
A comparison of Czech and English verbal valency based on corpus material (theory and practice) Grant
Asistent přístupné úřední komunikace Grant
Automatická analýza diskurzních vztahů v češtině Grant
Automatické hodnocení mluveného projevu v češtině [Automated Speech Scoring in Czech] Grant
Centre for Language Research Infrastructure in the Czech Republic Grant
Čeština ve věku strojového překladu Grant
Common Language Resources and their Applications - a Marie Curie ITN Grant
Computational Literary Studies Infrastructure Grant
Contextually-based synonymy and valency of verbs in a bilingual setting Grant
Coreference, Discourse Relations and Information Structure in a Contrastive Perspective Grant
Corpus-based Valency Lexicon of Czech Nouns Grant
Cross-lingual approaches to coreference resolution Grant
Deep Syntactic Representation across Languages Grant
Development of statistical methods for spoken dialogue systems Grant
Epistemic and Evidential Markers in Czech Grant
Establishing and operating the Czech node of pan-European infrastructure for research (Vybudování a provoz českého uzlu pan-evropské infrastruktury pro výzkum) Grant
EuroMatrix Grant
European Language Grid Grant
Explicitní popis jazyka a anotovaná data se zřetelem na češtinu Grant
Generování české poezie v edukačním a multimediálním prostředí Grant
Global Coherence of Czech Texts in the Corpus-Based Perspective Grant
High Performance Language Technologies Grant
Implicit Relations in Text Coherence Grant
LINDAT/CLARIAH-CZ Language Resources and Digital Arts and Humanities Research Infrastructure Grant
LINDAT/CLARIN - Research infrastructure for language technologies – extension of the repository and its computational power Grant
Linguistic Factors of Readability in Czech Administrative and Educational Texts Grant
Merlin Grant
Metody pro rychlou diskurzní anotaci ve vybraných korpusech Grant
Modelling dependency syntax across languages Grant
Modelování komplexity českých literárních textů Grant
Morphological complexity of the verbal lexicon in four languages: Quantitative research based on corpus data Grant
Morphologically and Syntactically Annotated Corpora of Many Languages Grant
Multilingual Corpus Annotation as a Support for Language Technologies Grant
On Linguistic Structure of Evaluative Meaning in Czech Grant
Reviving Zellig S. Harris: More linguistic information for distributional lexical analysis of English and Czech Grant
Sentence-Level Polarity Detection in a Computer Corpus Grant
Strojový překlad se sémantickou informací Grant
Structure of coreferential chains in parallel language data Grant
Subcategorization of adverbial meanings based on corpus data Grant
TextLink: Skladba diskurzu v evropských jazycích Grant
TextLink: Structuring Discourse in Multilingual Europe Grant
Tools and data for Machine Translation between Related Languages Grant
Towards a Computational Analysis of Text Structure Grant
Transatlantic Collaboration between LAPPS and CLARIN: Semantic, Technical and Infrastructural Interoperability of Services Grant
Uniform Meaning Representation (UMR) Grant
Universal morphosyntactic annotation of language data Grant
Valency of Non-verbal Predicates. An Extension of Valency Studies to Adjectives and Deadjectival Nouns. Grant
Word-formation Relations Reflected in Noun Valency: The Case of Czech Deverbal and Deadjectival Nouns Grant
ForFun 1.0 Tool
Netgraph Tool
PML-TQ Tool