Archive

Area of research Funding provider

Grants

Dialog
Duration Provider
EDU-AI: AI asistent pro žáky a učitele 04/2021-12/2023 TAČR
LINDAT/CLARIN: Centre for Language Research Infrastructure in the Czech Republic 2016 - 2019 MŠMT - velké infrastruktury
ECSS: Evaluation of conversational speech synthesis 2022-2024 GAUK
LINDAT/CLARIAH-CZ: LINDAT/CLARIAH-CZ Language Resources and Digital Arts and Humanities Research Infrastructure (2016-)2023-2026 MŠMT - velké infrastruktury
WELCOME: Multiple Intelligent Conversation Agent Services for Reception, Management and Integration of Third Country Nationals. 2020-2023 H2020
NG-NLG: Next-Generation Natural Language Generation 2022-2027 Horizon Europe, ERC
CZDEMOS4AI: Prospěšný multiagentní AI avatar v malé demokratické společnosti 09/2024-12/2029 TAČR
Duration Provider
EDU-AI: AI asistent pro žáky a učitele 04/2021-12/2023 TAČR
AIAI: AI: Authorship and Interpretation 2025-2027 GAČR
ECSS: Evaluation of conversational speech synthesis 2022-2024 GAUK
LINDAT/CLARIAH-CZ: LINDAT/CLARIAH-CZ Language Resources and Digital Arts and Humanities Research Infrastructure (2016-)2023-2026 MŠMT - velké infrastruktury
MASAPI: Multilingual assistant for searching, analysing and processing information and decision support 2021-2024 TAČR
NG-NLG: Next-Generation Natural Language Generation 2022-2027 Horizon Europe, ERC
CZDEMOS4AI: Prospěšný multiagentní AI avatar v malé demokratické společnosti 09/2024-12/2029 TAČR
The Anthropology of Artificial Intelligence: Ethics, Understanding, Human Nature 2023-2024 ETF UK
Information Retrieval
Duration Provider
EDU-AI: AI asistent pro žáky a učitele 04/2021-12/2023 TAČR
CEDMO 2.0 NPO 1.9. 2024 - 30. 4. 2026 MPO
DACT: Digital Analysis of Chant Transmission 2023-2029 Social Sciences and Humanities Research Council of Canada
MASAPI: Multilingual assistant for searching, analysing and processing information and decision support 2021-2024 TAČR
CZDEMOS4AI: Prospěšný multiagentní AI avatar v malé demokratické společnosti 09/2024-12/2029 TAČR
Annotations
Duration Provider
A data-based approach to competition in word-formation: selected semantic categories across seven languages 2021-2023 START
PONK: Asistent přístupné úřední komunikace 9/2023-12/2025 TAČR
LINDAT/CLARIN: Centre for Language Research Infrastructure in the Czech Republic 2016 - 2019 MŠMT - velké infrastruktury
CLS Infra: Computational Literary Studies Infrastructure 2021-2025 H2020
NomVallex-Denom: Czech non-verbal predicates motivated by nouns and their syntactic behavior 2025-2027 GAČR
HVar: Disagreement in corpus annotation and variation of human understanding of text 2024-2026 GAČR
SEEM-CZ: Epistemic and Evidential Markers in Czech 2023-2025 GAČR
ForFun2: ForFun2: Functions and Forms of Circumstantial Modifications 2023-2025 GAČR
EduPo: Generování české poezie v edukačním a multimediálním prostředí 09/2023 - 11/2026 TAČR
Global Coherence: Global Coherence of Czech Texts in the Corpus-Based Perspective 2020 - 2023 GAČR
LINDAT/CLARIAH-CZ: LINDAT/CLARIAH-CZ Language Resources and Digital Arts and Humanities Research Infrastructure (2016-)2023-2026 MŠMT - velké infrastruktury
OP VVV LINDAT: LINDAT/CLARIN - Research infrastructure for language technologies – extension of the repository and its computational power 2017–2019 MŠMT - OP VVV
RapiDisc: Metody pro rychlou diskurzní anotaci ve vybraných korpusech 2022-2024 GAČR
Multilingual Lens: Investigating Large Text Corpora from Different Methodological Perspectives 2024 - 2029 UK
WELCOME: Multiple Intelligent Conversation Agent Services for Reception, Management and Integration of Third Country Nationals. 2020-2023 H2020
OmniOMR: OmniOMR - optical music recognition using machine learning for digital libraries 2023-2027 NAKI
NomVallexDer: Word-formation Relations Reflected in Noun Valency: The Case of Czech Deverbal and Deadjectival Nouns 2022-2024 GAČR
Data
Duration Provider
A data-based approach to competition in word-formation: selected semantic categories across seven languages 2021-2023 START
Adapting Uniform Meaning Representation (UMR) for the Italic/Romance languages 2024-2026 GAUK
Automatické hodnocení mluveného projevu v češtině [Automated Speech Scoring in Czech] 2023–2027 NAKI
CEDMO 2.0 NPO 1.9. 2024 - 30. 4. 2026 MPO
LINDAT/CLARIN: Centre for Language Research Infrastructure in the Czech Republic 2016 - 2019 MŠMT - velké infrastruktury
CLS Infra: Computational Literary Studies Infrastructure 2021-2025 H2020
NomVallex-Denom: Czech non-verbal predicates motivated by nouns and their syntactic behavior 2025-2027 GAČR
DACT: Digital Analysis of Chant Transmission 2023-2029 Social Sciences and Humanities Research Council of Canada
HVar: Disagreement in corpus annotation and variation of human understanding of text 2024-2026 GAČR
SEEM-CZ: Epistemic and Evidential Markers in Czech 2023-2025 GAČR
ECSS: Evaluation of conversational speech synthesis 2022-2024 GAUK
Global Coherence: Global Coherence of Czech Texts in the Corpus-Based Perspective 2020 - 2023 GAČR
HPLT: High Performance Language Technologies 2022-2025 HE
LINDAT/CLARIAH-CZ: LINDAT/CLARIAH-CZ Language Resources and Digital Arts and Humanities Research Infrastructure (2016-)2023-2026 MŠMT - velké infrastruktury
OP VVV LINDAT: LINDAT/CLARIN - Research infrastructure for language technologies – extension of the repository and its computational power 2017–2019 MŠMT - OP VVV
RapiDisc: Metody pro rychlou diskurzní anotaci ve vybraných korpusech 2022-2024 GAČR
Multilingual Lens: Investigating Large Text Corpora from Different Methodological Perspectives 2024 - 2029 UK
WELCOME: Multiple Intelligent Conversation Agent Services for Reception, Management and Integration of Third Country Nationals. 2020-2023 H2020
OmniOMR: OmniOMR - optical music recognition using machine learning for digital libraries 2023-2027 NAKI
EdUKate: Promoting digital education of foreign-language children through machine translation 2023-2026 TAČR
Mashcima: Synthetic training data generation and other methods for handwritten music recognition 2023-2025 GAUK
Uniform Meaning Representation (UMR) 1.3.2023 - 30.9.2027 MŠMT
Lexicons
Duration Provider
A data-based approach to competition in word-formation: selected semantic categories across seven languages 2021-2023 START
LINDAT/CLARIN: Centre for Language Research Infrastructure in the Czech Republic 2016 - 2019 MŠMT - velké infrastruktury
NomVallex-Denom: Czech non-verbal predicates motivated by nouns and their syntactic behavior 2025-2027 GAČR
SEEM-CZ: Epistemic and Evidential Markers in Czech 2023-2025 GAČR
LINDAT/CLARIAH-CZ: LINDAT/CLARIAH-CZ Language Resources and Digital Arts and Humanities Research Infrastructure (2016-)2023-2026 MŠMT - velké infrastruktury
Modeling Mopheme Flow among Languages Jan 2024- Dec 2026 GAUK
Uniform Meaning Representation (UMR) 1.3.2023 - 30.9.2027 MŠMT
NomVallexDer: Word-formation Relations Reflected in Noun Valency: The Case of Czech Deverbal and Deadjectival Nouns 2022-2024 GAČR
Morphology
Duration Provider
A data-based approach to competition in word-formation: selected semantic categories across seven languages 2021-2023 START
LINDAT/CLARIN: Centre for Language Research Infrastructure in the Czech Republic 2016 - 2019 MŠMT - velké infrastruktury
Compound Identification and Splitting in Four Languages: A Deep Learning Approach 2022-2024 GAUK
LINDAT/CLARIAH-CZ: LINDAT/CLARIAH-CZ Language Resources and Digital Arts and Humanities Research Infrastructure (2016-)2023-2026 MŠMT - velké infrastruktury
Modeling Mopheme Flow among Languages Jan 2024- Dec 2026 GAUK
Morphological complexity of the verbal lexicon in four languages: Quantitative research based on corpus data 2023-2025 GAUK
Multilingual
Duration Provider
A data-based approach to competition in word-formation: selected semantic categories across seven languages 2021-2023 START
Babel Octopus: Robust Multi-Source Speech Translation 2021-2023 START
Better Tokenization for Multilingual Language Models and Machine Translation 3 years GAČR
Compound Identification and Splitting in Four Languages: A Deep Learning Approach 2022-2024 GAUK
CLS Infra: Computational Literary Studies Infrastructure 2021-2025 H2020
HPLT: High Performance Language Technologies 2022-2025 HE
Language Neutral and Culturally Aware Multilingual Neural Sentence Representations 2023-2026 UK
LINDAT/CLARIAH-CZ: LINDAT/CLARIAH-CZ Language Resources and Digital Arts and Humanities Research Infrastructure (2016-)2023-2026 MŠMT - velké infrastruktury
Modeling Mopheme Flow among Languages Jan 2024- Dec 2026 GAUK
LangTech: Modernizace oboru Matematická lingvistika MŠMT - OP VVV
Morphological complexity of the verbal lexicon in four languages: Quantitative research based on corpus data 2023-2025 GAUK
Multilingual Lens: Investigating Large Text Corpora from Different Methodological Perspectives 2024 - 2029 UK
WELCOME: Multiple Intelligent Conversation Agent Services for Reception, Management and Integration of Third Country Nationals. 2020-2023 H2020
NEUREM3: Neuronové reprezentace v multimodálním a mnohojazyčném modelování (Neural Representations in Multi-modal and Multi-lingual Modelling) 2019-2023 GAČR
EdUKate: Promoting digital education of foreign-language children through machine translation 2023-2026 TAČR
Uniform Meaning Representation (UMR) 1.3.2023 - 30.9.2027 MŠMT
Semantics
Duration Provider
A data-based approach to competition in word-formation: selected semantic categories across seven languages 2021-2023 START
Adapting Uniform Meaning Representation (UMR) for the Italic/Romance languages 2024-2026 GAUK
LINDAT/CLARIN: Centre for Language Research Infrastructure in the Czech Republic 2016 - 2019 MŠMT - velké infrastruktury
CLS Infra: Computational Literary Studies Infrastructure 2021-2025 H2020
NomVallex-Denom: Czech non-verbal predicates motivated by nouns and their syntactic behavior 2025-2027 GAČR
HVar: Disagreement in corpus annotation and variation of human understanding of text 2024-2026 GAČR
SEEM-CZ: Epistemic and Evidential Markers in Czech 2023-2025 GAČR
ForFun2: ForFun2: Functions and Forms of Circumstantial Modifications 2023-2025 GAČR
Global Coherence: Global Coherence of Czech Texts in the Corpus-Based Perspective 2020 - 2023 GAČR
LUSyD: Language Understanding: from Syntax to Discourse 2020–2024 GAČR
LINDAT/CLARIAH-CZ: LINDAT/CLARIAH-CZ Language Resources and Digital Arts and Humanities Research Infrastructure (2016-)2023-2026 MŠMT - velké infrastruktury
MASAPI: Multilingual assistant for searching, analysing and processing information and decision support 2021-2024 TAČR
WELCOME: Multiple Intelligent Conversation Agent Services for Reception, Management and Integration of Third Country Nationals. 2020-2023 H2020
NG-NLG: Next-Generation Natural Language Generation 2022-2027 Horizon Europe, ERC
Uniform Meaning Representation (UMR) 1.3.2023 - 30.9.2027 MŠMT
Using Auxiliary Subtasks for Learning Constraints in NLP 2023-2025 GAUK
Duration Provider
ATRIUM: Advancing FronTier Research In the Arts and hUManities 2024 - 2027 HE
HumanAId: AI zaměřená na člověka pro udržitelnou a adaptabilní společnost 1. 3. 2025 - 31. 12. 2028 MŠMT - OP JAK
CEDMO 2.0 EU: Central European Digital Media Observatory 2.0 1.1.2024-31.10.2026 EC Digital Europe Programme (DIGITAL)
RES-Q Plus: Comprehensive solutions of healthcare improvement based on the global Registry of Stroke Care Quality 2022-2026 HE
ELE 2: European Language Equality 2 2022-2023 PPPA (EU)
EVERSE: European Virtual Institute for Research Software Excellence 2024-2027 HE
HumanE-AI-Net: HumanE AI Network 1. 9. 2020 - 31. 8. 2024 H2020
Identification and Prevention of Unwanted Gender Bias in Neural Language Models 2023-2024 GAČR
Improving stomach examinations with Artificial Intelligence: A deep learning approach for assisted gastroscopy 1. 7. 2024 - 31. 12. 2026 MŠMT
InCroMin: Interactive Crosslingual Minutes 2024 HE
Jazykověda, umělá inteligence a jazykové a řečové technologie: od výzkumu k aplikacím 1. 1. 2025 - 31. 12. 2028 MŠMT - OP JAK
Methods for improving neural machine translation of diverse texts 2023-2025 GAUK
OpenEuroLLM: Open European Family of Large Language Models 36 months Digital Europe Programme
test
test2
MEMORISE: Virtualisation and Multimodal Exploration of Heritage on Nazi Persecution 2022-2026 HE
Corpora
Duration Provider
PONK: Asistent přístupné úřední komunikace 9/2023-12/2025 TAČR
Automatické hodnocení mluveného projevu v češtině [Automated Speech Scoring in Czech] 2023–2027 NAKI
LINDAT/CLARIN: Centre for Language Research Infrastructure in the Czech Republic 2016 - 2019 MŠMT - velké infrastruktury
CLS Infra: Computational Literary Studies Infrastructure 2021-2025 H2020
DACT: Digital Analysis of Chant Transmission 2023-2029 Social Sciences and Humanities Research Council of Canada
SEEM-CZ: Epistemic and Evidential Markers in Czech 2023-2025 GAČR
EduPo: Generování české poezie v edukačním a multimediálním prostředí 09/2023 - 11/2026 TAČR
Global Coherence: Global Coherence of Czech Texts in the Corpus-Based Perspective 2020 - 2023 GAČR
HPLT: High Performance Language Technologies 2022-2025 HE
LINDAT/CLARIAH-CZ: LINDAT/CLARIAH-CZ Language Resources and Digital Arts and Humanities Research Infrastructure (2016-)2023-2026 MŠMT - velké infrastruktury
OP VVV LINDAT: LINDAT/CLARIN - Research infrastructure for language technologies – extension of the repository and its computational power 2017–2019 MŠMT - OP VVV
RapiDisc: Metody pro rychlou diskurzní anotaci ve vybraných korpusech 2022-2024 GAČR
Morphological complexity of the verbal lexicon in four languages: Quantitative research based on corpus data 2023-2025 GAUK
Multilingual Lens: Investigating Large Text Corpora from Different Methodological Perspectives 2024 - 2029 UK
Uniform Meaning Representation (UMR) 1.3.2023 - 30.9.2027 MŠMT
NomVallexDer: Word-formation Relations Reflected in Noun Valency: The Case of Czech Deverbal and Deadjectival Nouns 2022-2024 GAČR
Machine Learning
Duration Provider
PONK: Asistent přístupné úřední komunikace 9/2023-12/2025 TAČR
LINDAT/CLARIN: Centre for Language Research Infrastructure in the Czech Republic 2016 - 2019 MŠMT - velké infrastruktury
Compound Identification and Splitting in Four Languages: A Deep Learning Approach 2022-2024 GAUK
DACT: Digital Analysis of Chant Transmission 2023-2029 Social Sciences and Humanities Research Council of Canada
ECSS: Evaluation of conversational speech synthesis 2022-2024 GAUK
HPLT: High Performance Language Technologies 2022-2025 HE
Language Neutral and Culturally Aware Multilingual Neural Sentence Representations 2023-2026 UK
LUSyD: Language Understanding: from Syntax to Discourse 2020–2024 GAČR
LINDAT/CLARIAH-CZ: LINDAT/CLARIAH-CZ Language Resources and Digital Arts and Humanities Research Infrastructure (2016-)2023-2026 MŠMT - velké infrastruktury
LangTech: Modernizace oboru Matematická lingvistika MŠMT - OP VVV
MASAPI: Multilingual assistant for searching, analysing and processing information and decision support 2021-2024 TAČR
NEUREM3: Neuronové reprezentace v multimodálním a mnohojazyčném modelování (Neural Representations in Multi-modal and Multi-lingual Modelling) 2019-2023 GAČR
NG-NLG: Next-Generation Natural Language Generation 2022-2027 Horizon Europe, ERC
OmniOMR: OmniOMR - optical music recognition using machine learning for digital libraries 2023-2027 NAKI
CZDEMOS4AI: Prospěšný multiagentní AI avatar v malé demokratické společnosti 09/2024-12/2029 TAČR
Mashcima: Synthetic training data generation and other methods for handwritten music recognition 2023-2025 GAUK
Using Auxiliary Subtasks for Learning Constraints in NLP 2023-2025 GAUK
Discourse
Duration Provider
Automatické hodnocení mluveného projevu v češtině [Automated Speech Scoring in Czech] 2023–2027 NAKI
LINDAT/CLARIN: Centre for Language Research Infrastructure in the Czech Republic 2016 - 2019 MŠMT - velké infrastruktury
Global Coherence: Global Coherence of Czech Texts in the Corpus-Based Perspective 2020 - 2023 GAČR
LINDAT/CLARIAH-CZ: LINDAT/CLARIAH-CZ Language Resources and Digital Arts and Humanities Research Infrastructure (2016-)2023-2026 MŠMT - velké infrastruktury
RapiDisc: Metody pro rychlou diskurzní anotaci ve vybraných korpusech 2022-2024 GAČR
Multilingual Lens: Investigating Large Text Corpora from Different Methodological Perspectives 2024 - 2029 UK
Monolingual
Duration Provider
Automatické hodnocení mluveného projevu v češtině [Automated Speech Scoring in Czech] 2023–2027 NAKI
NomVallex-Denom: Czech non-verbal predicates motivated by nouns and their syntactic behavior 2025-2027 GAČR
EduPo: Generování české poezie v edukačním a multimediálním prostředí 09/2023 - 11/2026 TAČR
HPLT: High Performance Language Technologies 2022-2025 HE
LINDAT/CLARIAH-CZ: LINDAT/CLARIAH-CZ Language Resources and Digital Arts and Humanities Research Infrastructure (2016-)2023-2026 MŠMT - velké infrastruktury
CZDEMOS4AI: Prospěšný multiagentní AI avatar v malé demokratické společnosti 09/2024-12/2029 TAČR
NomVallexDer: Word-formation Relations Reflected in Noun Valency: The Case of Czech Deverbal and Deadjectival Nouns 2022-2024 GAČR
Tools
Duration Provider
Automatické hodnocení mluveného projevu v češtině [Automated Speech Scoring in Czech] 2023–2027 NAKI
LINDAT/CLARIN: Centre for Language Research Infrastructure in the Czech Republic 2016 - 2019 MŠMT - velké infrastruktury
Compound Identification and Splitting in Four Languages: A Deep Learning Approach 2022-2024 GAUK
CLS Infra: Computational Literary Studies Infrastructure 2021-2025 H2020
DACT: Digital Analysis of Chant Transmission 2023-2029 Social Sciences and Humanities Research Council of Canada
EduPo: Generování české poezie v edukačním a multimediálním prostředí 09/2023 - 11/2026 TAČR
LINDAT/CLARIAH-CZ: LINDAT/CLARIAH-CZ Language Resources and Digital Arts and Humanities Research Infrastructure (2016-)2023-2026 MŠMT - velké infrastruktury
OP VVV LINDAT: LINDAT/CLARIN - Research infrastructure for language technologies – extension of the repository and its computational power 2017–2019 MŠMT - OP VVV
Mashcima: Synthetic training data generation and other methods for handwritten music recognition 2023-2025 GAUK
The Anthropology of Artificial Intelligence: Ethics, Understanding, Human Nature 2023-2024 ETF UK
Machine Translation
Duration Provider
Babel Octopus: Robust Multi-Source Speech Translation 2021-2023 START
Better Tokenization for Multilingual Language Models and Machine Translation 3 years GAČR
LINDAT/CLARIN: Centre for Language Research Infrastructure in the Czech Republic 2016 - 2019 MŠMT - velké infrastruktury
HPLT: High Performance Language Technologies 2022-2025 HE
LUSyD: Language Understanding: from Syntax to Discourse 2020–2024 GAČR
LINDAT/CLARIAH-CZ: LINDAT/CLARIAH-CZ Language Resources and Digital Arts and Humanities Research Infrastructure (2016-)2023-2026 MŠMT - velké infrastruktury
MASAPI: Multilingual assistant for searching, analysing and processing information and decision support 2021-2024 TAČR
WELCOME: Multiple Intelligent Conversation Agent Services for Reception, Management and Integration of Third Country Nationals. 2020-2023 H2020
EdUKate: Promoting digital education of foreign-language children through machine translation 2023-2026 TAČR
Using Auxiliary Subtasks for Learning Constraints in NLP 2023-2025 GAUK
Speech Recognition
Duration Provider
Babel Octopus: Robust Multi-Source Speech Translation 2021-2023 START
LINDAT/CLARIN: Centre for Language Research Infrastructure in the Czech Republic 2016 - 2019 MŠMT - velké infrastruktury
LINDAT/CLARIAH-CZ: LINDAT/CLARIAH-CZ Language Resources and Digital Arts and Humanities Research Infrastructure (2016-)2023-2026 MŠMT - velké infrastruktury
WELCOME: Multiple Intelligent Conversation Agent Services for Reception, Management and Integration of Third Country Nationals. 2020-2023 H2020
Information Structure
Duration Provider
CEDMO 2.0 NPO 1.9. 2024 - 30. 4. 2026 MPO
LINDAT/CLARIAH-CZ: LINDAT/CLARIAH-CZ Language Resources and Digital Arts and Humanities Research Infrastructure (2016-)2023-2026 MŠMT - velké infrastruktury
MASAPI: Multilingual assistant for searching, analysing and processing information and decision support 2021-2024 TAČR
Multilingual Lens: Investigating Large Text Corpora from Different Methodological Perspectives 2024 - 2029 UK
Multi-modality
Duration Provider
CEDMO 2.0 NPO 1.9. 2024 - 30. 4. 2026 MPO
LINDAT/CLARIN: Centre for Language Research Infrastructure in the Czech Republic 2016 - 2019 MŠMT - velké infrastruktury
DACT: Digital Analysis of Chant Transmission 2023-2029 Social Sciences and Humanities Research Council of Canada
Language Neutral and Culturally Aware Multilingual Neural Sentence Representations 2023-2026 UK
LINDAT/CLARIAH-CZ: LINDAT/CLARIAH-CZ Language Resources and Digital Arts and Humanities Research Infrastructure (2016-)2023-2026 MŠMT - velké infrastruktury
WELCOME: Multiple Intelligent Conversation Agent Services for Reception, Management and Integration of Third Country Nationals. 2020-2023 H2020
NEUREM3: Neuronové reprezentace v multimodálním a mnohojazyčném modelování (Neural Representations in Multi-modal and Multi-lingual Modelling) 2019-2023 GAČR
EdUKate: Promoting digital education of foreign-language children through machine translation 2023-2026 TAČR
CZDEMOS4AI: Prospěšný multiagentní AI avatar v malé demokratické společnosti 09/2024-12/2029 TAČR
Coreference
Duration Provider
LINDAT/CLARIN: Centre for Language Research Infrastructure in the Czech Republic 2016 - 2019 MŠMT - velké infrastruktury
LUSyD: Language Understanding: from Syntax to Discourse 2020–2024 GAČR
LINDAT/CLARIAH-CZ: LINDAT/CLARIAH-CZ Language Resources and Digital Arts and Humanities Research Infrastructure (2016-)2023-2026 MŠMT - velké infrastruktury
Using Auxiliary Subtasks for Learning Constraints in NLP 2023-2025 GAUK
Linked data
Duration Provider
LINDAT/CLARIN: Centre for Language Research Infrastructure in the Czech Republic 2016 - 2019 MŠMT - velké infrastruktury
NomVallex-Denom: Czech non-verbal predicates motivated by nouns and their syntactic behavior 2025-2027 GAČR
DACT: Digital Analysis of Chant Transmission 2023-2029 Social Sciences and Humanities Research Council of Canada
LINDAT/CLARIAH-CZ: LINDAT/CLARIAH-CZ Language Resources and Digital Arts and Humanities Research Infrastructure (2016-)2023-2026 MŠMT - velké infrastruktury
WELCOME: Multiple Intelligent Conversation Agent Services for Reception, Management and Integration of Third Country Nationals. 2020-2023 H2020
NG-NLG: Next-Generation Natural Language Generation 2022-2027 Horizon Europe, ERC
Uniform Meaning Representation (UMR) 1.3.2023 - 30.9.2027 MŠMT
Parsers
Duration Provider
LINDAT/CLARIN: Centre for Language Research Infrastructure in the Czech Republic 2016 - 2019 MŠMT - velké infrastruktury
CLS Infra: Computational Literary Studies Infrastructure 2021-2025 H2020
LUSyD: Language Understanding: from Syntax to Discourse 2020–2024 GAČR
LINDAT/CLARIAH-CZ: LINDAT/CLARIAH-CZ Language Resources and Digital Arts and Humanities Research Infrastructure (2016-)2023-2026 MŠMT - velké infrastruktury
RapiDisc: Metody pro rychlou diskurzní anotaci ve vybraných korpusech 2022-2024 GAČR
WELCOME: Multiple Intelligent Conversation Agent Services for Reception, Management and Integration of Third Country Nationals. 2020-2023 H2020
Publications
Duration Provider
LINDAT/CLARIN: Centre for Language Research Infrastructure in the Czech Republic 2016 - 2019 MŠMT - velké infrastruktury
LINDAT/CLARIAH-CZ: LINDAT/CLARIAH-CZ Language Resources and Digital Arts and Humanities Research Infrastructure (2016-)2023-2026 MŠMT - velké infrastruktury
Taggers
Duration Provider
LINDAT/CLARIN: Centre for Language Research Infrastructure in the Czech Republic 2016 - 2019 MŠMT - velké infrastruktury
CLS Infra: Computational Literary Studies Infrastructure 2021-2025 H2020
LINDAT/CLARIAH-CZ: LINDAT/CLARIAH-CZ Language Resources and Digital Arts and Humanities Research Infrastructure (2016-)2023-2026 MŠMT - velké infrastruktury
Valency
Duration Provider
LINDAT/CLARIN: Centre for Language Research Infrastructure in the Czech Republic 2016 - 2019 MŠMT - velké infrastruktury
NomVallex-Denom: Czech non-verbal predicates motivated by nouns and their syntactic behavior 2025-2027 GAČR
LUSyD: Language Understanding: from Syntax to Discourse 2020–2024 GAČR
LINDAT/CLARIAH-CZ: LINDAT/CLARIAH-CZ Language Resources and Digital Arts and Humanities Research Infrastructure (2016-)2023-2026 MŠMT - velké infrastruktury
Uniform Meaning Representation (UMR) 1.3.2023 - 30.9.2027 MŠMT
NomVallexDer: Word-formation Relations Reflected in Noun Valency: The Case of Czech Deverbal and Deadjectival Nouns 2022-2024 GAČR
Teaching
Duration Provider
CLS Infra: Computational Literary Studies Infrastructure 2021-2025 H2020
LCT: European Masters Program Language and Communication Technologies IX.2007-VIII.2013, IX.2013-VIII.2019, IX.2019-VIII.2025 EU ERASMUS MUNDUS
EduPo: Generování české poezie v edukačním a multimediálním prostředí 09/2023 - 11/2026 TAČR
LangTech: Modernizace oboru Matematická lingvistika MŠMT - OP VVV
Syntax
Duration Provider
NomVallex-Denom: Czech non-verbal predicates motivated by nouns and their syntactic behavior 2025-2027 GAČR
ForFun2: ForFun2: Functions and Forms of Circumstantial Modifications 2023-2025 GAČR
LUSyD: Language Understanding: from Syntax to Discourse 2020–2024 GAČR
LINDAT/CLARIAH-CZ: LINDAT/CLARIAH-CZ Language Resources and Digital Arts and Humanities Research Infrastructure (2016-)2023-2026 MŠMT - velké infrastruktury
Uniform Meaning Representation (UMR) 1.3.2023 - 30.9.2027 MŠMT
NomVallexDer: Word-formation Relations Reflected in Noun Valency: The Case of Czech Deverbal and Deadjectival Nouns 2022-2024 GAČR
Psycholinguistics
Duration Provider
HVar: Disagreement in corpus annotation and variation of human understanding of text 2024-2026 GAČR
Multiword Expressions
Duration Provider
LINDAT/CLARIAH-CZ: LINDAT/CLARIAH-CZ Language Resources and Digital Arts and Humanities Research Infrastructure (2016-)2023-2026 MŠMT - velké infrastruktury
Uniform Meaning Representation (UMR) 1.3.2023 - 30.9.2027 MŠMT
Speech Retrieval
Duration Provider
LINDAT/CLARIAH-CZ: LINDAT/CLARIAH-CZ Language Resources and Digital Arts and Humanities Research Infrastructure (2016-)2023-2026 MŠMT - velké infrastruktury
Spellcheckers
Duration Provider
LINDAT/CLARIAH-CZ: LINDAT/CLARIAH-CZ Language Resources and Digital Arts and Humanities Research Infrastructure (2016-)2023-2026 MŠMT - velké infrastruktury
Provider: Digital Europe Programme
Duration Provider Grant ID PI Area
OpenEuroLLM: Open European Family of Large Language Models 36 months Digital Europe Programme 101195233 Jan Hajič
Provider: HE
Duration Provider Grant ID PI Area
EVERSE: European Virtual Institute for Research Software Excellence 2024-2027 HE 101129744 Pavel Straňák
ATRIUM: Advancing FronTier Research In the Arts and hUManities 2024 - 2027 HE 101132163 Pavel Straňák
InCroMin: Interactive Crosslingual Minutes 2024 HE 101070631 Ondřej Bojar
RES-Q Plus: Comprehensive solutions of healthcare improvement based on the global Registry of Stroke Care Quality 2022-2026 HE 101057603 Pavel Pecina
MEMORISE: Virtualisation and Multimodal Exploration of Heritage on Nazi Persecution 2022-2026 HE 101061016 Pavel Pecina
HPLT: High Performance Language Technologies 2022-2025 HE 101070350 Jan Hajič Corpora, Data, Machine Learning, Machine Translation, Monolingual, Multilingual
Provider: Social Sciences and Humanities Research Council of Canada
Duration Provider Grant ID PI Area
DACT: Digital Analysis of Chant Transmission 2023-2029 Social Sciences and Humanities Research Council of Canada 895-2023-1002 Jan Hajič jr. Corpora, Data, Information Retrieval, Linked data, Machine Learning, Multi-modality, Tools
Provider: ETF UK
Duration Provider Grant ID PI Area
The Anthropology of Artificial Intelligence: Ethics, Understanding, Human Nature 2023-2024 ETF UK 247002 Rudolf Rosa Tools
Provider: Horizon Europe, ERC
Duration Provider Grant ID PI Area
NG-NLG: Next-Generation Natural Language Generation 2022-2027 Horizon Europe, ERC 101039303 Ondřej Dušek Dialog, Linked data, Machine Learning, Semantics
Provider: PPPA (EU)
Duration Provider Grant ID PI Area
ELE 2: European Language Equality 2 2022-2023 PPPA (EU) LC-01884166 (Project 101075356) Jan Hajič

MŠMT - velké infrastruktury

Duration Provider Grant ID PI Area
LINDAT/CLARIN: Centre for Language Research Infrastructure in the Czech Republic 2016 - 2019 MŠMT - velké infrastruktury LM2015071 Jan Hajič Annotations, Coreference, Corpora, Data, Dialog, Discourse, Lexicons, Linked data, Machine Learning, Machine Translation, Morphology, Multi-modality, Parsers, Publications, Semantics, Speech Recognition, Taggers, Tools, Valency
LINDAT/CLARIAH-CZ: LINDAT/CLARIAH-CZ Language Resources and Digital Arts and Humanities Research Infrastructure (2016-)2023-2026 MŠMT - velké infrastruktury LM2023062 Jan Hajič Annotations, Coreference, Corpora, Data, Dialog, Discourse, Information Structure, Lexicons, Linked data, Machine Learning, Machine Translation, Monolingual, Morphology, Multi-modality, Multilingual, Multiword Expressions, Parsers, Publications, Semantics, Speech Recognition, Speech Retrieval, Spellcheckers, Syntax, Taggers, Tools, Valency
Provider: MPO
Duration Provider Grant ID PI Area
CEDMO 2.0 NPO 1.9. 2024 - 30. 4. 2026 MPO MPO 60273/24/21300/21000 Ondřej Bojar Data, Information Retrieval, Information Structure, Multi-modality
Provider: EC Digital Europe Programme (DIGITAL)
Duration Provider Grant ID PI Area
CEDMO 2.0 EU: Central European Digital Media Observatory 2.0 1.1.2024-31.10.2026 EC Digital Europe Programme (DIGITAL) 101158609 Václav Moravec
Provider: MŠMT - OP JAK
Duration Provider Grant ID PI Area
HumanAId: AI zaměřená na člověka pro udržitelnou a adaptabilní společnost 1. 3. 2025 - 31. 12. 2028 MŠMT - OP JAK CZ.02.01.01/00/23_025/0008691 Barbora Vidová Hladká
Jazykověda, umělá inteligence a jazykové a řečové technologie: od výzkumu k aplikacím 1. 1. 2025 - 31. 12. 2028 MŠMT - OP JAK CZ.02.01.01/00/23_020/0008518 Jan Hajič

Institutional support for research at the Charles University

Duration Provider Grant ID PI Area
Multilingual Lens: Investigating Large Text Corpora from Different Methodological Perspectives 2024 - 2029 UK UNCE/24/SSH/009 Zdeněk Žabokrtský Annotations, Corpora, Data, Discourse, Information Structure, Multilingual
Language Neutral and Culturally Aware Multilingual Neural Sentence Representations 2023-2026 UK PRIMUS/23/SCI/023 Jindřich Libovický Machine Learning, Multi-modality, Multilingual

Horizon 2020 - European Commission

Duration Provider Grant ID PI Area
CLS Infra: Computational Literary Studies Infrastructure 2021-2025 H2020 101004984 Silvie Cinková Annotations, Corpora, Data, Multilingual, Parsers, Semantics, Taggers, Teaching, Tools
WELCOME: Multiple Intelligent Conversation Agent Services for Reception, Management and Integration of Third Country Nationals. 2020-2023 H2020 870930 Pavel Pecina Annotations, Data, Dialog, Linked data, Machine Translation, Multi-modality, Multilingual, Parsers, Semantics, Speech Recognition
HumanE-AI-Net: HumanE AI Network 1. 9. 2020 - 31. 8. 2024 H2020 952026 Jan Hajič

EU ERASMUS MUNDUS

Duration Provider Grant ID PI Area
LCT: European Masters Program Language and Communication Technologies IX.2007-VIII.2013, IX.2013-VIII.2019, IX.2019-VIII.2025 EU ERASMUS MUNDUS 610622-EPP-1-2019-1-DE-EPPKA1-JMD-MOB Vladislav Kuboň Teaching

MŠMT - OP VVV

Duration Provider Grant ID PI Area
OP VVV LINDAT: LINDAT/CLARIN - Research infrastructure for language technologies – extension of the repository and its computational power 2017–2019 MŠMT - OP VVV CZ.02.1.01/0.0/0.0/16_013/0001781 Jan Hajič Annotations, Corpora, Data, Tools
LangTech: Modernizace oboru Matematická lingvistika MŠMT - OP VVV CZ.02.2.69/0.0/0.0/16_018/0002373 Zdeněk Žabokrtský Machine Learning, Multilingual, Teaching

Technology Agency (Czech Republic)

Duration Provider Grant ID PI Area
PONK: Asistent přístupné úřední komunikace 9/2023-12/2025 TAČR TQ01000526 Barbora Vidová Hladká Annotations, Corpora, Machine Learning
EdUKate: Promoting digital education of foreign-language children through machine translation 2023-2026 TAČR TQ01000458 Lucie Poláková Data, Machine Translation, Multi-modality, Multilingual
MASAPI: Multilingual assistant for searching, analysing and processing information and decision support 2021-2024 TAČR FW03010656 Pavel Pecina Information Retrieval, Information Structure, Machine Learning, Machine Translation, Semantics
CZDEMOS4AI: Prospěšný multiagentní AI avatar v malé demokratické společnosti 09/2024-12/2029 TAČR TQ12000040 Martin Popel Dialog, Information Retrieval, Machine Learning, Monolingual, Multi-modality
EduPo: Generování české poezie v edukačním a multimediálním prostředí 09/2023 - 11/2026 TAČR TQ01000153 Rudolf Rosa Annotations, Corpora, Monolingual, Teaching, Tools
EDU-AI: AI asistent pro žáky a učitele 04/2021-12/2023 TAČR TL05000236 Ondřej Dušek Dialog, Information Retrieval

Czech Science Foundation

Duration Provider Grant ID PI Area
Better Tokenization for Multilingual Language Models and Machine Translation 3 years GAČR 25-16242S Jindřich Libovický Machine Translation, Multilingual
NomVallex-Denom: Czech non-verbal predicates motivated by nouns and their syntactic behavior 2025-2027 GAČR 25-16716S Veronika Kolářová Annotations, Data, Lexicons, Linked data, Monolingual, Semantics, Syntax, Valency
AIAI: AI: Authorship and Interpretation 2025-2027 GAČR 25-14501L Rudolf Rosa
HVar: Disagreement in corpus annotation and variation of human understanding of text 2024-2026 GAČR 24-11132S Šárka Zikánová Annotations, Data, Psycholinguistics, Semantics
SEEM-CZ: Epistemic and Evidential Markers in Czech 2023-2025 GAČR 23-05240S Barbora Štěpánková Annotations, Corpora, Data, Lexicons, Semantics
ForFun2: ForFun2: Functions and Forms of Circumstantial Modifications 2023-2025 GAČR 23-05238S Marie Mikulová Annotations, Semantics, Syntax
Identification and Prevention of Unwanted Gender Bias in Neural Language Models 2023-2024 GAČR 23-06912S David Mareček
RapiDisc: Metody pro rychlou diskurzní anotaci ve vybraných korpusech 2022-2024 GAČR 22-03269S Jiří Mírovský Annotations, Corpora, Data, Discourse, Parsers
NomVallexDer: Word-formation Relations Reflected in Noun Valency: The Case of Czech Deverbal and Deadjectival Nouns 2022-2024 GAČR 22-20927S Veronika Kolářová Annotations, Corpora, Lexicons, Monolingual, Syntax, Valency
LUSyD: Language Understanding: from Syntax to Discourse 2020–2024 GAČR GX20-16819X Jan Hajič Coreference, Machine Learning, Machine Translation, Parsers, Semantics, Syntax, Valency
Global Coherence: Global Coherence of Czech Texts in the Corpus-Based Perspective 2020 - 2023 GAČR 20-09853S Lucie Poláková Annotations, Corpora, Data, Discourse, Semantics
NEUREM3: Neuronové reprezentace v multimodálním a mnohojazyčném modelování (Neural Representations in Multi-modal and Multi-lingual Modelling) 2019-2023 GAČR 19-26934X Ondřej Bojar Machine Learning, Multi-modality, Multilingual

Ministry of Education, Youth and Sport (Czech Republic)

Duration Provider Grant ID PI Area
Uniform Meaning Representation (UMR) 1.3.2023 - 30.9.2027 MŠMT LUAUS23283 Jan Hajič Corpora, Data, Lexicons, Linked data, Multilingual, Multiword Expressions, Semantics, Syntax, Valency
Improving stomach examinations with Artificial Intelligence: A deep learning approach for assisted gastroscopy 1. 7. 2024 - 31. 12. 2026 MŠMT LUABA24136 Pavel Pecina

Ministry of Culture

Duration Provider Grant ID PI Area
Automatické hodnocení mluveného projevu v češtině [Automated Speech Scoring in Czech] 2023–2027 NAKI DH23P03OVV037 Kateřina Rysová Corpora, Data, Discourse, Monolingual, Tools
OmniOMR: OmniOMR - optical music recognition using machine learning for digital libraries 2023-2027 NAKI DH23P03OVV008 Jan Hajič jr. Annotations, Data, Machine Learning

Program START (UK - OP VVV)

Duration Provider Grant ID PI Area
A data-based approach to competition in word-formation: selected semantic categories across seven languages 2021-2023 START START/HUM/010 Annotations, Data, Lexicons, Morphology, Multilingual, Semantics
Babel Octopus: Robust Multi-Source Speech Translation 2021-2023 START START/SCI/089 Peter Polák Machine Translation, Multilingual, Speech Recognition

Grant Agency of the Charles University

Duration Provider Grant ID PI Area
Modeling Mopheme Flow among Languages Jan 2024- Dec 2026 GAUK 101924 Abishek Stephen Lexicons, Morphology, Multilingual
Adapting Uniform Meaning Representation (UMR) for the Italic/Romance languages 2024-2026 GAUK 104924 Federica Gamba Data, Semantics
Mashcima: Synthetic training data generation and other methods for handwritten music recognition 2023-2025 GAUK 289623 Jiří Mayer Data, Machine Learning, Tools
Methods for improving neural machine translation of diverse texts 2023-2025 GAUK 244523 Josef Jon
Using Auxiliary Subtasks for Learning Constraints in NLP 2023-2025 GAUK 272323 Dávid Javorský Coreference, Machine Learning, Machine Translation, Semantics
Morphological complexity of the verbal lexicon in four languages: Quantitative research based on corpus data 2023-2025 GAUK 246723 Hana Hledíková Corpora, Morphology, Multilingual
ECSS: Evaluation of conversational speech synthesis 2022-2024 GAUK 40222 Ondřej Plátek Data, Dialog, Machine Learning
Compound Identification and Splitting in Four Languages: A Deep Learning Approach 2022-2024 GAUK 128122 Emil Svoboda Machine Learning, Morphology, Multilingual, Tools

Ministry of Education, Youth and Sport (Czech Republic)

Duration Provider Area
INTERCOST-Readability: Modelování komplexity českých literárních textů VI 2018 - X 2021 MŠMT Annotations, Corpora, Data, Discourse, Information Structure, Semantics, Syntax, Teaching
Multilingual Corpus Annotation as a Support for Language Technologies 2014-2016 MŠMT Annotations, Coreference, Corpora, Data, Discourse
MOBAme: Modern Bayesian methods in machine learning 2013-2013 MŠMT Teaching
VYSTADIAL: Development of statistical methods for spoken dialogue systems 2012-2016 MŠMT Corpora, Dialog, Speech Recognition, Tools
KontaktII: Strojový překlad se sémantickou informací 2012-2014 MŠMT Annotations, Corpora, Data, Lexicons, Machine Translation, Semantics, Valency
LINDAT/Clarin: Establishing and operating the Czech node of pan-European infrastructure for research (Vybudování a provoz českého uzlu pan-evropské infrastruktury pro výzkum) 2010-2015 MŠMT Annotations, Coreference, Corpora, Data, Dialog, Discourse, Lexicons, Linked data, Machine Learning, Machine Translation, Morphology, Multi-modality, Parsers, Publications, Semantics, Speech Recognition, Taggers, Tools, Valency
Kontakt: Towards a Computational Analysis of Text Structure 2010 - 2012 MŠMT Annotations, Coreference, Corpora, Data, Discourse
TextLink-cz: TextLink: Skladba diskurzu v evropských jazycích 1.11.2015 - 31.12.2017 MŠMT Annotations, Corpora, Data, Discourse, Lexicons, Linked data, Monolingual
LD-Parseme: PARSEME: Parsing a víceslovné výrazy – k jazykovědné přesnosti a výpočetní efektivitě ve zpracování přirozeného jazyka 04-2014 – 03-2017 MŠMT Lexicons, Multiword Expressions, Semantics, Valency
National Scientific Foundation
Duration Provider Area
PIRE: Partnership for International Research and Education till 2014 NSF Machine Translation, Semantics, Speech Recognition, Teaching

Horizon 2020 - European Commission

Duration Provider Area
CLARIN-PLUS September 2015 – August 2017 H2020
QT21: Quality Translation 21 II.2015-I.2018 H2020 Data, Lexicons, Linked data, Machine Learning, Machine Translation, Tools
SSHOC: Social Sciences & Humanities Open Cloud 2019-30/04/2022 H2020
ELG: European Language Grid 2019-2021 H2020 Annotations, Corpora, Data, Linked data, Machine Translation, Multilingual, Parsers, Semantics, Speech Recognition, Syntax, Taggers, Tools
Bergamot: Browser-based Multilingual Translation 2019-2021 H2020 Machine Translation
ELITR: European Live Translator 2019-2021 H2020 Machine Translation, Speech Recognition
KConnect: Khresmoi Multilingual Medical Text Analysis, Search and Machine Translation Connected in a Thriving Data-Value Chain 2015-2017 H2020 Information Retrieval, Machine Translation, Semantics
HimL: Health in my Language 2.2015–1.2018 H2020 Data, Lexicons, Machine Translation, Morphology
CRACKER: Cracking the Language Barrier: Coordination, Evaluation and Resources for European MT Research 1.2015-12.2017 H2020 Data, Machine Translation

FP6: Research - European Commission

Duration Provider Area
EuroMatrix IX.2006-II.2009 FP6 Annotations, Corpora, Machine Translation, Tools, Valency

Technology Agency (Czech Republic)

Duration Provider Area
THEaiTRE: THEAITRE: Umělá inteligence autorem divadelní hry? April 2020 - September 2022 TAČR Dialog, Machine Learning, Tools
INTLIB: Intelligent library 2012-2015 TAČR Data, Linked data, Tools

Institutional support for research at the Charles University

Duration Provider Area
AIvK Exponát Didaktikon: Život s umělou inteligencí: upgrade 2023-09-01 - 2023-12-31 UK Teaching, Tools
NaMuDDiS: Natural multi-domain dialogue systems 2019-2021 UK Dialog, Discourse, Teaching
UNCE VITRI: Center for the Transdisciplinary Research of Violence, Trauma and Justice 2018-2023 UK Data, Discourse, Multi-modality, Semantics
PROGRES Q18 - Společenské vědy: Programy progres 2017-2021 UK
PROGRES Q48 - Informatika: Programy progres 2017-2021 UK
PRVOUK: Programy rozvoje vědních oblastí na Univerzitě Karlově - Informatika 2012-2016 UK

Grant Agency of the Charles University

Duration Provider Area
Arithmetic Properties in the space of Language Model Prompts 2023 GAUK Machine Learning
Independent component analysis of continuous word representations 2021–2022 GAUK Annotations, Machine Learning, Semantics
Dialogue systems focused on combining tasks and chit-chat 2021-2023 GAUK Dialog, Machine Learning
Controllable NLG: Controllable Natural Language Generation 2021-2023 GAUK
Exploring Multilingual Representations of Language Units in Neural Networks 2021 - 2023 GAUK Information Structure, Machine Learning, Multilingual
Named Entity Linking 2020-2022 GAUK Data, Machine Learning, Multilingual, Taggers
Machine Translation of Interpreted Speech 2020-2022 GAUK Machine Translation, Multi-modality, Speech Recognition
Domain Adaptation for Natural Language Generation 2020-2022 GAUK Data, Machine Learning
Low resource methods for dialogue systems applications 2020 - 2022 GAUK Dialog, Discourse, Machine Learning
Neural machine translation for low-resource languages 2019-2021 GAUK Machine Translation, Monolingual
Developing derivational networks for multiple languages 2019-2021 GAUK Data, Morphology, Multilingual
Vektorová reprezentace textu založená na neuronových sítích 2019 - 2021 GAUK Information Retrieval, Machine Learning, Machine Translation
Research of Methods of Neural Machine Translation Evaluation 2018-2020 GAUK Machine Translation
Utilising Linguistic Knowledge in Neural Machine Translation 2018 - 2020 GAUK Machine Translation
Multimodal Optical Music Recognition using Deep Learning 2017-2019 GAUK Machine Learning, Multi-modality
Universal morphosyntactic annotation of language data 2017-2019 GAUK Annotations, Corpora, Machine Learning, Multilingual, Parsers
DeepSynt: Deep Syntactic Representation across Languages 2017-2018 GAUK Corpora, Data, Multilingual
Open domain dialog management with knowledge graphs 2016-2018 GAUK Data, Dialog, Machine Learning
open-domain SLU: Spoken Language Understanding in open-domain environment 2016-2018 GAUK Dialog, Information Retrieval, Linked data, Machine Learning, Semantics
ANNMT: Utilization of artificial neural networks in machine translation 2016-2018 GAUK Machine Translation
Using Language Knowledge in Scene Text Recognition 2015-2017 GAUK Multi-modality
cross-coref: Cross-lingual approaches to coreference resolution 2015-2017 GAUK Annotations, Coreference, Corpora, Data, Machine Learning, Machine Translation, Multilingual
DiaMine: Information mining from spoken dialogue 2015-2017 GAUK Data, Dialog, Machine Learning, Speech Recognition
Čapek GAUK: An alternative way of getting more annotated linguistic data 2014-2016 GAUK Annotations, Tools
AdaNLG: An adaptive natural language generator 2014-2016 GAUK Dialog, Multilingual, Semantics
croSSSynt: Modelling dependency syntax across languages 2014-2016 GAUK Annotations, Corpora, Data, Multilingual, Parsers
MSDS: Modern Spoken Dialog Systems 2014, 2015, 2016 GAUK Data, Dialog, Machine Learning, Speech Recognition
DepRefSet: Utilizing a Multitude of References in Machine Translation 2013-2015 GAUK Data, Machine Translation
Interactive information retrieval in audiovisual dialogue corpora 2013-2015 GAUK Information Retrieval, Speech Retrieval
Tools and data for Machine Translation between Related Languages 2012-2013 GAUK Corpora, Data, Machine Translation, Tools, Valency
Utilization of coreference in MT: Utilization of coreference in Machine Translation 2011-2013 GAUK Linked data, Machine Translation
Sentence-Level Polarity Detection in a Computer Corpus 2011-2013 GAUK Annotations, Corpora, Data, Lexicons, Tools

Ministry of Culture

Duration Provider Area
Prameny Krkonoš: Prameny Krkonoš. Vývoj systému evidence, zpracování a prezentace pramenů k historii a kultuře Krkonoš a jeho využití ve výzkumu a edukaci 2020-2022 NAKI
ÚSTR: Systém pro trvalé uchování dokumentace a prezentaci historichých pramenů z období totalitních režimů 2016-2019 NAKI
VIADAT: Virtuální asistent pro zpřístupnění historických audiovizuálních dat 2016-2019 NAKI Annotations, Speech Recognition, Tools
AMALACH 2012-2015 NAKI Information Retrieval, Machine Translation, Multi-modality, Speech Recognition, Speech Retrieval, Teaching
EVALD (Evaluator of Discourse): Automatic Evaluation of Text Coherence in Czech 1. 3. 2016 – 31. 12. 2019 NAKI Coreference, Discourse, Information Structure
Provider: CELSA
Duration Provider Area
CELL: Contextual Machine Learning of Language Translations 2020-2022 CELSA Machine Learning, Machine Translation, Multi-modality, Multilingual

Czech Science Foundation

Duration Provider Area
Word-formation structure of Czech words: a data-based research 2019-2021 GAČR Data, Morphology
NomVallex II.: Valency of Non-verbal Predicates. An Extension of Valency Studies to Adjectives and Deadjectival Nouns. 2019-2021 GAČR Corpora, Lexicons, Valency
LiFR: Linguistic Factors of Readability in Czech Administrative and Educational Texts 2019-2021 GAČR Annotations, Corpora, Data, Discourse, Information Structure, Monolingual, Semantics, Syntax
CzeDParse: Automatická analýza diskurzních vztahů v češtině 2019-2021 GAČR Annotations, Corpora, Data, Discourse, Lexicons, Parsers
Mnohojazyčný strojový překlad 2018-2020 GAČR Machine Learning, Machine Translation, Multilingual
LSD: Linguistic Structure Representation in Neural Networks 2018-2020 GAČR Machine Learning, Machine Translation, Morphology, Multilingual, Parsers, Syntax, Taggers
VALLEX - Between Reciprocity and Reflexivity: The Case of Czech Reciprocal Constructions 2018-2020 GAČR Data, Lexicons, Monolingual, Semantics, Syntax, Valency
AnaConn: Anaphoricity in Connectives: Lexical Description and Bilingual Corpus Analysis 2017–2019 GAČR Discourse, Lexicons, Multilingual
ForFun: Subcategorization of adverbial meanings based on corpus data 2017-2019 GAČR Annotations, Corpora, Data, Monolingual, Semantics
IRTC: Implicit Relations in Text Coherence 2017-2019 GAČR Annotations, Corpora, Data, Discourse, Psycholinguistics
CzEngClass: Contextually-based synonymy and valency of verbs in a bilingual setting 2017-2019 GAČR Annotations, Corpora, Data, Lexicons, Semantics, Valency
CorefChains: Structure of coreferential chains in parallel language data 2016-2018 GAČR Annotations, Coreference, Corpora, Data
NomVallex: Corpus-based Valency Lexicon of Czech Nouns 2016-2018 GAČR Corpora, Lexicons, Valency
DerInfMorph: An Integrated Approach to Derivational and Inflectional Morphology of Czech 2016-2018 GAČR Data, Monolingual, Morphology
Manyla: Morphologically and Syntactically Annotated Corpora of Many Languages 2015–2017 GAČR Annotations, Corpora, Data, Morphology, Multilingual, Parsers, Taggers
zelligharris: Reviving Zellig S. Harris: More linguistic information for distributional lexical analysis of English and Czech 2015-2017 GAČR Annotations, Corpora, Data, Semantics, Taggers
On Linguistic Structure of Evaluative Meaning in Czech 2015-2017 GAČR Annotations, Corpora, Data, Lexicons, Semantics
Combining Words: Syntactic Properties of Czech Multiword Expressions with Light Verbs 2015-2017 GAČR Annotations, Data, Lexicons, Multiword Expressions, Valency
LiStr: Sentence structure induction without annotated corpora 2014 - 2016 GAČR Machine Learning, Multilingual, Parsers
CzEngVallex: A comparison of Czech and English verbal valency based on corpus material (theory and practice) 2013-2015 GAČR Annotations, Corpora, Data, Lexicons
Vybrané derivační vztahy pro automatické zpracovaní češtiny 2012–2014 GAČR Morphology
VALLEX: Delving Deeper: Lexicographic Description of Syntactic and Semantic Properties of Czech Verbs 2012-2015 GAČR Annotations, Data, Lexicons, Semantics, Syntax, Valency
Systematic, economical and corpus-based description of valency properties of Czech deverbal nouns (theory and practice) 2012-2014 GAČR Lexicons, Valency
CEMI: Center for large-scale multi-modal data interpretation 2012 - 2019 GAČR Multi-modality
CorefDisk: Coreference, Discourse Relations and Information Structure in a Contrastive Perspective 2012 - 2015 GAČR Annotations, Coreference, Corpora, Data, Discourse, Information Structure
CZECHMATE: Čeština ve věku strojového překladu 2011 – 2013 GAČR Annotations, Corpora, Data, Machine Translation, Morphology, Parsers
NoSCoM: Non-Standard Computational Models and Their Applications in Complexity, Linguistics, and Learning 2010-2014 GAČR
Komputační lingvistika: Explicitní popis jazyka a anotovaná data se zřetelem na češtinu 2010-2013 GAČR Annotations, Coreference, Corpora, Data, Discourse, Information Structure

OP Praha – Pól růstu ČR

Duration Provider Area
MTviet: Machine Translation from Vietnamese into Czech for the Purposes of the Police of the Czech Republic 2017-2018 Praha OP PPR Machine Translation

Mellon Foundation (USA)

Duration Provider Area
LAPPS-CLARIN: Transatlantic Collaboration between LAPPS and CLARIN: Semantic, Technical and Infrastructural Interoperability of Services 2016-2018, 2019-2021 Mellon Foundation (USA) Annotations, Corpora, Data, Tools

FP7: Research - European Commission

Duration Provider Area
TextLink: TextLink: Structuring Discourse in Multilingual Europe 2014 - 2017 FP7 Coreference, Corpora, Discourse, Linked data, Multilingual
QTLeap: Quality Translation by Deep Language Engineering Approaches 2013–2016 FP7 Linked data, Machine Translation
PARSEME: PARSEME: Parsing and Multiword Expressions 2013-2017 FP7 Lexicons, Multiword Expressions, Semantics, Valency
MosesCore 2012-2015 FP7 Data, Machine Translation, Teaching, Tools
EUDAT: EUDAT: European Data Infrastructure 2011–2014 FP7 Data
FAUST: Feedback Analysis for User adaptive Statistical Translation 2010–2013 FP7 Machine Translation
KHRESMOI: Medical information analysis and retrieval 2010-2014 FP7 Information Retrieval, Machine Translation
CLARA: Common Language Resources and their Applications - a Marie Curie ITN 2009-2013 FP7 Annotations, Corpora, Data, Machine Translation, Teaching
EuroMatrixPlus 2009-2012 FP7 Machine Translation

EU Lifelong Learning Programme

Duration Provider Area
Merlin 2012-2014 LLP Annotations, Corpora, Data

MVČR

Duration Provider Area
PoliSys: Systém pro analýzu policejních dat pro potřeby Policie ČR 03/2017-03/2018 MVČR Data, Information Retrieval, Machine Learning, Morphology

Inspire

Duration Provider Area
INSPIRE: INSPIRE in Pocket Inspire Machine Translation