Principal investigator (ÚFAL): 
Project Manager (ÚFAL): 
Provider: 
Grant id: 
PRIMUS/23/SCI/023
ÚFAL budget: 
9158000
Duration: 
2023-2026

Recently, multilingual sentence representations allowed representing many languages in a single model and thus the zero-shot transfer of task-specific models between languages. These methods can potentially revolutionize computational linguistics and natural language processing (NLP) by unifying the processing of all languages into a single framework. Yet, the language neutrality of current models is not sufficient for that.

We believe two points were neglected in previous work. Theoretical work suggests that physical perception might help to ground meaning – and eventually push the language neutrality of multilingual representation. Language meaning is socially constructed and inseparable from culture, which sets inherent limits for language neutrality. Multilingual representations must be aware of the cultural dimension of meaning, which should be interpretable and controllable.

In this project, we tackle these two issues of multilingual respresentation. As a results we want to make NLP models available in many languages without the need for explicit translation or task-specific data in multiple languages.