Principal investigator (ÚFAL): 
Project Manager (ÚFAL): 
Provider: 
Grant id: 
458326
Duration: 
2026-2028
People: 

Reliable out-of-domain reasoning is a holy grail of current AI research, being both essential for real-world deployment of autonomous AI systems and unsolved. While large language models (LLMs) combined with chain-of-thought and trained using reinforcement learning from verifiable feedback achieved unheard-of performance in math and coding competitions, they still struggle with seemingly trivial tasks, suggesting that their reasoning capabilities do not generalize the way humans do and leading to the phenomenon commonly known as hallucinations. This makes it difficult for humans to trust the outputs of AI systems, hindering their use as fully-fledged collaborators. It also limits their applicability to unattended long-horizon tasks where multiple steps have to be performed without error in order to achieve a desired outcome.

The overarching goal of our project is to develop methods for reliable out-of-domain reasoning on a small, self-contained, and clearly defined domain, and subsequently to integrate these advances into general purpose systems, such as LLMs or VLA (Vision-Language-Action) models. Improvements in reliability and depth of reasoning lead directly to the applicability of AI in virtually every field, including mathematics, physics, medicine, cryptography, software verification, and more.

In the first year 2026, we will focus on the development of an ARC solver (Abstraction and Reasoning Corpus), and research of continual learning methods and autoformalization.