The following topics are directly relevant to the SPRINT and PONK projects and suitable as a student thesis in Computational Linguistics, Natural Language Processing (NLP), or a related program. Each topic produces artifacts (models, datasets, evaluation results) that would be integrated into the running system.

Fine-Tuning Czech Language Models for Legal Rule Detection

Summary

Train a local classifier to detect specific stylistic/linguistic rule violations in Czech legal text, replacing or complementing the current LLM- based approach.

Motivation

The current system sends each text unit to a general-purpose LLM (e.g., Llama 3.3, latest GPT models) with a detailed prompt per rule. This is slow (~seconds per rule × unit), non-deterministic, and expensive. A fine-tuned local model could provide faster, consistent, and more precise detection.

Approach

Frame each rule as a binary classification task (violation / no violation) or a multi-label task across all rules
Fine-tune a Czech encoder model (e.g., RobeCzech, Czert, or a multilin- gual model like XLM-RoBERTa) on annotated examples
Create training data from: (a) existing manual annotations, (b) LLM- generated synthetic examples, (c) rule-based heuristics
Evaluate precision, recall, F1 per rule; compare with the LLM baseline
Investigate few-shot and data augmentation strategies for rules with sparse examples

Expected outcomes

A fine-tuned model (or ensemble) deployable as a local service; a benchmark dataset; a comparative analysis of local vs. LLM-based detection.

Automatic Discovery of Stylistic Rules from Legal Corpora

Summary

Use unsupervised or semi-supervised NLP methods to discover new candidate stylistic rules from large collections of Czech legal text.

Motivation

The current rule set was defined manually by legal linguists. Additional rules already exist in related projects (e.g., PONK) and will be integrated as the application matures, but there may be many more recurring stylistic issues that could be systematically identified and proposed as new rules.

Approach

Work with existing corpora of Czech legal/administrative texts available within the project
Use anomaly detection, clustering, or contrastive analysis (legal text vs. standard Czech) to identify recurring unusual constructions
Apply dependency parsing and morphological analysis (via UD- Pipe/MorphoDiTa) to extract syntactic patterns
Rank candidate rules by frequency, severity, and distinctiveness
Validate with domain experts (legal linguists)
Compare discovered candidates with rules already formalized in PONK and SPRINT

Expected outcomes

A pipeline for rule discovery; a ranked list of candidate rules with examples; analysis of Czech legal writing patterns.

Integrating PONK with LLM-Based Detection

Summary

Design and evaluate a hybrid detection architecture that combines the existing classical rule-based NLP system (PONK) with LLM-based and fine-tuned model approaches, routing each rule to the optimal method.

Motivation

PONK is an existing rule-based NLP system that already im- plements many of the exact stylistic rules studied in SPRINT using classical methods (morphological analysis, dependency parsing). Some rules have de- terministic linguistic signatures well-suited to PONK, while others require the semantic understanding of an LLM. Understanding where each approach excels is key to building an optimal production system.

Approach

Benchmark PONK’s existing rule implementations against the LLM-based evaluator (and optionally a fine-tuned classifier from Topic 1) on the same test set
Evaluate per-rule: precision, recall, F1, latency, cost
Analyze error patterns: where does the LLM succeed and PONK fails, and vice versa?
Propose a hybrid architecture that routes each rule to the optimal method (PONK, fine-tuned model, or LLM)
Implement a routing/orchestration layer and evaluate end-to-end performance

Expected outcomes

A comparative benchmark across methods; a hybrid detection architecture with rule-level routing; practical recommendations for Czech legal NLP.

Prompt Optimization and Structured Output for Legal Text Evaluation

Summary

Systematically evaluate and optimize prompt strategies for rule- based text evaluation using LLMs, with a focus on structured and reliable output.

Motivation

Prompt design significantly affects LLM accuracy, consistency, and output format compliance. The current system uses a fixed prompt template per rule. There is room to improve detection quality through better prompting without changing the model.

Approach

Compare prompt strategies: zero-shot, few-shot, chain-of-thought, rule decomposition
Evaluate the effect of example selection, ordering, and negative examples
Investigate constrained decoding / structured generation (e.g., JSON mode, grammar-constrained sampling) for reliable output parsing
Measure per-rule accuracy, false positive rate, and output format compli- ance across prompt variants
Explore batching strategies (multiple sentences per prompt, multiple rules per prompt)

Expected outcomes

An optimized prompt library per rule; guidelines for prompt design in legal NLP; quantitative comparison of strategies.

Institute of Formal and Applied Linguistics

Charles University, Czech Republic
Faculty of Mathematics and Physics

Search form

Fine-Tuning Czech Language Models for Legal Rule Detection

Summary

Motivation

Approach

Expected outcomes

Automatic Discovery of Stylistic Rules from Legal Corpora

Summary

Motivation

Approach

Expected outcomes

Integrating PONK with LLM-Based Detection

Summary

Motivation

Approach

Expected outcomes

Prompt Optimization and Structured Output for Legal Text Evaluation

Summary

Motivation

Approach

Expected outcomes