Vojtěch Lanz

office
424
email
lanz@ufal.mff.cuni.cz
address
Malostranské náměstí 25
118 00 Praha 1
Czech Republic

Main Research Interests

  • Clinical NLP
    • Question-Answering and Information Extraction from long, multilingual clinical documents
    • Domain-specific tokenization and pretraining for clinical language models
  • Efficiency and Optimization of Large Language Models
    • Post-training alignment of hybrid models (Transformers + Mamba)
    • KV cache compression
  • Computational Musicology
    • Gregorian chant analysis using Bayesian nonparametrics and bioinformatic methods

Projects

  • PhD Thesis
    Topic: Document-level information extraction
    Supervisordoc. RNDr. Pavel Pecina, Ph.D.
  • RES-Q+: Comprehensive solutions of healthcare improvement based on the global Registry of Stroke Care Quality.
  • GAUK: Empowering Healthcare with Large Language Models: Reducing Clinicians' Workload and Improving Stroke Patient Care
  • DACT: Digital Analysis of Chant Transmission, advancing the global study of plainchant transmission through digital analysis and computational resources.
  • GI-Insight: New methods for stomach examination using artificial intelligence: Utilization of deep learning for assisted gastroscopy.

Curriculum Vitae

My CV

Selected Bibliography

Papers

Vojtěch Lanz, and Pavel Pecina (2025): When Multilingual Models Compete with Monolingual Domain-Specific Models in Clinical Question Answering. In Proceedings of the Second Workshop on Patient-Oriented Language Processing (CL4Health), pages 69–82, Albuquerque, New Mexico. Association for Computational Linguistics. (url)

Vojtěch Lanz, and Jan Hajič jr. (2025): Gregorian melody, modality, and memory: Segmenting chant with Bayesian nonparametrics. In Proceedings of the 26th International Society for Music Information Retrieval Conference (ISMIR 2025), Daejeon, Korea.

Vojtěch Lanz, and Pavel Pecina (2025): CUNI-a at ArchEHR-QA 2025: Do We Need Giant LLMs for Clinical QA? In Proceedings of the 24th Workshop on Biomedical Language Processing (Shared Tasks), pages 27–40, Vienna, Austria. Association for Computational Linguistics. (url)

Vojtěch Lanz, Kristýna Szabová, and  Jan Hajič jr. (2025): Making computational study of Gregorian melody accessible with ChantLab. In Proceedings of the Music Encoding Conference 2025 (MEC 2025), London. (https://works.hcommons.org/records/z50gm-qf714)

Vojtěch Lanz, and Pavel Pecina (2024): Paragraph Retrieval for Enhanced Question Answering in Clinical Documents. In Proceedings of the 23rd Workshop on Biomedical Natural Language Processing, pages 580–590, Bangkok, Thailand. Association for Computational Linguistics. (url)

Vojtěch Lanz, and Jan Hajič jr. (2023): Text boundaries do not provide a better segmentation of Gregorian antiphons. Proceedings of the 10th International Conference on Digital Libraries for Musicology (DLfM '23). Association for Computing Machinery, New York, NY, USA, 72–76. (url)
 

Theses