SIS code: 
Semester: 
summer
Instructor: 

NPFL128 Language Technologies in Practice
2026 Summer

Instructor: Jirka Hana
e-mail:  Jirka.<my last name>@gmail.com (start the email's subject with NPFL128)
Time & Place: Thursday 15:40 - 17:10 S11

1. Description and Objectives of the Course

The course surveys solutions to common NLP tasks ranging from entity recognition to text generation. It evaluates various approaches (machine learning, rules, larger resources, ...) and their combinations.

Most of the course consists of students presenting and discussing papers relevant to a given topic. Part of the course also involves implementing a prototype system, typically replicating one described in one of the papers.

2. Discussion

This course is organized as a discussion of important papers. Everybody reads all the papers to be able to participate in the discussion. For each paper, one student will be responsible for presenting the paper and leading the discussion.

Not every detail of each paper is important to us now. Do not present aspects that are no longer relevant (e.g., there was a lot of development in word embeddings, so there is typically no reason to discuss how a paper from 2001 handled it).

  • Choose a paper and tell me which one.
  • Reads all papers
  • Create a Google doc document named (NPFL128 <your_name>) and share it with me. Use the same document for all papers.
  • For a paper that you did not choose, write 4-8 bullets summarizing:
    • the basic idea of the paper
    • aspects of the paper you find interesting, useful in other areas, etc.
    • aspects of the paper you think could be improved
    • Do this before the paper is scheduled, and send me a notification about it.
  • For the selected paper, create a presentation and send it to me by Monday noon before the class.

3. Project

There is one programming [Project] due on July 31 (talk to me if you cannot meet the deadline). Note that using git and Pull requests is required.

4. Grading

Project 0-50
Active class participation 0-50
Total: 0-100

Grade Points
1 90-100
2 76-89
3 60-75
4 0-59

5. Schedule

Discussion on  Presented by Topic Slides
Feb 19 me Introduction Zippf's law, morphological analysis
Feb 26 me Morphology: Yarowsky & Wicentowski (2000) Minimally Supervised Morphological Analysis by Multimodal Alignment
Mar 5 Morphology: Cucerzan & Yarowsky (2002): Bootstrapping a Multilingual Part-of-speech Tagger in One Person-day and Cucerzan & Yarowsky (2003): Minimally Supervised Induction of Grammatical Gender (presented together)
Mar 12 Fedor RAG: Gao et al (2023): Retrieval-Augmented Generation for Large Language Models: A Survey.
Mar 19
Mar 26
Apr 2
Apr 9
Apr 16
Apr 23 NO CLASS
Apr 30
May 7
May 14
May 21
Candidate topics and papers