NPFL128 Language Technologies in Practice
2025 Summer

Instructor:	Jirka Hana
e-mail:	Jirka.<my last name>@gmail.com (start the email's subject with NPFL128)
Time & Place:	Monday 15:40 - 17:10 S1

1. Description and Objectives of the Course

The course surveys solutions to common NLP tasks ranging from entity recognition to text generation. It evaluates various approaches (machine learning, rules, larger resources, ...) and their combinations.

Most of the course consists of students presenting and discussing papers relevant to a given topic. Part of the course also involves implementing a prototype system, typically replicating one described in one of the papers.

2. Discussion

This course is organized as a discussion of important papers. Everybody reads all the papers to be able to participate in the discussion. For each paper, one student will be responsible for presenting the paper and leading the discussion.

Not every detail of each paper is important to us now. Do not present aspects that are no longer relevant (e.g., there was a lot of development in word embeddings, so there is typically no reason to discuss how a paper from 2001 handled it).

Choose a paper and tell me which one.
Reads all papers
Create a Google doc document named (NPFL128 <your_name>) and share it with me. Use the same document for all papers.
For a paper that you did not choose, write 4-8 bullets summarizing:
- the basic idea of the paper
- aspects of the paper you find interesting, useful in other areas, etc.
- aspects of the paper you think could be improved
- Do this before the paper is scheduled, and send me a notification about it.
For the selected paper, create a presentation and send it to me by noon on the Monday before the class.

3. Project

There is one programming [Project] due on July 31 (talk to me if you cannot meet the deadline). Note that using git and Pull requests is required.

4. Grading

Project	0-50
Active class participation	0-50
Total:	0-100

Grade	Points
1	90-100
2	76-89
3	60-75
4	0-59

5. Schedule

Discussion on	Presented by	Topic	Slides
Feb 17	me	Introduction	Zippf's law, morphological analysis
Feb 24	me	Morphology: Yarowsky & Wicentowski (2000) Minimally Supervised Morphological Analysis by Multimodal Alignment	slides
Mar 3	Adriana	Morphology: Cucerzan & Yarowsky (2002) Bootstrapping a Multilingual Part-of-speech Tagger in One Person-day and Cucerzan & Yarowsky (2003) Minimally Supervised Induction of Grammatical Gender (presented together)	slides
Mar 10	me	Entities: Nadeau & Sekine (2007) A survey of named entity recognition and classification
Mar 17	Jan	Entitity linking: Cucerzan (2007): Large-Scale Named Entity Disambiguation Based on Wikipedia Data
Mar 24	Ori	Sentiment analysis: Mohammad (2017) Challenges in Sentiment Analysis
Mar 31		PCA, RAG 1
Apr 7	Matouš	Evaluation: Ribeiro et al (2020) Beyond Accuracy: Behavioral Testing of NLP Models with CheckList.
Apr 14	Oliver	Prompting: Prompt Engineering Guide by Anthropic	slides, jupyter
Apr 21		NO CLASS
Apr 28	Yauheni	GenAI Patterns: Subramaniam & Fowler (2025) Emerging Patterns in Building GenAI Products	slides
May 5	Petr	Agents: Agents Guide by Anthropic
May 12	Anna	Deep seek: paper, video
May 19		NO CLASS