Large Language Models

There are no elephants in this picture

Goals of the course:

  1. Explain how the models work
  2. Teach basic usage of the models
  3. Help students critically assess what you read about them
  4. Encourage thinking about the broader context of using the models

Syllabus from SIS:

  • Basics of neural networks for language modeling
  • Language model typology
  • Data acquisition and curation, downstream tasks
  • Training (self-supervised learning, reinforcement learning with human feedback)
  • Finetuning & Inference
  • Multilinguality and cross-lingual transfer
  • Large Language Model Applications (e.g., conversational systems, robotics, code generation)
  • Multimodality (CLIP, diffusion models)
  • Societal impacts
  • Interpretability

The course is part of the inter-university programme prg.ai Minor.

About

SIS code: NPFL140
Semester: summer
E-credits: 3
Examination: 0/2 C
Guarantors: Jindřich Helcl, Jindřich Libovický

Timespace Coordinates

The course is held on Thursdays at 12:20 in S5.

Lectures

1. Introductory notes and discussion on large language models Slides

2. The Transformer architecture Slides Notes Recording

3. Data & Evaluation Slides

4. LLM Inference Slides Code Recording

5. Hands-on session

6. LLM Training

License

Unless otherwise stated, teaching materials for this course are available under CC BY-SA 4.0.

1. Introductory notes and discussion on large language models

 Feb 19 Slides

Instructor: Zdeněk Kasner

Covered topics: aims of the course, passing requirements. We informally discussed what are (large) language models, what are they for, what are their benefits and downsides. We also gathered ideas on how to train the models, how to use them, and how to evaluate them.

2. The Transformer architecture

 Feb 26 Slides Notes Recording

Instructor: Jindřich Libovický

Learning objectives. After the lecture you should be able to...

  • Explain the building blocks of the Transformer architecture to a non-technical person.

  • Describe the Transformer architecture using equations, especially the self-attention block.

  • Implement the Transformer architecture (in PyTorch or another framework that does automated differentiation).

Additional materials.

3. Data & Evaluation

 Mar 5 Slides

Instructors: Ondřej Dušek & Patrícia Schmidtová

Learning objectives:

  • Understand what kinds of data LLMs need for training
  • Know about viable tasks for evaluating LLMs
  • Know where to find data for evaluation tasks
  • Have an idea on how to evaluate -- what approaches there are and what are their pros and cons

Additional materials:

4. LLM Inference

 Mar 12 Slides Code Recording

Instructor: Zdeněk Kasner

Learning objectives. After the class you should be able to...

  • Understand how to generate text with a Transformer-based language model.
  • Explain differences between decoding algorithms and the role of decoding parameters.
  • Choose a suitable LLM for your task.
  • Run a LLM locally on your computer or computational cluster.

Additional materials.

5. Hands-on session

 Mar 19

Instructor: Zdeněk Kasner

6. LLM Training

 Mar 26

Instructor: Ondřej Dušek

Project work

You will work on a team project during the semester. Teams of 4-6 students will work on the following topics.

Project Timeline

  • 2 March: project assignment
  • Week of 16–20 March, Kick-off meetings with supervisor
  • 31 March: Experiment plan submitted
  • 21 April: Self-assessment form
  • Project presentations
    • Early bird: 14 May
    • Standard: 21 May
  • Project reports by the end of the semester (Hard deadline: 5 working days before you need the credit)

In addition, write a log every week from the start until the end (the earlier you submit the less logs to do).

Project Reports

Each team will submit a report, consisting of:

  • Brief method overview
  • Summary of related work
  • Experimental design
  • Results
  • Conclusions
  • Overview of individual contributions of team members
  • References

The length of the report should be maximum 4 pages plus references and contributions. You might want to use the ACL paper template.

Reading assignments

You will be asked at least once to read a paper before the class.

Final written test

You need to take part in a final written test that will not be graded.