Large Language Models

There are no elephants in this picture

Goals of the course:

  1. Explain how the models work
  2. Teach basic usage of the models
  3. Help students critically assess what you read about them
  4. Encourage thinking about the broader context of using the models

Syllabus from SIS:

  • Basics of neural networks for language modeling
  • Language model typology
  • Data acquisition and curation, downstream tasks
  • Training (self-supervised learning, reinforcement learning with human feedback)
  • Finetuning & Inference
  • Multilinguality and cross-lingual transfer
  • Large Language Model Applications (e.g., conversational systems, robotics, code generation)
  • Multimodality (CLIP, diffusion models)
  • Societal impacts
  • Interpretability

The course is part of the inter-university programme prg.ai Minor.

About

SIS code: NPFL140
Semester: summer
E-credits: 3
Examination: 0/2 C
Guarantors: Jindřich Helcl, Jindřich Libovický

Timespace Coordinates

The course is held on Thursdays at 12:20 in S5.

Lectures

1. Introductory notes and discussion on large language models Slides

2. The Transformer architecture Slides Notes Recording

License

Unless otherwise stated, teaching materials for this course are available under CC BY-SA 4.0.

1. Introductory notes and discussion on large language models

 Feb 19 Slides

Instructor: Zdeněk Kasner

Covered topics: aims of the course, passing requirements. We informally discussed what are (large) language models, what are they for, what are their benefits and downsides. We also gathered ideas on how to train the models, how to use them, and how to evaluate them.

2. The Transformer architecture

 Feb 26 Slides Notes Recording

Instructor: Jindřich Libovický

Learning objectives. After the lecture you should be able to...

  • Explain the building blocks of the Transformer architecture to a non-technical person.

  • Describe the Transformer architecture using equations, especially the self-attention block.

  • Implement the Transformer architecture (in PyTorch or another framework that does automated differentiation).

Additional materials.

Project work

You will work on a team project during the semester. The project will be presented during the final class. Each team will submit a report, consisting of:

  • Brief method overwiev
  • Summary of related work
  • Experimental design
  • Results
  • Conclusions
  • Overview of individual contributions of team members
  • References

The length of the report should be maximum 4 pages plus references and contributions. You might want to use the ACL paper template.

Reading assignments

You will be asked at least once to read a paper before the class.

Final written test

You need to take part in a final written test that will not be graded.