The seminar focuses on deeper understanding of selected unsupervised machine learning methods for students who already have basic knowledge of machine learning and probability models. The first half of the semester is devoted to methods of unsupervised learning using Bayesian inference (Dirichlet-Categorical models, Mixture of Categoricals, Mixture of Gaussians, Expectation Maximization, Gibbs sampling) and implementation of these methods on selected tasks. Other lectures will be devoted to clustering methods, componet analysis and unsupervised inspecting deep neural networks.
SIS code: NPFL097
Semester: winter
E-credits: 3
Examination: 1/1 C
Guarantor: David Mareček
The course will be taught online over Zoom, given the current pandemic. All lectures will be recorded so you can catch up later.
All enrolled students will get a Zoom link via email. If you want to take part and have not officialy enrolled, email me.
Students are expected to be familiar with basic probabilistic concepts, roughly in the extent of:
In the second half of the course, it will be an advantage for you if you know the basics of deep-learning methods. I recommend to attend
1. Introduction Slides Warm-Up test
2. Beta-Bernoulli probabilistic model Beta-Bernoulli Beta distribution
3. Dirichlet-Categorical probabilistic model, Modeling document collections Dirichlet-Categorical Document collections Categorical Mixture Models Expectation-Maximization
4. Bayesian Mixture Models, Gibbs Sampling, Latent Dirichlet Allocation Gibbs Sampling Gibbs Sampling for Bayesian mixture Latent Dirichlet allocation Algorithms for LDA and Mixture of Categoricals Gibbs Sampling
5. Programming session: Latent Dirichlet Allocation Latent Dirichlet Allocation
6. Chinese Restaurant Process Chinese Restaurant Process Bayessian inference with Tears
7. Programming session: Text Segmentation Chinese Segmentation
8. Unsupervised POS tagging, Word-Alignment, and Dependency Parsing Tagging, Alignment, Parsing
9. Mixture of Gaussians and other clustering methods K-Means and Gaussian Mixture Models Clustering Methods
10. Dimesionality Reduction Dimensionality Reduction
11. Programming session: Clustering and Component Analysis Clustering and Component Analysis on Word Vectors
12. Interpretation of Neural Networks Interpretation of Neural Networks Hidden in the Layers
Oct 05 Oct 09 (in English)
Oct 19 Oct 16 (in English)
Oct 26 Oct 30 (in English)
Nov 02 Nov 06 (in English)
Nov 09 Nov 13 (in English)
Nov 16 ###e Date: Nov 20 (in English)
Nov 23 Nov 27 (in English)
Nov 30 Dec 04 (in English)
Dec 07 Dec 11 (in English)
Dec 14 Dec 18 (in English)
Dec 21 Dec 18 (in English)
Jan 04 Jan 08 (in English)
Deadline: Nov 30 23:59 Deadline: Dec 04 23:59 (for English students) 10 points
Deadline: Dec 14 23:59 Deadline: Dec 18 23:59 (for English students) 10 points
Deadline: Jan 18 23:59 Deadline: Jan 15 23:59 (for English students) 10 points
Christopher Bishop: Pattern Recognition and Machine Learning, Springer-Verlag New York, 2006 (read here)
Kevin P. Murphy: Machine Learning: A Probabilistic Perspective, The MIT Press, Cambridge, Massachusetts, 2012 (read here)
David Mareček, Jindřich Libovický, Tomáš Musil, Rudolf Rosa, Tomasz Limisiewicz: HIDDEN IN THE LAYERS: Interpretation of Neural Networks for Natural Language Processing. Institute of Formal and Applied Linguistics, 2020 (read_here)