The seminar focuses on deeper understanding of selected unsupervised machine learning methods for students who have already have basic knowledge of machine learning and probability models. The first half of the semester is devoted to methods of unsupervised learning using Bayesian inference (Dirichlet-Categorical models, Mixture of Categoricals, Mixture of Gaussians, Expectation Maximization, Gibbs sampling) and implementation of these methods on selected tasks. Other lectures will be devoted to clustering methods, componet analysis and inspecting deep neural networks.
SIS code: NPFL097
Semester: winter
E-credits: 3
Examination: 1/1 C
Guarantor: David Mareček
The lectures in Czech are given on Tuesdays, 12:20 - 13:50 in S1 (fourth floor)
The lectures in English are given on Wednesdays 12:20 - 13:50 (write me an e-mail if interested)
Students are expected to be familiar with basic probabilistic and ML concepts, roughly in the extent of:
In the second half of the course, you should be familiar with the basics of deep-learning methods. I recommend to attend
1. Introduction Slides Warm-Up test
2. Beta-Bernoulli probabilistic model Beta-Bernoulli Beta distribution
3. Dirichlet-Categorical probabilistic model Dirichlet-Categorical Document collections Categorial Mixture Models
4. Latent Dirichlet Allocation Beta and Dirichlet distributions Topic Models - Introduction Topic Models - Evaluation Topic Models - Gibbs Sampling Latent Dirichlet Allocation
5. Gibbs Sampling Bayessian inference with Tears
6. Chinese Segmentation Bayessian inference with Tears Chinese Restaurant Process Chinese Segmentation
7. Clustering Clustering - Basics Clustering - Hierarchical Clustering - K-means Clustering - Gaussian Mixture Models K-Means and Gaussian Mixture Models Gaussians Mixture Models
8. Principal Component Analysis Principal Component Analysis, SVD
Feb 25
Mar 3
Mar 10
Mar 24
Beta and Dirichlet distributions Topic Models - Introduction Topic Models - Evaluation Topic Models - Gibbs Sampling
Apr 7
Bayessian inference with Tears
Apr 14
Unsupervised segmentation of texts in languages which does not use spaces between words.
All the necessary information the second assignment is covered by the tutorial from the last lecture:
Bayessian inference with Tears
The unsupervised segmentation is decribed in sections 17 and 29, however, you will need many other hints from the whole text, so please, read it all.
You can also go through the following slides:
May 12
Clustering - Basics Clustering - Hierarchical Clustering - K-means Clustering - Gaussian Mixture Models
K-Means and Gaussian Mixture Models
May 19
Deadline: Apr 14 23:59 10 points
Deadline: May 5 23:59 10 points
Deadline: Sep 15, 23:59