The seminar focuses on deeper understanding of selected machine learning methods for students who have already have basic knowledge of machine learning and probability models. The first half of the semester is devoted to methods of unsupervised learning using Bayesian inference (Chinese restaurant process, Pitman-Yor process, Gibbs sampling) and implementation of these methods on selected tasks. Further topics are selected according to students interest.
SIS code: NPFL097
Semester: winter
E-credits: 3
Examination: 0/2 C
Guarantor: David Mareček
The seminar is held on Thursday, 9:00 - 10:30 in S1
Students are expected to be familiar with basic probabilistic and ML concepts, roughly in the extent of NPFL067/068 - Statistical methods in NLP I/II, and NPFL054 - Introduction to Machine Learning (in NLP).
2. Beta-Bernouli and Dirichlet-Categorial models
3. Modeling document collections, Categorical Mixture models, Expectation-Maximization Reading
4. Gibbs Sampling, Latent Dirichlet allocation assignment1
5. Pitman-Yor process, Word alignment, Word clustering
6. Sampling Methods: Rejection Sampling, Importance Sampling, Metropolis-Hastings Sampling
Oct 4
Introduction to probabilistic machine learning
Oct 11
Oct 18 Reading
Oct 25 assignment1
Nov 1
Nov 8
Deadline: Nov 14 23:59 5 points Duration: 2h
Latent Dirichlet Allocation: lda-assignment.pdf, lda-data.zip, evaluation, document perplexity
Christopher Bishop: Pattern Recognition and Machine Learning, Springer-Verlag New York, 2006
Kevin P. Murphy: Machine Learning: A Probabilistic Perspective, The MIT Press, Cambridge, Massachusetts, 2012