SIS code: 
Semester: 
winter
E-credits: 
winter s.:3
Examination: 
0/2 C
Guarantor: 

Selected Problems in Machine Learning

The seminar focuses on deeper understanding of selected machine learning methods for students who have already have basic knowledge of machine learning and probability models. The first half of the semester is devoted to methods of unsupervised learning using Bayesian inference (Chinese restaurant process, Pitman-Yor process, Gibbs sampling) and implementation of these methods on selected tasks. Further topics are selected according to students interest.

About

SIS code: NPFL097
Semester: winter
E-credits: 3
Examination: 0/2 C
Guarantor: David Mareček

Timespace Coordinates

The seminar is held on Thursday, 9:00 - 10:30 in S1

Course prerequisities

Students are expected to be familiar with basic probabilistic and ML concepts, roughly in the extent of NPFL067/068 - Statistical methods in NLP I/II, and NPFL054 - Introduction to Machine Learning (in NLP).

Course passing requirements

  • All students are required to actively participate in the classes.
  • ~2 homeworks (programming assignments)

Lectures

1. Introduction

2. Beta-Bernouli and Dirichlet-Categorial models

3. Modeling document collections, Categorical Mixture models, Expectation-Maximization Reading

4. Gibbs Sampling, Latent Dirichlet allocation assignment1

5. Pitman-Yor process, Word alignment, Word clustering

6. Sampling Methods: Rejection Sampling, Importance Sampling, Metropolis-Hastings Sampling

1. Introduction

 Oct 4

Introduction to probabilistic machine learning

2. Beta-Bernouli and Dirichlet-Categorial models

 Oct 11

  • slides for Beta-Bernouli and Dirichlet-Categorial models by Carl Edward Rasmussen from University of Cambridge
  • how to compute expected value of the Beta distribution can be found on YouTube
  • web application showing the Beta-Bernouli distribution and many others can be found at RandomServices.com

3. Modeling document collections, Categorical Mixture models, Expectation-Maximization

 Oct 18 Reading

  • slides for introduction and categorial and mixture models by Carl Edward Rasmussen from University of Cambridge
  • Expectation Maximization is also very well described in Chapter 9 in the Bishops book Pattern Recognition and Machine Learning

4. Gibbs Sampling, Latent Dirichlet allocation

 Oct 25 assignment1

  • slides for gibbs sampling and Latent Dirichlet allocation by Carl Edward Rasmussen from University of Cambridge

5. Pitman-Yor process, Word alignment, Word clustering

 Nov 1

6. Sampling Methods: Rejection Sampling, Importance Sampling, Metropolis-Hastings Sampling

 Nov 8

  • see Chapter 11 in the book Christopher Bishop: Pattern Recpgnition and Machine Learning

assignment1

 Deadline: Nov 14 23:59  5 points  Duration: 2h

Latent Dirichlet Allocation: lda-assignment.pdf, lda-data.zip, evaluation, document perplexity

  • Christopher Bishop: Pattern Recognition and Machine Learning, Springer-Verlag New York, 2006

  • Kevin P. Murphy: Machine Learning: A Probabilistic Perspective, The MIT Press, Cambridge, Massachusetts, 2012