Selected Problems in Machine Learning

Teachers: Zdeněk Žabokrtský, David Mareček
Time and location: Tuesday 12:20–14.00 S7

Course focus

The course has been designed especially for PhD students with a deep interest in Machine Learning. The course is a flexible combination of lectures, discussions, exercises and literature reading, aimed at the following three topics:

refreshing (and deeper understanding of) basic notions of Machine Learning
introduction to Bayesian inference
practising unsupervised ML (especially methods based on sampling)

Course prerequisities

Students are expected to be familiar with basic probabilistic and ML concepts, roughly in the extent of NPFL067/068 - Statistical methods in NLP I/II, and NPFL 054 - Introduction to Machine Learning (in NLP).

Course schedule

"Calibration" test - let me know what you already know
- calibration_test.pdf
- most of the questions are covered by any modern ML book, my favourite is
  - Bishop's Pattern Recognition and Machine Learning
Patching the holes disclosed by the calibration test.
Patching the holes disclosed by the calibration test, continued.
Very studious excercise on Beta distribution.
- let us admire two mighty parameters generating a broad family of different shapes
- generalization to n-dimenzions - Dirichlet distribution
- supplementary materials - mathematicalmonks's videos:
  - (ML 7.7.A1) Dirichlet distribution
Derivation of some simple Bayesian models - let's enjoy conjugacy!
- Beta-Bernoulli model
- Dirichlet-Categorical model
- supplementary materials - mathematicalmonks's videos:
Assignment 1 - Word-alignment using Gibbs sampling
- Text: PDF
- Data: english-czech.tsv
- Gold data: test.tsv
- Evaluation script: eval.pl
- Deadline: November 20, 2012
Reading - Bayesian Inference
- Bayesian inference with tears by Kevin Knight
Kernel methods
- slides on kernel methods by Mark Johnson
Assignment 2 - Segmentation of dependency trees
- assignment
- data
Gibbs sampling in NLP - two case studies
- Sharon Goldwater and Thomas L. Griffiths: A Fully Bayesian Approach to Unsupervised Part-of-Speech Tagging. Proceedings of the ACL, 2007.
- Jason Naradowsky and Sharon Goldwater: Improving morphology induction by learning spelling rules. In Proceedings of IJCAI. 2009.

Course passing requirements

All students are required to actively participate in the classes.

Selected Problems in Machine Learning

Course focus

Course prerequisities

Course schedule

Other useful links

Course passing requirements