Unsupervised Machine Learning in NLP

The seminar focuses on deeper understanding of selected unsupervised machine learning methods for students who already have basic knowledge of machine learning and probability models. The first half of the semester is devoted to methods of unsupervised learning using Bayesian inference (Dirichlet-Categorical models, Mixture of Categoricals, Mixture of Gaussians, Expectation Maximization, Gibbs sampling) and implementation of these methods on selected tasks. Other lectures will be devoted to clustering methods, componet analysis and unsupervised inspecting deep neural networks.

About

SIS code: NPFL097
Semester: winter
E-credits: 3
Examination: 1/1 C
Guarantor: David Mareček

Timespace Coordinates

The course will be taught online over Zoom, given the current pandemic. All lectures will be recorded so you can catch up later.

  • The lectures in Czech are given on Mondays 10:40 - 12:10, the first lecture is on Oct 5.
  • The lectures in English are given on Fridays 10:40 - 12:10, the first lecture is on Oct 9.

All enrolled students will get a Zoom link via email. If you want to take part and have not officialy enrolled, email me.

Course prerequisities

Students are expected to be familiar with basic probabilistic concepts, roughly in the extent of:

  • NPFL067 - Statistical methods in NLP I

In the second half of the course, it will be an advantage for you if you know the basics of deep-learning methods. I recommend to attend

Course passing requirements

  • There are three programming assignments during the term. For each one, you can obtain 10 points. When submitted after the deadline, you can obtain at most half of the points.
  • You can obtain 10 points for an individual 30-minutes presentation about selected machine learning method or task or about a novel approach in the field.
  • You pass the course if you obtain at least 20 points.

Lectures

1. Introduction Slides Warm-Up test

2. Beta-Bernoulli probabilistic model Beta-Bernoulli Beta distribution

3. Dirichlet-Categorical probabilistic model, Modeling document collections Dirichlet-Categorical Document collections Categorical Mixture Models Expectation-Maximization

4. Bayesian Mixture Models, Gibbs Sampling, Latent Dirichlet Allocation Gibbs Sampling Gibbs Sampling for Bayesian mixture Latent Dirichlet allocation Algorithms for LDA and Mixture of Categoricals Gibbs Sampling

5. Programming session: Latent Dirichlet Allocation Latent Dirichlet Allocation

6. Chinese Restaurant Process Chinese Restaurant Process Bayessian inference with Tears

7. Programming session: Text Segmentation Chinese Segmentation

8. Unsupervised POS tagging, Word-Alignment, and Dependency Parsing Tagging, Alignment, Parsing

9. Mixture of Gaussians and other clustering methods K-Means and Gaussian Mixture Models Clustering Methods

10. Dimesionality Reduction Dimensionality Reduction

11. Programming session: Clustering and Component Analysis Clustering and Component Analysis on Word Vectors

12. Interpretation of Neural Networks Interpretation of Neural Networks Hidden in the Layers

1. Introduction

 Oct 05  Oct 09 (in English)

  • Course overview Slides
  • revision of the basics of probability and machine learning theory Warm-Up test

2. Beta-Bernoulli probabilistic model

 Oct 19  Oct 16 (in English)

3. Dirichlet-Categorical probabilistic model, Modeling document collections

 Oct 26  Oct 30 (in English)

4. Bayesian Mixture Models, Gibbs Sampling, Latent Dirichlet Allocation

 Nov 02  Nov 06 (in English)

5. Programming session: Latent Dirichlet Allocation

 Nov 09  Nov 13 (in English)

6. Chinese Restaurant Process

 Nov 16 ###e Date: Nov 20 (in English)

7. Programming session: Text Segmentation

 Nov 23  Nov 27 (in English)

8. Unsupervised POS tagging, Word-Alignment, and Dependency Parsing

 Nov 30  Dec 04 (in English)

9. Mixture of Gaussians and other clustering methods

 Dec 07  Dec 11 (in English)

10. Dimesionality Reduction

 Dec 14  Dec 18 (in English)

  • Principal Component Analysis, Independent Component Analysis, Canonical Correlation Analysis
  • Slides Dimensionality Reduction

11. Programming session: Clustering and Component Analysis

 Dec 21  Dec 18 (in English)

12. Interpretation of Neural Networks

 Jan 04  Jan 08 (in English)

Latent Dirichlet Allocation

 Deadline: Nov 30 23:59  Deadline: Dec 04 23:59 (for English students)  10 points

Chinese Segmentation

 Deadline: Dec 14 23:59  Deadline: Dec 18 23:59 (for English students)  10 points

Clustering and Component Analysis on Word Vectors

 Deadline: Jan 18 23:59  Deadline: Jan 15 23:59 (for English students)  10 points

  • Christopher Bishop: Pattern Recognition and Machine Learning, Springer-Verlag New York, 2006 (read here)

  • Kevin P. Murphy: Machine Learning: A Probabilistic Perspective, The MIT Press, Cambridge, Massachusetts, 2012 (read here)

  • David Mareček, Jindřich Libovický, Tomáš Musil, Rudolf Rosa, Tomasz Limisiewicz: HIDDEN IN THE LAYERS: Interpretation of Neural Networks for Natural Language Processing. Institute of Formal and Applied Linguistics, 2020 (read_here)