Introduction to Natural Language Processing (Úvod do zpracování přirozeného jazyka)

week lecture lab homework
1: 4/10/2017 JH: Motivation for NLP. Basic notions from probability and information theory.
ZŽ: Using basic bash command line tools for text processing. Collecting counts for a bigram language model in bash.
Optional reading:
2: 11/10/2017 JH: Language models. The noisy channel model.
ZŽ: Character encoding.
[slides] (provisional!)
optional reading: The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)
3: 18/10/2017 JH/PP(?): Markov Models.
ZŽ: Language Model exercises.
Optional reading:
  • Philipp Koehn's slides about Language Models in Statistical Machine Translation
  • a chapter on N-gram models in a book by Jurafsky&Martin
HW01: word coloring by a trigram model,
deadline 16/11/2016
4: 25/10/2017 ZŽ: Language data resources.
ZŽ: Evaluation measures in NLP.
[slides] (provisional!)
register as a user of the Czech National Corpus (you will need it in the following week).
5: 1/11/2017 DZ: Morphological analysis.
DZ: Czech National Corpus.
6: 8/11/2017 DZ: Syntactic analysis.
DZ: Syntactically annotated corpora.
HW02: valency dictionary of verbs,
deadline 7/12/2016
7: 15/11/2017 PP: Introduction to information retrieval, Boolean model, Inverted index. [slides] PP: Vector space model, TF-IDF weighting, Evaluation.
8: 22/11/2017 PP: Probabilistic models. Language models for information retrieval. [slides] PP: Experimental vector space model.
HW03: Experiments with an open-source IR toolkit. [slides], deadline 8/1/2017
9: 29/11/2017 OB: Machine Translation (overview, evaluation) and word alignment. [slides] OB: Word alignment.
10: 6/12/2017 OB: Statistical Machine Translation: PBMT, Hiero, Syntax. [main slides, decoding (P. Koehn), syntax (D. Chiang), TectoMT (M. Popel)] OB: MT system Moses.
11: 13/12/2016 OB: Linguistic features or Neurons in MT. [main slides, factored PBMT (P. Koehn), Neural MT (R. Sennrich), ACL 2016 tutorial on Neural MT (T. Luong, K. Cho, C. Manning)]. OB: Moses, cont. HW04: Empirical comparison of AER vs. BLEU
12: 20/12/1017 Reserve
13: 3/01/2018 JL: Deep Learning in NLP [slides] JL: Recurrent Neural Netowrks for checking y/i spelling in Czech [slides]
14: 10/01/2018 Written final exam test


Homework tasks

Lab tasks

Homework rules

Requirements for passing the course