Time and location
Our one-semester introductory course consists of lecture sessions and practical sessions. The lecture sessions provide theoretical background of and key algorithms from the field of machine learning (ML) and they are application-independent. The practical sessions (a.k.a. practice, seminar, lab sessions) are application-dependent and they accompany the lecture sessions; the aim is an acquisiton of practical experience from application of ML approaches on problems from the field of natural language processing. The students run the experiments using the R system for statistical computing and graphics.
Course is intended for students from master and doctoral study programmes.
The following schedule is provided as a general guide for the course. The instructors may elect to adjust the outline to meet the unique needs of the class:
- Extended introduction: what is machine learning, motivation examples, interdisciplinarity of ml, supervised vs. unsupervised learning, selected ml topics, ml for natural language processing.
- Concept learning: concepts, hypothesis ordering, find-s algorithm, candidate-elimination algorithm.
- Decision tree learning: decision tree structure, id3 algorithm, splitting criteria, avoiding over-fitting, incorporating continuous-valued attributes, handling missing attribute values.
- Bayes learning: Bayes theorem in ml, posterior probability, maximum likelihood hypothesis, Bayes optimal classification, Naive Bayes classifier, Bayesian belief networks, K2 algorithm, curse of dimensionality.
- Instance-based learning: distance criterion, k-NN, a discrete/continuous-valued case.
- Experiment evaluation: accuracy, cross-validation, biased error estimation, bootstrapping, roc curve, aroc, statistical significance, confidence intervals.
- Support vector machines: classifier margin, finding the hyperplane, linear/non-linear separation, learning the maximum margin classifier via quadratic programming, Kernel tricks .
- Probably Approximately Correct framework: PAC learnability, sample complexity, Vapnik-Chervonenkis dimension.
- Ensemble methods: combination of classifiers, bagging, boosting, AdaBoost, bootstrapping vs. cross-validation.
- Logistic regression.
- Clustering: dendrograms, (non)hierarchical clustering, k-means algorithm.
The final project adresses a selected issue from the field of natural language processing and is not done by teams, it is done individually. More detailed info is provided here. The final project assignment comes in the middle of the term (usually after a lecture on the experiment evaluation) and students have to submit their solution by the end of the examination period. The students present their preliminary results before the Christmas break.
The final exam consists of three parts: (i) a statistical calculation with a pencil, (ii) experiments on the computer, (iii) discussion. See sample from Feb 8, 2010. With regards to grading, the final exam and the final project are two independent requirements. The students can take the final exam before they submit their final projects. But they cannot get an 'exam signature' by the time they finish the final project.
Recommended teaching materials
- Alpaydin, E. Introduction to Machine Learning. The MIT Press. 2004.
- Breiman, L. et al. Classification and regression trees. Wadsworth International Group, Belmont California, A Division of Wadsworth, Inc. 1984.
- Cooper, G. F., Herskovits E. A Bayesian Method for the Induction of Probabilistic Networks from Data. Machine learning, 9, pp. 309-347, 1992 (pdf)
- Gonick, L., Smith, W.: Cartoon Guide to Statistics. (Available on request.)
- Lillian L. I'm sorry Dave, I'm afraid I can't do that": Linguistics, statistics, and natural language processing circa 2001. Computer Science: Reflections on the Field, Reflections from the Field, National Academies Press, pp. 111–118, 2004. (pdf)
- Mitchel, T. M. Machine Learning. McGraw-Hill. 1997.
- Paradis, E.: R for Beginners (pdf)
- R Reference Card
- Scientific Video Lectures.
- Khan Academy, Statistics part
- Where are the flying cars ...
- At the very beginning, we put together a reader (only in printed version) - see the CONTENTS. If interested, write us.
- Example data: Forbes2000.csv, Forbes2000.xls, semantic-types.3000.csv, weather1.csv, weather2.csv
You can discuss your opinion on teaching machine learning here.