This webpage serves as a supplementary material attached to the paper Hladká, Barbora, Holub, Martin. A Gentle Introduction to Machine Learning for Natural Language Processing: How to start in 16 practical steps. In: Language and Linguistics Compass, Vol. 9, No. 2, Copyright © John Wiley & Sons Inc, ISSN 1749-818X, pp. 55-76, 2015.

Here is the abstract of the paper:

We present a gentle introduction to machine learning in natural language processing. Our goal is to navigate readers through basic machine learning concepts and experimental techniques. As an illustrative example we practically address the task of word sense disambiguation using the R software system. We focus especially on students and junior researchers who are not trained in experimenting with machine learning yet and who want to start. To some extent, machine learning process is independent on both addressed task and software system used. Therefore readers who deal with tasks from different research areas or who prefer different software systems will gain useful knowledge as well.

Supplementary data and other material for
"A Gentle Introduction to Machine Learning for Natural Language Processing"


Introductory reading

  • Alpaydin, Ethem. Introduction to Machine Learning. The MIT Press. 2004, 2010 (url).

  • Domingos, Pedro. A few useful things to know about Machine learning. Communication of the ACM, vol. 55, Issue 10, October 2012, pp. 78--87, ACM, New York, USA. (pdf) [a nice non-technical reading]

  • Gonick, Larry and Woollcott Smith. The Cartoon Guide to Statistics. Harper Resource. 2005. 

  • Hladka, Barbora and Martin Holub. The course proposal esslli-proposal.2013.pdf, 2013.

  • Kononenko, Igor and Matjaz Kukar. Machine Learning and Data Mining: Introduction to Principles and Algorithms. Horwood Publishing, 2007 (url). [a light survey of the whole field]

Serious textbooks and tutorials (they require deeper mathematical background)

  • Bishop, Christopher M. Pattern Recognition And Machine Learning. Springer, 2006 (url).

  • Burges Christopher J. C.  A Tutorial on Support Vector Machines for Pattern Recognition. Data Mining and Knowledge Discovery, 2(2):121–167, 1998. http://research.microsoft.com/pubs/67119/svmtutorial.pdf

  • Cristianni, Nello and John Shawe-Taylor. An Introduction to Support Vector M​achines and other Kernel-based Learning Methods. Cambridge University Press, 2000.

  • Duda, Richard O., Peter R. Hart and David G. Stork. Pattern Classification. Second Edition. Wiley, 2001.

  • Hsu Chih-Wei, Chang Chih-Chung Chang and Chih-Jen Lin. A Practical Guide to Support Vector Classication. 2010. (pdf).

  • Hastie, Trevor, Robert Tibshirani and Jerome Friedman.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer, 2009 (url).

About R

​​About data used

  • Leacock, C., Towell, G. and Voorhees, E. Corpus-Based Statistical Sense Resolution. In Proceedings of the ARPA Workshop on Human  Language Technology, pp. 260--265. 1993. [WSD task]  ​


Barbora Hladka and Martin Holub

Institute of Formal and Applied Linguistics

Faculty of Mathematics and Physics

Charles University in Prague


We gratefully acknowledge that this work was supported by the Grant Agency of the Czech Republic, grant project no. P103/12/G084. We would like to thank Jirka Hana for his English corrections. Also, we would like to thank the students who attended our course at ESSLLI 2013. Last but not least, we would like to thank the anonymous reviewers for their valuable comments.