Written tests

There will be two written closed-book scored tests. We will test practical knowledge of things explained at lectures and/or lab sessions. We require neither programming nor mathematical proofs. Test questions focus on data analysis, learning algorithms, and evaluation.

Test #1 is shorter (45 minutes) and comes in the midst of the term. Final test #2 is more complex and takes 80 minutes. Any computers or calculators are not necessary and are not allowed. Everything will be answered using only pen and head.

Topics covered for the tests are listed below. 

Test #1 

  • probability, conditional probability, statistical independence
  • simple statistical data analysis: expected values, variation, correlation, median, quantiles
  • confusion matrices, inter-annotator agreement
  • classifier evaluation
  • entropy, conditional entropy
  • majority voting for ensemble classifiers

Test #2 – Final written test

Topics of the exercises span almost the whole course. You should be ready to answer questions related to data analysis, learning algorithms, and evaluation methods including statistical tests. Mathematical proofs and neural networks will not be required.

 

Requirements for obtaining the exam credit

Obtaining the course credit is a prerequisite for taking the examination in the course.

The questions for oral examination:

  • Machine learning – basic concepts. What is machine learning, motivation examples of practical applications, theoretical foundations of machine learning. Supervised and unsupervised learning. Classification and regression tasks. Training and test examples. Feature vectors. Target variable and prediction function. Machine learning development cycle. Curse of dimensionality. Bayes classifier and Bayes error.

  • Clustering algorithms. Hierarchical clustering, k-Means algorithm.

  • Decision tree learning. Decision tree learning algorithm, splitting criteria, pruning.

  • Linear regression. Least square cost function.

  • Instance-based learning. k-NN algorithm.

  • Logistic regression. Discriminative classifier.

  • Naive Bayes learning. Naive Bayes classifier. Bayesian belief networks.

  • Support Vector Machines. Large margin classifier, soft margin classifier. Kernel functions. Multiclass classification.

  • Ensemble methods. Bagging and boosting. Unstable learning. AdaBoost algorithm. Random Forests.

  • Parameters in ML. Learning parameters tuning. Grid search. Gradient descent algorithm. Maximum likelihood estimation.

  • Predictor evaluation. Working with development and test data. Sample error, generalization error. Cross-validation, one-leave-out method. Bootstrap methods. Performance measures. Coefficient of determination. Evaluation of binary classifiers. ROC curve.

  • Statistical tests. Statistical hypotheses, one-sample and two-sample t-tests, chi-square test of independence and goodness-of-fit test. Significance level, p-value. Using statistical tests for classifier evaluation. Confidence level, confidence intervals.

  • Overfitting. How to recognize and avoid. Decision tree pruning. Regularization.

  • Dimensionality reduction. General principles of feature selection. Filters, wrappers, embedded methods. Feature selection using information gain. Forward selection and backward elimination. Principal Component Analysis.

  • Foundations of Neural Networks. Single Perceptron and Single Layer Perceptron – learning algorithms and mathematical interpretations. The architecture of multi-layer feed-forward models and the idea of back-propagation training.

 

Grading

Key requirements and contributions to the grade

  • 50% written tests
  • 20% homeworks
  • 30% oral examination