SIS code: 
0/2 C

NPFL116 - Compendium of Neural Machine Translation

This seminar should make the students familiar with the current research trends in machine translation using deep neural networks. The students should most importantly learn how to deal with the ever-growing body of literature on empirical research in machine translation and critically asses its content. The semester consists of few lectures summarizing the state of the art, discussions on reading assignments and student presentation of selected papers.

Topic overview

  1. Introductory notes on machine translation and deep learning slides reading
  2. Neural architectures for NLP slides reading
  3. Attentive sequence-to-sequence learning using RNNs slides reading
  4. Sequence-to-sequence learning with self-attention, a.k.a Transformer reading
  5. Tricks for improving NMT performance slides reading

langtech logo

Passing requirements

  • Homework tasks: There will be a reading assignment after every class. You will be given few question about the reading that you should submit before the next lecture.
  • Student presentations: Every student should participate in a team presenting a group of papers to the class.
  • Final written test: There will be a final written test that will not be graded.


Introductory notes on machine translation and deep learning

Neural architectures for NLP

Attentive sequence-to-sequence learning using RNNs

Sequence-to-sequence learning with self-attention, a.k.a Transformer

Tricks for improving NMT performance

Student Presentations

Students will for 3 team of 3 students and 1 team of 2 students and present one of the following group of papers to the fellow students. The students will not only prepare a presentation of the paper but also questions for discussion after the paper presentation.

Others should also get familiar with the paper, so they can participate in the discussion.

It is recommended, though not required, to arrange a consultation with the teachers at least one day before the presentation.

Unsupervised NMT (10.4. 2018)

Unsupervised machine translation is an active research topic where the goal is creating a machine translation system without the necessity of having huge corpora of parallel data to train the models.

So far, there were two papers on this topic:

Generative Adversarial Networks for NMT (17.4. 2018)

Two years ago, there have been many papers attempting to use reinforcement learning for machine translation and optimize the model directly towards sentence-level BLUE score instead of cross-entropy which appears to be clearly sub-optimal. This methods have not been much successful, mainly because the inherent limitation of BLEU score.

Generative Adversarial Networks with the generator-discriminator setup are a follow-up of this research. A trained discriminator plays a role optimization metric, its goal is to discriminate between a generated and human translation, the generator on the other hand tries to fool the discriminator and generate as close translation to human reference as possible.

The following papers will be presented:

Unassigned topics

Convolutional Sequence-to-sequence Learning

Facebook recently came with a sequence-to-sequence architecture that is base entirely on convolutional networks. This allows parallel processing of the input sentence. The autoregressive nature of the decoder does not allow parallel decoding in the inference time, however it is still possible at the training time when the target sentence is known.

The architecture was introduced in series of two papers:

Non-autoregressive MT

Non-autoregressive regressive models can generate the whole output sequence in parallel and do not need to wait before the previous word is generated to update the hidden state.

Other interesting papers

A group that will choose this topic will choose two papers from the following list: