# Summer Semester 2017

## Students Presentations

## Seminars

**March 1**

**Course logistics **∍ *prerequisties* ⚫ *syllabus* ⚫ *how to get credits*

**Notes on deep learning **∍ *deep learning * ⚫ *network building blocks * ⚫ *network components as functional programming * ⚫ *deep learning alchemy * ⚫ *reading the learning curves*

**Recurrent Neural Networks **∍* definition *⚫ *RNN as a program* ⚫ *excercise with Euclid's algorithm*

Reading: |
Sutskever, Ilya, Oriol Vinyals, and Quoc V. Le. "Sequence to sequence learning with neural networks." Advances in neural information processing systems. 2014. |

Question: |
What are the problems of the presented architecture? How do you think the neural MT continued after publishing this paper? |

Project proposals for NPFL087 Statistical Machine Translation.

**March 8**

**Recurrent Neural Networks **∍ *vanilla RNNs* ⚫ *vanishing gradient problem* ⚫ *understanding LSTMs * ⚫ *Gated Recurrent Units* ⚫ *neural language models* ⚫ *word embeddings* ⚫ *sampling from a language model*

Reading: |
Bahdanau, Dzmitry, Kyunghyun Cho, and Yoshua Bengio. "Neural machine translation by jointly learning to align and translate." arXiv preprint arXiv:1409.0473 (2014). |

Question: |
What do you think is the main difference between Bahdanau's attention model and the concept of alignment in statistical MT? |

**March 15**

**Attentive sequence-to-sequence learning **∍ *RNN** as a probabilistic model *⚫ *encoder-decoder architecture* ⚫ *training vs. runtime decoding* ⚫ *Neural Turing machines as motivation of attention* ⚫ *attention model * ⚫ *attention vs. alignment*

**Implementation and performance **∍ *computational graph & backpropagation * ⚫ *memory consumption*

Reading: |
Chung, Junyoung, Kyunghyun Cho, and Yoshua Bengio. "A character-level decoder without explicit segmentation for neural machine translation." arXiv preprint arXiv:1603.06147 (2016). |

Question: |
What are the reasons authors do not use character-level encoder? How would you improve the architecture such that it would allow character level encoding? |

**March 23**

**Model Ensembling and Beam Search **∍ *beam search* ⚫ *emsembles * ⚫ *computing in log domain*

**Big vocabulary problem **∍ *copy from source* ⚫ *subword units* ⚫ *character-level methods*

Reading: |
Sennrich, Rico, et al. "Nematus: a Toolkit for Neural Machine Translation." arXiv preprint arXiv:1703.04357 (2017). |

Question: |
Compare the Nematus models with the models from Bahdanau et al., 2014. How do they differ? Think of at least three differences. |

### Match 29

**Implementation in TensorFlow**

Reading: |
Shen, Shiqi, et al. "Minimum Risk Training for Neural Machine Translation." Proceeding of ACL 2016 (2016). |

Question: |
??? |

### April 5

**Advanced Optimization**** **∍ *reinforcement learning *⚫ *minimum risk training*