Summer Semester 2017

Students Presentations

Seminars

March 1

Course logistics ∍ prerequisties ⚫ syllabus ⚫ how to get credits

Notes on deep learning ∍ deep learning ⚫ network building blocks ⚫ network components as functional programming ⚫ deep learning alchemy ⚫ reading the learning curves

Recurrent Neural Networks ∍ definition ⚫ RNN as a program ⚫ excercise with Euclid's algorithm

Reading:	Sutskever, Ilya, Oriol Vinyals, and Quoc V. Le. "Sequence to sequence learning with neural networks." Advances in neural information processing systems. 2014.
Question:	What are the problems of the presented architecture? How do you think the neural MT continued after publishing this paper?

Project proposals for NPFL087 Statistical Machine Translation.

March 8

Recurrent Neural Networks ∍ vanilla RNNs ⚫ vanishing gradient problem ⚫ understanding LSTMs ⚫ Gated Recurrent Units ⚫ neural language models ⚫ word embeddings ⚫ sampling from a language model

Reading:	Bahdanau, Dzmitry, Kyunghyun Cho, and Yoshua Bengio. "Neural machine translation by jointly learning to align and translate." arXiv preprint arXiv:1409.0473 (2014).
Question:	What do you think is the main difference between Bahdanau's attention model and the concept of alignment in statistical MT?

March 15

Attentive sequence-to-sequence learning ∍ RNN as a probabilistic model ⚫ encoder-decoder architecture ⚫ training vs. runtime decoding ⚫ Neural Turing machines as motivation of attention ⚫ attention model ⚫ attention vs. alignment

Implementation and performance ∍ computational graph & backpropagation ⚫ memory consumption

Reading:	Chung, Junyoung, Kyunghyun Cho, and Yoshua Bengio. "A character-level decoder without explicit segmentation for neural machine translation." arXiv preprint arXiv:1603.06147 (2016).
Question:	What are the reasons authors do not use character-level encoder? How would you improve the architecture such that it would allow character level encoding?

March 23

Model Ensembling and Beam Search ∍ beam search ⚫ emsembles ⚫ computing in log domain

Big vocabulary problem ∍ copy from source ⚫ subword units ⚫ character-level methods

Reading:	Sennrich, Rico, et al. "Nematus: a Toolkit for Neural Machine Translation." arXiv preprint arXiv:1703.04357 (2017).
Question:	Compare the Nematus models with the models from Bahdanau et al., 2014. How do they differ? Think of at least three differences.

Match 29

Implementation in TensorFlow

Reading:	Shen, Shiqi, et al. "Minimum Risk Training for Neural Machine Translation." Proceeding of ACL 2016 (2016).
Question:	???

April 5

Advanced Optimization ∍ reinforcement learning ⚫ minimum risk training

Institute of Formal and Applied Linguistics

Charles University, Czech Republic
Faculty of Mathematics and Physics

Search form