Deep Learning Seminar, Summer 2016/17
In recent years, deep neural networks have been used to solve complex machine-learning problems and have achieved significant state-of-the-art results in many areas. The whole field of deep learning has been developing rapidly, with new methods and techniques emerging steadily.
The goal of the seminar is to follow the newest advancements in the deep learning field. The course takes form of a reading group – each lecture a paper is presented by one of the students. The paper is announced in advance, hence all participants can read it beforehand and can take part in the discussion of the paper.
If you want to receive announcements about chosen paper, sign up to our mailing list ufal-rg@googlegroups.com.
About
SIS code: NPFL117
Semester: summer
E-credits: 3
Examination: 0/2 C
Guarantor: Milan Straka
Timespace Coordinates
The Deep Learning Seminar takes place on Tuesday at 12:20 in S1. We will first meet on Tuesday Feb 28.
Requirements
To pass the course, you need to present a research paper and sufficiently attend the presentations.
License
Unless otherwise stated, teaching materials for this course are available under CC BY-SA 4.0.
If you want to receive announcements about chosen paper, sign up to our mailing list ufal-rg@googlegroups.com.
Date | Who | Paper(s) |
---|---|---|
28 Feb 2017 | Mirek Olšák | C. Kaliszyk, F. Chollet, C. Szegedy: HolStep: A Machine Learning Dataset for Higher-order Logic Theorem Proving TreeRNN-based implementation of Mirek Olšák surpassing accuracy of above paper from 83% to 88% |
07 Mar 2017 | Dušan Variš | Jason Lee, Kyunghyun Cho, Thomas Hofmann: Fully Character-Level Neural Machine Translation without Explicit Segmentation |
14 Mar 2017 | Karel Král | Geoffrey Hinton, Oriol Vinyals, Jeff Dean: Distilling the Knowledge in a Neural Network Lei Jimmy Ba, Rich Caruana: Do Deep Nets Really Need to be Deep? |
21 Mar 2017 | Milan Straka | Moshe Looks, Marcello Herreshoff, DeLesley Hutchins, Peter Norvig: Deep Learning with Dynamic Computation Graphs Lingpeng Kong, Chris Alberti, Daniel Andor, Ivan Bogatyy, David Weiss: DRAGNN: A Transition-Based Framework for Dynamically Connected Neural Networks |
28 Mar 2017 | Lukáš Jendele | Kaiming He, Georgia Gkioxari, Piotr Dollár, Ross Girshick: Mask R-CNN Yi Li, Haozhi Qi, Jifeng Dai, Xiangyang Ji, Yichen Wei: Fully Convolutional Instance-aware Semantic Segmentation |
04 Apr 2017 | Ondrej Škopek | Irwan Bello, Hieu Pham, Quoc V. Le, Mohammad Norouzi, Samy Bengio: Neural Combinatorial Optimization with Reinforcement Learning Oriol Vinyals, Meire Fortunato, Navdeep Jaitly: Pointer Networks |
11 Apr 2017 | Jan Hajič jr. | Diederik P Kingma, Max Welling: Auto-Encoding Variational Bayes Francisco J. R. Ruiz, Michalis K. Titsias, David M. Blei: The Generalized Reparameterization Gradient |
18 Apr 2017 | Jindřich Libovický | Holger Schwenk, Ke Tran, Orhan Firat, Matthijs Douze: Learning Joint Multilingual Sentence Representations with Neural Machine Translation |
25 Apr 2017 | Milan Straka | Mevlana Gemici et al.: Generative Temporal Models with Memory Alex Graves et al.: Hybrid computing using a neural network with dynamic external memory |
02 May 2017 | David Mareček | Dani Yogatama, Phil Blunsom, Chris Dyer, Edward Grefenstette, Wang Ling: Learning to Compose Words into Sentences with Reinforcement Learning |
09 May 2017 | Rudolf Rosa | Michael Sejr Schlichtkrull, Anders Søgaard: Cross-Lingual Dependency Parsing with Late Decoding for Truly Low-Resource Languages |
16 May 2017 | Jindřich Helcl | Noam Shazeer et al.: Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer |
23 May 2017 | Peter Zborovský | Luca Bertinetto et al.: Fully-Convolutional Siamese Networks for Object Tracking |
You can choose any paper you find interesting, but if you would like some inspiration, you can look at the following list.
Collections of Deep Learning Papers
- https://github.com/songrotek/Deep-Learning-Papers-Reading-Roadmap
- https://github.com/terryum/awesome-deep-learning-papers
Word Embeddings
- Tolga Bolukbasi, Kai-Wei Chang, James Zou, Venkatesh Saligrama, Adam Kalai: Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings. https://arxiv.org/abs/1607.06520
Parsing
- Eliyahu Kiperwasser, Yoav Goldberg: Simple and Accurate Dependency Parsing Using Bidirectional LSTM Feature Representations. https://arxiv.org/abs/1603.04351
- Yuan Zhang, David Weiss: Stack-propagation: Improved Representation Learning for Syntax. https://arxiv.org/abs/1603.06598
- Bernd Bohnet, Ryan McDonald, Emily Pitler and Ji Ma: Generalized Transition-based Dependency Parsing via Control Parameters. https://www.aclweb.org/anthology/P/P16/P16-1015.pdf
- Kazuma Hashimoto, Caiming Xiong, Yoshimasa Tsuruoka, Richard Socher: A Joint Many-Task Model: Growing a Neural Network for Multiple NLP Tasks. https://arxiv.org/abs/1611.01587
- Timothy Dozat, Christopher D. Manning: Deep Biaffine Attention for Neural Dependency Parsing. https://arxiv.org/abs/1611.01734
Neural Machine Translation
- Yonghui Wu et al.: Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation. https://arxiv.org/abs/1609.08144
- Jason Lee, Kyunghyun Cho, Thomas Hofmann: Fully Character-Level Neural Machine Translation without Explicit Segmentation. https://arxiv.org/abs/1610.03017
- Nal Kalchbrenner, Lasse Espeholt, Karen Simonyan, Aaron van den Oord, Alex Graves, Koray Kavukcuoglu: Neural Machine Translation in Linear Time. https://arxiv.org/abs/1610.10099
- Melvin Johnson et al.: Google's Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation. https://arxiv.org/abs/1611.04558
- Thanh-Le Ha, Jan Niehues, Alexander Waibel: Toward Multilingual Neural Machine Translation with Universal Encoder and Decoder. https://arxiv.org/abs/1611.04798
Language Correction
- Ziang Xie, Anand Avati, Naveen Arivazhagan, Dan Jurafsky, Andrew Y. Ng: Neural Language Correction with Character-Based Attention. https://arxiv.org/abs/1603.09727
Language Modelling
- Rafal Jozefowicz, Oriol Vinyals, Mike Schuster, Noam Shazeer, Yonghui Wu: Exploring the Limits of Language Modeling. https://arxiv.org/abs/1602.02410
Reinforcement Learning
- Frank S. He, Yang Liu, Alexander G. Schwing, Jian Peng: Learning to Play in a Day: Faster Deep Reinforcement Learning by Optimality Tightening. https://arxiv.org/abs/1611.01606
- Natasha Jaques, Shixiang Gu, Richard E. Turner, Douglas Eck: Tuning Recurrent Neural Networks with Reinforcement Learning. https://arxiv.org/abs/1611.02796
- Piotr Mirowski, Razvan Pascanu, Fabio Viola, Hubert Soyer, Andrew J. Ballard, Andrea Banino, Misha Denil, Ross Goroshin, Laurent Sifre, Koray Kavukcuoglu, Dharshan Kumaran, Raia Hadsell: Learning to Navigate in Complex Environments. https://arxiv.org/abs/1611.03673
- Dani Yogatama, Phil Blunsom, Chris Dyer, Edward Grefenstette, Wang Ling: Learning to Compose Words into Sentences with Reinforcement Learning. https://arxiv.org/abs/1611.09100
- Chelsea Finn, Tianhe Yu, Justin Fu, Pieter Abbeel, Sergey Levine: Generalizing Skills with Semi-Supervised Reinforcement Learning. https://arxiv.org/abs/1612.00429
Program Generation
- Matej Balog, Alexander L. Gaunt, Marc Brockschmidt, Sebastian Nowozin, Daniel Tarlow: DeepCoder : Learning to Write Programs. https://openreview.net/pdf?id=ByldLrqlx
Adversarial Networks
- Lantao Yu, Weinan Zhang, Jun Wang, Yong Yu: SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient. https://arxiv.org/abs/1609.05473
- Leon Sixt, Benjamin Wild, Tim Landgraf: RenderGAN: Generating Realistic Labeled Data. https://arxiv.org/abs/1611.01331
- Jianwei Yang, Anitha Kannan, Dhruv Batra, Devi Parikh: LR-GAN: Layered Recursive Generative Adversarial Networks for Image Generation. https://openreview.net/pdf?id=HJ1kmv9xx
Network Architectures
- Marcin Andrychowicz, Misha Denil, Sergio Gomez, Matthew W. Hoffman, David Pfau, Tom Schaul, Brendan Shillingford, Nando de Freitas: Learning to learn by gradient descent by gradient descent. https://arxiv.org/abs/1606.04474
- Julian Georg Zilly, Rupesh Kumar Srivastava, Jan Koutník, Jürgen Schmidhuber: Recurrent Highway Networks. https://arxiv.org/abs/1607.03474
- Lingpeng Kong, Chris Alberti, Daniel Andor, Ivan Bogatyy, David Weiss: DRAGNN: A Transition-Based Framework for Dynamically Connected Neural Networks. https://openreview.net/pdf?id=BycCx8qex
- Bowen Baker, Otkrist Gupta, Nikhil Naik, Ramesh Raskar: Designing Neural Network Architectures using Reinforcement Learning. https://arxiv.org/abs/1611.02167
Structured Prediction
- Daniel Andor, Chris Alberti, David Weiss, Aliaksei Severyn, Alessandro Presta, Kuzman Ganchev, Slav Petrov, Michael Collins: Globally Normalized Transition-Based Neural Networks. https://arxiv.org/abs/1603.06042
Image Labeling
- Oriol Vinyals, Alexander Toshev, Samy Bengio, Dumitru Erhan: Show and Tell: Lessons learned from the 2015 MSCOCO Image Captioning Challenge. https://arxiv.org/abs/1609.06647
Image Recognition
- Yuntian Deng, Anssi Kanervisto, Alexander M. Rush: What You Get Is What You See: A Visual Markup Decompiler. https://arxiv.org/pdf/1609.04938v1.pdf
Image Enhancement
- Justin Johnson, Alexandre Alahi, Li Fei-Fei: Perceptual Losses for Real-Time Style Transfer and Super-Resolution. https://arxiv.org/abs/1603.08155
- Richard Zhang, Phillip Isola, Alexei A. Efros: Colorful Image Colorization. https://arxiv.org/abs/1603.08511
- Christian Ledig, Lucas Theis, Ferenc Huszar, Jose Caballero, Andrew Cunningham, Alejandro Acosta, Andrew Aitken, Alykhan Tejani, Johannes Totz, Zehan Wang, Wenzhe Shi: Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network. https://arxiv.org/abs/1609.04802
- Ryan Dahl, Mohammad Norouzi, Jonathon Shlens: Pixel Recursive Super Resolution. https://arxiv.org/pdf/1702.00783.pdf
Speech Synthesis
- Aaron van den Oord, Sander Dieleman, Heiga Zen, Karen Simonyan, Oriol Vinyals, Alex Graves, Nal Kalchbrenner, Andrew Senior, Koray Kavukcuoglu: WaveNet: A Generative Model for Raw Audio. https://arxiv.org/abs/1609.03499