Deep Learning Seminar, Summer 2018/19

In recent years, deep neural networks have been used to solve complex machine-learning problems and have achieved significant state-of-the-art results in many areas. The whole field of deep learning has been developing rapidly, with new methods and techniques emerging steadily.

The goal of the seminar is to follow the newest advancements in the deep learning field. The course takes form of a reading group – each lecture a paper is presented by one of the students. The paper is announced in advance, hence all participants can read it beforehand and can take part in the discussion of the paper.

If you want to receive announcements about chosen paper, sign up to our mailing list ufal-rg@googlegroups.com.

About

SIS code: NPFL117
Semester: winter + summer
E-credits: 3
Examination: 0/2 C
Guarantor: Milan Straka

Timespace Coordinates

The Deep Learning Seminar takes place on Tuesday at 10:40 in S8. We will first meet on Tuesday Mar 05.

Requirements

To pass the course, you need to present a research paper and sufficiently attend the presentations.

License

Unless otherwise stated, teaching materials for this course are available under CC BY-SA 4.0.

If you want to receive announcements about chosen paper, sign up to our mailing list ufal-rg@googlegroups.com.

To add your name to a paper the table below, edit the source code on GitHub and send a PR.

Date Who Topic Paper(s)
05 Mar 2019 Milan Straka Optimization Noam Shazeer, Mitchell Stern: Adafactor: Adaptive Learning Rates with Sublinear Memory Cost
Ilya Loshchilov, Frank Hutter: Decoupled Weight Decay Regularization
Sashank J. Reddi, Satyen Kale, Sanjiv Kumar: On the Convergence of Adam and Beyond
Liangchen Luo, Yuanhao Xiong, Yan Liu, Xu Sun: Adaptive Gradient Methods with Dynamic Bound of Learning Rate
12 Mar 2019 No DL Seminar
19 Mar 2019 Milan Straka Optimization James Martens, Roger Grosse: Optimizing Neural Networks with Kronecker-factored Approximate Curvature
Roger Grosse, James Martens: A Kronecker-factored approximate Fisher matrix for convolution layers
Jimmy Ba, Roger Grosse, James Martens: Distributed Second-Order Optimization using Kronecker-Factored Approximations
James Martens, Jimmy Ba: Kronecker-Factored Curvature Approximations for Recurrent Neural Networks
Thomas George, César Laurent, Xavier Bouthillier, Nicolas Ballas, Pascal Vincent: Fast Approximate Natural Gradient Descent in a Kronecker-factored Eigenbasis
26 Mar 2019 Milan Straka AutoML Hieu Pham, Melody Y. Guan, Barret Zoph, Quoc V. Le, Jeff Dean: Efficient Neural Architecture Search via Parameter Sharing
Hanxiao Liu, Karen Simonyan, Yiming Yang: DARTS: Differentiable Architecture Search
Han Cai, Ligeng Zhu, Song Han: ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware
David R. So, Chen Liang, Quoc V. Le: The Evolved Transformer
02 Apr 2019 Martin Víta NLP Alexis Conneau, Douwe Kiela, Holger Schwenk, Loic Barrault, Antoine Bordes: Supervised Learning of Universal Sentence Representations from Natural Language Inference Data
Adam Poliak, Jason Naradowsky, Aparajita Haldar, Rachel Rudinger, Benjamin Van Durme: Hypothesis Only Baselines in Natural Language Inference
Amit Gajbhiye, Sardar Jaf, Noura Al Moubayed, A. Stephen McGough, Steven Bradley: An Exploration of Dropout with RNNs for Natural Language Inference
09 Apr 2019 Tomas Soucek GANs Zhiming Zhou, Yuxuan Song, Lantao Yu, Hongwei Wang, Jiadong Liang, Weinan Zhang, Zhihua Zhang, Yong Yu: Understanding the Effectiveness of Lipschitz-Continuity in Generative Adversarial Nets
Takeru Miyato, Toshiki Kataoka, Masanori Koyama, Yuichi Yoshida: Spectral Normalization for Generative Adversarial Networks
16 Apr 2019 Jakub Arnold Glow Laurent Dinh, David Krueger, Yoshua Bengio: NICE: Non-linear Independent Components Estimation
Laurent Dinh, Jascha Sohl-Dickstein, Samy Bengio: Density estimation using Real NVP
Diederik P. Kingma, Prafulla Dhariwal: Glow: Generative Flow with Invertible 1x1 Convolutions
23 Apr 2019 Tomáš Gavenčiak Value learning Joel Lehman et al.: The Surprising Creativity of Digital Evolution
P. Abbeel, A. Y. Ng: Apprenticeship learning via inverse reinforcement learning
Paul Christiano et al.: Deep reinforcement learning from human preferences
Possibly other IRLs (MaxEnt, Bayesian), notes on Cooperative IRL, Corrupt Reward MDP, Inv. Game Theory.
30 Apr 2019 Štěpán Hojdar Computer vision Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, Piotr Dollár: Focal Loss for Dense Object Detection
Xuebo Liu, Ding Liang, Shi Yan, Dagui Chen, Yu Qiao, Junjie Yan: FOTS: Fast Oriented Text Spotting with a Unified Network
07 May 2019 Petra Doubravová RL as planning Danijar Hafner, Timothy Lillicrap, Ian Fischer, Ruben Villegas, David Ha, Honglak Lee, James Davidson: Learning Latent Dynamics for Planning from Pixels https://planetrl.github.io/
Felipe Vianna RL credit assignment Jose A. Arjona-Medina, Michael Gillhofer, Michael Widrich, Thomas Unterthiner, Johannes Brandstetter, Sepp Hochreiter: RUDDER: Return Decomposition for Delayed Rewards
14 May 2019 No DL Seminar Rector's Day
21 May 2019 Surya Prakash AutoRL Hao-Tien Lewis Chiang, Aleksandra Faust, Marek Fiser, Anthony Francis: Learning Navigation Behaviors End-to-End with AutoRL
Anthony Francis, Aleksandra Faust, Hao-Tien Lewis Chiang, Jasmine Hsu, J. Chase Kew, Marek Fiser, Tsang-Wei Edward Lee: Long-Range Indoor Navigation with PRM-RL

AI Safety and Inverse Reinforcement Learning Materials

Even if the talk by Tomáš Gavenčiak was cancelled, you can at least study on the following materials he kindly sent us:

Intro and motivation:

  • General intro video (5m) from Stuart Russel (author of AIMA)
  • A nice video example of reward hacking in RL (OpenAI blog)
  • A paper with more value mis-specification examples (PDF with some pictures).

Why is it hard, why are the hard parts important:

  • Really nice talk (90m) from Yukdowsky on AI alignment problem, with concrete math and simple models. One part (here in the video) deals with counter-/examples to even simple problem specification (e.g. let's give the AI a off-switch and make it not want to interfere with it).

Inverse reinforcement learning and some variants:

  • Nice short video introduction with basic SGD-based algorithm (3m).

  • Complete lecture from CVPR18, both overview and maths. Has parts on Maximum Entropy IRL (here) and also GAIL and other advanced techniques later.

Example of success with simpler model:

  • Learning from human preferences (OpenAI blog with videos): teaching a "Hopper" figure to do a backflip, learning the value function only by showing humans 2 short videos to compare (<1000 times).