Deep Learning Seminar, Summer 2018/19

In recent years, deep neural networks have been used to solve complex machine-learning problems and have achieved significant state-of-the-art results in many areas. The whole field of deep learning has been developing rapidly, with new methods and techniques emerging steadily.

The goal of the seminar is to follow the newest advancements in the deep learning field. The course takes form of a reading group – each lecture a paper is presented by one of the students. The paper is announced in advance, hence all participants can read it beforehand and can take part in the discussion of the paper.

If you want to receive announcements about chosen paper, sign up to our mailing list ufal-rg@googlegroups.com.

About

SIS code: NPFL117
Semester: winter + summer
E-credits: 3
Examination: 0/2 C
Guarantor: Milan Straka

Timespace Coordinates

The Deep Learning Seminar takes place on Tuesday at 10:40 in S8. We will first meet on Tuesday Mar 05.

Requirements

To pass the course, you need to present a research paper and sufficiently attend the presentations.

License

Unless otherwise stated, teaching materials for this course are available under CC BY-SA 4.0.

If you want to receive announcements about chosen paper, sign up to our mailing list ufal-rg@googlegroups.com.

To add your name to a paper the table below, edit the source code on GitHub and send a PR.

Date	Who	Topic	Paper(s)
05 Mar 2019	Milan Straka	Optimization	Noam Shazeer, Mitchell Stern: Adafactor: Adaptive Learning Rates with Sublinear Memory Cost Ilya Loshchilov, Frank Hutter: Decoupled Weight Decay Regularization Sashank J. Reddi, Satyen Kale, Sanjiv Kumar: On the Convergence of Adam and Beyond Liangchen Luo, Yuanhao Xiong, Yan Liu, Xu Sun: Adaptive Gradient Methods with Dynamic Bound of Learning Rate
12 Mar 2019	No DL Seminar
19 Mar 2019	Milan Straka	Optimization	James Martens, Roger Grosse: Optimizing Neural Networks with Kronecker-factored Approximate Curvature Roger Grosse, James Martens: A Kronecker-factored approximate Fisher matrix for convolution layers Jimmy Ba, Roger Grosse, James Martens: Distributed Second-Order Optimization using Kronecker-Factored Approximations James Martens, Jimmy Ba: Kronecker-Factored Curvature Approximations for Recurrent Neural Networks Thomas George, César Laurent, Xavier Bouthillier, Nicolas Ballas, Pascal Vincent: Fast Approximate Natural Gradient Descent in a Kronecker-factored Eigenbasis
26 Mar 2019	Milan Straka	AutoML	Hieu Pham, Melody Y. Guan, Barret Zoph, Quoc V. Le, Jeff Dean: Efficient Neural Architecture Search via Parameter Sharing Hanxiao Liu, Karen Simonyan, Yiming Yang: DARTS: Differentiable Architecture Search Han Cai, Ligeng Zhu, Song Han: ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware David R. So, Chen Liang, Quoc V. Le: The Evolved Transformer
02 Apr 2019	Martin Víta	NLP	Alexis Conneau, Douwe Kiela, Holger Schwenk, Loic Barrault, Antoine Bordes: Supervised Learning of Universal Sentence Representations from Natural Language Inference Data Adam Poliak, Jason Naradowsky, Aparajita Haldar, Rachel Rudinger, Benjamin Van Durme: Hypothesis Only Baselines in Natural Language Inference Amit Gajbhiye, Sardar Jaf, Noura Al Moubayed, A. Stephen McGough, Steven Bradley: An Exploration of Dropout with RNNs for Natural Language Inference
09 Apr 2019	Tomas Soucek	GANs	Zhiming Zhou, Yuxuan Song, Lantao Yu, Hongwei Wang, Jiadong Liang, Weinan Zhang, Zhihua Zhang, Yong Yu: Understanding the Effectiveness of Lipschitz-Continuity in Generative Adversarial Nets Takeru Miyato, Toshiki Kataoka, Masanori Koyama, Yuichi Yoshida: Spectral Normalization for Generative Adversarial Networks
16 Apr 2019	Jakub Arnold	Glow	Laurent Dinh, David Krueger, Yoshua Bengio: NICE: Non-linear Independent Components Estimation Laurent Dinh, Jascha Sohl-Dickstein, Samy Bengio: Density estimation using Real NVP Diederik P. Kingma, Prafulla Dhariwal: Glow: Generative Flow with Invertible 1x1 Convolutions
23 Apr 2019	Tomáš Gavenčiak	Value learning	Joel Lehman et al.: The Surprising Creativity of Digital Evolution P. Abbeel, A. Y. Ng: Apprenticeship learning via inverse reinforcement learning Paul Christiano et al.: Deep reinforcement learning from human preferences Possibly other IRLs (MaxEnt, Bayesian), notes on Cooperative IRL, Corrupt Reward MDP, Inv. Game Theory.
30 Apr 2019	Štěpán Hojdar	Computer vision	Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, Piotr Dollár: Focal Loss for Dense Object Detection Xuebo Liu, Ding Liang, Shi Yan, Dagui Chen, Yu Qiao, Junjie Yan: FOTS: Fast Oriented Text Spotting with a Unified Network
07 May 2019	Petra Doubravová	RL as planning	Danijar Hafner, Timothy Lillicrap, Ian Fischer, Ruben Villegas, David Ha, Honglak Lee, James Davidson: Learning Latent Dynamics for Planning from Pixels https://planetrl.github.io/
”	Felipe Vianna	RL credit assignment	Jose A. Arjona-Medina, Michael Gillhofer, Michael Widrich, Thomas Unterthiner, Johannes Brandstetter, Sepp Hochreiter: RUDDER: Return Decomposition for Delayed Rewards
14 May 2019	No DL Seminar		Rector's Day
21 May 2019	Surya Prakash	AutoRL	Hao-Tien Lewis Chiang, Aleksandra Faust, Marek Fiser, Anthony Francis: Learning Navigation Behaviors End-to-End with AutoRL Anthony Francis, Aleksandra Faust, Hao-Tien Lewis Chiang, Jasmine Hsu, J. Chase Kew, Marek Fiser, Tsang-Wei Edward Lee: Long-Range Indoor Navigation with PRM-RL

AI Safety and Inverse Reinforcement Learning Materials

Even if the talk by Tomáš Gavenčiak was cancelled, you can at least study on the following materials he kindly sent us:

Intro and motivation:

General intro video (5m) from Stuart Russel (author of AIMA)
A nice video example of reward hacking in RL (OpenAI blog)
A paper with more value mis-specification examples (PDF with some pictures).

Why is it hard, why are the hard parts important:

Really nice talk (90m) from Yukdowsky on AI alignment problem, with concrete math and simple models. One part (here in the video) deals with counter-/examples to even simple problem specification (e.g. let's give the AI a off-switch and make it not want to interfere with it).

Inverse reinforcement learning and some variants:

Nice short video introduction with basic SGD-based algorithm (3m).
Complete lecture from CVPR18, both overview and maths. Has parts on Maximum Entropy IRL (here) and also GAIL and other advanced techniques later.

Example of success with simpler model:

Learning from human preferences (OpenAI blog with videos): teaching a "Hopper" figure to do a backflip, learning the value function only by showing humans 2 short videos to compare (<1000 times).

Institute of Formal and Applied Linguistics

Charles University, Czech Republic
Faculty of Mathematics and Physics

Search form

Deep Learning Seminar, Summer 2018/19

About

Timespace Coordinates

Requirements

License

AI Safety and Inverse Reinforcement Learning Materials

Intro and motivation:

Why is it hard, why are the hard parts important:

Inverse reinforcement learning and some variants:

Example of success with simpler model:

Related Coursed

Deep Learning

Deep Reinforcement Learning

ÚFAL Reading Group

Archive

Deep Learning Seminar, Winter 2018/19

Deep Learning Seminar, Summer 2017/18

Deep Learning Seminar, Summer 2016/17