NLP

Word Embeddings

POS Tagging

  • Kazuma Hashimoto, Caiming Xiong, Yoshimasa Tsuruoka, Richard Socher: A Joint Many-Task Model: Growing a Neural Network for Multiple NLP Tasks. https://arxiv.org/abs/1611.01587
  • Barbara Plank, Anders Søgaard, Yoav Goldberg: Multilingual Part-of-Speech Tagging with Bidirectional Long Short-Term Memory Models and Auxiliary Loss. https://arxiv.org/abs/1604.05529
  • Wang Ling, Tiago Luís, Luís Marujo, Ramón Fernandez Astudillo, Silvio Amir, Chris Dyer, Alan W. Black, Isabel Trancoso: Finding Function in Form: Compositional Character Models for Open Vocabulary Word Representation. http://arxiv.org/abs/1508.02096

Parsing

Coreference

  • Ali Emami, Paul Trichelair, Adam Trischler, Kaheer Suleman, Hannes Schulz, Jackie Chi Kit Cheung: The Knowref Coreference Corpus: Removing Gender and Number Cues for Difficult Pronominal Anaphora Resolution. https://arxiv.org/abs/1811.01747

NER, NEL

Knowledge Graphs

  • Rajarshi Das, Shehzaad Dhuliawala, Manzil Zaheer, Luke Vilnis, Ishan Durugkar, Akshay Krishnamurthy, Alex Smola, Andrew McCallum: Go for a Walk and Arrive at the Answer: Reasoning Over Paths in Knowledge Bases using Reinforcement Learning. https://arxiv.org/abs/1711.05851
  • Xi Victoria Lin, Richard Socher, Caiming Xiong: Multi-Hop Knowledge Graph Reasoning with Reward Shaping. https://www.aclweb.org/anthology/D18-1362.pdf

Q&A

  • Dayiheng Liu, Yeyun Gong, Jie Fu, Yu Yan, Jiusheng Chen, Daxin Jiang, Jiancheng Lv, Nan Duan: RikiNet: Reading Wikipedia Pages for Natural Question Answering. https://arxiv.org/abs/2004.14560
  • Adam Roberts, Colin Raffel, Noam Shazeer: How Much Knowledge Can You Pack Into the Parameters of a Language Model?. https://arxiv.org/abs/2002.08910
  • Kelvin Guu, Kenton Lee, Zora Tung, Panupong Pasupat, Ming-Wei Chang: REALM: Retrieval-Augmented Language Model Pre-Training. https://arxiv.org/abs/2002.08909
  • Patrick Lewis, Barlas Oğuz, Ruty Rinott, Sebastian Riedel, Holger Schwenk: MLQA: Evaluating Cross-lingual Extractive Question Answering. https://arxiv.org/abs/1910.07475
  • Tsung-yuan Hsu, Chi-liang Liu, Hung-yi Lee: Zero-shot Reading Comprehension by Cross-lingual Transfer Learning with Multi-lingual Language Representation Model. https://arxiv.org/abs/1909.09587
  • Lin Pan, Rishav Chakravarti, Anthony Ferritto, Michael Glass, Alfio Gliozzo, Salim Roukos, Radu Florian, Avirup Sil: Frustratingly Easy Natural Question Answering. https://arxiv.org/abs/1909.05286
  • Zhuosheng Zhang, Yuwei Wu, Junru Zhou, Sufeng Duan, Hai Zhao, Rui Wang: SG-Net: Syntax-Guided Machine Reading Comprehension. https://arxiv.org/abs/1908.05147
  • Chris Alberti, Kenton Lee, Michael Collins: A BERT Baseline for the Natural Questions. https://arxiv.org/abs/1901.08634

Contextualized Embeddings, BERT

  • Prakhar Ganesh, Yao Chen, Xin Lou, Mohammad Ali Khan, Yin Yang, Deming Chen, Marianne Winslett, Hassan Sajjad, Preslav Nakov: Compressing Large-Scale Transformer-Based Models: A Case Study on BERT. https://arxiv.org/abs/2002.11985
  • Kevin Clark, Minh-Thang Luong, Quoc V. Le, Christopher D. Manning: ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators. https://openreview.net/pdf?id=r1xMH1BtvB
  • Zhenzhong Lan, Mingda Chen, Sebastian Goodman, Kevin Gimpel, Piyush Sharma, Radu Soricut: ALBERT: A Lite BERT for Self-supervised Learning of Language Representations. https://arxiv.org/abs/1909.11942
  • Matthew E. Peters, Mark Neumann, Robert L. Logan IV, Roy Schwartz, Vidur Joshi, Sameer Singh, Noah A. Smith: Knowledge Enhanced Contextual Word Representations. https://arxiv.org/abs/1909.04164
  • Zhuosheng Zhang, Yuwei Wu, Hai Zhao, Zuchao Li, Shuailiang Zhang, Xi Zhou, Xiang Zhou: Semantics-aware BERT for Language Understanding. https://arxiv.org/abs/1909.02209
  • Kawin Ethayarajh: How Contextual are Contextualized Word Representations? Comparing the Geometry of BERT, ELMo, and GPT-2 Embeddings. https://arxiv.org/abs/1909.00512
  • Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, Veselin Stoyanov: RoBERTa: A Robustly Optimized BERT Pretraining Approach. https://arxiv.org/abs/1907.11692
  • Jacob Devlin, Ming-Wei Chang, Kenton Lee, Kristina Toutanova: BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. https://arxiv.org/abs/1810.04805

Cross-lingual Embeddings

  • Alexis Conneau, Kartikay Khandelwal, Naman Goyal, Vishrav Chaudhary, Guillaume Wenzek, Francisco Guzmán, Edouard Grave, Myle Ott, Luke Zettlemoyer, Veselin Stoyanov: Unsupervised Cross-lingual Representation Learning at Scale. https://arxiv.org/abs/1911.02116
  • Shijie Wu, Mark Dredze: Beto, Bentz, Becas: The Surprising Cross-Lingual Effectiveness of BERT. https://arxiv.org/abs/1904.09077
  • Mikel Artetxe, Holger Schwenk: Massively Multilingual Sentence Embeddings for Zero-Shot Cross-Lingual Transfer and Beyond. https://arxiv.org/abs/1812.10464
  • Omer Levy, Anders Søgaard, Yoav Goldberg: Reconsidering Cross-lingual Word Embeddings. https://arxiv.org/abs/1608.05426

Transformers

  • Alessandro Raganato, Yves Scherrer, Jörg Tiedemann: Fixed Encoder Self-Attention Patterns in Transformer-Based Machine Translation. https://arxiv.org/abs/2002.10260
  • Nikita Kitaev, Łukasz Kaiser, Anselm Levskaya: Reformer: The Efficient Transformer. https://arxiv.org/abs/2001.04451
  • Stephen Merity: Single Headed Attention RNN: Stop Thinking With Your Head. https://arxiv.org/abs/1911.11423
  • Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, Peter J. Liu: Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. https://arxiv.org/abs/1910.10683
  • Mitchell Stern, William Chan, Jamie Kiros, Jakob Uszkoreit: Insertion Transformer: Flexible Sequence Generation via Insertion Operations. https://arxiv.org/abs/1902.03249
  • Mostafa Dehghani, Stephan Gouws, Oriol Vinyals, Jakob Uszkoreit, Łukasz Kaiser: Universal Transformers. https://arxiv.org/abs/1807.03819
  • Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin: Attention Is All You Need. https://arxiv.org/abs/1706.03762

NMT

LM

  • Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, Dario Amodei: Language Models are Few-Shot Learners. https://arxiv.org/abs/2005.14165 GPT-3 GPT3
  • Nitish Shirish Keskar, Bryan McCann, Lav R. Varshney, Caiming Xiong, Richard Socher: CTRL: A Conditional Transformer Language Model for Controllable Generation. https://arxiv.org/abs/1909.05858
  • Julian Eisenschlos, Sebastian Ruder, Piotr Czapla, Marcin Kardas, Sylvain Gugger, Jeremy Howard: MultiFiT: Efficient Multi-lingual Language Model Fine-tuning. https://arxiv.org/abs/1909.04761
  • Alec Radford et al.: Language Models are Unsupervised Multitask Learners. https://d4mucfpksywv.cloudfront.net/better-language-models/language-models.pdf
  • Guillaume Lample, Alexis Conneau: Cross-lingual Language Model Pretraining. https://arxiv.org/abs/1901.07291
  • Jeremy Howard, Sebastian Ruder: Universal Language Model Fine-tuning for Text Classification. https://arxiv.org/abs/1801.06146
  • Anirudh Goyal, Nan Rosemary Ke, Alex Lamb, R Devon Hjelm, Chris Pal, Joelle Pineau, Yoshua Bengio: ACtuAL: Actor-Critic Under Adversarial Learning. https://arxiv.org/abs/1711.04755
  • Gábor Melis, Chris Dyer, Phil Blunsom: On the State of the Art of Evaluation in Neural Language Models. https://arxiv.org/abs/1707.05589
  • Rafal Jozefowicz, Oriol Vinyals, Mike Schuster, Noam Shazeer, Yonghui Wu: Exploring the Limits of Language Modeling. https://arxiv.org/abs/1602.02410

GEC

Summarization

Paraphrasing

NLG

Speech Recognition

  • Yanzhang He, Tara N. Sainath, Rohit Prabhavalkar, Ian McGraw, Raziel Alvarez, Ding Zhao, David Rybach, Anjuli Kannan, Yonghui Wu, Ruoming Pang, Qiao Liang, Deepti Bhatia, Yuan Shangguan, Bo Li, Golan Pundak, Khe Chai Sim, Tom Bagby, Shuo-yiin Chang, Kanishka Rao, Alexander Gruenstein: Streaming End-to-end Speech Recognition For Mobile Devices. https://arxiv.org/abs/1811.06621

Speech Synthesis

  • Jonathan Shen, Ruoming Pang, Ron J. Weiss, Mike Schuster, Navdeep Jaitly, Zongheng Yang, Zhifeng Chen, Yu Zhang, Yuxuan Wang, RJ Skerry-Ryan, Rif A. Saurous, Yannis Agiomyrgiannakis, Yonghui Wu: Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions. https://arxiv.org/abs/1712.05884
  • Aaron van den Oord, Yazhe Li, Igor Babuschkin, Karen Simonyan, Oriol Vinyals, Koray Kavukcuoglu, George van den Driessche, Edward Lockhart, Luis C. Cobo, Florian Stimberg, Norman Casagrande, Dominik Grewe, Seb Noury, Sander Dieleman, Erich Elsen, Nal Kalchbrenner, Heiga Zen, Alex Graves, Helen King, Tom Walters, Dan Belov, Demis Hassabis: Parallel WaveNet: Fast High-Fidelity Speech Synthesis. https://arxiv.org/abs/1711.10433
  • Yuxuan Wang, RJ Skerry-Ryan, Daisy Stanton, Yonghui Wu, Ron J. Weiss, Navdeep Jaitly, Zongheng Yang, Ying Xiao, Zhifeng Chen, Samy Bengio, Quoc Le, Yannis Agiomyrgiannakis, Rob Clark, Rif A. Saurous: Tacotron: Towards End-to-End Speech Synthesis. https://arxiv.org/abs/1703.10135
  • Aaron van den Oord, Sander Dieleman, Heiga Zen, Karen Simonyan, Oriol Vinyals, Alex Graves, Nal Kalchbrenner, Andrew Senior, Koray Kavukcuoglu: WaveNet: A Generative Model for Raw Audio. https://arxiv.org/abs/1609.03499

Differential Privacy

  • Nicholas Carlini, Chang Liu, Jernej Kos, Úlfar Erlingsson, Dawn Song: The Secret Sharer: Measuring Unintended Neural Network Memorization & Extracting Secrets. https://arxiv.org/abs/1802.08232
  • H. Brendan McMahan, Daniel Ramage, Kunal Talwar, Li Zhang: Learning Differentially Private Recurrent Language Models. https://arxiv.org/abs/1710.06963
  • Keith Bonawitz, Vladimir Ivanov, Ben Kreuter, Antonio Marcedone, H. Brendan McMahan, Sarvar Patel, Daniel Ramage, Aaron Segal, Karn Seth: Practical Secure Aggregation for Federated Learning on User-Held Data. https://arxiv.org/abs/1611.04482
  • Reza Shokri, Marco Stronati, Congzheng Song, Vitaly Shmatikov: Membership Inference Attacks against Machine Learning Models. https://arxiv.org/abs/1610.05820
  • H. Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, Blaise Agüera y Arcas: Communication-Efficient Learning of Deep Networks from Decentralized Data. https://arxiv.org/abs/1602.05629

Adversarial Text

Adversarial Speech

  • Xuejing Yuan, Yuxuan Chen, Yue Zhao, Yunhui Long, Xiaokang Liu, Kai Chen, Shengzhi Zhang, Heqing Huang, Xiaofeng Wang, Carl A. Gunter: CommanderSong: A Systematic Approach for Practical Adversarial Voice Recognition. https://arxiv.org/abs/1801.08535
  • Nicholas Carlini, David Wagner: Audio Adversarial Examples: Targeted Attacks on Speech-to-Text. https://arxiv.org/abs/1801.01944

Fake News

  • Rowan Zellers, Ari Holtzman, Hannah Rashkin, Yonatan Bisk, Ali Farhadi, Franziska Roesner, Yejin Choi: Defending Against Neural Fake News. https://arxiv.org/abs/1905.12616

Images

Image Classification

Object Detection and Image Segmentation

Image Labeling

  • Martin Engilberge, Louis Chevallier, Patrick Pérez, Matthieu Cord: Finding beans in burgers: Deep semantic-visual embedding with localization. https://arxiv.org/abs/1804.01720
  • Oriol Vinyals, Alexander Toshev, Samy Bengio, Dumitru Erhan: Show and Tell: Lessons learned from the 2015 MSCOCO Image Captioning Challenge. https://arxiv.org/abs/1609.06647

Image Data Augmentation

  • Cihang Xie, Mingxing Tan, Boqing Gong, Jiang Wang, Alan Yuille, Quoc V. Le: Adversarial Examples Improve Image Recognition. https://arxiv.org/abs/1911.09665
  • Qizhe Xie, Minh-Thang Luong, Eduard Hovy, Quoc V. Le: Self-training with Noisy Student improves ImageNet classification. https://arxiv.org/abs/1911.04252
  • Ekin D. Cubuk, Barret Zoph, Jonathon Shlens, Quoc V. Le: RandAugment: Practical automated data augmentation with a reduced search space. https://arxiv.org/abs/1909.13719
  • Ekin D. Cubuk, Barret Zoph, Dandelion Mane, Vijay Vasudevan, Quoc V. Le: AutoAugment: Learning Augmentation Policies from Data. https://arxiv.org/abs/1805.09501

Generative Adversarial Networks

Adversarial Images

OCR

Image Enhancement

3D Objects

  • Jiajun Wu, Yifan Wang, Tianfan Xue, Xingyuan Sun, William T Freeman, Joshua B Tenenbaum: MarrNet: 3D Shape Reconstruction via 2.5D Sketches. https://arxiv.org/abs/1711.03129

Deep Learning

Optimization

Activation Functions

Regularization

  • Takashi Ishida, Ikko Yamane, Tomoya Sakai, Gang Niu, Masashi Sugiyama: Do We Need Zero Training Loss After Achieving Zero Training Error?. https://arxiv.org/abs/2002.08709 Flooding
  • Deren Lei, Zichen Sun, Yijun Xiao, William Yang Wang: Implicit Regularization of Stochastic Gradient Descent in Natural Language Processing: Observations and Implications. https://arxiv.org/abs/1811.00659
  • Hongyi Zhang, Moustapha Cisse, Yann N. Dauphin, David Lopez-Paz: mixup: Beyond Empirical Risk Minimization. https://arxiv.org/abs/1710.09412
  • Sergey Ioffe: Batch Renormalization: Towards Reducing Minibatch Dependence in Batch-Normalized Models. https://arxiv.org/abs/1702.03275
  • Jimmy Lei Ba, Jamie Ryan Kiros, Geoffrey E. Hinton: Layer Normalization. https://arxiv.org/abs/1607.06450
  • David Krueger, Tegan Maharaj, János Kramár, Mohammad Pezeshki, Nicolas Ballas, Nan Rosemary Ke, Anirudh Goyal, Yoshua Bengio, Hugo Larochelle, Aaron Courville, Chris Pal: Zoneout: Regularizing RNNs by Randomly Preserving Hidden Activations. https://arxiv.org/abs/1606.01305
  • Tim Cooijmans, Nicolas Ballas, César Laurent, Çağlar Gülçehre, Aaron Courville: Recurrent Batch Normalization. https://arxiv.org/abs/1603.09025
  • Stanislau Semeniuta, Aliaksei Severyn, Erhardt Barth: Recurrent Dropout without Memory Loss. https://arxiv.org/abs/1603.05118
  • Yarin Gal, Zoubin Ghahramani: A Theoretically Grounded Application of Dropout in Recurrent Neural Networks. https://arxiv.org/abs/1512.05287
  • César Laurent, Gabriel Pereyra, Philémon Brakel, Ying Zhang, Yoshua Bengio: Batch Normalized Recurrent Neural Networks. https://arxiv.org/abs/1510.01378
  • Sergey Ioffe, Christian Szegedy: Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. https://arxiv.org/abs/1502.03167
  • Geoffrey E. Hinton, Nitish Srivastava, Alex Krizhevsky, Ilya Sutskever, Ruslan R. Salakhutdinov: Improving neural networks by preventing co-adaptation of feature detectors. https://arxiv.org/abs/1207.0580

Generalization

  • Chiyuan Zhang, Samy Bengio, Moritz Hardt, Benjamin Recht, Oriol Vinyals: Understanding deep learning requires rethinking generalization. https://arxiv.org/abs/1611.03530

Architectures

  • Jason Liang, Elliot Meyerson, Risto Miikkulainen: Evolutionary Architecture Search For Deep Multitask Networks. https://arxiv.org/abs/1803.03745
  • Esteban Real, Alok Aggarwal, Yanping Huang, Quoc V Le: Regularized Evolution for Image Classifier Architecture Search. https://arxiv.org/abs/1802.01548
  • Chenxi Liu, Barret Zoph, Maxim Neumann, Jonathon Shlens, Wei Hua, Li-Jia Li, Li Fei-Fei, Alan Yuille, Jonathan Huang, Kevin Murphy: Progressive Neural Architecture Search. https://arxiv.org/abs/1712.00559
  • Irwan Bello, Barret Zoph, Vijay Vasudevan, Quoc V. Le: Neural Optimizer Search with Reinforcement Learning. https://arxiv.org/abs/1709.07417
  • Barret Zoph, Vijay Vasudevan, Jonathon Shlens, Quoc V. Le: Learning Transferable Architectures for Scalable Image Recognition. https://arxiv.org/abs/1707.07012
  • Danijar Hafner, Alex Irpan, James Davidson, Nicolas Heess: Learning Hierarchical Information Flow with Recurrent Neural Modules. https://arxiv.org/abs/1706.05744
  • Noam Shazeer, Azalia Mirhoseini, Krzysztof Maziarz, Andy Davis, Quoc Le, Geoffrey Hinton, Jeff Dean: Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer. https://arxiv.org/abs/1701.06538
  • Bowen Baker, Otkrist Gupta, Nikhil Naik, Ramesh Raskar: Designing Neural Network Architectures using Reinforcement Learning. https://arxiv.org/abs/1611.02167
  • Barret Zoph, Quoc V. Le: Neural Architecture Search with Reinforcement Learning. https://arxiv.org/abs/1611.01578
  • Julian Georg Zilly, Rupesh Kumar Srivastava, Jan Koutník, Jürgen Schmidhuber: Recurrent Highway Networks. https://arxiv.org/abs/1607.03474
  • Marcin Andrychowicz, Misha Denil, Sergio Gomez, Matthew W. Hoffman, David Pfau, Tom Schaul, Brendan Shillingford, Nando de Freitas: Learning to learn by gradient descent by gradient descent. https://arxiv.org/abs/1606.04474
  • Rupesh Kumar Srivastava, Klaus Greff, Jürgen Schmidhuber: Training Very Deep Networks. https://arxiv.org/abs/1507.06228

Recurrent Cells

Model Interpretation

Structured Prediction

  • Dzmitry Bahdanau, Philemon Brakel, Kelvin Xu, Anirudh Goyal, Ryan Lowe, Joelle Pineau, Aaron Courville, Yoshua Bengio: An Actor-Critic Algorithm for Sequence Prediction. https://arxiv.org/abs/1607.07086
  • Sam Wiseman, Alexander M. Rush: Sequence-to-Sequence Learning as Beam-Search Optimization. https://arxiv.org/abs/1606.02960
  • Daniel Andor, Chris Alberti, David Weiss, Aliaksei Severyn, Alessandro Presta, Kuzman Ganchev, Slav Petrov, Michael Collins: Globally Normalized Transition-Based Neural Networks. https://arxiv.org/abs/1603.06042
  • Shiqi Shen, Yong Cheng, Zhongjun He, Wei He, Hua Wu, Maosong Sun, Yang Liu: Minimum Risk Training for Neural Machine Translation. https://arxiv.org/abs/1512.02433
  • Marc'Aurelio Ranzato, Sumit Chopra, Michael Auli, Wojciech Zaremba: Sequence Level Training with Recurrent Neural Networks. https://arxiv.org/abs/1511.06732

Variational Autoencoders

Double Descent

  • Preetum Nakkiran, Gal Kaplun, Yamini Bansal, Tristan Yang, Boaz Barak, Ilya Sutskever: Deep Double Descent: Where Bigger Models and More Data Hurt. https://arxiv.org/abs/1912.02292
  • Mikhail Belkin, Daniel Hsu, Siyuan Ma, Soumik Mandal: Reconciling modern machine learning practice and the bias-variance trade-off. https://arxiv.org/abs/1812.11118
  • Hartmut Maennel, Olivier Bousquet, Sylvain Gelly: Gradient Descent Quantizes ReLU Network Features. https://arxiv.org/abs/1803.08367

Neural ODEs

Evaluation

Artificial Intelligence

RL

RL – Actor Critic

  • Sriram Srinivasan, Marc Lanctot, Vinicius Zambaldi, Julien Perolat, Karl Tuyls, Remi Munos, Michael Bowling: Actor-Critic Policy Optimization in Partially Observable Multiagent Environments. https://arxiv.org/abs/1810.09026
  • Matteo Hessel, Hubert Soyer, Lasse Espeholt, Wojciech Czarnecki, Simon Schmitt, Hado van Hasselt: Multi-task Deep Reinforcement Learning with PopArt. https://arxiv.org/abs/1809.04474
  • Lasse Espeholt, Hubert Soyer, Remi Munos, Karen Simonyan, Volodymir Mnih, Tom Ward, Yotam Doron, Vlad Firoiu, Tim Harley, Iain Dunning, Shane Legg, Koray Kavukcuoglu: IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures. https://arxiv.org/abs/1802.01561
  • Max Jaderberg, Valentin Dalibard, Simon Osindero, Wojciech M. Czarnecki, Jeff Donahue, Ali Razavi, Oriol Vinyals, Tim Green, Iain Dunning, Karen Simonyan, Chrisantha Fernando, Koray Kavukcuoglu: Population Based Training of Neural Networks. https://arxiv.org/abs/1711.09846
  • Alfredo V. Clemente, Humberto N. Castejón, Arjun Chandra: Efficient Parallel Methods for Deep Reinforcement Learning. https://arxiv.org/abs/1705.04862
  • Volodymyr Mnih, Adrià Puigdomènech Badia, Mehdi Mirza, Alex Graves, Timothy P. Lillicrap, Tim Harley, David Silver, Koray Kavukcuoglu: Asynchronous Methods for Deep Reinforcement Learning. https://arxiv.org/abs/1602.01783 A3C

RL – DQN

Continuous RL

  • Rui Wang, Joel Lehman, Jeff Clune, Kenneth O. Stanley: Paired Open-Ended Trailblazer (POET): Endlessly Generating Increasingly Complex and Diverse Learning Environments and Their Solutions. https://arxiv.org/abs/1901.01753
  • Scott Fujimoto, Herke van Hoof, David Meger: Addressing Function Approximation Error in Actor-Critic Methods. https://arxiv.org/abs/1802.09477
  • Tuomas Haarnoja, Aurick Zhou, Pieter Abbeel, Sergey Levine: Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor. https://arxiv.org/abs/1801.01290
  • Piotr Mirowski, Razvan Pascanu, Fabio Viola, Hubert Soyer, Andrew J. Ballard, Andrea Banino, Misha Denil, Ross Goroshin, Laurent Sifre, Koray Kavukcuoglu, Dharshan Kumaran, Raia Hadsell: Learning to Navigate in Complex Environments. https://arxiv.org/abs/1611.03673
  • Yan Duan, Xi Chen, Rein Houthooft, John Schulman, Pieter Abbeel: Benchmarking Deep Reinforcement Learning for Continuous Control. https://arxiv.org/abs/1604.06778
  • Timothy P. Lillicrap, Jonathan J. Hunt, Alexander Pritzel, Nicolas Heess, Tom Erez, Yuval Tassa, David Silver, Daan Wierstra: Continuous control with deep reinforcement learning. https://arxiv.org/abs/1509.02971
  • David Silver, Guy Lever, Nicolas Heess, Thomas Degris, Daan Wierstra, Martin Riedmiller: Deterministic policy gradient algorithms. http://jmlr.org/proceedings/papers/v32/silver14.pdf

Model-based RL

Multi-agent RL

AutoML, AutoRL

  • Mingxing Tan, Ruoming Pang, Quoc V. Le: EfficientDet: Scalable and Efficient Object Detection. https://arxiv.org/abs/1911.09070
  • Mingxing Tan, Quoc V. Le: EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. https://arxiv.org/abs/1905.11946
  • Anthony Francis, Aleksandra Faust, Hao-Tien Lewis Chiang, Jasmine Hsu, J. Chase Kew, Marek Fiser, Tsang-Wei Edward Lee: Long-Range Indoor Navigation with PRM-RL. https://arxiv.org/abs/1902.09458
  • Hao-Tien Lewis Chiang, Aleksandra Faust, Marek Fiser, Anthony Francis: Learning Navigation Behaviors End-to-End with AutoRL. https://arxiv.org/abs/1809.10124
  • Mingxing Tan, Bo Chen, Ruoming Pang, Vijay Vasudevan, Mark Sandler, Andrew Howard, Quoc V. Le: MnasNet: Platform-Aware Neural Architecture Search for Mobile. https://arxiv.org/abs/1807.11626
  • Hanxiao Liu, Karen Simonyan, Yiming Yang: DARTS: Differentiable Architecture Search. https://arxiv.org/abs/1806.09055
  • Tien-Ju Yang, Andrew Howard, Bo Chen, Xiao Zhang, Alec Go, Mark Sandler, Vivienne Sze, Hartwig Adam: NetAdapt: Platform-Aware Neural Network Adaptation for Mobile Applications. https://arxiv.org/abs/1804.03230
  • Hieu Pham, Melody Y. Guan, Barret Zoph, Quoc V. Le, Jeff Dean: Efficient Neural Architecture Search via Parameter Sharing. https://arxiv.org/abs/1802.03268
  • Esteban Real, Alok Aggarwal, Yanping Huang, Quoc V Le: Regularized Evolution for Image Classifier Architecture Search. https://arxiv.org/abs/1802.01548
  • Chenxi Liu, Barret Zoph, Maxim Neumann, Jonathon Shlens, Wei Hua, Li-Jia Li, Li Fei-Fei, Alan Yuille, Jonathan Huang, Kevin Murphy: Progressive Neural Architecture Search. https://arxiv.org/abs/1712.00559
  • Barret Zoph, Vijay Vasudevan, Jonathon Shlens, Quoc V. Le: Learning Transferable Architectures for Scalable Image Recognition. https://arxiv.org/abs/1707.07012
  • Natasha Jaques, Shixiang Gu, Richard E. Turner, Douglas Eck: Tuning Recurrent Neural Networks with Reinforcement Learning. https://arxiv.org/abs/1611.02796

Meta Learning

Discrete Latent Variables

Explicit Memory

  • Max Jaderberg, Wojciech M. Czarnecki, Iain Dunning, Luke Marris, Guy Lever, Antonio Garcia Castaneda, Charles Beattie, Neil C. Rabinowitz, Ari S. Morcos, Avraham Ruderman, Nicolas Sonnerat, Tim Green, Louise Deason, Joel Z. Leibo, David Silver, Demis Hassabis, Koray Kavukcuoglu, Thore Graepel: Human-level performance in first-person multiplayer games with population-based deep reinforcement learning. https://arxiv.org/abs/1807.01281
  • Greg Wayne, Chia-Chun Hung, David Amos, Mehdi Mirza, Arun Ahuja, Agnieszka Grabska-Barwinska, Jack Rae, Piotr Mirowski, Joel Z. Leibo, Adam Santoro, Mevlana Gemici, Malcolm Reynolds, Tim Harley, Josh Abramson, Shakir Mohamed, Danilo Rezende, David Saxton, Adam Cain, Chloe Hillier, David Silver, Koray Kavukcuoglu, Matt Botvinick, Demis Hassabis, Timothy Lillicrap: Unsupervised Predictive Memory in a Goal-Directed Agent. https://arxiv.org/abs/1803.10760
  • Mevlana Gemici, Chia-Chun Hung, Adam Santoro, Greg Wayne, Shakir Mohamed, Danilo J. Rezende, David Amos, Timothy Lillicrap: Generative Temporal Models with Memory. https://arxiv.org/abs/1702.04649
  • Caglar Gulcehre, Sarath Chandar, Yoshua Bengio: Memory Augmented Neural Networks with Wormhole Connections. https://arxiv.org/abs/1701.08718
  • Alex Graves et al.: Hybrid computing using a neural network with dynamic external memory. https://www.gwern.net/docs/2016-graves.pdf
  • Alex Graves, Greg Wayne, Ivo Danihelka: Neural Turing Machines. https://arxiv.org/abs/1410.5401

Hyperparameter Optimization

Evolution

  • Tim Salimans, Jonathan Ho, Xi Chen, Szymon Sidor, Ilya Sutskever: Evolution Strategies as a Scalable Alternative to Reinforcement Learning. https://arxiv.org/abs/1703.03864

Misc

Books

Blogs