NLP

Word Embeddings

POS Tagging

  • Wang Ling, Tiago Luís, Luís Marujo, Ramón Fernandez Astudillo, Silvio Amir, Chris Dyer, Alan W. Black, Isabel Trancoso: Finding Function in Form: Compositional Character Models for Open Vocabulary Word Representation. http://arxiv.org/abs/1508.02096
  • Barbara Plank, Anders Søgaard, Yoav Goldberg: Multilingual Part-of-Speech Tagging with Bidirectional Long Short-Term Memory Models and Auxiliary Loss. https://arxiv.org/abs/1604.05529
  • Kazuma Hashimoto, Caiming Xiong, Yoshimasa Tsuruoka, Richard Socher: A Joint Many-Task Model: Growing a Neural Network for Multiple NLP Tasks. https://arxiv.org/abs/1611.01587

NER

Parsing

Neural Machine Translation

Language Modelling

Language Correction

Summarization

Paraphrasing

Natural Language Generation

Speech Synthesis

  • Aaron van den Oord, Sander Dieleman, Heiga Zen, Karen Simonyan, Oriol Vinyals, Alex Graves, Nal Kalchbrenner, Andrew Senior, Koray Kavukcuoglu: WaveNet: A Generative Model for Raw Audio. https://arxiv.org/abs/1609.03499
  • Yuxuan Wang, RJ Skerry-Ryan, Daisy Stanton, Yonghui Wu, Ron J. Weiss, Navdeep Jaitly, Zongheng Yang, Ying Xiao, Zhifeng Chen, Samy Bengio, Quoc Le, Yannis Agiomyrgiannakis, Rob Clark, Rif A. Saurous: Tacotron: Towards End-to-End Speech Synthesis. https://arxiv.org/abs/1703.10135
  • Aaron van den Oord, Yazhe Li, Igor Babuschkin, Karen Simonyan, Oriol Vinyals, Koray Kavukcuoglu, George van den Driessche, Edward Lockhart, Luis C. Cobo, Florian Stimberg, Norman Casagrande, Dominik Grewe, Seb Noury, Sander Dieleman, Erich Elsen, Nal Kalchbrenner, Heiga Zen, Alex Graves, Helen King, Tom Walters, Dan Belov, Demis Hassabis: Parallel WaveNet: Fast High-Fidelity Speech Synthesis. https://arxiv.org/abs/1711.10433
  • Jonathan Shen, Ruoming Pang, Ron J. Weiss, Mike Schuster, Navdeep Jaitly, Zongheng Yang, Zhifeng Chen, Yu Zhang, Yuxuan Wang, RJ Skerry-Ryan, Rif A. Saurous, Yannis Agiomyrgiannakis, Yonghui Wu: Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions. https://arxiv.org/abs/1712.05884

Differential Privacy

  • H. Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, Blaise Agüera y Arcas: Communication-Efficient Learning of Deep Networks from Decentralized Data. https://arxiv.org/abs/1602.05629
  • Reza Shokri, Marco Stronati, Congzheng Song, Vitaly Shmatikov: Membership Inference Attacks against Machine Learning Models. https://arxiv.org/abs/1610.05820
  • Keith Bonawitz, Vladimir Ivanov, Ben Kreuter, Antonio Marcedone, H. Brendan McMahan, Sarvar Patel, Daniel Ramage, Aaron Segal, Karn Seth: Practical Secure Aggregation for Federated Learning on User-Held Data. https://arxiv.org/abs/1611.04482
  • H. Brendan McMahan, Daniel Ramage, Kunal Talwar, Li Zhang: Learning Differentially Private Recurrent Language Models. https://arxiv.org/abs/1710.06963
  • Nicholas Carlini, Chang Liu, Jernej Kos, Úlfar Erlingsson, Dawn Song: The Secret Sharer: Measuring Unintended Neural Network Memorization & Extracting Secrets. https://arxiv.org/abs/1802.08232

Image Processing

Image Classification

Image Segmentation

Image Labeling

  • Oriol Vinyals, Alexander Toshev, Samy Bengio, Dumitru Erhan: Show and Tell: Lessons learned from the 2015 MSCOCO Image Captioning Challenge. https://arxiv.org/abs/1609.06647
  • Martin Engilberge, Louis Chevallier, Patrick Pérez, Matthieu Cord: Finding beans in burgers: Deep semantic-visual embedding with localization. https://arxiv.org/abs/1804.01720

Image Recognition

Image Enhancement

Image 3D Reconstruction

  • Jiajun Wu, Yifan Wang, Tianfan Xue, Xingyuan Sun, William T Freeman, Joshua B Tenenbaum: MarrNet: 3D Shape Reconstruction via 2.5D Sketches. https://arxiv.org/abs/1711.03129

Deep Learning

Training Methods

Activation Functions

Regularization

  • Geoffrey E. Hinton, Nitish Srivastava, Alex Krizhevsky, Ilya Sutskever, Ruslan R. Salakhutdinov: Improving neural networks by preventing co-adaptation of feature detectors. https://arxiv.org/abs/1207.0580
  • Sergey Ioffe, Christian Szegedy: Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. https://arxiv.org/abs/1502.03167
  • César Laurent, Gabriel Pereyra, Philémon Brakel, Ying Zhang, Yoshua Bengio: Batch Normalized Recurrent Neural Networks. https://arxiv.org/abs/1510.01378
  • Yarin Gal, Zoubin Ghahramani: A Theoretically Grounded Application of Dropout in Recurrent Neural Networks. https://arxiv.org/abs/1512.05287
  • Stanislau Semeniuta, Aliaksei Severyn, Erhardt Barth: Recurrent Dropout without Memory Loss. https://arxiv.org/abs/1603.05118
  • Tim Cooijmans, Nicolas Ballas, César Laurent, Çağlar Gülçehre, Aaron Courville: Recurrent Batch Normalization. https://arxiv.org/abs/1603.09025
  • David Krueger, Tegan Maharaj, János Kramár, Mohammad Pezeshki, Nicolas Ballas, Nan Rosemary Ke, Anirudh Goyal, Yoshua Bengio, Hugo Larochelle, Aaron Courville, Chris Pal: Zoneout: Regularizing RNNs by Randomly Preserving Hidden Activations. https://arxiv.org/abs/1606.01305
  • Jimmy Lei Ba, Jamie Ryan Kiros, Geoffrey E. Hinton: Layer Normalization. https://arxiv.org/abs/1607.06450
  • Sergey Ioffe: Batch Renormalization: Towards Reducing Minibatch Dependence in Batch-Normalized Models. https://arxiv.org/abs/1702.03275
  • Hongyi Zhang, Moustapha Cisse, Yann N. Dauphin, David Lopez-Paz: mixup: Beyond Empirical Risk Minimization. https://arxiv.org/abs/1710.09412

Network Architectures

  • Rupesh Kumar Srivastava, Klaus Greff, Jürgen Schmidhuber: Training Very Deep Networks. https://arxiv.org/abs/1507.06228
  • Marcin Andrychowicz, Misha Denil, Sergio Gomez, Matthew W. Hoffman, David Pfau, Tom Schaul, Brendan Shillingford, Nando de Freitas: Learning to learn by gradient descent by gradient descent. https://arxiv.org/abs/1606.04474
  • Julian Georg Zilly, Rupesh Kumar Srivastava, Jan Koutník, Jürgen Schmidhuber: Recurrent Highway Networks. https://arxiv.org/abs/1607.03474
  • Barret Zoph, Quoc V. Le: Neural Architecture Search with Reinforcement Learning. https://arxiv.org/abs/1611.01578
  • Bowen Baker, Otkrist Gupta, Nikhil Naik, Ramesh Raskar: Designing Neural Network Architectures using Reinforcement Learning. https://arxiv.org/abs/1611.02167
  • Noam Shazeer, Azalia Mirhoseini, Krzysztof Maziarz, Andy Davis, Quoc Le, Geoffrey Hinton, Jeff Dean: Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer. https://arxiv.org/abs/1701.06538
  • Danijar Hafner, Alex Irpan, James Davidson, Nicolas Heess: Learning Hierarchical Information Flow with Recurrent Neural Modules. https://arxiv.org/abs/1706.05744
  • Barret Zoph, Vijay Vasudevan, Jonathon Shlens, Quoc V. Le: Learning Transferable Architectures for Scalable Image Recognition. https://arxiv.org/abs/1707.07012
  • Irwan Bello, Barret Zoph, Vijay Vasudevan, Quoc V. Le: Neural Optimizer Search with Reinforcement Learning. https://arxiv.org/abs/1709.07417
  • Chenxi Liu, Barret Zoph, Maxim Neumann, Jonathon Shlens, Wei Hua, Li-Jia Li, Li Fei-Fei, Alan Yuille, Jonathan Huang, Kevin Murphy: Progressive Neural Architecture Search. https://arxiv.org/abs/1712.00559
  • Esteban Real, Alok Aggarwal, Yanping Huang, Quoc V Le: Regularized Evolution for Image Classifier Architecture Search. https://arxiv.org/abs/1802.01548
  • Jason Liang, Elliot Meyerson, Risto Miikkulainen: Evolutionary Architecture Search For Deep Multitask Networks. https://arxiv.org/abs/1803.03745

Recurrent Units

  • Kyunghyun Cho, Bart van Merrienboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, Yoshua Bengio: Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation. https://arxiv.org/abs/1406.1078
  • Tao Lei, Yu Zhang, Yoav Artzi: Training RNNs as Fast as CNNs. https://arxiv.org/abs/1709.02755

Network Interpretation

Non-diffentiable Loss Functions

  • Marc'Aurelio Ranzato, Sumit Chopra, Michael Auli, Wojciech Zaremba: Sequence Level Training with Recurrent Neural Networks. https://arxiv.org/abs/1511.06732
  • Shiqi Shen, Yong Cheng, Zhongjun He, Wei He, Hua Wu, Maosong Sun, Yang Liu: Minimum Risk Training for Neural Machine Translation. https://arxiv.org/abs/1512.02433
  • Dzmitry Bahdanau, Philemon Brakel, Kelvin Xu, Anirudh Goyal, Ryan Lowe, Joelle Pineau, Aaron Courville, Yoshua Bengio: An Actor-Critic Algorithm for Sequence Prediction. https://arxiv.org/abs/1607.07086

Structured Prediction

  • Daniel Andor, Chris Alberti, David Weiss, Aliaksei Severyn, Alessandro Presta, Kuzman Ganchev, Slav Petrov, Michael Collins: Globally Normalized Transition-Based Neural Networks. https://arxiv.org/abs/1603.06042
  • Sam Wiseman, Alexander M. Rush: Sequence-to-Sequence Learning as Beam-Search Optimization. https://arxiv.org/abs/1606.02960

Reinforcement Learning

Variational Autoencoders

Discrete Latent Variables

  • Eric Jang, Shixiang Gu, Ben Poole: Categorical Reparameterization with Gumbel-Softmax. https://arxiv.org/abs/1611.01144
  • Chris J. Maddison, Andriy Mnih, Yee Whye Teh: The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables. https://arxiv.org/abs/1611.00712
  • George Tucker, Andriy Mnih, Chris J. Maddison, Dieterich Lawson, Jascha Sohl-Dickstein: REBAR: Low-variance, unbiased gradient estimates for discrete latent variable models. https://arxiv.org/abs/1703.07370

Explicit Memory

Hyperparameter Optimization

Non-Gradient Methods

  • Tim Salimans, Jonathan Ho, Xi Chen, Szymon Sidor, Ilya Sutskever: Evolution Strategies as a Scalable Alternative to Reinforcement Learning. https://arxiv.org/abs/1703.03864

Adversarial Networks

Generative Adversarial Networks

Adversarial Images

Adversarial Text

Adversarial Speech

  • Nicholas Carlini, David Wagner: Audio Adversarial Examples: Targeted Attacks on Speech-to-Text. https://arxiv.org/abs/1801.01944
  • Xuejing Yuan, Yuxuan Chen, Yue Zhao, Yunhui Long, Xiaokang Liu, Kai Chen, Shengzhi Zhang, Heqing Huang, Xiaofeng Wang, Carl A. Gunter: CommanderSong: A Systematic Approach for Practical Adversarial Voice Recognition. https://arxiv.org/abs/1801.08535

Artificial Intelligence

  • Trapit Bansal, Jakub Pachocki, Szymon Sidor, Ilya Sutskever, Igor Mordatch: Emergent Complexity via Multi-Agent Competition. https://arxiv.org/abs/1710.03748
  • David Silver, Thomas Hubert, Julian Schrittwieser, Ioannis Antonoglou, Matthew Lai, Arthur Guez, Marc Lanctot, Laurent Sifre, Dharshan Kumaran, Thore Graepel, Timothy Lillicrap, Karen Simonyan, Demis Hassabis: Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm. https://arxiv.org/abs/1712.01815

Books

Blogs

Paper Lists