Replacing Linguists with Dummies: A Serious Need for Trivial Baselines in Multi-Task Neural Machine Translation

Daniel Kondratyuk, Ronald Cardenas, Ondřej Bojar

References:

  1. Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. Neural machine translation by jointly learning to align and translate arXiv preprint arXiv:1409.0473, 2014.
  2. Luisa Bentivogli, Arianna Bisazza, Mauro Cettolo, and Marcello Federico. Neural versus Phrase-Based Machine Translation Quality: a Case Study In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pages 257–267, 2016. (http://doi.org/10.18653/v1/D16-1025)
  3. James Bradbury and Richard Socher. Towards Neural Machine Translation with Latent Tree Attention In Proceedings of the 2nd Workshop on Structured Prediction for Natural Language Processing, pages 12–16, 2017. (http://doi.org/10.18653/v1/W17-4303)
  4. Chris Dyer, Adhiguna Kuncoro, Miguel Ballesteros, and Noah A Smith. Recurrent Neural Network Grammars In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 199–209, 2016. (http://doi.org/10.18653/v1/N16-1024)
  5. Akiko Eriguchi, Kazuma Hashimoto, and Yoshimasa Tsuruoka. Tree-to-Sequence Attentional Neural Machine Translation In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 1, pages 823–833, 2016. (http://doi.org/10.18653/v1/P16-1078)
  6. Akiko Eriguchi, Yoshimasa Tsuruoka, and Kyunghyun Cho. Learning to Parse and Translate Improves Neural Machine Translation In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers) 2, pages 72–78, 2017. (http://doi.org/10.18653/v1/P17-2012)
  7. Julia Hockenmaier and Mark Steedman. CCGbank: a corpus of CCG derivations and dependency structures extracted from the Penn Treebank Computational Linguistics 33, pages 355–396, MIT Press, 2007. (http://doi.org/10.1162/coli.2007.33.3.355)
  8. Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization arXiv preprint arXiv:1412.6980, 2017.
  9. Philipp Koehn, Hieu Hoang, Alexandra Birch, Chris Callison-Burch, Marcello Federico, Nicola Bertoldi, Brooke Cowan, Wade Shen, Christine Moran, Richard Zens, Chris Dyer, Ondřej Bojar, Alexandra Constantin, and Evan Herbst. Moses: Open Source Toolkit for Statistical Machine Translation In ACL 2007, Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics Companion Volume Proceedings of the Demo and Poster Sessions, pages 177–180, Association for Computational Linguistics, Prague, Czech Republic, 2007. (http://doi.org/10.3115/1557769.1557821)
  10. Mike Lewis, Luheng He, and Luke Zettlemoyer. Joint A* CCG parsing and semantic role labelling In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pages 1444–1454, 2015. (http://doi.org/10.18653/v1/D15-1169)
  11. Maria Nadejde, Siva Reddy, Rico Sennrich, Tomasz Dwojak, Marcin Junczys-Dowmunt, Philipp Koehn, and Alexandra Birch. Predicting Target Language CCG Supertags Improves Neural Machine Translation In Proceedings of the Second Conference on Machine Translation, pages 68–79, 2017. (http://doi.org/10.18653/v1/W17-4707)
  12. Jindřich Helcl and Jindřich Libovický. Neural Monkey: An Open-source Tool for Sequence Learning The Prague Bulletin of Mathematical Linguistics, pages 5–17, Prague, Czech Republic, 2017. (http://doi.org/10.1515/pralin-2017-0001)
  13. Jan Niehues and Eunah Cho. Exploiting Linguistic Resources for Neural Machine Translation Using Multi-task Learning In Proceedings of the Second Conference on Machine Translation, pages 80–89, 2017. (http://doi.org/10.18653/v1/W17-4708)
  14. Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. BLEU: a method for automatic evaluation of machine translation In Proceedings of the 40th annual meeting on association for computational linguistics, pages 311–318, 2002. (http://doi.org/10.3115/1073083.1073135)
  15. Martin Popel and Ondřej Bojar. Training Tips for the Transformer Model The Prague Bulletin of Mathematical Linguistics 110, pages 43–70, 2018. (http://doi.org/10.2478/pralin-2018-0002)
  16. Rico Sennrich, Barry Haddow, and Alexandra Birch. Neural Machine Translation of Rare Words with Subword Units In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 1, pages 1715–1725, 2016. (http://doi.org/10.18653/v1/P16-1162)
  17. Xing Shi, Inkit Padhi, and Kevin Knight. Does string-based neural MT learn source syntax? In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pages 1526–1534, 2016. (http://doi.org/10.18653/v1/D16-1159)
  18. Mark Steedman. The syntactic process, MIT press, 2000. (http://doi.org/10.7551/mitpress/6591.001.0001)
  19. Kai Sheng Tai, Richard Socher, and Christopher D Manning. Improved semantic representations from tree-structured long short-term memory networks arXiv preprint arXiv:1503.00075, 2015. (http://doi.org/10.3115/v1/P15-1150)
  20. Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need In Advances in Neural Information Processing Systems, pages 6000–6010, 2017.