Are Multilingual Neural Machine Translation Models Better at Capturing Linguistic Features?

David Mareček, Hande Celikkanat, Miikka Silfverberg, Vinit Ravishankar, Jörg Tiedemann

References:

  1. Roee Aharoni and Yoav Goldberg. Towards String-To-Tree Neural Machine Translation In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 132–140, Association for Computational Linguistics, Vancouver, Canada, 2017. (http://doi.org/10.18653/v1/P17-2021)
  2. Mikel Artetxe and Holger Schwenk. Massively multilingual sentence embeddings for zero-shot cross-lingual transfer and beyond Transactions of the Association for Computational Linguistics 7, pages 597–610, MIT Press, 2019. (http://doi.org/10.1162/tacl_a_00288)
  3. Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. Neural Machine Translation by Jointly Learning to Align and Translate In 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings, 2015.
  4. Grzegorz Chrupała and Afra Alishahi. Correlating neural and symbolic representations of language arXiv preprint arXiv:1905.06401, 2019.
  5. Y. J. Chu and T. H. Liu. On the Shortest Arborescence of a Directed Graph Science Sinica 14, pages 1396–1400, 1965.
  6. Ondřej Cífka and Ondřej Bojar. Are BLEU and Meaning Representation in Opposition? In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1362–1371, Association for Computational Linguistics, Melbourne, Australia, 2018. (http://doi.org/10.18653/v1/P18-1126)
  7. Alexis Conneau, German Kruszewski, Guillaume Lample, Lo\"ıc Barrault, and Marco Baroni. What you can cram into a single vector: Probing sentence embeddings for linguistic properties arXiv preprint arXiv:1805.01070, 2018. (http://doi.org/10.18653/v1/P18-1198)
  8. Alexis Conneau and Douwe Kiela. SentEval: An Evaluation Toolkit for Universal Sentence Representations arXiv preprint arXiv:1803.05449, 2018.
  9. Alexis Conneau, Guillaume Lample, Ruty Rinott, Adina Williams, Samuel R Bowman, Holger Schwenk, and Veselin Stoyanov. Xnli: Evaluating cross-lingual sentence representations arXiv preprint arXiv:1809.05053, 2018. (http://doi.org/10.18653/v1/D18-1269)
  10. Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. Bert: Pre-training of deep bidirectional transformers for language understanding arXiv preprint arXiv:1810.04805, 2018.
  11. John Hewitt and Christopher D Manning. A structural probe for finding syntax in word representations In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4129–4138, 2019.
  12. Melvin Johnson, Mike Schuster, Quoc Le, Maxim Krikun, Yonghui Wu, Zhifeng Chen, Nikhil Thorat, Fernanda Viégas, Martin Wattenberg, Greg Corrado, Macduff Hughes, and Jeffrey Dean. Google's Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation Transactions of the Association for Computational Linguistics 5, 2017. (http://doi.org/10.1162/tacl_a_00065)
  13. Philipp Koehn. Europarl: A parallel corpus for statistical machine translation In MT summit 5, pages 79–86, 2005.
  14. Sneha Reddy Kudugunta, Ankur Bapna, Isaac Caswell, Naveen Arivazhagan, and Orhan Firat. Investigating multilingual nmt representations at scale arXiv preprint arXiv:1909.02197, 2019. (http://doi.org/10.18653/v1/D19-1167)
  15. Lucian Vlad Lita, Abe Ittycheriah, Salim Roukos, and Nanda Kambhatla. Truecasing In Proceedings of the 41st Annual Meeting on Association for Computational Linguistics-Volume 1, pages 152–159, 2003. (http://doi.org/10.3115/1075096.1075116)
  16. Yichao Lu, Phillip Keung, Faisal Ladhak, Vikas Bhardwaj, Shaonan Zhang, and Jason Sun. A neural interlingua for multilingual machine translation In Proceedings of the Third Conference on Machine Translation: Research Papers, pages 84–92, Association for Computational Linguistics, Brussels, Belgium, 2018. (http://doi.org/10.18653/v1/W18-6309)
  17. David Mareček and Rudolf Rosa. From Balustrades to Pierre Vinken: Looking for Syntax in Transformer Self-Attentions In Proceedings of the 2019 ACL Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP, pages 263–275, Association for Computational Linguistics, Florence, Italy, 2019. (http://doi.org/10.18653/v1/W19-4827)
  18. Ryan McDonald, Fernando Pereira, Kiril Ribarov, and Jan Hajič. Non-Projective Dependency Parsing using Spanning Tree Algorithms In HLT-EMNLP, pages 523–530, 2005. (http://doi.org/10.3115/1220575.1220641)
  19. Maria Nadejde, Siva Reddy, Rico Sennrich, Tomasz Dwojak, Marcin Junczys-Dowmunt, Philipp Koehn, and Alexandra Birch. Predicting target language ccg supertags improves neural machine translation arXiv preprint arXiv:1702.01147, 2017. (http://doi.org/10.18653/v1/W17-4707)
  20. Matthew E Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. Deep contextualized word representations arXiv preprint arXiv:1802.05365, 2018. (http://doi.org/10.18653/v1/N18-1202)
  21. Alessandro Raganato, Raúl Vázquez, Mathias Creutz, and Jörg Tiedemann. An Evaluation of Language-Agnostic Inner-Attention-Based Representations in Machine Translation In Proceedings of the 4th Workshop on Representation Learning for NLP (RepL4NLP-2019), pages 27–32, Association for Computational Linguistics, Florence, Italy, 2019. (http://doi.org/10.18653/v1/W19-4304)
  22. Alessandro Raganato and Jörg Tiedemann. An Analysis of Encoder Representations in Transformer-Based Machine Translation In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP, pages 287–297, Association for Computational Linguistics, Brussels, Belgium, 2018. (http://doi.org/10.18653/v1/W18-5431)
  23. Vinit Ravishankar, Lilja Øvrelid, and Erik Velldal. Probing Multilingual Sentence Representations With X-Probe arXiv preprint arXiv:1906.05061, 2019. (http://doi.org/10.18653/v1/W19-4318)
  24. Holger Schwenk and Matthijs Douze. Learning Joint Multilingual Sentence Representations with Neural Machine Translation In Rep4NLP@ACL, 2017. (http://doi.org/10.18653/v1/W17-2619)
  25. Dan Klein and Christopher D Manning. Fast exact inference with a factored model for natural language parsing In Advances in neural information processing systems, pages 3–10, 2003.
  26. Ke Tran, Arianna Bisazza, and Christof Monz. The Importance of Being Recurrent for Modeling Hierarchical Structure In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 4731–4736, Association for Computational Linguistics, Brussels, Belgium, 2018. (http://doi.org/10.18653/v1/D18-1503)
  27. Milan Straka and Jana Straková. Tokenizing, POS Tagging, Lemmatizing and Parsing UD 2.0 with UDPipe In Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, pages 88–99, Association for Computational Linguistics, Vancouver, Canada, 2017. (http://doi.org/10.18653/v1/K17-3009)
  28. Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Ł ukasz Kaiser, and Illia Polosukhin. Attention is All you Need In Advances in Neural Information Processing Systems 30, pages 5998–6008, Curran Associates, Inc., 2017.