LAMPLE Guillaume
Supervision : Ludovic DENOYER
Unsupervised Machine Translation
Recent advances in machine translation obtained very promising results: the quality of translations provided by deep learning models is now close from human performance. However, these models still rely on large amounts of bilingual resources, only available for a few language pairs like English-French. For the majority of languages, parallel resources are scarce, and the quality of translations provided by traditional systems is not satisfactory. Monolingual resources, however, are widely available. Several studies showed that these resources can improve the performance of standard supervised systems, but these approaches still rely on the existence of large bilingual corpora.
In this thesis, we investigate the problem of unsupervised machine translation, where we will try to build a machine translation system from monolingual corpora exclusively. We will show that fully unsupervised machine translation is not only possible, but that it can also have a significant impact in real world applications, sometimes outperforming the performance of supervised models in low-resource language pairs such as English-Urdu or English-Nep
Defence : 10/17/2019
Jury members :
Fraçois Yvon (LIMSI-CNRS)
Kevin Knight (Department of Computer Science of the University of Southern California)
Nico Sennrich (University of Edinburgh)
Alexander Rush (Harvard School of Engineering and Applied Sciences)
Patrick Gallinari (Sorbonne University)
Mikaela Keller (Centre de Recherche en Informatique, Signal et Automatique de Lille)
Marc'Aurelio Ranzato (Facebook AI Research)
Ludovic Denoyer (Sorbonne University)
2017-2019 Publications
-
2019
- G. Lample : “Unsupervised Machine Translation”, thesis, phd defence 10/17/2019, supervision Denoyer, Ludovic (2019)
-
2018
- A. Conneau, G. Kruszewski, G. Lample, L. Barrault, M. Baroni : “What you can cram into a single \$&!#* vector: Probing sentence embeddings for linguistic properties”, ACL 2018 - 56th Annual Meeting of the Association for Computational Linguistics, vol. 1, Melbourne, Australia, pp. 2126–2136, (Association for Computational Linguistics) (2018)
-
2017
- G. Lample, N. Zeghidour, N. Usunier, A. Bordes, L. Denoyer, M. Ranzato : “Fader Networks: Generating Image Variations by Sliding Attribute Values”, 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, United States, pp. 5969-5978 (2017)