SCIALOM Thomas

doctorant à Sorbonne Université
Équipe : MLIA
https://lip6.fr/Thomas.Scialom

Direction de recherche : Patrick GALLINARI

Co-encadrement : PIWOWARSKI Benjamin, LAMPRIER Sylvain

Natural Language Generation with Reinforcement Learning

Natural Language Generation (NLG) is the subfield of Natural Language Processing, where the task is to produce natural language outputs. Despite the important progress fostered by the application of Deep Learning, generated texts are still inconsistent and contain factual inconsistencies. At the root cause, we argue in this thesis that deep learning models in NLG suffer from inherent flaws in algorithms, which limits their efficiency. At training time, the standard training strategy, Teacher Forcing, induces the so called exposure bias, a mismatch with inference time, where the errors accumulate. Moreover, NLG suffers from a second flaw: its the automatic evaluation does not reflect well human judgement.
In this thesis, we explore how to improve both evaluation and training in NLG toward more reliable systems. In particular, we propose a Question Answering based metric. We show how this metric can be used as a reward in a Reinforcement Learning setup to improve NLG models. Toward this objective, we also explore learned rewards that are the discriminators, and introduce several new algorithms that benefit NLG during training and decoding times. In particular, we propose to combine Monte Carlo Tree Search with Generative Adversarial Networks, resulting in state-of-the-art models.

Soutenance : 06/07/2022

Membres du jury :

Sara Tonelli, head of the Digital Humanities research group at FBK [Rapporteur]
Benoît Favre, full professor at Polytech Marseille [Rapporteur]
Catherine Pelachaud, Director of Research CNRS at ISIR, UPMC
Marc Aurelio Ranzato, Research Scientist at Google DeepMind
Oriol Vinyals, Principal Scientist at Google DeepMind
Jacopo Staiano (corporate supervisor), Head or research at reciTAL
Benjamin Piwowarski (academic supervisor), Researcher at CNRS
Sylvain Lamprier (academic supervisor), Associate Professor, ISIR

Date de départ : 31/12/2021

Publications 2019-2022