Staff directory

SIMON Etienne

PhD Student at Sorbonne University
Team : MLIA
http://esimon.eu

Supervision : Vincent GUIGUE
Co-supervision : PIWOWARSKI Benjamin

Deep Learning for Natural Language Understanding

Capturing concepts' interrelations is a fundamental of natural language understanding. It constitutes a bridge between two historically separate approaches of artificial intelligence: the use of symbolic and distributed representations. However, tackling this problem without human supervision poses several issues, and unsupervised models have difficulties echoing the expressive breakthroughs of supervised ones. This thesis addresses two supervision gaps we identified: the problem of regularization of sentence-level discriminative models and the problem of leveraging relational information from dataset-level structures. The first gap arises following the increased use of discriminative approaches, such as deep neural network classifiers, in the supervised setting. These models tend to collapse without supervision. To overcome this limitation, we introduce two relation distribution losses to constrain the relation classifier into a trainable state. The second gap arises from the development of dataset-level (aggregate) approaches. We show that unsupervised models can leverage a large amount of additional information from the structure of the dataset, even more so than supervised models. We close this gap by adapting existing unsupervised methods to capture topological information using graph convolutional networks. Furthermore, we show that we can exploit the mutual information between topological (dataset-level) and linguistic (sentence-level) information to design a new training paradigm for unsupervised relation extraction.

Phd defence : 07/05/2022

Jury members :

Alexandre Allauzen, Professeur des universités, Université Paris-Dauphine PSL, ESPCI [rapporteur]
Benoît Favre, Maître de conférences, Aix-Marseille Université [rapporteur]
Pascale Sébillot, Professeure des universités, IRISA, INSA Rennes
Xavier Tannier, Professeur des universités, Sorbonne Université
Benjamin Piwowarski, Chargé de recherche, CNRS, Sorbonne Université
Vincent Guigue, Maître de conférences, Sorbonne Université

Departure date : 12/31/2021

2019-2022 Publications

All Communications Thesis

2022
- E. Simon : “Apprentissage de réseaux profonds pour l’indexation conceptuelle de texte”, thesis, phd defence 07/05/2022, supervision Guigue, Vincent, co-supervision : Piwowarski, Benjamin (2022)
2019
- É. Simon, V. Guigue, B. Piwowarski : “Unsupervised Information Extraction: Regularizing Discriminative Approaches with Relation Distribution Losses”, Proceedings of the 57^th Annual Meeting of the Association for Computational Linguistics, Florence, Italy, pp. 1378-1387, (Association for Computational Linguistics) (2019)
- É. Simon, V. Guigue, B. Piwowarski : “Extraction d’information non supervisée avec des modèles discriminants”, CAp 2019 - 21^e Conférence sur l'Apprentissage automatique, Toulouse, France (2019)