TRINH Anh Phuc

PhD student at Sorbonne University
Team : MALIRE
https://lip6.fr/Anh-Phuc.Trinh

Supervision : Patrick GALLINARI

Classifieur probabiliste et Séparateur à Vaste Marge. Application à la classification de texte et à l'étiquetage d'image

This thesis proposes estimators of posterior probabilities for separator Large Margin Classifiers. It includes a theoretical and an experimental part.

The first contribution we present is to introduce a probabilistic classifier based on SVM for multi-class classification. The approach we use is the one against one approach, where for a problem with k classes k (k - 1) / 2 classifiers are trained. The binary outputs of these classifiers form voting features based on which a class decision will be computed. We introduce a new voting space that enables an enhanced representation of classifier decisions so as to take into account the relations between classes. We propose a method to learn from this binary space an estimate of the posterior probabilities of classes.

The second contribution concerns the problem of multi-label classification and the dependencies between labels. The prediction of structured outputs in recent years has been an extremely active area and many models based on extensions of SVMs or graphical models have been proposed. Many of these models have a complexity that prevents any application on real data. We introduce a multi-label classifier based on an undirected graphical model formalism. We propose approximate learning and inference methods of limited complexity. They make use of the probabilistic binary classifiers developed before.

The third contribution is the experimental validation of these ideas and algorithms. A first application allows us to test our multi-class probabilistic classifiers. This Challenge is a DEFT competition on the French classification of texts. The data on which we worked ,deal with classification and gender theme of journalistic corpora. The second application we addressed concerns the labeling of images by using information of dependency between the labels. It corresponds to a task proposed in the international competition ImageCLEF08 2. We propose a graphical model suitable for this task allows us to validate this model on a multi-label problem.

Defence : 02/17/2012

Jury members :

Thierry Paquet, Professeur, Université de Rouen [Rapporteur]
Sylvie Thiria, Professeur, Université Versailles Saint Quentin en Yvelines [Rapporteur]
Patrick Gallinari, Professeur, Université Pierre et Marie Curie
Thierry Artières, Professeur, Université Pierre et Marie Curie




Departure date : 09/30/2012

2008-2012 Publications