FAKERI TABRIZI Ali

PhD student at Sorbonne University
Team : MLIA
https://perso.lip6.fr/Ali.Fakeri-Tabrizi

Supervision : Patrick GALLINARI

Co-supervision : AMINI Massih-Reza

Semi-supervised Multi-view Learning: An Application to Image Annotation and Multi-lingual Document Classification

In this thesis, we introduce two multiview learning approaches. In a first approach, we describe a self-training multiview strategy which trains different voting classifiers on different views. The margin distributions over the unlabeled training data, obtained with each view-specific classifier are then used to estimate an upper-bound on their transductive Bayes error. Minimizing this upper-bound provides an automatic margin-threshold which is used to assign pseudo-labels to unlabeled examples. Final class labels are then assigned to these examples, by taking a vote on the pool of the previous pseudo-labels. New view-specific classifiers are then trained using the original labeled and the pseudo-labeled training data. We consider applications to image-text and to multilingual document classification.
In second approach, we propose a multiview semi-supervised bipartite ranking model which allows us to leverage the information contained in unlabeled sets of images to improve the prediction performance, using multiple descriptions, or views of images. For each topic class, our approach first learns as many view-specific rankers as there are available views using the labeled data only. These rankers are then improved iteratively by adding pseudo-labeled pairs of examples on which all view-specific rankers agree over the ranking of examples within these pairs.
We report on experiments carried out on the NUS-WIDE dataset, which show that the multiview ranking process improves predictive performance when a small number of labeled examples is available specially for unbalanced topic classes. We show also that our approach achieves significant improvements over a state-of-the art semi-supervised multiview classification model. We present experimental results on the NUS-WIDE collection and on Reuters RCV1-RCV2 which show that despite its simplicity, our approach is competitive with other state-of-the-art techniques.

Defence : 09/30/2013

Jury members :

Mr. Glotin, Hérvé. Univ. Sud Toulon Var [Rapporteur]
Mr. Quenot, Georges. Office B-109 Campus Scientifique [Rapporteur]
Mr. Artières, Thierry. LIP6, Université Pierre et Marie Curie
Mr. Gallinari, Patrick. LIP6, Université Pierre et Marie Curie
Mr. Amini, Massih-reza. LIG/AMA

Departure date : 01/13/2014

2008-2015 Publications