CISSE Mouhamadou Moustapha
Supervision : Thierry ARTIÈRES
Co-supervision : GALLINARI Patrick
Efficient Extreme Classification
Humans naturally and instantly recognize relevant objects in images despite the large number of potential visual concepts. They can also instantly tell which topics are relevant for a given text document even though these topics are chosen among thousands of semantic concepts. This ability to quickly categorize information is an important aspect of high level intelligence and endowing machines with it is an important step towards artificial intelligence.
We propose in this thesis new methods to tackle classification problems with a large number of labels, also called extreme classification. The proposed approaches aim at reducing the inference complexity in comparison with the classical methods (such as one-versus-rest) in order to make learning machines usable in a real life scenario. We propose two types of methods respectively designed for single label and multilabel classification.
The first proposed method uses existing hierarchical information among the categories in order to learn low dimensional binary representation of the categories. The second type of approaches, dedicated to multilabel problems, adapts the framework of Bloom Filters to represent subsets of labels with sparse low dimensional binary vectors. For both methods, binary classifiers are learned to predict the new low dimensional representation of the categories and several algorithms are also proposed to recover the set of relevant labels. Large scale experiments validate the methods.
Defence : 07/25/2014
Jury members :
Eric Gaussier, LIG (Grenoble-France) [Rapporteur]
Georges Paliouras, Demokritos (Athens-Greece) [Rapporteur]
Christophe Marsala, LIP6 (Paris-France)
Nicolas Usunier UTC/CNRS (Compiegne-France)
Thierry Artieres LIP6 (Paris-France)
Patrick Gallinari LIP6 (Paris-France)
2011-2014 Publications
-
2014
- M. Cisse : “Efficient Extreme Classification”, thesis, phd defence 07/25/2014, supervision Artières, Thierry, co-supervision : Gallinari, Patrick (2014)
-
2013
- M. Cisse, N. Usunier, Th. Artières, P. Gallinari : “Robust Bloom Filters for Large MultiLabel Classification Tasks”, Advances in Neural Information Processing Systems 26, Lake Tahoe, United States, pp. 1851-1859 (2013)
-
2012
- M. Cisse, Th. Artières, P. Gallinari : “Learning compact class codes for fast inference in large multi class classification”, European Conference on Machine Learning, vol. 7523, Lecture Notes in Computer Science, Bristol, United Kingdom, pp. 506-520, (Springer) (2012)
-
2011
- M. Cisse, Th. Artières, P. Gallinari : “Learning efficient error correcting output codes for large hierarchical multi-class problems”, Workshop on Large Scale Hierarchical Classification (at ECML), Athens, Greece, pp. 37-48 (2011)