LIP6 1998/016: THÈSE de DOCTORAT de l'UNIVERSITÉ PARIS 6
LIP6 /
LIP6 research
reports
164 pages - Avril/April 1998 -
French document.
PostScript : 491 Ko /Kb
Contact : par mail / e-mail
Thème/Team: Apprentissage et Acquisition de Connaissances
Titre français : Combinaison de Classifieurs Statistiques, Application à la Prédiction de la Structure Secondaire des Protéines
Titre anglais : Statistical Classifier Combination, Application to Protein Secondary Structure Prediction
Abstract : Model combination has recently been at the origine of significant improvements in the field of statistical learning, both for regression and pattern recognition tasks. However, fundamental questions have remained virtually untackled. Few criteria have thus been developed to motivate the choice of a specific method, whereas no independent result has been derived in the field of discrimination.
This dissertation deals with one of the most commonly used combination techniques: linear regression. We first characterize the regularizing effect of the "stacked regression" method introduced by Breiman. We then study the application of the multivariate linear regression model to the combination of discriminant experts the outputs of which are estimates of the class posterior probabilities. This question is successively considered from the point of view of optimization and complexity control. The latter point involves the computation of generalized Vapnik-Chervonenkis dimensions.
The study is followed up with the description of a non parametric method for Bayes' error rate estimation.
Our ensemble method is assessed on an open biological sequence processing problem: the problem of globular protein secondary structure prediction. To perform this discrimination task, we introduce a hierarchical and modular approach in which combination is used at an intermediate level.
Key-words : Ensemble methods, complexity control, VC dimension, discrimination, Bayes error rate estimation, hierarchical models, protein secondary structure prediction, stacked regression, hybrid systems
Publications internes LIP6 1998 / LIP6 research reports 1998