MINETTO Rodrigo
Supervision : Matthieu CORD
Co-supervision : NEUCIMAR J. Leite
Text Recognition and 2D/3D Object Tracking
In this thesis we address three computer vision problems: (1) the detection and recognition of flat text objects in images of real scenes; (2) the tracking of such text objects in a digital video; and (3) the tracking an arbitrary three-dimensional rigid object with known markings in a digital video. For each problem we developed innovative algorithms, which are at least as accurate and robust as other state-of-the-art algorithms. Specifically, for text recognition we developed (and extensively evaluated) a new HOG-based descriptor specialized for Roman script, which we call T-HOG, and showed its value as a post-filter for an existing text detector (SnooperText). We also improved the SnooperText algorithm by using the multi-scale technique to handle widely different letter sizes while limiting the sensitivity of the algorithm to various artifacts. For text tracking, we describe four basic ways of combining a text detector and a text tracker, and we developed a specific tracker based on a particle-filter which exploits the T-HOG recognizer. For rigid object tracking we developed a new accurate and robust algorithm (AffTrack) that combines the KLT feature tracker with an improved camera calibration procedure. We extensively tested our algorithms on several benchmarks well-known in the literature. We also created benchmarks (publicly available) for the evaluation of text detection and tracking and rigid object tracking algorithms.
Defence : 03/19/2012
Jury members :
Patrick PÉREZ - Senior Researcher à Technicolor Research Innovation [Rapporteur]
Arnaldo de A. ARAÚJO - Professeur à l’Université Feredal de Minas Gerais (UFMG) [Rapporteur]
Nicolas THOME - Maitre de Conférences à l’Université Pierre et Marie Curie (Paris 6)
Neucimar J. LEITE - Professeur à l’Université Estadual de Campinas (UNICAMP)
Matthieu CORD - Professeur à l’Université Pierre et Marie Curie (Paris 6)
Jorge STOLFI - Professeur à l’Université Estadual de Campinas (UNICAMP)
Hélio PEDRINI - Professeur à l’Université Estadual de Campinas (UNICAMP)
Marcin DETYNIECKI - CR CNRS à l’Université Pierre et Marie Curie (Paris 6)
2010-2014 Publications
-
2014
- R. Minetto, N. Thome, M. Cord, Neucimar J. Leite, J. Stolfi : “SnooperText: A Text Detection System for Automatic Indexing of Urban Scenes”, Computer Vision and Image Understanding, vol. 122, pp. 92-104, (Elsevier) (2014)
-
2013
- R. Minetto, N. Thome, M. Cord, Neucimar J. Leite, J. Stolfi : “T-HOG: an Effective Gradient-Based Descriptor for Single Line Text Regions”, Pattern Recognition, vol. 46 (3), pp. 1078-1090, (Elsevier) (2013)
-
2012
- R. Minetto : “Reconnaissance de Zones de Texte et Suivi d’Objects dans les Images et les Vidéos”, thesis, phd defence 03/19/2012, supervision Cord, Matthieu, co-supervision : Neucimar, J. Leite (2012)
-
2011
- R. Minetto, N. Thome, M. Cord, J. Stolfi, F. Precioso, J. Guyomard, Neucimar J. Leite : “Text Detection and Recognition in Urban Scenes”, International Conference on Computer Vision (ICCV): Workshop on Computer Vision for Remote Sensing of the Environment, Barcelona, Spain, pp. 227-234, (IEEE) (2011)
- R. Minetto, N. Thome, M. Cord, Neucimar J. Leite, J. Stolfi : “SNOOPERTRACK: TEXT DETECTION AND TRACKING FOR OUTDOOR VIDEOS”, IEEE International Conference on Image Processing (ICIP), Brussels, Belgium, pp. 505-508, (IEEE) (2011)
-
2010
- R. Minetto, N. Thome, M. Cord, J. Fabrizio, B. Marcotegui : “Snoopertext: A multiresolution system for text detection in complex visual scenes”, ICIP 2010 - 17th IEEE International Conference on Image Processing, Hong-Kong, Hong Kong, pp. 3861-3864, (IEEE) (2010)