SARR Idrissa
Supervision : Anne DOUCET
Co-supervision : NAACKE Hubert
Transaction Routing in a Large-Scale Database
Database replication has been widely studied for more decades. Replication is a way to increase data availability and to improve data access performance. A major challenge of database replication is replica control when data are concurrently updated. Some replication solutions are targeting medium-scale systems (e.g. PC clusters) with intrinsic reliability, and are no longer suitable for large-scale systems. Moreover, because of the well-known tension between consistency, which requires synchronization between replicas, and scalability, which requires asynchronous mechanisms, the approach used in this thesis is to release consistency in order to improve data access performance. Furthermore, many web2.0 applications do not always need to access the most up-to-date data. That leads to new solutions, which give more performances in terms of throughput, latency and availability. In this PhD thesis, we consider transactional applications in which data are widely distributed over a dynamic infrastructure such as P2P systems. We aim to improve performances by monitoring data consistency, load balancing, and availability. We propose a middleware solution in such that resources and their temporary unavailability are completely transparent. Our solution preserves applications and database autonomy; therefore both applications and DBMS remain unchanged. Applications specify their needs in terms of consistency and the middleware guarantees these needs by controlling the transactions routing and resources status. We define two protocols for maintaining global consistency based on the knowledge of data accessed by transactions. The first protocol ensures global consistency in a pessimistic way. Each transaction is associated with its conflict classes, which are used for ordering transactions based on their arrival date in the system. The second protocol process transactions in an optimistic way, then, real conflicts are checked later at commit time. That allows for more parallelism and raises new opportunities for load balancing. Moreover, we propose a large-scale distributed directory for metadata, which enable to guaranty data consistency. All our solutions take into account fail-stop failures through a failure management mechanism well suited to a large-scale system. Finally, we have implemented our solutions for validating them experimentally. Performance tests show that metadata management is effective and improves transactional throughput. We also show that the redundancy of the middleware reduces the response time in case of failures.
Defence : 10/07/2010
Jury members :
Mme. Esther Pacitti, Professeur, Université Montpellier 2, LIRMM [Rapporteur]
M. Rachid Guerraoui, Professeur, EPFL CH 1015 Lausanne, Switzerland [Rapporteur]
M. Gabriel Antoniu, Chargé de recherche (HDR), INRIA Rennes
M. Samba Ndiaye, Maître Assistant, UCAD Direction Informatique, Dakar Sénégal
M. Stéphane Gançarski, Maître de Conférences (HDR), UPMC
M. Hubert Naacke, Maître de Conférences, UPMC
M. Anne Doucet, Professeur, UPMC
M. Pierre Sens, Professeur, UPMC
2008-2022 Publications
-
2022
- I. Gueye, H. Naacke, I. Sarr, L. Bouzid Khiri, S. Gançarski : “Malaria Control: Epidemic Progression Calculation Based on Individual Mobility Data”, chapter in Simulation and Modeling Methodologies, Technologies and Applications. Revised Selected Papers., vol. 306, Lecture Notes in Networks and Systems, pp. 156-183, (Springer International Publishing), (ISBN: 978-3-030-84811-8) (2022)
-
2020
- L. Khiri, I. Gueye, H. Naacke, I. Sarr, S. Gançarski : “A Malaria Control Model using Mobility Data: An Early Explanation of Kedougou Case in Senegal”, 10th International Conference on Simulation and Modeling Methodologies, Technologies and Applications, on-line, France, pp. 35-46, (SciTePress) (2020)
-
2015
- N. Bame, H. Naacke, I. Sarr, S. Ndiaye : “Optimisation de requêtes dynamiques pour l’analyse de la biodiversité”, Revue Africaine de Recherche en Informatique et Mathématiques Appliquées, vol. Volume 21 - 2015 - Special issue - CARI 2014, pp. 21-47, (African Society in Digital Science) (2015)
- I. Gueye, H. Naacke, I. Sarr : “Supporting Fluctuating Transactional Workload”, 26th International Conference on Database and Expert Systems Applications (DEXA), vol. 9262, Lecture Notes in Computer Science, Valencia, Spain, pp. 295-303, (Springer) (2015)
- I. Sarr, H. Naacke, N. Bame, I. Gueye, S. Ndiaye : “Green and Distributed Architecture for Managing Big Data of Biodiversity”, chapter in Computing in Research and Development in Africa. Benefits, Trends, Challenges and Solutions., pp. 21-39, (Springer), (ISBN: 978-3-319-08239-4) (2015)
-
2014
- I. Gueye, I. Sarr, H. Naacke : “Gestion d’un workload transitoire via les graphes sociaux”, 12th African Conference on Research in Computer Science and Applied Mathematics, CARI'2014, Saint-Louis, Senegal, pp. 201-212 (2014)
- N. Bame, H. Naacke, I. Sarr, S. Ndiaye : “Algorithmes de traitement de requêtes de biodiversité dans un environnement distribué”, Revue Africaine de Recherche en Informatique et Mathématiques Appliquées, vol. Volume 18, 2014, pp. 1-18, (African Society in Digital Science) (2014)
- I. Gueye, I. Sarr, H. Naacke : “Exploiting the social structure of online media to face transient heavy workload”, The Sixth International Conference on Advances in Databases, Knowledge, and Data Applications, DBKDA 2014, Chamonix, France, pp. 51-58 (2014)
-
2013
- I. Sarr, H. Naacke, Abderrahmane O. M. Moctar : “STRING: Social-Transaction Routing over a Ring”, International Conference on Database and Expert Systems Applications (DEXA), vol. 8056, Lecture Notes in Computer Science, Prague, Czechia, pp. 319-333, (Springer) (2013)
- N. Bame, H. Naacke, I. Sarr, S. Ndiaye : “Traitement décentralisé de requêtes de biodiversité”, Colloque National sur la Recherche en Informatique et ses Applications, CNRIA 2013, Ziguinchor, Senegal, pp. 8 (2013)
-
2012
- N. Bame, H. Naacke, I. Sarr, S. Ndiaye : “Architecture répartie à large échelle pour le traitement parallèle de requêtes de biodiversité”, 11th African Conference on Research in Computer Science and Applied Mathematics (CARI'12), Algiers, Algeria, pp. 143-150 (2012)
- I. Gueye, I. Sarr, H. Naacke : “TransElas : Elastic Transaction Monitoring for Web2.0 applications”, 5th International Conference on Data Management in Cloud, Grid and P2P Systems (GLOBE'12), vol. 7450, Lecture Notes in Computer Science, Vienna, Austria, pp. 1-12, (Springer) (2012)
-
2010
- I. Sarr : “Routage des Transactions dans une Base de Données à Large Echelle”, thesis, phd defence 10/07/2010, supervision Doucet, Anne, co-supervision : Naacke, Hubert (2010)
- I. Sarr, H. Naacke, S. Gançarski : “Failure-Tolerant Transaction Routing at Large Scale”, International Conference on Advances in Databases, Knowledge, and Data Applications (DBKDA), Menuires, France, pp. 165-172, (IEEE) (2010)
- I. Sarr, H. Naacke, S. Gançarski : “TransPeer: Adaptive Distributed Transaction Monitoring for Web2.0 applications”, ACM Symposium on Applied Computing: Track on Dependable and Adaptive Distributed Systems (SAC DADS), Sierre, Switzerland, pp. 423-430, (ACM) (2010)
- I. Sarr, H. Naacke, S. Gançarski : “Routage décentralisé de transactions avec gestion des pannes dans un réseau à large échelle”, Revue des Sciences et Technologies de l'Information - Série ISI : Ingénierie des Systèmes d'Information, vol. 15 (1), pp. 87-111, (Lavoisier) (2010)
-
2009
- M. Gueye, I. Sarr, S. Ndiaye : “Database Replication in Large Scale Systems: Optimizing the Number of Replicas”, EDBT09 International Workshop on Data Management in Peer-to-peer systems (DAMAP), Saint Petersburg, Russian Federation, pp. 3-9, (ACM) (2009)
-
2008
- I. Sarr, H. Naacke, S. Gançarski : “Routage Décentralisé de Transactions avec Gestion des Pannes dans un Réseau à Large Echelle”, Journées de Bases de Données Avancées (BDA), Guilherand Granges, France, pp. 1-20 (2008)
- I. Sarr, H. Naacke, S. Gançarski : “DTR: Distributed Transaction Routing in a Large Scale Network”, VECPAR International Workshop on High-Performance Data Management in Grid Environments (HPDGrid), vol. 5336, Lecture Notes in Computer Science, Toulouse, France, pp. 521-531, (Springer) (2008)