SARR Idrissa

PhD student at Sorbonne University
Team : BD
https://lip6.fr/Idrissa.Sarr

Supervision : Anne DOUCET

Co-supervision : NAACKE Hubert

Transaction Routing in a Large-Scale Database

Database replication has been widely studied for more decades. Replication is a way to increase data availability and to improve data access performance. A major challenge of database replication is replica control when data are concurrently updated. Some replication solutions are targeting medium-scale systems (e.g. PC clusters) with intrinsic reliability, and are no longer suitable for large-scale systems. Moreover, because of the well-known tension between consistency, which requires synchronization between replicas, and scalability, which requires asynchronous mechanisms, the approach used in this thesis is to release consistency in order to improve data access performance. Furthermore, many web2.0 applications do not always need to access the most up-to-date data. That leads to new solutions, which give more performances in terms of throughput, latency and availability. In this PhD thesis, we consider transactional applications in which data are widely distributed over a dynamic infrastructure such as P2P systems. We aim to improve performances by monitoring data consistency, load balancing, and availability. We propose a middleware solution in such that resources and their temporary unavailability are completely transparent. Our solution preserves applications and database autonomy; therefore both applications and DBMS remain unchanged. Applications specify their needs in terms of consistency and the middleware guarantees these needs by controlling the transactions routing and resources status. We define two protocols for maintaining global consistency based on the knowledge of data accessed by transactions. The first protocol ensures global consistency in a pessimistic way. Each transaction is associated with its conflict classes, which are used for ordering transactions based on their arrival date in the system. The second protocol process transactions in an optimistic way, then, real conflicts are checked later at commit time. That allows for more parallelism and raises new opportunities for load balancing. Moreover, we propose a large-scale distributed directory for metadata, which enable to guaranty data consistency. All our solutions take into account fail-stop failures through a failure management mechanism well suited to a large-scale system. Finally, we have implemented our solutions for validating them experimentally. Performance tests show that metadata management is effective and improves transactional throughput. We also show that the redundancy of the middleware reduces the response time in case of failures.

Defence : 10/07/2010

Jury members :

Mme. Esther Pacitti, Professeur, Université Montpellier 2, LIRMM [Rapporteur]
M. Rachid Guerraoui, Professeur, EPFL CH 1015 Lausanne, Switzerland [Rapporteur]
M. Gabriel Antoniu, Chargé de recherche (HDR), INRIA Rennes
M. Samba Ndiaye, Maître Assistant, UCAD Direction Informatique, Dakar Sénégal
M. Stéphane Gançarski, Maître de Conférences (HDR), UPMC
M. Hubert Naacke, Maître de Conférences, UPMC
M. Anne Doucet, Professeur, UPMC
M. Pierre Sens, Professeur, UPMC

Departure date : 09/30/2011

2008-2022 Publications