VASILAS Dimitrios
Supervision : Marc SHAPIRO
A flexible and decentralised approach to query processing for geo-distributed data systems
This thesis studies the design of query processing systems, across a diversity of geo-distributed settings. Optimising performance metrics such as response time, freshness, or operational cost involves design decisions, such as what derived state (e.g., indexes, materialised views, or caches) to maintain, and how to distribute and where to place the corresponding computation and state. These metrics are often in tension, and the trade-offs depend on the specific application and/or environment. This requires the ability to adapt the query engine's topology and architecture, and the placement of its components. This thesis makes the following contributions: - A flexible architecture for geo-distributed query engines, based on components connected in a bidirectional acyclic graph. - A common microservice abstraction and API for these components, the Query Processing Unit (QPU). A QPU encapsulates some primitive query processing task. Multiple QPU types exist, which can be instantiated and composed into complex graphs. - A model for constructing modular query engine architectures as a distributed topology of QPUs, enabling flexible design and trade-offs between performance metrics. - Proteus, a QPU-based framework for constructing and deploying query engines. - Representative deployments of Proteus and experimental evaluation thereof.
Defence : 02/19/2021
Jury members :
PREGUIÇA Nuno (Associate professor/ Universidade Nova de Lisboa) [Rapporteur]
MONNET Sébastien (Professeur/ Université Savoie Mont Blanc) [Rapporteur]
AMANN Bernd (Professeur/ Sorbonne Université)
KEMME Bettina (Associate Professor/ McGill University)
KING Bradley (Co-founder & Field CTO/ Scality)
PALPANAS Themis (Professeur/ Université de Paris)
SAEIDA ARDEKANI Masoud (Software Engineer/ Google)
SHAPIRO Marc (Distinguished Research Scholar (Emeritus)/Sorbonne Université-Inria)
2018-2021 Publications
-
2021
- D. Vasilas : “A flexible and decentralised approach to query processing for geo-distributed data systems”, thesis, phd defence 02/19/2021, supervision Shapiro, Marc (2021)
- R. Vaillant, D. Vasilas, M. Shapiro, Th. Nguyen : “CRDTs for truly concurrent file systems”, HotStorage '21 -13th ACM Workshop on Hot Topics in Storage and File Systems, Virtual, France (2021)
-
2020
- D. Vasilas, M. Shapiro, B. King, S. Hamouda : “Towards application-specific query processing systems”, BDA 2020 - 36e Conférence sur la Gestion de Données – Principes, Technologies et Applications, Paris / Virtual, France (2020)
-
2018
- D. Vasilas, M. Shapiro, B. King : “A Modular Design for Geo-Distributed Querying: Work in Progress Report”, PaPoC 2018 - 5th Workshop on Principles and Practice of Consistency for Distributed Data, Porto, Portugal, pp. 1-8 (2018)