DREBES Andi
Supervision : Nathalie DRACH-TEMAM
Co-supervision : HEYDEMANN Karine
Dynamic optimization of data-flow task-parallel applications for large-scale NUMA systems
Within the last decade, microprocessor development reached a point at which higher clock rates and more complex micro-architectures became less energy-efficient, such that power consumption and energy density were pushed beyond reasonable limits. As a consequence, the industry has shifted to more energy efficient multi-core designs, integrating multiple processing units (cores) on a single chip. The number of cores is expected to grow exponentially and future systems are expected to integrate thousands of processing units. In order to provide sufficient memory bandwidth in these systems, main memory is physically distributed over multiple memory controllers with non-uniform access to memory (NUMA).
Past research has identified programming models based on fine-grained, dependent tasks as a key technique to unleash the parallel processing power of massively parallel general-purpose computing architectures. However, the execution of task-paralel programs on architectures with non-uniform memory access and the dynamic optimizations to mitigate NUMA effects have received only little interest. In this thesis, we explore the main factors on performance and data locality of task-parallel programs and propose a set of transparent, portable and fully automatic on-line mapping mechanisms for tasks to cores and data to memory controllers in order to improve data locality and performance. Placement decisions are based on information about point-to-point data dependences, readily available in the run-time systems of modern task-parallel programming frameworks. The experimental evaluation of these techniques is conducted on our implementation in the run-time of the OpenStream language and a set of high-performance scientific benchmarks. Finally, we designed and implemented Aftermath, a tool for performance analysis and debugging of task-parallel applications and run-times.
Defence : 06/25/2015
Jury members :
M. Jean-François MÉHAUT, Professeur, Université Joseph Fourier / CEA, [Rapporteur]
M. Nacho NAVARRO, Associate Professor, Universitat Politècnica de Catalunya / Barcelona Supercomputing Center, [Rapporteur]
M. Albert COHEN, Directeur de Recherche, INRIA
M. Benoît DUPONT DE DINECHIN, CTO Kalray S.A.
Mme. Nathalie DRACH-TÉMAM, Professeur, Université Pierre et Marie Curie
Mme. Karine HEYDEMANN, Maître de Conférences, Université Pierre et Marie Curie
M. Raymond NAMYST, Professeur, Université de Bordeaux
M. Antoniu POP, Lecturer, The University of Manchester
M. Pierre SENS, Professeur, Université Pierre et Marie Curie
M. Marc SHAPIRO, Directeur de Recherche, INRIA / LIP6
2014-2016 Publications
-
2016
- A. Drebes, J.‑B. Bréjon, A. Pop, K. Heydemann, A. Cohen : “Language-Centric Performance Analysis of OpenMP Programs with Aftermath”, IWOMP 2016 - 12th International Workshop on OpenMP, vol. 9903, Lecture Notes in Computer Science, Nara, Japan, pp. 237-250, (Springer) (2016)
- A. Drebes, A. Pop, K. Heydemann, A. Cohen, N. Drach : “Scalable Task Parallelism for NUMA: A Uniform Abstraction for Coordinated Scheduling and Memory Management”, PACT'16 - ACM/IEEE Conference on Parallel Architectures and Compilation Techniques, Haifa, Israel, pp. 125-137 (2016)
- A. Drebes, A. Pop, K. Heydemann, N. Drach, A. Cohen : “NUMA-aware scheduling and memory allocation for data-flow task-parallel applications”, ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Barcelona, Spain, pp. 44:1-44:2, (ACM New York, NY, USA) (2016)
- A. Drebes, A. Pop, K. Heydemann, A. Cohen : “Interactive visualization of cross-layer performance anomalies in dynamic task-parallel applications and systems”, IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), Uppsala, Sweden, pp. 274-283 (2016)
-
2015
- A. Drebes : “Parallélisation adaptative pour les applications embarquées haute-performance”, thesis, phd defence 06/25/2015, supervision Drach-temam, Nathalie, co-supervision : Heydemann, Karine (2015)
- A. Drebes, K. Heydemann, A. Pop, A. Cohen, N. Drach : “Automatic Detection of Performance Anomalies in Task-Parallel Programs”, 1st Workshop on Resource Awareness and Adaptivity in Multi-Core Computing (Racing 2014), Paderborn, Germany (2015)
-
2014
- A. Drebes, K. Heydemann, N. Drach, A. Pop, A. Cohen : “Topology-Aware and Dependence-Aware Scheduling and Memory Allocation for Task-Parallel Languages”, ACM Transactions on Architecture and Code Optimization, vol. 11 (3), pp. 30, (Association for Computing Machinery) (2014)
- A. Drebes, K. Heydemann, N. Drach, P. Antoniu, A. Cohen : “Aftermath: Performance analysis of task-parallel applications on many-core NUMA systems”, Tenth International Summer School on Advanced Computer Architecture and Compilation for High-Performance and Embedded Systems, Fiuggi, Italy (2014)
- A. Drebes, A. Pop, K. Heydemann, A. Cohen, N. Drach : “Aftermath: A graphical tool for performance analysis and debugging of fine-grained task-parallel programs and run-time systems”, Seventh Workshop on Programmability Issues for Heterogeneous Multicores (MULTIPROG-2014), Vienna, Austria (2014)