Nowadays, image processing is ubiquitous, and can be found in fields such as medical imaging, autonomous driving, or space exploration. Although new algorithms provide more and more accurate results, they can be resource-intensive (execution time, power consumption, memory footprint). In this case, these algorithms are ill suited for real-time image processing, especially on embedded platforms. This is even worse when their execution time depends on image characteristics, and we refer to these as «~irregular~» algorithms.
One goal of this thesis is the design of new efficient (faster and more resistant to image characteristics) irregular algorithms using an algorithm-hardware-system approach, and develop expertise on this type of algorithm.
We focus on three types of irregular problems: Split & Merge segmentation, Connected Component Labeling and Analysis, and hysteresis thresholding.
The investigation of each problem is not limited to execution time but also relies on a study of hardware performance metrics. These metrics help us determine whether the algorithms are effectively exploiting system and hardware mechanisms. Using this fine-grained analysis, we transform step-by-step, the execution flow and data structures of each algorithm. In addition to this, we also exploit multiple levels of parallelism that can be found on modern hardware. We particularly focus on data-level parallelism, which can be leveraged using SIMD instructions.
We propose faster algorithms, which are also more resistant to image characteristics: a Split & Merge algorithm whose execution time is nearly constant with regard to segmentation criteria; an efficient Connected Component Labeling algorithm for 3D images; an efficient hysteresis thresholding algorithm for CPU.
To ensure the portability of our contributions with respect to the hardware, we conduct our performance evaluation on a wide array of hardware (server, desktops, and embedded systems).