PAROLINI Francesco

PhD graduated
Team : APR
Departure date : 06/30/2024
https://lip6.fr/Francesco.Parolini

Supervision : Antoine MINÉ

Static Analysis for Security Properties of Software by Abstract Interpretation

This thesis aims at developing and implementing formal techniques that can prove the absence of security-related vulnerabilities in software systems. We focus our attention on two notable cases: Regular Expression Denial of Service attacks (ReDoS), and exploitable runtime errors. For each case, we first study the theoretical framework to precisely characterize the vulnerability that we are considering. Then, we develop sound, automatic analyses, which can prove the absence of the vulnerabilities without relying on user interaction or annotations. We pair our theoretical results with practical implementations, which we consistently test on real-world examples.
Modern programming languages often provide functions to manipulate regular expressions in standard libraries. If they offer support for advanced features, the matching algorithm has an exponential worst-case time complexity: for some so-called vulnerable regular expressions, an attacker can craft ad hoc strings to force the matcher to exhibit the exponential behaviour and perform a ReDoS attack. In the first part of this thesis, we put forward a novel tree semantics for the regular expression matching procedure that precisely characterizes the behaviour of real-world matching engines. By leveraging such a characterization, we formally define ReDoS vulnerabilities in terms of the size of the matching trees. Then, we propose a sound analysis which extracts an overapproximation of the set of words that can make the matching engine exhibit the exponential behaviour. We implemented our analysis in a tool called RAT, and by comparing it to seven other detectors on a large set of 74,669 regular expressions, we found that RAT is the only detector that does not raise false negatives. Furthermore, RAT is faster, often by orders of magnitude, than most other tools.
Runtime errors that can be triggered by an attacker are sensibly more dangerous than others, as they not only result in program failure, but can also be exploited and lead to security breaches. In the second part of this thesis, we focus our attention on developing a technique able to rule out the existence of runtime errors that can be triggered by an attacker. First, we introduce a novel property called safety-nonexploitability, which precisely characterizes the set of programs whose correctness cannot be altered by an external user. Then, we give an alternative characterization of safety-nonexploitability in terms of tainted (i.e., user-controlled) variables. By relying on this characterization, we propose an analysis by abstract interpretation which combines a semantic taint analysis with a numeric analysis. The numeric invariants inferred by the numeric domain enhance the precision of the taint analysis. To assess the usefulness of our technique, we implemented our analysis for a large subset of C. We compared the regular analyzer with the modified nonexploitability version on a large set of 77 real-world programs taken from the Coreutils package, to which we added 13,261 test cases taken from the Juliet test suite. In our experiments, we found that our framework can consistently prove that more than 70% of the alarms raised by the regular analyzer are not exploitable.

Defence : 06/26/2024

Jury members :

JENSEN Thomas (IRISA, France)[Rapporteur]
MASTROENI Isabella (Università di Verona, Italie) [Rapporteur]
BARDIN Sébastien (CEA List, France)
GANTY Pierre (IMDEA Software Institute, Espagne)
GENITRINI Antoine (Sorbonne Université, France)
MINÉ Antoine (Sorbonne Université, France)
SZNAJDER Nathalie (Sorbonne Université, France)

Departure date : 06/30/2024

2022-2024 Publications