Staff directory

VERGER Mélina

PhD Student at Sorbonne University
Team : MOCAH
https://melinaverger.github.io/

Supervision : Vanda LUENGO
Co-supervision : BOUCHET François, LALLÉ Sébastien

Algorithmic fairness analyses of supervised machine learning in education

This thesis aims to evaluate and reduce algorithmic unfairness of machine learning models widely used in education. Such predictive models, based on increasing amounts of educational data and learning traces, are intended to improve the human learning experience. They can be used, for example, to predict dropout or to personalize the learning experience by adapting educational content to meet the needs of each learner.

However, it has been shown repeatedly that these models can produce biased and discriminatory predictions, most often by producing consistently worse predictions for Black people compared to White people, and for women compared to men. It has therefore become crucial to evaluate the fairness of predictive models' results, according to the different groups present in the data.

State-of-the-art research has focused on comparing the predictive performances of models between groups. For example, for a binary classifier and for male/female groups, the rate of correct predictions is calculated for each group, and the difference between these rates would indicate unfairness. Although this approach is predominant in the literature, it only captures unfairness in terms of predictive performance, while unfairness can manifest in other ways and in more nuanced forms than simple score differences, which needs further exploration.

The main objective of this thesis is thus to deepen the understanding and evaluation of algorithmic unfairness, and to then identify its potential presence in understudied contexts. These contexts include both sensitive attributes and learner populations that have been little or not considered at all.

To this end, we designed a new algorithmic fairness metric, in short MADD, which is based on the distributions of the results of supervised learning models. This distribution-based approach additionally allows for graphical analyses to better understand the unfairness quantified by MADD. We have demonstrated, both theoretically and experimentally, the validity of this metric and discovered that potential unfairness observed in the data is not always reflected in the model outcomes, as was the case with gender bias in our experiments.

Moreover, we developed a technique to mitigate unfairness using MADD, along with new methods to evaluate fairness with multiple sensitive attributes simultaneously. Indeed, the literature typically considers each attribute separately, while Crenshaw’s (1989, 1991) theory of intersectionality argues that combined influences produce unique and different forms of discrimination for certain groups. Our experimental results show that some combinations of attributes increase, reduce, or maintain the level of unfairness initially observed.

Finally, we conducted fairness analyses for new sensitive attributes, whether demographics or related to the learning context, and with new learner populations from African countries, the Philippines, Haiti, and France, thanks to data collected from a MOOC (massive open online course) and the Canvas LMS platform. These experiments revealed unfairness that had not been previously discovered, thus shedding light on potential real unfairness in these contexts.

To facilitate replication of our work and the application of our methods in other contexts, we created an open-source Python library, named maddlib. The data (except for those from the Philippines) and our documented source code are also available online.

Phd defence : 12/20/2024

Jury members :

Nicolas Roussel, DR, Inria Bordeaux [rapporteur]
Agathe Merceron, PR, BHT Berlin, [rapportrice]
Sébastien Destercke, DR, UTC Compiègne
Mykola Pechenizkiy, PR, TU Eindhoven
Mar Pérez-Sanagustín, MCF, Univ. Paul Sabastier Toulouse III, PUC Chile
Vanda Luengo, PR, Sorbonne Université
François Bouchet, MCF, Sorbonne Université
Sébastien Lallé, MCF, Sorbonne Université

Lien de la visio : https://zoom.us/j/97632100054

Departure date : 12/31/2024

2022-2024 Publications

All Articles Communications Thesis

2024
- M. Verger : “Algorithmic fairness analyses of supervised machine learning in education”, thesis, phd defence 12/20/2024, supervision Luengo, Vanda, co-supervision : Bouchet, François, Lallé, Sébastien (2024)
- V. Švábenský, M. Verger, Maria Mercedes T. Rodrigo, Clarence James G. Monterozo, R. Baker, M. Saavedra, S. Lallé, A. Shimada : “Evaluating Algorithmic Bias in Models for Predicting Academic Performance of Filipino Students”, Proceedings of the 17^th International Conference on Educational Data Mining (EDM 2024), Atlanta, GA, United States (2024)
- M. Verger, Ch. Fan, S. Lallé, F. Bouchet, V. Luengo : “A Comprehensive Study on Evaluating and Mitigating Algorithmic Unfairness with the MADD Metric”, Journal of Educational Data Mining, vol. 16 (1), pp. 365–409, (International Educational Data Mining Society) (2024)
- M. Verger, F. Bouchet, S. Lallé, V. Luengo : “Intersectionalinity : deepen algorithmic fairness evaluation. The case study of academic performance prediction using data from online courses”, STICEF (Sciences et Technologies de l'Information et de la Communication pour l'Éducation et la Formation), vol. 31 (1), Numéro spécial EIAH 2023, (ATIEF) (2024)
- S. Lallé, F. Bouchet, M. Verger, V. Luengo : “Fairness of MOOC Completion Predictions Across Demographics and Contextual Variables”, Proceedings of the 25^th International Conference on Artificial Intelligence in Education, vol. 14829, Lecture Notes in Computer Science, Recife, Brazil, pp. 379-393, (Springer Nature Switzerland) (2024)
2023
- M. Verger, Ch. Fan, S. Lallé, F. Bouchet, V. Luengo : “A Fair Post-Processing Method based on the MADD Metric for Predictive Student Models”, 1^st International Tutorial and Workshop on Responsible Knowledge Discovery in Education (RKDE 2023) at ECML PKDD 2023, Turino, Italy (2023)
- M. Verger, S. Lallé, F. Bouchet, V. Luengo : “Is Your Model "MADD"? A Novel Metric to Evaluate Algorithmic Fairness for Predictive Student Models”, Proceedings of the 16^th International Conference on Educational Data Mining, Bengaluru, India, (ISBN: 978-1-7336736-4-8) (2023)
- M. Verger, F. Bouchet, S. Lallé, V. Luengo : “Caractérisation et mesure des discriminations algorithmiques dans la prédiction de la réussite à des cours en ligne”, EIAH2023 : 11^e Conférence sur les Environnements Informatiques pour l'Apprentissage Humain, Brest, France (2023)
2022
- M. Verger : “Investiguer la notion d’équité algorithmique dans les environnements informatiques pour l’apprentissage humain”, Actes des neuvièmes rencontres jeunes chercheur·e·s en EIAH, Lille, France, pp. 44-51 (2022)