POUPART Yoann

Photo PhD student at Sorbonne University (Teaching assistant, )
Team : SMA
Arrival date : 09/01/2024
    Sorbonne Université - LIP6
    Boîte courrier 169
    Couloir 25-26, Étage 4, Bureau 416
    4 place Jussieu
    75252 PARIS CEDEX 05
    FRANCE

Tel: +33 1 44 27 36 67, Yoann.Poupart (at) nulllip6.fr
https://perso.lip6.fr/Yoann.Poupart

Supervision : Nicolas MAUDET

Interpretability for Deep Multi-Agent Systems

Multi-agent systems (MAS) have been democratised in recent years thanks to the natural language interfacing made possible by large language models (LLM). While their ability to solve complex tasks is undeniable, the dynamics emerging from these systems can be hard to predict, and guarantees are needed. Jailbreak, adversariality, or power-seeking are concerning failure modes of MAS, and evaluating these capabilities remains a difficult problem. In this respect, interpretability could be one of the best tools to monitor and control several agents simultaneously and automatically. Indeed the models' internals convey the information used for its prediction and can be used symbolically for gaining understanding or control.