GISSELBRECHT Thibault
Supervision : Patrick GALLINARI
Co-supervision : LAMPRIER Sylvain
Bandit algorithms for real time data capture on social media
In this thesis, we study the problem of real time data capture on social media. Due to the different limitations imposed by those media, but also to the very large amount of information, it is not possible to collect all the data produced by social networks such as Twitter. Therefore, to be able to gather enough relevant information related to a predefined need, it is necessary to focus on a subset of the information sources. In this work, we focus on user-centered data capture and consider each account of a social network as a source that can be listened to at each iteration of a data capture process, in order to collect the corresponding produced contents. This process, whose aim is to maximize the quality of the information gathered, is constrained at each time step by the number of users that can be monitored simultaneously. The problem of selecting a subset of accounts to listen to over time is a sequential decision problem under constraints, which we formalize as a bandit problem with multiple selections. Therefore, we propose several bandit models to identify the most relevant users in real time. First, we study of the case of the so-called stochastic bandit, in which each user corresponds to a stationary distribution. Then, we introduce two contextual bandit models, one stationary and the other non stationary, in which the utility of each user can be estimated more efficiently by assuming some underlying structure in the reward space. In particular, the first approach introduces the notion of profile, which corresponds to the average behavior of each user. On the other hand, the second approach takes into account the activity of a user at a given instant in order to predict his future behavior. Finally, we are interested in models that are able to take into account complex temporal dependencies between users, with the use of a latent space within which the information transits from one iteration to the other. Moreover, each of the proposed approaches is validated on both artificial and real datasets.
Defence : 03/24/2017
Jury members :
M. Philippe Preux - Université de Lille 3 [Rapporteur]
M. Liva Ralaivola - Laboratoire d'Informatique de Marseille [Rapporteur]
Mme Michèle Sebag - CNRS
M. Olivier Sigaud - Université Pierre et Marie Curie
M. Sylvain Lamprier - Université Pierre et Marie Curie
M. Patrick Gallinari - Université Pierre et Marie Curie
2015-2019 Publications
-
2019
- S. Lamprier, Th. Gisselbrecht, P. Gallinari : “Contextual Bandits with Hidden Contexts: a Focused Data Capture From Social Media Streams”, Data Mining and Knowledge Discovery, 33, pp. 1853-1893, (Springer) (2019)
-
2018
- S. Lamprier, Th. Gisselbrecht, P. Gallinari : “Profile-Based Bandit with Unknown Profiles”, Journal of Machine Learning Research, vol. 19 (53), pp. 53:1-53:40, (Microtome Publishing) (2018)
-
2017
- Th. Gisselbrecht : “Diffusion d’informations dans les rĂ©seaux sociaux”, thesis, phd defence 03/24/2017, supervision Gallinari, Patrick, co-supervision : Lamprier, Sylvain (2017)
- S. Lamprier, Th. Gisselbrecht, P. Gallinari : “Variational Thompson Sampling for Relational Recurrent Bandits”, Joint European Conference on Machine Learning and Knowledge Discovery in Databases - ECML/PKDD 2017, vol. 10535, Lecture Notes in Computer Science, Skopje, North Macedonia, pp. 405-421, (Springer) (2017)
-
2016
- Th. Gisselbrecht, S. Lamprier, P. Gallinari : “Dynamic Data Capture from Social Media Streams: A Contextual Bandit Approach.”, Tenth International Conference on Web and Social Media, ICWSM 2016, Cologne, Germany, pp. 130-139 (2016)
- Th. Gisselbrecht, S. Lamprier, P. Gallinari : “Bandit Contextuel pour la Capture de DonnĂ©es Temps RĂ©el sur les MĂ©dias Sociaux”, Semaine du Document NumĂ©rique et de la Recherche d'Information (SDNRI 2016), Toulouse, France, pp. 57-72 (2016)
- Th. Gisselbrecht, S. Lamprier, P. Gallinari : “Linear Bandits in Unknown Environments”, ECML PKDD 2016 - European Conference on Machine Learning and Knowledge Discovery in Databases, vol. 9852, Lecture Notes in Computer Science, Riva Del Garda, Italy, pp. 282-298, (Springer) (2016)
-
2015
- Th. Gisselbrecht, S. Lamprier, P. Gallinari : “Policies for Contextual Bandit Problems with Count Payoffs.”, 27th IEEE International Conference on Tools with Artificial Intelligence, ICTAI 2015, Vietri Sul Mare, Italy, pp. 542-549, (IEEE) (2015)
- Th. Gisselbrecht, P. Gallinari, S. Lamprier, L. Denoyer : “WhichStreams: A Dynamic Approach for Focused Data Capture from Large Social Media”, Ninth International Conference on Web and Social Media, ICWSM 2015, Oxford, United Kingdom, pp. 130-139 (2015)
- Th. Gisselbrecht, L. Denoyer, P. Gallinari, S. Lamprier : “Apprentissage en temps rĂ©el pour la collecte d’information dans les rĂ©seaux sociaux.”, CORIA 2015 - ConfĂ©rence en Recherche d'Infomations et Applications, Paris, France, pp. 7-22 (2015)
- Th. Gisselbrecht, L. Denoyer, P. Gallinari, S. Lamprier : “Apprentissage en temps rĂ©el pour la collecte d’information dans les rĂ©seaux sociaux”, Document numĂ©rique - Revue des sciences et technologies de l'information. SĂ©rie Document numĂ©rique, vol. 18 (2-3), pp. 39-58, (Hermès) (2015)