LI Ke
Supervision : Bernd AMANN
Co-supervision : NAACKE Hubert
Exploring Topic Evolution in Large Scientific Archives with Pivot Graphs
There is an increasing demand for practical tools to explore the evolution of scientific research published in bibliographic archives such as the Web of Science (WoS), arXiv, PubMed or ISTEX. Revealing meaningful evolution patterns from these document archives has many applications and can be extended to synthesize narratives from datasets across multiple domains, including news archives, legal document archives and works of literature.In this thesis, we propose a data model and query language for the visualization and exploration of topic evolution graphs. Our model is independent of a particular topic extraction and alignment method and proposes a set of semantic and structural metrics for characterizing and filtering meaningful topic evolution patterns. These metrics are particularly useful for the visualization and the exploration of large topic evolution graphs. We also present a prototype implementation of our model on top of Apache Spark and experimental results obtained for four real-world document archives.
Defence : 06/22/2021
Jury members :
Mirian Halfeld Ferrari, Professeure, Université d’Orléans, LIFO [Rapporteur]
Nicolas Travers, Maître de conférences HDR, ESILV, De Vinci Research Center [Rapporteur]
?Nathalie Aussenac-Gilles, Directrice de Recherche CNRS, Université de Toulouse, IRIT
Clémence Magnien, Directrice de Recherche CNRS, Sorbonne Université, LIP6
Bernd Amann, Professeur, Sorbonne Université, LIP6
Hubert Naacke, Maître de conférences, Sorbonne Université, LIP6
2017-2021 Publications
-
2021
- K. Li : “Exploring Topic Evolution in Large Scientific Archives with Pivot Graphs”, thesis, phd defence 06/22/2021, supervision Amann, Bernd, co-supervision : Naacke, Hubert (2021)
- K. Li, H. Naacke, B. Amann : “An Analytic Graph Data Model and Query Language for Exploring the Evolution of Science”, Big Data Research, vol. 26, pp. 100247, (Elsevier) (2021)
-
2020
- K. Li, H. Naacke, B. Amann : “EPIQUE: A Graph Data Model and Query Language for Exploring the Evolution of Science”, BDA 2020 : 36e Conférence sur la Gestion de Données – Principes, Technologies et Applications., Paris (virtual), France (2020)
- K. Li, H. Naacke, B. Amann : “EPIQUE: Extracting Meaningful Science Evolution Patterns from Large Document Archives”, International Conference on Extending Database Technology (EDBT), Copenhagen, Denmark (2020)
- K. Li, H. Naacke, B. Amann : “Exploring the Evolution of Science with Pivot Topic Graphs”, International Workshop on Big Data Visual Exploration and Analytics BigVis at EDBT 2020, Copenhague, Denmark (2020)
-
2019
- H. Naacke, K. Li, B. Amann, O. Curé : “Efficient similarity-based alignment of temporally-situated graph nodes with Apache Spark”, IEEE International Conference on Big Data, High Performance Big Graph Data Management, Analysis, and Mining, Los Angeles, CA, United States, pp. 4793-4798, (IEEE), (ISBN: 978-1-7281-0858-2) (2019)
-
2017
- X. Ren, O. Curé, H. Naacke, J. Lhez, K. Li : “Strider R: Massive and Distributed RDF Graph Stream Reasoning”, IEEE International Conference on Big Data, Boston, United States, pp. 3358-3367, (IEEE) (2017)