MEDEM KUATSE Amélie
Supervision : Serge FDIDA
Conception de Mécanismes d'Amélioration de la Gestion d'Incidents dans les Réseaux IP
IP networks of operators carry the most data traffic of the world every day, and so should provide an increasingly important reliability. However, these networks are often subject to incidents that arise from maintenance works or unexpected failures. Many of these incidents are unavoidable, mainly because their origin are external to network operators. Moreover, when they happen the network can undergo considerable damages. It is therefore important to develop tools to prevent network incidents outbreak, or at least to limit their impact on the network. In this context, automatic procedures can help to accelerate troubleshooting procedures and maintenance works and so, to reduce the overall downtime of the network. The main focus of this thesis is to automatically detect IP network incidents. To reach this goal, we need a deep understanding of these incidents and their effects on the network. Network operators use trouble tickets to track all the steps of troubleshooting and maintenance activities. The history of trouble tickets carries valuable information for network management. Tickets are text documents that store the description (and the cause) of incidents which have required operator intervention. The effects of these incidents are observables through alarm messages which come from different sources (for instance, SNMP, router syslogs, or routing protocols), we focus on routing alarm messages. Our key observation is that operators already use trouble ticketing systems to record all events that require their intervention. Hence, we can use the history of trouble tickets combined with intradomain routing messages to train a classifier. Then, we can apply this classifier online to process intradomain routing messages and automatically single out the critical events. As a first step, we propose Troubleminer, a mechanism based on document clustering techniques to (1) automatically extract the causes of network incidents from tickets and (2) organize a collection of trouble tickets into an hierarchy that network operators can easily used. Then, we develop an heuristic to correlate trouble tickets with instability routing events in two operational networks: a VPN provider and Internet2 backbone network. We find that 4% (VPN operator) and 23% (Internet2) of routing events in these networks are critical, which means that they do coincide with trouble tickets. Finally, we show the faisability of detecting critical routing events by means of k-NN and Random Forest algorithms. Our results show that we can accurately pinpoint approximately 70% of critical events for both networks.
Defence : 02/02/2011
Jury members :
Damien MAGONI, Professeur, Université de Bordeaux [Rapporteur]
Philippe OWEZARSKI, Chercheur, CNRS [Rapporteur]
Patrick GALLINARI, Professeur, UPMC Sorbonne Universités
Nöemie SIMONI, Professeur, ENST Paris
Olivier FESTOR, Chercheur, INRIA
Mickael MEULLE, Chercheur, Orange Labs R&D (France Telecom R&D)
Serge FDIDA, Professeur, UPMC Sorbonne Universités
2007-2013 Publications
-
2013
- C. Magnien, A. Medem Kuatse, S. Kirgizov, F. Tarissan : “Towards realistic modeling of IP-level routing topology dynamics”, Networking science, vol. 3 (1-4), pp. 24-33 (2013)
-
2012
- C. Magnien, A. Medem Kuatse, F. Tarissan : “Towards realistic modeling of IP-level routing topology dynamics”, 14es Rencontres Francophones sur les Aspects Algorithmiques des Télécommunications (AlgoTel), La Grande Motte, France, pp. 1-4 (2012)
- A. Medem Kuatse, C. Magnien, F. Tarissan : “Impact of power-law topology on IP-level routing dynamics: Simulation results”, IEEE International Workshop on Network Science For Communication Networks (NetSciCom'12), Orlando, United States, pp. 220-225, (IEEE) (2012)
-
2011
- A. Medem Kuatse : “Conception de Mécanismes d’Amélioration de la Gestion d’Incidents dans les Réseaux IP”, thesis, phd defence 02/02/2011, supervision Fdida, Serge (2011)
-
2010
- A. Medem Kuatse, R. Teixeira, N. Usunier : “Predicting Critical Intradomain Routing Events”, Proceedings of the Global Communications Conference (GLOBECOM 2010), Miami, Florida, United States, pp. 1-5, (IEEE) (2010)
- A. Medem Kuatse, R. Teixeira, N. Feamster, M. Meulle : “Joint analysis of network incidents and intradomain routing changes”, Proceedings of the 6th International Conference on Network and Service Management (CNSM 2010), Niagara Falls, Canada, pp. 198-205, (IEEE) (2010)
-
2009
- A. Medem Kuatse, M.‑I. Akodjenou, R. Teixeira : “Troubleminer: Mining network trouble tickets”, In Proc. of the 1st IFIP/IEEE international workshop on Management of the Future Internet (Manfi2009), New York, United States, pp. 113-119, (IEEE) (2009)
-
2007
- A. Medem Kuatse, R. Teixeira, M. Meulle : “Characterizing network events and their impact on routing”, ACM CoNEXT Student Workshop, New York, United States, pp. 59, (ACM) (2007)