Alarm Prioritization and Diagnosis for Cellular Networks

How to optimize the prioritization of the alarm events in ICT?

Gabriela F. Ciocarlie, from SRI International, tried to answer to this question within the Research awarded as the Best Paper at MONAMI 2015, the Seventh EAI International Conference on Mobile Networks and Management.

The state-of-the-art In today’s telecommunication networks, alarm events management represents a challenge for operators, especially due to two key factors: high volume and lack of descriptiveness. Discrete alarm events can easily reach tens of thousands per day; secondly, alarms themselves frequently do not describe the actual cause for a failure. Furthermore, the cause of the alarm could be located outside the network itself. Current techniques to perform this form of alarm correlation have fallen into two separate bins:

  • Data-driven methods are adaptable to networks and may facilitate discovery of the causes of the failure. On the other hand, they cannot offer much in terms of concrete diagnostics or prioritization, as they operate on derived statistics and do not incorporate knowledge about the network itself.
  • Rule-based methods offer strong diagnostic capabilities as they do leverage background knowledge about a given network, but are inflexible to new networks or updates to the underlying network itself.

Alarm Prioritization and Diagnosis for Cellular Networks The team lead by Dr. Ciocarlie proposes a novel technique that can flexibly adapt to new networks while retaining as much background knowledge and diagnostic capability, as well. The new framework aims at prioritizing and diagnosing faults in broadband networks based on a priori information about the managed network structure, relationships, and fault management practices. The study shows that the system significantly reduces the amount of analyzed objects in the network by combining the alarming objects into sub-graphs and prioritizing them. Furthermore, it is also able to derive the most probable cause for the observed alarms.

Next steps In the next phases of the research, the framework will be tested in a real-time setting. In addition, other types of data will be considered in the diagnosis process, including configuration management information.

Are you eager to read the full paper? It will be available on EUDL.