This article is the corollary to "The Anatomy of APM” which
outlines four foundational elements of a successful APM strategy: Top Down
Monitoring, Bottom Up Monitoring, Reporting, and Incident Management. Here I provide a deeper context on how the event-to-incident
flow is structured.
It is the
correlation of events and the amalgamation of metrics that bring value to the
business by way of dashboards and trending reports, and it’s in the way the
business interprets the accuracy of those metrics that determines the success
of the implementation. If an event
occurs and no one sees it, believes it, or takes action on it, APM’s value can
be severely diminished and you run the risk of owning "shelfware.”
Overall, as
events are detected and consumed by the system, it is the automation that is
the lifeblood of an APM solution, ensuring the pulse of the incident flow is a
steady one. The goal is to show a
conceptual view of how events flow through the environment and eventually
become incidents. At a high level, the
Trouble Ticket Interface (TTI) will correlate the events into alerts, and
alerts into incidents which then become tickets, enabling the Operations team
to begin working toward resolution.

The event
flow moves from the outside in, and then from the center to the right. Here is how it’s managed:
• The outside blue circles represent
the monitoring toolsets that collect information directly from the
Infrastructure and the critical applications.
• The inner green (teal) circles
represent the toolsets the Enterprise Systems Management (ESM) team manages,
and is where most of the critical application thresholds are set.
• The dark brown circles are logical
connection points depicting how the events are collected as they flow through
the system – Once the events hit this connection point they go to 3 output
queues.
• The Red circles on the right are the
Incident Output queues for each event after it has been tracked and
correlated.
The transformation between event-to-incident is the critical
junction where APM and ITIL come together to provide tangible value back to the
business. So if you only take one thing
away from this picture, it would be the importance of managing the strategic
intent of the output queues, because this is the key for managing action, going
red to green, and trending.
Conclusion
I’m
suggesting that it is not necessarily the number of features or technical
stamina of each monitoring tool to process large volumes of data that will make
an APM implementation successful; it’s the choices you make in how you put them
together to manage the event-to-incident flow that determines your success. Timeliness and accuracy in this area will
help you gain credibility and confidence with each of your constituents and
business partners you support.
Related Links:
For a high-level view of a
much broader technology space, refer to slide show on BrightTALK.com which
describes "The Anatomy of APM- Webcast” in more context.
APM and MoM – Symbiotic Solution Sets
The Anatomy of APM
Prioritizing Gartner's APM Model
You can contact me on
LinkedIn.
Posted Friday, July 20, 2012