We were recently discussing management by exception at Sunrise Café.1
With big networks, it can be overwhelming to manage a network in any way other than by exception. We did a lot of work in NNM i-series to make management by exception easier.
We talk to folks regularly who have over 100,000 traps per day; we talked to one customer with 9 million traps per day. If you have to deal with every one of those individually, it could be a full time job for a whole team.
So in NNM i-series we changed the Root Cause Analysis (RCA) model to only promote items to incidents that we know for a fact are problems you should look at. Of course, you can still look at everything that comes in if you need to do some investigating.
We also designed in user roles and the ability to assign an incident to a user. An operator can then view only incidents she assigned to herself or those assigned to her. This further reduces the number of things she has to distract her from fixing the incidents with most impact.
ITIL (Information Technology Infrastructure Library) discusses incident lifecycle management. Wanting to move more toward that model, part of the redesign of the GUI includes five incident states. (HP is one of the authors of ITIL)
With 20/20 hindsight, some might call ATM and FDDI fads, while Ethernet has proven to be a real trend. Does your shop use management by exception? Do you think it is a fad or a trend?
I think it is a trend. Of course, that's just my opinion. I could be wrong2.
- The Sunrise Café is a little table with a coffee maker near a window next to the Network Management team cubes.
- Borrowed from Dennis Miller, I hope he doesn't mind.
For the Network Management Center Team - Michael Procopio

Posted
10-28-2008 4:14 AM
by
Michael_Procopio