Management by exception, current fad or real trend? - Network Management Center Blog -
Management by exception, current fad or real trend?

We were recently discussing management by exception at Sunrise Café.1

With big networks, it can be overwhelming to manage a network in any way other than by exception. We did a lot of work in NNM i-series to make management by exception easier.

We talk to folks regularly who have over 100,000 traps per day; we talked to one customer with 9 million traps per day. If you have to deal with every one of those individually, it could be a full time job for a whole team.

So in NNM i-series we changed the Root Cause Analysis (RCA) model to only promote items to incidents that we know for a fact are problems you should look at. Of course, you can still look at everything that comes in if you need to do some investigating.

We also designed in user roles and the ability to assign an incident to a user. An operator can then view only incidents she assigned to herself or those assigned to her. This further reduces the number of things she has to distract her from fixing the incidents with most impact.

ITIL (Information Technology Infrastructure Library) discusses incident lifecycle management. Wanting to move more toward that model, part of the redesign of the GUI includes five incident states. (HP is one of the authors of ITIL)

With 20/20 hindsight, some might call ATM and FDDI fads, while Ethernet has proven to be a real trend. Does your shop use management by exception? Do you think it is a fad or a trend?

I think it is a trend. Of course, that's just my opinion. I could be wrong2.

  1. The Sunrise Café is a little table with a coffee maker near a window next to the Network Management team cubes.
  2. Borrowed from Dennis Miller, I hope he doesn't mind.

 For the Network Management Center Team - Michael Procopio

Sunrise Cafe


Posted 10-28-2008 4:14 AM by Michael_Procopio
Filed under:

Comments

Stephen Smith wrote re: Management by exception, current fad or real trend?
on 10-30-2008 6:18 AM

Hi,

I think this is a nice idea and hope we get participants on it. I know of some customers in my country who are getting huge amounts of traps coming in too. One in particular is swamped and only looks at the critical severity items. As a result any warnings or majors etc get missed and so they lose their potential/ability to be proactive in managing the network. They are currently using 7.51.

So I was wondering what sort of practise other cusomers might use in this situation and if there is a better practise, or in other words, how do other customers go about handling such an overwhelming load?

Stephen Smith - TC, MEMA-SA

Powered by Community Server (Non-Commercial Edition), by Telligent Systems