Application Failover Support Feature – 8.11 - Network Management Center Blog -
Application Failover Support Feature – 8.11

NNMi Application Failover is a new feature introduced with NNMi/NNMi Advanced 8.11. NNMi Application Failover functionality gives you the ability to setup two NNMi management stations, namely Primary/Active node and a Standby/Backup node. If the Primary management server crashes, the Backup server will automatically startup and resume NNMi functions (trap handling, analysis, discovery, polling, etc.). The two NNMi systems (Active and Standby) monitor each other using a “heartbeat” signal over the network. If the Active node fails (loss of heartbeat), the Standby node will automatically start NNMi. The Active as well as the Standby server instance uses the embedded Postgres database by default.

Note: Application Failover feature monitors system availability, not the availability of NNMi instance itself; i.e. if NNMi instance crashes or is killed, but the system remains online, Application Failover will not detect this.

How does this work?

  • The Active node performs periodic database backups
  • These backups are sent to the Standby node.
  • Between backups, periodic database transaction logs are created and immediately sent to the Standby node.
  • If the Active node crashes or is shutdown, Standby becomes the New-Active node
  • The New-Active node starts NNMi, including the database server.
  • The database server imports all the transaction logs and is available to NNMi for requests.

Does the Application Failover Support WAN?

The 8.11 release only supports LAN (same subnet). Going across a router is not supported at this time. Future releases will look into adding support for WAN.

What is the compatibility matrix for the active server and the standby server?

  • OS-consistency across the active and standby server is enforced. Both server need to be HPUX-to-HPUX, Linux-to-Linux, etc. Mix mode OS configuration such as Windows-to-Linux is not supported.
  • NNMi version and patch level needs to be the same across both servers
  • Licensing needs to consistent. If license (capacity or features) on Standby is less than Active, nodes will get unmanaged on failover. Eg. if the Standby node has lower node count and/or a subset of features (iSPI-NET, iAdvance, etc.) then when that node becomes Active, those features will be disabled and/or nodes will become unmanaged.
  • Both servers need to have the same “system” password

How are the 2 servers licensed?

The main Active server will need to have a production license equivalent to the number of nodes that it needs to managed. Standby uses non-production license. The node counts for the Active server and Stand

What if I want to use another external database such as Oracle instead of Postgres?

Application failover is not supported on any external database like Oracle and is only supported with the internal Postgres Database. If there is a need to use an external database such as Oracle, other supported clustering technologies such as MC ServiceGuard for HPUX or Microsoft Clusters (MSCS) for Windows or Veritas for Linux can be used.

Is the application level failover supported for NNM iSPI products?

All the iSPIs (except iSPI Performance) get “free” support to the Application Failover feature as long as they follow the following two conditions:

  • They use NNM’s Postgres database (then their data is replicated along with NNM’s data)
  • They use ovstart/ovstop to start/stop their services

iSPI NET as well as the iSPI for Performance do not support Application Failover feature.

In order to avail the Application Failover support of NNMi with iSPI Performance, the following could done:

  • iSPI Performance must be installed on a 3rd node (same subnet), without NNMi (cannot co-exist on the same server)
  • The two NNM nodes both run the script “nnmenableiSPI Performance.ovpl” to point to iSPI Performance station
  • iSPI Performance is (initially) configured to point to currently-Active node
  • iSPI Performance detects an NNM Cluster environment, and periodically polls “who-is-current-Active?” using a special command.
  • When the Active node changes, iSPI Performance reconfigures itself to point to the new-Active node.

Do we support Oracle RAC?

Oracle RAC support is not available at this time.

 I am Aruna Ravichandran, the Product Marketing Manager for NNM/NNMi/iSPI products within the Network Management Center for HP Software. I have been with HP for 13 years. I started my career as an engineer in the HP-UX kernel lab, moved on to do application development and was an architect in the High Availability/Clustering lab for couple of years. I then wanted to experiment the “darker” side of the business and moved to product management/Product marketing 5 years ago and marketed Storage products – high end disk arrays (XP) followed by Security Marketing where I created a secure appliance solution for enterprise log management and took it to market. I recently joined the Business Service Management (BSM) organization of HP Software. I have to say that I am still a “techy” at heart, though I totally love the “darker” side of the business.

We have to other conversations you can join. For Operations ITOpsBlog and for Business Service Management the BSMblog.


Posted 02-16-2009 3:41 AM by Michael_Procopio

Comments

Mike Ashauer wrote re: Application Failover Support Feature – 8.11
on 02-19-2009 1:52 AM

How long does the failover take ? Also do I loose my consoles session when the failover occurs ? What is the method to convert back to the primary server

aruna13 wrote re: Application Failover Support Feature – 8.11
on 02-20-2009 8:08 PM

Hello,

Here are the answers to what you asked:

How long does the failover take ?

The time for failover depends a lot of the size of the environment but it is usually 10 minutes or so. It can be longer on a really large system that has a lot of transactions logs to process.

Also do I loose my consoles session when the failover occurs ?

Yes. The console session will be lost.

What is the method to convert back to the primary server?

To go back to the primary the nnmcluster –acquire command can be run on the primary system.

Thanks,

Aruna

_________________

Aruna Ravichandran

Sr. Product Marketing Manager, Network Management Center, HPSW

3kassociates wrote re: Application Failover Support Feature – 8.11
on 03-03-2009 8:10 PM

Just to be clear - this approaches the failover capabilities on NNM 7.x (pre-ET) though with more restrictions. Any plans to also support the load-balancing capabilities in those earlier versions? It was very nice being able to use the active-active load-balance + failover capabilities available before extended topology. In fact this is what stopped us from implementing ET and going beyond 7.50.

martin.haack wrote re: Application Failover Support Feature – 8.11
on 03-05-2009 4:15 PM

To go back to the primary NNMi server.

Does the secondary NNMi server afte the failover and the primary NNMi server comes up again write the transaction logs to the primary NNMi ?

So the primary NNMi can be activated again.

aruna13 wrote re: Application Failover Support Feature – 8.11
on 03-05-2009 7:01 PM

Hello,

Answering the earlier comment on, Application failover capabilities:

NNMi Application failover does not support active active monitoring.  But, NNMi Application Level failover is less restrictive than the old MS/CS failover.

If I do a rough tally on what comes with the Application Failover capability with NNMi 8.11 when compared to NNM 7.x:

NNMi 8.11

1. Fault polling failover

2. Performance polling failover

3. Custom polling failover

4. Event data failover

5. Topology data failover

6. Configuration data failover

NNM 7.x

1. Fault polling failover (layer 3 only)

Based on the above, NNMi 8.x Application Level Failover has more capabilities than with what was provided with NNM 7.x

Thanks,

Aruna Ravichandran

Sr. Product Marketing Manager,

Network Management Center,

HP Software

aruna13 wrote re: Application Failover Support Feature – 8.11
on 03-05-2009 7:40 PM

Hello,

Answering the other comment, on what happens when the main active server comes back up:

When the primary server comes back up it will start receiving transaction logs from the secondary server and you can run the nnmcluster acquire command on the primary to have it take over as the active node.

Thanks,

Aruna Ravichandran

Sr. Product Marketing Manager,

Network Management Center,

HP Software

john wrote re: Application Failover Support Feature – 8.11
on 05-15-2009 4:15 AM

Does the Application Failover Support WAN now?Has it been released?

Dave wrote re: Application Failover Support Feature – 8.11
on 06-22-2009 9:40 PM

I see a reference to application failover whitepaper but cannot find it.

Thx

aruna13 wrote re: Application Failover Support Feature – 8.11
on 06-24-2009 6:19 PM

Hello,

    Application Failover support is now available with NNMi over the WAN.

We dont have a seperate whitepaper, but we have a complete description on this feature and how to configure it in the latest version of the NNMi Deployment  NNMi Deployment guide (version 8.11, patch 3). Here is the link to the guide on HP's manuals website (need HP Passport sign-in):

support.openview.hp.com/.../nnmi_deployment_guide_81x_p3.pdf

In the above guide, go to chapter - Adminstration --> Configuring NNMi Application Failover --> Application Failover and Multi-Subnets.

Thanks,

Aruna Ravichandran

Sr. Product Marketing Manager

Network Management Center, HP Software

aru@hp.com

hpxwyuan wrote re: Application Failover Support Feature – 8.11
on 01-12-2010 7:39 PM

Dees failover still need two servers on the same subnet? Thanks!

ken_gott wrote re: Application Failover Support Feature – 8.11
on 01-12-2010 8:06 PM

Yes, as of NNMi 8.12 (NNMi 8.1x Patch 4) there is support for application failover across subnets.  Please refer to the NNMi 8.13 Deployment Guide for details.

Powered by Community Server (Non-Commercial Edition), by Telligent Systems