Monitor the Availability of the STANDBY Cluster

The STANDBY cluster responds to Diameter Device-Watchdog-Request (DWR) messages to let the network application know it is running so it can be ready to take over processing if a failover occurs. You can monitor the sending and receiving of DW messages by monitoring SNMP notifications and by running the print_blade_stats.py script.

About this task

This procedure describes how to verify the receiving and sending of DWR messages, signaling the standby cluster is available.

Note: If the redundant link between the HA cluster pair is not available and communication between the two MATRIXX Engine sites is broken, for example, when the standby engine is stopped because it cannot keep up with transaction replay operations from the active cluster (the configured maximum number of retry attempts has been reached), an SNMP sysClusterPeerDisconnected trap is triggered. Network operators should monitor the system for this notification. When the connection is restored, a sysClusterPeerConnected trap is triggered.

Procedure

Open a terminal and type the following command to display the DW statistics on the standby cluster. Replace eID with the ID of the secondary engine.
print_blade_stats.py -e eID -D

For example:

print_blade_stats.py -e 2 -D

Results

Information similar to the following is displayed. The Diameter Device-Watchdog statistics are shown in line 2 of the Diameter PDU Stats (command 280). If the Total Packets Read and Total Packets Sent values are 0 or are not equal to each other, problems might exist on the standby cluster.
topology: Parsing config file: /opt/mtx/conf/mtx_config.xml

Diameter PDU Stats
------------------
                                    Response Time in micros   |    Total      Total 
                                     Current   |     Total    |  Packets    Packets    
AppId     Cmd Description          Avg    Max  |  Avg     Max |     Read       Sent
=======================================================================================================
     0     257 common:CE             0      0       0     581         35      22560         
     0     280 common:DW             0      0       0       0     301028     301028         
     0     282 common:DP             0      0       0       0          0          0         
     1     265 nasreq:AA             0      0       0       0          0          0        
     3     271 accounting:AC         0      0       0       0          0          0         
     4     258 credit-control:RA     0      0       0       0      13334          0         
     4     272 credit-control:CC     0      0    6961  126512      21313      21313  
368618     430 private:mdc           0      0       0       0          0          0