MATRIXX System Monitor

The MATRIXX System Monitor is an optional service that you can use to monitor the status of MATRIXX Engine servers (nodes) to protect them from an overload condition. System Monitor is enabled by default.

System Monitor works with the services stack of a MATRIXX Engine nodes server. This service monitors MATRIXX Engine critical resources and calculates the node usage level as a whole percentage value. This usage level is also used by the TRA-PROC optional health-based features and is exposed publicly through SNMP. This percentage value reflects the amount used of the monitored resource, such as CPU usage, and how much remains. A node usage level of 50% indicates that the node has half of its processing capability left.

System Monitor monitors resources that are considered critical in determining the operational state of MATRIXX Engine services. A general node usage level is derived from the different monitored resources.

Example resources that the service monitors include:
  • Inter-process queues — Monitors the usage level/health of the inter-process queues used by the different MATRIXX Engine services. Queue states are periodically sampled and maintained over a sampling window to determine general queues health as a load indicator.
  • Shared memory buffer pools — Monitors the usage state of the shared memory buffer pools. Periodically assesses the availability of buffers as a system load indicator.
  • System memory swap activity — Monitors the OS swap activity. Swap activity is periodically sampled and maintained over a sampling window to determine system swap impact.

System Monitor maintains a node serviceability state based on conditions generated by other services. For example, Transaction Server indicates critical slow disk I/O conditions to System Monitor. The Serviceability State might be used by TRA-PROC advanced routing functions and is available through SNMP and print_blade_stats.py.

You can configure System Monitor by adding SNMP trap thresholds and elements to the sysmon_config.xml file. For more information about the configuration options, see the discussion about the sysmon_config.xml reference file in MATRIXX Monitoring and Logging.

System Monitor also gives you the option to configure your own resource-specific usage thresholds based SNMP traps for the following:
  • General usage traps
  • Resource-specific general-usage traps
You configure these SNMP traps for monitored resources by specifying them in the appropriate <monitored-object> sections in the sysmon_config.xml file.

For more information about configuring and starting System Monitor, see the discussion about configuring and enabling System Monitor in MATRIXX Monitoring and Logging.