Configuring SNMP Alarms for Log Files

When configuring SNMP alarms you must specify which log files to monitor, what patterns to look for, and what types of notifications to send when an event matches a log entry.

Log File Monitoring shows the properties you can set to determine which log files in the configuration are monitored.

Table 1. Log File Monitoring
Property Name	Description	Default Value
snmp-adapter.alarms.logFiles[0].fileName	The full path and name of the file to monitor.
snmp-adapter.alarms.logFiles[0].pollingFrequency	The interval between reads of the file. Postfix the value with an s to indicate seconds, m for minutes, or ms for milliseconds.	5s
snmp-adapter.alarms.logFiles[0].startFromEndOfFile	If the file already exists (with content) and if this value is true, then the SNMP adapter starts reading from the end of the file and ignores all previously written content. If this value is false, then the file is always read from the beginning.	false

The following shows a sample alarm log file in the YAML configuration file.

snmp-adapter:
  alarms:
    logFiles:
      - fileName: /var/log/mtx/mylog.log
        pollingFrequency: 5s

Notice that the properties include an array style index. This allows multiple file monitoring configurations to be added.

Matching Rules

The matching rules define regular expressions that are compared against each line of the monitored files. If a match is found, then an SNMP notification is sent to the SNMP NMS. The matching rules determine which parameters are sent and how they are formed. Matching Rules shows the properties you can set.

Table 2. Matching Rules
Property Name	Description	Default Value
snmp-adapter.alarms.matchingRules[0].pattern	The regular expression used to identify if a line in the log file is a match. This regular expression can define groups that are extracted for use in the variables sent in the SNMP notification (trap).
snmp-adapter.alarms.matchingRules[0].trapOid	The object identifier of the SNMP notification.
snmp-adapter.alarms.matchingRules[0].order	A number representing the order in which the matching rules are processed. The lower the number, the higher up the list it appears. The SNMP adapter only sends an SNMP notification for the first match.	100
snmp-adapter.alarms.matchingRules[0].variables[0].oid	The object identifier of the variable being sent in the SNMP notification.
adapter.alarms.matchingRules[0].variables[0].type	The data type of this variable. String and integer are supported.	string
adapter.alarms.matchingRules[0].variables[0].template	The template used to construct the value for this variable. This template can contain the groups captured in the regular expression by using the group index prefixed by the $ symbol.	$1

The following shows sample matching rules in the YAML configuration file.

snmp-adapter:
  alarms:
    matchingRules:
 
 
      # Specific Error Trap
      - pattern: '^\[ERROR\s*\]\s+([0-9]{4}-[0-9]{2}-[0-9]{2}\s[0-9]{2}:[0-9]{2}:[0-9]{2}.[0-9]{0,6})\s+\[.*\]\s+.*\s-\sA very specific situation has arisen because (.*)'
        trapOid: '1.3.6.1.4.1.35838.1.999.1.1'
        variables:
 
 
          # Extract the timestamp out of the log message which is captured as the first group of the regular expression
          - oid: '1.3.6.1.4.1.35838.1.999.2.1'
            type: 'string'
            template: '$1'
 
 
          # Extract the description out of the log message which is captured as the second group of the regular expression
          - oid: '1.3.6.1.4.1.35838.1.999.2.2'
            type: 'string'
            template: 'Error Message was: $2'
 
 
      # Generic Catch All Error Trap (ordered to be last)
      - pattern: '^\[ERROR\s*\]\s+([0-9]{4}-[0-9]{2}-[0-9]{2}\s[0-9]{2}:[0-9]{2}:[0-9]{2}.[0-9]{0,6})\s+\[.*\]\s+.*\s-\s(.*)'
        trapOid: '1.3.6.1.4.1.35838.1.999.1.1'
        order: 999
        variables:
 
 
          # Extract the timestamp out of the log message which is captured as the first group of the regular expression
          - oid: '1.3.6.1.4.1.35838.1.999.2.1'
            type: 'string'
            template: '$1'
 
 
          # Extract the description out of the log message which is captured as the second group of the regular expression
          - oid: '1.3.6.1.4.1.35838.1.999.2.2'
            type: 'string'
            template: '$2'

Notice that the properties include an array style index at two points. This allows multiple file matching rules to be added, with each matching rule defining multiple variables.

Configuring SNMP Alarms for Queues and Buffers

You can configure alarm thresholds on all key buffers and queues including MtxBuf, write buffers, the CAMEL Gateway input queue and the queue from the CAMEL Gateway. Threshold breaches shall be reported using SNMP and to Prometheus. This feature can be enabled by configuration. Specifically, an event monitoring system such as Prometheus can be configured to raise alarms based on the value of SNMP statistics.

In Queues, see the queues and their descriptions.

Table 3. Queues
Service	Queue	Description
camel	Analyzer:analyzer_task	The second queue in the CAMEL Gateway.
camel	Logging:camel_logging_task	The third queue in the CAMEL Gateway.
camel	Sccp:from_sccp_task	The first queue into the CAMEL Gateway from network_enabler using the network.
camel	UnitData:unitdata_sender_task	A queue in the CAMEL Gateway. Messages are queued here before being sent to network_enabler.
camel	blade_1_1_1_chrg.1.5.output	The queue from the CAMEL Gateway into the Charging Server.
camel	blade_1_1_1_sysctrl.1.3.output	Inbound queue (shared memory) to CAMEL Gateway stats_task, which handles service state and cluster control messages. Messages typically sent from the Cluster Manager.
charging	CHRG-Replay:replay_task	Inbound queue to replay_task of Charging Server, which throttles replay messages.
charging	CHRG-Upgrade:upgrade_task	The priority queue into the Charging Server. Messages requiring database retry are queued here.
charging	blade_1_1_1_chrg.1.1.input	The queue from the MDC Gateway to the Charging Server.
charging	blade_1_1_1_chrg.1.2.input	The queue from Diameter Gateway to the Charging Server.
charging	blade_1_1_1_chrg.1.3.input	The queue from subscriber_db_scan_task to the Charging Server.
charging	blade_1_1_1_chrg.1.4.input	The queue from task_manager_handler_task back into the top of the Charging Server.
charging	blade_1_1_1_chrg.1.5.input	The queue from the CAMEL Gateway to the Charging Server.
charging	blade_1_1_1_chrg.1.6.input	The queue from the Cluster Manager to the Charging Server.
charging	blade_1_1_1_chrg.1.7.input	The queue from replay_task to the top of the Charging Server.
charging	blade_1_1_1_chrg.1.priority.taskMgrHandlerQueue	The queue from response_task to task_manager_handler_task.
charging	blade_1_1_1_chrg.1.responseQueue	The queue from the Transaction Server to the Charging Server.
charging	blade_1_1_1_chrg.1.retryQueue	The queue into retry_task.
charging	blade_1_1_1_chrg.1.taskMgrHandlerQueue	The queue from the Task Managerto the Charging Server.
diameter	Analyzer:analyzer_task	The second queue in the Diameter Gateway.
diameter	Diameter-Worker:from_diameter_task	The first queue in Diameter Gateway, after incoming DIAMETER messages are decoded.
diameter	DiameterIO:diameter_sender_task:01	A queue in Diameter Gateway. Messages are queued here before being sent to network_enabler.
diameter	Logging:diameter_logging_task	The third queue in Diameter Gateway.
diameter	blade_1_1_1_chrg.1.2.output	The queue from Diameter Gateway into the Charging Server.
diameter	blade_1_1_1_sysctrl.1.2.output	Inbound queue (shared memory) to diameter stats_task, which handles service state and cluster control messages. Messages typically sent from the Cluster Manager.
mdc	Network-Worker:from_network_task	Inbound queue (local) to from_network_task of the MDC Gateway.
mdc	NetworkIO:network_sender_task:01	Inbound queue (local) to network_sender_task, each instance serializes to configured endpoint. See configuration file for task number to destination details.
mdc	NetworkIO:network_sender_task:02	Inbound queue (local) to network_sender_task, each instance serializes to configured endpoint. See configuration file for task number to destination details.
mdc	NetworkIO:network_sender_task:03	Inbound queue (local) to network_sender_task, each instance serializes to configured endpoint. See configuration file for task number to destination details.
mdc	blade_1_1_1_chrg.1.1.output	Inbound queue (shared memory) for to_network_task. This is the main queue that feeds outbound MDCs from the Charging Server.
mdc	blade_1_1_1_gw.1.input	Inbound queue (shared memory) to the MDC Gateway stats task, which handles services state and cluster control messages. Messages typically sent from Cluster Manager.
task	blade_1_1_1_TaskMgrResponse.1.input	The queue from task_manager_handler_task in Charging Server to response_task in the Task Manager.
task	blade_1_1_1_chrg.1.3.output	The queue from Charging Server to response_task in the Task Manager.
task	blade_1_1_1_task.1.input	The queue to the Task Manager.
transaction	blade_1_1_1_txn.1.1.input	The queue from Charging Server to the Transaction Server.
transaction	blade_1_1_1_txn.1.4.input	The queue from checkpoint_manager_task to transaction_manager_task in the Transaction Server.
transaction	blade_1_1_1_txn.1.5.input	The queue from transaction_log_reader_task to transaction_manager_task in the Transaction Server.
transaction	blade_1_1_1_txn.1.6.input	The queue from p2p_sender_receiver_task to transaction_manager_task in the Transaction Server.
transaction	blade_1_1_1_txn.1.controlChannelQueue	The queue to control_channel_task in the Transaction Server.
transaction	blade_1_1_1_txn.1.indexOrganizerQueue	The queue from index_organizer_driver_task to index_organizer_task in the Transaction Server.
transaction	blade_1_1_1_txn.1.priority.input	The priority queue to the Transaction Server.

Note: All queues prefixed above by blade_1_1_1 use the syntax blade_<engine ID>_<cluster ID>_<blade ID>.

All SNMP statistic OIDs for service queue statistics are prefixed with

MATRIXX-COMMON-MIB::sysQueueStatsStatName.Service Name.Queue
            ID

. For example:

MATRIXX-COMMON-MIB::sysQueueStatsQueueName.charging.7 = blade_1_1_1_chrg.1.5.input

This means that the queue with service ID, charging and queue ID 7 has the name blade_1_1_1_chrg.1.5.input the queue from the CAMEL Gateway to the Charging Server.

To find the number of times a queue has reached full, see MATRIXX-COMMON-MIB::sysQueueStatsFullCount.Service Name.Queue ID. For example:

MATRIXX-COMMON-MIB::sysQueueStatsFullCount.charging.7 = 57

For each queue, the system monitor can be configured to monitor the stats full count and log alarms if this increases.

The SNMP statistics for buffers are all prefixed by MATRIXX-MIB::sysBufferPoolStats. The names of each buffer pool ID are given by MATRIXX-MIB::sysBufferPoolStatsPoolName.>pool ID>. For example:

MATRIXX-MIB::sysBufferPoolStatsPoolName.6 = blade_1_1_1_shared_buffer_pool_large_6_freequeueSo pool ID

The number of buffers in the pool is given by MATRIXX-MIB::sysBufferPoolStatsCurrentCount.<pool ID> For example, MATRIXX-MIB::sysBufferPoolStatsCurrentCount.6 = 32683 means that there are 32683 free, large buffers. You can configure the system monitoring tool to log an alarm for the number of available large buffers being nearly exhausted when the value of the above statistic drops below, as an example, 5000.