Configuring SNMP Alarms for Log Files
When configuring SNMP alarms you must specify which log files to monitor, what patterns to look for, and what types of notifications to send when an event matches a log entry.
Log File Monitoring shows the properties you can set to determine which log files in the configuration are monitored.
Property Name | Description | Default Value |
---|---|---|
snmp-adapter.alarms.logFiles[0].fileName | The full path and name of the file to monitor. | |
snmp-adapter.alarms.logFiles[0].pollingFrequency | The interval between reads of the file. Postfix the value with an s to indicate seconds, m for minutes, or ms for milliseconds. | 5s |
snmp-adapter.alarms.logFiles[0].startFromEndOfFile | If the file already exists (with content) and if this value is true, then the SNMP adapter starts reading from the end of the file and ignores all previously written content. If this value is false, then the file is always read from the beginning. | false |
The following shows a sample alarm log file in the YAML configuration file.
snmp-adapter:
alarms:
logFiles:
- fileName: /var/log/mtx/mylog.log
pollingFrequency: 5s
Notice that the properties include an array style index. This allows multiple file monitoring configurations to be added.
Matching Rules
The matching rules define regular expressions that are compared against each line of the monitored files. If a match is found, then an SNMP notification is sent to the SNMP NMS. The matching rules determine which parameters are sent and how they are formed. Matching Rules shows the properties you can set.
Property Name | Description | Default Value |
---|---|---|
snmp-adapter.alarms.matchingRules[0].pattern | The regular expression used to identify if a line in the log file is a match. This regular expression can define groups that are extracted for use in the variables sent in the SNMP notification (trap). | |
snmp-adapter.alarms.matchingRules[0].trapOid | The object identifier of the SNMP notification. | |
snmp-adapter.alarms.matchingRules[0].order | A number representing the order in which the matching rules are processed. The lower the number, the higher up the list it appears. The SNMP adapter only sends an SNMP notification for the first match. | 100 |
snmp-adapter.alarms.matchingRules[0].variables[0].oid | The object identifier of the variable being sent in the SNMP notification. | |
adapter.alarms.matchingRules[0].variables[0].type | The data type of this variable. String and integer are supported. | string |
adapter.alarms.matchingRules[0].variables[0].template | The template used to construct the value for this variable. This template can contain the groups captured in the regular expression by using the group index prefixed by the $ symbol. | $1 |
The following shows sample matching rules in the YAML configuration file.
snmp-adapter:
alarms:
matchingRules:
# Specific Error Trap
- pattern: '^\[ERROR\s*\]\s+([0-9]{4}-[0-9]{2}-[0-9]{2}\s[0-9]{2}:[0-9]{2}:[0-9]{2}.[0-9]{0,6})\s+\[.*\]\s+.*\s-\sA very specific situation has arisen because (.*)'
trapOid: '1.3.6.1.4.1.35838.1.999.1.1'
variables:
# Extract the timestamp out of the log message which is captured as the first group of the regular expression
- oid: '1.3.6.1.4.1.35838.1.999.2.1'
type: 'string'
template: '$1'
# Extract the description out of the log message which is captured as the second group of the regular expression
- oid: '1.3.6.1.4.1.35838.1.999.2.2'
type: 'string'
template: 'Error Message was: $2'
# Generic Catch All Error Trap (ordered to be last)
- pattern: '^\[ERROR\s*\]\s+([0-9]{4}-[0-9]{2}-[0-9]{2}\s[0-9]{2}:[0-9]{2}:[0-9]{2}.[0-9]{0,6})\s+\[.*\]\s+.*\s-\s(.*)'
trapOid: '1.3.6.1.4.1.35838.1.999.1.1'
order: 999
variables:
# Extract the timestamp out of the log message which is captured as the first group of the regular expression
- oid: '1.3.6.1.4.1.35838.1.999.2.1'
type: 'string'
template: '$1'
# Extract the description out of the log message which is captured as the second group of the regular expression
- oid: '1.3.6.1.4.1.35838.1.999.2.2'
type: 'string'
template: '$2'
Notice that the properties include an array style index at two points. This allows multiple file matching rules to be added, with each matching rule defining multiple variables.
Configuring SNMP Alarms for Queues and Buffers
You can configure alarm thresholds on all key buffers and queues including MtxBuf, write buffers, the CAMEL Gateway input queue and the queue from the CAMEL Gateway. Threshold breaches shall be reported using SNMP and to Prometheus. This feature can be enabled by configuration. Specifically, an event monitoring system such as Prometheus can be configured to raise alarms based on the value of SNMP statistics.
In Queues, see the queues and their descriptions.
Service | Queue | Description |
---|---|---|
camel | Analyzer:analyzer_task | The second queue in the CAMEL Gateway. |
camel | Logging:camel_logging_task | The third queue in the CAMEL Gateway. |
camel | Sccp:from_sccp_task | The first queue into the CAMEL Gateway from network_enabler using the network. |
camel | UnitData:unitdata_sender_task | A queue in the CAMEL Gateway. Messages are queued here before being sent to network_enabler. |
camel | blade_1_1_1_chrg.1.5.output | The queue from the CAMEL Gateway into the Charging Server. |
camel | blade_1_1_1_sysctrl.1.3.output | Inbound queue (shared memory) to CAMEL Gateway stats_task, which handles service state and cluster control messages. Messages typically sent from the Cluster Manager. |
charging | CHRG-Replay:replay_task | Inbound queue to replay_task of Charging Server, which throttles replay messages. |
charging | CHRG-Upgrade:upgrade_task | The priority queue into the Charging Server. Messages requiring database retry are queued here. |
charging | blade_1_1_1_chrg.1.1.input | The queue from the MDC Gateway to the Charging Server. |
charging | blade_1_1_1_chrg.1.2.input | The queue from Diameter Gateway to the Charging Server. |
charging | blade_1_1_1_chrg.1.3.input | The queue from subscriber_db_scan_task to the Charging Server. |
charging | blade_1_1_1_chrg.1.4.input | The queue from task_manager_handler_task back into the top of the Charging Server. |
charging | blade_1_1_1_chrg.1.5.input | The queue from the CAMEL Gateway to the Charging Server. |
charging | blade_1_1_1_chrg.1.6.input | The queue from the Cluster Manager to the Charging Server. |
charging | blade_1_1_1_chrg.1.7.input | The queue from replay_task to the top of the Charging Server. |
charging | blade_1_1_1_chrg.1.priority.taskMgrHandlerQueue | The queue from response_task to task_manager_handler_task. |
charging | blade_1_1_1_chrg.1.responseQueue | The queue from the Transaction Server to the Charging Server. |
charging | blade_1_1_1_chrg.1.retryQueue | The queue into retry_task. |
charging | blade_1_1_1_chrg.1.taskMgrHandlerQueue | The queue from the Task Managerto the Charging Server. |
diameter | Analyzer:analyzer_task | The second queue in the Diameter Gateway. |
diameter | Diameter-Worker:from_diameter_task | The first queue in Diameter Gateway, after incoming DIAMETER messages are decoded. |
diameter | DiameterIO:diameter_sender_task:01 | A queue in Diameter Gateway. Messages are queued here before being sent to network_enabler. |
diameter | Logging:diameter_logging_task | The third queue in Diameter Gateway. |
diameter | blade_1_1_1_chrg.1.2.output | The queue from Diameter Gateway into the Charging Server. |
diameter | blade_1_1_1_sysctrl.1.2.output | Inbound queue (shared memory) to diameter stats_task, which handles service state and cluster control messages. Messages typically sent from the Cluster Manager. |
mdc | Network-Worker:from_network_task | Inbound queue (local) to from_network_task of the MDC Gateway. |
mdc | NetworkIO:network_sender_task:01 | Inbound queue (local) to network_sender_task, each instance serializes to configured endpoint. See configuration file for task number to destination details. |
mdc | NetworkIO:network_sender_task:02 | Inbound queue (local) to network_sender_task, each instance serializes to configured endpoint. See configuration file for task number to destination details. |
mdc | NetworkIO:network_sender_task:03 | Inbound queue (local) to network_sender_task, each instance serializes to configured endpoint. See configuration file for task number to destination details. |
mdc | blade_1_1_1_chrg.1.1.output | Inbound queue (shared memory) for to_network_task. This is the main queue that feeds outbound MDCs from the Charging Server. |
mdc | blade_1_1_1_gw.1.input | Inbound queue (shared memory) to the MDC Gateway stats task, which handles services state and cluster control messages. Messages typically sent from Cluster Manager. |
task | blade_1_1_1_TaskMgrResponse.1.input | The queue from task_manager_handler_task in Charging Server to response_task in the Task Manager. |
task | blade_1_1_1_chrg.1.3.output | The queue from Charging Server to response_task in the Task Manager. |
task | blade_1_1_1_task.1.input | The queue to the Task Manager. |
transaction | blade_1_1_1_txn.1.1.input | The queue from Charging Server to the Transaction Server. |
transaction | blade_1_1_1_txn.1.4.input | The queue from checkpoint_manager_task to transaction_manager_task in the Transaction Server. |
transaction | blade_1_1_1_txn.1.5.input | The queue from transaction_log_reader_task to transaction_manager_task in the Transaction Server. |
transaction | blade_1_1_1_txn.1.6.input | The queue from p2p_sender_receiver_task to transaction_manager_task in the Transaction Server. |
transaction | blade_1_1_1_txn.1.controlChannelQueue | The queue to control_channel_task in the Transaction Server. |
transaction | blade_1_1_1_txn.1.indexOrganizerQueue | The queue from index_organizer_driver_task to index_organizer_task in the Transaction Server. |
transaction | blade_1_1_1_txn.1.priority.input | The priority queue to the Transaction Server. |
blade_1_1_1
use the syntax blade_<engine ID>_<cluster ID>_<blade ID>
.MATRIXX-COMMON-MIB::sysQueueStatsStatName.Service Name.Queue
ID
. For example:MATRIXX-COMMON-MIB::sysQueueStatsQueueName.charging.7 = blade_1_1_1_chrg.1.5.input
This means that the queue with service ID, charging
and queue ID 7 has the
name blade_1_1_1_chrg.1.5.input
the queue from the CAMEL Gateway to the
Charging Server.
MATRIXX-COMMON-MIB::sysQueueStatsFullCount.Service Name.Queue ID
. For
example:MATRIXX-COMMON-MIB::sysQueueStatsFullCount.charging.7 = 57
For each queue, the system monitor can be configured to monitor the stats
full count and log alarms if this increases.MATRIXX-MIB::sysBufferPoolStats
. The names of each buffer pool ID are
given by MATRIXX-MIB::sysBufferPoolStatsPoolName.>pool ID>
. For
example:MATRIXX-MIB::sysBufferPoolStatsPoolName.6 = blade_1_1_1_shared_buffer_pool_large_6_freequeueSo pool ID
The number of buffers in the pool is given by
MATRIXX-MIB::sysBufferPoolStatsCurrentCount.<pool ID>
For example,
MATRIXX-MIB::sysBufferPoolStatsCurrentCount.6 = 32683
means that there
are 32683 free, large buffers. You can configure the system monitoring tool to log an alarm
for the number of available large buffers being nearly exhausted when the value of the above
statistic drops below, as an example, 5000.