Using a Generic SNMP Trap

To send a generic SNMP trap, you can use any process, rather than the standard Process Controller, Cluster Manager, SNMP agent, or TRA. The trap has error information, with specific alerts that relate back to specific error numbers. You can use the sysGenericErrorMessage SNMP trap to send out a system level alert with message text in the payload.

Generic Trap Locations

Configure the trap generation period in mtx_config.xml:

<snmp_agent> <trap_generate_period_msec>10000</trap_generate_period_msec>

You can change the configuration to control how soon the trap generates; the default is 10 seconds. The same type of trap only generates once in this period.

For more information about mtx_config.xml, see the discussion about MATRIXX configuration specification (mtx_config.xml) in MATRIXX Installation and Upgrade.

SNMP uses the generic trap locations specified in Generic Trap Locations.

Table 1. Generic Trap Locations
Component/Module	Task::Function	Message	Action
MtxChrg	AbortThreadData.onThreadTimeout	Thread `linuxThreadId_` has exceeded `threadQuarantineTimeoutInMillis_` ms timeout while processing message. Placing server into quarantine. Example: `2022-06-10 23:46:51.213401 FROM localhost: ------------------------------------------ DISMAN-EVENT-MIB::sysUpTimeInstance = 0:0:01:28.69SNMPv2-MIB::snmpTrapOID = MATRIXX-COMMON-MIB::sysGenericErrorMessageMATRIXX-COMMON-MIB::sysGenericErrorText = b'Thread 23953 has exceeded 2000ms timeout while processing message. Placing blade into quarantine.'`	Check messages/OIDs in the log on the pod for latency issues.
MtxChrg	AbortThreadData.onThreadTimeout	Quarantining thread `linuxThreadId_` would exceed server limit of `threadQuarantineLimit_` quarantined threads. Terminating server.	Check messages/OIDs in the log on the pod for latency issues and check system health.
MtxEventLoader	EventLoaderDispatcherTask::checkIdleGtcTimeouts	Have not received a GTC in the last `idleGtcErrorTimeout_.count()` minutes. Example: 2022-04-21 13:00:09.655332 FROM localhost: ------------------------------------------ DISMAN-EVENT-MIB::sysUpTimeInstance = 0:23:36:24.02 SNMPv2-MIB::snmpTrapOID = MATRIXX-COMMON-MIB::sysGenericErrorMessage MATRIXX-COMMON-MIB::sysGenericErrorText = b'Have not received a GTC in the last 5 minutes.' 2022-04-21 13:00:10.808405 FROM localhost: ------------------------------------------ DISMAN-EVENT-MIB::sysUpTimeInstance = 0:23:36:25.18 SNMPv2-MIB::snmpTrapOID = MATRIXX-COMMON-MIB::sysProcessingErrorAlert MATRIXX-COMMON-MIB::sysProcessingErrors = 283	Check system health.
MtxEventLoader	EventLoaderDispatcherTask::dispatcherLoop	Failed to read Event Repository for missing GTC ranges. This can happen when a publishing pod becomes active. The Dispatcher reads the LoaderTraceCollection for any gaps to fill.	Check MongoDB.
MtxStream	MefV2GeneratorTask::publishMefv2FilesToTarget	Could not publish event files: `::strerror(savedErrno)` `savedErrno` and Could not publish event files. Exit status= `publishCommand.getExitStatus()`. Example: `DISMAN-EVENT-MIB::sysUpTimeInstance = 0:0:02:22.45 SNMPv2-MIB::snmpTrapOID = MATRIXX-COMMON-MIB::sysGenericErrorMessage MATRIXX-COMMON-MIB::sysGenericErrorText = b'Could not publish event files. Exit status=255'`	Check the publishing target.
	MefV2GeneratorTask::createPublishedMefList	MEFv2 event recovery. Could not execute create_published_mef_list.py on publish target: `::strerror(savedErrno)` `savedErrno` and MEFv2 event recovery. Could not execute create_published_mef_list.py on publish target `publishTargetHostName_`. Error due to `errString`.	Requires manual MEFv2 recovery.
	MefV2GeneratorTask::pubTriggerCallbackHandler	Mef V2 Publisher did not make any progress for `kPubMonitorTimeoutMillis` milliseconds.	Check system health.
MtxTrafficMgr	CmpLeaderNodePool::getNextSvcStateOnNodeUp	"duplicate CMP " << str << " nodes, count=" << count << FQN Example: `DISMAN-EVENT-MIB::sysUpTimeInstance = 0:0:04:01.45 SNMPv2-MIB::snmpTrapOID = MATRIXX-COMMON-MIB::sysGenericErrorMessage MATRIXX-COMMON-MIB::sysGenericErrorText = b'duplicate CMP leader nodes, count=2; fqn=poolLeader'`	Restart the previous active publishing pod.
MtxTxn	CheckpointWriterTask::writeCheckpoint	The checkpointing server is out-of-sync with the last system snapshot. Please check for other errors to determine why. A duplicate Checkpoint was created for GTC= `prevCkptGtc_`. Example: `2022-06-03 19:46:49.667491 FROM localhost: ------------------------------------------ DISMAN-EVENT-MIB::sysUpTimeInstance = 0:4:30:19.70 SNMPv2-MIB::snmpTrapOID = MATRIXX-COMMON-MIB::sysGenericErrorMessage MATRIXX-COMMON-MIB::sysGenericErrorText = b'The checkpointing server is out-of-sync with the last system snapshot. Please check for other errors (in this log?) to determine why. A duplicate Checkpoint was created for GTC=7059770'`	Check system health.
	TransactionManagerTask::resolvePendingTransactionIfAny	Number of retries to resolve transaction ID `txnID`, GTC=`txnCtxP-` `getGlobalTxnCounter()` reaches maximum value `resolveTxnMaxRetries_`.	Restart the pod.
	TransactionManagerTask::handleSharedStorageEvent	Failed to execute nfs unmount from Standby server= `myBladeId`. Note: Please unmount nfs and mount shared storage manually.	Unmount NFS and mount shared storage.
	TransactionManagerTask::handleSharedStorageEvent	Failed to mount the shared storage even after fsck on Active publishing server= `myBladeId`. Note: Please manually mount the shared storage.	Mount shared storage.
	TransactionSortedLoggingTask::logWriteBufferAbrtCbHandler	TransactionSortedLoggingTask::logWriteBufferAbrtCbHandler:`atPtr->getStepString()`, Step: `atPtr->getStepString()`. Timeout: `timeoutMs` msec.	Restart the publishing cluster.
	TransactionSortedLoggingTask::diskWriteAbrtCbHandler	TransactionSortedLoggingTask::diskWriteAbrtCbHandler: `atPtr->getStepString()`, Step: `atPtr->getStepString()`. Timeout: `timeoutMs` msec.	Restart the engine.
	TransactionStreamTask::peerClusterHaStateUpdated	Got Publishing cluster `cl::name(toClusterHaState)` state, aborting transaction stream. To start transaction stream need to restart the publishing cluster. Note: This can happen during high load when LogWriteBuffer is not available. Example: `2022-06-03 13:36:20.172439 FROM localhost: ------------------------------------------ DISMAN-EVENT-MIB::sysUpTimeInstance = 0:20:57:18.36 SNMPv2-MIB::snmpTrapOID = MATRIXX-COMMON-MIB::sysGenericErrorMessage MATRIXX-COMMON-MIB::sysGenericErrorText = b'Got Publishing cluster FAILED state, aborting transaction stream. Note: To start transaction stream need to restart the publishing cluster.'`
	TransactionStreamTask::handleTxnStreamClusterStateMsg	Got HA peer engine= `haPeerEcbId` cl::name(clusterState) state, aborting transaction stream. To start transaction stream need to restart the engine. Note: This means the sorted transaction log writing to the local disk is slow. Verify if any non-MATRIXX processes are writing to disk.
	TransactionManagerTask::coordinatorCommit	Fatal error in committing transaction, NACK this transaction.\n Note: Only when the server is not shut down.	Start the other server, engine, or cluster before restarting this server.