Handling Event Repository Loading Issues

This topic provides information about troubleshooting and handling issues when loading event objects into the Event Repository.

You can query the number of event objects that failed to load into the Event Repository.

The Event Loader logs are contained in the mtx_debug.log file.

After successfully loading events, you see information indicating how many events from which MATRIXX Event File (MEF) file were loaded into the Event Repository. In addition, information is logged when a MEF file is deleted after events in the MEF files are fully synchronized to other event repository servers. For example:

LM_INFO 21450|21463 2018-10-03 15:57:13.495272 [event_loader_1:2:1:1(5100.55831)] | EventLoaderWorkerTask::loadEventFromMefToSingleCollection: successfully loaded 183 events from transaction_1_1_2_1538607406_3.mef.gz into event repository 
LM_INFO 21450|21463 2018-10-03 15:57:13.495379 [event_loader_1:2:1:1(5100.55831)] | EventLoaderWorkerTask::loadEventFromMefToSingleCollection: successfully renamed file: ../local_1_2_1/event_store_meta/transaction_1_1_2_1538607406_3.mef.gz to ../local_1_2_1/event_store_processed/transaction_1_1_2_1538607406_3.mef.gz.remove.1538607433 
LM_INFO 21450|21467 2018-10-03 15:57:35.867956 [event_loader_1:2:1:1(5100.55831)] | EventLoaderWorkerTask::processEventFiles: successfully removed file: ../local_1_2_1/event_store_processed/transaction_1_1_2_1538607426_5.mef.gz.remove.1538607433

Event Loader expects MEF files to be in compact MATRIXX Data Container (CMDC) format and verifies that each MEF file header is in compact MDC format. If a header is not in compact MDC format, Event Loader logs the following error:

LM_ERROR 3032|3043 2016-02-26 15:39:15.247628 [event_loader_1:2:4:1(4710.36950)] | EventLoaderWorkerTask::loadEventFromMefToSingleCollection: 
/mnt/mtx/shared_01/event_store_meta/transaction_1_1_2_1456529894_1.mef.gz file not in compact MDC format

An issue with the network could interrupt access to the shared storage device. If a network issue causes access to the shared disk to hang, after five minutes, Event Loader logs the error shown in the following example. Check for network issues and correct them.

LM_ERROR 13555|13557 2016-08-31 15:52:01.663249 [event_loader_1:2:4:1(4751.39532)] | EventLoaderDispatcherTask::warningTriggerCallbackHandler: MtxEventLoader::EventLoaderDispatcherTask abort trigger, Step: EventLoaderDispatcherTask::processEventFiles::scandir, Timeout: 300083 msec

When troubleshooting issues with loading events, you can view the log file for the publishing server and print_blade_stats.py - E for the SNMP statistics.

MongoDB Server Failover Handling

If you configured a MongoDB server replica set for the Event Repository, failover handling is transparent.

If a MongoDB primary (mongod) fails, and it fails to propagate some event objects to its secondaries at the time its processing is transferred to the new primary, then those events objects must be reloaded into the MongoDB database from MATRIXX Engine.

When a primary fails to propagate event objects to its secondaries in such a failover case, Event Loader logs an error message and an SNMP trap is generated. The following is an example of the Event Loader error message:

LM_ERROR 32022|32034 2016-07-27 12:26:24.223874 [event_loader_1:2:4:1(4750.39117)] | EventLoaderWorkerTask::handleInput: mongocxx::operation_exception.

After this error is logged, you must restart the publishing pod before the Event Loader removes the processed MEF files of the events that were not propagated.

The default delay time that Event Loader waits before removing processed MEF files is one hour. You can increase this delay time if you want more time to respond to such failovers by answering the create_config.info question EventLoader:How long in seconds do you want to delay before event_loader can remove the processed MEF files?

When you restart the publishing server, the Event Loader upon start-up, reloads the processed MEF files into MongoDB. Therefore, the event objects that were not propagated during the MongoDB failover are loaded into the Event Repository by the new primary.