stream_events Directory and File Format

The /stream_events directory has directories and files used by the Event Stream Server for event file processing. The event files go through several states as they are processed and end up in the Streamed Event File (SEF) format. The SEF files are streamed as events and used for MATRIXX Event File (MEF) publishing and event loading.

The /stream_events directory exists in MATRIXX Engine shared storage in /mnt/mtx/shared.

The /stream_events directory has a directory tree organized by a range of Global Transaction Counters (GTCs). It includes a /cur_events sub-directory that has the file that is written to by the Event Stream Server. The other sub-directories and filenames have the syntax in the following code example. Both include a prefix and three numbers. In addition, the filenames have a .gz suffix.

The naming syntax is:
prefix_MinTxnLogFileTime_MinGtc_MaxGtc
Where:
  • prefix — The event file directory and filename prefixes reflect the state of the event file and change as events are collected into files, processed, and archived:
    • /cur_events — Events are first streamed to this directory in a file with the mtx_ prefix. When the mtx_ file reaches one million GTCs, the file is renamed with an evt_ prefix and moved to a directory with the dir_ prefix.
    • /dir_ — The evt_ files are moved to this directory as they are created. When this directory reaches 500 evt_ files, it is renamed with the rdy_ prefix.
    • /rdy_ — Files are copied from this directory to shared storage in MATRIXX Engine as evt_ files are archived, and the prefix changes to /sef_. After all files in the rdy_ directory are copied to the shared storage, the directory is renamed with a sef_ prefix.
    • /sef_ — This directory has sef_ files that have been archived to shared storage. After all files in a /sef_ directory are copied, the directory is renamed with this syntax: sef_MinTxnLogFileTime_MinGtc_MaxGtc.
  • minTxnLogFileTime — A system-defined time used by the Event Streaming Framework for failure recovery.
  • MinGtc — The minimum GTC of the included events.
  • MaxGtc — The maximum GTC of the included events.
Example directory name:
sef_15623452525_1001_4000
Example filename:
sef_15623452525_1001_4000.gz

The following is an example of a set of files in the /stream_events directory:

[user@hostname streamServerInstance]$ ls local_1_2_1/stream_events/*
local_1_2_1/stream_events/cur_events:
evt_1537400637_12001_13618.gz

local_1_2_1/stream_events/sef_1537400533_1_12000:
sef_1537400533_1_3000.gz  sef_1537400533_3001_6000.gz  sef_1537400533_6001_9000.gz  sef_1537400533_9001_12000.gz

By default, the Event Streaming Framework removes internal event files from shared storage after 24 hours. During those 24 hours, if the Event Streaming Framework finds small gaps of missing events in SEF files, it tries to find those events and fill the gaps. After 24 hours, you must manually fill any gaps in events using create_event_files_from_txn_logs.py. For details, see the discussion about recovering SEF files.