stream_events Directory and File Format
The /stream_events directory has directories and files used by the Event Stream Server for event file processing. The event files go through several states as they are processed and end up in the Streamed Event File (SEF) format. The SEF files are streamed as events and used for MATRIXX Event File (MEF) publishing and event loading.
The /stream_events directory exists in MATRIXX Engine shared storage in /mnt/mtx/shared.
The /stream_events directory has a directory tree organized by a range of Global Transaction Counters (GTCs). It includes a /cur_events sub-directory that has the file that is written to by the Event Stream Server. The other sub-directories and filenames have the syntax in the following code example. Both include a prefix and three numbers. In addition, the filenames have a .gz suffix.
prefix_MinTxnLogFileTime_MinGtc_MaxGtc
- prefix — The event
file directory and filename prefixes reflect the state of the event file and
change as events are collected into files, processed, and archived:
- /cur_events — Events are first streamed to this directory in a file with the mtx_ prefix. When the mtx_ file reaches one million GTCs, the file is renamed with an evt_ prefix and moved to a directory with the dir_ prefix.
- /dir_ — The evt_ files are moved to this directory as they are created. When this directory reaches 500 evt_ files, it is renamed with the rdy_ prefix.
- /rdy_ — Files are copied from this directory to
shared storage in MATRIXX Engine as evt_ files are archived, and the prefix
changes to /sef_. After all files in the
rdy_ directory are copied to the shared
storage, the directory is renamed with a
sef_
prefix. - /sef_ — This directory has sef_ files that have
been archived to shared storage. After all files in a
/sef_ directory are copied, the directory is
renamed with this syntax:
sef_MinTxnLogFileTime_MinGtc_MaxGtc
.
- minTxnLogFileTime — A system-defined time used by the Event Streaming Framework for failure recovery.
- MinGtc — The minimum GTC of the included events.
- MaxGtc — The maximum GTC of the included events.
sef_15623452525_1001_4000
sef_15623452525_1001_4000.gz
The following is an example of a set of files in the /stream_events directory:
[user@hostname streamServerInstance]$ ls local_1_2_1/stream_events/*
local_1_2_1/stream_events/cur_events:
evt_1537400637_12001_13618.gz
local_1_2_1/stream_events/sef_1537400533_1_12000:
sef_1537400533_1_3000.gz sef_1537400533_3001_6000.gz sef_1537400533_6001_9000.gz sef_1537400533_9001_12000.gz
By default, the Event Streaming Framework removes internal event files from shared storage after 24 hours. During those 24
hours, if the Event Streaming Framework finds small gaps of missing events in SEF files, it tries to find those events and
fill the gaps. After 24 hours, you must manually fill any gaps in events using
create_event_files_from_txn_logs.py
. For details, see the
discussion about recovering SEF files.