File Systems
File systems should be monitored for details such as local SSD free space and gaps in the file sequence.
- Local storage (local SSD) free space.For example:
Filesystem Size Used Avail Use% Mounted on /dev/mapper/system-root 4.0G 933M 2.9G 25% /tmpfs 115G 113G 2.7G 98% /dev/shm/dev/sda1 194M 150M 34M 82% /boot/dev/mapper/system-home 4.0G 3.0G 822M 79% /home/dev/mapper/system-opt 4.0G 1.8G 2.0G 48% /opt/dev/mapper/system-tmp 4.0G 181M 3.6G 5% /tmp/dev/mapper/system-usr 4.0G 1.7G 2.1G 44% /usr/dev/mapper/system-var 158G 63G 87G 42% /var
You can configure the low disk space threshold for monitoring. The threshold can be a percentage of the total disk space or it can be an absolute value in megabytes. When the disk space falls below the threshold, an error is logged. For more information, see the discussion about file system configuration in MATRIXX Installation and Upgrade.
- Shared storage free space:
- $MTX_SHARED_DIR
- $MTX_RECOVERY_DIR
- Local SSD and shared storage file
quantity:
- Local SSD
- /var/log/mtx/local/blade_x_y_z
There should be a maximum of two files in this directory. If there are more than two files, an alarm should trigger.
- /var/log/mtx/staging
There should be a maximum of two files in this directory. If there are more than two files, an alarm should trigger.
Note: There is a maximum of two files in the Active publishing directory. For Standby publishing, there may be up to eight files in this directory. If the limits are passed, an alarm should trigger. During publishing, to create a file in /staging/temp,gtc_sorted_txn_logging_enabled
requires enabling. Otherwise, the behavior is the same as processing. By default,gtc_sorted_txn_logging_enabled
is enabled in publishing. - /var/log/mtx/local/blade_x_y_z
- Shared storage file quantity:
- $MTX_SHARED_DIR/event_files
With event publishing, there should be a maximum of two files in this directory. If there are more than two files, an alarm should trigger.
Without event publishing, the number of files should be monitored but is subject to the downstream billing interface behavior.
- $MTX_SHARED_DIR/txnlogs/blade_x_y_z
There should be a maximum n number of files where n is determined by the maximum throughput, purging configuration, and file frequency.
-
$MTX_SHARED_DIR/staging
The temp directory should have a maximum of [number of servers x 2] files. For example, if there are four servers, there should be no more than eight files.
The mef_temp directory should have a maximum of 25 files (can be adjusted based on project implementation specifics).
- $MTX_SHARED_DIR/event_files
- Local SSD
- File size in
$MTX_SHARED_DIR/checkpoints:
- The timestamp of the newest sub-directory is determined by the checkpoint creation frequency. For example, a timestamp should not be older than four hours 15 minutes if the configured frequency is four hours.
- The size of a sub-directory should be +/- 10% of calculated value based on the current database size.
- Gaps in file sequence:Missing files should be identified by checking the file naming sequences of the following files:
- MATRIXX Transaction Log files
- MATRIXX Event Files (MEFs)
Note: The sequence number is part of the transaction log filename, which is also part of the MEF filename. There is a direct 1:1 mapping from the transaction log filename to the MEF filename.The following is a sample Linux command used to search for gaps in the filename sequence number:
All files for a given server are listed and any missing numbers in the sequence are printed to standard out.ls transaction_2_1_4* | cut -d_ -f6 | sed 's/[^0-9]//g' | sort -n | awk '{while(++x<$1)print x}'
For more information about the MATRIXX environment variables, see the discussion about container directory and environment variables in MATRIXX Installation and Upgrade.