Database Checkpoints

Database checkpoints contain an exact snapshot of the in-memory databases (IMDBs) and configuration files at a specific time. Periodic database checkpoints are used to recover data from whole cluster failures and to reinstate an active processing cluster from a standby processing cluster after a disaster recovery operation. Use on-demand checkpoints for data analysis, setting up test environments, and troubleshooting. You confirm checkpoints using the validateCheckpoint.jar utility.

Database checkpoints are created by the checkpointing pod from its local IMDB and configuration files from /opt/mtx/conf and /opt/mtx/custom. Database checkpoints are not created by the Parallel-MATRIXX™ protocol, so creating checkpoints does not affect transaction processing.

The checkpointing pod IMDB is built by replaying input transaction log files from MATRIXX Engine shared storage (${MTX_SHARED_DIR}/staging/ckpt_temp). This keeps the checkpointing pod IMDB up-to-date with the IMDB of the active publishing or processing pod. If the checkpointing pod stops, its IMDB is restored, upon restart, from:

The latest database checkpoint on the shared storage.
Replaying input transaction log files on its shared storage that were not included in the database checkpoint.

The checkpointing pod replays input transaction log files from its local SSD until it creates a database checkpoint. During this process, the checkpointing pod suspends transaction replay so that the database checkpoint is consistent. The checkpointing pod resumes transaction replay immediately after the database checkpoint is created on its local SSD. The database checkpoint is then moved to shared storage. The checkpointing pod continuously replays transactions and creates database checkpoints while runs.

If the checkpointing pod detects that pricing is being replayed while the pod creates a database checkpoint, the pod delays checkpoint creation and tries again later after pricing completes replaying.

All transaction log files with time stamps that are newer than the time stamp in the checkpoint filename are replayed to update databases, for example, after a complete system failure. To ensure data consistency, the checkpointing pod uses a Global Transaction Counter (GTC) to replay the transactions from all processing pods in the proper order.

Note: If a transaction record from a transaction log file in a checkpoint operation has been retried more than a maximum number of times and fails, the Transaction Server logs a critical error and a "Creating a possibly inconsistent checkpoint" warning and continues typical processing.

Important: After you start a failed engine or restart the checkpointing pod, a database checkpoint is not immediately created at start-up. At start-up, the checkpointing pod first replays real-time transaction log files to ensure that its own in-memory databases are in a consistent state. The pod then creates the next database checkpoint as scheduled at the configured checkpoint interval. In a recovery scenario where a significant backlog of transaction log files is expected to be replayed (for example, a replay exceeding two hours), administrators can create an on-demand database checkpoint before the scheduled checkpoint time. In this case, the administrator must wait until the checkpointing pod catches up the replay of the real-time transaction log files (accumulated in /staging/ckpt_temp) before starting the on-demand database checkpoint. To check if the checkpointing pod caught up the replay of the real-time transaction log files, run ${MTX_TXN_LOG_DIR}/print_blade_stats.py -R on the checkpointing pod. Check the value of Checkpoint Replay File Count. If the value is less than 3, create an on-demand database checkpoint (most of the time, the value is 0).

Checkpoint files also contain copies of all versions of the latest configuration files, including:

asn1_dictionary*
cdr_dictionary*
create_config*
diameter_dictionary*
mdc_config*
mtx_config*
mtx_pricing*
process_control*
sysmon_config*
topology*
version*

For example, all files that start with the string mdc_config are copied. This includes the default mdc_config_system.xml file and any variants with names that start with mdc_config.

Understanding Database Checkpoint Files

MATRIXX Engine writes database checkpoint files to the ${MTX_SHARED_DIR}/checkpoints directory and appends the filename with the software version and the time the checkpoint finished writing, for example:

p /mtx_ckpt_v5050.1.1358493520.

The contents.gz file in the ${MTX_SHARED_DIR}/checkpoints directory has the transaction count for all files of a database checkpoint containing the transaction count of each database in the IMDB.

Each database might have many checkpoint files that use a database-name prefix; for example, subscriber_db_xxx.log.gz.

The following shows example content of a contents.gz file in MDC format:

DataContainer:
  containerId=MtxCheckpointContent(393,5050,1)
  idx name                           type      L A M P value
    0 CheckpointFileCount            UINT32    0 0 0 1 7
    1 TotalCheckpointTxnCount        UINT64    0 0 0 1 38484
    2 CheckpointFileList             STRUCT    1 0 0 1 {
    DataContainer:
      containerId=MtxTxnLogFileStats(392,5050,1)
      idx name                           type      L A M P value
        0 FileName                       STRING    0 0 0 1 subscriber_db
        1 TxnCount                       UINT64    0 0 0 1 20000
,
    DataContainer:
      containerId=MtxTxnLogFileStats(392,5050,1)
      idx name                           type      L A M P value
        0 FileName                       STRING    0 0 0 1 balance_set_db
        1 TxnCount                       UINT64    0 0 0 1 5000
,
    DataContainer:
      containerId=MtxTxnLogFileStats(392,5050,1)
      idx name                           type      L A M P value
        0 FileName                       STRING    0 0 0 1 activity_db
        1 TxnCount                       UINT64    0 0 0 1 10002
,
    DataContainer:
      containerId=MtxTxnLogFileStats(392,5050,1)
      idx name                           type      L A M P value
        0 FileName                       STRING    0 0 0 1 sched_db
        1 TxnCount                       UINT64    0 0 0 1 0
,
    DataContainer:
      containerId=MtxTxnLogFileStats(392,5050,1)
      idx name                           type      L A M P value
        0 FileName                       STRING    0 0 0 1 event_db
        1 TxnCount                       UINT64    0 0 0 1 2
,
    DataContainer:
      containerId=MtxTxnLogFileStats(392,5050,1)
      idx name                           type      L A M P value
        0 FileName                       STRING    0 0 0 1 alert_db
        1 TxnCount                       UINT64    0 0 0 1 0
,
    DataContainer:
      containerId=MtxTxnLogFileStats(392,5050,1)
      idx name                           type      L A M P value
        0 FileName                       STRING    0 0 0 1 pricing_db
        1 TxnCount                       UINT64    0 0 0 1 3480
}
    6 GlobalTxnCounter               UINT64    0 0 0 1 43045

Using Binary Database Checkpoints

Binary checkpointing is a feature used to speed up a cold restart. Unlike MATRIXX Data Container (MDC) checkpoints, binary checkpoints are serialized, compressed .bin image files of database memory that minimize disk IO operations. Using binary checkpoints, all processing and publishing pods can be started at the same time. Transaction processing can be stopped during checkpointing on a dedicated checkpointing pod. All memory changes and index updates stop. Memory can be serialized while being unmodified. Engine initialization can use read-from-memory files before the processing of transactions begins. This allows for a checkpoint to use files to store binary images instead of processing transactions to repopulate the databases. You can use a binary checkpoint for a restore as follows:

All pods restore from a binary checkpoint from shared storage.
Some pods use a binary checkpoint to restore in parallel. Other pods start using DBSync, which ensures that all pods that come up late are restored.

With binary checkpoints, each memory segment is written into its own compressed file. The filename includes the process name, database name, possibly an index name, and the segment index, for example: shm_transaction_server.1.database.subscriber.index.user_subscription_join_by_user.storage.6.gz. A meta file, segments_contents.txt, has filenames and attributes to restore a disk segment to memory. The file lines represent memory pools. Each line has format:

52 6
					transaction_server.1.database.event.storage

, where:

52 is the memory pool ID.
6 is the number of the memory segment (and files for this pool).
transaction_server.1.database.event.storage is the shared memory name needed to match the database memory pool with files on the disk.

You can enable and configure binary checkpointing by answering create_config.info file questions. For information about configuring binary checkpointing, see the discussion about checkpoint and transaction replay configuration in MATRIXX Configuration.

Using Parallel Checkpointing

You can use MDC checkpointing and binary checkpointing in parallel by configuring the create_config.info questions shown in the following example, per engine, per cluster:

Engine 1:Cluster 3:Do you want to enable binary checkpoint creation (y/n)?y
Engine 1:Cluster 3:How many writing threads do you want to use for binary checkpoint creation?8
Engine 1:Cluster 3:Do you want to enable binary checkpoint restore (y/n)?y
Engine 1:Cluster 3:How many reading threads do you want to use for binary checkpoint restore?8

Analyzing Database Checkpoints

You use the validateCheckpoint.jar utility to analyze a MATRIXX checkpoint, find any errors in the database, and produce a validation report listing internal database statistics and any errors. For MATRIXX Engine in production, MATRIXX Support recommends that you run the checkpoint validation process daily to verify that the output is free of errors, which ensures database integrity. If validateCheckpoint.jar reports any errors, contact a MATRIXX Support representative to help troubleshoot them.

Configuring Checkpoint Intervals

You configure periodic checkpoints by setting the create_config.info file parameters for configuring database checkpoints. The questions in this file specify the interval for automatic checkpoint creation and the number of checkpoints to save for rerating and disaster recovery scenarios. For more information about configuring checkpoints, see the discussion about checkpoint and transaction replay configuration in MATRIXX Installation and Upgrade.

If automatic database checkpoint creation fails, MATRIXX Engine waits one fourth the time specified for the database checkpoint interval before trying another database checkpoint. For example, if the database checkpoint interval is set to 60 minutes (the default), when a database checkpoint fails, MATRIXX Engine waits 15 minutes and then tries again to create a database checkpoint.

You create on-demand checkpoints by running the create_checkpoint.py script. For details, see the discussion about creating a database checkpoint manually.

Exporting Data to Other Formats

Use the MATRIXX data_export.jar utility to export checkpoint and MATRIXX Event File (MEF) data to another data format. Once exported, that data is available for post-processing operations and analytics. The data_export.jar utility transforms the MDC data to comma-separated value (CSV) files. It also generates files that create SQL RDBMS tables and that can load the data from the CSV files into the RDBMS tables.

For more information about exporting data, see the discussion about exporting subscription data in MATRIXX Integration.

For more information about the MATRIXX environment variables, see the discussion about container directories and environment variables in MATRIXX Installation and Upgrade.