Change Checkpointing Behavior

You can change the interval at which checkpoints are created and the number of checkpoints to store before they are recycled.

About this task

This task assumes you are changing a MATRIXX Engine installation that is online, in production, and has multiple engines. In such cases, you must perform this procedure on an engine running in standby mode. If you are changing an offline engine, you need not perform any steps after running the configure_engine.py script, except to run the script again on the other engines. If you are changing a single engine, you need not perform any steps after running the configure_engine.py script.

Perform this task on the publishing server in the standby cluster.

Procedure

  1. Go to the ${MTX_CUSTOM_DIR} directory:
    cd_custom_dir
  2. Copy the create_config.info file to a backup file:
    cp create_config.info create_config.info.bak
  3. Open the create_config.info file with a text editor, such as vi, and delete the following question(s):
    Engine 1:Cluster 1:What is the maximum number of checkpoints to maintain?

    Engine 1:Cluster 1:What is the interval between database checkpoints (in minutes)?

  4. If the engine is running, stop it. Replace engineId with the ID of the engine.
    Note: This causes processing to fail over to the engine running in standby mode.
    stop_engine.py -e engineId
  5. Configure the local server:
    create_config.py
  6. Answer the deleted questions:
    Note: The checkpoint interval must be large enough to include the amount of time it takes the publishing server to replay the transaction log files, sync its databases, and create the checkpoint. By default, the interval is set to 60 minutes.
  7. When the script finishes, copy the local configuration to all servers in the engine:
    configure_engine.py
    After the configuration changes are complete, the script updates the configuration on each server by synchronizing the ${MTX_CUSTOM_DIR} directories and then running the configuration script locally on each server.
  8. Open a second terminal on the server and run the following command to view the state of the cluster as it enters into a standby state:
    print_blade_stats.py -C
    If the state is UNKNOWN, the engine is likely in the process of switching states. In such cases, wait a few minutes and run the command again.
  9. When the cluster state is STANDBY, stop the active engine. Replace engineId with the ID of the active engine:
    stop_engine.py -e engineId
  10. Propagate the configuration changes to the peer engine. Replace engineId with the ID of the engine that is now stopped:
    configure_engine.py -e engineId
    After the configuration changes are complete, the script updates the configuration on each server by synchronizing the ${MTX_CUSTOM_DIR} directories and then running the configuration script locally on each server.
  11. Start the engine. Replace engineId with the ID of the engine:
    start_engine.py -e engineId

    After all engines are started, the processing cluster enters into a standby state so it can sync its databases with the active cluster.

  12. Open a second terminal on the server and run the following command to view the HA state of the cluster that is starting:
    print_blade_stats.py -C

    The HA state becomes STANDBY. If it is UNKNOWN, the cluster is likely in the process of switching HA states. In such cases, wait a few minutes and run the command again.

  13. When the cluster state is STANDBY, run the following command to switch the HA states of both engines. Replace engineId with the ID of the engine you want to make active (the engine running in standby mode).
    Note: You cannot switch the HA state of two engines if your installation is running three engines.
    activate_engine.py -e engineId

What to do next

If the engine is not in production and is not online, follow the procedure for configuring the primary engine. If the engine is in production and is online, follow the procedure for changing the configuration of an online system.