Start the Standby Engine

Determine whether the standby engine must be started before lifting the traffic bypass. Standby engine start-up can take an extended amount of time for large installations. If the steps in this procedure do not have to be completed before bypass lift, these tasks must be returned to when time or resource availability allows, such that they do not interfere with service restoration.

About this task

Most of these tasks can be performed in parallel to optimize recovery time.
Important: The time it takes to start a standby engine is specific to your installation and varies according to database size. If you determine the standby engine must be started before bypass lift, you must accept the risk of temporary reduced high availability (HA).

Procedure

  1. Verify standby engine Traffic Routing Agents (TRA-PROCs and TRA-PUBs) are running (using print_tra_cluster_status.py).
  2. Run commands to stop the standby engine to ensure clean initialization of MATRIXX processes.
  3. Prepare to monitor standby engine start-up with a command similar to the following:
    tail -F mtx_debug.log | grep state-related-prefix-suffix-string
  4. Prepare to monitor critical errors during standby engine start-up with a command similar to the following:
    tail -F mtx_debug.log | egrep "LM_CRITI|LM_ERROR"
  5. Accounting for procedures specific to your installation, and confirming the identity of the primary engine, start the engine.
  6. Monitor terminal output and take action as needed, such as checking check_engine_prerequisites.log on referenced nodes for guidance on next steps.
  7. Check that debug logs report that the engine reached the active state using the monitoring set up in step 2.
  8. Verify close alignment of objects and GTCs across clusters with commands similar to the following:
    print_blade_stats.py -R
    print_blade_stats.py -B