Health Checks

The subdomain-health-checker-sX and engine-health-checker-sXeY pods run multiple containers, each of which is a separate health check.

For every MtxSubdomain custom resource (CR), the topology-operator pod creates a subdomain-health-checker-sX deployment at the same time as it creates the subdomain-operator-sX deployment. For every MtxEngine CR, the topology-operator pod creates an engine-health-checker-sXeY deployment at the same time as it creates the engine-operator-sXeY deployment.

Two types of health checks are available, one at engine-level and the other at sub-domain-level:

  • Engine — Cluster Monitor.
  • Sub-domain – Inter-engine communication, Global Transaction Counter (GTC) out-of-sync monitoring.

Sub-domain health checks also implement a separate brain container in the same pod. This brain container monitors the GTC out-of-sync monitoring results, and recovers from two conditions:

  • Engine standby GTC out-of-sync – This may result in an engine restart to resolve the GTC out-of-sync condition.
  • Processing cluster-to-cluster publishing GTC out-of-sync – This may result in a publishing cluster restart to resolve the GTC out-of-sync condition.

Engine restarts only occur if the engine has not reached the maximum number of retries.

For information about brain and GTC sync configuration properties, see the discussion about sub-domain health checker configuration.