Sub-Domain Health Checker Configuration

Subdomain Health Checker Configuration describes the properties available for sub-domain health checker configuration in a Topology Operator-based deployment.

Table 1. Subdomain Health Checker Configuration
Property Description
engine.operatorV2.subdomainHealthChecker.affinity The affinity configuration used by the subdomain-health-checker pods. The value is of type Affinity.
engine.operatorv2.subdomainHealthChecker.brain.dryRun When set to true, checks are performed and results logged without triggering engine stop and restart. The default value is false.
engine.operatorv2.subdomainHealthChecker.brain.engineCheckInterval The interval time in seconds between checking that all MATRIXX Engines have started. The default value is 5.
engine.operatorV2.subdomainHealthChecker.brain.engineMaxRestarts The total number of times the brain health check container attempts to restart an engine, after which the saved engine state expires. The default value is 1.
engine.operatorV2.subdomainHealthChecker.containers.brain.engineStateExpiry When set to a nonzero value the brain health check container expires a saved engine state after the specified number of minutes. The saved engine state is updated whenever the brain restarts an engine. If the state has not expired then an engine can only be restarted if the saved restart count is less than the value of the engineMaxRestarts property. The default value is 30.
engine.operatorv2.subdomainHealthChecker.containers.brain.image.nameOverride When specified, overrides the default image name for the brain health check container with the specified name.
engine.operatorv2.subdomainHealthChecker.containers.brain.image.versionOverride When specified, overrides the default image version for the brain health check container with the specified version.
engine.operatorV2.subdomainHealthChecker.containers.brain.oosPubRecovery When set to true, the brain health check conatiner attempts to recover from an out-of- sync publishing cluster. The default value is true.
engine.operatorV2.subdomainHealthChecker.containers.brain.oosStandbyRecovery When set to true, the brain health check conatiner attempts to recover from an out-of-sync standby engine. The default value is true.
engine.operatorV2.subdomainHealthChecker.containers.brain.pubClusterFailureRecovery When set to true the brain health check container attempts to recover from a failed publishing cluster. The default value is true.
engine.operatorv2.subdomainHealthChecker.containers.brain.resources The resource configuration used by the brain health check containers. The value is of type ResourceRequirements.
engine.operatorv2.subdomainHealthChecker.containers.brain.securityContext The container security context configuration used by the brain health check containers. The value is of type SecurityContext. The default value is:
allowPrivilegeEscalation: false
privileged: false
runAsNonRoot: true
runAsUser: 1000
engine.operatorv2.subdomainHealthChecker.containers.brain.snmpTimeout The interval time in seconds to wait for an SNMP response. The default value is 5.
engine.operatorV2.subdomainHealthChecker.containers.engineCommunication.image.nameOverride The name of the image used by the engine communication health check containers.
engine.operatorV2.subdomainHealthChecker.containers.engineCommunication.image.versionOverride The tag value of the image used by the engine communication health check containers.
engine.operatorV2.subdomainHealthChecker.containers.engineCommunication.interval.connection The interval in milliseconds between an engine communication health check master closing a connection to one engine communication health check agent and opening a connection to the next agent. The default value is 1000.
engine.operatorV2.subdomainHealthChecker.containers.engineCommunication.interval.request The interval in milliseconds between an engine communication health check master receiving a response from an engine communication health check agent and sending the next request to the agent. The default value is 1000.
engine.operatorV2.subdomainHealthChecker.containers.engineCommunication.interval.test The interval in milliseconds between the engine communication health check master finishing one round of testing and starting the next round. The default value is 10000.
engine.operatorV2.subdomainHealthChecker.containers.engineCommunication.livenessProbe.failureThreshold The failure threshold value to use for the engine communication health check containers' liveness probe. The default value is 3.
engine.operatorV2.subdomainHealthChecker.containers.engineCommunication.livenessProbe.initialDelaySeconds The initial delay value to use for the engine communication health check containers' liveness probe. The default value is 0.
engine.operatorV2.subdomainHealthChecker.containers.engineCommunication.livenessProbe.periodSeconds The period value to use for the engine communication health check containers' liveness probe. The default value is 10.
engine.operatorV2.subdomainHealthChecker.containers.engineCommunication.livenessProbe.timeoutSeconds The timeout value to use for the engine communication health check containers' liveness probe. The default value is 1.
engine.operatorV2.subdomainHealthChecker.containers.engineCommunication.readinessProbe.failureThreshold The failure threshold value to use for the engine communication health check containers' readiness probe. The default value is 3.
engine.operatorV2.subdomainHealthChecker.containers.engineCommunication.readinessProbe.initialDelaySeconds The initial delay value to use for the engine communication health check containers' readiness probe. The default value is 0.
engine.operatorV2.subdomainHealthChecker.containers.engineCommunication.readinessProbe.periodSeconds The period value to use for the engine communication health check containers' readiness probe. The default value is 10.
ngine.operatorV2.subdomainHealthChecker.containers.engineCommunication.readinessProbe.timeoutSeconds The timeout value to use for the engine communication health check containers' readiness probe. The default value is 1.
engine.operatorV2.subdomainHealthChecker.containers.engineCommunication.resources The resource configuration used by the engine communication health check containers. The value is of type ResourceRequirements.
engine.operatorV2.subdomainHealthChecker.containers.engineCommunication.securityContext The container security context configuration used by the engine communication health check containers. The value is of type SecurityContext. The default value is:
allowPrivilegeEscalation: false
privileged: false
runAsNonRoot: true
runAsUser: 1000
engine.operatorV2.subdomainHealthChecker.containers.engineCommunication.timeout.connection The maximum amount of time in milliseconds for an engine communication health check master to attempt to connect to an engine communication health check agent or for an agent to connect to another agent. The default value is 1000.
engine.operatorV2.subdomainHealthChecker.containers.engineCommunication.timeout.request The maximum amount of time in milliseconds for an engine communication health check master to attempt to send a request to an engine communication health check agent or for an agent to send a request to another agent. The default value is 1000.
engine.operatorV2.subdomainHealthChecker.containers.engineCommunication.timeout.response The maximum amount of time in milliseconds for an engine communication health check master to wait for a response from an engine communication health check agent or for an agent to wait for a response from another agent. The default value is 1000.
engine.operatorv2.subdomainHealthChecker.containers.gtcSync.detectInterval The maximum time in seconds to wait before requesting the GTC values from the engines. The default value is 10.
engine.operatorv2.subdomainHealthChecker.containers.gtcSync.detectIntervalMax The maximum time in seconds the gtcSync health check waits for the GTC out-of-sync condition to be detected. The default value is 300.
engine.operatorV2.subdomainHealthChecker.containers.gtcSync.engineCheckInterval The time in seconds the gtcSync health check waits between checking that all engines have started. The default value is 5.
engine.operatorv2.subdomainHealthChecker.containers.gtcSync.gapMax The GTC gap value that triggers GTC sync failure. The default value is 1000000.
engine.operatorv2.subdomainHealthChecker.containers.gtcSync.image.nameOverride When specified, overrides the default image name for the gtcSync health check container with the specified name.
engine.operatorv2.subdomainHealthChecker.containers.gtcSync.image.versionOverride When specified, overrides the default image version for the gtcSync health check container with the specified version.
engine.operatorv2.subdomainHealthChecker.containers.gtcSync.resources The resource configuration used by the gtcSync health check container. The value is of type ResourceRequirements.
engine.operatorv2.subdomainHealthChecker.containers.gtcSync.securityContext The container security context configuration used by the gtcSync health check containers. The value is of type SecurityContext. The default value is:
allowPrivilegeEscalation: false
privileged: false
runAsNonRoot: true
runAsUser: 1000
engine.operatorv2.subdomainHealthChecker.containers.gtcSync.snmpTimeout The interval time in seconds to wait for an SNMP response. The default value is 5.
engine.operatorv2.subdomainHealthChecker.containers.gtcSync.tpsMinDetect The minimum TPS required before starting GTC detection. The default value is 1.
engine.operatorV2.subdomainHealthChecker.labels.podLabels Labels added to the subdomain-health-checker pods. These would be in addition to the labels specified with the global.labels.podLabels property, which are applied to all pods.
engine.operatorV2.subdomainHealthChecker.nodeSelector The node selector configuration used by the subdomain-health-checker pods.
engine.operatorV2.subdomainHealthChecker.podSecurityContext The pod security context configuration used by the subdomain-health-checker pods. The value is of type PodSecurityContext. The default value is fsGroup: 1000.
engine.operatorV2.subdomainHealthChecker.serviceAccount.annotations Annotations added to the service account if engine.operatorV2.subdomainHealthChecker​.serviceAccount.create is set to true.
engine.operatorV2.subdomainHealthChecker.serviceAccount.create When set to true, a service account is created with the specified name and annotations in the master namespace.

Otherwise, a service account with the specified name must be manually created in the master namespace. The default value is true.

engine.operatorV2.subdomainHealthChecker.serviceAccount.name The name of a service account created in the master namespace if engine.operatorV2.subdomainHealthChecker​.serviceAccount.create is set to true.

Otherwise, a service account with this name must be manually created in the master namespace. The default value is subdomain-health-checker.

engine.operatorV2.subdomainHealthChecker.tolerations The tolerance configuration used by the subdomain-health-checker pods. The value is a list of type Toleration.