MATRIXX Cloud Native Reference Architecture

A MATRIXX Digital Commerce reference configuration runs in Kubernetes clusters in a private or public cloud.

The reference configuration multi-cluster deployment has the following high-level characteristics:

  • A Traffic Routing Agent in sub-domain routing and disaster recovery mode (TRA-RT/DR) and the Network Enabler route MDCs to the correct sub-domain. The TRA-RT/DR and Network Enabler and can route between clusters to the active MATRIXX Engine.
  • MATRIXX Digital Commerce, Prometheus, and Fluentd are installed in separate Kubernetes clusters. An aggregated view of metrics and log data across clusters can be achieved by setting up federation in Prometheus and Fluentd.
  • ActiveMQ is used to send MDCs out from the engine to the MATRIXX Gateways and Web Apps and responses are routed back using ActiveMQ.
    • Queues are used to separate messages per sub-domain.
    • Broker networks are highly available within the cluster, with at least 2 instances running.
    • ActiveMQ broker networks are not directly connected across clusters. Instead, the engine is configured to connect to all instances of ActiveMQ serving that sub-domain across all clusters. Gateway components only connect to the ActiveMQ broker network in the local cluster.
  • MongoDB is installed using the Kubernetes Operator for the reference architecture. For more information, see the discussion about MongoDB.
    • The engine assumes MongoDB is deployed as a Highly Available (HA) cluster.
    • The Kubernetes Operator only works within a single cluster/namespace. For Geo-redundant setups, the Kubernetes operator must be installed on a Kubernetes distribution that can span multiple physical locations, such as Amazon Elastic Kubernetes Service (EKS) over multiple availability zones. If the Kubernetes distribution does not support this, MongoDB must be installed outside the cluster or a hosted solution (Atlas) can be used.
Important: Deploy MATRIXX Engine and Traffic Routing Agent pods in Kubernetes clusters running the Docker or CRI-O container runtime. Avoid the containerd runtime for these components.

Geographical Redundancy

Cloud providers such as Amazon Web Services (AWS) make it possible to run a single Kubernetes cluster across multiple geographical locations. In AWS these geographies are known as availability zones. Availability zones are data centers that balance being physically close enough to minimize latency with being far enough away to reduce the risk that a fire, earthquake, or other disaster would effect every site hosting cluster components. This is not always available to customers however, and it does not eliminate the risk of a Kubernetes cluster outage or a regional outage. (An AWS region is a collection of availability zones in a close geographical location.)

The Reference architecture describes a deployment that spans multiple clusters, which would likely be in different geographical locations, like with an AWS multi-region setup.

Ingress Load Balancing

A Kubernetes Ingress is configured for upstream communication with MATRIXX Gateways and Web Apps.

A hardware or software load balancer is required:

  • An IP range is defined that is independent of the node IP addresses.
  • Individual IP addresses are assigned to applications and components such as MATRIXX Engine, TRA RT/DR, Gateway Proxy, Network Enabler, SBA Gateway.
  • For Proxy components, an Ingress Controller like NGINX is required to route requests on HTTP ports to the correct application depending on the URL.
  • For SBA Gateway, an Ingress Controller is required for HTTP/2 load-balancing.

MATRIXX Kubernetes Components

See the MATRIXX Digital Commerce online (HTML) documentation for diagram that show the MATRIXX components required for a Kubernetes environment.

MATRIXX Kubernetes Components

Figure 1 shows a low level component view of the MATRIXX Kubernetes access group.

Figure 1. MATRIXX Kubernetes Low Level Access Group Component View
Figure 2 shows a low level component view of the MATRIXX Network Enabler, Traffic Routing Agent (TRA), an Third-party group.
Figure 2. MATRIXX Kubernetes Low Level NE, TRA, and Third-Party Component View

Figure 3 shows a low level component view of the MATRIXX Kubernetes MATRIXX Engine group.

Figure 3. MATRIXX Kubernetes Low Level Engine Component View

Figure 4 shows the Kubernetes components that work with, but are not a part of the MATRIXX engine group.

Figure 4. MATRIXX Kubernetes Ancillary Component View

Communication Between Clusters

When deployed in a single cluster, the engine controller creates internal services that are front ends for MATRIXX Engine clusters, for example proc-cls-s1e1, publ-cls-s1e1, and ckpt-cls-s1e1. The TRA RT/DR then connects to these processing/publishing services. when deployed in a multi-cluster setup these services require external connectivity.

In Cluster 1:

  • Helm creates internal ClusterIP Services for engines within the local cluster: proc-cls-s1e1, publ-cls-s1e1.
  • Helm creates ExternalName-type Services for engines in the remote cluster: proc-cls-s1e2, publ-cls-s1e2.

The ExternalName service is configured with the domain name of the remote engine cluster, for example proc-s1e2.cluster2.example.com. This allows the TRA RT/DR to connect using Kubernetes Services, leaving the routing detail to be managed by Kubernetes configuration rather than TRA application configuration.

In Cluster 2:

  • The Engine Controller creates services of type LoadBalancer. Due to Kubernetes limitations, two services are required for each of the Processing/Publishing engine clusters: one for TCP, one for UDP .
  • The LoadBalancer-type Service links to an external IP address from the pool of IP addresses assigned to the load balancer.
  • A Central DNS service contains domain names for these IP addresses, such as proc-s1e1-cluster1.

The above is replicated by the Network Enabler, with ClusterIP and ExternalName services shown in Cluster 1 and the LoadBalancer services in Cluster 2.

If an Access Group is also deployed in Cluster 2, Helm is used to create ClusterIP and ExternalName services in Cluster 2 that link to the LoadBalancer services in Cluster 1.

Figure 5 shows a high level view of the MATRIXX and third-party components required for a multi-cluster MATRIXX Digital Commerce environment.
Figure 5. Multi Cluster View