inspect_engine_metrics.py
The inspect_engine_metrics.py script analyzes engine metrics against user-defined boundaries. It can be run inside a MATRIXX pod or in a Kubernetes cluster outside of a MATRIXX pod.
Syntax
The inspect_engine_metrics.py script has the following options, in addition to the standard engine command line options.
inspect_engine_metrics.py [--first-run] | [--get-labels] | [--calculate] --create-pod
Where --create-pod
is only used
if the script is being run in a Kubernetes cluster (outside a MATRIXX pod).Options
Note: One of these three options is required: --first run, --get-labels, or --calculate.
- --first-run
- Creates a default metrics_config.csv file with commonly used metrics and services. It collects available labels for the specified pod and metric combinations from Prometheus and prompts you to enter the minimum, maximum, and upper limit percentage values they want for each metric.
- --get-labels
- Prepares a CSV file with all available labels for the specified pod and metric combinations from Prometheus. It allows you to update the metrics_config.csv file with the labels you want.
- --calculate
-
- Fetches metric values from Prometheus based on the metrics_config.csv file.
- Calculates the status of each metric (ABOVE_MAX, BELOW_MIN, or IN_BOUNDS) based on user-defined boundaries.
- Outputs the metrics that are outside the defined boundaries (use the --verbose flag to print all metrics).
- Updates the metrics_config.csv file with metric values and status.
- Creates a separate CSV file, all_remaining_metrics.csv, with all available metrics not included in the metrics_config.csv file.
- --dir
- Defines the directory where metrics_config.csv is stored. The default is the current working directory.
- --config-csv
- Configures the metrics CSV file. The default is metrics_config.csv.
- --verbose
- When set to
true
, prints metrics of all statuses, not just outside bounds. The default isfalse
(disabled). - --prom-svc-dns-name
- The fully resolved Prometheus server domain name. The default is prometheus-kube-prometheus-prometheus.monitoring.svc.cluster.local.
- --prom-svc-port
- The Prometheus server port. The default is 9090.
- --create-pod
- When set to
true
, creates a new pod that downloads from the Prometheus server, saving the data outside of any pod. The default value isfalse
. - --temp-pod-name
- Name of the pod to create for downloading from the Prometheus server. The default is temp-pod.
- --namespace
- The Kubernetes namespace to use for creating the pod. The default is matrixx.
- --kube-config-path
- The path to the Kubernetes configuration file. The default is ~/.kube/config.
- --image-name
- The name of the image to use for the pod.
Usage
The inspect_engine_metrics.py script is included in the mtx-debug-kit
image and can be attached to a running pod.
Attach the image to a running pod, in the example below
publ-s1e1-0
:``` bash
kubectl debug -it -c debugger \
--target ctr-1 \
--image your-image-repository/mtx-debug-kit:1.0.0 \
publ-s1e1-0
```
Where your-image-repository is the URL of your image repository.