Analyze Queue Latencies

Use the analyze_mtx_debug_log.py script to analyze the queue statistics in the mtx_debug.log file and aid in troubleshooting issues with transaction performance. Queue statistics are collected for Diameter Gateway, MDC Gateway, Charging Server, and Transaction Server.

About this task

Queue statistics are collected for shared memory, local memory, and database memory queues. A time stamp is recorded in an incoming message as it enters each MATRIXX process queue, so as the message flows through the server, latencies can be tracked and issues more identifiable. For messages that time out during processing, the time stamps are written to the system log when the server is shut down, and the time it took each server to process the message. You can analyze all queues, only the queues that have experienced timeouts, or only those that have errors written to a specified log file. You can also run the analyze_mtx_debug_log.py script in debug mode to return more details about the queue statistics.

For more information about the MATRIXX environment variables, see the discussion about container directories and environment variables in MATRIXX Installation and Upgrade.

Procedure

Type one of the following commands to run the analyze_mtx_debug_log.py script. You can run each command with the -d option to return more details about the queues analyzed. In this case, append the line with -d 1.
  • To analyze all queues for all MATRIXX processes on a server: kubectl exec -it engine_pod_name -n matrixx -- bash --login -c"run_cmd_on_blade.py -b serverId "analyze_mtx_debug_log.py -a""
  • To analyze all queues for all MATRIXX servers in a cluster: kubectl exec -it engine_pod_name -n matrixx -- bash --login -c "run_cmd_on_blade.py -c clusterid "analyze_mtx_debug_log.py -a""
  • To analyze all queues in a specified log file: kubectl exec -it engine_pod_name -n matrixx -- bash --login -c "analyze_mtx_debug_log.py -f ${MTX_LOG_DIR}/log_filename"
  • To analyze all queues on a server that have experienced a timeout equal to or less than the specified number of seconds: kubectl exec -it engine_pod_name -n matrixx -- bash --login -c "run_cmd_on_blade.py -b serverId "analyze_mtx_debug_log.py -t number_of_seconds""
  • To analyze all queues in a cluster that have experienced a timeout equal to or less than the specified number of seconds: kubectl exec -it engine_pod_name -n matrixx -- bash --login -c "run_cmd_on_blade.py -c clusterId "analyze_mtx_debug_log.py -t number_of_seconds""

Results

Depending on the queue status, suggestions might be provided as possible solutions to any issues discovered. For example, if messages are timing out in a Diameter Gateway queue, the script suggests you check the Diameter client's configuration, socket configuration, or incoming message rate. The message rate might need throttling so Diameter Gateway can handle the load.