performance_analysis_event_server.py
This script analyzes mtx_event_streamer-nn.log files for Event Stream Server performance, estimating processing latencies and first delivery times for events. This script produces CSV files and corresponding graphs (PNG files) from the CSV data to aid in analysis.
Note: The default log pattern used by the script is mtx_event_streamer-nn.log so that any compressed and uncompressed file(s)
(for example, mtx_event_streamer-92.log.gz) are input to the script.
The CSV files produced by this script are:
- Aggregated_EPS_events_data.csv — Provides first delivery times and counts for events delivered from Apache Kafka to Event Stream Server. Data includes:
Timestamp
— Timestamp for each collection of data.retrieved
— Global Transaction Counter (GTC) at each timestamp retrieved from Kafka.delivered
— GTC at each timestamp delivered to Event Stream Server.num_of_events
— Number of events processed at each timestamp.EPS_Retrieved
— Events per second (EPS) at each timestamp retrieved from Kafka.EPS_Delivered
— EPS at each timestamp delivered to Event Stream Server.
- Events_latency_estimation.csv — Provides Event Stream Server performance data. Data
includes:
Timestamp
— Timestamp for each collection of data.count
— GTC at each timestamp retrieved from Kafka.cum_waiting_time_sec
— Cumulative waiting time in seconds at each timestamp between delivered and retrieved events.avg_time_sec
— Average time in seconds at each timestamp between delivered and retrieved events. This is the cumulative waiting time (cum_waiting_time_sec
) divided by thecount
(GTC) at each timestamp.
- Kafka_delivery_times_start time-->end time.csv — Provides performance data for events delivered from Kafka to Event Stream Server throughout a rolling time window. Data includes:
Timestamp
— Timestamp for each collection of data.avg_ms_to_first_delivery
— Average number of milliseconds at each timestamp until the first delivery of events. This is the total milliseconds (total_ms
) divided by thecount
.total_ms
— Total time delay in milliseconds between each delivery and retrieval timestamp.count
— Count of events delivered between each interval.
Syntax
performance_analysis_event_server.py [-f] [-s] [-o] [-e] [-d] [-i] [-r]
Options
The performance_analysis_event_server.py script has the following command line options:
- -f, --file
- The input file pattern. The default value is mtx_event_streamer-nn.log.
- -d, --dir
- The directory in which to find the logs and to which to save CSV and PNG graph output files. The default value is
current dir
. - -o, --out
- Saves output to CSV and PNG files.
- -s, --start
- The start time (in the local time zone) of the window for which to analyze log files. If this option is not specified, all log files are parsed. The accepted time formats are: YYYY-MM-DD HH:MM:SS.SSSSSS, YYYY-MM-DD HH:MM:SS, and YYYY-MM-DD HH:MM. Example: 2021-06-16T03:15.
- -e, --end
- The end time (in the local time zone) of the window for which to analyze log files. If this option is not specified, all log files are parsed. The accepted time formats are: YYYY-MM-DD HH:MM:SS.SSSSSS, YYYY-MM-DD HH:MM:SS, and YYYY-MM-DD HH:MM. Example: 2021-06-16T03:15.
- -i, --interval
- The time interval, in seconds, for which to aggregate each collection of data. The default value is 60.
- -r --rolling_avg_window
- The window size for calculating the rolling average window for plotting the CSV data. The default value is 5, to show a five-period rolling average window of data.
Aggregated EPS Event Data
The following is an example of the Aggregated_EPS_events_data.csv file. In this example, the specified interval between timestamps for each collection of data is 60
seconds:
Timestamp retrieved delivered num_of_events EPS_Retrieved EPS_Delivered
8/29/2023 21:08 3667 0 1 61.117 0
8/29/2023 21:09 0 3667 1 0 61.117
8/29/2023 21:10 0 0 0 0 0
8/29/2023 21:11 5144 661 4 85.733 11.017
Figure 1 shows an example graph that the script generates from this file:Events Latency Estimation
The following is an example of the Events_latency_estimation.csv
file:
Timestamp count cum_waiting_time_sec avg_time_sec
8/29/2023 21:05 0 00:02.0 0
8/29/2023 21:06 0 00:07.0 0
8/29/2023 21:07 3667 00:07.4 0.002005181
8/29/2023 21:08 0 00:04.0 0
Next is an example graph (PNG image) that the script generates from this file:Kafka Delivery Times
The following is an example of the Kafka_delivery_times_start time-->end time.csv
file:
Timestamp avg_ms_to_first_delivery total_ms count
8/29/2023 21:08 195 195 1
8/29/2023 21:11 10 30 3
8/29/2023 21:12 0.333333333 1 3
Next is an example graph (PNG image) that the script generates from this file: