CAF Metrics
Prometheus can be used to monitor the CDR Aggregation Function (CAF) application and the Kafka cluster.
All metrics exposed by Kafka can be accessed through the Java Management Extensions (JMX) interface provided by Prometheus. Kafka and CAF metrics can be retrieved from the JVMs.
KAFKA_OPTS
set to include the JMX Exporter as a
javaagent
.For example:
# Set KAFKA_OPTS
export KAFKA_OPTS="$KAFKA_OPTS -javaagent:${KAFKA_HOME}/libs/jmx_prometheus_javaagent-0.15.0.jar=localhost:9091:${KAFKA_HOME}/config/kafka-jmx-beans.yml"
# Start Kafka broker
${KAFKA_HOME}/bin/kafka-server-start ${KAFKA_HOME}/config/server.properties
In the example, kafka-jmx-beans.yml is used to configure the metrics exposed from the Kafka cluster. For more details about usage, see the discussion about Prometheus JMX Exporter at the Prometheus website.
Once the metrics have been reported to Prometheus, they can then be made available to Grafana dashboards by adding Prometheus as a data source. For more information about metrics exposed by Kafka using JMX, see Kafka metrics documentation on the Apache Kafka website.
CAF Custom Metrics describes available CAF Prometheus metrics.
Prometheus Metric Name | Description |
---|---|
kafka_stream_CDR_Aggregation_processedRequests_total{CDR_Aggregation_id="ChargingDataRequests"} | Total number of charging data requests processed by the CAF instance. |
kafka_stream_CDR_Aggregation_processedRequests_rate{CDR_Aggregation_id="ChargingDataRequests"} | Rate of charging data requests processed by the CAF instance per second. |
kafka_stream_CDR_Aggregation_chargingNotifyRequests_processed_total{CDR_Aggregation_id="ChargingNotifyRequests"} | Total number of charging notify requests processed
by the CAF instance. |
kafka_stream_CDR_Aggregation_chargingNotifyRequests_processed_rate{CDR_Aggregation_id="ChargingNotifyRequests"} | Rate of charging notify requests processed by the
CAF instance per second. |
kafka_stream_CDR_Aggregation_chargingDataRequests_duplicateCount_total{CDR_Aggregation_id="ChargingDataRequests"} | Total number of duplicate charging data request messages handled by the CAF instance. |
kafka_stream_CDR_Aggregation_chargingDataRequests_duplicateCount_rate{CDR_Aggregation_id="ChargingDataRequests"} | Rate of duplicate charging data request messages handled by the CAF instance per second |
kafka_stream_CDR_Aggregation_sessionReleased_total | The total number of aggregate CDRs released by the CAF instance. This can be filtered using CDR_Aggregation_id for specific closure reasons. |
kafka_stream_CDR_Aggregation_sessionReleased_rate | Rate of aggregate CDRs released by the CAF instance per second. This can be filtered using CDR_Aggregation_id for specific closure reasons. |
kafka_stream_CDR_Aggregation_state_store_size | The size of the aggregated store in-memory, per task. Tags include spring_id , task_id , thread_id ,
instance , and job . |
kafka_stream_CDR_Aggregation_state_store_deleted | The total number of records deleted from the state store, per task. Tags include: spring_id , task_id , thread_id ,
instance , and job . |
kafka_stream_CDR_Aggregation_state_store_duration | Time taken for iteration through the store when deleting old records
or closing idle records differentiated by process tag. Tags include:
spring_id , task_id ,
thread_id , instance ,
job , and process . |
kafka_stream_CDR_Aggregation_state_store_thread_sleep_total | The total count of times the stream thread has been paused,
differentiated by process tag. Tags include: spring_id ,
task_id , thread_id ,
instance , job , and
process . |
While these metrics are available in the JMX console, to expose these to Prometheus,
update the MTX_CAF_OPTS
variable to include the Prometheus JMX
exporter.
Recommended Metrics for Grafana
CAF Dashboard Metrics describes the recommended metrics for CAF dashboards:
Source | Metric |
---|---|
CAF | kafka_stream_CDR_Aggregation_processedRequests_total kafka_stream_CDR_Aggregation_processedRequests_rate kafka_stream_CDR_Aggregation_chargingNotifyRequests_processed_total kafka_stream_CDR_Aggregation_chargingNotifyRequests_processed_rate kafka_stream_CDR_Aggregation_chargingDataRequests_duplicateCount_total kafka_stream_CDR_Aggregation_chargingDataRequests_duplicateCount_rate kafka_stream_CDR_Aggregation_sessionReleased_total kafka_stream_CDR_Aggregation_sessionReleased_rate kafka_stream_CDR_Aggregation_state_store_size kafka_stream_CDR_Aggregation_state_store_deleted kafka_stream_CDR_Aggregation_state_store_duration kafka_stream_CDR_Aggregation_state_store_thread_sleep_total |
Kafka Client (Consumer) | kafka_consumer_fetch_manager_records_lag_max kafka_consumer_fetch_manager_records_lag kafka_consumer_fetch_manager_bytes_consumed_rate kafka_consumer_fetch_manager_records_consumed_rate kafka_consumer_fetch_manager_fetch_rate |
Kafka Client (Producer) | kafka_producer_record_send_total kafka_producer_record_error_total kafka_producer_record_retry_total kafka_producer_request_latency_avg kafka_producer_buffer_available_bytes |