sFTP Sink Microservice Deployment

MATRIXX uses the sFTP Sink microservice to upload charging data records (CDRs) to Apache Kafka. These CDRs are first converted from JSON to ASN.1 (ASCII) format. sFTP Sink sends these ASN-1 encoded messages from an Apache Kafka topic. Records are aggregated in a local cache, and each record is pre-pended with a header. Once a certain file size is reached, these records are pre-pended with a header, and the file is uploaded to an sFTP server for processing.

The sFTP Sink microservice can also be deployed as n+k. but this depends on your requirements for the CDR file. If the file must be one set of files containing all sessions, then one instance of sFTP Sink is run and vertically scaled. Because the consumer of this service is Apache Kafka, it does not matter if a single process stops and is replaced a short time later. If multiple versions of the microservice are run in parallel, the sessions might be split between two sets of files. If multiple Sink instances are running, each session should be processed by the same Sink instance as the session is used for the topic key. This means that sessions are aggregated together. Each Sink instance acts independently, however, and uploads the records for its particular allocated partitions.
Note: The number of active Sink instances depends on the number of topic partitions. If there are more Sink instances than partitions, some Sink instances remain idle. Also, in a cloud native environment you cannot configure more pods than Apache Kafka topics.

Figure 1 shows the typical sFTP Sink flow.

Figure 1. sFTP Sink Flow
ASN.1 encoded messages are pre-pended with a header and uploaded from Kafka topics to sFTP servers for processing.

The microservice is provided in the asn1-kafka-sftp image. The configuration filename is asn1-sftp-sink.yaml. For information about sFTP Sink configuration, see the discussion about ASN.1 sFTP Sink application configuration.

CDR File Format Specification

The files produced by this solution are binary records, each with a header conforming to section 6.1.2 CDR header format of the 3GPP TS 32.297 V15.4.0 (2019-09) technical specification. Each file has a header conforming to section 6.1.1 of the specification.

Figure 2 shows how sFTP Sink prepends a header to each ASN.1 encoded message to prepare them for uploading to sFTP servers.

Figure 2. Formatting of ASN.1 Encoded Messages
Sink file headers prepended to ASN-1 encoded messages for uploads to sFTP servers

Topic Routing

Topic routing allows records received for one Apache Kafka topic to be processed apart from records received for another topic.

Sink Topic Routing shows this routing flow.

Figure 3. Sink Topic Routing
Records from Kafka topics are routed to the sFTP server.

Each set of records can be uploaded to a distinct directory on the sFTP server. Each route corresponds to one topic, and file aggregation and file stats are kept separate from the other routes. This means that the File Sequence Number and Running Count are distinct for the route.

Routing requires one or more routes to be defined. One of those routes must be the default route. The default route is used when upgrading from prior versions of MATRIXX that do not have routing. When records are in the file buffer, upon an upgrade, the current statistics are transferred to the default route, an the buffer is closed. The closed file is then uploaded to the corresponding remote path that has been defined.

For information about routing and sFTP server configuration, see the discussion about routing and sFTP server configuration.