MOD_RANGE_GLOBAL Record Recovery Mode

MOD_RANGE_GLOBAL mode accepts input of global start and end timestamps for all partitions and generates recovery files for all records of each partition. Configure the file number by specifying the node ID, file sequence number (FSN), and running count/file issued number (RC).

MOD_RANGE_GLOBAL Configuration Properties describes the configuration property used to specify a range of metadata files by filename.

Table 1. MOD_RANGE_GLOBAL Configuration Properties
Property Description
sftp.internal.recovery.modRangeGlobal.startTimestamp Starting time stamp used for each Kafka topic/partition.
sftp.internal.recovery.modRangeGlobal.endTimestamp Ending time stamp used for each Kafka topic/partition.
sftp.internal.recovery.modRangeGlobal.nodeID The node or pod ID. Can be overridden.
sftp.internal.recovery.modRangeGlobal.fsn The file sequence number for a specific route.
sftp.internal.recovery.modRangeGlobal.rc The running count of files issued by container/pod, starting with 1.

This mode runs as a Kubernetes job specified in asn1-kafka-sftp-recovery.yaml. Configuration is specified in the ConfigMap section. The following asn1-kafka-sftp-recovery.yaml example is configured for MOD_RANGE_CONTINUE mode:

kind: ConfigMap
apiVersion: v1
metadata:
  name: sink-recovery-config
  namespace: matrixx
data:
  asn1-sftp-sink.yaml: |-
    kafka:
      configuration:
        bootstrap.servers: mtx-kafka-cp-kafka.confluent.svc.cluster.local:9092
    sftp:
      remote:
        user: mtxsftpuser
        password: mtx123
        host: sftp.sftp.svc.cluster.local
        port: 22
      internal:
        buffer:
          path: "/opt/mtx/recovery/buffer"
        uploader:
          path: "/opt/mtx/recovery/internal_recovery"
             
        recovery:
          enabled: true
          path: "/opt/mtx/recovery"
          modRangeGlobal:
            startTimestamp: 1658839435466
            endTimestamp: 1658839435466
            nodeID: 1
            fsn: 0
            rc: 1
        condition:
          always:
            enabled: true
        routes:
        - topic: cdr-asn1-1
          path: "upload/uploads-1"
          isdefault: true
        - topic: cdr-asn1-2
          path: "upload/uploads-2"

MOD_RANGE_CONTINUE mode does the following:

  1. Collects information about each topic and partition from Kafka using the partitionsFor function.
  2. Retrieves start and end offsets for each topic and partition from Kafka using the time stamps provided using the offsetsForTimes function. In the case of a missing offset, this is ignored.
  3. Reprocesses records from collected partitions, between offsets, using the seek function. The record or records from one or more topics is generated according to configured file closure conditions. For more information, see the discussion about file closure control properties and conditions.
  4. When all records between the start and end timestamps are reprocessed, the job stops.