kafka retention period setting - unix1998/technical_notes GitHub Wiki

The retention period for a Kafka topic is typically not defined directly in the server.properties file, but rather it can be set as a topic-level configuration. However, you can specify default values in the server.properties file that will apply to all topics unless overridden by specific topic configurations.

Setting Retention Period

  1. Topic-Level Configuration: The retention period can be set for each topic individually using the kafka-topics.sh command or through the Kafka API. For example:

    kafka-topics.sh --alter --topic my_topic --config retention.ms=604800000 --bootstrap-server localhost:9092
    

    In this example, the retention period is set to 7 days (604,800,000 milliseconds) for the topic my_topic.

  2. Broker-Level Default Configuration (server.properties): You can define default retention settings in the server.properties file, which will apply to all topics unless explicitly overridden. The relevant configuration parameters include:

    • log.retention.hours: The default retention period for all topics in hours. For example, setting log.retention.hours=168 means that the logs will be retained for 7 days by default.
    • log.retention.minutes: The default retention period for all topics in minutes. It can be used instead of log.retention.hours.
    • log.retention.ms: The default retention period for all topics in milliseconds. It can override the settings in log.retention.hours and log.retention.minutes.

    Here is an example configuration in server.properties:

    log.retention.hours=168
    

    This setting will retain logs for 7 days by default.

Retention Policy

The retention period controls how long Kafka retains records before they are eligible for deletion. It can be based on:

  • Time-Based Retention (retention.ms): Records are retained for a certain amount of time.
  • Size-Based Retention (retention.bytes): Records are retained until the partition reaches a certain size.

Overriding Defaults

When creating or altering a topic, you can override the default retention settings specified in server.properties. This flexibility allows for fine-tuning data retention policies based on the specific needs of each topic.

Note: Setting the retention period too short might lead to data loss if the data is needed beyond the retention window. Conversely, setting it too long might lead to excessive storage usage. Therefore, it's essential to balance retention policies based on use case requirements and resource constraints.