kafka retention period setting - unix1998/technical_notes GitHub Wiki
The retention period for a Kafka topic is typically not defined directly in the server.properties
file, but rather it can be set as a topic-level configuration. However, you can specify default values in the server.properties
file that will apply to all topics unless overridden by specific topic configurations.
Setting Retention Period
-
Topic-Level Configuration: The retention period can be set for each topic individually using the
kafka-topics.sh
command or through the Kafka API. For example:kafka-topics.sh --alter --topic my_topic --config retention.ms=604800000 --bootstrap-server localhost:9092
In this example, the retention period is set to 7 days (604,800,000 milliseconds) for the topic
my_topic
. -
Broker-Level Default Configuration (
server.properties
): You can define default retention settings in theserver.properties
file, which will apply to all topics unless explicitly overridden. The relevant configuration parameters include:log.retention.hours
: The default retention period for all topics in hours. For example, settinglog.retention.hours=168
means that the logs will be retained for 7 days by default.log.retention.minutes
: The default retention period for all topics in minutes. It can be used instead oflog.retention.hours
.log.retention.ms
: The default retention period for all topics in milliseconds. It can override the settings inlog.retention.hours
andlog.retention.minutes
.
Here is an example configuration in
server.properties
:log.retention.hours=168
This setting will retain logs for 7 days by default.
Retention Policy
The retention period controls how long Kafka retains records before they are eligible for deletion. It can be based on:
- Time-Based Retention (
retention.ms
): Records are retained for a certain amount of time. - Size-Based Retention (
retention.bytes
): Records are retained until the partition reaches a certain size.
Overriding Defaults
When creating or altering a topic, you can override the default retention settings specified in server.properties
. This flexibility allows for fine-tuning data retention policies based on the specific needs of each topic.
Note: Setting the retention period too short might lead to data loss if the data is needed beyond the retention window. Conversely, setting it too long might lead to excessive storage usage. Therefore, it's essential to balance retention policies based on use case requirements and resource constraints.