ilm and snapshots in elk - juancamilocc/virtual_resources GitHub Wiki
ILM and Snapshot strategy for logs ingest in ELK stack
In this guide, you will learn how to configure an ILM (Index Lifecycle Management) and Snapshot strategy to ensure proper log rotation and free up disk space in your ELK stack. This will help you prevent the ELK stack from becoming blocked and reduce the heavy processes associated with log management.
NOTE: This guide assumes that, you already have an ELK stack running receiving logs from filebeat. Otherwise, you can go to ELK configuration to get Kubernetes logs using Filebeat and check how to initialize an ELK stack and send logs from filebeat correctly.
Table of Content
- General Diagram
- Snapshot Configuration
- ILM Configuration
- Restore a Snapshot
- Delete restored logs
- Conclusion
General Diagram

Snapshot configuration
Associate S3 bucket as repository
Go to Snapshot and Restore > Repositories. There, we will associate an S3 to save all snapshots.
Create a S3, as follows.
aws s3 mb s3://eks-logs-test-$(aws sts get-caller-identity --query "Account" --output text)
aws s3api put-bucket-versioning --bucket eks-logs-test-<YOUR_AWS_ID> --versioning-configuration Status=Enabled
Click on Register repository, provide it a name.

Configure it, as follows.

Click on register and validate its status connection.

Create Snapshot policy
Go to Snapshot and Restore > Policies > Create policy and configure it, as follows.

Save the policy. It will run daily at 00:30 and retain snapshots in the repository for 45 days.
NOTE: As noted above, we set 0 30 5 * * ? due to the time zone. This may depend on your location.
ILM configuration
Index Lifecycle Policy
Go to Index Lifecycle Policies and search logs@lifecycle, click on it Manage > Edit.

Configure it, as follows.

Save the policy.
Associate ILM policy to the Index template
Go to Index Management > Index Templates, search logs, click on it Manage > Edit.

Navigate until Index settings and set the following code.
{
"index": {
"lifecycle": {
"name": "logs@lifecycle"
}
}
}

With the above configuration, we retain 'hot' logs (fewer than 15 days old) on the ELK instance to maintain free disk space and prevent blocks.
Logs older than 15 days are stored as snapshots in the S3 repository, and these logs have a retention period of 45 days.
Restore a Snapshot
To restore a snapshot navigate to Snapshot and Restore > Snapshots, select the snapshot, identify the index and click on Restore, as shown below.

Disable All data streams and indices and click on Use index patterns to filter an specific index, as example will use .ds-logs-from-logstash-testing-2025.07.19-000003. Enable Rename data streams and indices, for Capture pattern set (.+) and for Replacement pattern set $1-restored-2025-07-19 and disable Restore aliases. As follows.

Finally click on Restore snapshot.
NOTE: We changed the name of the indexes to avoid losing the current logs
Verify and visualization restore logs
Navigate to Stack Management > Index Management. There you will find the restored index, don't worry if it has not a size is due to the restore process take a time.
NOTE: This process can take a long time depending of the index size.
When the index was restored correctly, it will be showed, as follows.

We must disable its ILM policy, click on it, Manage index and delete it.

To view them, go to Stack Management > Data Views > Create data view, click on Show advanced settings and enable Allow hidden and system indices, give it a name and index pattern .ds-logs-from-logstash-testing-*.*.*-restored*. This Data View will help us to view other future snapshot restores.

Click on Save data view to Kibana.
Now, go to “Discover” and select the “Data View” created earlier to view all records.

Delete restored logs
To delete them, go to Stack Management > Index Management > Index details, search it by name, and click on delete.

Conclusion
Configuring an effective ILM and Snapshot strategy is crucial for maintaining a healthy and efficient ELK stack. This guide demonstrates how to automate the lifecycle of your logs, moving them from active storage to a more cost-effective S3 repository as they age. This not only ensures that you retain important data for your desired period but also prevents your primary storage from filling up, which could lead to system instability. By following these steps, you can optimize your log management processes, reduce operational overhead, and maintain control over your data, ensuring your ELK stack remains performant and reliable.