Elasticsearch restore index - digitalepidemiologylab/crowdbreaks-streamer-v1 GitHub Wiki
Setting up manual snapshots
This is a summarized and simplified version of the official AWS documentation. Perquisites: AWS ES instance running, granting access to an IAM user (in this case arn:aws:iam::874942657130:user/crowdbreaks-prd
).
- Create S3 bucket
crowdbreaks-prd-es-snapshots
- Create a policy
crowdbreaks-prd-es-snapshots
- Attach the policy
crowdbreaks-prd-es-snapshots
to a rolecrowdbreaks-prd-es-snapshots
{
"Version": "2012-10-17",
"Statement": [
{
"Action": [
"s3:ListBucket"
],
"Effect": "Allow",
"Resource": [
"arn:aws:s3:::crowdbreaks-prd-es-snapshots"
]
},
{
"Action": [
"s3:GetObject",
"s3:PutObject",
"s3:DeleteObject",
"iam:PassRole"
],
"Effect": "Allow",
"Resource": [
"arn:aws:s3:::crowdbreaks-prd-es-snapshots/*"
]
}
]
}
Afterwards edit the trust relationship and change it to:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Service": "es.amazonaws.com",
"AWS": "arn:aws:iam::874942657130:user/crowdbreaks-prd"
},
"Action": "sts:AssumeRole"
}
]
}
Make note of the arn of the role (in this case arn:aws:iam::874942657130:role/crowdbreaks-prd-es-snapshots
)
4. Create new policy crowdbreaks-prd-es-createsnapshots
to allow the IAM user to create snapshot repositories:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "Stmt1",
"Effect": "Allow",
"Action": [
"iam:PassRole"
],
"Resource": [
"arn:aws:iam::874942657130:role/crowdbreaks-prd-es-snapshots"
]
}
]
}
- Attach the
crowdbreaks-prd-es-createsnapshots
policy to the IAM user crowdbreaks-prd - Create a snapshot repository using a one-time signed request with the IAM user. Use the following python script (requires
pip install boto
) in order to create a new snapshot repository calledmanual
.
from boto.connection import AWSAuthConnection
import os
class ESConnection(AWSAuthConnection):
def __init__(self, region, **kwargs):
super(ESConnection, self).__init__(**kwargs)
self._set_auth_region_name(region)
self._set_auth_service_name("es")
def _required_auth_capability(self):
return ['hmac-v4']
if __name__ == "__main__":
client = ESConnection(
region='eu-central-1',
host='search-crowdbreaks-prd-orwuhh6oylaqcwcwq456ksdecq.eu-central-1.es.amazonaws.com',
aws_access_key_id=os.environ.get('AWS_ACCESS_KEY_ID'),
aws_secret_access_key=os.environ.get('AWS_SECRET_ACCESS_KEY'), is_secure=False)
# print(client.make_request(path='/_cluster/health', method='GET').read())
print('Registering Snapshot Repository')
resp = client.make_request(method='PUT',
path='/_snapshot/manual',
headers={'Content-Type': 'application/json'},
data='{"type": "s3","settings": {"bucket": "crowdbreaks-prd-es-snapshots", "region": "eu-central-1", "role_arn": "arn:aws:iam::874942657130:role/crowdbreaks-prd-es-snapshots"}}')
body = resp.read()
print(body)
- Optional: Run the same script for the new ES instance if you want to move the data to the new instance.
Create new snapshot of index
Easiest is to use Kibana's Dev Tools for this. Create a new snapshot named snapshot_1
of an individual index original_index
using:
PUT /_snapshot/manual/snapshot_1?wait_for_completion=true
{
"indices": "original_index",
"ignore_unavailable": true,
"include_global_state": false
}
Restoring indices
In order to view all snapshot repositories types
GET _snapshot
View all snapshots in repository manual
GET _snapshot/manual
Restore snapshot snapshot_1
in repository manual
POST _snapshot/manual/snapshot_1/_restore
Restore snapshot snapshot_1
in repository manual
and rename snapshotted index from original_name
to original_name_restored
.
POST _snapshot/manual/snapshot_1/_restore
{
"rename_pattern":"original_name",
"rename_replacement":"original_name_restored"
}
This can be useful if you also need to change the mapping of name. In that case create a new index original_name
with the new mapping. Then reindex the data to this new index:
POST _reindex
{
"source": {
"index": "original_name_restored"
},
"dest": {
"index": "original_name"
}
}
Afterwards, original_name_restored
can be deleted.