Elasticsearch restore index - digitalepidemiologylab/crowdbreaks-streamer-v1 GitHub Wiki

Setting up manual snapshots

This is a summarized and simplified version of the official AWS documentation. Perquisites: AWS ES instance running, granting access to an IAM user (in this case arn:aws:iam::874942657130:user/crowdbreaks-prd).

  1. Create S3 bucket crowdbreaks-prd-es-snapshots
  2. Create a policy crowdbreaks-prd-es-snapshots
  3. Attach the policy crowdbreaks-prd-es-snapshots to a role crowdbreaks-prd-es-snapshots
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Action": [
                "s3:ListBucket"
            ],
            "Effect": "Allow",
            "Resource": [
                "arn:aws:s3:::crowdbreaks-prd-es-snapshots"
            ]
        },
        {
            "Action": [
                "s3:GetObject",
                "s3:PutObject",
                "s3:DeleteObject",
                "iam:PassRole"
            ],
            "Effect": "Allow",
            "Resource": [
                "arn:aws:s3:::crowdbreaks-prd-es-snapshots/*"
            ]
        }
    ]
}

Afterwards edit the trust relationship and change it to:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "es.amazonaws.com",
        "AWS": "arn:aws:iam::874942657130:user/crowdbreaks-prd"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}

Make note of the arn of the role (in this case arn:aws:iam::874942657130:role/crowdbreaks-prd-es-snapshots) 4. Create new policy crowdbreaks-prd-es-createsnapshots to allow the IAM user to create snapshot repositories:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "Stmt1",
            "Effect": "Allow",
            "Action": [
                "iam:PassRole"
            ],
            "Resource": [
                "arn:aws:iam::874942657130:role/crowdbreaks-prd-es-snapshots"
            ]
        }
    ]
}
  1. Attach the crowdbreaks-prd-es-createsnapshots policy to the IAM user crowdbreaks-prd
  2. Create a snapshot repository using a one-time signed request with the IAM user. Use the following python script (requires pip install boto) in order to create a new snapshot repository called manual.
from boto.connection import AWSAuthConnection
import os
 
class ESConnection(AWSAuthConnection):
 
    def __init__(self, region, **kwargs):
        super(ESConnection, self).__init__(**kwargs)
        self._set_auth_region_name(region)
        self._set_auth_service_name("es")
 
    def _required_auth_capability(self):
        return ['hmac-v4']
 
if __name__ == "__main__":
 
    client = ESConnection(
            region='eu-central-1',
            host='search-crowdbreaks-prd-orwuhh6oylaqcwcwq456ksdecq.eu-central-1.es.amazonaws.com',
            aws_access_key_id=os.environ.get('AWS_ACCESS_KEY_ID'),
            aws_secret_access_key=os.environ.get('AWS_SECRET_ACCESS_KEY'), is_secure=False)
 
    # print(client.make_request(path='/_cluster/health', method='GET').read())
    print('Registering Snapshot Repository')
    resp = client.make_request(method='PUT',
            path='/_snapshot/manual',
            headers={'Content-Type': 'application/json'},
            data='{"type": "s3","settings": {"bucket": "crowdbreaks-prd-es-snapshots", "region": "eu-central-1", "role_arn": "arn:aws:iam::874942657130:role/crowdbreaks-prd-es-snapshots"}}')
    body = resp.read()
    print(body)
  1. Optional: Run the same script for the new ES instance if you want to move the data to the new instance.

Create new snapshot of index

Easiest is to use Kibana's Dev Tools for this. Create a new snapshot named snapshot_1 of an individual index original_index using:

PUT /_snapshot/manual/snapshot_1?wait_for_completion=true
{
  "indices": "original_index",
  "ignore_unavailable": true,
  "include_global_state": false
}

Restoring indices

In order to view all snapshot repositories types

GET _snapshot

View all snapshots in repository manual

GET _snapshot/manual

Restore snapshot snapshot_1 in repository manual

POST _snapshot/manual/snapshot_1/_restore

Restore snapshot snapshot_1 in repository manual and rename snapshotted index from original_name to original_name_restored.

POST _snapshot/manual/snapshot_1/_restore
{
    "rename_pattern":"original_name",
    "rename_replacement":"original_name_restored"
}

This can be useful if you also need to change the mapping of name. In that case create a new index original_name with the new mapping. Then reindex the data to this new index:

POST _reindex 
{
  "source": {
    "index": "original_name_restored"
  },
  "dest": {
    "index": "original_name"
  }
}

Afterwards, original_name_restored can be deleted.