Geode on AWS EC2 - padogrid/padogrid GitHub Wiki

◀️ Geode Workspaces on VMs 🔗 Reactivating Geode Workspaces on AWS EC2 ▶️


Since Padogrid v0.9.20


Geode/GemFire on AWS EC2 Instances

This article provides step-by-step instructions for creating and running a VM workspace on AWS EC2 instances.

Launch EC2 Instances

From the EC2 Dashboard launch six (6) EC2 instances of type t2.micro on two (2) different availability zones and collect their public IP addresses. For our example, we launched three (3) instances on each of the us-east-2a and us-east-2b availability zones, named the instances, and collected their public IP addresses as follows:

Name IP Address Availability Zone
locator1 3.135.221.186 us-east-2a
member1 18.191.168.36 us-east-2a
member2 3.135.186.150 us-east-2a
locator2 3.135.232.83 us-east-2b
member3 18.218.40.90 us-east-2b
member4 18.217.250.90 us-east-2b

Create VM Workspace

To create a workspace, run create_workspace, which by default runs in the interactive mode. For our example, let's run it in the non-interactive mode by specifying the -quiet option as follows:

create_workspace -quiet \
-name ws-aws-gemfire \
-java /Library/Java/JavaVirtualMachines/jdk1.8.0_333.jdk/Contents/Home \
-product /Users/dpark/Padogrid/products/vmware-gemfire-9.15.1 \
-vm 3.135.221.186,18.191.168.36,3.135.186.150,3.135.232.83,18.218.40.90,18.217.250.90 \
-vm-user ec2-user \
-vm-key mykey.pem

The above creates the workspace named ws-aws-gemfire and places all the installations in the /home/ec2-user/Geode directory. When you are done with installation later, each EC2 instance will have the following folder contents:

/home/ec2-user/Padogrid
├── downloads
│   ├── jdk-8u333-linux-x64.tar.gz
│   └── vmware-gemfire-9.15.1.tgz
├── products
│   ├── padogrid_0.9.20
│   ├── jdk1.8.0_333
│   └── vmware-gemfire-9.15
└── workspaces
    └── ws-aws-gemfire
        └── clusters
            └── mygemfire

The non-vm options, -java and -product must be set to Java and GemFire home paths in your local file system.

The following table shows the breakdown of the options used in the example.

Option Value Description
-name ws-aws-gemfire Workspace name
-java /Library/Java/JavaVirtualMachines/jdk1.8.0_333.jdk/Contents/Home JAVA_HOME, local file system
-product /Users/dpark/Padogrid/products/vmware-gemfire-9.15 GEMFIRE_HOME, local file system
-vm 3.18.113.154,3.21.44.203,18.218.42.247, 18.222.225.117,18.188.216.237 EC2 instance public IP addresses. Must be separated by comma with no spaces.
-vm-user ec2-user User name, EC2 instances
-vm-key mykey.pem Private key file, local file system

The following table shows rest of the options not included in the example. These options take default values if not specified.

Option Default Value Description
-cluster myhz Cluster name
-vm-padogrid ~/Padogrid PadoGrid base directory on EC2 instances
-vm-java JDK installation path JAVA_HOME on EC2 instances. Use vm_install to install JDK.

Configure Cluster

We launched t2.micro instances which only have 1 GB of memory. We need to lower the GemFire member heap size to below this value. Edit the cluster.properties file as follows:

switch_workspace ws-aws-gemfire
switch_cluster mygemfire
vi etc/cluster.properties

Change the heap and host properties in that file as shown below. You may not able to run GemFire with the max heap size greater than 384 MB on t2.micro EC2 instances. You can create a swapspace to overcome this limitation, however. Please see the next section for instructions on creating a swap space.

# Lower the heap size from 1g to 256m.
heap.min=256m
heap.max=256m

We have launched two (2) EC2 instances for locators. Let's set vm.locator.hosts to their IP addresses.

# Set the first VM as the locator
vm.locator.hosts=3.135.221.186,3.135.232.83

Make sure to remove the locator IP addresses from the member host list set by the vm.hosts property.

# Rest of the 4 VMs for members
vm.hosts=18.191.168.36,3.135.186.150,18.218.40.90,18.217.250.90

Lastly, set the redundancy zone for each of the VM. We can use any name as long as they are unique per zone. For our example, let's use the same availability zone names.

vm.3.135.221.186.redundancyZone=us-east-2a
vm.18.191.168.36.redundancyZone=us-east-2a
vm.3.135.186.150.redundancyZone=us-east-2a

vm.3.135.232.83.redundancyZone=us-east-2b
vm.18.218.40.90.redundancyZone=us-east-2b
vm.18.217.250.90.redundancyZone=us-east-2b

Note that we didn't need to set the redundancy zone for locators since they are not data nodes.

Swap Space

The t2.micro EC2 instances do not include swap spaces. The following example shows how to add a 1-GB swap space to enable virtual memory on all VMs. This allows you increase the max heap size.

# Create swap space of 1 GB. To change the swap space size, change the count.
# For example, to increase to 2 GB, set count=16 (128 MB * 16).
vm_exec "sudo dd if=/dev/zero of=/swapfile bs=128M count=8 && \
sudo dd if=/dev/zero of=/swapfile bs=128M count=8 && \
sudo chmod 600 /swapfile && \
sudo mkswap /swapfile && \
sudo swapon /swapfile && \
sudo swapon -s"

You can start the swap file at boot time by editing the /etc/fstab file. You can skip this step if you won't be rebooting the VMs.

# ssh into VM
ssh -i mykey.pem [email protected]
sudo vi /etc/fstab

Add the following new line at the end of the file, save the file, and then exit:

/swapfile swap swap defaults 0 0

You must carry out the above steps for each VM.

Sync VMs

Run vm_sync to synchronize the workspace.

vm_sync

The above command reports the following:

Scanning VMs... Please wait.
Deploying padogrid_0.9.20 to 3.135.221.186...
Deploying padogrid_0.9.20 to 18.191.168.36...
Deploying padogrid_0.9.20 to 3.135.186.150...
Deploying padogrid_0.9.20 to 3.135.232.83...
Deploying padogrid_0.9.20 to 18.218.40.90...
Deploying padogrid_0.9.20 to 18.217.250.90...

Workspace sync: ws-aws-gemfire
   Synchronizing 3.135.221.186...
   Synchronizing 18.191.168.36...
   Synchronizing 3.135.186.150...
   Synchronizing 3.135.232.83...
   Synchronizing 18.218.40.90...
   Synchronizing 18.217.250.90...

Updating remote (VM) '.bashrc' if needed...
...
Workspace sync complete.

Install Software

vm_sync will display warning messages indicating the new EC2 instances do not have the required software installed. Download the required software and install them by running the vm_install command as shown in the following example.

vm_install -product ~/Downloads/jdk-8u333-linux-x64.tar.gz
vm_isntall -product ~/Downloads/vmware-gemfire-9.15.1.tgz

Start Cluster

Start the cluster.

start_cluster

Monitor Cluster

To monitor the cluster:

show_cluster

View Log

To view logs

# member1
show_log

# member2
show_log -num 2

# locator1
show_log -log locator

# locator2
show_log -log locator -num 2

Pulse

You can get the Pulse URL by running the show_cluster -long command. For our example, both locators serve Pulse.

You can view the members running in each redundancy zone from Pulse.

Pulse Screenshot

Test Cluster

You can run the perf_test app to ingest data into the cluster and monitor the region sizes increase from Pulse.

First, create the perf_test app and edit its configuration file to point to the locator running on EC2.

create_app
cd_app perf_test
vi etc/client-cache.xml

Set the locator hosts in the etc/client-cache.xml with the locator IP addresses.

<pool name="serverPool">
   <locator host="3.135.221.186" port="10334" />
   <locator host="3.135.232.83" port="10334" />
</pool>

Ingest data into the cluster.

cd bin_sh
./test_ingestion -run

Including Additional EC2 Instances

You can include additional EC2 instances to the workspace by entering the new instance IP addresses in the workspace vmenv.sh file. Let's launch a new EC2 instance for running client apps. We ran the perf_test app locally from your laptop in the previous section. We'll use this new EC2 instance to run the same app from the new EC2 instance. For our example, the new EC2 instance has the following IP address.

Name IP Address
client 18.219.75.145

Let's add the IP address to the workspace vmenv.sh file.

cd_workspace
vi vmenv.sh

Append the new IP address to the VM_HOSTS environment variable as follows:

VM_HOSTS="3.135.221.186,18.191.168.36,3.135.186.150,3.135.232.83,18.218.40.90,18.217.250.90,18.219.75.145"

After saving the vmenv.sh file, run vm_sync to synchronize the VMs so that the new instance will have PadoGrid installed.

vm_sync

The vm_sync command output should be similar to what we have seen before.

Scanning VMs... Please wait.
Deploying padogrid_0.9.7 to 18.219.75.145...

Workspace sync: ws-aws-gemfire
   Synchronizing 3.135.221.186...
   Synchronizing 18.191.168.36...
   Synchronizing 3.135.186.150...
   Synchronizing 3.135.232.83...
   Synchronizing 18.218.40.90...
   Synchronizing 18.217.250.90...
   Synchronizing 18.219.75.145...

Updating remote (VM) '.bashrc' if needed...
...
Workspace sync complete.

Let's install Java and GemFire as before.

vm_install -product ~/Downloads/jdk-8u333-linux-x64.tar.gz
vm_isntall -product ~/Downloads/vmware-gemfire-9.15.1.tgz

Running Apps from EC2 Instances

In the previous section, we added a new EC2 instance for running client apps.

Login to the client instance.

# First change directory to the workspace where the key file (mykey.pem) is located.
cd_workspace

# ssh into the client VM
ssh -i mykey.pem [email protected]

When we ran vm_sync earlier, it also deployed the perf_test app to all the VMs. We can simply change directory to perf_test and run test_ingestion as before.

cd_app perf_test; cd bin_sh
./test_ingestion -run

Preserving Workspace

If you terminate the EC2 instances without removing the workspace, then your workspace will be preserved on your local machine. This means you can later reactivate the workspace by simply launching new EC2 instances and configuring the workspace with the new public IP addresses. The following link provides step-by-step instructions describing how to reactivate VM workspaces.

Reactivating Workspaces on AWS EC2

Teardown

  1. Stop the cluster

If you want to remove the cluster from all the VMs, then you must first stop the cluster and execute the remove_cluster command.

# Stop cluster including members and locator
stop_cluster -all
  1. Remove the workspace

If you want to preserve the workspace so that you can later reactivate it then skip this step and jump to the next step; otherwise, run the remove_workspace command which will also remove the cluster.

# Simulate removing workspace from all VMs. Displays removal steps but does not
# actually remove the workspace.
remove_workspace -workspace ws-aws-gemfire -simulate

# Remove workspace from all VMs. Runs in interactive mode.
remove_workspace -workspace ws-aws-gemfire
  1. Terminate the EC2 instances

From the EC2 Dashboard remove the EC2 instances.


◀️ Geode Workspaces on VMs 🔗 Reactivating Geode Workspaces on AWS EC2 ▶️

⚠️ **GitHub.com Fallback** ⚠️