Geode on AWS EC2 - padogrid/padogrid GitHub Wiki
Since Padogrid v0.9.20
This article provides step-by-step instructions for creating and running a VM workspace on AWS EC2 instances.
From the EC2 Dashboard launch six (6) EC2 instances of type t2.micro
on two (2) different availability zones and collect their public IP addresses. For our example, we launched three (3) instances on each of the us-east-2a
and us-east-2b
availability zones, named the instances, and collected their public IP addresses as follows:
Name | IP Address | Availability Zone |
---|---|---|
locator1 | 3.135.221.186 | us-east-2a |
member1 | 18.191.168.36 | us-east-2a |
member2 | 3.135.186.150 | us-east-2a |
locator2 | 3.135.232.83 | us-east-2b |
member3 | 18.218.40.90 | us-east-2b |
member4 | 18.217.250.90 | us-east-2b |
To create a workspace, run create_workspace
, which by default runs in the interactive mode. For our example, let's run it in the non-interactive mode by specifying the -quiet
option as follows:
create_workspace -quiet \
-name ws-aws-gemfire \
-java /Library/Java/JavaVirtualMachines/jdk1.8.0_333.jdk/Contents/Home \
-product /Users/dpark/Padogrid/products/vmware-gemfire-9.15.1 \
-vm 3.135.221.186,18.191.168.36,3.135.186.150,3.135.232.83,18.218.40.90,18.217.250.90 \
-vm-user ec2-user \
-vm-key mykey.pem
The above creates the workspace named ws-aws-gemfire
and places all the installations in the /home/ec2-user/Geode
directory. When you are done with installation later, each EC2 instance will have the following folder contents:
/home/ec2-user/Padogrid
├── downloads
│ ├── jdk-8u333-linux-x64.tar.gz
│ └── vmware-gemfire-9.15.1.tgz
├── products
│ ├── padogrid_0.9.20
│ ├── jdk1.8.0_333
│ └── vmware-gemfire-9.15
└── workspaces
└── ws-aws-gemfire
└── clusters
└── mygemfire
The non-vm options, -java
and -product
must be set to Java and GemFire home paths in your local file system.
The following table shows the breakdown of the options used in the example.
Option | Value | Description |
---|---|---|
-name | ws-aws-gemfire | Workspace name |
-java | /Library/Java/JavaVirtualMachines/jdk1.8.0_333.jdk/Contents/Home | JAVA_HOME, local file system |
-product | /Users/dpark/Padogrid/products/vmware-gemfire-9.15 | GEMFIRE_HOME, local file system |
-vm | 3.18.113.154,3.21.44.203,18.218.42.247, 18.222.225.117,18.188.216.237 | EC2 instance public IP addresses. Must be separated by comma with no spaces. |
-vm-user | ec2-user | User name, EC2 instances |
-vm-key | mykey.pem | Private key file, local file system |
The following table shows rest of the options not included in the example. These options take default values if not specified.
Option | Default Value | Description |
---|---|---|
-cluster | myhz | Cluster name |
-vm-padogrid | ~/Padogrid | PadoGrid base directory on EC2 instances |
-vm-java | JDK installation path | JAVA_HOME on EC2 instances. Use vm_install to install JDK. |
We launched t2.micro
instances which only have 1 GB of memory. We need to lower the GemFire member heap size to below this value. Edit the cluster.properties
file as follows:
switch_workspace ws-aws-gemfire
switch_cluster mygemfire
vi etc/cluster.properties
Change the heap and host properties in that file as shown below. You may not able to run GemFire with the max heap size greater than 384 MB on t2.micro
EC2 instances. You can create a swapspace to overcome this limitation, however. Please see the next section for instructions on creating a swap space.
# Lower the heap size from 1g to 256m.
heap.min=256m
heap.max=256m
We have launched two (2) EC2 instances for locators. Let's set vm.locator.hosts
to their IP addresses.
# Set the first VM as the locator
vm.locator.hosts=3.135.221.186,3.135.232.83
Make sure to remove the locator IP addresses from the member host list set by the vm.hosts
property.
# Rest of the 4 VMs for members
vm.hosts=18.191.168.36,3.135.186.150,18.218.40.90,18.217.250.90
Lastly, set the redundancy zone for each of the VM. We can use any name as long as they are unique per zone. For our example, let's use the same availability zone names.
vm.3.135.221.186.redundancyZone=us-east-2a
vm.18.191.168.36.redundancyZone=us-east-2a
vm.3.135.186.150.redundancyZone=us-east-2a
vm.3.135.232.83.redundancyZone=us-east-2b
vm.18.218.40.90.redundancyZone=us-east-2b
vm.18.217.250.90.redundancyZone=us-east-2b
Note that we didn't need to set the redundancy zone for locators since they are not data nodes.
The t2.micro
EC2 instances do not include swap spaces. The following example shows how to add a 1-GB swap space to enable virtual memory on all VMs. This allows you increase the max heap size.
# Create swap space of 1 GB. To change the swap space size, change the count.
# For example, to increase to 2 GB, set count=16 (128 MB * 16).
vm_exec "sudo dd if=/dev/zero of=/swapfile bs=128M count=8 && \
sudo dd if=/dev/zero of=/swapfile bs=128M count=8 && \
sudo chmod 600 /swapfile && \
sudo mkswap /swapfile && \
sudo swapon /swapfile && \
sudo swapon -s"
You can start the swap file at boot time by editing the /etc/fstab
file. You can skip this step if you won't be rebooting the VMs.
# ssh into VM
ssh -i mykey.pem [email protected]
sudo vi /etc/fstab
Add the following new line at the end of the file, save the file, and then exit:
/swapfile swap swap defaults 0 0
You must carry out the above steps for each VM.
Run vm_sync
to synchronize the workspace.
vm_sync
The above command reports the following:
Scanning VMs... Please wait.
Deploying padogrid_0.9.20 to 3.135.221.186...
Deploying padogrid_0.9.20 to 18.191.168.36...
Deploying padogrid_0.9.20 to 3.135.186.150...
Deploying padogrid_0.9.20 to 3.135.232.83...
Deploying padogrid_0.9.20 to 18.218.40.90...
Deploying padogrid_0.9.20 to 18.217.250.90...
Workspace sync: ws-aws-gemfire
Synchronizing 3.135.221.186...
Synchronizing 18.191.168.36...
Synchronizing 3.135.186.150...
Synchronizing 3.135.232.83...
Synchronizing 18.218.40.90...
Synchronizing 18.217.250.90...
Updating remote (VM) '.bashrc' if needed...
...
Workspace sync complete.
vm_sync
will display warning messages indicating the new EC2 instances do not have the required software installed. Download the required software and install them by running the vm_install
command as shown in the following example.
vm_install -product ~/Downloads/jdk-8u333-linux-x64.tar.gz
vm_isntall -product ~/Downloads/vmware-gemfire-9.15.1.tgz
Start the cluster.
start_cluster
To monitor the cluster:
show_cluster
To view logs
# member1
show_log
# member2
show_log -num 2
# locator1
show_log -log locator
# locator2
show_log -log locator -num 2
You can get the Pulse URL by running the show_cluster -long
command. For our example, both locators serve Pulse.
You can view the members running in each redundancy zone from Pulse.
You can run the perf_test
app to ingest data into the cluster and monitor the region sizes increase from Pulse.
First, create the perf_test
app and edit its configuration file to point to the locator running on EC2.
create_app
cd_app perf_test
vi etc/client-cache.xml
Set the locator hosts in the etc/client-cache.xml
with the locator IP addresses.
<pool name="serverPool">
<locator host="3.135.221.186" port="10334" />
<locator host="3.135.232.83" port="10334" />
</pool>
Ingest data into the cluster.
cd bin_sh
./test_ingestion -run
You can include additional EC2 instances to the workspace by entering the new instance IP addresses in the workspace vmenv.sh
file. Let's launch a new EC2 instance for running client apps. We ran the perf_test
app locally from your laptop in the previous section. We'll use this new EC2 instance to run the same app from the new EC2 instance. For our example, the new EC2 instance has the following IP address.
Name | IP Address |
---|---|
client | 18.219.75.145 |
Let's add the IP address to the workspace vmenv.sh
file.
cd_workspace
vi vmenv.sh
Append the new IP address to the VM_HOSTS environment variable as follows:
VM_HOSTS="3.135.221.186,18.191.168.36,3.135.186.150,3.135.232.83,18.218.40.90,18.217.250.90,18.219.75.145"
After saving the vmenv.sh
file, run vm_sync
to synchronize the VMs so that the new instance will have PadoGrid installed.
vm_sync
The vm_sync
command output should be similar to what we have seen before.
Scanning VMs... Please wait.
Deploying padogrid_0.9.7 to 18.219.75.145...
Workspace sync: ws-aws-gemfire
Synchronizing 3.135.221.186...
Synchronizing 18.191.168.36...
Synchronizing 3.135.186.150...
Synchronizing 3.135.232.83...
Synchronizing 18.218.40.90...
Synchronizing 18.217.250.90...
Synchronizing 18.219.75.145...
Updating remote (VM) '.bashrc' if needed...
...
Workspace sync complete.
Let's install Java and GemFire as before.
vm_install -product ~/Downloads/jdk-8u333-linux-x64.tar.gz
vm_isntall -product ~/Downloads/vmware-gemfire-9.15.1.tgz
In the previous section, we added a new EC2 instance for running client apps.
Login to the client instance.
# First change directory to the workspace where the key file (mykey.pem) is located.
cd_workspace
# ssh into the client VM
ssh -i mykey.pem [email protected]
When we ran vm_sync
earlier, it also deployed the perf_test
app to all the VMs. We can simply change directory to perf_test
and run test_ingestion
as before.
cd_app perf_test; cd bin_sh
./test_ingestion -run
If you terminate the EC2 instances without removing the workspace, then your workspace will be preserved on your local machine. This means you can later reactivate the workspace by simply launching new EC2 instances and configuring the workspace with the new public IP addresses. The following link provides step-by-step instructions describing how to reactivate VM workspaces.
Reactivating Workspaces on AWS EC2
- Stop the cluster
If you want to remove the cluster from all the VMs, then you must first stop the cluster and execute the remove_cluster
command.
# Stop cluster including members and locator
stop_cluster -all
- Remove the workspace
If you want to preserve the workspace so that you can later reactivate it then skip this step and jump to the next step; otherwise, run the remove_workspace
command which will also remove the cluster.
# Simulate removing workspace from all VMs. Displays removal steps but does not
# actually remove the workspace.
remove_workspace -workspace ws-aws-gemfire -simulate
# Remove workspace from all VMs. Runs in interactive mode.
remove_workspace -workspace ws-aws-gemfire
- Terminate the EC2 instances
From the EC2 Dashboard remove the EC2 instances.