Running in the Cloud with AWS - synthetichealth/synthea GitHub Wiki

Running SyntheaTM on AWS

Important

The following page was contributed by AWS. Use of cloud services can result in charges. -- Synthea Team

Amazon Elastic Compute Cloud (Amazon EC2) is a versatile computing platform that offers over 700 different instances and allows you to choose the latest processors, storage options, networking configurations, operating systems, and payment plans that best suit your needs. When you use an Amazon Machine Image (AMI), it helps you set up EC2 faster with pre-configured settings and software.

CloudFormation is a tool that automates deployments, making it easier to set up the necessary resources and install any prerequisites, like Amazon Corretto (a free, cross-platform, production-ready version of OpenJDK). When you pair this with Amazon S3 storage, you can store large amounts of data at a lower cost.

When it comes to selecting the right deployment option, there are several key factors to keep in mind. Let's take a closer look at what you should consider:

  1. Simplified Setup: You won't have to worry about downloading and installing Java JDK on your local machine. This hassle-free approach allows you to get started quickly without any technical hurdles.
  2. Tailored Resources: You have a range of options at your fingertips, allowing you to customize your computing and memory resources to perfectly match your dataset generation needs. Whether you're working with small-scale projects or data-intensive tasks, there's a solution that suits you.
  3. Data Storage: With a deployment option that offers persistent storage, you can securely and conveniently store your generated datasets. This ensures your data is readily available and safe, ready to be used whenever you need it.
  4. Enhanced Analytics: AWS provides a wide array of data analytics services that go beyond dataset generation. Discover the world of data transformation and advanced analytics on your datasets by exploring the offerings available on AWS. For more information, visit Analytics on AWS to learn how you can take your data analysis to the next level.

These instructions are intended for those just wishing to run Synthea on Amazon EC2. Those wishing to examine the Synthea source code, extend it or build the code locally should follow the Developer Setup and Running instructions instead.

Prerequisites

  • AWS Account: You must have an active AWS account. If you don't have one, you can sign up for an AWS account on the AWS website.
  • IAM User or Role: You should have an IAM (Identity and Access Management) user or role with the necessary permissions to create and manage AWS resources. CloudFormation creates the following AWS resources: an Amazon S3 bucket to store generated output, an Amazon EC2 instance that provides the required computing power to run Synthea, and an IAM role with permissions to write data to the Amazon S3 bucket.

Deployment

To choose a Region

  1. Sign in to the AWS Management Console.
  2. In the navigation bar, choose the name of the currently displayed Region. Then choose the Region to which you want to switch.

Deploying CloudFormation Template

  1. You can download the CloudFormation template for this solution before deploying it.
  2. Open the AWS CloudFormation.
  3. Choose Create Stack, select With new resources (standard) from the menu option.
  4. Under Prerequisite - Prepare template section:
    • For Prepare template, choose Template is ready.
  5. Under Specify template section:
    • For Template source, choose Upload a template file.
    • For Upload a template file select Choose file, and select the file that you downloaded at the beginning of this section.
    • Select Next
  6. Under Specify stack details section, provide the following required parameters for the CloudFormation template.
    • Enter stack name as synthea-on-aws.
  7. Under the Parameters section:
    • For Instance Type, select the instance type based on your resource needs. Visit Instance-Types page to learn more about configuration.
    • Select KeyPair, if you don't have one already in your AWS account, follow these directions to create one.
    • For MyIp, input your private IP address followed by a /32. You can find your local IP by searching What is my IP .
    • For SubnetId, select the first subnet from the available list of subnets.
    • For VpcId, select the default VPC from the available list of VPCs. You should only see one VPC (the default VPC) in the drop-down list.
  8. Click on Next.
  9. Retain all selected default values on the Configure stack options screen and click on Next.
  10. In the Review screen, select the checkbox for I acknowledge that AWS CloudFormation might create IAM resources.
  11. Click on Submit.
  12. Once the stack deployment is complete, you will see a status of "CREATE_COMPLETE".

Generate synthetic data using SyntheaTM

Connect to EC2 instance

  • Open the Amazon EC2 console at https://console.aws.amazon.com/ec2/.
  • In the navigation pane, choose Instances.
  • Select the instance that starts with the name Synthea and choose Connect.
  • Choose **EC2 Instance Connect.
  • Verify the user name and choose Connect to open a terminal window.

Generate synthetic dataset

  • Instance already has Amazon Corretto and synthea-with-dependencies.jar installed.
  • You can type command that generates synthetic datasets. ex: java -jar synthea-with-dependencies.jar -p 1000

Copying output to S3 for persistance storage

  • On the terminal window type a following command to copy output to S3. aws s3 sync output s3://synthea-output-xxxxxx where xxxxxxx is your AWS account number.

Clean Up

  • Before you delete the S3 bucket, you must first empty the bucket.
    • Navigate to the Amazon S3 console.
    • Click the Bucket name link of the bucket you created (synthea-output-xxxxxx).
    • Check the top box to select all objects in the bucket. In the Actions menu, select Delete. Confirm by choosing Delete in the confirmation modal.
  • Go to the CloudFormation console (or click here).
    • You will see a stack that you created in the earlier section of the lab, select stack and click on Delete button.
    • Select Delete on the prompt titled Delete stack?.
    • The stack status will change to DELETE_IN_PROGRESS.
    • After a few moments, all the resources created by stacks will be deleted.
  • If AWS CloudFormation fails to delete your stack, you can follow the below instructions to manually delete resources.
    • To terminate an instance
      • Open the Amazon EC2 console.
      • In the navigation pane, choose Instances.
      • Select the instance that starts with the name Synthea, and choose Instance state, Terminate instance.
      • Choose Terminate when prompted for confirmation.

Next: Common Configuration

⚠️ **GitHub.com Fallback** ⚠️