Import JSON File into DynamoDB - qyjohn/AWS_Tutorials GitHub Wiki

There have been a lot of discussions on how to import JSON data (with multiple items) into DynamoDB. People said there exists no generic solution, or it is very hard to do so. However, things can become pretty easy if you add jq into the picture. Here is how:

  • Create a test DynamoDB table with a hash key (hash, S) and a range key (range, S), with the table name being "test".

  • I have the following data in test.json for demo. In short, the test file contains three items. Apart from the hash key and range key, each item has different extra attributes. This simulates use cases with changing schema in the table.

[{
    "hash": {"S": "ABC"},
    "range": {"S": "123"},
    "val_1": {"S": "ABCD"}
},
{
    "hash": {"S": "BCD"},
    "range": {"S": "234"},
    "val_2": {"S": "ABCD"}
},
{
    "hash": {"S": "XYZ"},
    "range": {"S": "567"},
    "val_1": {"S": "ABCD"},
    "val_2": {"S": "XYZA"}
}]

For each attribute in each item, you specify the data type of the attribute. For example, "S" means the attribute should be a string. For more information on this topic, please refer to the following AWS documentation:

https://docs.aws.amazon.com/cli/latest/reference/dynamodb/put-item.html

  • I have the following bash script import.sh to do the import. In short, we read in the content of test.json using jq, one record at a time. For each record, use the AWS CLI to write the item into DynamoDB. Therefore, you will need both jq and AWS CLI in the run-time environment.
#!/bin/bash
jq -c '.[]' test.json | while read i; do
    aws dynamodb put-item --table-name test --item $i
done
  • Let's assume that import.sh and test.json are in the same folder. Do the import with the following command:
chmod +x import.sh
./import.sh
  • After that, I can see the items ended up in the DynamoDB table.

It should be noted that I have tested this approach with a small data set only. The limiting factor of this approach will be the data size. For example, if you have a data file that is 50 GB, you might want to run the import on an EC2 instance with more than 50 GB memory.