Relational Database

read

Data is stored in a relationship structure
Data stored in tables as Rows and Columns (property information about the row)
Use SQL to access/query the data

Data Integrity

Completeness of the data (you can define constraints such as firstname, lastname canot be null, so the consumer of the data will have complete information)
Consistency - consumer can rely on the data
Accuracy - defining data model, relationship properly

Database Transactions

Collection of SQL statements processed in sequence
All or None functionality - If all these statement executions are succeed then the changes are applied to the database, if any one statement is fail then no changes are applied.
Db Transactions must be ACID
- Atomic - entire execution of statements should be successful, not just part of it
- Consistent - data written in db must adhere to all the rules and constraints defined
- Isolation - transactions independent, doesnt rely on any other trnx for it to succeed
- Durable - all the changes made to db are permanent

Amazon Relational Databases supported

Amazon Aurora
MariaDb
MS SQL Server
MySQL
Oracle
PostgreSQL

NoSQL (Non-Relational) Databases

read

To support varied data model (unstructured schema)
Used to store large amount of data - with less constraints
Supports Flexible data models
Provides Low latency - because less validations on the constraints and rules compared to relational database
Scalability & Performance - less processing and validations, efficient to compress and store data
Flexibility - stores different types of data

Types of NoSQL Database

Key-Value
DocumentDb
Graph db
Search db
In-Memory db

Comparison between sql and nosql database

Database Consistency Models

read

DynamoDB

It is a NoSQL Db, boasts performance and scalability, it is fully managed
It is serverless db, automatically scalable without worry about infra

It supports strong consistency model
- all writes are completed before any read operation is performed, all latest data is guaranteed to be returned
- data is always upto date
- it is more resource intensive, hence lower performance
It also supports eventual consistency where you may not read the same data immediately after the write as there will be some delay
Default: Eventual Consistency
maximize performance of read operation
may not capture recent writes, i.e when eventual consistency enabled, subsequent read operation does not guarantee the data of write operation, but it will be eventually available for sure (after few seconds), not immediately. So the read operation may return stale data due to this delay in write/update/delete opeartion
Eventual consistency - is normal in scenarios where there is replication happen, multiple availability zones
Supports Strong Consistent Read
It can be deployed in single or multiple region (replication hapen accross region),
It supports availability zone

Relational Database Service (RDS)

read

It is a fully managed web based relational database service
It is cost efficient and scalable

Database Instances

Db Instance is a basic unit, instance is a deployment, a single unit/instance can host multiple databases
Instance identifier to be used in consumption
Limitations
- max 40 db instances in an account depends on type of database
- if you deploy sql server edition - max 10 instances
- if oracle - max 10 instances, but if you bring your own license then max 40 instances can be deployed
- if mysql,mariadb,postgresql - upto 40 instances
General purpose SSD - higher cost
Provisioned IOPS - higher cost
Magnetic storage - cost effective option
High Availability
- Use multi-availability zone deployment
- automatic failover - if db is down in one zone, automaticall another db spun up in another zone
Pay on-demand - pay as you use
Reserved - fixed price, time frame, hourly rate, partial upfront cost or everything upfront cost
Billing by database instances, multi-az database instances

Working with DB instance read replicas

A read replica is a read-only copy of a DB instance. You can reduce the load on your primary DB instance by routing queries from your applications to the read replica. In this way, you can elastically scale out beyond the capacity constraints of a single DB instance for read-heavy database workloads.

To create a read replica from a source DB instance, Amazon RDS uses the built-in replication features of the DB engine. For information about using read replicas with a specific engine, see the following sections:

Working with MariaDB read replicas

Working with read replicas for Microsoft SQL Server in Amazon RDS

Working with MySQL read replicas

Working with read replicas for Amazon RDS for Oracle

Working with read replicas for Amazon RDS for PostgreSQL

After you create a read replica from a source DB instance, the source becomes the primary DB instance. When you make updates to the primary DB instance, Amazon RDS copies them asynchronously to the read replica. The following diagram shows a source DB instance replicating to a read replica in a different Availability Zone (AZ). Clients have read/write access to the primary DB instance and read-only access to the replica.

Use cases for read replicas

Deploying one or more read replicas for a given source DB instance might make sense in a variety of scenarios, including the following:
Scaling beyond the compute or I/O capacity of a single DB instance for read-heavy database workloads. You can direct this excess read traffic to one or more read replicas.
Serving read traffic while the source DB instance is unavailable. In some cases, your source DB instance might not be able to take I/O requests, for example due to I/O suspension for backups or scheduled maintenance. In these cases, you can direct read traffic to your read replicas. For this use case, keep in mind that the data on the read replica might be "stale" because the source DB instance is unavailable.
Business reporting or data warehousing scenarios where you might want business reporting queries to run against a read replica, rather than your production DB instance.
Implementing disaster recovery. You can promote a read replica to a standalone instance as a disaster recovery solution if the primary DB instance fails.

https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/blue-green-deployments-overview.html

Amazone DocumentDB

read

It is a nosql db
Compatible with Mongodb
Automatic volume scaling, grow upto 64TB, upto 15 replicas of the database
it can be within VPCs
Health monitoring
Automated failover - restart and recover
Point-in-time cluster recover
supports KMS Encryption

Considerations

It supports Clusters (group of instances), supports upto 16 database instances
Primary used for writing, secondary will be used for reading
Invoiced by IO consumptions
Monitoring support

Interfaces

Use AWS Management Console
AWS CLI
Mongodb SHell, TOols, Drivers

Endpoints

To read and write the data - use Cluster Endpoints
To only read data - use Reader endpoint
To communicate with specific replica - use Instance endpoint

Common Usecases for Documentdb

User Profiles
Realtime big data
content management

ElastiCache for MemCached

read

In-Memory database
High performance, scalable and cost effective
Reduce complexity of distributed cache deployment
Failure detection and recovery
Automatic node discovery - no need to reconfigure
Flexible availability zone placement

Considerations

Speed and cost
Data and access patterns (ex: look up tables, static data)
Manage staleness of data

Components

Nodes - static allocation of caching memory, it has scaling capability
Cluster - group of nodes, same type
Regions and availability zones
Endpoints - to manage the configuration of clusters
Security - IAM policies, VPC, Security Groups, SUbnet groups
Event notifications

Interactions

AWS Management Console
AWS CLI
AWS SDK
ElastiCache API

Amazon Keyspaces (Casandra Database)

read

It is a managed apache cassandra database solution
It is a server less solution
Pay per use service
Unlimited throughput and storage solution

Reasons to use Keyspaces

For Low latency apps
Open source development
Move cassandra workloads to cloud with ease

CQL - Cassandra Query Language

Similar to SQL, can be used in CQL editor in aws management console

Amazon Neptune

read

It is a Managed Graph database
For complex application datasets
It keeps the relationship of the data
It uses Graph database engine - which processes billions of object relationships
It supports TinkerPop Gremlin and SPARQL query language support

Components

Database is broken down into clusters, data is stored atleast in 2 instances (1. Primary - read + write operation, 2. Replica - for read only operation)

Cluster - contains primary database instance, which allows user to read and write the data from the cluster
Cluster Provides High availability and reliability
Neptune Replica - upto 15 replicas per cluster
Cluster Volume - where the data is stored

Amazon Quantum Ledger Database (QLDB)

read

It is a ledger database or Journal database, Immutability
User can read and write the data in the database, difference is that once it is written, it cannot be changed - applicable in bank use case
Use case is For Change tracking purpose
It works like a blockchain db

Comparison QLDB vs Relational db

Concepts

Data object model
Journal first transactions - data is shown from journal

Amazon Aurora Database

read

- It is a relational db - It offers High availability and Performance - It offers high performance clusters - 1 primary and multiple replica instances

Storage and Reliability

Cluster Volume Contents - the data is stored in single isolated volume, which is decoupled from the instances, the primary and the replicas are all referencing the same volume, hence if you add more and more data, you will need more and more storage
It handles automatic storage resizing - it supports upto 128 TBs
Billing is based on the storage
Aurora is automatically replicating the data into the replicas - this results high availability and reliability

Amazon RedShift

read

- It is a managed data warehouse implementation, petabyte-scale - It makes use of RedShift Clusters - Clusters are collection of nodes, each node is one of 3 node types - RA3 Nodes - DC2 - DS2

Amazon Timestream

read

- It is a managed timeseries database, collection data received over a time from single source device (temperature readings from thermostats) - Time is key index in the data records - It can store trillions of time series data points - It can scale quickly - It has extensive integration support to collect the time series data (ex: iot devices)

Architecture

Writes Architecture

stores data type - BIGINT, BOOLEAN, DOUBLE, VARCHAR

Storage Architecture

data is optimized and stored
data is retrieved using optimized queries causing reduced storage cost
supports retention policies

Query Architecture Model

Flat model - stores all data in table and uses timestamp column
Timeseries model - every value in the table is a key-value pair, key being the timestamp

Quiz

read

AWS ‐ Database Services ‐ RDS | DocumentDB | DynamoDB | ElastiCache | Keyspaces | Neptune | Quantum Ledger | Aurora | Redshift | Timestream - FullstackCodingGuy/Developer-Fundamentals GitHub Wiki

Relational Database

Data Integrity

Database Transactions

Amazon Relational Databases supported

NoSQL (Non-Relational) Databases

Types of NoSQL Database

Comparison between sql and nosql database

Database Consistency Models

DynamoDB

Relational Database Service (RDS)

Database Instances

Working with DB instance read replicas

Use cases for read replicas

Amazone DocumentDB

Considerations

Interfaces

Endpoints

Common Usecases for Documentdb

ElastiCache for MemCached

Considerations

Components

Interactions

Amazon Keyspaces (Casandra Database)

Reasons to use Keyspaces

CQL - Cassandra Query Language

Amazon Neptune

Components

Amazon Quantum Ledger Database (QLDB)

Comparison QLDB vs Relational db

Concepts

Amazon Aurora Database

Storage and Reliability

Amazon RedShift

Amazon Timestream

Architecture

Writes Architecture

Storage Architecture

Query Architecture Model

Quiz

What element corresponds to a data entry in an Amazon Timestream database?

In a collection named “books,” what command would be used to retrieve all documents?

How many Microsoft SQL Server databases can be deployed via the Amazon Relational Database Service per AWS account?

How many replicas can an Amazon Aurora cluster include?

What command is required when adding clauses to Amazon Keyspaces read operations?

What element of a relational database corresponds to an individual data value?

What Amazon DynamoDB element corresponds to an individual data value?

What data node type is only recommended for legacy applications?

What consistency model ensures that data is guaranteed to be included in read operations immediately after a write operation? Strongly consistent

What database capability type does Amazon Neptune provide? Graph

What is the format type of a data object stored within Amazon DocumentDB? JSON File

What is an Amazon Quantum Ledger Database (QLDB) journal?

What database service does Amazon Keyspaces wrap? Apache Cassandra

What tool is used for programmatic access to Amazon ElastiCache for Memcached?

References

⚠️ **GitHub.com Fallback** ⚠️

⚠️ GitHub.com Fallback ⚠️