AWS ‐ Database Services ‐ RDS | DocumentDB | DynamoDB | ElastiCache | Keyspaces | Neptune | Quantum Ledger | Aurora | Redshift | Timestream - FullstackCodingGuy/Developer-Fundamentals GitHub Wiki

read
- Data is stored in a relationship structure
- Data stored in tables as Rows and Columns (property information about the row)
- Use SQL to access/query the data
- Completeness of the data (you can define constraints such as firstname, lastname canot be null, so the consumer of the data will have complete information)
- Consistency - consumer can rely on the data
- Accuracy - defining data model, relationship properly
- Collection of SQL statements processed in sequence
- All or None functionality - If all these statement executions are succeed then the changes are applied to the database, if any one statement is fail then no changes are applied.
- Db Transactions must be ACID
- Atomic - entire execution of statements should be successful, not just part of it
- Consistent - data written in db must adhere to all the rules and constraints defined
- Isolation - transactions independent, doesnt rely on any other trnx for it to succeed
- Durable - all the changes made to db are permanent
- Amazon Aurora
- MariaDb
- MS SQL Server
- MySQL
- Oracle
- PostgreSQL
read
-
To support varied data model (unstructured schema)
-
Used to store large amount of data - with less constraints
-
Supports Flexible data models
-
Provides Low latency - because less validations on the constraints and rules compared to relational database
-
Scalability & Performance - less processing and validations, efficient to compress and store data
-
Flexibility - stores different types of data
- Key-Value
- DocumentDb
- Graph db
- Search db
- In-Memory db
read
- It is a NoSQL Db, boasts performance and scalability, it is fully managed
- It is serverless db, automatically scalable without worry about infra
-
It supports strong consistency model
- all writes are completed before any read operation is performed, all latest data is guaranteed to be returned
- data is always upto date
- it is more resource intensive, hence lower performance
-
It also supports eventual consistency where you may not read the same data immediately after the write as there will be some delay
-
Default: Eventual Consistency
-
maximize performance of read operation
-
may not capture recent writes, i.e when eventual consistency enabled, subsequent read operation does not guarantee the data of write operation, but it will be eventually available for sure (after few seconds), not immediately. So the read operation may return stale data due to this delay in write/update/delete opeartion
-
Eventual consistency - is normal in scenarios where there is replication happen, multiple availability zones
-
Supports Strong Consistent Read
-
It can be deployed in single or multiple region (replication hapen accross region),
-
It supports availability zone
read
- It is a fully managed web based relational database service
- It is cost efficient and scalable
-
Db Instance is a basic unit, instance is a deployment, a single unit/instance can host multiple databases
-
Instance identifier to be used in consumption
-
Limitations
- max 40 db instances in an account depends on type of database
- if you deploy sql server edition - max 10 instances
- if oracle - max 10 instances, but if you bring your own license then max 40 instances can be deployed
- if mysql,mariadb,postgresql - upto 40 instances
-
General purpose SSD - higher cost
-
Provisioned IOPS - higher cost
-
Magnetic storage - cost effective option
-
High Availability
- Use multi-availability zone deployment
- automatic failover - if db is down in one zone, automaticall another db spun up in another zone
-
Pay on-demand - pay as you use
-
Reserved - fixed price, time frame, hourly rate, partial upfront cost or everything upfront cost
-
Billing by database instances, multi-az database instances
A read replica is a read-only copy of a DB instance. You can reduce the load on your primary DB instance by routing queries from your applications to the read replica. In this way, you can elastically scale out beyond the capacity constraints of a single DB instance for read-heavy database workloads.
To create a read replica from a source DB instance, Amazon RDS uses the built-in replication features of the DB engine. For information about using read replicas with a specific engine, see the following sections:
Working with MariaDB read replicas
Working with read replicas for Microsoft SQL Server in Amazon RDS
Working with MySQL read replicas
Working with read replicas for Amazon RDS for Oracle
Working with read replicas for Amazon RDS for PostgreSQL
After you create a read replica from a source DB instance, the source becomes the primary DB instance. When you make updates to the primary DB instance, Amazon RDS copies them asynchronously to the read replica. The following diagram shows a source DB instance replicating to a read replica in a different Availability Zone (AZ). Clients have read/write access to the primary DB instance and read-only access to the replica.
-
Deploying one or more read replicas for a given source DB instance might make sense in a variety of scenarios, including the following:
-
Scaling beyond the compute or I/O capacity of a single DB instance for read-heavy database workloads. You can direct this excess read traffic to one or more read replicas.
-
Serving read traffic while the source DB instance is unavailable. In some cases, your source DB instance might not be able to take I/O requests, for example due to I/O suspension for backups or scheduled maintenance. In these cases, you can direct read traffic to your read replicas. For this use case, keep in mind that the data on the read replica might be "stale" because the source DB instance is unavailable.
-
Business reporting or data warehousing scenarios where you might want business reporting queries to run against a read replica, rather than your production DB instance.
-
Implementing disaster recovery. You can promote a read replica to a standalone instance as a disaster recovery solution if the primary DB instance fails.
read

- It is a nosql db
- Compatible with Mongodb
- Automatic volume scaling, grow upto 64TB, upto 15 replicas of the database
- it can be within VPCs
- Health monitoring
- Automated failover - restart and recover
- Point-in-time cluster recover
- supports KMS Encryption
- It supports Clusters (group of instances), supports upto 16 database instances
- Primary used for writing, secondary will be used for reading
- Invoiced by IO consumptions
- Monitoring support
- Use AWS Management Console
- AWS CLI
- Mongodb SHell, TOols, Drivers
- To read and write the data - use Cluster Endpoints
- To only read data - use Reader endpoint
- To communicate with specific replica - use Instance endpoint
- User Profiles
- Realtime big data
- content management
read
-
In-Memory database
-
High performance, scalable and cost effective
-
Reduce complexity of distributed cache deployment
-
Failure detection and recovery
-
Automatic node discovery - no need to reconfigure
-
Flexible availability zone placement
- Speed and cost
- Data and access patterns (ex: look up tables, static data)
- Manage staleness of data
- Nodes - static allocation of caching memory, it has scaling capability
- Cluster - group of nodes, same type
- Regions and availability zones
- Endpoints - to manage the configuration of clusters
- Security - IAM policies, VPC, Security Groups, SUbnet groups
- Event notifications
- AWS Management Console
- AWS CLI
- AWS SDK
- ElastiCache API
read
- It is a managed apache cassandra database solution
- It is a server less solution
- Pay per use service
- Unlimited throughput and storage solution
- For Low latency apps
- Open source development
- Move cassandra workloads to cloud with ease
- Similar to SQL, can be used in CQL editor in aws management console
read
- It is a Managed Graph database
- For complex application datasets
- It keeps the relationship of the data
- It uses Graph database engine - which processes billions of object relationships
- It supports TinkerPop Gremlin and SPARQL query language support
Database is broken down into clusters, data is stored atleast in 2 instances (1. Primary - read + write operation, 2. Replica - for read only operation)
- Cluster - contains primary database instance, which allows user to read and write the data from the cluster
- Cluster Provides High availability and reliability
- Neptune Replica - upto 15 replicas per cluster
- Cluster Volume - where the data is stored
read
- It is a ledger database or Journal database, Immutability
- User can read and write the data in the database, difference is that once it is written, it cannot be changed - applicable in bank use case
- Use case is For Change tracking purpose
- It works like a blockchain db
- Data object model
- Journal first transactions - data is shown from journal
read
- It is a relational db - It offers High availability and Performance - It offers high performance clusters - 1 primary and multiple replica instances- Cluster Volume Contents - the data is stored in single isolated volume, which is decoupled from the instances, the primary and the replicas are all referencing the same volume, hence if you add more and more data, you will need more and more storage
- It handles automatic storage resizing - it supports upto 128 TBs
- Billing is based on the storage
- Aurora is automatically replicating the data into the replicas - this results high availability and reliability
read
- It is a managed data warehouse implementation, petabyte-scale - It makes use of RedShift Clusters - Clusters are collection of nodes, each node is one of 3 node types - RA3 Nodes - DC2 - DS2read
- It is a managed timeseries database, collection data received over a time from single source device (temperature readings from thermostats) - Time is key index in the data records - It can store trillions of time series data points - It can scale quickly - It has extensive integration support to collect the time series data (ex: iot devices)- stores data type - BIGINT, BOOLEAN, DOUBLE, VARCHAR
- data is optimized and stored
- data is retrieved using optimized queries causing reduced storage cost
- supports retention policies
- Flat model - stores all data in table and uses timestamp column
- Timeseries model - every value in the table is a key-value pair, key being the timestamp