AWS_Data - kamialie/knowledge_corner GitHub Wiki
Contents
Glue
Managed and serverless Extract, Transform, and Load (ETL) service. Supports
RDS
, DynamoDB
, Redshift
, and S3
.
Used to prepare and transform data for analytics.
Elastic Map Reduce
BigData cloud processing. Enables BigData processing on EC2
and S3
.
Supports popular open-source frameworks and tools like Apache Spark, Apache
Flink, Apache Hive, Apache Hudi, Apache Hbase, Presto, Hadoop. Operates in a
clustered environment.
Data catalog
Holds catalogs of data. Data Crawler navigates through compatible databases
(S3
, RDS
, DynamoDB
or any JDBC compatible) and writes metadata to
catalog. Resulting tables can be analyzed by Athena
, Redshift Spectrum
,
EMR
.
Other services
Data Pipeline
- data workflow orchestration service across AWS. Also a
managed ETL, supports S3
, EMR
, Redshift
, DynamoDB
, and RDS
. Can
integrate on-prem data stores.
Quicksight
- business intelligent service enabling dynamic data dashboards.
CloudSearch
- managed search service for custom apps. Charges per hour and
instance type for underlying infrastructure.
Artificial intelligence:
Transcribe
- speech recognition service (converted into text), includes specific sub-service for medical use, works both in batch and real-time, around 31 languages are supportedComprehend
- discover patterns in textFraud Detector
- discover potential fraudulent online activitiesLex
- build voice and text chatbots
Machine learning:
SageMaker
- quickly build, train, and deploy machine learning models at scaleTextract
- extracts text and data from scanned documentsAugmented AI
(Amazon A2I) - provides built-in human review workflows for common machine learning use cases, such as content moderation and text extraction from documents; can also create your own workflows for machine learning models built onSageMaker
or any other toolsRekognition
- image and video recognition deep learning service, can identify objects in images (and actions for video), can detect specific people using facial analysisTranslate
- supports around 54 languages, can perform language identification, works both in batch and real-timeDeepRacer
- autonomous 1/18 scale race car that you can use to test reinforcement learning models.