AWS_Data - kamialie/knowledge_corner GitHub Wiki

Contents

Glue

Managed and serverless Extract, Transform, and Load (ETL) service. Supports RDS, DynamoDB, Redshift, and S3.

Used to prepare and transform data for analytics.

Elastic Map Reduce

BigData cloud processing. Enables BigData processing on EC2 and S3. Supports popular open-source frameworks and tools like Apache Spark, Apache Flink, Apache Hive, Apache Hudi, Apache Hbase, Presto, Hadoop. Operates in a clustered environment.

Data catalog

Holds catalogs of data. Data Crawler navigates through compatible databases (S3, RDS, DynamoDB or any JDBC compatible) and writes metadata to catalog. Resulting tables can be analyzed by Athena, Redshift Spectrum, EMR.

Other services

Data Pipeline - data workflow orchestration service across AWS. Also a managed ETL, supports S3, EMR, Redshift, DynamoDB, and RDS. Can integrate on-prem data stores.

Quicksight - business intelligent service enabling dynamic data dashboards.

CloudSearch - managed search service for custom apps. Charges per hour and instance type for underlying infrastructure.

Artificial intelligence:

  • Transcribe - speech recognition service (converted into text), includes specific sub-service for medical use, works both in batch and real-time, around 31 languages are supported
  • Comprehend - discover patterns in text
  • Fraud Detector - discover potential fraudulent online activities
  • Lex - build voice and text chatbots

Machine learning:

  • SageMaker - quickly build, train, and deploy machine learning models at scale
  • Textract - extracts text and data from scanned documents
  • Augmented AI (Amazon A2I) - provides built-in human review workflows for common machine learning use cases, such as content moderation and text extraction from documents; can also create your own workflows for machine learning models built on SageMaker or any other tools
  • Rekognition - image and video recognition deep learning service, can identify objects in images (and actions for video), can detect specific people using facial analysis
  • Translate - supports around 54 languages, can perform language identification, works both in batch and real-time
  • DeepRacer - autonomous 1/18 scale race car that you can use to test reinforcement learning models.