GCP Service to Service Integration - vidyasekaran/GCP GitHub Wiki
Which GCP Service integrate with other service and how? https://cloud.google.com/architecture/data-lifecycle-cloud-platform
Source | Destination | Integration Method
Amazon S3 | Cloud Storage | Storage Transfer Service
If you are transferring large amounts of files on a
daily basis from Amazon S3 to Cloud Storage, you
can use the Storage Transfer Service to transfer
files from sources including Amazon S3 and HTTP/HTTPS
services. You can set up regularly recurring transfers
and Storage Transfer Service supports several advanced
options. This service takes advantage of the large network
bandwidth between major cloud providers and uses advanced
bandwidth optimization techniques to achieve very high
transfer speeds.
https://cloud.google.com/storage-transfer-service
Below topics are covered
Scenario 1: transferring files from on-premises servers Scenario 2: transferring files from other cloud providers
https://cloud.google.com/architecture/mobile-gaming-analysis-telemetry#streaming_pipeline
AppEngine CloudLogging For example, apps running on App Engine automatically
log the details of each request and response to
Cloud Logging. You can also write custom logging
messages to stdout and stderr, which Cloud Logging
automatically collects and displays in the Logs Viewer.
ComputeEngine/GKE CloudLogging Cloud Logging provides a logging agent, based on **fluentd**,
that you can run on virtual machine (VM) instances hosted
on Compute Engine as well as container clusters managed
by GKE. The agent streams log data from common
third-party apps and system software to Cloud Logging.
Cloud Storage | BigQuery | An app outputs batch CSV files
to the object store of Cloud Storage.
From there, the import function
of BigQuery, an analytics data warehouse,
can pull the data in for analysis and querying.
- https://cloud.google.com/architecture/data-lifecycle-cloud-platform
- https://cloud.google.com/architecture/mobile-gaming-analysis-telemetry#streaming_pipeline
- https://cloud.google.com/bigquery
Cloud Storage | Pub/Sub | Create a datastore bucket a pub/sub
notification topic also set
a pub/sub notification for that bucket
so that when u store an image or file
in cloud storage it is sent to notification topic
export REGION=us-central1
export GCS_NOTIFICATION_TOPIC="gcs-notification-topic"
export GCS_NOTIFICATION_SUBSCRIPTION="gcs-notification-subscription"
export PROJECT=$(gcloud config get-value project)
export VIDEO_CLIPS_BUCKET=${PROJECT}_videos
In Cloud Shell, create a Pub/Sub topic:
gcloud pubsub topics create ${GCS_NOTIFICATION_TOPIC}
Create a Pub/Sub subscription for the topic:
gcloud pubsub subscriptions create ${GCS_NOTIFICATION_SUBSCRIPTION} --
topic=${GCS_NOTIFICATION_TOPIC}
Create a bucket to store the input video clips:
gsutil mb -c standard -l ${REGION} gs://${VIDEO_CLIPS_BUCKET}
**Create a Pub/Sub notification for the bucket:**
gsutil notification create -t ${GCS_NOTIFICATION_TOPIC}
-f json gs://${VIDEO_CLIPS_BUCKET}
Now that you have configured notifications, the system sends
a Pub/Sub message to the topic that you created every
time you upload a file to the bucket.
Create a bucket to store the input video clips:
gsutil mb -c standard -l ${REGION} gs://${VIDEO_CLIPS_BUCKET}
Create a Pub/Sub notification for the bucket:
gsutil notification create -t ${GCS_NOTIFICATION_TOPIC}
-f json gs://${VIDEO_CLIPS_BUCKET}
Now that you have configured notifications, the system sends a
Pub/Sub message to the topic that you created every time you
upload a file to the bucket.
API Reference : https://cloud.google.com/storage/docs/reference/libraries#client-libraries-usage-java https://github.com/vidyasekaran/GCP/wiki/GCP-Solutions
Cloud Storage | Dataflow | You can setup Dataflow pipeline to polls every 10
seconds for new text files stored in Cloud Storage
and outputs each line to a Pub/Sub topic.
(Using DataFlow - Create job from Template)
Cloud Storage | PubSub/cloudsql | Write a cloud function by selecting bucket action 3 dots
for event like file creation and code to put it on
pubsub or any other service.
Cloud Storage | CloudScheduler/ | Write a cloud function by selecting bucket action 3 dots
CloudFunction for event like file creation and write cloud function
code to put it on pubsub or any other service.
Pub/Sub | BigQuery/BigTable | Ingestion user interaction and server events To make use
/CloudStorage of user interaction events from end-user apps or server
events from your system, you may forward them to Pub/Sub
and then use a stream processing tool such as **Dataflow**)
which delivers them to BigQuery, Bigtable, Cloud Storage
and other databases. Pub/Sub allows you to gather events
from many clients simultaneously.
https://cloud.google.com/pubsub/docs/overview
Pub/Sub | BigQuery/BigTable | Ingestion user interaction and server events To make use
/CloudStorage of user interaction events from end-user apps or server
events from your system, you may forward them to Pub/Sub
and then use a stream processing tool such as **Dataflow**)
which delivers them to BigQuery, Bigtable, Cloud Storage
and other databases. Pub/Sub allows you to gather events
from many clients simultaneously.
NOTE: You can also use **Cloud Data Fusion** to ingest data from pub/sub to BigQuery
Reference : https://codelabs.developers.google.com/codelabs/real-time-csv-cdf-bq#0
Pub/Sub | Cloud Function | Created cloud function and set it to Pub/Sub topic; when i published to topic
the cloud function is invoked so i can ingest to any service from here.
NOTE: You can also use **Cloud Function** to ingest data from pub/sub to BigQuery
https://medium.com/@milosevic81/copy-data-from-pub-sub-to-bigquery-496e003228a1
CloudScheduler | Pub/Sub | Created a job and scheduled to run every 1 minute which writes to a topic
https://cloud.google.com/scheduler/docs/quickstart
CloudScheduler | Cloud Functions | Created a job and scheduled to run every 1 minute invokes cloud function
https://rominirani.com/google-cloud-functions-tutorial-using-the-cloud-scheduler-to-trigger-your-functions-756160a95c43
Ingest to/from - Bigquery Loading Efficieny, tools and methods explained
https://cloud.google.com/blog/topics/developers-practitioners/bigquery-explained-data-ingestion
Building a Mobile Gaming Analytics Platform - a Reference Architecture
Covers
Real-time processing of individual events using a streaming processing pattern
Bulk processing of aggregated events using a batch processing pattern
https://cloud.google.com/architecture/mobile-gaming-analysis-telemetry#streaming_pipeline
Pub/Sub has global endpoints and leverages Google’s global front-end load balancer to support data ingestion across all Google Cloud regions, with minimal latency.
**Migrating From On-Prem to GCP: Storage (Cloud Store, Database, VMs (Velostrata), **
https://bluemedora.com/migrating-from-on-prem-to-gcp-storage/
Decision as to which datastorage service to use for storage and also about each datastore servcie ex : BigTable
https://cloud.google.com/architecture/data-lifecycle-cloud-platform