Coursera Preparing for GCP : Cloud Architect - vidyasekaran/GCP GitHub Wiki

GCP Computing Architecture

Virtualized data centers brought you Infrastructure as a Service, IaaS, and Platform as a Service, PaaS offerings.

IaaS offerings provide raw compute, storage, and network organized in ways that are familiar from data centers.

PaaS offerings, on the other hand, bind application code you write to libraries that give access to the infrastructure your application needs. That way, you can just focus on your application logic.

In the IaaS model, you pay for what you allocate.

In the PaaS model, you pay for what you use.

Both sure beat the old way where you bought everything in advance based on lots of risky forecasting.

As Cloud Computing has evolved, the momentum has shifted towards managed infrastructure and managed services. GCP offers many services in which you need not worry about any resource provisioning at all. We'll discuss many in this course.

what about SaaS? Of course, Google's popular applications like, Search, Gmail, Docs and Drive are Software as a Service applications in that they're consumed directly over the internet by end users.

GCP Network

According to some estimates out there publicly, Google's network carries as much as 40 percent of the world's Internet traffic every day. Google's network is the largest of its kind on earth and the company has invested billions of dollars over the years to build it. It's designed to give its users the highest possible throughput and the lowest possible latencies for their applications. The network interconnects at more than 90 Internet exchanges and more than 100 points of presence worldwide. When an Internet user sends traffic to a Google resource, Google responds to the user's request from an edge network location that will provide the lowest latency. Google's Edge-caching network sites content close to end users to minimize latency.

GCP Regions and Zones

Zone

A zone is a deployment area for Google Cloud Platform Resources. For example, when you launch a virtual machine in GCP using Compute Engine, which we'll discuss later, it runs in a zone you specify. Although people think of a zone as being like a GCP Data Center, that's not strictly accurate because a zone doesn't always correspond to a single physical building. You can still visualize the zone that way, though.

Region

Zones are grouped into regions, independent geographic areas, and you can choose what regions your GCP resources are in. All the zones within a region have fast network connectivity among them. Locations within regions usually have round trip network latencies of under five milliseconds. Think of a zone as a single failure domain within a region. As part of building a fault tolerant application, you can spread their resources across multiple zones in a region. That helps protect against unexpected failures. You can run resources in different regions too. Lots of GCP customers do that, both to bring their applications closer to users around the world, and also to protect against the loss of an entire region, say, due to a natural disaster.

Multi-Region

A few Google Cloud Platform Services support placing resources in what we call a Multi-Region. For example, Google Cloud Storage, which we'll discuss later, lets you place data within the Europe Multi-Region. That means, it's stored redundantly in at least two geographic locations, separated by at least 160 kilometers within Europe. As of the time of this video's production, GCP had 15 regions. Visit cloud.google.com to see what the total is up to today.

Compute Pricing

Google was the first major Cloud provider to deliver per second billing for its Infrastructure as a Service Compute offering, Google Compute Engine. Fine-grain billing is a big cost savings for workloads that are bursty, which is a lot of them. Many of the best-known GCP services billed by the second, including Compute Engine and Kubernetes Engine and you'll learn about them and others in this course.

Compute Engine offers automatically applied sustained use discounts which are automatic discounts that you get for running a virtual machine instance for a significant portion of the billing month. Specifically, when you run an instance for more than 25 percent of a month, Compute Engine automatically gives you a discount for every incremental minute you use for that instance. Compute Engines Custom Virtual Machine types lets you fine-tune virtual machines for their applications, which in turn lets you tailor your pricing for your workloads. Try the online pricing calculator to help estimate your costs.

Open APIs and open source mean customers can leave

Google gives customers the ability to run their applications elsewhere, if Google becomes no longer the best provider for their needs.

Here are some examples of how Google helps its customers avoid feeling locked in. GCP services are compatible with open source products.

For example, take Cloud Bigtable, a database we'll discuss later. Bigtable uses the interface of the open source database Apache HBase, which gives customers the benefit of code portability.

Another example, Cloud Dataproc offers the open source big data environment Hadoop, as a managed service.

Google publishes key elements of technology using open source licenses to create ecosystems that provide customers with options other than Google. For example, TensorFlow, an open source software library for machine learning developed inside Google, is at the heart of a strong open source ecosystem.

Many GCP technologies provide interoperability. Kubernetes gives customers the ability to mix and match microservices running across different clouds, and Google Stackdriver lets customers monitor workload across multiple cloud providers.

MultiLayered Security Approach

Because Google has seven services with more than a billion users, you can bet security is always on the minds of Google's employees.

Design for security is pervasive, throughout the infrastructure, the GCP and Google services run-on. Let's talk about a few ways Google works to keep customers' data safe, starting at the bottom and working up.

Both the server boards and the networking equipment in Google data centers are custom designed by Google. Google also designs custom chips, including a hardware security chip called Titan that's currently being deployed on both servers and peripherals.

Google server machines use cryptographic signatures to make sure they are booting the correct software. Google designs and builds its own data centers which incorporate multiple layers of physical security protections. Access to these data centers is limited to only a very small fraction of Google employees, not including me.

Google's infrastructure provides cryptographic privacy and integrity for remote procedure called data-on-the-network, which is how Google services communicate with each other. The infrastructure automatically encrypts our PC traffic in transit between data centers.

Google Central Identity Service, which usually manifests to end users as the Google log-in page, goes beyond asking for a simple username and password.

It also intelligently challenges users for additional information based on risk factors such as whether they have logged in from the same device or a similar location in the past. Users can also use second factors when signing in, including devices based on the universal second factor U2F open standard. Here's mine.

Most applications at Google access physical storage indirectly via storage services and encryption is built into those services.

Google also enables hardware encryption support in hard drives and SSDs. That's how Google achieves encryption at rest of customer data.

Google services that want to make themselves available on the Internet register themselves with an infrastructure service called the Google Front End, which checks incoming network connections for correct certificates and best practices. The GFE also additionally, applies protections against denial of service attacks. The sheer scale of its infrastructure, enables Google to simply absorb many denial of service attacks, even behind the GFEs. Google also has multi-tier, multi-layer denial of service protections that further reduce the risk of any denial of service impact.

Inside Google's infrastructure, machine intelligence and rules warn of possible incidents. Google conducts Red Team exercises, simulated attacks to improve the effectiveness of its responses. Google aggressively limits and actively monitors the activities of employees who have been granted administrative access to the infrastructure. To guard against phishing attacks against Google employees, employee accounts including mine require use of U2F compatible security keys. I don't forget my keys as much as I used to. To help ensure that code is as secure as possible Google stores its source code centrally and requires two-party review of new code. Google also gives its developers libraries that keep them from introducing certain classes of security bugs. Externally, Google also runs a vulnerability rewards program, where we pay anyone who is able to discover and inform us of bugs in our infrastructure or applications.

Budgets and Billing

How do you ensure not to overrun you cost in GCP. GCP Provides 5 ways to protect

a. Budgets and Alerts - we can configure Budgets @ billing account or per GCP Project. budget can be fixed limit or tie to another metric Example: tie to a % of previous month spend to notify cost approaching our limit we can create an alert. If a budget is 20k and alert set at 90% we get an alert when expense reaches 18k.

NOTE: **Search for Budgets and Alerts in GCP and set it up. **

b. Billing Exports - Allows us to store detailed billing information to Bigquery for detailed analysis.

c. Reports - available in GCP under Billing which is a visual tool to monitor

d. Quotos - to prevent over comsuption 2 types of quotoes at GCP Project level

        Rate quota - GKE Api 1000 requests per 100 seconds.
        Allocation Quotos - 5 n/w per project.

Interacting with GCP

cloud Platform console

Cloud Shell and Cloud SDK (Available as Docker image also) - gcloud, gsutil (storage), bq (big query)

Cloud Console Mobile App - we can use GCP thru this APP too.

REST Based API - use API Explorer - We can use 2 libraries to invoke services in GCP -

i) Cloud Client libraries

ii) Google API client lib

If some new services and features are missing in cloud client libraries we can use google api client library in that case.

Virtual Private Cloud (VPC) Network

VPC is global Scope. You can create Subnets in any region globally and in the subnets create resource in any zones. We can develop resilient app by utilizing this different instances of apps in different zones.

compute Engines - you have GPU for performance (machine learning) and you can have persistent disks such as Standard, SDD, and local (high performance) but if VM terminates data in persistent disk is lost. Can use boot image for a compute engine, define custom startup script for installing softwares etc, we can take backup of compute even when running this we can use take snapshot as backups to migrate to different region. we can have upto 96 virtual cpus in a compute engine and memory of 624 gb. You can scale up a compuete engine or autoscale for resilient ap. otherwise you can also use Load Balancer to distribute traffic to different Compute Engines.

VPC have implicit routing table we need not manage it GCP uses routing table to route traffic within a subnet, zones or across subnets too. GCP provide global distributed firewall - we need not manage it too - Use firewall rules to control incoming/outgoing traffic. You can expand more ip addresses in a subnet to expand it and it doesnot affect already created VMs.

You can tag firewall rules to metadata too - say you have created 10 webservers and have a label "web" - we can write firewall rule to allow traffic on port 80 for "web" tag which allows incoming traffic on port 80 for all these servers.

Allow VPCs to talk?

VPC Peering - to interconnect networks in GCP projects.

Shared VPC - who in one vpc do what in another then use this.

Load Balancers

Global HTTPS - layer 7 load balancing based on load - can route different URLs to different backends.

Global SSL Proxy - Layer 4 non http ssl based on load - supported on specific port numbers.

Global TCP Proxy - non ssl tcp traffic

Regional - load balancing of any traffic -

Regional Internal - load balancing of traffic inside GCP

Cloud DNS - 8.8.8.8 free provided by GCP

Cloud CDN -

Interconnect Options

VPN - Secure multi Gbps connection over VPN tunnels.

Direct Peering - No Internet. for private connection b/w you and google for your hybrid cloud workloads.

Carrier Peering - connection thru largest partner n/w of service providers.

Dedicated Interconnect - connect N * 10 G transport circuits for private cloud traffic to google cloud pops.

Deployment Manager - declaratively allows us to create resource - you create a template (.yaml/python) - give template to deployment manager to create resource - update .yaml

gcloud deployment-manager deployments create my-first-depl --config mydeploy.yml (Create deployment)

You modify the resources in mydeploy.yml and update like below

gcloud deployment-manager deployments update my-first-depl --config mydeploy.yml (Update deployment)

Monitoring Service - Use agents to collect more info from your VM instance including metrics and logs from 3rd party application.

Cloud Storage

Let's start with Google Cloud Storage. What's object storage? It's not the same as file storage, in which you manage your data as a hierarchy of folders. It's not the same as block storage, in which your operating system manages your data as chunks of disk. Instead, object storage means you save to your storage here, you keep this arbitrary bunch of bytes I give you and the storage lets you address it with a unique key. That's it. Often these unique keys are in the form of URLs which means object storage interacts nicely with Web technologies. Cloud Storage works just like that, except better. It's a fully managed scalable service. That means that you don't need to provision capacity ahead of time. Just make objects and the service stores them with high durability and high availability. You can use Cloud Storage for lots of things: serving website content, storing data for archival and disaster recovery, or distributing large data objects to your end users via Direct Download. Cloud Storage is not a file system because each of your objects in Cloud Storage has a URL. Each feels like a file in a lot of ways and that's okay to use the word "file" informally to describe your objects, but still it's not a file system. You would not use Cloud Storage as the root file system of your Linux box. Instead, Cloud Storage is comprised of buckets you create and configure and use to hold your storage objects. The storage objects are immutable, which means that you do not edit them in place but instead you create new versions. Cloud Storage always encrypts your data on the server side before it is written to disk and you don't pay extra for that. Also by default, data in-transit is encrypted using HTTPS. Speaking of transferring data, there are services you can use to get large amounts of data into Cloud Storage conveniently. We'll discuss them later in this module. Once they are in Cloud Storage, you can move them onwards to other GCP storage services. Just as I discussed, your Cloud Storage files are organized into buckets. When you create a bucket, you give it a globally unique name. You specify a geographic location where the bucket and its contents are stored and you choose a default storage class. Pick a location that minimizes latency for your users. In other words, if most of your users are in Europe, you probably want to pick a European location. Speaking of your users, there are several ways to control access to your objects and buckets. For most purposes, Cloud IAM is sufficient. Roles are inherited from project to bucket to object. If you need finer control, you can create access control lists - ACLs - that offer finer control. ACLs define who has access to your buckets and objects as well as what level of access they have. Each ACL consists of two pieces of information, a scope which defines who can perform the specified actions, for example, a specific user or group of users and a permission which defines what actions can be performed. For example, read or write. Remember I mentioned that Cloud Storage objects are immutable. You can turn on object versioning on your buckets if you want. If you do, Cloud Storage keeps a history of modifications. That is, it overrides or deletes all of the objects in the bucket. You can list the archived versions of an object, restore an object to an older state or permanently delete a version as needed. If you don't turn on object versioning, new always overrides old. What if versioning sounds good to you but you're worried about junk accumulating? Cloud Storage also offers lifecycle management policies. For example, you could tell Cloud Storage to delete objects older than 365 days. Or you could tell it to delete objects created before January 1, 2013 or keep only the three most recent versions of each object in a bucket that has versioning enabled.

Types - Multi Regional - Regional (high perf), - nearline (archival) - coldline (archival)

Comparing Storage

Now that we've covered GCP's core storage options, let's compare them to help you choose the right service for your application or workflow. This table focuses on the technical differentiators of the storage services. Each row has a technical specification and each column is a service. Let me cover each service from left to right. Consider using Cloud Datastore if you need to store unstructured objects or if you require support for transactions and SQL-like queries. This storage service provides terabytes of capacity with a maximum unit size of one megabyte per entity. Consider using Cloud Bigtable if you need to store a large amount of structured objects. Cloud Bigtable does not support SQL's queries nor does it support multi-row transactions. This storage service provides petabytes of capacity with a maximum unit size of 10 megabytes per cell and 100 megabytes per row. Consider using Cloud Storage if you need to store immutable blobs larger than 10 megabytes such as large images or movies. This storage service provides petabytes of capacity with a maximum unit size of five terabytes per object. Consider using Cloud SQL or Cloud Spanner if you need full SQL support for an online transaction processing system. Cloud SQL provides terabytes of capacity, while Cloud Spanner provides petabytes. If Cloud SQL does not fit your requirements because you need horizontal scalability not just through read replicas, consider using Cloud Spanner. We didn't cover BigQuery in this module as it sits on the edge between data storage and data processing, but you will learn more about it in the "Big Data and Machine Learning in the Cloud" Module. The usual reason to store data in BigQuery is to use its big data analysis and interactive query and capabilities. You would not want to use BigQuery, for example, as the backings store for an online application. Considering the technical differentiators of the different storage services, help some people decide which storage service to choose. Others like to consider use cases. Let me go through each service one more time. Cloud Datastore is the best for semi-structured application data that is used in app engines' applications. Bigtable is best for analytical data with heavy read/write events like AdTech, Financial or IoT data. Cloud Storage is best for structured and unstructured, binary or object data like images, large media files and backups. SQL is best for web frameworks and in existing applications like storing user credentials and customer orders. Cloud Spanner is best for large scale database applications that are larger than two terabytes; for example, for financial trading and e-commerce use cases. As I mentioned at the beginning of the module, depending on your application, you might use one or several of these services to get the job done.