GCP | Infra Cloud Solutions

Zones, Regions, Dual-Regions, and Multi-Regions

– Subnets are regional resources

– Because subnets are regional objects, the region you select for a resource determines the subnets it can use.”

– multi-regions and dual-regions are geo-redundant

Regions are independent geographic areas that consist of zones
A dual-region is a specific pair of regions

-Cloud KMS resources can be created in the following dual-regional locations
-A dual-region is a specific pair of regions
-Objects stored in a multi-region or dual-region are geo-redundant

– Data that is geo-redundant is stored redundantly in at least two separate geographic places separated by at least 100 miles
-Geo-redundancy occurs asynchronously

Currently, nam4 and eur4 are the only Dual-Regions available.

-Geo-redundancy occurs asynchronously

-Data that is geo-redundant is stored redundantly in at least two separate geographic places separated by at least 100 miles

A GCP organization’s combined IAM policy at any level of the Cloud Resource Hierarchy is a combination of the policies at that level, plus any policies inherited from higher levels.

Cloud Spanner – Global replication of relational data

BigQuery dataset, Location Types are available : Regional and Multi Regional

Billing

Billing accounts can contain billing sub accounts

Billing Account are connected to Payments Profile

Billing Account user – Link projects to billing accounts

Export billing options

Export Cloud Billing to :

BigQuery
Cloud Storage

Billing for resources that participate in a Shared VPC network is attributed to the service project where the resource is located

Cloud IAM

An Organization contains one or more folders. A Folder contains one or more Projects . A Project contains one or more Resources.

-A Role is a collection of permissions

-An IAM Policy object consists of a list of bindings

Projects can contain resources in different Region
Projects are configured with default Region and Zone
You don’t assign permissions to users directly. Instead, you assign them a Role which contains one or more permissions
Members can be of the following types: Google account, Service account, Google group, G Suite domain, Cloud Identity domain
A Binding binds a list of members to a role.”

Each GCP project can contain only a single App Engine application, and once created you cannot change the location of your App Engine application

MFA stands for Multi-Factor Authentication, and it is a best practice to use this to secure accounts.

IAM roles can be assigned per bucket.

–Predefined roles are granular and assigned to the service level for much more fine-tuned access

–Primitive roles are broad, project-wide roles assigned to the project level.

Cloud Identity and GSuite are the two ways to centrally manage Google accounts.

KMS – stand for in Cloud KMS – Key management Services

Service Accounts

–Resources not hosted on GCP should use a custom service account key for authentication.

Cloud SDK

The gcloud alpha and gcloud beta commands are two groups of additional Cloud SDK commands that you can install for the gcloud component.

-Maximum Size of a Cloud Storage Bucket – unlimited

-Cloud Storage offers unlimited object storage and individual objects can be as large as 5TB

-Versioning can be enabled on a Cloud Storage Bucket.

gsutil – This is a Cloud SDK component used to interact with Cloud Storage.

to view which project is default, run gcloud config list command. This will list properties for the active configuration, including the default project.

$ gcloud compute instances list

$ gcloud compute ssh

$ gcloud compute ssh ovi@server –dry-run

** *** Snapshot ***

$ gcloud compute snapshots list

$ gcloud compute disks list

$ gcloud compute disks snapshot development-server

$ gcloud compute images list

$ gcloud container clusters list

$ gcloud config list

$ gcloud app versions list

gcloud config configurations create

gcloud config configurations activate

gcloud config set project [ Project_ID]

gcloud logging read “login_name”

gcloud logging read “login_name” –limit 15

DISK

Create disk:

gcloud compute disks create (DISK_NAME) –type=(DISK_TYPE0 –size=(SIZE) –zone=(ZONE)

gcloud compute disks create disk-1 –size=50GB –zone=us-east1-b

Resize disk:

gcloud compute disks resize (disk_name)–size=(size) –zone=(zone)

gcloud compute disks resize disk-1 –size=150 –zone=us-east1-b

Attach disk:

gcloud compute instances attach-disk instance –disk=(disk_name) –zone=(zone)

snapshot

gcloud compute disks snapshot web1 –snapshot-names web1-backup-v1 –zone us-central1-a
gcloud compute snapshots list
gcloud compute snapshots describe web1-backup-v1

-persistent disks will not be deleted when an instance is stopped.

– persistent disk performance is based on the total persistent disk capacity attached to an instance and the number of vCPUs that the instance has. Incrementing the persistent disk capacity will increment its throughput and IOPS

Video for reference: Installing the Cloud SDK

View default cloud configuration

gcloud config list

gcloud container clusters get-credentials —> to authenticate and configure kubectl

Preemptible Virtual Machines

Affordable, short-lived compute instances suitable for batch jobs and fault-tolerant workloads.
Go to console

Preemptible VMs are highly affordable, short-lived compute instances suitable for batch jobs and fault-tolerant workloads. Preemptible VMs offer the same machine types and options as regular compute instances and last for up to 24 hours”

// ENABLE PREEMPTIBLE OPTION
gcloud compute instances create my-vm --zone us-central1-b --preemptible

App Engine

web based workloads, high availability, no ops

Flexible environments are able to use a Dockerfile to create custom runtimes

-App Engine is regional

-App Engine traffic can be split by cookie, by IP address, and at random. We cannot split traffic by zone.

App Engine Standard Environment.

default timeout setting for a Service Instance deployed to the App Engine Standard Environment is 60 s
The App Engine Standard environment does not allows Instance Runtimes to be modified
App Engine Standard Environment does scale down to zero when not in use

App Engine Flexible Environment

Runtime modifications are allowed for instances running in the App Engine Flexible environment.

In App Engine Flex the connection to Stackdriver (i.e. agent installation and configuration) is handled automatically for you

App Engine Flexible Environment does not scale down to zero

Deploying and Manipulating Multiple App Engine Versions

gcloud app deploy –version 1

canary test

gcloud app deploy –no-promote –version 2

Compute Engine

Managed Instance Group

Unmanaged Instance Group

Unmanaged instance groups do not offer...multi-zonal support

Maximum size of Compute Engine Local Disks – 3 TB

Cloud Functions

-billing interval is – 100 ms

-Horizontal Scaling

– Microservices Architecture

-Cloud Functions does scale down to zero when not in use

Cloud Run

Uses Stateless HTTP containers
Scalability
Built on Knative

Cloud Storage

-Cloud Storage allows Organizations to use CSEKs (Customer Supplied Encryption Keys).

-Data in a regional location operates in a multi-zone replicated configuration

*** create a bucket

$ gsutil mb -c regional -l us-east gs://ovi

$ gsutil versioning get gs://ovi

$ gsutil versioning set on gs://ovi

$ gsutil ls -a gs://ovi

$ gsutil cp <file> gs://ovi

ovi_p_eb632cd8@cloudshell:~ (ovi-24-565a3874)$ gsutil ls gs://ovi11
gs://ovi11/IMG_2759.jpg
gs://ovi11/IMG_2770.jpg

ovi@cloudshell:~ (ovi-24-565a3874)$ touch ovi_file
ovi@cloudshell:~ (ovi-24-565a3874)$ gsutil cp ovi_file gs://ovi11
Copying file://ovi_file [Content-Type=application/octet-stream]...
/ [1 files][    0.0 B/    0.0 B]
Operation completed over 1 objects.

ovi@cloudshell:~ (ovi-24-565a3874)$ gsutil ls gs://ovi11
gs://ovi11/IMG_2759.jpg
gs://ovi11/IMG_2770.jpg
gs://ovi11/ovi_file

Pub/Sub is a messaging service for exchanging event data among applications and services. A producer of data publishes messages to a Pub/Sub topic. A consumer creates a subscription to that topic. Subscribers either pull messages from a subscription or are configured as webhooks for push subscriptions. Every subscriber must acknowledge each message within a configurable window of time.

Cloud Pub/Sub as the messaging service to capture real time data ( ex: IoT )
– is designed to provide reliable, many-to-many, asynchronous messaging between applications (real time IoT data capture)

-Cloud Pub/Sub is designed to handle infinitely-scalable streaming data ingest

Pub/Sub

1. Create a topic.

2. Subscribe to the topic.

3. Publish a message to the topic.

4. Receive the message.

gcloud init
gcloud pubsub topics create ovi-topic
gcloud pubsub subsriptions create –topic ovi-topic ovi-sub
gcloud pubsub topics publish ovi-topic –message “hello”
gcloud pubsub subscriptions pull –auto-ack ovi-sub

gcloud config configurations activate — Activate an existing configuration

gcloud config list — list the settings for the active configuration

App Engine is a Platform as a Service – It is a fully managed solution.

gcloud container cluster resize — this command is used to resize a Kubernetes clusters

ex:

gcloud container clusters resize oviproject –node-pool ‘primary-node-pool’ –num-nodes 25

gcloud config configurations create — create and activate a new configuration

Log sinks can be exported to Cloud Pub/Sub.

Storage Option

Multi-Regional – Data accessed frequently with highest availability / Geo-redundant
Regional – Data accessed frequently within region / Regional, redundant across availability zones
Nearline – Data accessed less than once per month / Regional / Store infrequently accessed content
Coldline – Data accessed less than once per year / Regional / Archive storage, backup, Disaster recovery

Coldline Storage is the best choice for data that you plan to access at most once a year, due to its slightly lower availability, 90-day minimum storage duration, costs for data access, and higher per-operation costs

-Lifecycle management policies can be submitted via JSON format.

Cloud SQL

– Read replicas and failover replicas are charged at the same rate as stand-alone instances

-Cloud SQL for PostgreSQL does not yet support replication from an external master or external replicas for Cloud SQL instances
“This functionality is not yet supported for PostgreSQL instances

-GCP Cloud SQL provides which of the following Backup types : automated backups , on-demand backups

-Cloud SQL read replicas and failovers must be in the same region. The failover must be in a different zone in the same region.

-Cloud SQL is a relational database and not the best fit for time-series log data formats

Cloud Spanner

Cloud Spanner scales horizontally and serves data with low latency while maintaining transactional consistency

After you create an instance, you cannot change the configuration of that instance later

Cloud Spanner Instance Configuration can be set to which of the following Location: regional, multi-regional

Cloud Spanner is a SQL/relational database.

Cloud Spanner acts is a SQL database that is horizontally scalable for cross-region support and can host large datasets.

BigQuery – Calculating cost

UI: query validator

CLI: –dry-run

REST: dryRun Property

-BigQuery is the only one of these Google products that supports an SQL interface

-BigQuery is billed based on the amount of data read. The dry-run flag is used to determine how many bytes are going to be read.

-Analytics DataWare house

-Use a BigQuery with table partitioning

-BigQuery is the best choice for data warehousing

-BigQuery does not offer low latency and millisecond response time

-The Big Query instance Labels and Display Name can be modified without any downtime

–BigQuery is a serverless warehouse for analytics and supports the volume and analytics requirement

– move large datasets directly to BigQuery, consider BigQuery Data Transfer Service, which automates data movement from SaaS applications to Google BigQuery on a scheduled, managed basis

Cloud Bigtable

A petabyte-scale, fully managed NoSQL database service for large analytical and operational workloads.

Bigtable is priced by provisioned node
Bigtable does not autoscale
Bigtable does not store data in GCS
Bigtable is not made for store large objects

Use Cloud Bigtable as the storage engine for large-scale, low-latency applications as well as throughput-intensive data processing and analytics.

Apache HBase is Open Source version of Bigtable

-Each cluster is located in a single zone

-Maximum number of Clusters for a Cloud Bigtable Instance is – 4

After creating a Cloud Bigtable instance, any of the following settings can be updated without any downtime:

– The application profiles for the instance, which contain replication settings
– Upgrade a development instance to a production instance

– The number of nodes in each cluster
– The number of clusters in the instance

Cloud BigTable

– Service is ideal for Time-Series data

– ideal for applications requiring very high read/write throughput and can store Petabytes of unstructured data

– can be deployed zonal

-Bigtable is not a relational database.

-Cloud Bigtable provides the ability to isolate workloads by allowing applications to connect to specific Clusters

-Cloud Bigtable is optimized for time-series data. It is cost-efficient, highly available, and low-latency

Cloud Datastore

-Datastore can be queried, it’s fully managed, and is a great option for catalog based applications. Datastore also supports a basic query/filter syntax.

-Datastore is a managed NoSQL database well suited to mobile applications

– Cloud Datastore queries can deliver their results at either of two consistency levels:

-Strongly consistent queries guarantee the freshest results, but may take longer to complete.
-Eventually consistent queries generally run faster, but may occasionally return stale results.

-You can store your Datastore mode data in either a multi-region location or a regional location

Cloud Firestore is the next generation of Cloud Datastore

Firestore
Easily develop rich applications using a fully managed, scalable, and serverless document database

Cloud Dataflow – service of processing large volume of data

Cloud Dataflow provides you with a place to run Apache Beam based jobs, on GCP
Cloud Dataflow provides for both streaming and batch pipelines
uses cases

( Serverless ETL, processing data from IoT Devices, processing Data from POS systems)

– a fully managed ETL/ELT service for transforming, transporting, and enriching data

– Dataflow is built on top of Apache Beam and is ideal for new, cloud-native batch and streaming data processing

Cloud Dataproc – o handle existing Hadoop/Spark jobs. ( Use it to replace existing hadoop infra.)

Dataproc should be used if the processing has any dependencies to tools in the Hadoop ecosystem.

Cloud Dataproc can leverage Preemptive Compute Engine VMs

Dataproc is a fast, easy-to-use, fully managed cloud service for running Apache Spark and Apache Hadoop clusters in a simpler, more cost-efficient way.

Dataproc is for managed Hadoop/Spark workflows

–Cloud Dataproc has built-in integration with other Google Cloud Platform services, such as BigQuery, Cloud Storage, Cloud Bigtable, Stackdriver Logging, and Stackdriver Monitoring, so you have more than just a Spark or Hadoop cluster, you have a complete data platform”

Cloud Dataproc and Cloud Dataflow can both be used for data processing, and there’s overlap in their batch and streaming capabilities

Cloud Dataparse

Cloud Composer

Cloud Composer is a fully managed workflow orchestration service that empowers you to author, schedule, and monitor pipelines that span across clouds and on-premises data centers

A fully managed workflow orchestration service built on Apache Airflow.

Preemtible instances are short lived instances ( 24 hours maxim )

-A static website can be hosted with cloud storage for very little money.

Cloud Functions

billing interval for Cloud Functions is 100 ms

Apigee – Design, Secure, Publish, Analyze, Monitor, and Monetize APIs

Cloud Functions support : Go,node,python

Cloud Datastudio ( similar to Tableau, Power BI )

Data Studio is able to easily create useful charts from live BigQuery data to get insight.

Security

Cloud Audit Log

GCP Service maintains logs for each GCP Project, Folder, and Organization

Cloud Security Scanner

Cloud Armour

works with Global HTTP(S) Load Balancers to deliver defense against DDoS (Distributed Denial of Service) attacks

Data Loss Prevention API

-Use the Data Loss Prevention API to automatically detect and redact sensitive data

-Fully managed service designed to help you discover, classify, and protect your most sensitive data

Trusted Platform Module (TPM)

Cloud Code

– provides everything you need to write, debug, and deploy Kubernetes applications

Cloud Source

– is a GCP Service that is used for Code Version Control

Cloud TPU

GCP Service provides a custom-designed family of ASIC (Application-Specific Integrated Circuit) hardware accelerators, which are specifically for machine learning

Cloud Datafusion ( similar to Cloud Dataflow)

Cloud Data Catalog

provides Organizations with a central location to discover, manage, and understand all their data in the Google Cloud

Cloud Memorystore

Cloud IoT Core

provides the ability to securely connect, manage, and ingest data from globally dispersed devices

MQTT stands for MQ Telemetry Transport. It is a publish/subscribe, extremely simple and lightweight messaging protocol, designed for constrained devices and low-bandwidth, high-latency or unreliable networks.

Cloud Firestore is the next generation of Cloud Datastore

Cloud Build

Cloud Source

Cloud Dataprep

provides features to visually explore, scrub, clean, and prepare structured and unstructured data
Dataprep cleans data in a web interface format using data from Cloud Storage or Bigquery.
Dataprep is a UI driven data preparation service that runs on top of Cloud Dataflow

Cloud Datalab

-is a data exploration tool which provides an intuitive notebook format to combine code, results, and visualizations

– is most useful for Data Scientists

StackDriver

-Once logs are past their retention period and are deleted, they are permanently gone. Export logs to Cloud Storage or BigQuery for long-term retention

-Performance statistics would be best served viewing in Stackdriver Monitoring using custom metrics.

Stackdriver has an integrated service to export logs for Analysis to: BigQuery, Pub/Sub, Storage

Cloud endpoints

-GCP Service provides API Management by using either Frameworks for App Engine, OAS (OpenAPI Specification), or gRPC

-Develop, deploy, protect, and monitor your APIs with Cloud Endpoint

Apigee

provides the ability to Design, Secure, Publish, Analyze, Monitor, and Monetize APIs?

Deployment manager

gsutil -m cp -r gs://ovi/deployment-manager/* .

gcloud deployment-manager deployments create my-vm –config vm-web.yaml

gcloud deployment-manager deployments create vpcs –config vpc-dependencies.yaml

gcloud deployment-manager deployments describe vpcs

gcloud deployment-manager deployments delete vpcs

Machine Types:
General-purpose: n1
n1-standard
n1-highcpu
n1-highmem

Compute-optimized: c2
c2-standard

Memory-optimized: n1, m2
n1-ultramem
n1-megamem
m2-ultramem

Shared-core:
f1-micro
g1-small

to initialize gcloud: simply `gcloud init` and follow the prompts. And this also configures `gsutil` and `bq`.

Networking

VPC

–GCP VPC are global

-GCP Resources within a single VPC Subnet must be within same region (Subnets are regional resources)

-VPC network peering provides cross-project VPC communication within the same or different organizations

-VPC Network Peering and Shared VPC are methods for connecting two GCP VPC, not for connecting an On-Prem network to GCP Cloud Services

Shared VPC ( two main components )

Host Project
Service Project

Billing for resources that participate in a Shared VPC network is attributed to the service project where the resource is located

– VPC Network Peering is only between two Google Cloud

Each Cloud VPN tunnel can support up to 3 Gbps. Actual bandwidth depends on several factors

– Direct Peering exists outside of Google Cloud Platform

-(Direct Peering) can be used by GCP, but does not require it.

-Direct Peering can be used for G Suite Platform, existing outside of GCP

-you can’t use Google Cloud VPN in combination with Dedicated Interconnect, but you can use your own VPN solution.”

-you can’t use Google Cloud VPN in combination with Partner Interconnect, but you can use your own VPN solution.”

Dedicated Interconnect

find a collocation facility
Connect On-premise to Collocation
Order LOA-CFA ( Letter of Authorization and Connecting Facility Assignment )

Partner Interconnect

Cloud VPN

Cloud load balancer

Global HTTP(s) – Cloud Load Balancer offers cookie-based Session Affinity

Global HTTP(S) – can be configured for use as a CDN (Content Delivery Network)?

Global SSL proxy – type of Cloud Load Balancer is intended for Global SSL Encrypted Traffic that is not HTTP(S)

Global TCP proxy – type of Cloud Load Balancer is intended for Global Traffic that is not HTTP(S) and not SSL Encrypted

Global HTTP(S) – type of Cloud Load Balancer is intended to provide Global URL Routing

The HTTP(S) load balancer in GCP handles WebSocket traffic natively. Backends that use WebSocket to communicate with clients can use the HTTP(S) load balancer as a front end for scale and availability.

Network load balancers only distribute traffic to a single region. For global load balancing to multiple regions, use an HTTP load balancer or a TCP/SSL Proxy load balancer option

– Network Load Balancers are not proxies

– responses go directly to clients – direct server return

– Source IP address not modified – The LB preservers the source IP addresses of packets

-Network tags allow more granular access based on individually tagged instances.

LOGS/Monitoring

GCP projects store logs in:

-Default bucket

-Required bucket

The default retention period for log stored in default bucket is 30 days

Storage Transfer Service

Transfer Appliance

Transfer Appliance is a high-capacity storage device that enables you to transfer and securely ship your data to a Google upload facility, where we upload your data to Google Cloud Storage. For Transfer Appliance capacities and requirements, see Specifications.

Other data transfer options

Cloud Storage Transfer Service: Quickly imports online data into Google Cloud Storage.
Google BigQuery Data Transfer Service: Automates data movement from Software as a Service (SaaS) applications such as Google Ads and Google Ad Manager on a scheduled, managed basis.

Cases Study

TerramEarth

Cloud IoT Core
Cloud Dataflow
Cloud BigQuery
Cloud ML Engine
Cloud Datalab
Datastudio

signed URL

Allows timed access with a URL link.
Allows someone object access without requiring them to have a GCP account.

Security

-Forseti security

https://forsetisecurity.org/

Other data transfer options

Share this:

Related

Leave a comment Cancel reply