GCP Lesson 49 of 98

Google Kubernetes Engine, In Depth: Autopilot vs Standard, Node Pools, Networking & Security

Google Kubernetes Engine (GKE) is Google Cloud’s managed Kubernetes — the same Kubernetes you would run yourself, but with Google operating the control plane, patching it, scaling it, and (in Autopilot) running the nodes too. Kubernetes itself came out of Google’s internal Borg system, and GKE is the most opinionated, most automated managed Kubernetes of the big three: it ships secure defaults, two genuinely different operating modes, an eBPF dataplane, and keyless identity for pods, all wired together out of the box.

This lesson is the exhaustive version. By the end you will know every meaningful knob you set when you create a cluster and a node pool, the load-bearing concepts an interviewer probes (Autopilot vs Standard, zonal vs regional, VPC-native, Workload Identity), what you can and cannot change after the fact, and the cost mechanics that quietly drive the bill. It is beginner-accessible — every term is defined — but complete enough to operate GKE in production and to answer the hard questions on the Associate Cloud Engineer and Professional Cloud Architect exams.

Learning objectives

Prerequisites & where this fits

You should be comfortable with the GCP resource hierarchy and IAM, with a VPC and subnets (this lesson leans on alias IP ranges and Cloud NAT), and with the absolute basics of Kubernetes — what a Pod, Deployment and Service are, and that kubectl talks to an API server. If those are fuzzy, skim the VPC and IAM fundamentals first. In the Zero-to-Hero programme this is the Containers lesson of the Intermediate tier: it follows Cloud Load Balancing (GKE leans on the same load-balancer building blocks) and precedes BigQuery. The two advanced follow-ons — Autopilot production hardening and Workload Identity in depth — assume you have the fundamentals on this page.

Core concepts

A cluster is a control plane plus a set of worker nodes. The control plane runs the Kubernetes API server, scheduler, controller-manager and etcd; in GKE it is fully managed by Google — you never SSH it, and Google patches and scales it. You talk to it with kubectl, which hits the cluster’s API endpoint.

A node is a Compute Engine VM that runs your pods via the kubelet and a container runtime (containerd). In Standard mode you group nodes into node pools — sets of identical nodes (same machine type, image, disk, settings) that scale and upgrade together. In Autopilot mode there are no visible node pools: Google provisions, sizes, patches and scales nodes for you, and you pay for the resources your pods request, not for nodes.

VPC-native is the modern networking model: pods get real IP addresses from a secondary (alias) IP range on the subnet, so pod traffic is first-class in your VPC and routable to on-prem and peered networks without extra routes. Workload Identity Federation for GKE lets a Kubernetes ServiceAccount act as (federate to) a Google service account, so pods authenticate to Google APIs using short-lived tokens and no downloaded keys. Release channels (Rapid, Regular, Stable, plus Extended) control how quickly your cluster auto-upgrades through Kubernetes versions. Keep these terms in mind — every section below is one of these boxes or the wiring between two of them.

Autopilot vs Standard: the whole decision

GKE has two modes of operation, chosen at cluster creation and (with rare exceptions) fixed for the cluster’s life — you cannot flip a Standard cluster to Autopilot in place; you create a new cluster and migrate workloads.

Dimension Autopilot Standard
Who runs nodes Google provisions, sizes, scales, repairs You size node pools and scale them (autoscaler optional)
Billing unit Pod resource requests (vCPU/mem/ephemeral) + cluster fee Node VM-hours (provisioned) + cluster fee
Bin-packing Google packs and scales for you Your responsibility
Node access No SSH; privileged/host-namespace pods blocked SSH, privileged pods, host DaemonSets allowed
Security defaults Shielded nodes, Workload Identity, hardened OS — enforced Opt-in (most are recommended defaults but changeable)
GPUs / TPUs Supported via requests (managed) Full control incl. custom drivers
Windows nodes Not supported Supported
DaemonSets Restricted (no host access) Full
SLA Pod-level + control-plane SLA Control-plane (+ node) SLA
Best for Teams wanting platform discipline without node ops Teams needing host access, special hardware, extreme packing

The mental shift: on Standard you optimise node utilisation; on Autopilot you optimise request accuracy, because over-requesting is the cost. A useful middle ground exists on Standard — compute classes and node auto-provisioning bring much of Autopilot’s automation while keeping node access — but the billing model and the node-access contract are the things that actually differ. Default to Autopilot unless you have a concrete reason (host-level DaemonSets, custom GPU drivers, Windows, or bin-packing you must control) to take node ownership.

The control plane: zonal vs regional

When you create a cluster you choose its location type, which determines where the control plane (and, for Standard, the default nodes) live. This is one of the highest-leverage availability decisions, and it is immutable — you cannot convert zonal to regional after creation.

Location type Control plane Default node placement Control-plane HA Use when
Zonal (single-zone) One replica in one zone Nodes in that one zone None — control plane down during that zone’s outage/upgrade Dev/test, cost-sensitive, non-critical
Zonal (multi-zonal) One replica in one zone Nodes spread across several zones None (plane still single-zone) You want node spread but accept a single-zone plane
Regional 3 replicas across 3 zones in the region Nodes replicated across zones (×3 by default on Standard) Yes — survives a zone failure, zero-downtime control-plane upgrades Production

Two consequences people miss. First, a regional control plane gives you a highly available API endpoint and no control-plane downtime during upgrades — worth it for anything production. Second, on Standard regional clusters, a node pool’s node count is per-zone: ask for 1 node in a 3-zone region and you get 3 nodes. Autopilot clusters are regional by design (the node concept is hidden, and Google spreads pods across zones). The cluster management fee is the same flat hourly fee for zonal and regional (and for Autopilot and Standard) — what differs is the node/pod compute you pay underneath.

Creating a cluster: every setting that matters

Whether you use the Console wizard or gcloud, these are the fields you set. The Console groups them under Cluster basics, Fleet, Networking, Security, Metadata and Features; the flags below mirror them.

Setting What it is / choices Default When / trade-off / gotcha
Mode Autopilot vs Standard Console offers Autopilot first Immutable; drives the whole billing and ops model
Name / location Cluster name; zone (zonal) or region (regional) Location type is immutable; pick region for prod
Release channel Rapid / Regular / Stable / Extended (or static, Standard only) Regular Channel = auto-upgrade cadence; static versions on Standard reach end-of-life and force upgrades
Control-plane version Specific minor/patch within the channel Channel default You can set a target, but the channel keeps it current
Network / subnet Which VPC and subnet the cluster uses default Choose a real subnet with sized secondary ranges
Network policy / dataplane Legacy (Calico) vs Dataplane V2 (eBPF) Dataplane V2 on new clusters DPv2 also gives network-policy logging and better observability
Cluster IP allocation VPC-native (alias IP) vs routes-based (legacy) VPC-native Routes-based is legacy; always pick VPC-native
Pod range / Service range Secondary ranges (or auto-created) sizing pod and ClusterIP space Auto These cap pods-per-cluster and services; immutable — size up front (see networking)
Private cluster Nodes get internal IPs only; control-plane access via private endpoint Off (public nodes) Recommended for prod; pair with Cloud NAT for egress
Control-plane authorized networks CIDR allow-list for the public API endpoint Off Strongly recommended even on public clusters
Workload Identity Federate K8s SAs to Google SAs On (enforced on Autopilot) Enable on Standard at create; the keyless standard
Shielded GKE nodes Secure/measured boot + integrity monitoring for nodes On (enforced on Autopilot) Keep on; cheap integrity guarantee
Binary Authorization Admission policy: only signed/attested images run Off Turn on for supply-chain control in prod
Security posture / vuln scanning Built-in misconfig + workload vuln dashboard Basic on Standard tier adds deeper scanning
Cluster autoscaler / NAP Per-pool autoscale; node auto-provisioning creates pools on demand Off (Standard) Autopilot does this implicitly; enable on Standard for elasticity
Maintenance window / exclusions When auto-upgrades may run; blackout windows Any time Set a window + freeze around peak events
Fleet registration Join the cluster to a fleet (multi-cluster mgmt) Optional Needed for Config Management, multi-cluster services, Gateway

A minimal Autopilot cluster:

gcloud container clusters create-auto demo-auto \
  --region=us-central1 \
  --release-channel=regular \
  --enable-private-nodes

A minimal production-shaped Standard regional cluster (VPC-native, private nodes, Dataplane V2, Workload Identity, Shielded nodes):

gcloud container clusters create demo-std \
  --region=us-central1 \
  --release-channel=regular \
  --enable-ip-alias \
  --enable-dataplane-v2 \
  --enable-private-nodes \
  --master-ipv4-cidr=172.16.0.0/28 \
  --workload-pool=$(gcloud config get-value project).svc.id.goog \
  --shielded-secure-boot --shielded-integrity-monitoring \
  --num-nodes=1 \
  --machine-type=e2-standard-4

(--num-nodes=1 on a 3-zone region creates 3 nodes — one per zone.)

Node pools, in depth (Standard)

A node pool is a group of identical nodes managed as a unit. A cluster always has at least one (the default pool created with the cluster); you add more to mix machine types, attach GPUs, use Spot, or isolate workloads with taints. Autopilot has no user-visible node pools — skip this section if you run Autopilot. Every setting below is per-pool.

Node-pool setting What it is / choices Default When / trade-off / gotcha
Machine type / family E2, N2, N2D, C3, C3D, T2D (general/compute/memory/Arm), plus A-series/accelerator for GPU/TPU; custom types allowed e2-medium Match CPU:RAM to the workload; reserve some capacity for system pods
Image type Container-Optimized OS (cos_containerd, default & recommended), Ubuntu (ubuntu_containerd), Windows COS COS is minimal/auto-updating; Ubuntu only if you need its packages/kernel modules
Boot disk pd-standard / pd-balanced / pd-ssd; size; CMEK pd-balanced, 100 GB SSD for I/O-heavy nodes; size for image cache + ephemeral storage
Node count Fixed size of the pool (per-zone on regional) 3 With autoscaler this is the starting size
Autoscaling (min/max) Cluster autoscaler grows/shrinks the pool by pending-pod pressure Off Set sane max; scale-down respects PodDisruptionBudgets and safe-to-evict
Location policy BALANCED vs ANY (where the autoscaler adds nodes) BALANCED ANY helps land scarce capacity (e.g. Spot/GPU)
Auto-upgrade Nodes auto-upgraded toward the control-plane version On (required on channels) Keeps nodes patched; control the timing with maintenance windows
Auto-repair Unhealthy nodes auto-recreated On Repairs failed/NotReady nodes; expect occasional node churn
Surge upgrade max-surge (extra temp nodes) + max-unavailable during upgrades surge 1 / unavailable 0 Higher surge = faster, more cost during upgrade; or use blue-green node-pool upgrades
Spot VMs Preemptible, deeply discounted, can be reclaimed any time (no 24h cap) Off Up to ~60–91% cheaper; only for fault-tolerant/stateless work; combine with on-demand pool
Taints `key=value:NoSchedule PreferNoSchedule NoExecute` to repel pods lacking the matching toleration
Labels Node labels for nodeSelector/affinity GKE adds some Use for scheduling and cost attribution
Node metadata / SA The node’s Compute Engine service account & scopes; metadata concealment default CE SA Give nodes a least-privilege SA; pods get Google access via Workload Identity, not node scopes
node-system-config sysctls and kubelet config (e.g. --max-pods-per-node) Platform defaults Max-pods-per-node is set at pool creation and constrains the alias range
Confidential nodes Memory encrypted in use (AMD SEV) Off For sensitive workloads; small overhead

Add a Spot, GPU-style or autoscaling pool:

# Autoscaling on-demand pool
gcloud container node-pools create web-pool \
  --cluster=demo-std --region=us-central1 \
  --machine-type=e2-standard-4 \
  --enable-autoscaling --min-nodes=1 --max-nodes=6 \
  --enable-autoupgrade --enable-autorepair \
  --max-surge-upgrade=2 --max-unavailable-upgrade=0

# Spot pool, tainted so only tolerant pods land here
gcloud container node-pools create spot-pool \
  --cluster=demo-std --region=us-central1 \
  --machine-type=e2-standard-4 --spot \
  --enable-autoscaling --min-nodes=0 --max-nodes=10 \
  --node-taints=cloud.google.com/gke-spot=true:NoSchedule

Cluster autoscaler vs node auto-provisioning

The cluster autoscaler (CA) scales node counts within pools you already defined, up to each pool’s max, based on pending (unschedulable) pods, and scales down underused nodes when their pods can move elsewhere. Node auto-provisioning (NAP) goes further: it creates and deletes whole node pools automatically with machine shapes that fit pending pods — closer to the Autopilot experience while keeping node access. Enable NAP with resource limits:

gcloud container clusters update demo-std --region=us-central1 \
  --enable-autoprovisioning --min-cpu=1 --max-cpu=64 --min-memory=1 --max-memory=256

Networking: VPC-native, ranges, Dataplane V2, private clusters

VPC-native and the three IP ranges

A VPC-native cluster draws from three address pools that you must size up front, because the pod and service ranges are effectively immutable for the cluster’s life:

Size the pod range with headroom: a /16 pod range with the default 110 max-pods-per-node supports a few hundred nodes; shrinking max-pods-per-node packs more nodes into the same range. Because you cannot grow the pod or service range later, over-provision deliberately. Pods getting real VPC IPs is what makes GKE traffic routable to on-prem, peered VPCs and Private Google Access without per-pod routes — the big advantage over the legacy routes-based model.

Dataplane V2 (eBPF)

Dataplane V2 replaces the old kube-proxy/iptables and Calico stack with an eBPF dataplane built on Cilium. It is the default on new clusters and brings: scalable Service/load-balancing handling, built-in NetworkPolicy enforcement (no separate Calico add-on), network policy logging, and better visibility. NetworkPolicy is how you implement pod-to-pod firewalling (default-deny then allow):

gcloud container clusters create demo-dpv2 --region=us-central1 \
  --enable-dataplane-v2 --enable-ip-alias

Private clusters

In a private cluster, nodes have internal IPs only (no public IPs), shrinking the attack surface. You then control how you reach the control plane:

Because private nodes have no public IP, outbound internet (pulling images from non-Artifact-Registry registries, calling third-party APIs) needs Cloud NAT, and access to Google APIs/Artifact Registry needs Private Google Access (on by default for GKE subnets). This is the standard production shape: private nodes + authorized networks (or private endpoint) + Cloud NAT.

Exposing workloads: Service, Ingress, Gateway

Mechanism Layer What it creates Use when
Service type LoadBalancer L4 A passthrough/internal Network LB to the pods Simple TCP/UDP exposure of one Service
Ingress (GKE Ingress) L7 A Google Application Load Balancer via HTTP(S) HTTP routing, managed certs, Cloud CDN/Armor; the long-standing default
Gateway API (GKE Gateway) L7/L4 LBs driven by the standard Gateway/HTTPRoute CRDs Modern, role-oriented, multi-cluster traffic, finer control
Container-native LB (NEGs) LB sends traffic directly to pod IPs (skips node hop) Default with VPC-native; better latency, accurate health checks

GKE Ingress and Gateway both lean on the same Cloud Load Balancing building blocks (forwarding rule → target proxy → URL map → backend service → NEG of pod IPs). Container-native load balancing via Network Endpoint Groups (NEGs) — automatic in VPC-native clusters — routes the LB straight to pod IPs, which is why VPC-native matters for ingress too. Gateway is the forward-looking choice for new platforms; Ingress remains fully supported.

Workload Identity Federation for GKE

The single most important security feature: Workload Identity Federation for GKE lets a Kubernetes ServiceAccount impersonate a Google service account, so pods call Google APIs (Cloud Storage, Pub/Sub, BigQuery…) with short-lived, automatically-rotated credentials and no downloaded JSON keys. It replaces the bad old pattern of mounting a service-account key into a pod (a long-lived secret that leaks and never rotates) and the blunt pattern of granting the node’s service account broad scopes (which gives every pod on the node the same access).

It is on and enforced in Autopilot; on Standard you enable it on the cluster (--workload-pool=PROJECT.svc.id.goog) and on each node pool. The wiring is a three-step bind:

# 1. Allow the K8s SA (namespace/ksa) to impersonate the Google SA
gcloud iam service-accounts add-iam-policy-binding \
  app-gsa@$PROJECT.iam.gserviceaccount.com \
  --role=roles/iam.workloadIdentityUser \
  --member="serviceAccount:$PROJECT.svc.id.goog[my-ns/app-ksa]"

# 2. Annotate the Kubernetes ServiceAccount to point at the Google SA
kubectl annotate serviceaccount app-ksa -n my-ns \
  iam.gke.io/gcp-service-account=app-gsa@$PROJECT.iam.gserviceaccount.com

# 3. Grant the Google SA only the roles the workload needs (least privilege)
gcloud projects add-iam-policy-binding $PROJECT \
  --member="serviceAccount:app-gsa@$PROJECT.iam.gserviceaccount.com" \
  --role=roles/storage.objectViewer

Pods running under app-ksa then authenticate to Google APIs as app-gsa automatically — no keys anywhere. (The depth lesson on Workload Identity walks through troubleshooting, the metadata server, and migration; this is the fundamentals.)

Security: the controls worth enabling

GKE ships more secure defaults than self-managed Kubernetes, but production warrants going further. The major controls:

Release channels and upgrades

A release channel subscribes the cluster to an upgrade cadence; Google then auto-upgrades the control plane (and, with auto-upgrade, the nodes) along that channel, having validated each version.

Channel Cadence / maturity Use when
Rapid Newest versions soonest (incl. latest minor) Test/staging, early access to features
Regular Balanced — proven a few weeks after Rapid Default for most production
Stable Most conservative, longest soak Risk-averse production
Extended Longer support window for a version (extra cost) Workloads that need to pin a version longer
Static (no channel) Standard only; you pin a version manually Avoid — versions reach end-of-life and force upgrades

Control upgrade timing with maintenance windows (when upgrades may run) and maintenance exclusions (blackout periods — e.g. freeze during a launch or peak shopping week). For nodes, choose the upgrade strategy per pool: surge upgrades (extra temporary nodes drain-and-replace gradually) or blue-green (stand up a parallel set, shift, then tear down the old) for safer, reversible rollouts. Regional control planes upgrade with zero API downtime; zonal planes are briefly unavailable during the control-plane upgrade.

Architecture at a glance

The diagram below contrasts the two operating modes and shows the shared building blocks — the Google-managed control plane on top, then Autopilot (Google-run nodes, pay-per-pod-request) beside Standard (your node pools on Compute Engine VMs, pay-per-node), with VPC-native pod/service ranges, Dataplane V2, a private-cluster boundary, and Workload Identity linking a pod’s Kubernetes ServiceAccount to a Google service account.

Google Kubernetes Engine: Autopilot vs Standard

Keep this picture in mind: almost every setting on this page configures one of these boxes — the control plane, a node pool, an IP range, the dataplane, the cluster boundary, or the identity link — or the wiring between two of them.

Hands-on lab

Create an Autopilot cluster (lowest-friction, pay-per-pod), deploy and expose an app, demonstrate Workload Identity, then clean up. Run this in Cloud Shell, where gcloud and kubectl are pre-installed and you are already authenticated. Autopilot bills per pod request; a tiny deployment for an hour is a rupee or two, and the cluster fee is the free first-cluster fee on many accounts — we delete everything at the end. (New accounts get the $300 free-trial credit; the GKE free tier also offsets one cluster’s management fee per month.)

Step 1 — Set the project and a region.

gcloud config set project "$(gcloud config get-value project)"
REGION=us-central1

Expected: Updated property [core/project].

Step 2 — Create an Autopilot cluster (a few minutes).

gcloud container clusters create-auto demo-auto \
  --region=$REGION --release-channel=regular

Expected: progress lines ending in Created and a RUNNING status with an endpoint IP.

Step 3 — Get credentials so kubectl targets the cluster.

gcloud container clusters get-credentials demo-auto --region=$REGION
kubectl get nodes

Expected: kubeconfig entry generated, then one or more nodes Ready (Autopilot provisions them as workloads land).

Step 4 — Deploy a sample app and expose it with an external L4 Service.

kubectl create deployment hello --image=us-docker.pkg.dev/google-samples/containers/gke/hello-app:1.0
kubectl set resources deployment hello --requests=cpu=250m,memory=256Mi
kubectl expose deployment hello --type=LoadBalancer --port=80 --target-port=8080
kubectl get service hello -w

Expected: the Service shows <pending> then an EXTERNAL-IP. Press Ctrl-C, then curl http://EXTERNAL_IP returns Hello, world! with the version and hostname.

Step 5 — Prove Workload Identity (pod authenticates as a Google SA).

# Create a Google SA and allow the default KSA in 'default' to impersonate it
PROJECT=$(gcloud config get-value project)
gcloud iam service-accounts create wi-demo
gcloud iam service-accounts add-iam-policy-binding \
  wi-demo@$PROJECT.iam.gserviceaccount.com \
  --role=roles/iam.workloadIdentityUser \
  --member="serviceAccount:$PROJECT.svc.id.goog[default/default]"
kubectl annotate serviceaccount default \
  iam.gke.io/gcp-service-account=wi-demo@$PROJECT.iam.gserviceaccount.com

# Run a Cloud SDK pod and check which identity it is
kubectl run wi-test --rm -it --restart=Never \
  --image=google/cloud-sdk:slim -- \
  gcloud auth list

Expected: the active account is wi-demo@$PROJECT.iam.gserviceaccount.com — the pod authenticated with no key file.

Validation. kubectl get deploy,svc shows hello available with an external IP; kubectl get nodes shows Autopilot-managed nodes; the gcloud auth list output inside the pod shows the federated Google SA.

Cleanup.

kubectl delete service hello
kubectl delete deployment hello
gcloud iam service-accounts delete wi-demo@$PROJECT.iam.gserviceaccount.com --quiet
gcloud container clusters delete demo-auto --region=$REGION --quiet

Cost note. Deleting the cluster stops the management fee and all pod billing. The external load balancer and its forwarding rule bill while they exist, so delete the Service before (or with) the cluster. With Autopilot you were charged only for the pod’s small CPU/memory request for the time it ran — typically a rupee or two for this lab.

Common mistakes & troubleshooting

Symptom Likely cause Fix
Pods stuck Pending, “Insufficient cpu/memory” Standard pool too small / autoscaler off or at max Enable/raise pool autoscaling or NAP; on Autopilot check requests vs quotas
Cannot grow pod IP space / nodes capped Pod (alias) range or max-pods-per-node sized too small — and immutable Plan ranges with headroom at creation; rebuild with a larger range
Private-cluster nodes can’t pull public images No Cloud NAT / image not in Artifact Registry Add Cloud NAT for egress; mirror images to Artifact Registry (Private Google Access covers Google)
Pod gets 403 calling a Google API Workload Identity binding/annotation missing or wrong SA roles Re-check the workloadIdentityUser binding, the KSA annotation, and the Google SA’s roles
kubectl times out connecting to API Private endpoint + your IP not in authorized networks Add your CIDR to control-plane authorized networks, or use the private endpoint from inside the VPC
Cluster auto-upgraded during peak traffic No/loose maintenance window Set a maintenance window and an exclusion around peak periods
Spot-pool pods evicted constantly Spot reclamation under capacity pressure Tolerate disruption (PDBs, replicas) and add an on-demand pool as fallback; set min-nodes appropriately
Tried to switch Standard → Autopilot in place Mode is fixed at creation Create a new Autopilot cluster and migrate workloads

Best practices

Security notes

GKE’s defaults do a lot, but the responsibility line is real: Google secures and patches the control plane; you secure your workloads, IAM, network policy and image supply chain. The non-negotiables: Workload Identity (no keys), private nodes + restricted API access, node auto-upgrade (CVE patching), and Shielded nodes. Layer on NetworkPolicy for east-west segmentation, Binary Authorization to admit only trusted images, and the Security Posture dashboard to catch misconfigurations and vulnerable workloads. Audit access with Cloud Audit Logs, and keep the node service account and pod-level Google access least-privilege and separate — node scopes are for the kubelet, Workload Identity is for your pods.

Cost & sizing

The levers that move a GKE bill:

Interview & exam questions

  1. What is the core difference between GKE Autopilot and Standard? Autopilot is pod-centric — Google runs and scales the nodes and you pay for pod resource requests; Standard is node-centric — you manage node pools and pay for node VMs. Autopilot enforces secure defaults and removes node access; Standard gives full node control and responsibility. Mode is fixed at creation.

  2. Zonal vs regional cluster — what changes? A regional cluster runs three control-plane replicas across three zones, surviving a zone failure and upgrading with zero API downtime; a zonal cluster has a single-zone control plane that is unavailable during its zone’s outage or control-plane upgrade. On Standard regional, node counts are per-zone (1 → 3 nodes across 3 zones).

  3. What does “VPC-native” mean and why does it matter? Pods get real IPs from a secondary (alias) range on the subnet, making pod traffic routable across the VPC, peered networks and on-prem with no extra routes, and enabling container-native load balancing (NEGs). The legacy alternative is routes-based; always choose VPC-native. Pod and service ranges are immutable, so size them with headroom.

  4. Why is mounting a service-account key into a pod bad, and what replaces it? A mounted key is a long-lived secret that doesn’t rotate and leaks easily; it also can’t be scoped per pod easily. Workload Identity Federation for GKE replaces it: a Kubernetes SA federates to a Google SA and pods get short-lived, auto-rotated tokens with no keys.

  5. Cluster autoscaler vs node auto-provisioning? The cluster autoscaler scales node counts within existing pools based on pending pods. Node auto-provisioning additionally creates and removes whole pools with shapes that fit pending pods — closer to Autopilot while keeping node access.

  6. What is Dataplane V2 and what do you get from it? An eBPF/Cilium dataplane (default on new clusters) replacing kube-proxy/iptables and Calico. It brings scalable service handling, built-in NetworkPolicy enforcement and network-policy logging, and better observability.

  7. How do you upgrade nodes safely with minimal disruption? Use surge upgrades (max-surge/max-unavailable) to add temporary nodes and drain-and-replace gradually, or blue-green node-pool upgrades for a parallel, reversible rollout. Pair with PodDisruptionBudgets, maintenance windows and exclusions. Regional control-plane upgrades have no API downtime.

  8. What does a private cluster need to reach the internet and Google APIs? Private nodes have no public IP, so outbound internet needs Cloud NAT; access to Google APIs/Artifact Registry uses Private Google Access (on for GKE subnets). Restrict the control plane with authorized networks or a private endpoint.

  9. What are taints and tolerations used for in node pools? A taint on a pool repels pods that lack a matching toleration, letting you dedicate pools (Spot, GPU, Windows) so only opted-in workloads schedule there.

  10. Name three security controls you’d enable on a production GKE cluster and why. Workload Identity (keyless API access), private nodes + restricted API endpoint (smaller attack surface), and node auto-upgrade (CVE patching) — plus Shielded nodes, NetworkPolicy and Binary Authorization for defence in depth.

  11. What do release channels do, and which would you pick? They subscribe the cluster to an auto-upgrade cadence (Rapid/Regular/Stable, plus Extended). Regular is the balanced default for most production; Stable for risk-averse; Rapid for test/early access. Static (no channel) is discouraged — versions reach end-of-life.

  12. How is GKE billed, and what’s the cluster fee? A flat per-cluster management fee (same across zonal/regional, Autopilot/Standard) plus, underneath, pod requests (Autopilot) or node VM-hours (Standard). The free tier offsets one cluster’s fee per account per month.

Quick check

  1. You need a production cluster that survives a single zone failing and upgrades without API downtime. What location type do you choose, and how many control-plane replicas does it have?
  2. True or false: you can convert an existing Standard cluster to Autopilot in place.
  3. A pod returns 403 calling Cloud Storage. You’re using Workload Identity. Name the three things to verify.
  4. Why must you size the pod (alias) IP range carefully at creation rather than later?
  5. You want batch jobs to run on cheap, interruptible capacity but never have your API pods evicted. How do you structure node pools?

Answers

  1. Regional, with three control-plane replicas across three zones. It survives a zone outage and upgrades the control plane with zero API downtime.
  2. False. Mode is fixed at creation; you create a new Autopilot cluster and migrate workloads.
  3. The workloadIdentityUser IAM binding from the Google SA to the K8s SA member, the iam.gke.io/gcp-service-account annotation on the Kubernetes ServiceAccount, and the Google SA’s IAM roles (e.g. storage.objectViewer).
  4. The pod and service ranges are immutable for the cluster’s life; the pod range plus max-pods-per-node cap how many pods and nodes you can ever run, so over-provision with headroom up front.
  5. Run a Spot node pool (tainted, e.g. min-nodes=0) for the batch jobs (which tolerate the taint and disruption) and a separate on-demand pool for the API pods, so eviction of Spot capacity never touches them.

Exercise

In Cloud Shell, create a Standard regional cluster with --enable-ip-alias, --enable-dataplane-v2, --enable-private-nodes, --workload-pool, Shielded-node flags and --num-nodes=1 in us-central1 (note you get 3 nodes). Then: (a) add a second, autoscaling Spot node pool (--spot --enable-autoscaling --min-nodes=0 --max-nodes=4) tainted cloud.google.com/gke-spot=true:NoSchedule; (b) deploy an app with a matching toleration and confirm it lands on the Spot pool with kubectl get pods -o wide and kubectl get nodes -L cloud.google.com/gke-spot; © apply a default-deny NetworkPolicy in its namespace and verify cross-pod traffic is blocked; (d) bind a Workload Identity Google SA to the app’s KSA and prove access with gcloud auth list inside a pod; (e) clean up with gcloud container clusters delete. Bonus: enable node auto-provisioning with CPU/memory limits and observe the autoscaler create a pool for a pending pod.

Certification mapping

Glossary

Next steps

You now know GKE end to end — both modes, the control-plane topology, node pools, networking, identity, security and cost. The natural follow-ons go deeper on running it well and on locking down identity:

GKEKubernetesGCPContainersACEPCA
Need this built for real?

Vinod is a Senior Cloud Architect (22+ yrs) — available for Azure / AWS / GCP architecture, landing zones, and migrations.

Work with me

Comments