GCP Lesson 11 of 98

Google Cloud Functions, In Depth: 1st vs 2nd Gen, Triggers, Runtimes, Concurrency & Scaling

Google Cloud Functions is Google’s functions-as-a-service (FaaS) platform: you write a single function — a handler that responds to an HTTP request or to an event — push the source, and Google builds it into a container, deploys it, scales it from zero to many copies as load arrives, scales it back to zero when idle, and bills you only while it runs. There are no servers to provision, no containers to write, no autoscaler to tune. It sits one rung above Cloud Run on the abstraction ladder: with Cloud Run you bring a whole container; with Cloud Functions you bring just a function and Google builds the container for you. It is the fastest way on Google Cloud to turn a snippet of code into a deployed, scalable, event-driven endpoint.

This lesson is deliberately exhaustive. The single most important thing to understand first is that there are two generations, and 2nd gen is a completely different machine underneath — it is Cloud Run plus Eventarc with a function-shaped front door, while 1st gen is the original, Google-managed FaaS platform. We cover that split in full with a comparison table, then walk every trigger type (HTTP, Pub/Sub, Cloud Storage, Firestore, generic Eventarc/CloudEvents, and scheduled invocation via Cloud Scheduler), every language runtime (Node.js, Python, Go, Java, .NET, Ruby, PHP) with its exact function signature, source layout, and the buildpacks that turn source into an image, and then the entire scaling model: minimum and maximum instances, per-instance concurrency (a 2nd-gen superpower), scale-to-zero, cold starts, CPU and memory sizing, and the request timeout. We finish with networking (VPC connector vs Direct VPC egress, ingress controls), environment variables and Secret Manager, service identity and the invoker IAM role, and the decision interviewers and the ACE and Professional Cloud Architect exams love: Cloud Functions vs Cloud Run vs App Engine. Every option gets the same treatment — what it is · the choices · the default · when to pick which · the trade-off · the limit · the cost impact · the gotcha — and every core operation comes with a real gcloud command. Everything below reflects the current (2026) surface, where 2nd gen is the default and the recommended choice for almost all new work.

Learning objectives

By the end of this lesson you can:

Prerequisites & where this fits

You should already understand Google Cloud’s resource hierarchy — organisation → folder → project → resource — what a region is, how to run gcloud from Cloud Shell or a local SDK install (covered in the Fundamentals module), and the idea of an event (a message describing that something happened). It helps to have seen a container image conceptually, but you do not need to write one — that is precisely the point of Cloud Functions. This is the serverless functions lesson of the Compute module in the GCP Zero-to-Hero course. It sits directly above Cloud Run on the abstraction ladder and shares its engine in 2nd gen, so reading the Cloud Run deep dive first will make every scaling and networking concept here click instantly. Once you can deploy and tune a function fluently, pair this with the architecture-focused Event-Driven Architecture with Cloud Functions 2nd Gen and Eventarc to design whole event-driven systems rather than single handlers.

Core concepts

Before the options, fix six mental models. They explain why every setting is shaped the way it is.

A function is your code; Google builds and runs everything around it. You provide one entry-point function plus a manifest of dependencies. Google’s build system (Cloud Build) runs a buildpack that wraps your code in a tiny web server (the Functions Framework for your language), installs your dependencies, and produces an OCI container image stored in Artifact Registry — all without you writing a Dockerfile. You are responsible for the function body and the dependency list; Google is responsible for the base image, the server, the build, the host, request routing, TLS, and scaling.

There are two generations, and 2nd gen is a different platform. 1st gen is the original Google-managed FaaS runtime with its own event plumbing and tight per-function limits. 2nd gen deploys your function as a Cloud Run service and delivers events to it through Eventarc. That one architectural decision drives every meaningful difference: 2nd gen inherits Cloud Run’s longer timeouts, bigger instances, request concurrency (many requests per instance), traffic splitting/revisions, and Eventarc’s huge catalogue of event sources. Internalise this and the platform stops being magic — your 2nd-gen function is a Cloud Run service that you can even see and manage in the Cloud Run console.

Functions come in two shapes: HTTP and event-driven. An HTTP function is invoked by a web request and returns a response (a webhook, an API endpoint). An event-driven (background/CloudEvent) function is invoked by an event — a message published to Pub/Sub, an object finalised in a bucket, a Firestore document written — and returns nothing to a caller; it just does work. The function signature you write differs between the two, and between generations.

Instances are ephemeral and stateless. Each running copy of your function is an instance. The autoscaler creates and destroys them freely; there is no durable local disk beyond an in-memory /tmp (which counts against your memory limit and vanishes with the instance). Anything that must persist belongs in an external store. Design so any request or event can land on any instance.

Scaling is automatic and bounded by instances (and, in 2nd gen, concurrency). You never set an instance count. You set boundsmin-instances (default 0, i.e. scale to zero) and max-instances — and the platform sizes the fleet to load. In 1st gen, each instance handles exactly one request at a time (concurrency is effectively 1). In 2nd gen, you set per-instance concurrency (up to 1000), so one instance can serve many requests at once — fewer instances, fewer cold starts, lower cost.

Billing is pay-per-use with a generous free tier. You pay for invocations, for the compute (vCPU-seconds and GiB-seconds) consumed while your function runs, and for networking egress. When nothing is running (and min-instances is 0), you pay nothing for compute. Both generations include a monthly free allotment of invocations and compute. Key terms used throughout: generation (1st vs 2nd), trigger (what invokes the function), runtime (the language and version), Functions Framework (the per-language server that adapts your function to HTTP), CloudEvent (the standard event envelope 2nd gen uses), instance (a running copy), concurrency (simultaneous requests per instance), cold start (latency to spin up a fresh instance), and service account (the function’s identity).

1st gen vs 2nd gen: the defining split

This is the first and most consequential decision, and a guaranteed interview question. 2nd gen is built on Cloud Run (for execution) and Eventarc (for events); 1st gen is the legacy, self-contained platform. As of 2026, gcloud functions deploy defaults to 2nd gen, and Google recommends 2nd gen for all new functions. Choose 1st gen only for a narrow set of legacy needs (e.g. certain direct event types not yet fronted by Eventarc, or to match an existing 1st-gen deployment).

Capability 1st gen 2nd gen (Cloud Run + Eventarc)
Execution engine Google-managed FaaS runtime Cloud Run service (visible in Cloud Run)
Event delivery Built-in, function-specific plumbing Eventarc (CloudEvents) + native Pub/Sub/HTTP
Concurrency per instance 1 (one request at a time) Up to 1000 (configurable; default 1 for safety)
Max request timeout 9 minutes (540 s) 60 minutes (3600 s) for HTTP; 9 min for event funcs
Max memory 8 GiB 32 GiB
Max vCPU tied to memory (up to ~2 vCPU) up to 8 vCPU (independently selectable)
Max instances 3,000 1,000 per function (can request higher via Cloud Run)
Min instances (warm) Supported Supported
Traffic splitting / revisions No Yes (inherited from Cloud Run)
Eventarc event sources Limited direct sources Eventarc full catalogue (90+ Google sources via Audit Logs, plus Pub/Sub, Storage, Firestore)
CloudEvents format No (legacy event formats) Yes (industry-standard CloudEvents)
VPC egress VPC connector only VPC connector or Direct VPC egress
Networking ingress controls Basic Full Cloud Run ingress (all / internal / internal+LB)
Deploy command gcloud functions deploy --no-gen2 gcloud functions deploy --gen2 (default)
Pricing model Per-invocation + GB/GHz-seconds Cloud Run pricing (request- or instance-based)

How to read this table. Almost every row favours 2nd gen, and the reasons all trace back to the engine: because 2nd gen is Cloud Run, it gets Cloud Run’s concurrency, big instances, long timeouts, revisions, and networking; because it routes events through Eventarc, it gets Eventarc’s enormous source catalogue and the standard CloudEvents envelope. The one place 1st gen “wins” — a higher max-instances ceiling and a couple of niche direct event types — rarely matters. The gotcha: a 1st-gen and a 2nd-gen function are different resources even with the same name; you cannot “upgrade” in place — you redeploy as 2nd gen (Google provides a migration path/tool, but treat it as a new deployment and re-test). Cost note: 2nd-gen concurrency is the biggest lever — serving 10 requests per instance instead of 1 can cut compute cost ~10× for I/O-bound workloads.

Select the generation explicitly so there are no surprises:

# 2nd gen (the default and recommended)
gcloud functions deploy myfn --gen2 --region=us-central1 ...

# 1st gen (legacy)
gcloud functions deploy myfn --no-gen2 --region=us-central1 ...

Everything from here uses 2nd gen unless a row explicitly contrasts the generations.

Triggers: every way to invoke a function

A trigger is what causes your function to run. There are two families: HTTP (a web request) and event-driven (something happened). In 2nd gen, all event-driven triggers are ultimately Eventarc triggers delivering CloudEvents, but gcloud gives you convenient shorthands for the common sources.

Trigger Flag(s) Delivery Function shape Retries Typical use
HTTP --trigger-http Synchronous request/response over HTTPS HTTP handler Client/LB retries Webhooks, APIs, manual invoke
Pub/Sub --trigger-topic=TOPIC Push from a Pub/Sub subscription (managed for you) CloudEvent handler Yes, until ack/expiry Async fan-out, decoupling
Cloud Storage --trigger-bucket=BUCKET --trigger-event-filters="type=..." Eventarc (via Cloud Storage events) CloudEvent handler Yes React to object create/delete/finalize
Firestore --trigger-event-filters="type=google.cloud.firestore.document.v1.written" ... Eventarc CloudEvent handler Yes React to document writes
Eventarc (generic) --trigger-event-filters=... (+ --trigger-event-filters-path-pattern) Eventarc — any supported source CloudEvent handler Yes Audit-Log events from 90+ Google services
Cloud Scheduler (deploy HTTP, then create a scheduler job) Scheduler calls the HTTP URL on a cron HTTP handler Scheduler retry policy Cron / periodic jobs

HTTP triggers

An HTTP function is reachable at an HTTPS URL and returns a response. This is the shape for webhooks, REST/JSON APIs, and anything you invoke directly.

gcloud functions deploy http-hello \
  --gen2 --region=us-central1 --runtime=python312 \
  --source=. --entry-point=hello --trigger-http --allow-unauthenticated

Pub/Sub triggers

gcloud functions deploy on-message \
  --gen2 --region=us-central1 --runtime=nodejs20 \
  --source=. --entry-point=onMessage --trigger-topic=orders

Cloud Storage triggers

gcloud functions deploy on-upload \
  --gen2 --region=us-central1 --runtime=python312 \
  --source=. --entry-point=on_upload \
  --trigger-bucket=my-uploads-bucket
# or explicitly with event filters:
#   --trigger-event-filters="type=google.cloud.storage.object.v1.finalized" \
#   --trigger-event-filters="bucket=my-uploads-bucket"

Firestore triggers

gcloud functions deploy on-user-write \
  --gen2 --region=us-central1 --runtime=nodejs20 \
  --source=. --entry-point=onUserWrite \
  --trigger-location=us-central1 \
  --trigger-event-filters="type=google.cloud.firestore.document.v1.written" \
  --trigger-event-filters="database=(default)" \
  --trigger-event-filters-path-pattern="document=users/{userId}"

Eventarc (generic / CloudEvents) triggers

gcloud functions deploy on-bq-job \
  --gen2 --region=us-central1 --runtime=python312 \
  --source=. --entry-point=on_bq_job \
  --trigger-event-filters="type=google.cloud.audit.log.v1.written" \
  --trigger-event-filters="serviceName=bigquery.googleapis.com" \
  --trigger-event-filters="methodName=jobservice.jobcompleted"

Cloud Scheduler (cron) triggers

# 1) deploy an authenticated HTTP function (note: no --allow-unauthenticated)
gcloud functions deploy nightly-job \
  --gen2 --region=us-central1 --runtime=python312 \
  --source=. --entry-point=nightly --trigger-http

URL=$(gcloud functions describe nightly-job --gen2 --region=us-central1 \
      --format='value(serviceConfig.uri)')

# 2) schedule it (Scheduler authenticates with an OIDC token)
gcloud scheduler jobs create http nightly-trigger \
  --location=us-central1 --schedule="0 2 * * *" --time-zone="Asia/Kolkata" \
  --uri="$URL" --http-method=POST \
  --oidc-service-account-email=scheduler-sa@PROJECT_ID.iam.gserviceaccount.com

Runtimes: every language, signature, and source layout

A runtime is the language and version your function runs on. Google provides managed runtimes for seven languages; each has a Functions Framework library that adapts your function to HTTP and CloudEvents. List what is available with gcloud functions runtimes list --region=us-central1.

Language Recent runtime IDs (2026) Source entry file Dependency manifest
Node.js nodejs22, nodejs20, nodejs18 index.js (or main in package.json) package.json
Python python312, python311, python310 main.py requirements.txt
Go go122, go121 *.go in package, exported func go.mod
Java java21, java17, java11 class implementing a framework interface pom.xml / build.gradle
.NET dotnet8, dotnet6 class implementing IHttpFunction/ICloudEventFunction *.csproj
Ruby ruby33, ruby32 app.rb Gemfile
PHP php83, php82 index.php composer.json

Pick the version explicitly with --runtime=<id>. The gotcha: runtimes reach end of support on the language’s schedule; deploying to a deprecated runtime is eventually blocked, so pin to a current major and plan upgrades.

Buildpacks: how source becomes a container

You never write a Dockerfile. Google’s buildpacks (the open-source GCP buildpacks / Cloud Native Buildpacks) detect your language from the manifest, install dependencies, inject the Functions Framework, and produce an OCI image pushed to Artifact Registry, all via Cloud Build. You influence the build with:

The gotcha: the --entry-point must exactly match the symbol your code exports, and your dependency manifest must be present at the source root — a missing requirements.txt/package.json is the most common build failure.

HTTP function signatures (per language)

An HTTP function receives a request and writes a response.

// Node.js (index.js) — uses @google-cloud/functions-framework
const functions = require('@google-cloud/functions-framework');
functions.http('hello', (req, res) => {
  res.status(200).send(`Hello ${req.query.name || 'world'}`);
});
# Python (main.py) — uses functions-framework (Flask request)
import functions_framework

@functions_framework.http
def hello(request):
    name = request.args.get("name", "world")
    return f"Hello {name}", 200
// Go (function.go) — package + init registration
package p
import (
  "fmt"; "net/http"
  "github.com/GoogleCloudPlatform/functions-framework-go/functions"
)
func init() { functions.HTTP("Hello", Hello) }
func Hello(w http.ResponseWriter, r *http.Request) {
  fmt.Fprint(w, "Hello world")
}
// Java — implements HttpFunction
import com.google.cloud.functions.*;
import java.io.*;
public class Hello implements HttpFunction {
  public void service(HttpRequest req, HttpResponse res) throws IOException {
    res.getWriter().write("Hello world");
  }
}
// .NET (C#) — implements IHttpFunction
using Google.Cloud.Functions.Framework;
using Microsoft.AspNetCore.Http;
public class Hello : IHttpFunction {
  public async Task HandleAsync(HttpContext context) =>
    await context.Response.WriteAsync("Hello world");
}
# Ruby (app.rb) — Functions Framework
require "functions_framework"
FunctionsFramework.http("hello") do |request|
  "Hello world"
end
// PHP (index.php)
use Psr\Http\Message\ServerRequestInterface;
function hello(ServerRequestInterface $request): string {
  return 'Hello world';
}

CloudEvent (event-driven) function signatures

An event-driven function receives a CloudEvent and returns nothing to a caller. Use the framework’s CloudEvent registration:

# Python — CloudEvent handler (e.g. Pub/Sub or Storage)
import base64, functions_framework

@functions_framework.cloud_event
def on_event(cloud_event):
    data = cloud_event.data
    # Pub/Sub: payload is base64 in data["message"]["data"]
    print("event id:", cloud_event["id"], "type:", cloud_event["type"])
// Node.js — CloudEvent handler
const functions = require('@google-cloud/functions-framework');
functions.cloudEvent('onEvent', (cloudEvent) => {
  console.log('type', cloudEvent.type, 'subject', cloudEvent.subject);
});

The gotcha: in 2nd gen, all event functions receive the CloudEvents format — if you are porting 1st-gen background functions (which used the legacy (data, context) signature), you must switch to the CloudEvent signature.

Scaling: instances, concurrency, cold starts, and sizing

This is where 2nd gen earns its keep. You never set an instance count; you set bounds and per-instance behaviour, and the platform sizes the fleet.

Lever Flag Default Range / choices Effect
Min instances --min-instances 0 (scale to zero) 0 … max Keep N warm to avoid cold starts; you pay to keep them alive
Max instances --max-instances platform default (e.g. 100) up to 1,000 (2nd gen) Cap the fan-out (protect downstreams, bound cost)
Concurrency (2nd gen) --concurrency 1 (safe default) 1 … 1000 Requests handled simultaneously per instance
CPU --cpu derived from memory up to 8 vCPU Compute per instance
Memory --memory 256 MiB up to 32 GiB (2nd gen) RAM per instance (includes /tmp)
Timeout --timeout 60 s up to 3600 s HTTP (2nd gen) Max wall-clock per request
CPU boost --cpu-boost (inherited from Cloud Run) off on/off Extra CPU during startup to cut cold-start latency

Min and max instances, and scale-to-zero

Per-instance concurrency (the 2nd-gen superpower)

In 1st gen, one instance = one request at a time, full stop. In 2nd gen, --concurrency=N lets a single instance serve up to N requests simultaneously (default 1, max 1000). For I/O-bound work (calling APIs, waiting on a DB), raising concurrency to, say, 80 means one instance does the work of dozens — far fewer instances, far fewer cold starts, far lower cost.

gcloud functions deploy api \
  --gen2 --region=us-central1 --runtime=nodejs20 \
  --source=. --entry-point=api --trigger-http \
  --concurrency=80 --cpu=1 --memory=512Mi \
  --min-instances=1 --max-instances=100

Cold starts

A cold start is the latency of spinning up a fresh instance: pull the image, boot the runtime, run your initialisation code, then handle the request. Reduce it by:

CPU, memory, and timeout sizing

Networking: ingress, VPC connector, and Direct VPC egress

By default a function reaches the public internet for egress and is reachable per its trigger. To talk to private resources (a VM, Cloud SQL via private IP, an internal load balancer) or to lock down who can reach it, configure networking.

Egress to your VPC: two options

Option Flag How it works When to use Trade-off
Serverless VPC Access connector --vpc-connector=NAME Routes egress through a managed connector (a small managed instance group) in a /28 subnet Mature, works in both gens; cross-project/shared-VPC patterns You provision and pay for the connector; it can be a throughput bottleneck
Direct VPC egress --network=NET --subnet=SUBNET Assigns the function instances IPs directly in your subnet — no connector 2nd gen, lower latency, higher throughput, lower cost Newer; consumes subnet IPs; some constraints vs connector

Ingress: who can reach the function

--ingress-settings (2nd gen, inherited from Cloud Run):

gcloud functions deploy private-fn \
  --gen2 --region=us-central1 --runtime=python312 \
  --source=. --entry-point=handler --trigger-http \
  --network=my-vpc --subnet=my-subnet \
  --vpc-egress-settings=private-ranges-only \
  --ingress-settings=internal-only

Environment variables and Secret Manager

gcloud functions deploy with-secret \
  --gen2 --region=us-central1 --runtime=python312 \
  --source=. --entry-point=handler --trigger-http \
  --set-env-vars=LOG_LEVEL=info \
  --set-secrets=DB_PASSWORD=projects/PROJECT_ID/secrets/db-pass:latest

Service identity and the invoker role

Every function runs as a service account — its identity for calling other Google Cloud APIs — and access to invoke it is controlled separately by IAM.

gcloud run services add-invoker-policy-binding api \
  --region=us-central1 \
  --member="serviceAccount:caller@PROJECT_ID.iam.gserviceaccount.com"
# (or the gcloud functions add-invoker-policy-binding shorthand)

Google Cloud Functions architecture: triggers, the 2nd-gen Cloud Run plus Eventarc engine, runtimes and buildpacks, scaling, networking, and identity

The diagram traces a request or event from its trigger (HTTP, Pub/Sub, Storage, Firestore, Eventarc, Scheduler) through the 2nd-gen engine (Cloud Run service + Eventarc delivering CloudEvents), into your runtime (built by buildpacks), where the autoscaler sizes instances by min/max and concurrency, while VPC egress reaches private resources and the service account governs identity.

Hands-on lab

Deploy an HTTP function and an event-driven function on the Free Tier, exercise scaling, and clean up. Use Cloud Shell (no local setup) and a project where you can create functions. Cloud Functions includes a generous monthly free allotment, so this lab should cost effectively ₹0.

1. Set defaults and enable APIs.

gcloud config set project YOUR_PROJECT_ID
gcloud config set functions/region us-central1
gcloud services enable cloudfunctions.googleapis.com run.googleapis.com \
  cloudbuild.googleapis.com eventarc.googleapis.com artifactregistry.googleapis.com \
  pubsub.googleapis.com

2. Create an HTTP function (Python).

mkdir cf-lab && cd cf-lab
cat > main.py <<'PY'
import functions_framework

@functions_framework.http
def hello(request):
    name = request.args.get("name", "world")
    return f"Hello {name} from Cloud Functions 2nd gen\n", 200
PY
cat > requirements.txt <<'TXT'
functions-framework==3.*
TXT

gcloud functions deploy http-hello \
  --gen2 --runtime=python312 --source=. --entry-point=hello \
  --trigger-http --allow-unauthenticated \
  --concurrency=80 --cpu=1 --memory=256Mi \
  --min-instances=0 --max-instances=5

3. Validate the HTTP function.

URL=$(gcloud functions describe http-hello --gen2 --region=us-central1 \
      --format='value(serviceConfig.uri)')
curl "$URL?name=Vinod"
# Expected: Hello Vinod from Cloud Functions 2nd gen

4. Confirm it is really a Cloud Run service (the 2nd-gen proof).

gcloud run services list --region=us-central1 --filter="metadata.name=http-hello"
# The function appears as a Cloud Run service — that is the engine.

5. Create a Pub/Sub-triggered function.

gcloud pubsub topics create demo-topic

cat > main.py <<'PY'
import base64, functions_framework

@functions_framework.cloud_event
def on_message(cloud_event):
    msg = cloud_event.data["message"]
    data = base64.b64decode(msg.get("data", "")).decode() if msg.get("data") else ""
    print(f"Received message: {data!r}")
PY

gcloud functions deploy on-message \
  --gen2 --runtime=python312 --source=. --entry-point=on_message \
  --trigger-topic=demo-topic --min-instances=0 --max-instances=3

6. Validate the event function.

gcloud pubsub topics publish demo-topic --message="hello events"
# Read logs after a few seconds:
gcloud functions logs read on-message --gen2 --region=us-central1 --limit=20
# Expected: a line like  Received message: 'hello events'

7. Cleanup (do this to avoid lingering resources).

gcloud functions delete http-hello --gen2 --region=us-central1 --quiet
gcloud functions delete on-message --gen2 --region=us-central1 --quiet
gcloud pubsub topics delete demo-topic --quiet
# Optional: remove the Eventarc trigger if it lingers
gcloud eventarc triggers list --location=us-central1

Cost note. Both functions scale to zero (--min-instances=0), so they cost nothing when idle, and a handful of test invocations sits well within the monthly free tier. The only thing that would cost money is leaving --min-instances ≥ 1 running, or a runaway retry loop on the Pub/Sub function — which is why we cap --max-instances and clean up.

Common mistakes & troubleshooting

Symptom Likely cause Fix
Build fails: “entry point not found” --entry-point doesn’t match the exported function/class name Make them identical; check the file is at the source root
Build fails: missing dependencies No requirements.txt/package.json at source root, or wrong name Add the manifest at the root; verify .gcloudignore isn’t excluding it
HTTP function returns 403 on invoke Function requires auth; caller lacks run.invoker (or no token) Grant roles/run.invoker, or --allow-unauthenticated for public
Event trigger never fires Audit Data Access logs not enabled, or wrong event-type/filter Enable the logs; verify type/serviceName/methodName filters
Pub/Sub function loops / re-runs Handler throws → message redelivered (at-least-once) Make idempotent; add a dead-letter topic; return success on success
Storage function loops forever Function writes back into the same bucket it triggers on Write to a different bucket/prefix or filter precisely
2nd gen costs as much as 1st gen Concurrency left at default 1 Raise --concurrency for I/O-bound workloads
Cold starts hurt latency Scale-to-zero + heavy init Set --min-instances≥1, enable --cpu-boost, lazy-init clients
Can’t reach Cloud SQL / private IP No VPC egress configured Add --vpc-connector or Direct VPC egress; fix firewall rules
Permission denied inside the function Runtime service account lacks the API role Grant the needed role to the function’s --service-account

Best practices

Security notes

Interview & exam questions

1. Why is 2nd-gen Cloud Functions “Cloud Run plus Eventarc”, and why does it matter? 2nd gen deploys your function as a Cloud Run service and delivers events through Eventarc. It matters because the function inherits Cloud Run’s request concurrency, larger instances (up to 8 vCPU / 32 GiB), longer timeouts (up to 60 min HTTP), revisions/traffic splitting, and Cloud Run networking — and Eventarc’s 90+ event sources delivered as standard CloudEvents.

2. What is the single biggest cost/scaling difference between 1st and 2nd gen? Per-instance concurrency. 1st gen serves one request per instance; 2nd gen can serve up to 1000 per instance (--concurrency). For I/O-bound workloads, higher concurrency means far fewer instances, fewer cold starts, and dramatically lower cost.

3. A function must run only when a Firestore document under users/{id} is updated. How do you wire it? Deploy a 2nd-gen function with an Eventarc Firestore trigger: --trigger-event-filters="type=google.cloud.firestore.document.v1.updated", --trigger-event-filters="database=(default)", and --trigger-event-filters-path-pattern="document=users/{userId}".

4. Your Pub/Sub-triggered function keeps re-processing the same message. Why, and what do you do? Pub/Sub delivery is at-least-once; if the handler throws (or doesn’t finish), the message is redelivered. Make the handler idempotent, ensure it returns success on success, and configure a dead-letter topic to stop infinite retries.

5. How do you eliminate cold starts for a latency-sensitive function, and what’s the cost? Set --min-instances ≥ 1 to keep warm instances ready (and optionally --cpu-boost). The cost is that those warm instances bill even when idle, so pick the smallest count that meets the latency target.

6. How does a function reach a Cloud SQL instance over private IP? Give it VPC egress — either a Serverless VPC Access connector (--vpc-connector) or Direct VPC egress (--network/--subnet) — with appropriate egress settings and firewall rules. Without VPC egress the function can only reach public endpoints.

7. What’s the difference between the invoker role and the function’s service account? roles/run.invoker controls who may call the function. The function’s runtime service account controls what the function can do (which Google APIs it can call). They are independent; confusing them yields either 403-on-invoke or permission-denied-inside.

8. How do you implement a nightly cron job with Cloud Functions? Deploy an HTTP function (authenticated), then create a Cloud Scheduler job that calls its URL on a unix-cron schedule using an OIDC token from a service account with run.invoker. Cloud Functions has no built-in scheduler.

9. When would you choose Cloud Run over Cloud Functions? When you need a full container (multiple endpoints/routes, a web framework, custom runtime, system libraries, gRPC), more than one function’s worth of code, or behaviours like fine-grained traffic management — i.e. when “one function” no longer fits. (2nd-gen Functions is Cloud Run, so it’s really “function-shaped vs container-shaped”.)

10. How do you keep an HTTP function private? Don’t use --allow-unauthenticated; require auth and grant run.invoker only to specific principals. Optionally set --ingress-settings=internal-only (or internal-and-LB) so it isn’t reachable from the public internet at all.

11. What are the max timeout values, and how do they differ by generation/trigger? 1st gen: 9 minutes (540 s) for all functions. 2nd gen: 60 minutes (3600 s) for HTTP functions, 9 minutes for event-driven functions.

12. Why might a Cloud Storage trigger loop infinitely? Because the function writes a new object back into the same bucket it is triggered on, which fires the trigger again. Write outputs to a different bucket/prefix, or filter the event type/path precisely.

Quick check

  1. What two Google products are the engine of 2nd-gen Cloud Functions?
  2. What is the default per-instance concurrency in 2nd gen, and what’s the max?
  3. Which flag keeps warm instances to avoid cold starts?
  4. Which IAM role lets a caller invoke an HTTP function?
  5. Name the two ways a function can send egress traffic into your VPC.

Answers

  1. Cloud Run (execution) and Eventarc (event delivery).
  2. Default 1; maximum 1000 (--concurrency).
  3. --min-instances (set to ≥ 1).
  4. roles/run.invoker (on the underlying Cloud Run service).
  5. Serverless VPC Access connector (--vpc-connector) and Direct VPC egress (--network/--subnet).

Exercise

Build a small image-thumbnail pipeline, exhausting several options at once:

  1. Create two buckets: SRC-uploads and SRC-thumbs (replace SRC with a unique prefix).
  2. Deploy a 2nd-gen Cloud Storage function on SRC-uploads for google.cloud.storage.object.v1.finalized that reads the uploaded image and writes a resized copy into SRC-thumbs. Set --memory=512Mi, --cpu=1, --concurrency=10, --max-instances=5, and a dedicated least-privilege service account with object read on the source and write on the thumbs bucket.
  3. Use Secret Manager to store a dummy “watermark key” and mount it with --set-secrets; grant the SA secretAccessor.
  4. Upload an image to SRC-uploads, confirm a thumbnail appears in SRC-thumbs, and read the logs.
  5. Deliberately demonstrate the loop gotcha: explain (in a comment) why writing the thumbnail back into SRC-uploads would re-trigger the function, and confirm your design writes to SRC-thumbs instead.
  6. Tear everything down: delete the function, the buckets, and the secret.

Success criteria: a thumbnail is produced for each upload, the function runs as a least-privilege SA, the secret is mounted (not in env vars), and you can articulate why the source/destination split prevents an infinite loop.

Certification mapping

Glossary

Next steps

gcpcloud-functionsserverlesseventarcACE
Need this built for real?

Vinod is a Senior Cloud Architect (22+ yrs) — available for Azure / AWS / GCP architecture, landing zones, and migrations.

Work with me

Comments