Azure Functions and Serverless Patterns: Event-Driven Compute

A team ran a webhook handler on a dedicated VM. The VM cost money every hour of every day even though the endpoint received a few thousand calls between 9am and 6pm and nothing overnight. Worse, the VM needed patching, the disk filled with logs nobody read, and a kernel update once took the endpoint down for forty minutes. Moving the handler to Azure Functions cut the compute bill by roughly 90%, removed the patching entirely, and — because the platform scaled the handler from zero to dozens of instances on its own — survived a traffic spike that would have toppled the single VM. That is the serverless trade in one story: you stop renting a machine and start renting executions, and you hand the platform the jobs (provisioning, scaling, patching, load-balancing) you used to do by hand.

Azure Functions is Azure’s Functions-as-a-Service offering: you write a function — a small piece of code with a single entry point — and declare the event that triggers it (an HTTP request, a queue message, a blob upload, a timer, a Cosmos DB change) plus the inputs and outputs it binds to (read a document, write to a queue, push to Event Hubs) declaratively, so you write logic, not client boilerplate. The platform runs your function only when its trigger fires, scales the number of concurrent instances to match the event rate, and — on the serverless plans — bills you per execution and per gigabyte-second of memory, dropping to zero when nothing is happening. This is the natural home for glue code, automation, event processing, lightweight APIs and scheduled jobs.

This article is the working reference a senior engineer keeps open. We go plan by plan (Consumption, Flex Consumption, Premium, Dedicated and Container Apps), trigger by trigger and binding by binding, through the scale controller that decides how many instances you get and the cold start that makes the first request slow, into concurrency and partitioning (the knobs that decide throughput and ordering), and across the Durable Functions patterns — function chaining, fan-out/fan-in, async HTTP, monitor, human interaction and aggregator — that let stateless functions run stateful, long-running workflows. Every concept carries the real limits (timeouts, payload sizes, instance caps), an az/Bicep snippet where it applies, and — because half of all Functions incidents are the same dozen mistakes — a symptom→cause→confirm→fix playbook. Read the prose once; keep the tables open when you are building or on call.

By the end you will know which plan to pick and what each one actually fixes, why your function fired twice and how to make that safe, why messages piled up in a poison queue at 2am, why a Premium plan still cold-started, and how to wire identity, networking and observability so the thing is production-grade rather than a demo that happened to ship.

What problem this solves

Most real work in a cloud system is event-shaped, not request-shaped. A file lands in storage and needs a thumbnail. An order message arrives on a queue and needs validating. A timer fires at 02:00 and a cleanup must run. A row changes in Cosmos DB and a downstream cache must update. A webhook calls in and a record must be written. None of these need a server sitting idle waiting; they need code that runs when the event happens and then stops. Running that code on always-on infrastructure (a VM, an always-warm App Service, a Kubernetes deployment) means paying 24/7 for capacity used a fraction of the time, plus owning the patching, scaling rules and load-balancing yourself.

Without serverless, the pain is concrete: you over-provision for the peak (a flash sale, a nightly batch) and waste money the other 23 hours; or you under-provision and the spike takes you down. You write the same connection-management, retry and dead-letter plumbing for every integration. You patch OS and runtime on a schedule that competes with shipping features. You build autoscaling rules and hope they react fast enough. And when traffic genuinely goes to zero overnight, you keep paying anyway.

Who hits this: anyone building integrations and automation (the classic “glue” between SaaS, queues, storage and databases), event processors (image/file pipelines, IoT and telemetry, change-feed reactors), lightweight or spiky APIs (webhooks, back-office endpoints, bursty public APIs), and scheduled jobs (reports, cleanups, syncs). Azure Functions removes the server from all of them — you provide the handler and the trigger, Azure provides everything else. But it is not a universal hammer: long-running compute, very low-latency APIs that cannot tolerate any cold start, and workloads that need persistent local state fit other models better, and a big part of using Functions well is knowing where its edges are.

The whole field, framed before the deep dive — the event source, the question it forces, and where Functions fits:

Workload shape	What triggers it	The serverless win	When Functions is wrong
Webhook / lightweight API	HTTP request	Scale-to-zero; pay per call; no VM	Strict sub-100 ms p99 with no cold-start tolerance
Event/stream processing	Queue, Event Hubs, Service Bus, Event Grid	Auto-scale to the backlog; built-in checkpointing	Heavy stateful stream joins (use Stream Analytics/Flink)
File / blob pipeline	Blob trigger / Event Grid on Storage	Runs per file; fans out automatically	Very high-rate blob events (prefer Event Grid source)
Scheduled job	Timer (CRON)	No always-on host for a nightly task	Sub-second scheduling precision
Change reactor	Cosmos DB / SQL change feed	Reacts to data changes without polling	Need transactional consistency across writes
Long workflow / orchestration	Durable Functions	Stateful, long-running, checkpointed	Single sub-second synchronous call

Learning objectives

By the end of this article you can:

Choose the right hosting plan (Consumption, Flex Consumption, Premium/Elastic Premium, Dedicated/App Service, Container Apps) for a given latency, scale, networking and cost profile — and explain what each one fixes.
Wire any of the core triggers (HTTP, Timer, Queue Storage, Service Bus, Event Hubs, Event Grid, Blob, Cosmos DB) and use input/output bindings to read and write Azure services without client boilerplate.
Explain how the scale controller decides instance count per trigger type, why cold starts happen, and how Always Ready/pre-warmed instances, Flex alwaysReady and concurrency tuning reduce them.
Tune concurrency, batching and partitioning (host.json batchSize, maxConcurrentCalls, Event Hubs partitions, sessions) to trade throughput against ordering and downstream pressure.
Implement the Durable Functions patterns — function chaining, fan-out/fan-in, async HTTP API, monitor, human interaction and aggregator (entities) — and reason about replay, determinism and the task hub.
Make event handlers idempotent and poison-safe: handle at-least-once delivery, retries, dead-letter/poison queues and out-of-order events.
Secure and isolate a function app with managed identity, Key Vault references, VNet integration and private endpoints, and observe it with Application Insights end to end.
Read the limits and error reference (timeouts, payload sizes, instance caps, host errors) and right-size cost on the serverless and Premium plans.

Prerequisites & where this fits

You should be comfortable with the Azure basics: a resource group, an App Service plan vs a serverless plan, running az in Cloud Shell, reading JSON output, and the idea of a managed identity. Familiarity with HTTP, queues and JSON helps; you do not need deep Kubernetes or messaging-broker knowledge — we build it up. A function app always needs a backing storage account (it stores triggers’ state, the Durable task hub, and runtime metadata there), so a passing familiarity with Azure Storage account fundamentals is useful.

This sits in the Compute / Serverless track and is the event-driven sibling of the request-driven PaaS world. The decision of whether serverless functions are the right compute at all lives upstream in Azure App Service vs Container Apps vs AKS; read that first if you are still choosing a model. Once you are running Functions in production, the operational reflexes transfer directly from Troubleshooting Azure App Service: 502/503, Cold Starts & Restart Loops — the front-end/worker mental model and Application Insights workflow are the same. Functions almost always read secrets via Azure Key Vault: Secrets, Keys & Certificates and config via Azure App Configuration: Feature Flags, Dynamic Config & Key Vault References, and you observe them through Azure Monitor & Application Insights for Observability. When a function needs private outbound to a database, Azure Private Endpoint vs Service Endpoint is the networking decision it forces.

A quick map of who owns what when a function misbehaves, so you escalate to the right place fast:

Layer	What lives here	Who usually owns it	Failure classes it causes
Event source (queue/hub/blob)	Messages, partitions, backlog	App / platform team	Backlog growth, duplicate delivery, ordering
Trigger + scale controller	Instance count decision, polling	Microsoft (platform)	Slow scale-out, no scale (host down), cold start
Function host (runtime)	Your code, bindings, concurrency	App / dev team	Crash, timeout, throttled downstream, OOM
Backing storage account	Trigger state, Durable task hub	App + platform	Host won’t start, Durable stalls, throttling
Identity & config	Managed identity, KV refs, settings	App + platform	Boot failure, 403 to dependencies
Network (VNet / PE)	Outbound to DB/PaaS, DNS	Platform + network	Timeouts, name-resolution failures

Core concepts

Six mental models make every later section obvious.

A function is a handler plus a trigger. The unit of work is a function: one entry point with exactly one trigger (the event that starts it) and zero or more bindings (declarative inputs and outputs). One or more functions live inside a function app, which is the deployment, scaling and configuration boundary — the function app is what you create, scale, give an identity, and put on a plan. All functions in an app share the app’s plan, settings, identity and storage account.

The trigger defines the contract; bindings remove the boilerplate. A trigger delivers an event payload and starts execution (an HTTP request body, a queue message, a blob stream). An input binding hands your function data pulled from a service before it runs (a Cosmos DB document keyed off the trigger); an output binding writes your function’s return value to a service after it runs (append to a queue, upsert a document). Bindings are declared in attributes/decorators or function.json, so the SDK manages the client, connection and serialization for you — you read and write parameters, not SDK objects.

The platform decides scale; you decide concurrency. Azure does not run your function on a server you manage. A component called the scale controller watches each trigger’s signal (HTTP request rate, queue length, Event Hubs lag) and adds or removes instances (worker sandboxes) to keep up — from zero to the plan’s maximum. Within each instance, concurrency settings decide how many invocations run at once. Scale (instances) is the platform’s job; concurrency (per-instance parallelism) is yours, and the two multiply into throughput.

Stateless by default; stateful on purpose. A plain function is stateless — it must not rely on in-memory state surviving between invocations, because the next invocation may run on a different instance (or the instance may have been recycled). State lives outside the function (a database, a queue, a cache). When you genuinely need stateful, long-running coordination — “call A, then B, wait for approval, then C” running for minutes, hours or days — Durable Functions provides it via an orchestrator that checkpoints its progress to storage and replays deterministically.

Delivery is at-least-once; design for it. Queue, Service Bus, Event Hubs and Event Grid triggers deliver at least once — under retries, redelivery or scale events, your function can see the same message more than once and events can arrive out of order. This is not a bug to fix; it is a property to design around with idempotency (processing the same event twice has the same effect as once) and poison/dead-letter handling (a message that keeps failing is set aside, not retried forever).

No server, but always a storage account. “Serverless” means you do not manage servers — it does not mean there is no state. Every function app is bound to a storage account (the AzureWebJobsStorage connection) that holds runtime metadata, trigger leases/checkpoints, the Durable task hub, and (for some plans) the deployment package. If that storage account is unreachable, throttled, or its keys rotate without updating the setting, the host fails to start — a surprising amount of “Functions is down” is really “the storage account is unhappy.”

The vocabulary in one table

Pin down every moving part before the deep sections. The glossary repeats these for lookup; this is the mental model side by side:

Term	One-line definition	Where it lives	Why it matters
Function	One handler with one trigger + bindings	Inside a function app	The unit of execution and billing
Function app	Deployment/scaling/config boundary	On a plan	What you create, scale, give identity
Trigger	The event that starts a function	Per function	Defines payload + scaling signal
Binding	Declarative input/output to a service	Per function	Removes client boilerplate
Hosting plan	Where/how the app runs and is billed	Per function app	Decides scale, cold start, cost, networking
Scale controller	Platform component that adds/removes instances	Microsoft-managed	Decides how fast you scale to load
Instance	A worker sandbox running your app	On the plan	Cold start happens when a new one spins up
Concurrency	Invocations running at once per instance	host.json / settings	Throughput vs downstream pressure
Cold start	First-request latency on a fresh instance	Instance lifecycle	Slow first call; mitigated, not eliminated
`AzureWebJobsStorage`	The app’s backing storage connection	App setting	Host won’t start if it’s broken
Durable Functions	Stateful orchestration on top of Functions	Extension + task hub	Long-running, checkpointed workflows
Task hub	Durable’s state store (queues + tables)	In the storage account	Where orchestration progress is persisted
Poison / dead-letter	Where repeatedly-failing messages go	Queue/Service Bus	Stops infinite retry of a bad message
Managed identity	The app’s Entra identity for auth	On the function app	Passwordless access to KV/DB/Storage

Hosting plans: pick the one that fits, not the cheapest by default

The single highest-leverage decision is the hosting plan. It determines how the app scales, whether it ever scales to zero, how cold starts behave, what networking it can do, the maximum timeout, and how you are billed. Picking “Consumption because it’s cheapest” and then fighting cold starts and VNet limits for a month is the most common early mistake.

There are five plans in practice. Consumption is the original serverless plan: scale-to-zero, pay per execution, modest cold starts, a hard 10-minute timeout. Flex Consumption is the modern serverless plan: scale-to-zero and fast per-instance concurrency control, VNet integration, alwaysReady instances to kill cold starts, and per-instance memory you choose — it is the default new-build recommendation. Premium (Elastic Premium, EP) gives pre-warmed instances (no cold start), VNet integration, longer/unbounded timeouts and more memory, billed per vCPU/GB allocated. Dedicated (App Service plan) runs Functions on a plan you already pay for (good for steady load or co-locating with web apps), with no scale-to-zero. Container Apps hosts a containerized function app on the Container Apps/KEDA platform when you want microservices, Dapr, or container parity.

Lay the five plans side by side on the axes that actually decide the choice:

Plan	Scale-to-zero	Cold starts	Max timeout	VNet integration	Billing model	Best for
Consumption	Yes	Yes (modest)	5 min default, 10 min max	No (legacy: limited)	Per-execution + GB-s	Spiky/low-traffic glue, demos
Flex Consumption	Yes	Yes — killed with `alwaysReady`	Configurable, long	Yes (built-in)	Per-execution + GB-s + alwaysReady	New serverless builds (default)
Premium (EP1–EP3)	No (min 1)	None (pre-warmed)	Unbounded (default 30 min)	Yes	Per vCPU/GB allocated (always-on)	Steady + need warm + VNet + long runs
Dedicated (App Service)	No	Per App Service rules	Unbounded (Always On)	Yes	App Service plan (instance-hours)	Co-locate with web apps; predictable load
Container Apps	Yes (to 0 via KEDA)	Yes (scale-from-zero)	Long (revision-based)	Yes	vCPU/GB per second	Containers, Dapr, microservices parity

The same plans as a capability grid against the features people actually need:

Capability	Consumption	Flex Consumption	Premium (EP)	Dedicated	Container Apps
Scale to zero	Yes	Yes	No	No	Yes
Pre-warmed / always-ready	No	Yes (`alwaysReady`)	Yes (pre-warmed count)	n/a (Always On)	No (min replicas)
VNet integration	No	Yes	Yes	Yes	Yes
Per-instance concurrency control	Limited	Yes	Yes	Yes	Yes (KEDA)
Choose instance memory	No	Yes	Yes (EP SKU)	Yes (SKU)	Yes
Unbounded execution time	No (10 min)	Long	Yes	Yes	Long
Deployment slots	No	(evolving)	Yes	Yes	Revisions
Linux + Windows	Both	Linux	Both	Both	Linux (containers)

And the decision as a table — match what you’re feeling to the plan that fixes it:

If you need…	Because…	Pick
Cheapest possible for bursty/low traffic	You pay nothing at idle	Consumption (or Flex)
Scale-to-zero plus no cold start plus VNet	Modern serverless, private deps	Flex Consumption
Zero cold start with steady load and long runs	Latency-sensitive, > 10 min jobs	Premium (EP)
To run on a plan you already pay for	Co-located web apps, steady load	Dedicated
Container image, Dapr, or K8s-style ops	Microservice parity	Container Apps
Strict isolation / dedicated tenancy	Compliance, ASE-style	Premium on ASE / Dedicated

Create a Flex Consumption app (the modern default) with az:

RG=rg-fn-prod
LOC=centralindia
STG=stfnprod$RANDOM            # storage account (globally unique)
APP=fn-orders-prod-$RANDOM     # function app (globally unique)

az group create -n $RG -l $LOC -o table
az storage account create -n $STG -g $RG -l $LOC --sku Standard_LRS -o table

# Flex Consumption: choose runtime, version, instance memory, and region
az functionapp create -n $APP -g $RG \
  --storage-account $STG \
  --flexconsumption-location $LOC \
  --runtime dotnet-isolated --runtime-version 8.0 \
  --instance-memory 2048 \
  -o table

The equivalent in Bicep, with system-assigned identity and an alwaysReady instance to remove cold start on the HTTP path:

resource plan 'Microsoft.Web/serverfarms@2023-12-01' = {
  name: 'flex-orders'
  location: location
  sku: { tier: 'FlexConsumption', name: 'FC1' }
  properties: { reserved: true } // Linux
}

resource fnApp 'Microsoft.Web/sites@2023-12-01' = {
  name: 'fn-orders-prod'
  location: location
  kind: 'functionapp,linux'
  identity: { type: 'SystemAssigned' }
  properties: {
    serverFarmId: plan.id
    functionAppConfig: {
      runtime: { name: 'dotnet-isolated', version: '8.0' }
      scaleAndConcurrency: {
        instanceMemoryMB: 2048
        maximumInstanceCount: 100
        alwaysReady: [ { name: 'http', instanceCount: 1 } ] // warm pool for HTTP
      }
      deployment: {
        storage: {
          type: 'blobContainer'
          value: '${stg.properties.primaryEndpoints.blob}deployments'
          authentication: { type: 'SystemAssignedIdentity' }
        }
      }
    }
  }
}

Runtime, language and worker model

Independent of plan, you pick a runtime stack and version. .NET has two models: isolated worker (your function runs in its own process out-of-proc from the host — the recommended model, decoupled from the host’s .NET version) and the legacy in-process model (being retired). The other stacks — Node.js, Python, Java, PowerShell — always run out-of-process via the language worker. Pick the version deliberately: an unsupported runtime version blocks deploys and security updates.

Stack	Models / notes	Trigger style	When to pick
.NET (isolated)	Out-of-proc; decoupled from host	Attributes	New .NET builds (recommended)
.NET (in-process)	Legacy; tied to host version	Attributes	Existing apps only; migrate off
Node.js (v4 model)	Code-first programming model	`app.http(...)` etc.	JS/TS teams, fast iteration
Python (v2 model)	Decorator-based	`@app.route` etc.	Data/ML glue, scripting
Java	Annotations	`@FunctionName`	JVM shops, Spring-adjacent
PowerShell	Scripting	`function.json`	Ops automation, Azure mgmt
Custom handler	Any language over HTTP	Custom handler contract	Go/Rust/other; container only realistically

Triggers: the event that starts a function

Every function has exactly one trigger. The trigger decides the payload shape, the scaling signal the controller watches, the delivery guarantee, and the failure/retry behaviour. Knowing each trigger’s real limits is the difference between a pipeline that holds under load and one that silently drops or duplicates.

The full trigger catalogue, with the property that bites:

Trigger	Fires on	Delivery guarantee	Scaling signal	Key limit / gotcha
HTTP	Inbound HTTP request	Synchronous (caller-driven)	Request rate	Response within timeout; large bodies via stream
Timer	CRON schedule (NCRONTAB)	Singleton (one instance)	Time	Missed runs on restart unless `RunOnStartup`; 6-field CRON incl. seconds
Queue Storage	New message in a queue	At-least-once	Queue length	64 KB message; 5 dequeues → poison queue
Service Bus	Message in queue/subscription	At-least-once	Active message count	Lock duration; sessions for ordering; 256 KB/1 MB (Premium)
Event Hubs	Event batch on a partition	At-least-once	Partition lag (lease)	One instance per partition; checkpointing; ordering per partition
Event Grid	Discrete event (HTTP push)	At-least-once	Event push	Handshake validation; retries with backoff; dead-letter to blob
Blob (polling)	New/updated blob	At-least-once (eventual)	Scan / receipts	High latency at scale → use Event Grid source
Blob (Event Grid)	Blob event via Event Grid	At-least-once	Event push	Near-real-time; the production choice for blobs
Cosmos DB	Change feed (inserts/updates)	At-least-once	Lease lag	Needs a lease container; no deletes in feed
Durable orchestration	Orchestrator/activity/entity	Internal (replay)	Control queue	Determinism rules; managed by the extension

HTTP trigger

The HTTP trigger turns a function into a web endpoint. It is synchronous — the caller waits for your response — so the request must complete within the platform/front-end timeout (about 230 seconds at the load balancer, far less than the function timeout). Configure the route, methods, and authorization level (the function-key model): anonymous (no key), function (per-function key), admin (host key). For real auth, put Easy Auth/Entra ID or API Management/Application Gateway in front rather than relying on function keys alone.

# Read a function's invoke URL and (default) key
az functionapp function show -g $RG -n $APP --function-name HttpOrders \
  --query "invokeUrlTemplate" -o tsv

Setting	Values	Default	When to change	Gotcha
`authLevel`	anonymous / function / admin	function	`anonymous` behind APIM/Entra	Keys are not real auth; rotate them
`methods`	GET/POST/PUT/…	GET, POST	Restrict to what you accept	Over-permissive methods = attack surface
`route`	template e.g. `orders/{id}`	function name	Clean REST routing	Route collisions return 404
Response timeout	bounded by LB ~230 s	—	Long work → return 202 + async	Don’t block; use Durable async pattern
Max request body	streamable; ~100 MB practical	—	Large uploads	Buffer vs stream; memory pressure

Timer trigger

A timer fires on a NCRONTAB schedule — a six-field CRON that includes seconds ({second} {minute} {hour} {day} {month} {day-of-week}). It is a singleton: only one instance runs the timer (coordinated via a storage lock), so a scaled-out app does not fire the timer N times. Missed occurrences (host was down) are not back-filled unless you opt in; set RunOnStartup only for development — it fires on every restart/scale event, which can surprise you.

// .NET isolated: every day at 02:00:00 (note the leading seconds field)
[Function("NightlyCleanup")]
public void Run([TimerTrigger("0 0 2 * * *")] TimerInfo timer) { /* ... */ }

CRON example	Meaning
`0 /5 * * *`	Every 5 minutes
`0 0 * * * *`	Every hour, on the hour
`0 0 2 * * *`	Every day at 02:00
`0 30 9 * * 1-5`	09:30, Monday–Friday
`/30 * * * *`	Every 30 seconds
`0 0 0 1 * *`	Midnight on the 1st of each month

Queue Storage trigger

Fires when a message lands in an Azure Storage queue. Delivery is at-least-once; a message that fails processing is retried up to 5 times (default maxDequeueCount), then moved to a poison queue named <queue>-poison. Messages are capped at 64 KB (base64 ~48 KB of payload) — for larger payloads, store the blob and queue a pointer. Tune batch size and concurrency in host.json.

{
  "extensions": {
    "queues": {
      "batchSize": 16,
      "newBatchThreshold": 8,
      "maxDequeueCount": 5,
      "visibilityTimeout": "00:00:30",
      "maxPollingInterval": "00:00:02"
    }
  }
}

Setting	What it does	Default	Trade-off
`batchSize`	Messages fetched per instance at once	16	Higher = throughput, more memory/downstream load
`newBatchThreshold`	Refill trigger (fetch more when below)	batchSize/2	Controls steady-state concurrency
`maxDequeueCount`	Retries before poison queue	5	Lower = fail fast; higher = ride transient errors
`visibilityTimeout`	How long a message is hidden while processing	0	Too short = duplicate processing
`maxPollingInterval`	Backoff when the queue is empty	1 min	Lower = faster pickup, more storage transactions

Service Bus trigger

For enterprise messaging — ordering (sessions), dead-lettering, transactions, topics/subscriptions — use Service Bus rather than Storage queues. Delivery is at-least-once with a lock (PeekLock): the message is locked while you process it, and you must finish before the lock duration expires or it’s redelivered. Use sessions for FIFO ordering within a key. Failed messages go to the built-in dead-letter sub-queue after maxDeliveryCount. Standard tier caps messages at 256 KB, Premium at 1 MB (or 100 MB with large-message support).

{
  "extensions": {
    "serviceBus": {
      "maxConcurrentCalls": 16,
      "maxConcurrentSessions": 8,
      "prefetchCount": 0,
      "autoCompleteMessages": true,
      "maxAutoLockRenewalDuration": "00:05:00"
    }
  }
}

Setting	What it does	Default	When to change
`maxConcurrentCalls`	Parallel non-session messages per instance	16	Lower to protect a fragile downstream
`maxConcurrentSessions`	Parallel sessions per instance	8	Tune for ordered-stream fan-out
`prefetchCount`	Messages cached locally ahead of processing	0	Higher = throughput, risk of lock expiry
`autoCompleteMessages`	Auto-complete on success	true	Set false for manual settlement control
`maxAutoLockRenewalDuration`	Auto-renew the lock for long work	5 min	Raise for long handlers; cap to avoid stuck locks

Event Hubs trigger

For high-throughput telemetry/streaming, Event Hubs partitions the stream; the trigger assigns one instance per partition (via leases) and processes events in batches, checkpointing progress so a restart resumes where it left off — but a redelivered batch after a crash means at-least-once and possible reprocessing. Ordering is per-partition only. Max parallelism equals the partition count, so partitions are your scale ceiling — size them up front (they’re hard to change later).

{
  "extensions": {
    "eventHubs": {
      "maxEventBatchSize": 100,
      "batchCheckpointFrequency": 1,
      "prefetchCount": 300
    }
  }
}

Concept	What it controls	Limit / note
Partition count	Max concurrent instances	Set at creation; 1–32 (more on Premium/Dedicated)
`maxEventBatchSize`	Events per invocation	Bigger batch = throughput, larger memory
`batchCheckpointFrequency`	Batches between checkpoints	Higher = fewer storage writes, more reprocessing on crash
Throughput units / PUs	Ingress/egress capacity	TU on Standard; CUs on Dedicated
Ordering	FIFO per partition only	No global ordering across partitions

Event Grid, Blob and Cosmos DB triggers

Event Grid delivers discrete events over HTTP push (Storage events, custom events, system topics). It validates the endpoint with a handshake, retries with exponential backoff on failure, and dead-letters to a blob container after the retry window. It is the right way to react to blob events at scale.

Blob trigger has two modes. The legacy polling mode scans the container and tracks receipts — simple but with high latency at scale (minutes) and a risk of missing events on very high churn. The production choice is Event Grid-based blob events, which push near-real-time and don’t degrade with container size.

Cosmos DB trigger consumes the change feed (inserts and updates, not deletes) using a lease container to track progress across partitions; like Event Hubs it scales with the source’s physical partitions and delivers at-least-once.

Trigger	Latency	Scaling unit	Critical gotcha
Event Grid	Near-real-time	Event push (parallel)	Must answer validation handshake (200)
Blob (polling)	Minutes at scale	Container scan	Misses/lags on high churn — avoid in prod
Blob (Event Grid)	Seconds	Event push	Requires Event Grid + storage event subscription
Cosmos DB change feed	Seconds	Source partitions	Needs a lease container; no deletes; not transactional

Bindings: read and write services without the client code

A binding connects your function to a service declaratively. An input binding supplies data before your function runs; an output binding writes your return value after. The trigger is itself a special binding (direction in, trigger). Bindings cover most Azure data services and remove the connect/auth/serialize/dispose boilerplate — but they trade flexibility for convenience, and for anything fancy (transactions, custom retry, streaming) you still use the SDK directly.

// .NET isolated: triggered by a queue message, read a Cosmos doc, write to another queue
[Function("EnrichOrder")]
[QueueOutput("orders-enriched")]                                   // output binding
public string Run(
    [QueueTrigger("orders-in")] string orderId,                    // trigger
    [CosmosDBInput("shop","orders", Id="{orderId}", PartitionKey="{orderId}")] Order order) // input
{
    order.Enriched = true;
    return JsonSerializer.Serialize(order);
}

The bindings you reach for, and the direction(s) each supports:

Binding	In	Out	Trigger	Typical use
HTTP	—	—	Yes	Web endpoints
Timer	—	—	Yes	Schedules
Queue Storage	Yes	Yes	Yes	Lightweight work queues
Service Bus	—	Yes	Yes	Enterprise messaging
Event Hubs	—	Yes	Yes	Streaming / telemetry out
Event Grid	—	Yes	Yes	Event publishing
Blob Storage	Yes	Yes	Yes	File read/write
Table Storage	Yes	Yes	—	Cheap key-value state
Cosmos DB	Yes	Yes	Yes	Documents + change feed
SQL (Azure SQL)	Yes	Yes	Yes	Relational read/write/feed
SignalR Service	Yes	Yes	—	Real-time push to clients
Durable client/entity	Yes	Yes	Yes	Start/query orchestrations

Two binding pitfalls worth knowing before you ship:

Pitfall	What happens	Fix
Binding expression typo (`{orderId}` vs `{OrderId}`)	Binding resolves empty → null arg → crash	Match the trigger property name exactly (case-sensitive)
Output binding never written	Silent no-op (you returned but didn’t set it)	Return the bound value, or use `IAsyncCollector.AddAsync`
Connection setting missing	Binding can’t auth → host error at load	Set `<Name>__serviceUri`/connection app setting (identity-based preferred)
Large payload through a binding	Memory pressure, timeout	Stream via SDK; pass a pointer, not the blob

Scaling and cold starts: the part everyone underestimates

On the serverless plans the scale controller is a platform component that watches each trigger’s signal and decides how many instances to run — from zero to the plan maximum. It reacts differently per trigger: HTTP scales on request rate/latency, queues scale on queue length, Event Hubs/Cosmos scale on partition lag, and the controller adds instances in steps (it won’t go from 0 to 200 in one tick). This is why a sudden burst sees a brief ramp, and why a queue that suddenly gets 100k messages drains over a minute or two rather than instantly.

A cold start is the latency the first request on a fresh instance pays: the platform allocates a sandbox, mounts your app, starts the language worker, JITs/loads your code, and primes connections — typically 1–10+ seconds depending on stack, package size and dependencies. It bites whenever an instance is created: scaling out, scaling back up from zero, or after a recycle. On Consumption you cannot avoid it entirely; Flex Consumption offers alwaysReady instances (a warm pool that’s always running for a given group); Premium keeps pre-warmed instances so scale-out never exposes a cold worker; Dedicated stays warm because it never scales to zero.

How each plan handles scale and cold start:

Plan	Scales from 0	Cold-start exposure	Warm mechanism	Max instances (typical)
Consumption	Yes	On every new instance	None	~200
Flex Consumption	Yes	Only when above `alwaysReady` count	`alwaysReady` warm pool	High (configurable cap)
Premium (EP)	No (min 1)	None	Pre-warmed instance count	Up to ~20–100 by SKU
Dedicated	No	Per App Service (Always On)	Always On	Plan instance cap
Container Apps	Yes (KEDA)	On scale-from-zero	Min replicas > 0	Replica cap

What actually eats the cold-start budget, and how to cut each:

Cold-start cost	Typical magnitude	Reduce it by	Trade-off
Sandbox + mount	0.5–2 s	`alwaysReady`/pre-warmed; smaller package	Costs warm capacity
Language worker start	0.3–3 s	Lighter runtime; .NET isolated trimming	Build complexity
Dependency load / DI	0.5–5 s	Fewer/lighter packages; lazy init	First real call still primes
First connection (DB/KV)	0.2–3 s	Reuse clients (static); pooled drivers	Must be singleton-safe
Package pull (large zip/container)	1–30 s	Run-from-package; small image; same-region	Build discipline

Set Flex alwaysReady and Premium pre-warmed counts:

# Flex Consumption: keep 2 instances of the 'http' group always warm
az functionapp scale config set -g $RG -n $APP \
  --always-ready-instances http=2

# Premium (EP): pre-warm 3 instances + raise the elastic maximum
az functionapp plan update -g $RG -n premium-plan \
  --min-instances 1 --max-burst 20
az resource update --resource-group $RG \
  --name $APP --resource-type "Microsoft.Web/sites" \
  --set properties.preWarmedInstanceCount=3

The scale knobs by plan, and what each caps:

Knob	Plan	What it sets	Default	Why change
`maximumInstanceCount`	Flex	Upper bound on instances	plan default	Protect a downstream; cap cost
`alwaysReady`	Flex	Warm instances per group	0	Kill cold start on hot paths
`preWarmedInstanceCount`	Premium	Buffer instances before traffic	1	Cover scale-out latency
`minimumElasticInstanceCount`	Premium	Always-running floor	1	Steady warm baseline
`functionAppScaleLimit`	Consumption/Premium	Hard instance cap	none	Stop runaway scale to a fragile dep
`WEBSITE_MAX_DYNAMIC_APPLICATION_SCALE_OUT`	Consumption	Per-app scale cap	platform	Limit a single app’s footprint

Concurrency, batching and partitioning: throughput vs ordering

Scale (instances) multiplies with concurrency (parallel invocations per instance) to give throughput. Push concurrency too high and you overwhelm a downstream (a database hits connection limits, an API throttles); too low and you under-use each instance and pay for more instances than you need. Each trigger family has its own concurrency model, and a few share the dynamic concurrency feature (the host auto-tunes concurrency from observed success/latency).

The concurrency model per trigger, and the lever:

Trigger	Concurrency lever	Where set	Ordering implication
HTTP	Instances × in-process parallelism	platform / host	None (stateless)
Queue Storage	`batchSize` + `newBatchThreshold`	host.json	None (no ordering)
Service Bus (no session)	`maxConcurrentCalls`	host.json	None
Service Bus (sessions)	`maxConcurrentSessions`	host.json	FIFO within a session
Event Hubs	partitions × batch size	hub + host.json	FIFO within a partition
Cosmos DB	source partitions × lease	Cosmos + lease	Per-partition
Durable	`maxConcurrentActivityFunctions` / orchestrations	host.json	Orchestrator-controlled

Enable dynamic concurrency to let the host find the sweet spot under variable load:

{
  "concurrency": {
    "dynamicConcurrencyEnabled": true,
    "snapshotPersistenceEnabled": true
  }
}

The ordering-vs-throughput trade, stated plainly:

You want…	Mechanism	Cost
Maximum throughput, order irrelevant	High concurrency, many partitions/instances	Downstream pressure; must be idempotent
Strict ordering within a key	Service Bus sessions or Event Hubs partition key	Throughput capped by key/partition count
Even fan-out, no hot key	Good partition-key design (high cardinality)	Lose per-key ordering
Protect a fragile downstream	Cap `maxConcurrentCalls` / `functionAppScaleLimit`	Slower drain; possible backlog

A worked sizing example: an Event Hub with 8 partitions gives at most 8 concurrent instances for that trigger, regardless of how many messages pile up — if each instance processes a batch of 100 in 200 ms, your ceiling is ~4,000 events/sec. Need 40,000/sec? You need ~80 partitions (or fewer with bigger batches and faster handlers). The partition count, chosen at creation, is your scale ceiling — this is the single most common Event Hubs capacity mistake.

Durable Functions: stateful workflows on stateless compute

Plain functions are stateless and short-lived; many real processes are stateful and long-running — “validate, charge, ship, notify, and if anything fails, compensate,” running over minutes to days, surviving restarts. Durable Functions is an extension that adds this without a separate workflow engine. You write an orchestrator function (which coordinates) that calls activity functions (which do the work), and the framework checkpoints the orchestrator’s progress to the task hub in storage. When the orchestrator awaits, the platform can unload it entirely (you pay nothing while it waits hours for an approval) and later replay the orchestrator function from the start, using the checkpointed history to skip already-completed steps — which is why orchestrator code must be deterministic (no DateTime.Now, no random, no direct I/O; use the context’s APIs).

The three function types in Durable, and the rules each obeys:

Type	Role	Constraints	Example
Orchestrator	Coordinates the workflow	Deterministic: no I/O, no clocks/random, no `await` except on durable APIs	“call A → B → wait → C”
Activity	Does the actual work	Any code, side effects allowed	Charge a card, send email
Entity	Stateful actor (small state)	Single-threaded per entity key	A per-user counter, a cart
Client	Starts/queries orchestrations	Triggered by HTTP/queue/etc.	Webhook that kicks off a flow

The orchestration patterns

The patterns are the reason Durable exists. Each solves a class of coordination problem cleanly:

Pattern	Problem it solves	Mechanism
Function chaining	Run steps in strict sequence (output → input)	`await ctx.CallActivityAsync` in order
Fan-out / fan-in	Parallelize N items, then aggregate	Start N activities, `await Task.WhenAll`
Async HTTP API	Long job behind a quick HTTP 202 + status URL	Client starts orchestration, returns status endpoint
Monitor	Poll a resource until a condition, with timeout	Orchestrator loops with `CreateTimer`
Human interaction	Wait for approval/input (minutes–days)	`WaitForExternalEvent` + timeout
Aggregator (entities)	Accumulate state from many events, single-threaded	Durable Entities

Fan-out/fan-in — process every line of an order in parallel, then reconcile:

[Function("ProcessOrder")]
public static async Task<OrderResult> Run(
    [OrchestrationTrigger] TaskOrchestrationContext ctx)
{
    var order = ctx.GetInput<Order>();

    // Fan out: one activity per line item, all in parallel
    var tasks = order.Lines.Select(line =>
        ctx.CallActivityAsync<LineResult>("ProcessLine", line)).ToList();

    // Fan in: wait for all, then aggregate
    LineResult[] results = await Task.WhenAll(tasks);
    return new OrderResult(results);
}

Human-interaction with a timeout (approve within 72 hours or escalate):

using var cts = new CancellationTokenSource();
DateTime deadline = ctx.CurrentUtcDateTime.AddHours(72);       // deterministic clock
Task timeout = ctx.CreateTimer(deadline, cts.Token);
Task<bool> approval = ctx.WaitForExternalEvent<bool>("ApprovalEvent");

if (approval == await Task.WhenAny(approval, timeout)) {
    cts.Cancel();
    if (approval.Result) await ctx.CallActivityAsync("Ship", order);
} else {
    await ctx.CallActivityAsync("Escalate", order);             // timed out
}

The determinism rules — break one and you get non-deterministic replay (the classic Durable bug):

Don’t (in an orchestrator)	Why	Do instead
`DateTime.Now` / `DateTimeOffset.UtcNow`	Different value on replay	`ctx.CurrentUtcDateTime`
`Guid.NewGuid()` / random	Non-deterministic	`ctx.NewGuid()`
Direct HTTP / DB / file I/O	Side effects re-run on replay	Call an activity that does the I/O
`Task.Delay` / `Thread.Sleep`	Not durable; blocks	`ctx.CreateTimer(...)`
`await` non-durable tasks	Breaks the replay model	Only `await` durable APIs
Static mutable state	Leaks across replays/instances	Pass state through the orchestrator

Durable behaviours and limits you should size for:

Aspect	Behaviour	Limit / note
Task hub storage	Queues + tables in the storage account	Throttling here stalls all orchestrations
History growth	Each step appends to history	Use `ContinueAsNew` for eternal/long loops
Concurrency	`maxConcurrentActivityFunctions` etc.	Tune in host.json to protect downstreams
Backend choice	Azure Storage (default), Netherite, MSSQL	Netherite/MSSQL for high throughput
Versioning	In-flight orchestrations pin to old code	Don’t break history shape on deploy
Sub-orchestrations	Orchestrators calling orchestrators	Compose large workflows; mind history size

The reference architecture for a serverless order workflow that combines several of these patterns is in Reference Architecture: Serverless API on Azure.

Idempotency, retries and poison messages: designing for at-least-once

Because every messaging trigger delivers at least once, a correct function must produce the same result whether it sees a message once or five times — that’s idempotency. The realistic failure flow: your function pulls a message, does half the work, then crashes (or its lock/visibility expires); the message becomes visible again and is redelivered; without idempotency you double-charge a card or write a duplicate row.

The idempotency techniques, and when each fits:

Technique	How it works	Best for
Idempotency key (dedup store)	Record a unique message id; skip if seen	Side-effecting writes (charge, email)
Upsert by natural key	Write is “set to X” not “add X”	Database records
Conditional write (ETag/If-Match)	Reject if state changed underneath	Optimistic concurrency
Idempotent downstream	The API itself dedups on a key	Payment providers with idem keys
Exactly-once via transaction	Settle message + write atomically	Service Bus + DB (sessions/Tx)

Retries: the host has a retry policy (fixed or exponential) for trigger-level retries, plus the source’s own redelivery (queue dequeue count, Service Bus delivery count). After retries are exhausted, the message is poisoned/dead-lettered — moved aside so it stops blocking the queue. You must monitor and drain these, or failures pile up silently.

[Function("ChargeOrder")]
[FixedDelayRetry(5, "00:00:10")]              // host retry: 5 attempts, 10 s apart
public async Task Run([ServiceBusTrigger("orders","charge")] OrderMessage msg)
{ /* idempotent charge */ }

The delivery/retry mechanics per source, and where the failed message ends up:

Source	Redelivery counter	Default before set-aside	Set-aside destination
Queue Storage	`dequeueCount`	5	`<queue>-poison`
Service Bus	`DeliveryCount`	10 (`maxDeliveryCount`)	Built-in dead-letter sub-queue
Event Hubs	(no per-message DLQ)	n/a — checkpoint advances	None — handle in code or sideline
Event Grid	retry schedule	~24 h window	Dead-letter blob container
Cosmos DB	lease retry	per host policy	None — handle in code

A symptom→cause→confirm→fix table for the messaging failure classes, because this is where production bites:

#	Symptom	Likely cause	Confirm (exact path/cmd)	Fix
1	Messages in `<q>-poison` growing	Handler throws every time on a bad message	Check the poison queue depth in Storage; read a message	Make handler tolerant; fix data; reprocess after fix
2	Same record processed twice	At-least-once + no idempotency	App Insights shows duplicate operation ids	Add idempotency key / upsert
3	Service Bus messages re-appear after ~30s	Lock expired before processing finished	`maxAutoLockRenewalDuration` too low; long handler	Raise lock renewal; shorten work; checkpoint
4	Out-of-order processing	No sessions / multiple partitions	Events on different partitions/instances	Use sessions or partition key for ordering
5	Backlog never drains	Scale ceiling hit (partitions, scale limit)	Partition count = max instances; `functionAppScaleLimit`	Add partitions; raise/relax the cap
6	Dead-letter on Service Bus filling	`maxDeliveryCount` exceeded	DLQ depth in the portal/CLI	Inspect DLQ, fix root cause, resubmit
7	Event Grid events lost	Endpoint failed validation or 5xx’d	Event Grid metrics: delivery failures	Return 200 on validation; fix handler; check DLQ blob

Networking and identity for production functions

Demos run on default networking and connection strings; production needs private outbound and passwordless identity. On Flex Consumption, Premium, Dedicated and Container Apps you can VNet-integrate the function app so its outbound traffic flows through your virtual network, then reach databases and PaaS via private endpoints — keeping traffic off the public internet. (Plain Consumption cannot VNet-integrate — a frequent reason to choose Flex.) For identity, give the app a managed identity and use identity-based connections for triggers/bindings (<Name>__serviceUri + an RBAC role) instead of connection strings, and Key Vault references for any remaining secrets.

The networking/identity options and what each requires:

Capability	Mechanism	Plans that support it	Why
Private outbound to VNet	VNet integration	Flex, Premium, Dedicated, Container Apps	Reach private DB/PaaS; egress control
Private inbound	Private endpoint on the app	Premium, Dedicated, Flex (evolving)	No public ingress
Reach PaaS privately	Private endpoint on target + DNS	Any VNet-integrated plan	Storage/SQL/Cosmos off the internet
Passwordless to PaaS	Managed identity + RBAC	All	No secrets to leak/rotate
Identity-based trigger/binding conn	`<Name>__serviceUri` + role	All (binding-dependent)	Remove connection strings
Secrets when unavoidable	Key Vault reference	All	Secret out of app settings plaintext
Restrict who can call HTTP	Access restrictions / Easy Auth / APIM	All	Lock the endpoint

Wire identity-based access end to end — give the app an identity and grant it queue + blob roles, no keys:

# 1) System-assigned identity
az functionapp identity assign -g $RG -n $APP
PID=$(az functionapp identity show -g $RG -n $APP --query principalId -o tsv)

# 2) Grant it data-plane roles on the storage account (queues + blobs)
SID=$(az storage account show -n $STG -g $RG --query id -o tsv)
az role assignment create --assignee $PID --role "Storage Queue Data Contributor" --scope $SID
az role assignment create --assignee $PID --role "Storage Blob Data Owner"       --scope $SID

# 3) Point the trigger/binding at the account by URI (identity-based), not a key
az functionapp config appsettings set -g $RG -n $APP --settings \
  "Orders__queueServiceUri=https://$STG.queue.core.windows.net/" \
  "Orders__credential=managedidentity"

// Reference a Key Vault secret from an app setting (the app's MI must have 'Key Vault Secrets User')
appSettings: [
  {
    name: 'PaymentApiKey'
    value: '@Microsoft.KeyVault(SecretUri=https://kv-shop.vault.azure.net/secrets/payment-key/)'
  }
]

The identity roles a function commonly needs, by what it touches:

The function…	Needs role	On
Reads/writes Storage queues	Storage Queue Data Contributor	The storage account
Reads/writes blobs	Storage Blob Data Contributor/Owner	The storage account
Reads/writes Cosmos DB	Cosmos DB Built-in Data Contributor	The Cosmos account
Reads Key Vault secrets	Key Vault Secrets User	The key vault
Sends to Service Bus	Azure Service Bus Data Sender	The namespace/queue
Receives from Service Bus	Azure Service Bus Data Receiver	The namespace/queue
Sends to Event Hubs	Azure Event Hubs Data Sender	The namespace/hub

Limits and the error reference

Keep this open. First, the platform limits that shape design decisions:

Limit	Consumption	Flex / Premium	Note
Max execution time	5 min default, 10 min hard	Long / unbounded (EP default 30 min)	The classic reason to leave Consumption
Max instances	~200	High (configurable)	Per-app scale cap available
Memory per instance	~1.5 GB	Choose (e.g. 512 MB–4 GB+)	Flex/Premium let you size it
HTTP response timeout	~230 s (front end)	~230 s	Return 202 + async for long work
Queue message size	64 KB	64 KB	Pointer pattern for larger
Service Bus message	256 KB (Std) / 1 MB (Prem)	same	Large-message support on Premium
App settings size	~32 KB total	same	Don’t stuff payloads into settings
Storage dependency	Required	Required	Host won’t start without it

The host/runtime errors you’ll actually see, and what each means:

Error / symptom	Meaning	Likely cause	First fix
“Azure Functions runtime is unreachable”	Host can’t start	`AzureWebJobsStorage` broken (key rotated, firewall, deleted)	Fix the storage connection/identity/firewall
HTTP 503 on the function URL	No healthy host/instance	Host crash-looping; cold start mid-deploy	Check App Insights traces; redeploy
HTTP 429 from your function	Throttled	Daily quota (Consumption) or downstream throttling	Check `functionAppScaleLimit`/quota; back off
Function timeout (504-like)	Exceeded `functionTimeout`	Long work on Consumption (10 min cap)	Move to Premium/Flex or use Durable async
Binding error at load	Function not indexed	Missing connection setting / bad binding	Set the connection app setting; fix binding
Messages stuck, none processed	Trigger not firing	Host down, or storage/lease unreachable	Check host status + storage health
Duplicate executions	At-least-once + retries	Crash mid-process; lock expiry	Add idempotency
“Did not find functions with language…”	Wrong runtime/worker	`FUNCTIONS_WORKER_RUNTIME` mismatch	Match runtime to your code
Cold start spikes	Fresh instance latency	Scale-out / scale-from-zero	`alwaysReady`/pre-warmed; smaller package
Durable orchestration stuck	Replay/determinism or task-hub issue	Non-deterministic orchestrator; storage throttled	Fix determinism; check task-hub storage

The critical app settings for a function app, beyond the bindings:

Setting	Controls	Typical value	Note
`AzureWebJobsStorage`	Backing storage connection	(account/identity)	Required; identity-based preferred
`FUNCTIONS_EXTENSION_VERSION`	Runtime major version	`~4`	Pin to a supported major
`FUNCTIONS_WORKER_RUNTIME`	Language worker	`dotnet-isolated`/`node`/`python`	Must match your code
`WEBSITE_RUN_FROM_PACKAGE`	Run from immutable package	`1`	Atomic deploys, faster cold start
`APPLICATIONINSIGHTS_CONNECTION_STRING`	Telemetry target	(connection string)	Always set in prod
`functionTimeout` (host.json)	Per-function max duration	plan-dependent	Bounded by plan’s hard cap
`WEBSITE_CONTENTAZUREFILECONNECTIONSTRING`	Content share (some plans)	(account)	Keep consistent with storage
`functionAppScaleLimit`	Max instances	none/number	Protect downstreams

Architecture at a glance

The diagram traces a real serverless order pipeline left to right, and marks the five hops where things break. Producers — a public client over HTTPS and an upstream system dropping messages — enter on the left. The ingress/trigger zone holds the two front doors: an HTTP-triggered function (behind Application Gateway/Easy Auth, scaling on request rate) and a Service Bus queue that buffers order messages and absorbs spikes so the back end never has to. From there the compute zone is the heart: the scale controller decides how many function instances run (zero to the cap), and a Durable orchestrator coordinates the multi-step workflow — fanning out line-item activities in parallel and waiting for an approval — checkpointing its state to the task hub. The state/dependencies zone is everything the functions read and write through bindings and identity: the backing storage account (runtime state + task hub), Cosmos DB for orders, Key Vault for the payment secret, all reached privately. Finally the observability plane (Application Insights) sees every invocation, dependency call and failure across the whole path.

Read the numbered badges as the failure map. Badge 1 sits on the trigger: a cold start on a fresh instance makes the first call slow — confirm with App Insights request duration after a gap, fix with alwaysReady/pre-warmed instances. Badge 2 is on the queue→instance hop: at-least-once delivery means duplicate processing — confirm with duplicate operation ids, fix with idempotency. Badge 3 is the scale ceiling: a backlog that won’t drain because partitions or the scale limit cap concurrency — confirm by comparing partition count to instance count, fix by adding partitions or raising the cap. Badge 4 is the Durable orchestrator: a non-deterministic orchestrator stalls or misbehaves on replay — confirm with the orchestration history, fix by removing clocks/random/I/O. Badge 5 is the backing storage: if AzureWebJobsStorage is throttled or unreachable, the whole host won’t start — confirm with “runtime unreachable,” fix the storage connection/firewall/identity. The lesson the picture teaches: the function code is the small part; the event source, the scale ceiling, the state store and the delivery guarantee are where serverless systems actually live or die.

Real-world scenario

Saffron Mart, a mid-size Indian grocery e-commerce company, ran its order pipeline on a pair of always-on App Service instances and a couple of VMs for background jobs. Order processing — validate, reserve stock, charge, generate invoice, notify — was a synchronous chain inside the web app, so a slow payment provider made checkout itself slow, and the nightly invoice batch needed its own VM that sat idle 23 hours a day. Monthly spend on this machinery was about ₹62,000, and during festival sales the synchronous chain buckled: checkout p95 climbed past 8 seconds and stock oversold because two requests reserved the same item.

The platform team (three engineers) moved the pipeline to Azure Functions on Flex Consumption. Checkout became a thin HTTP-triggered function that did one thing — validate the cart and drop an order message on a Service Bus queue — then returned 202 Accepted with a status URL. A Durable Functions orchestrator, started from the queue, ran the real workflow: a fan-out to reserve each line item in parallel, then an activity to charge (idempotent, keyed on the order id, against a payment provider that supports idempotency keys), then invoice generation, then notification. The nightly invoice job became a timer-triggered function — no VM. Everything authenticated with a managed identity: the queue, Cosmos DB and Key Vault (for the payment key) were all reached without a single connection string, and Cosmos was behind a private endpoint.

The first festival sale on the new system exposed three lessons. First, cold starts: at the very start of the flash sale, the first wave of checkout calls saw 4–6 second latencies because the app had scaled to zero overnight and the burst hit cold instances. They set Flex alwaysReady=2 on the HTTP group and the cold spikes vanished. Second, duplicate charges: an early bug — a non-idempotent charge activity — meant a redelivered message double-charged a handful of customers during a transient Service Bus lock expiry. The fix was a dedup store keyed on the order id, checked before charging; the lock-renewal duration was also raised because the charge call occasionally took longer than the default lock. Third, backlog: at peak the order queue briefly grew to ~40,000 messages and drained slower than expected — the team had capped functionAppScaleLimit too conservatively at 20 while protecting Cosmos; raising it to 60 and bumping Cosmos throughput cleared it in under two minutes.

The outcome: checkout p95 dropped from 8 s to 310 ms (because checkout no longer waited for the workflow), stock oversell went to zero (line-item reservation became an idempotent, ordered-per-item operation via session-keyed messaging), the nightly VM was deleted, and the monthly bill fell to about ₹28,000 — the serverless pipeline cost nothing at 3am and scaled itself during the sale. The architecture lesson on the team wall: “Make checkout drop a message and walk away. The workflow is Durable’s problem, the scale is the platform’s problem, and at-least-once is your problem — so be idempotent.”

The migration as a before/after, because the shape of the change is the lesson:

Concern	Before (App Service + VMs)	After (Functions + Durable)	Effect
Checkout latency (p95)	~8 s (synchronous chain)	310 ms (drop message, return 202)	25× faster perceived
Order workflow	Inline, blocking	Durable orchestration (fan-out + approval)	Resilient, checkpointed
Nightly invoices	Dedicated VM, idle 23h	Timer-triggered function	VM deleted
Stock oversell at peak	Race on shared item	Idempotent, session-ordered reserve	Zero oversell
Secrets	Connection strings in config	Managed identity + KV references	No secrets to leak
Cold start at sale start	n/a (always-on, expensive)	Killed with Flex `alwaysReady=2`	No first-wave spikes
Monthly cost	~₹62,000	~₹28,000	~55% lower

Advantages and disadvantages

The event-driven, pay-per-execution model both enables the wins above and introduces a class of problems you don’t have with always-on compute. Weigh it honestly:

Advantages (why serverless helps)	Disadvantages (why it bites)
Scale-to-zero: pay nothing at idle; ideal for spiky/low traffic	Cold starts add first-request latency on fresh instances
Automatic scale to the event rate — no autoscale rules to write	The platform decides scale; bursty ramps and ceilings can surprise you
Bindings remove client boilerplate for dozens of services	Bindings hide details; complex needs (Tx, streaming) still need the SDK
Durable Functions gives stateful workflows without a workflow engine	Orchestrator determinism rules are subtle; non-deterministic bugs are nasty
No servers to patch, scale or load-balance	You can’t `ssh` to “the server”; you debug through logs/App Insights
Per-execution billing tracks real usage closely	At-least-once delivery forces idempotency on you (duplicate executions)
Tight integration with the Azure event ecosystem	Vendor lock-in: triggers/bindings/Durable are Azure-specific
Strong fit for glue, automation, event processing, schedules	Wrong for long-running compute, ultra-low-latency APIs, persistent local state

Where each matters: serverless is right when work is event-shaped and intermittent, when you want to ship logic not operate hosts, and when occasional cold starts are tolerable (or killable with warm pools). It’s wrong for steady high-CPU compute (you’d pay more than a reserved VM and fight timeouts), for APIs with a hard sub-100 ms p99 and zero cold-start tolerance (use Premium-warmed or a different model), and for anything needing durable local disk or in-memory state across calls. The disadvantages are all manageable — but only if you design for them up front, which is the entire point of this article.

Hands-on lab

Build a queue-triggered, idempotent function on the free-friendly Consumption plan, watch it process and poison a bad message, then tear it down. Run in Cloud Shell (Bash); the runtime + a small storage account stay inside or near the free tier.

Step 1 — Variables and resource group.

RG=rg-fn-lab
LOC=centralindia
STG=stfnlab$RANDOM        # globally-unique, 3-24 lowercase
APP=fn-lab-$RANDOM        # globally-unique
az group create -n $RG -l $LOC -o table

Step 2 — Storage account + Consumption function app (.NET isolated).

az storage account create -n $STG -g $RG -l $LOC --sku Standard_LRS -o table
az functionapp create -n $APP -g $RG \
  --consumption-plan-location $LOC \
  --runtime dotnet-isolated --runtime-version 8.0 \
  --functions-version 4 \
  --storage-account $STG -o table

Expected: a function app row, state = Running.

Step 3 — Create the work queue and the (auto-created) poison queue.

KEY=$(az storage account keys list -n $STG -g $RG --query "[0].value" -o tsv)
az storage queue create -n orders-in --account-name $STG --account-key "$KEY" -o table
# The 'orders-in-poison' queue is created automatically on first poison event.

Step 4 — Deploy a queue-triggered function. (Author locally with func init/func new and func azure functionapp publish $APP, or deploy a zip.) The handler is idempotent — it records processed ids in Table storage and skips duplicates, and it throws on a deliberately bad payload so you can watch poisoning:

[Function("ProcessOrder")]
public async Task Run([QueueTrigger("orders-in")] string body)
{
    var msg = JsonSerializer.Deserialize<OrderMsg>(body)
              ?? throw new InvalidOperationException("bad payload"); // -> retried -> poison
    if (await AlreadyProcessed(msg.Id)) return;                     // idempotent skip
    await DoWork(msg);
    await MarkProcessed(msg.Id);
}

Step 5 — Send a good message and watch it process.

GOOD='{"Id":"o-1001","Item":"rice-5kg"}'
az storage message put -q orders-in --content "$GOOD" \
  --account-name $STG --account-key "$KEY" -o table
# Stream logs and watch the invocation succeed:
az webapp log tail -n $APP -g $RG

Expected: one successful invocation in the log; the message disappears from orders-in.

Step 6 — Send a bad message and watch it poison.

az storage message put -q orders-in --content 'not-json' \
  --account-name $STG --account-key "$KEY" -o table
# After ~5 dequeue attempts it lands in orders-in-poison:
sleep 5
az storage message peek -q orders-in-poison --account-name $STG --account-key "$KEY" -o table

Expected: after the retries, the bad message appears in orders-in-poison — proof that one bad message doesn’t block the queue forever.

Step 7 — Confirm idempotency. Re-send the same good id (o-1001); the handler runs but the AlreadyProcessed check skips the work — no duplicate side effect. Verify in the log: the invocation completes without doing work twice.

Step 8 — Teardown.

az group delete -n $RG --yes --no-wait

You’ve now seen the three things that define serverless event processing in production: it scales to the queue, it sets bad messages aside instead of looping, and it stays correct under duplicate delivery because you made it idempotent.

Common mistakes & troubleshooting

The same dozen mistakes account for most Functions incidents. Each is symptom → root cause → confirm (exact path/command) → fix.

1. “Azure Functions runtime is unreachable” — the whole app is down. Root cause: The backing storage account (AzureWebJobsStorage) is broken — its access key rotated without updating the setting, a firewall now blocks the app, or the account/container was deleted. Confirm: Portal banner on the function app; az functionapp config appsettings list -n $APP -g $RG --query "[?name=='AzureWebJobsStorage']"; check the storage account’s networking/firewall and that the key matches. Fix: Repair the connection (new key or, better, switch to identity-based AzureWebJobsStorage__accountName + role); allow the app through the storage firewall; never let the host’s storage be unreachable.

2. Function runs twice (or N times) for one event. Root cause: At-least-once delivery plus a crash/lock-expiry mid-process; the message is redelivered. Not a platform bug — expected behaviour. Confirm: App Insights requests/traces show the same operation/message id processed more than once; Service Bus DeliveryCount > 1. Fix: Make the handler idempotent — dedup store keyed on message id, upsert by natural key, or use a downstream that dedups. Never assume exactly-once.

3. Messages pile up in the poison/dead-letter queue. Root cause: The handler throws on certain messages every time (bad data, a downstream that’s down), so they exhaust retries and are set aside — and nobody is draining them. Confirm: Storage <queue>-poison depth, or Service Bus DLQ depth (az servicebus queue show ... --query countDetails.deadLetterMessageCount). Fix: Alert on poison/DLQ depth; read a poisoned message to find the cause; fix the data/downstream; reprocess (move messages back). Make handlers tolerant of expected-bad input rather than throwing.

4. Long-running function times out. Root cause: The work exceeds the plan’s max execution time — 10 minutes hard on Consumption. Confirm: App Insights shows the invocation cut at the timeout; functionTimeout in host.json vs the plan cap. Fix: Move to Premium/Flex (long/unbounded timeout) for genuinely long work, or refactor to Durable Functions (the async pattern: return immediately, run the long workflow in pieces).

5. HTTP function returns 502/503 intermittently. Root cause: Cold start mid-deploy, host crash-looping, or the response exceeded the ~230 s front-end timeout. Confirm: App Insights requests with failures; correlate to deploys/scale events; check function duration against 230 s. Fix: Use alwaysReady/pre-warmed instances; fix the crash (see traces); for long work return 202 + status URL instead of blocking. (Same front-end mechanics as App Service 502/503 troubleshooting.)

6. The app won’t scale out — backlog grows. Root cause: A scale ceiling: Event Hubs/Cosmos partition count caps instances, or functionAppScaleLimit/WEBSITE_MAX_DYNAMIC_APPLICATION_SCALE_OUT is set low, or you’re on Dedicated (no elastic scale). Confirm: Compare partition count to instance count (App Insights cloud_RoleInstance cardinality); read the scale-limit settings. Fix: Add partitions (at the source — can’t change later cheaply), raise/remove the scale cap, or move to a plan that scales elastically.

7. Cold-start latency on a “warm” Premium plan. Root cause: Pre-warmed count is 1 (default) and a burst scaled out faster than the buffer covered; or you scaled past maxBurst. Confirm: App Insights shows latency spikes correlated with new cloud_RoleInstance values during a burst. Fix: Raise preWarmedInstanceCount and maxBurst; keep the deployment package small; reuse clients so per-instance warm-up is cheap.

8. Timer fired multiple times / didn’t fire after a restart. Root cause: For multiples — a misconfiguration broke the singleton lock (rare) or you confused it with RunOnStartup firing on every restart. For misses — the host was down during the schedule and you didn’t opt into catch-up. Confirm: Function execution log timestamps; check for RunOnStartup=true; verify the storage lock container. Fix: Remove RunOnStartup in production; rely on the storage-backed singleton; for critical schedules, make the job idempotent and tolerant of a missed/duplicate run.

9. Durable orchestration is stuck or behaves nondeterministically. Root cause: The orchestrator violates determinism (used DateTime.Now, Guid.NewGuid(), direct I/O, or awaited a non-durable task), so replay diverges from history; or the task-hub storage is throttled. Confirm: Query the orchestration status/history (az rest to the Durable status endpoint, or the Durable Functions monitor); look for replay errors; check the storage account metrics for throttling. Fix: Remove all non-deterministic calls from the orchestrator (move I/O to activities, use ctx.CurrentUtcDateTime/ctx.NewGuid()); if storage is the bottleneck, scale it or switch the Durable backend (Netherite/MSSQL).

10. Binding resolves to null / function isn’t found. Root cause: A binding expression name doesn’t match the trigger property (case-sensitive), or a required connection app setting is missing, so the function fails to index. Confirm: Startup logs show “no functions found” or an indexing error; the binding parameter is null at runtime. Fix: Match {property} exactly to the trigger’s field; set the binding’s connection app setting (<Name>__serviceUri or connection string); redeploy and re-check the function list.

11. 403 / auth failures calling a dependency (DB, Key Vault, Storage). Root cause: The managed identity isn’t enabled, or lacks the data-plane RBAC role on the target, or the target’s firewall blocks the app’s outbound. Confirm: az functionapp identity show; az role assignment list --assignee <principalId> --scope <targetId>; the target’s networking blade. Fix: Assign the identity; grant the data role (e.g. Storage Blob Data Contributor, Key Vault Secrets User), not just control-plane Reader; allow the app’s subnet/outbound through the target firewall (private endpoint preferred).

12. Costs higher than expected on Consumption. Root cause: A chatty trigger (a queue that’s never empty, an aggressively-polling timer) or a function that runs far more often/longer than assumed; or a runaway retry loop reprocessing poison messages. Confirm: App Insights execution count × duration; the cost analysis blade filtered to the function app; check poison-queue churn. Fix: Reduce invocation frequency (batch, raise polling interval), shorten execution, fix retry loops, and consider Premium if steady load makes per-execution pricing lose to a flat plan.

Best practices

Pick the plan for the workload, not the price tag. Flex Consumption for most new builds (scale-to-zero + VNet + alwaysReady); Premium for steady, latency-sensitive or long-running; Dedicated to co-locate with web apps; Consumption only for truly spiky/low glue.
Keep functions small and single-purpose. One trigger, one job. Compose with queues and Durable orchestration, not with giant multi-responsibility handlers.
Stay stateless; put state outside. Never rely on in-memory or local-disk state surviving between invocations or across instances. Use a database, queue or cache.
Be idempotent by default. Every messaging trigger is at-least-once. Design handlers so processing the same event twice is safe (dedup key, upsert, conditional write).
Reuse clients. Create HttpClient, DB and SDK clients once (static/singleton), not per invocation — per-call clients exhaust connections and slow cold start.
Use managed identity and Key Vault references. No connection strings in app settings where an identity-based connection works; grant least-privilege data-plane roles.
Monitor poison/dead-letter depth and host health. Alert on poison-queue/DLQ growth, host “unreachable,” 5xx rate, and execution duration — not just “is it up.”
Tune concurrency to protect downstreams. Cap maxConcurrentCalls/functionAppScaleLimit so a scale-out doesn’t DDoS your own database; enable dynamic concurrency where it helps.
Size partitions up front. Event Hubs/Cosmos partition count is your scale ceiling and is painful to change later — provision for the peak you’ll plausibly hit.
Keep orchestrators deterministic. No clocks, randomness or I/O in orchestrator code; all side effects go in activities. This is the #1 Durable correctness rule.
Deploy from package, wire Application Insights from day one. WEBSITE_RUN_FROM_PACKAGE=1 gives atomic deploys and faster cold start; App Insights turns a two-hour mystery into a two-minute lookup.
Right-size the backing storage and keep it healthy. It’s a hard dependency for the host and the Durable task hub — don’t share it with a noisy workload, and watch it for throttling.

Security notes

Managed identity over secrets. Use the function app’s system- or user-assigned managed identity for triggers, bindings and dependency calls; reserve Key Vault references for secrets that have no identity-based path. Grant least privilege — the specific data-plane role, scoped to the resource.
Lock down the HTTP surface. Function keys (authLevel) are not authentication. Put Easy Auth/Entra ID or API Management/Application Gateway + WAF in front of HTTP-triggered functions, and use access restrictions to limit who can reach the endpoint.
Private networking for outbound. On Flex/Premium/Dedicated, VNet-integrate and reach databases/PaaS via private endpoints so traffic never traverses the public internet; force outbound through the VNet where egress control matters.
Protect the backing storage account. It holds runtime state and the Durable task hub. Use identity-based access, restrict its firewall to the app, disable shared-key access where possible, and don’t expose it publicly.
Don’t leak in errors or health endpoints. Keep stack traces and internal topology out of HTTP responses and any health/diagnostic endpoint; send detail to App Insights, not the caller.
Secure the deployment supply chain. Use WEBSITE_RUN_FROM_PACKAGE from a trusted, access-controlled source; for container-based functions, pull from a private registry via managed identity and scan/pin images.
Rotate and scope what’s left. Any remaining keys (function keys, leftover connection strings) should be rotated and least-scoped; prefer to eliminate them.

The security controls and what each prevents:

Control	Mechanism	Secures against	Also prevents
Managed identity + RBAC	`identity` + data role	Secrets in plaintext settings	Rotation breaking the app
Key Vault references	`@Microsoft.KeyVault(...)`	Secret values in config	Hand-rolled secret handling
Easy Auth / APIM in front	Entra ID / APIM policy	Anonymous abuse of HTTP funcs	Key-only “auth” being bypassed
VNet integration + PE	Private outbound/inbound	Public-internet exposure of deps	Data exfil over public paths
Storage hardening	Firewall + identity, no shared key	Tampering with runtime/task hub	Host-takeover via storage
Run-from-package + scanning	Immutable, scanned artifact	Tampered/unknown code	Surprise breaking deploys

Cost & sizing

What drives the bill, by plan:

Consumption / Flex Consumption bill per execution count and resource consumption (GB-seconds: memory × duration), with a monthly free grant (roughly 1 million executions and 400,000 GB-s) — so genuinely spiky/low workloads can cost near zero. Flex adds a charge for any alwaysReady instances you keep warm.
Premium (Elastic Premium) bills per vCPU-second and GB-second of allocated instances, always-on (minimum 1) — you pay for warm capacity whether or not it’s busy. It wins over Consumption when load is steady enough that per-execution pricing would exceed a flat warm plan, or when you need no-cold-start/VNet/long-timeouts.
Dedicated (App Service plan) is just the plan instance-hours you already pay for — marginal cost of adding functions is near zero if the plan has headroom.
Container Apps bills vCPU/GB per second of active replicas (scale-to-zero supported), plus request charges.

The cost levers and what each buys:

Cost driver	What you pay for	Rough INR/month (illustrative)	When it dominates
Consumption executions + GB-s	Per-run + memory×time (free grant first)	₹0–3,000 for spiky/low traffic	Bursty, low-to-moderate volume
Flex `alwaysReady` instances	Warm pool (per instance)	~₹3,000–6,000 per warm instance	Killing cold start on hot paths
Premium EP1 (1 instance)	Always-on vCPU/GB	~₹12,000–18,000	Steady load, warm + VNet + long runs
Dedicated (shared plan)	Plan instance-hours	marginal if plan exists	Co-located with web apps
Backing storage	Transactions + capacity	~₹200–1,500	High trigger/Durable churn
App Insights ingestion	Per-GB telemetry	~₹1,000–3,000	High-volume tracing (sample it)

Sizing guidance: start on Consumption/Flex and measure; if your monthly execution × duration cost approaches the price of an EP1, or you keep fighting cold starts/VNet, move to Premium. Keep functions short (duration is half the GB-s bill), batch where it cuts invocation count, and enable Application Insights adaptive sampling so a traffic spike doesn’t spike the telemetry bill. The biggest hidden cost is a retry/poison loop silently reprocessing bad messages forever — alert on poison depth so it never runs up the meter. For the broader cost-control workflow, see Azure FinOps & Cost Management at Scale.

Interview & exam questions

1. What is the difference between a trigger and a binding? A trigger is the single event that starts a function and supplies its payload (and is the scaling signal); a binding is a declarative input or output connection to a service. Every function has exactly one trigger and zero or more input/output bindings; the trigger is technically a special binding with direction trigger.

2. When would you choose Flex Consumption over Consumption? When you need scale-to-zero and features Consumption lacks — chiefly VNet integration (private outbound to databases/PaaS), alwaysReady warm instances to eliminate cold start, per-instance memory sizing, and per-instance concurrency control. Flex is the modern serverless default; plain Consumption is for the simplest spiky glue.

3. Why might a function execute the same message twice, and how do you make that safe? Queue/Service Bus/Event Hubs/Event Grid triggers deliver at least once; a crash or lock/visibility expiry mid-processing causes redelivery. You make it safe with idempotency — a dedup store keyed on the message id, an upsert by natural key, a conditional (ETag) write, or an idempotent downstream — so processing twice has the same effect as once.

4. What is a cold start and which plans eliminate it? Cold start is the latency the first request on a freshly created instance pays (sandbox allocation, worker start, code load, connection priming). Premium eliminates it with pre-warmed instances; Flex Consumption removes it for the hot path with alwaysReady instances; Dedicated stays warm (Always On). Plain Consumption cannot fully avoid it.

5. Why must Durable orchestrator functions be deterministic? The platform checkpoints an orchestrator’s progress and replays the function from the start to rebuild state after an await or restart. If the code uses non-deterministic operations (DateTime.Now, Guid.NewGuid(), direct I/O, non-durable awaits), replay diverges from the recorded history and the orchestration breaks. Use ctx.CurrentUtcDateTime, ctx.NewGuid(), and put all side effects in activity functions.

6. Describe the fan-out/fan-in pattern in Durable Functions. The orchestrator starts many activity functions in parallel (e.g. one per item), collecting their tasks, then awaits all of them (Task.WhenAll) and aggregates the results. It’s the pattern for parallelizing independent work and then reconciling — far simpler than hand-rolling parallel queue workers plus a join.

7. What is the maximum execution time on the Consumption plan, and what do you do about a longer job? 10 minutes (5-minute default, 10-minute hard cap). For longer work, move to Premium/Flex (long or unbounded timeout) or refactor to Durable Functions using the async HTTP pattern — return 202 immediately and run the long workflow as checkpointed orchestrator/activity steps that aren’t bound by a single function’s timeout.

8. How does the scale controller decide how many instances to run, and what caps it? It watches each trigger’s scaling signal — HTTP request rate, queue length, Event Hubs/Cosmos partition lag — and adds/removes instances in steps from zero to the plan max. The cap is the plan’s maximum plus trigger-specific ceilings: Event Hubs/Cosmos give one instance per partition, and you can set functionAppScaleLimit/WEBSITE_MAX_DYNAMIC_APPLICATION_SCALE_OUT to bound it.

9. What happens to a message that keeps failing, on Storage queues vs Service Bus? On Storage queues, after maxDequeueCount (default 5) the message is moved to a <queue>-poison queue. On Service Bus, after maxDeliveryCount (default 10) it goes to the built-in dead-letter sub-queue. In both cases you must monitor and drain these or failures accumulate silently.

10. Why is the backing storage account so important to a function app? It holds runtime metadata, trigger leases/checkpoints, the Durable task hub, and (some plans) the deployment package via the AzureWebJobsStorage connection. If it’s unreachable, throttled, or its key rotates without updating the setting, the host fails to start (“runtime unreachable”) — a large share of “Functions is down” is really a storage problem.

11. How do you give a function passwordless access to Cosmos DB or Key Vault? Enable a managed identity on the function app and grant it the data-plane RBAC role on the target (e.g. Cosmos DB Built-in Data Contributor, Key Vault Secrets User), then use an identity-based connection (<Name>__serviceUri + __credential=managedidentity) or a Key Vault reference — no connection strings or keys in app settings.

12. When is Azure Functions the wrong choice? For long-running, steady high-CPU compute (you’d pay more than a reserved VM and fight timeouts), ultra-low-latency APIs with a hard sub-100 ms p99 and zero cold-start tolerance, and workloads needing persistent local state or disk across invocations. Those fit App Service, AKS, Container Apps with min replicas, or VMs better.

These map to AZ-204 (Developer Associate) — implement Azure Functions (triggers, bindings, Durable Functions) and develop event-based and message-based solutions; AZ-104 touches the hosting/scaling/monitoring angle; and the networking/identity content (VNet integration, managed identity, private endpoints) reaches AZ-500/AZ-700. A compact cert mapping:

Question theme	Primary cert	Objective area
Triggers, bindings, Durable patterns	AZ-204	Implement Azure Functions; event/message solutions
Plans, scaling, cold start	AZ-204 / AZ-104	Implement & configure compute
Idempotency, poison/dead-letter	AZ-204	Message-based solutions
Managed identity, Key Vault refs	AZ-204 / AZ-500	Secure solutions; manage identity
VNet integration, private endpoints	AZ-700	Design & implement network connectivity
Monitoring with App Insights	AZ-204	Instrument, monitor & troubleshoot

Quick check

Your HTTP-triggered function needs to reach a private Azure SQL database and you want scale-to-zero. Which plan, and why not plain Consumption?
A queue-triggered function occasionally charges a customer twice. What property of the trigger explains this, and what’s the fix?
True or false: adding more instances will fix an Event Hubs trigger that can’t keep up with its backlog.
Your Durable orchestrator works on first run but behaves erratically after the host restarts mid-workflow. Name two things in the orchestrator code to check.
The function app shows “Azure Functions runtime is unreachable” and nothing runs. What’s the most likely root cause?

Answers

Flex Consumption — it offers scale-to-zero and VNet integration (so it can reach the private SQL endpoint), plus alwaysReady to kill cold start. Plain Consumption can’t VNet-integrate, so it can’t reach the private database.
The trigger delivers at least once; a crash or lock/visibility expiry mid-process causes redelivery, so the same message is processed twice. Fix with idempotency — e.g. a dedup store keyed on the order id checked before charging, or an idempotency key on the payment call.
False. Event Hubs scales to at most one instance per partition, so the partition count is the ceiling regardless of instance settings. Add partitions (at the source) or process larger batches faster; more instances alone won’t help.
Check that the orchestrator (a) uses ctx.CurrentUtcDateTime/ctx.NewGuid() instead of DateTime.Now/Guid.NewGuid(), and (b) performs no direct I/O and only awaits durable APIs (all side effects moved into activity functions). Non-determinism breaks replay.
The backing storage account (AzureWebJobsStorage) is broken or unreachable — a rotated access key not updated in the setting, a firewall now blocking the app, or a deleted account/container. Repair the connection (prefer identity-based) and the host starts.

Glossary

Azure Functions — Azure’s Functions-as-a-Service: run event-triggered code, scale automatically, and (on serverless plans) pay per execution with scale-to-zero.
Function — one handler with a single trigger and zero or more bindings; the unit of execution and billing.
Function app — the deployment, scaling, configuration and identity boundary that hosts one or more functions on a plan.
Trigger — the event that starts a function (HTTP, Timer, Queue, Service Bus, Event Hubs, Event Grid, Blob, Cosmos DB, Durable); also the scaling signal.
Binding — a declarative input or output connection to a service that removes client boilerplate; directions are in, out, trigger.
Hosting plan — Consumption, Flex Consumption, Premium (Elastic Premium), Dedicated (App Service), or Container Apps; decides scale, cold start, networking, timeout and billing.
Scale controller — the platform component that adds/removes instances based on each trigger’s signal, from zero to the plan maximum.
Instance — a worker sandbox running your function app; a new one incurs a cold start.
Cold start — first-request latency on a freshly created instance (sandbox + worker start + code load + connection priming).
Concurrency — how many invocations run at once per instance (e.g. batchSize, maxConcurrentCalls); multiplies with instances for throughput.
AzureWebJobsStorage — the app’s backing storage connection holding runtime state, trigger leases and the Durable task hub; the host won’t start without it.
Durable Functions — an extension providing stateful, long-running orchestration via orchestrator, activity and entity functions, checkpointed to a task hub and replayed deterministically.
Orchestrator function — coordinates a workflow; must be deterministic (no clocks, randomness or I/O — only durable APIs).
Activity function — does the actual work in a Durable workflow; side effects are allowed here.
Durable entity — a stateful, single-threaded actor keyed by id (aggregator pattern).
Task hub — the queues and tables in the storage account where Durable persists orchestration state.
At-least-once delivery — messaging triggers may deliver an event more than once and out of order; design for it with idempotency.
Idempotency — processing the same event twice has the same effect as once (dedup key, upsert, conditional write).
Poison / dead-letter queue — where a repeatedly-failing message is set aside (<queue>-poison for Storage; the DLQ sub-queue for Service Bus) so it stops blocking processing.
NCRONTAB — the six-field CRON (including a seconds field) used by the Timer trigger.
alwaysReady / pre-warmed instances — warm instances kept running (Flex / Premium) so the hot path never pays a cold start.

Next steps

You can now choose a plan, wire triggers and bindings, reason about scale and cold starts, orchestrate with Durable Functions, and make handlers idempotent and observable. Build outward:

Next: Reference Architecture: Serverless API on Azure — see these patterns assembled into a complete, production serverless system.
Related: Azure App Service vs Container Apps vs AKS — the upstream decision of whether Functions (vs containers) is the right compute model.
Related: Troubleshooting Azure App Service: 502/503, Cold Starts & Restart Loops — the same front-end/worker diagnostic reflexes apply to HTTP functions.
Related: Azure Monitor & Application Insights for Observability — instrument every invocation, dependency and failure across the pipeline.
Related: Azure Key Vault: Secrets, Keys & Certificates — get Key Vault references right so a missing secret never crash-loops the app.
Related: Deploy KEDA: Event-Driven Autoscaling with Kafka & Service Bus — the same event-driven scaling model on Kubernetes/Container Apps.