Azure Networking

Azure Standard Load Balancer Deep Dive: Outbound Rules, HA Ports, and Cross-Region Load Balancing

Standard Load Balancer is the Layer-4 plumbing almost every Azure network design sits on, and it is the layer people understand least. They reach for it as “a TCP load balancer,” wire up one rule, and never touch the parts that decide whether the system survives load: outbound rules that give you deterministic SNAT instead of a 2 a.m. port-exhaustion incident, HA Ports that make a firewall sandwich genuinely highly available, health-probe thresholds that decide whether a deploy drains gracefully or black-holes connections, and a global cross-region front end that fails a whole region over without a DNS change. The Azure Standard Load Balancer is a software-defined, zero-latency-added, pass-through L4 device: it does not terminate connections, it rewrites the destination (and optionally the source) of a 5-tuple and forwards the packet. That “pass-through” nature is exactly why its failure modes are subtle — there is no access log, no TLS to inspect, no request to trace, just flows that either complete or quietly die.

This is the engineering-grade walkthrough of every moving part, ending with a global anycast front end whose backend pool is other load balancers. Everything here is the Standard SKU. Basic Load Balancer retires 30 September 2025 — no SLA, no availability zones, no outbound rules, no HA Ports, no cross-region — so if you are still on Basic, migration is the first task, not an optimization. Because this is a reference you will return to mid-incident, the rule types, the SNAT maths, the probe knobs, the metrics, and the failure playbook are all laid out as scannable tables: read the prose once, then keep the tables open when SnatConnectionCount (Failed) starts climbing.

By the end you will stop guessing. When egress fails under a flash sale you will know whether you starved SNAT ports on one destination, whether implicit SNAT is shadowing your explicit rule, whether a “healthy” backend is lying because its probe is TCP, or whether your stateful firewall is resetting long flows because HA Ports does not guarantee path symmetry. Knowing which in ninety seconds is what separates a five-minute incident from a two-hour one.

What problem this solves

Standard LB exists to spread Layer-4 traffic across a pool of backends inside a region, to provide controlled, deterministic outbound internet access for those backends, and — with the cross-region SKU — to give a TCP/UDP service a single global IP with automatic regional failover. The pain it removes is the pain of doing any of that by hand: round-robin DNS that ignores health, a single NAT box that becomes a bottleneck and a single point of failure, or a firewall pair with no safe way to balance across both nodes.

What breaks without engineering it properly is specific and recurring. An app that opens a new outbound connection per request exhausts the shared SNAT pool and throws intermittent dependency timeouts that pass in test and fail in production — the single most common Standard LB incident. A “TCP load balancer” with a TCP probe keeps routing traffic to a worker that returns 500 to every user, because the socket is still open. A firewall sandwich that passed its lab test resets every long-lived database and gRPC connection in production because return packets traverse a different stateful appliance than the forward packets. A team migrates inbound but forgets that Standard is secure by default — no implicit outbound — and every backend silently loses internet access the moment they remove the public IP from the NIC.

Who hits this: anyone running VMs or VM Scale Sets behind L4 in Azure, anyone inserting a network virtual appliance (NGFW, IDS/IPS, proxy) into the data path, anyone with chatty outbound calls to a small number of upstreams (payment APIs, partner endpoints, a shared database), and anyone designing multi-region active-active for a non-HTTP protocol that Azure Front Door (HTTP-only) cannot serve. The fix is almost never “make the LB bigger” — Standard LB has no instance size. It is “allocate ports explicitly, probe at Layer 7, engineer flow symmetry, and watch the SNAT metrics.”

To frame the whole field before the deep dive, here is every failure class this article covers, the question it forces, and where to look first:

Failure class What you actually see First question to ask First place to look Most common single cause
SNAT port exhaustion Intermittent outbound timeouts under load, fine at rest Does it fail to one destination or all? SnatConnectionCount (Failed) metric New connection per request to one upstream
Implicit SNAT shadowing Unpredictable port use; exhaustion despite a rule Is disableOutboundSnat set on inbound rules? LB rule config / ARM disableOutboundSnat Inbound rule silently providing default SNAT
Backend “healthy” but failing Users get 500/502 from a node the LB calls healthy Is the probe TCP or HTTP? DipAvailability vs real 5xx rate TCP probe on an app that 500s
Asymmetric NVA reset Long flows reset, short flows fine Forward and return path same appliance? Firewall session log (“no state”) HA Ports without Floating IP / state sync
No regional failover Region down, global IP keeps sending traffic Is the regional probe honest? Cross-region LB backend health Regional LB still reports healthy
No outbound at all New backends cannot reach the internet Is there an explicit egress path? Effective routes; outbound rule presence Standard is secure-by-default (egress opt-in)

Learning objectives

By the end of this article you can:

Prerequisites & where this fits

You should already understand the basics: a frontend IP configuration (the VIP — public or internal/private), a backend pool (the targets), a load-balancing rule (which frontend port maps to which backend port), a health probe (what “healthy” means), and an outbound rule (how the pool reaches the internet). You should know how SNAT (Source Network Address Translation) works in principle — a private IP:port rewritten to a public IP:port so return traffic finds its way home — and be comfortable running az in Cloud Shell, reading JSON output, and applying a Bicep or Terraform file. Familiarity with availability zones, NSGs, and UDRs helps; the Azure Virtual Network, Subnets and NSGs fundamentals are assumed.

This sits in the Networking track and is the L4 layer beneath almost everything else. It is the floor under Azure Multi-Region Active-Active Architecture, the SNAT-aware sibling of Diagnosing and Killing SNAT Port Exhaustion on Cloud NAT Gateways (NAT Gateway is the other egress path and often the better one), and the HA mechanism behind Deploying HA Third-Party NVAs in Azure: The Load Balancer Sandwich Pattern. Where you need Layer-7 features — path routing, WAF, edge TLS — you want Application Gateway instead; this article is strictly L4.

A quick map of who owns which layer during an incident, so you call the right person fast:

Layer What lives here Who usually owns it Failure classes it can cause
Client / DNS Name resolution, retries Frontend / SRE Rarely the LB; usually a red herring
Cross-region (Global) LB Anycast VIP, regional health Network / platform No failover (regional probe lies), wrong region steer
Regional Standard LB Inbound rules, probes, outbound rules Network team SNAT exhaustion, probe false-healthy, no egress
Backend pool (VM/VMSS) The app, the NIC, the zone App / compute team False-healthy app, zone imbalance
NVA subnet (firewall sandwich) UDRs, NSGs, stateful FW Security / network Asymmetric reset, total-blast-radius NSG mistake
Outbound (SNAT / NAT GW) Egress to APIs/DB Platform + network Port exhaustion under load

Core concepts

Six mental models make every later diagnosis obvious.

Standard LB is pass-through, not a proxy. It rewrites the 5-tuple (destination — and for outbound, source) and forwards the packet; it never terminates the TCP connection. That is why it adds no measurable latency, sees no application data, and produces no access log. Visibility comes from metrics and VNet flow logs, not from the LB itself. Every troubleshooting instinct you have from an L7 proxy (read the access log, inspect the request) does not apply.

A SNAT port is keyed on the full destination 5-tuple. You are not limited to ~64,000 total outbound connections — you are limited to ~64,000 simultaneous flows to the same destination IP and port per frontend public IP. Exhaustion almost always means many flows to one upstream behind a single VIP. This is the single most misunderstood fact about the device, and it is why “we only have 5,000 connections” still exhausts when they all go to one payment API.

Standard is secure by default — outbound is opt-in. Being in a backend pool does not grant a backend internet access. You must provide egress explicitly: an outbound rule on the LB, a NAT Gateway on the subnet, or an instance-level public IP. Remove a public IP from a NIC without adding one of these and the backend goes dark to the internet — a classic migration surprise.

Zone-redundant frontend, zone-spread backends — both or neither. A Standard LB frontend is zone-redundant by default (its VIP is served from all zones). But HA is only real if the backends span zones too. A zone-redundant frontend in front of a single-zone VMSS dies with that zone. Spread instances across zones 1/2/3 and the frontend keeps serving from the survivors.

The probe defines truth, and the wrong probe lies. The LB sends traffic only to instances the probe says are healthy. A TCP probe proves a socket is open; an HTTP/HTTPS probe proves the app answered 200. A wedged-but-listening process passes a TCP probe and fails every real request. Detection time is roughly interval x probe_threshold, and graceful drain (stop new flows, let established flows finish) is a property you sequence into deploys, not a setting.

HA Ports balances everything; it does not guarantee symmetry. An HA Ports rule (protocol All, ports 0/0, internal LB only) load-balances all ports and protocols at once — the only sane way to front a firewall whose port set you cannot enumerate. But a stateful NVA needs the return packet on the same appliance as the forward packet, and HA Ports hashes flows independently per direction. Symmetry is your job (Floating IP / DSR, vendor state sync, symmetric UDRs) — and getting it wrong is the most common HA-Ports incident.

The vocabulary in one table

Before the deep sections, pin down every moving part. The glossary repeats these for lookup; this is the model side by side:

Concept One-line definition Where it lives Why it matters
Frontend IP config The VIP (public or internal) traffic enters on On the LB Each public IP = +64k SNAT ports
Backend pool The set of targets (NIC-based or IP-based) On the LB IP-based pools cannot do outbound rules
Load-balancing rule Frontend port → backend port mapping On the LB Can silently provide implicit SNAT
Inbound NAT rule One frontend port → one backend instance On the LB Per-instance reach (SSH/RDP)
Health probe What “healthy” means (TCP/HTTP/HTTPS) On the LB Wrong type → false-healthy backend
Outbound rule Explicit SNAT egress + port allocation On the LB The deterministic-egress control
HA Ports rule Protocol All, ports 0/0 (internal only) On internal LB Balances every port for NVAs
SNAT port One outbound 5-tuple translation entry Per frontend public IP (~64k) Exhaustion → outbound failures
Floating IP (DSR) Backend sees original VIP; return symmetric On the rule Required for stateful NVA symmetry
Cross-region LB Global anycast VIP over regional LBs Global tier One static IP, DNS-free failover
DipAvailability % of probes succeeding per backend Metric Your health/drain signal
VipAvailability Whether the frontend datapath is up Metric The “is the VIP alive” signal

Standard vs Gateway vs cross-region, and when each fits

Azure ships three load balancer “shapes.” They are not interchangeable, and picking the wrong one shows up as a missing feature or a redesign weeks later.

SKU / type Scope Primary job Outbound SNAT HA Ports Frontend
Standard (regional) One region, zone-aware General L4 load balancing for VMs/VMSS Yes, via outbound rules Yes (internal) Public or internal
Gateway One region Transparent insertion of NVAs via service chaining No (bump-in-the-wire) N/A Internal (chained)
Cross-region (Global) Multi-region Anycast global front end over regional LBs No No Public (global)

The mental model:

The decision distilled — match the requirement to the shape:

If you need… Use Why not the others
Balance VMs/VMSS in one region Standard regional Gateway has no general rules; cross-region needs regional LBs underneath
Insert a firewall/IDS transparently in the path Gateway LB Standard needs HA Ports + UDRs; cross-region is global only
One static IP for a TCP/UDP service across regions Cross-region LB Front Door is HTTP-only; Traffic Manager is DNS/TTL-bound
L7 path routing / WAF / edge TLS Application Gateway / Front Door Standard LB is L4 only — no HTTP awareness
Deterministic, allow-listable egress at scale NAT Gateway (with or without LB) LB outbound pre-carves ports; NAT GW allocates on demand

The rest of this article uses the regional Standard LB for the core sections, layers HA Ports for the NVA case, then puts the cross-region LB on top.

Basic is retiring — what changes on the way to Standard

If you are still on Basic LB (retires 30 September 2025), migration is the first task. The features do not map 1:1, and several Standard behaviors are secure-by-default where Basic was permissive — so a lift-and-shift that ignores these breaks egress or HA:

Aspect Basic LB Standard LB Migration action
SLA None 99.99% (with ≥2 healthy backends) Gain SLA; ensure 2+ instances
Availability zones Not supported Zone-redundant / zonal Re-pin frontend + spread backends
Outbound rules Not supported Supported (explicit SNAT) Add an explicit outbound rule
Default outbound access Implicit, on Secure by default (opt-in) Add egress or backends go dark
HA Ports Not supported Internal LB only Build the HA Ports rule on an internal LB
Public IP SKU Basic Standard (required) Upgrade the PIP to Standard
Backend pool size ~300 ~1,000 Re-architect large fleets if needed
Cross-region Not supported Supported (Global tier) Layer a cross-region LB if multi-region
NSG requirement Optional Recommended/expected Add NSGs (Standard assumes them)

The one that silently bites: default outbound access. On Basic, a backend reached the internet implicitly; on Standard it does not until you add an outbound rule or NAT Gateway. Migrate the egress path in the same change as the LB, or every backend loses internet the moment the public IP leaves the NIC.

Frontend IPs and the rule types

A Standard LB is a collection of frontend IP configurations and the rules that bind them to a backend pool. Get the rule taxonomy straight — each does one job, and mixing them up is where implicit SNAT bites.

Rule type What it does Frontend → backend SNAT behaviour Typical use
Load-balancing rule Distributes a port across the whole pool e.g. 443 → 8443 (all instances) Provides implicit outbound SNAT unless disabled Web/API traffic to a VMSS
HA Ports rule Balances all ports/protocols at once 0 → 0, protocol All (internal LB) Implicit SNAT off by design (internal) Active-active NVA sandwich
Inbound NAT rule Maps one port to one specific instance e.g. 50001 → VM3:22 None Per-instance SSH/RDP/jump
Inbound NAT pool (VMSS) Range of ports → instances 50000-50100 → VMSS:22 None SSH into VMSS instances
Outbound rule Explicit egress + manual port allocation pool → frontend public IP Explicit SNAT (what you want) Deterministic internet egress

Each rule type has hard requirements — what it needs and where it’s allowed. Mismatch any of these and the create is rejected or the rule silently does nothing:

Rule type Needs a probe? Pool type Public or internal LB Floating IP option Key constraint
Load-balancing rule Yes NIC or IP-based Either Yes (DSR) Implicit SNAT on unless disabled
HA Ports rule Yes NIC or IP-based Internal only Yes (needed for NVA) One per LB; ports 0/0, protocol All
Inbound NAT rule Optional NIC-based Either Yes One frontend port → one instance
Inbound NAT pool (VMSS) Optional VMSS NICs Either n/a Port range mapped to instances
Outbound rule No NIC-based only Public (egress IP) n/a IP-based pools unsupported

The two constraints that trip people most: HA Ports is internal-LB-only, and outbound rules require a NIC-based pool. If you find yourself trying to put HA Ports on a public LB or an outbound rule on an IP-based pool, the design is wrong, not the syntax.

The frontend itself is public or internal, and zonal or zone-redundant. The defaults and the decision:

Frontend property Options Default When to change Gotcha
Address type Public / Internal (private) — (you choose) Internal for east-west, NVA, internal services Internal LBs can do HA Ports; public cannot
Zone behaviour Zone-redundant / Zonal / No-zone Zone-redundant (Standard) Zonal only for latency/co-location pins A zonal frontend dies with its zone
IP allocation Static / Dynamic Static (Standard PIP) Always static for a stable VIP Dynamic VIPs change on dealloc
Public IP SKU Standard / Basic Standard Must be Standard with a Standard LB Basic PIP + Standard LB is rejected
Inbound/outbound IP sharing Same IP / separate IP Often shared Separate outbound IP keeps SNAT budget clean Sharing mixes inbound + SNAT on one budget
LOC=eastus
RG=rg-lb-prod

# Zone-redundant public frontend IP (Standard SKU, served from all zones).
az network public-ip create \
  --resource-group $RG --name pip-lb-fe \
  --sku Standard --tier Regional \
  --allocation-method Static --zone 1 2 3

az network lb create \
  --resource-group $RG --name lb-app-prod \
  --sku Standard \
  --public-ip-address pip-lb-fe \
  --frontend-ip-name fe-public \
  --backend-pool-name bep-app

Zonal vs zone-redundant is a real decision. A zone-redundant frontend survives a single zone loss transparently. A zonal frontend (pinned with a single --zone) is occasionally required for latency-sensitive or co-location designs, but it dies with its zone. Default to zone-redundant unless you have a specific, written reason not to.

Backend pool design: NIC-based vs IP-based, and zone alignment

A Standard LB backend pool can be defined two ways, and the choice constrains the entire design — especially outbound.

The trade-off in full:

Aspect NIC-based pool IP-based pool
Membership unit VM/VMSS NIC ipConfiguration Raw private IP in the VNet
Outbound rules (SNAT) Supported Not supported
Lifecycle coupling Tied to the compute resource Decoupled (you manage IPs)
Best for Standard VM/VMSS workloads Pre-provisioned IPs, mixed/unmanaged backends
Auto-membership (VMSS) Yes, via the scale set Manual IP management
Cross-resource-group targets Constrained More flexible

Zone alignment is the part that gets skipped. A Standard LB frontend is zone-redundant by default, but HA is only real if the backends span zones too. Spread VMSS instances across zones 1/2/3 and the frontend keeps serving from surviving zones when one fails. The zone model side by side:

Backend zoning Survives single-zone loss? When to use Watch-out
Zone-spread (1/2/3) Yes — survivors keep serving Default for HA Cross-zone bandwidth has a (tiny) cost
Zonal (pinned to one zone) No — dies with the zone Latency/co-location pin only Pair with a zonal frontend deliberately
No-zone (regional) Best-effort (no zone guarantee) Legacy / regions without zones No explicit zone resilience
Mixed zonal + zone-redundant FE Partial Migration states Easy to think you’re HA when you’re not

See Azure Regions and Availability Zones for the zone model in depth; the rule here is simply both ends or neither.

Outbound rules and explicit SNAT port allocation

This is the part that prevents incidents. By default a Standard LB does not give backends outbound internet access just for being in a pool — Standard is secure by default, and egress is opt-in. The clean ways to provide it:

Egress method How it allocates ports Best for Cost Limit / gotcha
Outbound rule (LB) Pre-carved, manual per instance LB already present; egress must be the LB VIP PIP only Pre-divides 64k; caps pool size if over-allocated
NAT Gateway On-demand from a shared pool Pure egress at scale; many destinations Hourly + per-GB Zonal (one per zone); separate article
Instance-level public IP Per-instance dedicated A handful of VMs needing own IP PIP per VM Doesn’t scale; management overhead
Default outbound (legacy) Implicit, Microsoft-managed Nothing — being retired None Non-deterministic; do not rely on it

A SNAT port is one entry in a translation table keyed on the full 5-tuple, including the destination IP and port. You are not limited to 64K total connections — you are limited to ~64K simultaneous flows to the same destination IP:port per frontend public IP. Exhaustion almost always means many flows to one upstream behind a single VIP.

With an outbound rule you allocate ports explicitly, pre-dividing the 64,000-port budget per frontend IP across the pool. The maths is unforgiving:

ports_per_instance = floor( (64,000 x frontend_IP_count) / backend_instance_count )

64,000 ports, 1 frontend IP, 50 instances  -> 1,280 ports each
64,000 ports, 1 frontend IP, 100 instances ->   640 ports each
64,000 ports, 2 frontend IPs, 100 instances -> 1,280 ports each

The allocation table you actually plan against — note how adding frontend IPs (or a public IP prefix) is the lever that grows the budget:

Frontend public IPs Total SNAT ports 50 instances 100 instances 200 instances
1 IP 64,000 1,280 / inst 640 / inst 320 / inst
2 IPs 128,000 2,560 / inst 1,280 / inst 640 / inst
4 IPs 256,000 5,120 / inst 2,560 / inst 1,280 / inst
/28 prefix (16 IPs) 1,024,000 20,480 / inst 10,240 / inst 5,120 / inst

Set it too high and you cap pool size (you can run out of ports to hand new instances); too low (the default auto-allocation is famously stingy) and busy instances exhaust ports while the pool looks half-idle. Always allocate manually, against your maximum intended pool size.

# Dedicated outbound frontend IP — do NOT share the inbound VIP for outbound
# if you can avoid it; a separate IP keeps the SNAT budget clean.
az network public-ip create \
  --resource-group $RG --name pip-lb-outbound \
  --sku Standard --allocation-method Static --zone 1 2 3

az network lb frontend-ip create \
  --resource-group $RG --lb-name lb-app-prod \
  --name fe-outbound --public-ip-address pip-lb-outbound

# Explicit outbound rule: manual port allocation, generous idle timeout,
# and TCP reset on idle so clients learn the flow is gone.
az network lb outbound-rule create \
  --resource-group $RG --lb-name lb-app-prod \
  --name obr-app \
  --frontend-ip-configs fe-outbound \
  --address-pool bep-app \
  --protocol All \
  --idle-timeout 15 \
  --enable-tcp-reset true \
  --outbound-ports 1280

The flags that matter, each with its default and the failure if you get it wrong:

Flag (CLI) ARM / Bicep Default Set it to Failure if wrong
--outbound-ports allocatedOutboundPorts Auto (stingy) floor(64k×IPs / max-instances) Too low → exhaustion; too high → can’t add instances
--enable-tcp-reset enableTcpReset false true Idle flows dropped silently; clients hang
--idle-timeout idleTimeoutInMinutes 4 15-30 (or app keepalives) Mid-idle drops on long-lived flows
--protocol protocol All (TCP+UDP) UDP egress missing if set to Tcp only
--frontend-ip-configs frontendIPConfigurations A dedicated outbound IP Sharing inbound VIP muddies the budget

The durable fix for mid-idle drops is application keepalives, not a giant idle timeout. Each extra frontend IP (or a public IP prefix) adds another 64,000 ports. If you are fighting this maths at scale, that is the signal to move egress to NAT Gateway, which allocates ports on demand instead of pre-carving them.

Worked sizing against real workloads — pick the row closest to yours and read the verdict. The key variable is concurrent flows to the busiest single destination, not total throughput:

Workload Instances Frontend IPs Ports/instance Peak flows to busiest dest Verdict
Internal API, few egress calls 10 1 6,400 ~500 Huge headroom; fine
Web tier → one DB VIP, pooled 20 1 3,200 ~1,500 Comfortable with reuse
Payment fan-out, per-request conns 50 1 1,280 ~30,000 Exhausts — reuse or add IPs
Payment fan-out, pooled clients 50 1 1,280 ~1,200 Fine once connections are reused
Batch webhook fan-out 100 1 640 ~50,000 Exhausts — needs NAT Gateway
Batch webhook fan-out 100 4 (/30 prefix) 2,560 ~50,000 Borderline; NAT Gateway better
Large fleet, many destinations 200 2 640 ~3,000 (spread) Fine — load spread over dests
Large fleet, one hot destination 200 2 640 ~80,000 Exhausts — shard or NAT GW

The pattern is unmissable: the rows that exhaust all have many concurrent flows to one destination with no connection reuse. Fix reuse first (it collapses the flow count), then add IPs or move to NAT Gateway for genuine high-fan-out to a single upstream.

The implicit-SNAT trap

A load-balancing rule silently provides implicit, unmanaged SNAT alongside any explicit outbound rule unless you turn it off. Two overlapping SNAT behaviors give you unpredictable port use and exhaustion you cannot reason about. The fix is one flag — disableOutboundSnat = true (ARM/Bicep disableOutboundSnat) on the load-balancing rule — so egress is governed only by your explicit outbound rule.

Configuration Outbound SNAT source Determinism Verdict
LB rule only, disableOutboundSnat=false Implicit (auto, ~stingy) Low Default; exhausts early
LB rule + outbound rule, disableOutboundSnat=false Both (overlapping) Very low The trap — unpredictable
LB rule (disableOutboundSnat=true) + outbound rule Explicit rule only High Correct
NAT Gateway on subnet NAT GW (on demand) Highest Best for pure egress

HA Ports for active-active NVAs and firewall sandwiches

HA Ports makes an internal Standard LB load-balance all ports and all protocols with one rule. It exists for the network virtual appliance case: you cannot enumerate every port a firewall must pass, so you balance the whole flow space at once. An HA Ports rule is just a load-balancing rule with protocol All and both frontendPort and backendPort set to 0. It is available on internal Standard LBs only (not public).

# Internal LB in front of the active-active NVA pool.
az network lb create \
  --resource-group $RG --name lb-nva-internal \
  --sku Standard \
  --vnet-name vnet-hub --subnet snet-nva-frontend \
  --frontend-ip-name fe-nva --private-ip-address 10.0.10.4 \
  --backend-pool-name bep-nva

# HA Ports: protocol All, ports 0/0 — every port, every protocol.
az network lb rule create \
  --resource-group $RG --lb-name lb-nva-internal \
  --name rule-haports \
  --protocol All --frontend-port 0 --backend-port 0 \
  --frontend-ip-name fe-nva \
  --backend-pool-name bep-nva \
  --probe-name probe-nva \
  --enable-tcp-reset true \
  --idle-timeout 15

The classic topology is the firewall sandwich: an external/internal LB pair around an active-active NVA pool, HA Ports on the internal side. The design rules that decide whether it actually works:

Design rule Why it matters If you skip it
Symmetric routing Stateful NVA needs return packet on the same appliance Mid-stream resets (“no matching state”) on long flows
Floating IP (DSR) Backend sees original VIP; keeps routing symmetric Return path diverges; asymmetric drops
Vendor session-state sync Any appliance can handle any packet of a flow Rebalance/probe event drops in-flight sessions
Per-NVA liveness probe Pulls a wedged appliance out of rotation Black-holes traffic to a hung-but-listening FW
Treat the NVA subnet as prod-critical HA Ports = no per-port blast radius One bad NSG/UDR breaks all protocols at once

The two non-negotiables, spelled out:

HA Ports balances everything, so a misconfigured NSG or UDR on the NVA subnet now affects all protocols at once. There is no per-port blast radius anymore — treat that subnet as production-critical and test failover explicitly.

Health probe protocols, thresholds, and graceful drain

Probes decide what “healthy” means, and the defaults are rarely what you want for a zero-downtime deploy. Standard LB supports TCP, HTTP, and HTTPS probes.

Probe type Healthy when Proves Use it for Limit
TCP 3-way handshake completes on the port The port is open Non-HTTP backends; cheapest A wedged app that still listens passes
HTTP GET on the path returns HTTP 200 The app answered Web backends Slightly more overhead than TCP
HTTPS GET over TLS returns HTTP 200 The app answered over TLS Encrypted-probe requirements Cert/TLS handling on the backend

Prefer an HTTP/HTTPS probe against a real /healthz over TCP wherever the backend speaks HTTP. A TCP probe stays “healthy” while the app returns 500s to every user, because the socket is still open. Only an L7 probe catches a wedged-but-listening process.

az network lb probe create \
  --resource-group $RG --lb-name lb-app-prod \
  --name probe-app \
  --protocol Http --port 8080 --path /healthz \
  --interval 5 --probe-threshold 2

The probe knobs, their ranges, and the trade-off each controls:

Setting (CLI) ARM / Bicep Default Range Trade-off
--protocol protocol Tcp / Http / Https TCP = cheap but blind; HTTP = true health
--port port 1-65535 Must match the listening/health port
--path requestPath — (HTTP/S) any path returning 200 Keep it shallow and honest
--interval intervalInSeconds 15 (min 5) 5-2147483646 Tighter = faster detect, more flap risk
--probe-threshold numberOfProbes / probeThreshold 1-2 ≥1 Higher rides blips; lower evicts fast

Detection time is roughly interval x probe_threshold (~10s at 5s/2). Tighter flaps on a merely-slow backend; looser keeps sending traffic to a dead node. A sizing guide:

interval × threshold Detect time Good for Risk
5s × 2 ~10s Fast eviction of dead nodes Flaps a momentarily-slow node
5s × 3 ~15s Balanced default Slightly slower eviction
15s × 2 ~30s Stable, flap-averse Dead node serves up to ~30s
30s × 3 ~90s Very stable backends Slow to pull a failed node

Graceful drain is the other half. When a probe starts failing (or you pull an instance from the pool), Standard LB stops new flows to it but does not kill established TCP connections — existing flows continue until they close or hit the idle timeout. So the clean deploy sequence is:

Step Action What the LB does Why
1 Flip the instance’s /healthz to non-200 (or stop the app gracefully) Probe begins failing Signal intent to drain
2 Wait interval × threshold Marks instance unhealthy; stops new flows No new traffic lands on it
3 Wait out the drain window Established flows finish naturally In-flight requests complete
4 Recycle, bring /healthz back Probe succeeds; rejoins rotation Instance returns warm

This is the orchestration that VMSS rolling upgrades and App Service slot swaps lean on under the hood. Deploys that black-hole requests almost always skipped the drain wait between steps 2 and 3.

A reference deployment in Bicep and Terraform

Here is the regional public LB, NIC-style pool, explicit outbound rule, HTTP probe, and load-balancing rule as one coherent reference — the shape you want in the repo, not a pile of CLI commands. First Bicep:

param location string = resourceGroup().location

resource pip 'Microsoft.Network/publicIPAddresses@2023-11-01' = {
  name: 'pip-lb-fe'
  location: location
  sku: { name: 'Standard' }
  zones: [ '1', '2', '3' ]
  properties: { publicIPAllocationMethod: 'Static' }
}

resource lb 'Microsoft.Network/loadBalancers@2023-11-01' = {
  name: 'lb-app-prod'
  location: location
  sku: { name: 'Standard' }
  properties: {
    frontendIPConfigurations: [ {
      name: 'fe-public'
      properties: { publicIPAddress: { id: pip.id } }
    } ]
    backendAddressPools: [ { name: 'bep-app' } ]
    probes: [ {
      name: 'probe-app'
      properties: { protocol: 'Http', port: 8080, requestPath: '/healthz', intervalInSeconds: 5, numberOfProbes: 2 }
    } ]
    loadBalancingRules: [ {
      name: 'rule-https'
      properties: {
        protocol: 'Tcp'
        frontendPort: 443
        backendPort: 8443
        idleTimeoutInMinutes: 15
        enableTcpReset: true
        disableOutboundSnat: true   // outbound handled by the explicit rule below
        frontendIPConfiguration: { id: resourceId('Microsoft.Network/loadBalancers/frontendIPConfigurations', 'lb-app-prod', 'fe-public') }
        backendAddressPool: { id: resourceId('Microsoft.Network/loadBalancers/backendAddressPools', 'lb-app-prod', 'bep-app') }
        probe: { id: resourceId('Microsoft.Network/loadBalancers/probes', 'lb-app-prod', 'probe-app') }
      }
    } ]
    outboundRules: [ {
      name: 'obr-app'
      properties: {
        protocol: 'All'
        allocatedOutboundPorts: 1280
        idleTimeoutInMinutes: 15
        enableTcpReset: true
        frontendIPConfigurations: [ { id: resourceId('Microsoft.Network/loadBalancers/frontendIPConfigurations', 'lb-app-prod', 'fe-public') } ]
        backendAddressPool: { id: resourceId('Microsoft.Network/loadBalancers/backendAddressPools', 'lb-app-prod', 'bep-app') }
      }
    } ]
  }
}

The same shape in Terraform, which is what many teams keep in the repo:

resource "azurerm_public_ip" "lb_fe" {
  name                = "pip-lb-fe"
  resource_group_name = var.rg
  location            = var.location
  allocation_method   = "Static"
  sku                 = "Standard"
  zones               = ["1", "2", "3"]
}

resource "azurerm_lb" "app" {
  name                = "lb-app-prod"
  resource_group_name = var.rg
  location            = var.location
  sku                 = "Standard"
  frontend_ip_configuration {
    name                 = "fe-public"
    public_ip_address_id = azurerm_public_ip.lb_fe.id
  }
}

resource "azurerm_lb_backend_address_pool" "app" {
  name            = "bep-app"
  loadbalancer_id = azurerm_lb.app.id
}

resource "azurerm_lb_probe" "app" {
  name                = "probe-app"
  loadbalancer_id     = azurerm_lb.app.id
  protocol            = "Http"
  port                = 8080
  request_path        = "/healthz"
  interval_in_seconds = 5
  number_of_probes    = 2
}

resource "azurerm_lb_rule" "app" {
  name                           = "rule-https"
  loadbalancer_id                = azurerm_lb.app.id
  protocol                       = "Tcp"
  frontend_port                  = 443
  backend_port                   = 8443
  frontend_ip_configuration_name = "fe-public"
  backend_address_pool_ids       = [azurerm_lb_backend_address_pool.app.id]
  probe_id                       = azurerm_lb_probe.app.id
  idle_timeout_in_minutes        = 15
  enable_tcp_reset               = true
  disable_outbound_snat          = true # outbound handled by the explicit rule below
}

resource "azurerm_lb_outbound_rule" "app" {
  name                     = "obr-app"
  loadbalancer_id          = azurerm_lb.app.id
  protocol                 = "All"
  backend_address_pool_id  = azurerm_lb_backend_address_pool.app.id
  allocated_outbound_ports = 1280
  idle_timeout_in_minutes  = 15
  enable_tcp_reset         = true
  frontend_ip_configuration {
    name = "fe-public"
  }
}

The detail that bites people: set disable_outbound_snat = true on the load-balancing rule (disableOutboundSnat in ARM/Bicep) so the inbound rule does not silently provide implicit, unmanaged SNAT alongside your explicit outbound rule. Without it you get two overlapping SNAT behaviors and unpredictable port use. The Bicep-vs-Terraform property names you will reach for:

Concept CLI flag Bicep / ARM property Terraform argument
Disable implicit SNAT --disable-outbound-snat disableOutboundSnat disable_outbound_snat
Allocated SNAT ports --outbound-ports allocatedOutboundPorts allocated_outbound_ports
Idle timeout --idle-timeout idleTimeoutInMinutes idle_timeout_in_minutes
TCP reset on idle --enable-tcp-reset enableTcpReset enable_tcp_reset
Floating IP (DSR) --floating-ip enableFloatingIP enable_floating_ip
Probe threshold --probe-threshold numberOfProbes number_of_probes

Cross-region load balancer: global front end, regional pools, failover

The cross-region (Global) LB gives you a single static anycast IP from Microsoft’s edge, with a backend pool of regional Standard load balancers. Traffic enters at the closest edge and steers to the closest healthy region; if a region’s LB goes unhealthy, flows shift to the next automatically — no DNS TTL to wait out, because the IP never changes.

# Global LB lives in a supported "home region" but serves globally.
az network public-ip create \
  --resource-group rg-global --name pip-global \
  --sku Standard --tier Global --allocation-method Static

az network cross-region-lb create \
  --resource-group rg-global --name lb-global \
  --frontend-ip-name fe-global \
  --public-ip-address pip-global \
  --backend-pool-name bep-regions

# Backend members are the *frontend IP configs of regional Standard LBs*.
az network cross-region-lb address-pool address add \
  --resource-group rg-global --lb-name lb-global \
  --pool-name bep-regions --name eastus-lb \
  --frontend-ip-address "$EASTUS_LB_FE_ID"

az network cross-region-lb address-pool address add \
  --resource-group rg-global --lb-name lb-global \
  --pool-name bep-regions --name westeurope-lb \
  --frontend-ip-address "$WESTEUROPE_LB_FE_ID"

What to internalize about the global LB:

The global-routing options compared, so you pick the right global layer:

Global option Layer Routing basis Failover speed Static IP Protocols
Cross-region LB L4 Geo-proximity (latency) Seconds, no DNS Yes (anycast) Any TCP/UDP
Front Door L7 Latency / priority / weighted Seconds (edge) No (anycast hostname) HTTP/S only
Traffic Manager DNS Performance / priority / geo / weighted DNS TTL-bound (minutes) No (DNS) Any (DNS-level)
Anycast accelerator L4 Edge anycast Seconds Yes TCP/UDP

This is the cleanest way to give a TCP/UDP service (not just HTTP) one global IP with regional failover — something Traffic Manager (DNS/TTL-bound) and Front Door (HTTP-only) cannot each do alone. For the edge-anycast variant and latency engineering, see Anycast at the Edge, and for the broader pattern Azure Multi-Region Active-Active Architecture.

How the global LB behaves in each failure and routing case — what actually happens to a flow, and what holds constant:

Event What the cross-region LB does Client impact What stays constant
Normal steady state Routes to closest healthy region by latency Lowest-latency region Global static IP
One region’s LB goes unhealthy Stops sending to it; shifts to next-closest Brief reconnect, no DNS wait Global static IP
Failed region recovers Re-includes it once probes pass Gradual return of nearby traffic Global static IP
New region added to pool Starts steering nearby clients to it More local routing Global static IP
All regions unhealthy No healthy backend; connections fail Outage (by definition) IP still answers, no target
Client moves geographically Re-steered to new closest region Lower latency from new location Global static IP
Regional probe lies “healthy” Keeps sending to a degraded region Errors with no failover (the bug — fix the probe)

The single design dependency to burn in: the global LB only fails over as well as your regional probes report. A dishonest regional probe (TCP, or / that always 200s) is the difference between automatic failover and a global outage that points at a region that “looks” up.

Diagnostics: metrics, SNAT counts, and the queries that matter

Before the metrics, the hard numbers — the limits and quotas you design against. Most “why did it fall over” moments are one of these ceilings, and knowing the real figure (not a guess) is half the diagnosis:

Limit / quota Standard LB value What hits it Symptom at the ceiling Lever to raise it
SNAT ports per frontend public IP ~64,000 Flows to one destination IP:port SnatConnectionCount Failed > 0 Add public IPs / a prefix; NAT Gateway
Backend pool size (NIC-based) up to ~1,000 instances Very large VMSS fleets Can’t add members Split pools / multiple LBs
Frontend IP configurations up to ~600 per LB Many VIPs on one LB Create fails at the cap Use additional LBs
Load-balancing + outbound + NAT rules up to ~1,500 per LB Rule-heavy designs Rule create fails Consolidate; multiple LBs
Probe interval (minimum) 5 seconds Fast detection needs Can’t go tighter Tune threshold instead
Idle timeout range 4-100 minutes Long-lived idle flows Mid-idle drop below your value App keepalives + raise timeout
Public IP prefix size /28 to /31 (16 down to 2 IPs) Allow-listable egress block Prefix too small for budget Allocate a larger prefix
Cross-region LB backend members regional LB frontends Multi-region fan-out Add regional LBs to the pool
HA Ports rules per internal LB 1 (it’s “all ports”) NVA sandwich N/A (one rule covers all)
TCP reset on idle off by default Silent idle drops Clients hang, don’t retry enableTcpReset=true

These are the figures that matter in practice; Azure publishes the authoritative current limits per subscription/region, and a few are soft (raisable via support). The mechanism — per-destination SNAT, per-IP 64k — never changes even as the published caps shift, so design to the mechanism.

Standard LB emits multi-dimensional metrics under Microsoft.Network/loadBalancers. Because the LB has no access log, these metrics plus VNet flow logs are your only visibility. The ones worth alerting on:

Metric What it measures Split by Watch for What it confirms
SnatConnectionCount Established SNAT flows ConnectionState (Pending/Failed) Rising Failed Port exhaustion (the canary)
AllocatedSnatPorts Ports budgeted per backend backend Baseline Your configured ceiling
UsedSnatPorts Ports actually consumed backend Used → Allocated How close to the ceiling you are
DipAvailability % probes succeeding per backend (Health Probe Status) backend Drops below 100% Backend health / drain signal
VipAvailability Datapath availability of the frontend frontend Drops below 100% Whether the VIP itself is up
ByteCount / PacketCount / SYNCount Throughput and new-connection rate direction Sudden spikes Load / SYN-flood patterns

A KQL query to catch SNAT pressure before users do:

AzureMetrics
| where ResourceProvider == "MICROSOFT.NETWORK"
| where ResourceId has "/LOADBALANCERS/LB-APP-PROD"
| where MetricName in ("UsedSnatPorts", "AllocatedSnatPorts", "SnatConnectionCount")
| summarize Used = sumif(Total, MetricName == "UsedSnatPorts"),
            Allocated = sumif(Total, MetricName == "AllocatedSnatPorts")
            by bin(TimeGenerated, 5m)
| extend UtilizationPct = round(100.0 * Used / Allocated, 1)
| order by TimeGenerated desc

Alert on SnatConnectionCount with ConnectionState == Failed greater than 0 over 5 minutes — sustained failed SNAT means you are at the ceiling, and the fix is more frontend IPs, higher per-instance ports, or NAT Gateway. The alerts worth wiring before the next incident — leading indicators, not the lagging “VIP down”:

Alert on Metric / dimension Threshold (starting point) Why it’s leading
SNAT failures SnatConnectionCount (Failed) > 0 sustained 5 min First sign of exhaustion before timeouts spike
SNAT utilization UsedSnatPorts / AllocatedSnatPorts > 80% for 10 min Predicts exhaustion with headroom to act
Backend health DipAvailability < 100% for 5 min Catches probe failures / drain issues
Datapath VipAvailability < 100% for 5 min The VIP itself is degraded
Connection rate SYNCount unusual spike Load surge or SYN-flood pattern

An L4 LB has no access logs like an L7 proxy; flow-level visibility comes from VNet flow logs on the backend subnet, fed into Traffic Analytics for top-talker and drop analysis — see Network Flow Logs to Insight. Wire the LB metrics into a workspace and dashboards via Azure Monitor.

Architecture at a glance

The diagram traces an L4 flow as it actually moves and maps each failure class onto the exact hop where it bites. Read it left to right. Clients hit a single static anycast IP on the cross-region (Global) LB, which steers them to the closest healthy region — badge 1 marks the failover decision, which works only because the global LB consumes each regional LB’s honest health signal (not your VMs directly). Inside the region, the regional Standard LB applies an inbound rule (443 to 8443), runs a health probe (badge 2 — a TCP probe here would lie “healthy” while the app 500s), and governs egress through an explicit outbound rule (badge 3 — where implicit SNAT shadowing and stingy auto-allocation cause exhaustion). The rule hashes each flow by 5-tuple onto the NIC-based backend pool, a VMSS spread across zones 1/2/3; outbound flows leave via the egress VIP (badge 4 — the ~64,000-ports-per-IP ceiling counts simultaneous flows to one destination, not total connections).

Branching off the backend path is the NVA sandwich: spoke traffic is forced by UDR through an internal LB with an HA Ports rule (protocol All, ports 0/0) in front of an active-active firewall pool. Badge 5 sits on the stateful appliance — HA Ports balances every port but not flow symmetry, so without Floating IP (DSR) and vendor session-state sync, return packets land on a different firewall and long flows reset mid-stream. Finally, every hop reports into Azure Monitor and VNet flow logs — the only visibility an access-log-less L4 device gives you. The five numbered legend entries narrate each badge as symptom · confirm · fix; that is the whole diagnostic method: localise the symptom to a hop, read the cause, run the named metric/command, apply the fix.

Azure Standard Load Balancer L4 architecture: clients reach a single static anycast IP on the cross-region Global LB (badge 1, regional failover with no DNS TTL), which steers to the closest healthy region; the regional zone-redundant Standard LB applies an inbound rule 443-to-8443, an HTTP /healthz health probe every 5s (badge 2, TCP probe false-healthy trap), and an explicit outbound rule allocating 1280 ports per IP (badge 3, SNAT exhaustion and implicit-SNAT shadowing); flows hash by 5-tuple onto a NIC-based VMSS backend pool of 50 instances spread across zones 1/2/3 and egress through a 64k-ports-per-IP VIP (badge 4, per-destination port ceiling); a parallel firewall sandwich routes spoke traffic via UDR through an internal LB HA Ports rule (protocol All, ports 0/0) to an active-active NVA pool needing Floating IP and session-state sync (badge 5, asymmetric stateful reset); all hops report SnatConnectionCount Failed and DipAvailability into Azure Monitor and VNet flow logs

Real-world scenario

Meridian Pay, a fictional but representative payments platform, ran an active-active NGFW firewall sandwich in their hub VNet: an internal Standard LB in front of three firewall VMs, an HA Ports rule, and all spoke traffic forced through it via UDRs. The fleet was a 50-instance VMSS of payment workers behind a separate public Standard LB, fronted by a single outbound public IP, in Central India. It passed every lab and functional test. Monthly LB-and-egress spend was about ₹14,000. Two separate incidents, two weeks apart, taught the team the two hardest lessons of this device.

Incident one — the firewall sandwich resets. In production, long-lived database and gRPC connections reset randomly after a few minutes while short HTTP calls were fine, and the firewall logs showed sessions with “no matching state.” The constraint was classic stateful-inspection asymmetry. HA Ports hashes flows across the three firewalls by 5-tuple, but the return-path UDRs sent reply packets back through a different firewall than the forward path. The second appliance saw a mid-stream packet for a session it never created and dropped it. Short flows finished inside one hash window; long flows lived long enough to hit a state mismatch on a reconvergence or probe-driven rebalance. The fix had two parts: enable the vendor’s session-state synchronization across the cluster so any appliance can handle any packet of a flow, and enable Floating IP (Direct Server Return) on the HA Ports rule so appliances see the original VIP and routing stays symmetric per the vendor design. They also pointed the probe at a real data-plane liveness URL, not just a listening port.

# HA Ports rule with Floating IP enabled for the stateful NVA sandwich.
az network lb rule create \
  --resource-group rg-hub --lb-name lb-nva-internal \
  --name rule-haports \
  --protocol All --frontend-port 0 --backend-port 0 \
  --frontend-ip-name fe-nva --backend-pool-name bep-nva \
  --probe-name probe-nva-dataplane \
  --floating-ip true \
  --enable-tcp-reset true --idle-timeout 30

The mid-stream resets stopped on the first cutover.

Incident two — SNAT exhaustion during a sale. Three weeks later, a flash sale drove the 50-instance fleet to peak, and the payment-provider callout (a single upstream VIP) started timing out intermittently — ~9% of charges failing. The on-call reflex was to scale the VMSS out, which helped marginally and cost money. The real read came from the metric: SnatConnectionCount with a non-zero Failed dimension, and UsedSnatPorts pinned at AllocatedSnatPorts on the busiest instances. With one frontend IP across 50 instances the outbound rule had auto-allocated a stingy port count, and every flow targeted the same payment VIP, so the ~64,000-ports-per-IP-per-destination ceiling was the wall. Two coupled bugs again: a per-request connection pattern in the worker, and a single outbound IP with no headroom. The night-of fix: set the outbound rule to an explicit --outbound-ports 1280, set disableOutboundSnat=true on the inbound rule to stop implicit shadowing, and add a second outbound public IP to double the budget. The following week they fixed the worker to reuse connections and moved egress to a NAT Gateway for on-demand ports independent of instance count.

The next sale ran at full load with zero failed SNAT, charge success returned to 100%, and they moved the VMSS back down to its baseline size at ₹13,500 — lower than before. The two lessons on the wall: “HA Ports gives you all-port load balancing, not flow symmetry — that is your routing’s job,” and “SNAT is per-destination; one busy upstream exhausts you no matter how few total connections you think you have.” The incidents as a timeline, because the order of moves is the lesson:

Time Symptom Action taken Effect What it should have been
Wk1 Long flows reset, short ones fine Restart firewalls Brief relief, recurs Ask: is the path symmetric?
Wk1 “no matching state” in FW log Read FW session log Asymmetry identified The breakthrough
Wk1 Root cause found Floating IP + vendor state sync + dataplane probe Resets stop Correct fix
Wk3 9% charge timeouts at peak Scale VMSS out Marginal, costs money Don’t scale to mask
Wk3 Still failing Read SnatConnectionCount (Failed) Exhaustion confirmed This was the read
Wk3 Mitigated Explicit ports + disableOutboundSnat + 2nd IP Failures clear Correct night-of fix
+1wk Fixed Connection reuse + NAT Gateway; scale back down 0 SNAT fails, ₹13,500 The actual fix is code + egress design

Advantages and disadvantages

The pass-through L4 model both enables these designs and creates their failure modes. Weigh it honestly:

Advantages (why this model helps you) Disadvantages (why it bites)
Zero added latency — pure 5-tuple rewrite, no termination No access log; you diagnose from metrics + flow logs only
Protocol-agnostic — balances any TCP/UDP, not just HTTP No L7 features — no path routing, WAF, or TLS termination
Outbound rules give deterministic, allow-listable SNAT Pre-carved ports cap pool size; the maths is unforgiving
HA Ports balances every port for NVAs with one rule HA Ports gives no flow symmetry — stateful NVAs need extra engineering
Zone-redundant frontend survives single-zone loss Only real if backends are zone-spread too — easy to fake HA
Cross-region LB = one static IP, DNS-free regional failover L4 only; HTTP global routing still needs Front Door
Secure by default — no implicit internet exposure Egress is opt-in; forget it and backends go dark
First-class SNAT/health metrics you can alert on Finite SNAT (~64k/IP/destination) is invisible until you hit it under load

The model is right when you need a fast, protocol-agnostic L4 front end, controlled egress, or NVA HA. It bites hardest on chatty outbound workloads to few destinations (SNAT), stateful NVA sandwiches (symmetry), and anyone who assumes a zone-redundant frontend alone means HA. The disadvantages are all manageable — but only if you know they exist, which is the point of this article. Where you need L7, reach for Application Gateway instead.

Hands-on lab

Stand up a regional Standard LB with a zone-redundant frontend, a NIC-based pool, an explicit outbound rule, and an HTTP probe — then prove the egress IP is deterministic and the drain works. Free-tier-friendly except the two B1s VMs and the public IPs (a few rupees an hour; delete at the end). Run in Cloud Shell (Bash).

Step 1 — Variables and resource group.

RG=rg-lb-lab
LOC=centralindia
az group create -n $RG -l $LOC -o table

Step 2 — VNet, subnet, and two zone-spread backend VMs.

az network vnet create -g $RG -n vnet-lab --address-prefix 10.0.0.0/16 \
  --subnet-name snet-app --subnet-prefix 10.0.1.0/24 -o table

for i in 1 2; do
  az vm create -g $RG -n vm-app-$i --image Ubuntu2204 --size Standard_B1s \
    --vnet-name vnet-lab --subnet snet-app --zone $i \
    --public-ip-address "" --admin-username azureuser --generate-ssh-keys -o table
done

Expected: two VMs, vm-app-1 in zone 1 and vm-app-2 in zone 2, neither with a public IP (egress will come from the LB).

Step 3 — Zone-redundant public frontend and the Standard LB.

az network public-ip create -g $RG -n pip-lb-fe \
  --sku Standard --allocation-method Static --zone 1 2 3 -o table

az network lb create -g $RG -n lb-lab --sku Standard \
  --public-ip-address pip-lb-fe --frontend-ip-name fe-public \
  --backend-pool-name bep-app -o table

Expected: a Standard LB with frontend fe-public and an empty pool bep-app.

Step 4 — HTTP probe, load-balancing rule (implicit SNAT disabled), and explicit outbound rule.

az network lb probe create -g $RG --lb-name lb-lab -n probe-app \
  --protocol Http --port 80 --path / --interval 5 --probe-threshold 2

az network lb rule create -g $RG --lb-name lb-lab -n rule-http \
  --protocol Tcp --frontend-port 80 --backend-port 80 \
  --frontend-ip-name fe-public --backend-pool-name bep-app \
  --probe-name probe-app --idle-timeout 15 --enable-tcp-reset true \
  --disable-outbound-snat true

az network lb outbound-rule create -g $RG --lb-name lb-lab -n obr-app \
  --frontend-ip-configs fe-public --address-pool bep-app \
  --protocol All --idle-timeout 15 --enable-tcp-reset true --outbound-ports 1280

Expected: disableOutboundSnat: true on the LB rule and an outbound rule allocating 1280 ports.

Step 5 — Add the NICs to the pool and install a tiny web server on each VM.

for i in 1 2; do
  NIC=$(az vm show -g $RG -n vm-app-$i --query "networkProfile.networkInterfaces[0].id" -o tsv)
  IPCFG=$(az network nic show --ids $NIC --query "ipConfigurations[0].name" -o tsv)
  az network nic ip-config address-pool add --nic-name $(basename $NIC) -g $RG \
    --ip-config-name $IPCFG --lb-name lb-lab --address-pool bep-app
  az vm run-command invoke -g $RG -n vm-app-$i --command-id RunShellScript \
    --scripts "sudo apt-get update -y && sudo apt-get install -y nginx && echo vm-app-$i | sudo tee /var/www/html/index.html"
done

Step 6 — Verify inbound balancing and the deterministic egress IP.

LBIP=$(az network public-ip show -g $RG -n pip-lb-fe --query ipAddress -o tsv)
for i in $(seq 1 10); do curl -s http://$LBIP/; done   # alternates vm-app-1 / vm-app-2

# Egress determinism: from inside a backend, the source IP must be pip-lb-fe.
az vm run-command invoke -g $RG -n vm-app-1 --command-id RunShellScript \
  --scripts "curl -s https://api.ipify.org"
echo "Compare the returned IP to:"; echo $LBIP

Expected: the curl loop alternates between vm-app-1 and vm-app-2; the egress check returns the LB’s frontend IP — proof the outbound rule is the egress path.

Step 7 — Prove graceful drain. Stop nginx on one VM, watch it leave rotation after interval × threshold (~10s), confirm the other keeps serving:

az vm run-command invoke -g $RG -n vm-app-1 --command-id RunShellScript \
  --scripts "sudo systemctl stop nginx"
sleep 15
for i in $(seq 1 10); do curl -s http://$LBIP/; done   # now only vm-app-2
az network lb show -g $RG -n lb-lab --query "probes[0].{proto:protocol,interval:intervalInSeconds,threshold:numberOfProbes}" -o jsonc

Expected: after ~10-15s, every response is vm-app-2 — the probe pulled the stopped instance without killing the survivor.

Validation checklist — what each step proved:

Step What you did What it proves Real-world analogue
2 Zone-spread VMs, no public IP Backends span zones; egress is LB-provided The HA + secure-by-default model
4 disable-outbound-snat true + outbound rule Explicit SNAT, no implicit shadowing The incident-proof egress config
6 curl loop + ipify from inside Inbound balances; egress is deterministic “Which IP do partners allow-list?”
7 Stop nginx, watch drain Probe pulls dead nodes, keeps survivors Zero-downtime deploy drain

Cleanup (avoid lingering charges):

az group delete -n $RG --yes --no-wait

Cost note. Two B1s VMs plus two Standard public IPs run a few rupees per hour; an hour of this lab is well under ₹60, and deleting the resource group stops everything. Standard public IPs and the LB carry a small hourly charge even idle, so do not leave the lab running.

Common mistakes & troubleshooting

This is the playbook — the part you bookmark. An L4 LB emits no HTTP status codes, so the “error reference” is the set of connection-level outcomes and metric/health states you read instead. Learn to map each to what the LB is actually doing:

Observed outcome What it means at L4 Likely cause How to confirm First move
Connection refused (RST on connect) No healthy backend on that port All instances unhealthy / wrong rule port DipAvailability 0%; rule port vs listener Fix probe/listener; check rule mapping
Connection times out (no SYN-ACK) VIP/datapath issue or NSG block VipAvailability drop, or NSG denies VipAvailability metric; NSG effective rules Allow AzureLoadBalancer tag; check region health
Outbound connect fails under load SNAT port exhaustion Per-destination 5-tuple ceiling SnatConnectionCount Failed > 0 Add IP / NAT Gateway; reuse connections
Mid-stream RST after minutes (idle) Idle timeout reclaimed the flow Idle timeout < flow idle gap Flow dies at the timeout boundary App keepalives; raise idle timeout
Mid-stream RST after minutes (NVA) Stateful asymmetry dropped it Return path on a different firewall FW log “no matching state” Floating IP + state sync
Backend “Up” but app errors Probe proves socket, not app TCP probe on a 500-ing app DipAvailability 100% vs 5xx HTTP/HTTPS /healthz probe
New flows stop, old ones continue Graceful drain in progress Probe failed / instance removed DipAvailability dropped on that instance Expected; finish the drain sequence
Global VIP serves a dead region Regional probe reports healthy Dishonest regional probe Cross-region backend health “Up” Make the regional probe real

Now the symptom → cause → confirm → fix table you read mid-incident, then the entries that bite hardest in detail.

# Symptom Root cause Confirm (exact cmd / portal path) Fix
1 Intermittent outbound timeouts under load, fine at rest SNAT port exhaustion to one destination SnatConnectionCount (Failed) > 0; UsedSnatPortsAllocatedSnatPorts Explicit outbound rule; more frontend IPs; NAT Gateway; reuse connections
2 Exhaustion despite an outbound rule; ports unpredictable Implicit SNAT from the inbound rule shadowing it LB rule shows disableOutboundSnat: false Set disableOutboundSnat=true on every LB rule
3 Backends “healthy” but users get 500/502 TCP probe on an app that 500s (wedged-but-listening) Probe protocol: Tcp; DipAvailability 100% while 5xx high Switch to HTTP/HTTPS probe on /healthz
4 Long-lived flows reset after minutes; short ones fine Asymmetric routing through stateful NVAs Firewall log “no matching state”; pattern is long-only Floating IP (DSR) + vendor state sync; symmetric UDRs
5 New backends can’t reach the internet Standard is secure-by-default; no egress configured No outbound rule / NAT GW; effective routes lack default Add an outbound rule or NAT Gateway
6 Region goes down but the global IP keeps sending traffic Regional LB still reports healthy (probe lies) Cross-region LB backend health “Up”; regional DipAvailability not 0 Make the regional probe honest (HTTP /healthz)
7 Mid-idle drops on long-lived connections Idle timeout too short, no keepalives Flows die at the idle-timeout boundary App keepalives; raise idleTimeoutInMinutes; enableTcpReset
8 Scaling out the pool silently starves ports outbound-ports computed for today, not max New instances get fewer ports than needed Compute ports from maximum pool size, not current
9 A NSG/UDR change breaks all protocols at once HA Ports = no per-port blast radius One subnet change; everything fails together Treat NVA subnet as prod-critical; test failover
10 IP-based pool: outbound rule won’t apply IP-based pools don’t support outbound rules Pool is IP-based; rule rejected/ineffective Use a NIC-based pool, or NAT Gateway for egress
11 HA Ports rule rejected on a public LB HA Ports is internal-LB only LB frontend is public Use an internal LB for the HA Ports rule
12 Basic→Standard migration breaks egress/zones Basic features don’t map 1:1 Still on Basic; retires 30 Sep 2025 Plan a Standard migration (PIP SKU, outbound, zones)

The expanded form for the entries that cost the most time:

1. Intermittent outbound timeouts under load, fine at rest. Root cause: SNAT port exhaustion, almost always many flows to one destination IP:port (the per-destination 5-tuple ceiling, not total connections). Confirm: SnatConnectionCount with a non-zero Failed dimension under load; UsedSnatPorts pinned at AllocatedSnatPorts on the busy instances.

az monitor metrics list \
  --resource $(az network lb show -g $RG -n lb-app-prod --query id -o tsv) \
  --metric SnatConnectionCount --filter "ConnectionState eq 'Failed'" \
  --interval PT1M --aggregation Total -o table

Fix: Reuse outbound connections (shared client, keepalives); allocate ports explicitly against max pool size; add frontend IPs or a public IP prefix (+64k each); or move egress to a NAT Gateway. Scaling out is a band-aid.

2. Exhaustion despite an outbound rule; port use is unpredictable. Root cause: The load-balancing rule is providing implicit SNAT alongside your explicit outbound rule — two overlapping behaviors. Confirm: az network lb rule show ... --query disableOutboundSnat returns false. Fix: Set disableOutboundSnat=true on every load-balancing rule so egress is governed only by the outbound rule.

3. Backends report healthy but users get 500/502. Root cause: A TCP probe keeps a wedged-but-listening process in rotation; the socket is open, the app is broken. Confirm: Probe protocol is Tcp; DipAvailability shows 100% while your app’s 5xx rate is high. Fix: Switch to an HTTP/HTTPS probe against a real /healthz that exercises the app, not just the socket.

4. Long-lived flows reset after a few minutes; short flows are fine. Root cause: Asymmetric routing through a stateful NVA sandwich — the return packet traverses a different firewall than the forward packet. Confirm: Firewall session log shows “no matching state”; the failure is exclusively long flows (DB, gRPC), never short HTTP. Fix: Enable Floating IP (DSR) on the HA Ports rule, enable the vendor’s session-state sync, and keep UDRs symmetric. Point the probe at a data-plane liveness URL.

5. New backends can’t reach the internet. Root cause: Standard is secure by default — being in a pool grants no egress; nobody added an explicit path. Confirm: No outbound rule and no NAT Gateway on the subnet; effective routes lack an internet default via a managed egress. Fix: Add an outbound rule (NIC-based pool) or a NAT Gateway on the subnet.

6. A region is down but the global IP keeps sending traffic there. Root cause: The regional probe is dishonest — a TCP probe (or / that always 200s) keeps the regional LB “healthy,” so the cross-region LB never fails it over. Confirm: Cross-region LB backend health shows the region “Up” while it’s clearly degraded; regional DipAvailability isn’t dropping. Fix: Make the regional probe a true health check; global failover quality is exactly your regional probe quality.

And the fast triage table — match the signal you have to the likely cause and the immediate move, before you even open the playbook:

If you see… It’s probably… Do this
SnatConnectionCount Failed climbing under load Per-destination SNAT exhaustion Add a frontend IP now; plan NAT Gateway + connection reuse
UsedSnatPortsAllocatedSnatPorts, Failed still 0 About to exhaust Raise outbound-ports / add an IP before it fails
Exhaustion with an outbound rule present Implicit SNAT shadowing Set disableOutboundSnat=true on the LB rule
DipAvailability 100% but users get 5xx TCP probe lying healthy Switch probe to HTTP/HTTPS /healthz
DipAvailability flapping on a slow node Probe too tight Raise interval × threshold
Only long flows reset, short ones fine Asymmetric stateful NVA Floating IP (DSR) + vendor state sync
New VMs have no internet Secure-by-default, no egress Add outbound rule or NAT Gateway
Global IP won’t fail a dead region over Regional probe dishonest Make regional /healthz real
VipAvailability < 100% Datapath/frontend degraded Check region health; open a support case
Outbound rule “won’t apply” IP-based pool Convert to NIC-based pool
HA Ports rule rejected Public LB (internal-only feature) Move HA Ports to an internal LB
Mid-idle drops at a fixed interval Idle timeout / no keepalives App keepalives; raise timeout; enableTcpReset

Best practices

Security notes

Cost & sizing

Standard LB has no instance size to choose — the cost model is rule-count and processed data, plus the public IPs and any NAT Gateway you attach for egress. The drivers and how they interact with the design:

A rough monthly picture for a mid-size regional deployment in INR:

Cost driver What you pay for Rough INR / month What it buys Watch-out
Standard LB (rules + base) Hourly + first rules ~₹1,500-2,500 The LB itself Per-rule charge beyond the base set
Data processed Per-GB through the LB ~₹0.4-0.5 / GB Throughput Scales with traffic; can dominate at high GB
Standard public IP (each) Hourly per IP ~₹300-400 / IP +64k SNAT ports each Don’t over-provision idle IPs
NAT Gateway Hourly + per-GB ~₹1,500-3,000 On-demand SNAT, deterministic egress Zonal; one per zone for AZ coverage
Cross-region LB Global data processing ~₹1,000-2,500 One static IP + DNS-free failover On top of the regional LBs
VNet flow logs + Traffic Analytics Storage + ingestion ~₹1,000-3,000 The visibility L4 lacks Sample/retain sensibly
Public IP prefix (/28) Hourly per IP in the block ~₹4,500-6,000 (16 IPs) Allow-listable, stable egress CIDR Pay for the whole block even if idle
Additional LB rules (beyond base) Per-rule hourly ~₹100-200 / rule Extra VIPs/ports Adds up on rule-heavy LBs
Cross-zone data transfer Per-GB inter-zone ~₹0.1 / GB Zone resilience Negligible vs the resilience

The sizing rule in one line: pick the minimum outbound IPs (or a NAT Gateway) that keeps UsedSnatPorts comfortably below AllocatedSnatPorts at peak, run zone-spread backends behind a zone-redundant frontend, and only add the cross-region tier when you genuinely need a single global IP. Meridian Pay landed at ₹13,500/month after fixing connection reuse and moving to NAT Gateway — lower than the ₹14,000 they paid while broken, proof the fix is usually design, not a bigger bill.

Interview & exam questions

1. What does an outbound rule do that implicit SNAT does not, and why does it matter? An outbound rule lets you explicitly allocate SNAT ports per instance, pick the outbound frontend IP(s), set the idle timeout, and enable TCP reset — deterministic, plannable egress. Implicit SNAT (from a load-balancing rule) auto-allocates a stingy port count and is non-deterministic. You also set disableOutboundSnat=true on the LB rule so the two don’t overlap. It matters because deterministic egress is the difference between a planned 64k budget and a 2 a.m. exhaustion incident.

2. Why can an app with only 5,000 outbound connections still exhaust SNAT? Because SNAT ports are keyed on the full destination 5-tuple — the limit is ~64,000 simultaneous flows to the same destination IP:port per frontend public IP, not 64,000 total. If all 5,000 connections target one upstream and the app opens a fresh connection per request without reuse, the per-destination pressure builds far past what the raw count suggests.

3. What is an HA Ports rule, and what does it deliberately not solve? An HA Ports rule (protocol All, frontend/backend port 0, internal LB only) load-balances every port and protocol at once — built for NVAs whose port set you can’t enumerate. It does not guarantee flow symmetry: a stateful firewall needs the return packet on the same appliance, and HA Ports hashes directions independently. You add Floating IP (DSR) and vendor session-state sync to get symmetry.

4. Your backends show 100% healthy but users get 502s. Most likely cause? A TCP health probe on an app that is wedged-but-listening — the socket is open so the probe passes, but the app returns 500/502 to real requests. Confirm via DipAvailability at 100% while the 5xx rate is high; fix by switching to an HTTP/HTTPS probe against a real /healthz.

5. Difference between NIC-based and IP-based backend pools, and the constraint that decides it? NIC-based pools attach VM/VMSS NIC ipConfigurations; IP-based pools list raw private IPs. The decisive constraint: IP-based pools cannot use outbound rules. If you need LB-provided SNAT, you must use a NIC-based pool (or provide egress via NAT Gateway).

6. How does the cross-region (Global) LB decide health and where does its failover quality come from? It health-checks the regional load balancers, not your VMs directly — consuming each regional LB’s own probe signal. So global failover quality is exactly your regional probe quality: an honest regional /healthz probe means clean failover; a TCP-or-/ probe that always passes means the region never fails over even when it’s down.

7. Why is a zone-redundant frontend not enough for HA on its own? Because a zone-redundant frontend only guarantees the VIP survives a zone loss. If the backends are pinned to a single zone, losing that zone takes the app down regardless. HA requires both a zone-redundant (or appropriately zonal) frontend and zone-spread backends.

8. What is the implicit-SNAT trap and how do you avoid it? A load-balancing rule silently provides implicit outbound SNAT alongside any explicit outbound rule unless disabled, producing two overlapping behaviors and unpredictable port use. Avoid it by setting disableOutboundSnat=true on every load-balancing rule, so egress is governed solely by the explicit outbound rule.

9. A firewall sandwich resets long-lived connections but not short ones. Diagnose and fix. Classic asymmetric routing through stateful NVAs — return packets traverse a different appliance than the forward path, which has no session state, so it drops mid-stream packets. Short flows finish inside one hash window; long flows hit a rebalance. Fix with Floating IP (DSR) + vendor session-state sync and symmetric UDRs; point the probe at a data-plane liveness URL.

10. Which metric is the canary for SNAT exhaustion, and what do you alert on? SnatConnectionCount split by ConnectionState — alert on the Failed dimension > 0 sustained over ~5 minutes. Dashboard UsedSnatPorts against AllocatedSnatPorts per backend (alert at ~80% utilization) so you act with headroom before failures begin.

11. When do you pick cross-region LB over Front Door or Traffic Manager? Cross-region LB when you need a single static anycast IP for any TCP/UDP protocol with DNS-free, seconds-fast regional failover. Front Door is HTTP-only (no static IP, but L7 + WAF + edge TLS); Traffic Manager is DNS/TTL-bound (minutes to fail over, any protocol at the DNS level but no single IP).

12. How do you grow the SNAT budget without code changes, and what’s the better long-term fix? Add frontend public IPs (or a public IP prefix) — each adds ~64,000 ports — and re-compute outbound-ports against max pool size. The better long-term fix is connection reuse in the app (cuts outbound connections drastically) and, at scale, a NAT Gateway that allocates ports on demand independent of instance count.

These map to AZ-700 (Network Engineer)design and implement load balancing and network connectivity — most directly, with the egress/SNAT and NVA topics squarely in scope; AZ-104 (Administrator)configure load balancing, probes, and rules; and the resilience/active-active design angle touches AZ-305 (Solutions Architect). A compact cert-mapping for revision:

Question theme Primary cert Exam objective area
Outbound rules, SNAT maths, NAT Gateway AZ-700 Design & implement network connectivity / load balancing
HA Ports, firewall sandwich, symmetry AZ-700 Implement load balancing; secure connectivity
Probes, rules, NIC vs IP pools AZ-104 Configure load balancing
Zone-redundant vs zonal frontend/backends AZ-104 / AZ-305 Resilience & availability
Cross-region LB vs Front Door vs Traffic Manager AZ-700 / AZ-305 Global routing & multi-region design
SNAT/DipAvailability metrics & alerting AZ-104 / AZ-700 Monitor & troubleshoot networking

Quick check

  1. An app opens ~3,000 outbound connections, all to a single payment API, with a new connection per request, and you start seeing timeouts under load. Which limit are you hitting and what’s the metric that proves it?
  2. You configured an explicit outbound rule but still see unpredictable port use and exhaustion. What single property did you forget, and on which rule?
  3. True or false: a zone-redundant frontend in front of a single-zone VMSS gives you high availability.
  4. Your firewall sandwich resets long-lived gRPC and DB connections but short HTTP calls are fine. What’s the root cause and the two fixes?
  5. You need one static IP for a UDP service with automatic failover across two regions. Which Azure load balancer, and why not Front Door?

Answers

  1. SNAT port exhaustion against the per-destination 5-tuple ceiling (~64,000 simultaneous flows to the same destination IP:port per frontend public IP). The proof is SnatConnectionCount with a non-zero Failed dimension (and UsedSnatPortsAllocatedSnatPorts). The total connection count being modest is irrelevant — they all target one destination.
  2. disableOutboundSnat=true on the load-balancing rule. Without it, the inbound rule silently provides implicit SNAT alongside your explicit outbound rule, giving two overlapping behaviors and unpredictable port use.
  3. False. Zone-redundancy on the frontend only protects the VIP. If the backends are in one zone, losing that zone takes the app down. HA needs zone-spread backends too.
  4. Asymmetric routing through the stateful NVAs — return packets land on a different appliance with no session state and get dropped mid-stream. Fixes: Floating IP (Direct Server Return) on the HA Ports rule and the vendor’s session-state synchronization (plus symmetric UDRs).
  5. The cross-region (Global) Load Balancer — it gives a single static anycast IP for any TCP/UDP protocol with DNS-free regional failover. Front Door is HTTP-only and provides no static IP, so it cannot serve a UDP service.

Glossary

Next steps

You can now engineer Standard LB end to end — deterministic SNAT, HA Ports symmetry, honest probes, and a global front end — and diagnose any of its failure modes. Build outward:

AzureLoad BalancerNetworkingHigh AvailabilitySNATHA PortsCross-Region
Need this built for real?

Vinod is a Senior Cloud Architect (22+ yrs) — available for Azure / AWS / GCP architecture, landing zones, and migrations.

Work with me

Comments

Keep Reading