Hardening Azure App Service: VNet Integration, Private Endpoints, and Zero-Downtime Slots

A default Azure App Service is reachable from the entire internet, talks to your database over a shared public path, and stores connection strings as plaintext in app settings. This guide walks through closing all three gaps — inbound, outbound, and secrets — and then layering on deployment slots so you ship to production with zero downtime and a real rollback.

The default threat model

When you az webapp create and walk away, you inherit three liabilities:

Surface	Default behavior	Risk
Inbound	Public `*.azurewebsites.net` hostname, open to the internet	Anyone can reach the app; access control is app-layer only
Outbound	Egress from a shared, rotating pool of Azure IPs	Backends must allow broad ranges; no per-app firewalling
Config	Connection strings & keys as plaintext app settings	Secrets visible to anyone with Reader on the resource

These are independent problems with independent fixes. Regional VNet integration governs outbound. Private endpoints govern inbound. Key Vault references + managed identity kill the secrets. You can adopt them in any order; this guide does outbound, then secrets, then inbound, then slots.

Plan tier matters. VNet integration and private endpoints require Basic or higher; production scenarios below assume Standard (S1) or Premium v3 (P1v3). Slots, Always On, and per-slot autoscale all need Standard+. Free/Shared tiers support none of this.

The examples assume an existing hub-and-spoke topology (see the companion landing-zone guide) with the app deployed into a spoke VNet vnet-spoke-app.

Step 1 — Regional VNet integration for outbound

Regional VNet integration injects the app’s outbound calls into a dedicated, delegated subnet in your VNet. The subnet must be delegated to Microsoft.Web/serverFarms and should be sized generously — the platform consumes addresses as the plan scales out (a /27 is a sane minimum; /26 for larger plans).

# Delegated subnet for outbound integration
az network vnet subnet create \
  --resource-group rg-app-prod \
  --vnet-name vnet-spoke-app \
  --name snet-appsvc-integration \
  --address-prefixes 10.20.1.0/27 \
  --delegations Microsoft.Web/serverFarms

# Wire the app to it
az webapp vnet-integration add \
  --resource-group rg-app-prod \
  --name app-orders-prod \
  --vnet vnet-spoke-app \
  --subnet snet-appsvc-integration

At this point outbound calls to RFC 1918 destinations route through the VNet, but internet-bound traffic still leaves via Azure’s shared egress. To force all egress through the VNet — so it can be inspected by a firewall and leave from a predictable IP — set WEBSITE_VNET_ROUTE_ALL:

az webapp config appsettings set \
  --resource-group rg-app-prod \
  --name app-orders-prod \
  --settings WEBSITE_VNET_ROUTE_ALL=1

Once WEBSITE_VNET_ROUTE_ALL=1 is set, the subnet’s effective routes apply to internet traffic too. If the subnet has a route table sending 0.0.0.0/0 at an Azure Firewall or NAT gateway, every outbound packet now follows it. Without a deliberate egress path, internet calls can break — so configure the next step before flipping this in production.

Pinning egress with a NAT gateway or firewall

For a stable outbound IP (what most SaaS allowlists and partner firewalls demand), attach a NAT gateway to the integration subnet:

az network public-ip create -g rg-app-prod -n pip-nat-app --sku Standard --allocation-method Static
az network nat gateway create -g rg-app-prod -n nat-app --public-ip-addresses pip-nat-app --idle-timeout 10
az network vnet subnet update \
  -g rg-app-prod --vnet-name vnet-spoke-app \
  --name snet-appsvc-integration --nat-gateway nat-app

For inspection and FQDN filtering, point the subnet’s default route at your hub Azure Firewall instead (UDR with 0.0.0.0/0 -> VirtualAppliance -> firewall private IP). Use the firewall when you need egress logging and rules; use the NAT gateway when you only need a fixed source IP. They can be combined, but the firewall must then own the route.

Step 2 — Kill plaintext secrets with Key Vault references

Never store a connection string or key as a literal app setting. Instead, store the secret in Key Vault and reference it. App Service resolves the reference at startup (and on a refresh interval) using the app’s managed identity — the secret value never appears in configuration.

First, enable a system-assigned identity and grant it read access to the vault. Use RBAC vaults (the modern default) with the Key Vault Secrets User role:

az webapp identity assign -g rg-app-prod -n app-orders-prod

PRINCIPAL_ID=$(az webapp identity show -g rg-app-prod -n app-orders-prod --query principalId -o tsv)
VAULT_ID=$(az keyvault show -g rg-app-prod -n kv-orders-prod --query id -o tsv)

az role assignment create \
  --assignee-object-id "$PRINCIPAL_ID" \
  --assignee-principal-type ServicePrincipal \
  --role "Key Vault Secrets User" \
  --scope "$VAULT_ID"

Then reference the secret. The reference uses a versionless URI so rotation flows through without a redeploy:

az webapp config appsettings set -g rg-app-prod -n app-orders-prod --settings \
  "ServiceBusKey=@Microsoft.KeyVault(SecretUri=https://kv-orders-prod.vault.azure.net/secrets/sb-key/)"

Confirm resolution in the portal: Configuration -> Application settings shows a green “Key Vault Reference” badge. A red badge means the identity can’t read the secret (usually a missing role assignment or a vault firewall blocking the app).

Prefer passwordless over even references

Key Vault references remove plaintext, but the best secret is no secret. For Azure backends that support Entra auth — Azure SQL, Storage, Service Bus, Event Hubs — use the managed identity directly and drop the credential entirely.

Azure SQL: assign the app’s identity as a contained DB user and connect with Authentication=Active Directory Default (via Azure.Identity / Microsoft.Data.SqlClient); no password in the connection string.
Storage / Service Bus: grant a data-plane RBAC role (e.g. Storage Blob Data Contributor, Azure Service Bus Data Sender) to the identity and authenticate with DefaultAzureCredential.

-- Run against the target database as an Entra admin
CREATE USER [app-orders-prod] FROM EXTERNAL PROVIDER;
ALTER ROLE db_datareader ADD MEMBER [app-orders-prod];
ALTER ROLE db_datawriter ADD MEMBER [app-orders-prod];

Step 3 — Lock down inbound with a private endpoint

A private endpoint projects the app into your VNet as a private IP and (when enabled) disables public access entirely. Inbound now requires a route into the VNet — from on-prem over ExpressRoute/VPN, from a peered spoke, or via Application Gateway/Front Door as the public front door.

A private endpoint is a separate subnet from the VNet-integration subnet in Step 1. One handles inbound (private endpoint), the other handles outbound (delegated integration). Do not reuse one subnet for both.

# Dedicated subnet for the private endpoint (no delegation)
az network vnet subnet create \
  -g rg-app-prod --vnet-name vnet-spoke-app \
  --name snet-privateendpoints --address-prefixes 10.20.2.0/27

WEBAPP_ID=$(az webapp show -g rg-app-prod -n app-orders-prod --query id -o tsv)

az network private-endpoint create \
  -g rg-app-prod -n pe-app-orders \
  --vnet-name vnet-spoke-app --subnet snet-privateendpoints \
  --private-connection-resource-id "$WEBAPP_ID" \
  --group-id sites \
  --connection-name pe-app-orders-conn

# Turn off public network access so the app is reachable only via the PE
az webapp update -g rg-app-prod -n app-orders-prod --set publicNetworkAccess=Disabled

Private endpoints are useless without Private DNS. The app’s hostname must resolve to the private IP from inside the VNet. Create the privatelink.azurewebsites.net zone, link it to the VNet, and register the record (a private-endpoint DNS zone group automates the A record):

az network private-dns zone create -g rg-app-prod -n privatelink.azurewebsites.net

az network private-dns link vnet create \
  -g rg-app-prod -n link-spoke-app \
  --zone-name privatelink.azurewebsites.net \
  --virtual-network vnet-spoke-app --registration-enabled false

az network private-endpoint dns-zone-group create \
  -g rg-app-prod --endpoint-name pe-app-orders \
  -n default --private-dns-zone privatelink.azurewebsites.net --zone-name privatelink_azurewebsites_net

The SCM/Kudu site shares the hostname. Once public access is disabled, your CI/CD agent must reach the app over the private network (self-hosted runner in the VNet, or a build that pushes an artifact a VNet-attached deploy step consumes). A public-hosted pipeline doing zip deploy will start failing — plan the deploy path before you flip publicNetworkAccess.

Step 4 — Deployment slots done right

A staging slot is a full, addressable copy of the app on the same plan. You deploy to staging, warm it, validate it, then swap — App Service redirects production traffic to the warmed instances with no cold start.

az webapp deployment slot create -g rg-app-prod -n app-orders-prod --slot staging

The subtlety that bites everyone is which settings travel during a swap. By default, app settings and connection strings follow the slot — they swap along with the code. That is wrong for anything environment-specific (a staging DB connection string must NOT become production’s). Mark those as slot settings (“deployment slot setting” / sticky) so they stay pinned to the slot:

az webapp config appsettings set -g rg-app-prod -n app-orders-prod --slot staging \
  --slot-settings ASPNETCORE_ENVIRONMENT=Staging "SqlConnection=@Microsoft.KeyVault(SecretUri=https://kv-orders-prod.vault.azure.net/secrets/sql-staging/)"

Setting type	Behavior on swap	Use for
Regular app setting	Travels with the code	Feature flags, shared tuning that should promote
Slot setting (sticky)	Stays pinned to the slot	Environment name, env-specific connection strings, slot-scoped instrumentation keys

Warm-up so the swap is actually zero-downtime

A swap is only seamless if the staging instances are already warm. Tell App Service to ping a path on every instance and wait for healthy responses before completing the swap:

az webapp config appsettings set -g rg-app-prod -n app-orders-prod --slot staging --slot-settings \
  WEBSITE_SWAP_WARMUP_PING_PATH=/health/ready \
  WEBSITE_SWAP_WARMUP_PING_STATUSES=200,202 \
  WEBSITE_WARMUP_PATH=/health/ready

WEBSITE_SWAP_WARMUP_PING_PATH and WEBSITE_SWAP_WARMUP_PING_STATUSES gate the swap on your readiness endpoint returning an acceptable status on each instance. /health/ready should check real dependencies (DB reachable, Key Vault references resolved, cache primed) — not just return 200 unconditionally. Pair this with Always On so the slot never idles out before a swap:

az webapp config set -g rg-app-prod -n app-orders-prod --slot staging --always-on true

Swap with auto-rollback semantics

The robust pattern is a swap with preview (two-phase swap). Phase 1 applies production’s slot settings to staging and restarts it under production config — without moving traffic. You validate against the previewed slot, then complete:

# Phase 1: apply target (production) config to staging, no traffic moved yet
az webapp deployment slot swap -g rg-app-prod -n app-orders-prod \
  --slot staging --target-slot production --action preview

# ... run smoke tests against the staging slot now running prod config ...

# Phase 2: complete the swap (traffic moves)
az webapp deployment slot swap -g rg-app-prod -n app-orders-prod \
  --slot staging --target-slot production --action swap

If smoke tests fail during preview, abort with --action reset and nothing reaches users. If a regression surfaces after completion, swap back — the previous production bits are sitting in the staging slot, so rollback is another swap, not a redeploy:

az webapp deployment slot swap -g rg-app-prod -n app-orders-prod --slot staging --target-slot production

Step 5 — Health checks, scaling, and resilience

Enable Health Check so the platform pulls unhealthy instances out of rotation and recycles them. App Service polls the path across instances and stops routing to any that fail consistently:

az webapp config set -g rg-app-prod -n app-orders-prod --generic-configurations '{"healthCheckPath": "/health/live"}'

Use a liveness path (/health/live — is the process up?) for Health Check and a readiness path (/health/ready — are dependencies good?) for warm-up. Conflating them recycles healthy instances during a transient dependency blip.

Add autoscale on the plan. Scale on a signal that reflects load (CPU here; queue depth or HTTP queue length are often better):

az monitor autoscale create -g rg-app-prod \
  --resource $(az appservice plan show -g rg-app-prod -n plan-orders-prod --query id -o tsv) \
  --name autoscale-orders --min-count 2 --max-count 10 --count 2

az monitor autoscale rule create -g rg-app-prod --autoscale-name autoscale-orders \
  --condition "CpuPercentage > 70 avg 10m" --scale out 2

az monitor autoscale rule create -g rg-app-prod --autoscale-name autoscale-orders \
  --condition "CpuPercentage < 30 avg 10m" --scale in 1

Autoscale operates on the plan, which both slots share. Staging instances consume the same plan capacity, so size max-count with headroom for a slot running warm during a deploy. Keep min-count at 2+ so production survives an instance recycle.

Step 6 — Observability and slot-aware alerting

Wire Application Insights for distributed tracing and live metrics, and ship platform logs to Log Analytics via diagnostic settings:

APPI_CONN=$(az monitor app-insights component show -g rg-app-prod --app appi-orders --query connectionString -o tsv)
az webapp config appsettings set -g rg-app-prod -n app-orders-prod \
  --settings APPLICATIONINSIGHTS_CONNECTION_STRING="$APPI_CONN"

az monitor diagnostic-settings create \
  --name diag-to-law \
  --resource $(az webapp show -g rg-app-prod -n app-orders-prod --query id -o tsv) \
  --workspace $(az monitor log-analytics workspace show -g rg-app-prod -n law-orders --query id -o tsv) \
  --logs    '[{"category":"AppServiceHTTPLogs","enabled":true},{"category":"AppServiceConsoleLogs","enabled":true},{"category":"AppServiceAppLogs","enabled":true}]' \
  --metrics '[{"category":"AllMetrics","enabled":true}]'

Apply diagnostic settings to the staging slot too — a slot is a distinct resource and won’t inherit them. Scope production alerts (5xx rate, response-time P95, health-check failures) to the production slot resource ID so a noisy staging deploy doesn’t page on-call.

Enterprise scenario

A payments team locked down app-orders-prod exactly as above — private endpoint, publicNetworkAccess=Disabled, swap-with-preview from a self-hosted VNet runner. The first private-network release passed every smoke test, then 500ed in production seconds after the swap completed. Staging under preview was healthy; production was not.

The cause was Key Vault references and the private endpoint colliding at swap time. The vault had its own private endpoint, but the freshly-restarted production instances re-resolved every @Microsoft.KeyVault(...) reference on startup, and WEBSITE_VNET_ROUTE_ALL=1 forced that DNS lookup through the VNet — where the privatelink.vaultcore.azure.net zone was linked to the spoke but the conditional-forwarder rule on the hub DNS server hadn’t been updated. Staging had cached resolved secrets from before the DNS change; production started cold and couldn’t reach the vault. The warm-up ping on /health/ready should have caught it, but the readiness probe only checked SQL, not secret resolution.

Two fixes. First, make readiness actually prove the dependency chain — resolve a sentinel secret, not just open a DB connection:

app.MapHealthChecks("/health/ready", new HealthCheckOptions {
    Predicate = c => c.Tags.Contains("ready")
});
builder.Services.AddHealthChecks()
    .AddAzureKeyVault(new Uri(vaultUri), new DefaultAzureCredential(),
        o => o.AddSecret("health-canary"), tags: new[] { "ready" })
    .AddSqlServer(sqlConn, tags: new[] { "ready" });

Second, gate the swap on it explicitly and confirm the vault is reachable from the integration subnet before releasing:

az webapp config appsettings set -g rg-app-prod -n app-orders-prod --slot staging \
  --slot-settings WEBSITE_SWAP_WARMUP_PING_PATH=/health/ready WEBSITE_SWAP_WARMUP_PING_STATUSES=200
nslookup kv-orders-prod.vault.azure.net   # from Kudu: must return 10.20.2.x, not a public IP

The lesson: a private endpoint you add for one resource changes the DNS blast radius for every service that resolves a name at startup. Readiness checks have to exercise the secrets path, or warm-up gating is theater.

Verify

# Outbound egresses from the NAT gateway's public IP (run from Kudu console / SSH in the app)
curl -s https://ifconfig.me        # should return pip-nat-app's IP

# Public access is off — this must fail from the internet
curl -I https://app-orders-prod.azurewebsites.net    # expect 403 / connection refused

# From inside the VNet, the hostname resolves to the private IP
nslookup app-orders-prod.azurewebsites.net           # expect 10.20.2.x

# Key Vault references resolved (no plaintext secret in output)
az webapp config appsettings list -g rg-app-prod -n app-orders-prod \
  --query "[?contains(value, 'KeyVault')].name" -o tsv

# Slot settings are sticky (slotSetting: true on env-specific keys)
az webapp config appsettings list -g rg-app-prod -n app-orders-prod --slot staging \
  --query "[?slotSetting].name" -o tsv

# Health endpoints
curl -s https://<vnet-reachable-host>/health/live && echo OK

Production-readiness checklist

Pitfalls

Flipping WEBSITE_VNET_ROUTE_ALL before egress exists. All internet calls break until a NAT gateway or firewall route is in place. Configure egress first.
Forgetting Private DNS. A private endpoint without the linked privatelink.azurewebsites.net zone leaves the hostname resolving to the public IP — connections still fail. The DNS zone group is not optional.
Non-sticky environment settings. If a staging DB connection string isn’t a slot setting, a swap promotes it to production. Audit slotSetting before every release.
A trivial health endpoint. /health/ready returning 200 unconditionally defeats warm-up gating and ships broken instances. Make it check real dependencies.
Public-hosted CI/CD after lockdown. Disabling public access also blocks Kudu/SCM from the public internet — move the deploy step into the VNet first.

With outbound pinned, inbound private, secrets out of config, and swaps gated on warm health checks, the app is no longer a default deployment — it is a network-isolated production service you can ship to safely, on demand, with a one-command rollback.