Most Azure networks segment at the subnet boundary: a web subnet, an app subnet, a data subnet, and NSGs that allow 10.0.1.0/24 -> 10.0.2.0/24 on a port. It works until it doesn’t. The moment two workloads share a subnet, your NSG can’t tell them apart – they’re the same CIDR, so a rule that lets the payments API talk to the database also lets the marketing site talk to it. And every IP you hardcode becomes a maintenance liability the day autoscale hands out a new address. Micro-segmentation fixes both problems by moving the unit of policy from a subnet to a workload identity. This walkthrough builds that model on Azure with Application Security Groups (ASGs), a default-deny baseline, a priority architecture that keeps platform and app-team rules from colliding, and the Azure Policy guardrails that keep it intact fleet-wide.
1. Why subnet-level NSGs stop scaling
Subnet-coarse rules fail along three axes, and they fail quietly:
- They can’t distinguish same-subnet workloads. An NSG rule keyed on a CIDR treats every NIC in that range identically. Lateral movement – the attacker technique of pivoting from one compromised host to its neighbors – happens inside the subnet, exactly where a CIDR rule is blind.
- They hardcode IPs into rules. Scale-out, re-imaging, and ephemeral addressing churn the IP space. Every churn either breaks a rule or forces someone to widen it to a
/24“to be safe,” which erodes the boundary you were trying to enforce. - They explode combinatorially. With N tiers across M spokes, explicit IP rules trend toward N x M and beyond. Nobody can read a 200-rule NSG and tell you what it permits.
Micro-segmentation’s actual payoff is not “more firewalls.” It’s a default-deny posture where every allowed flow is named, intentional, and tied to a workload role rather than an address. You stop describing the network and start describing the application.
The mental shift: an NSG rule should read like a sentence about your application – “web may reach app on 443” – not like a routing table entry. ASGs are what make that sentence possible.
2. Application Security Groups as workload identity
An ASG (Microsoft.Network/applicationSecurityGroups) is a logical label you attach to NICs. Instead of writing source = 10.0.1.0/24, you write source = asg-web, and Azure resolves the membership to whatever NICs currently carry that label – including ones autoscale created thirty seconds ago.
Create the ASGs that mirror your application’s roles, not its topology:
RG=rg-app-prod
LOC=eastus2
for tier in web app data; do
az network asg create \
--resource-group "$RG" --name "asg-${tier}" --location "$LOC"
done
Attach a NIC to one or more ASGs at the IP configuration level. A NIC can belong to several ASGs simultaneously, which is how you express overlapping roles (e.g., a host that is both asg-app and asg-patch-managed):
# Associate an existing NIC's ipconfig with the web ASG
az network nic ip-config update \
--resource-group "$RG" \
--nic-name nic-web-vm01 --name ipconfig1 \
--application-security-groups asg-web
For VM Scale Sets, set the ASG in the network profile so every instance is born into the right group:
az vmss update \
--resource-group "$RG" --name vmss-web \
--set virtualMachineProfile.networkProfile.networkInterfaceConfigurations[0]\
.ipConfigurations[0].applicationSecurityGroups='[{"id":"/subscriptions/<sub>/resourceGroups/'"$RG"'/providers/Microsoft.Network/applicationSecurityGroups/asg-web"}]'
# Then roll the instances so the new model takes effect
az vmss update-instances --resource-group "$RG" --name vmss-web --instance-ids '*'
Two constraints to internalize before you design around ASGs:
- Same region. An ASG and every NIC referencing it must live in the same Azure region.
- Same VNet per rule. Within a single NSG rule, all NICs referenced by the source ASG(s) and destination ASG(s) must belong to the same virtual network. ASGs are an intra-VNet construct; for cross-VNet flows you fall back to service tags or IP ranges, or you inspect at a firewall in the hub.
3. Designing a default-deny baseline
Every NSG already ships with non-deletable default rules. The ones that matter:
| Direction | Priority | Name | Effect |
|---|---|---|---|
| Inbound | 65000 | AllowVnetInBound |
Allow VirtualNetwork -> VirtualNetwork |
| Inbound | 65001 | AllowAzureLoadBalancerInBound |
Allow AzureLoadBalancer -> Any |
| Inbound | 65500 | DenyAllInBound |
Deny Any -> Any |
| Outbound | 65000 | AllowVnetOutBound |
Allow VirtualNetwork -> VirtualNetwork |
| Outbound | 65001 | AllowInternetOutBound |
Allow Any -> Internet |
| Outbound | 65500 | DenyAllOutBound |
Deny Any -> Any |
The trap is AllowVnetInBound at 65000: by default, any VM in the VNet can reach any other VM on any port. That is the opposite of micro-segmentation. The whole game is to shadow those permissive 65000 defaults with your own lower-numbered (higher-priority) rules: explicit allows for the flows you want, then an explicit deny that catches everything else before the default allow can fire.
Usable priorities run 100 to 4096 (lower number = evaluated first; first match wins and stops evaluation). Build the baseline bottom-up. First, the explicit allows between named ASGs:
NSG=nsg-app-tier
# web -> app on 443
az network nsg rule create \
--resource-group "$RG" --nsg-name "$NSG" \
--name Allow-Web-To-App-443 --priority 1000 \
--direction Inbound --access Allow --protocol Tcp \
--source-asgs asg-web --destination-asgs asg-app \
--destination-port-ranges 443 --source-port-ranges '*'
# app -> data on 1433 (SQL)
az network nsg rule create \
--resource-group "$RG" --nsg-name "$NSG" \
--name Allow-App-To-Data-1433 --priority 1010 \
--direction Inbound --access Allow --protocol Tcp \
--source-asgs asg-app --destination-asgs asg-data \
--destination-port-ranges 1433 --source-port-ranges '*'
Then the rule that does the actual segmentation work – an intra-VNet deny that sits above the 65000 default and below your allows:
# Catch-all: anything VNet-internal not explicitly allowed above is denied
az network nsg rule create \
--resource-group "$RG" --nsg-name "$NSG" \
--name Deny-All-VNet-Inbound --priority 4000 \
--direction Inbound --access Deny --protocol '*' \
--source-address-prefixes VirtualNetwork \
--destination-address-prefixes VirtualNetwork \
--destination-port-ranges '*' --source-port-ranges '*'
Now the evaluation order is: your allows (1000-1010) win for sanctioned flows, your deny (4000) blocks every other east-west attempt, and the permissive AllowVnetInBound at 65000 never gets reached. That single deny rule converts “flat VNet” into “default-deny VNet.”
Do not delete or try to reorder the 65000 defaults – you can’t. You override them. A lower priority number on your own rule is the only mechanism. Treat the 65500
DenyAllInBoundas your safety net for non-VNet sources, and your 4000Deny-All-VNet-Inboundas the one that disciplines lateral traffic.
One field rule worth memorizing: within a single rule’s source (or destination), you can specify ASGs or address prefixes, but not both. If you need an IP-based source and an ASG-based source for the same flow, that’s two rules.
4. Rule priority architecture: platform vs app-team
The moment more than one team writes into NSGs, priority numbers become a shared namespace – and an unmanaged shared namespace is a collision waiting to happen. Carve the 100-4096 range into bands with documented ownership:
| Priority band | Owner | Purpose |
|---|---|---|
| 100-499 | Platform | Mandatory denies (e.g., block RDP/SSH from Internet, block known-bad) |
| 500-899 | Platform | Mandatory allows (management plane, AzureLoadBalancer health probes, monitoring) |
| 1000-3499 | App team | Application allow flows between ASGs |
| 3500-3999 | App team | App-specific denies |
| 4000-4096 | Platform | The default-deny backstop (Deny-All-VNet-Inbound) |
This is more than etiquette. Because first match wins, a platform deny at priority 200 will always beat an app-team allow at 1000 – which is exactly what you want for a non-negotiable control like “no inbound SSH from the internet, ever.” The platform’s mandatory rules live below the app team’s range, so app teams cannot punch a hole through them no matter what they write:
# Platform-mandated, low-number, app teams cannot override it
az network nsg rule create \
--resource-group rg-platform --nsg-name "$NSG" \
--name Deny-Internet-SSH-RDP --priority 200 \
--direction Inbound --access Deny --protocol Tcp \
--source-address-prefixes Internet \
--destination-address-prefixes '*' \
--destination-port-ranges 22 3389 --source-port-ranges '*'
The banding gives you a contract: platform owns the floor and the ceiling, app teams own the middle. Encode the bands in your IaC module as input validation so a bad priority fails at plan time, not in production.
5. East-west isolation: blocking lateral movement in one subnet
Here is the payoff that subnet CIDR rules can never deliver. Put vm-app01 and vm-app02 in the same subnet, both labeled asg-app. With a plain subnet rule you cannot stop them from talking to each other – they’re the same CIDR. With ASGs plus the default-deny baseline, app-to-app traffic is simply never on the allow list, so the 4000 deny catches it.
But sometimes peers legitimately need to talk (a clustered cache, a leader election port) and you want to scope that, not open it wide. Express the intra-tier flow as ASG-to-the-same-ASG on exactly the port required:
# Allow app peers to gossip on 7000 ONLY; everything else app<->app stays denied
az network nsg rule create \
--resource-group "$RG" --nsg-name "$NSG" \
--name Allow-App-Peer-Gossip-7000 --priority 1200 \
--direction Inbound --access Allow --protocol Tcp \
--source-asgs asg-app --destination-asgs asg-app \
--destination-port-ranges 7000 --source-port-ranges '*'
Now vm-app01 can reach vm-app02 on 7000 and nothing else. If vm-app01 is compromised, the attacker cannot SSH to vm-app02, cannot hit its management agent, cannot scan its open ports – the baseline deny stands for every port except the one gossip flow. That is lateral-movement containment at the NIC, inside a shared subnet, which is the precise gap an IP-based NSG leaves open.
6. Enforcing the model fleet-wide with Azure Policy
A baseline you apply by hand erodes within a quarter. Two enforcement jobs matter: every subnet must have an NSG, and no NSG may contain a rule that allows inbound SSH/RDP from the internet. Azure Policy does both – the first with a Modify/DeployIfNotExists effect, the second with Deny so the bad rule never lands.
There is a built-in for the dangerous-management-port case. Assign the policy RDP access from the Internet should be blocked (and its SSH counterpart) to audit or deny. For a custom deny keyed on NSG rules, target the security-rule aliases:
{
"policyRule": {
"if": {
"allOf": [
{ "field": "type", "equals": "Microsoft.Network/networkSecurityGroups/securityRules" },
{ "field": "Microsoft.Network/networkSecurityGroups/securityRules/access", "equals": "Allow" },
{ "field": "Microsoft.Network/networkSecurityGroups/securityRules/direction", "equals": "Inbound" },
{ "field": "Microsoft.Network/networkSecurityGroups/securityRules/sourceAddressPrefix", "in": ["Internet", "*", "0.0.0.0/0"] },
{ "anyOf": [
{ "field": "Microsoft.Network/networkSecurityGroups/securityRules/destinationPortRange", "in": ["22", "3389", "*"] }
]}
]
},
"then": { "effect": "deny" }
}
}
For “every subnet has an NSG,” use the built-in Subnets should be associated with a Network Security Group in audit mode first to size the gap, then escalate to a DeployIfNotExists that attaches a baseline NSG to any subnet missing one. The subnet alias to key off is Microsoft.Network/virtualNetworks/subnets[*].networkSecurityGroup.id.
Assign at the management-group scope so new subscriptions inherit it automatically:
az policy assignment create \
--name deny-internet-mgmt-ports \
--display-name "Deny inbound SSH/RDP from Internet" \
--policy "<policy-definition-id>" \
--scope "/providers/Microsoft.Management/managementGroups/mg-landingzones" \
--enforcement-mode Default
To catch drift – someone editing an NSG out-of-band – lean on Policy’s continuous compliance scan (non-compliant resources surface in the compliance blade and via az policy state) and wire an Activity Log alert on Microsoft.Network/networkSecurityGroups/securityRules/write so an unexpected rule change pages someone in near real time.
7. Verify
A segmentation model you haven’t tested is a hypothesis. Validate it two ways: with flow telemetry and with a deliberate attempt to move laterally.
Flow logs. Enable NSG flow logs (or, going forward, VNet flow logs) with traffic analytics into Log Analytics. The deny you care about shows up as flow tuples with a deny decision. Query the analytics table for blocked east-west flows – this is your proof the baseline is firing, and your early-warning for a missing allow:
AzureNetworkAnalytics_CL
| where SubType_s == "FlowLog" and FlowType_s == "IntraVNet"
| where FlowStatus_s == "D" // D = Denied
| summarize DeniedFlows = count() by SrcIP_s, DestIP_s, DestPort_d, NSGRule_s
| order by DeniedFlows desc
A spike of denied intra-VNet flows on an unexpected port is either an attacker probing or a legitimate flow you forgot to allow. Both are exactly what you want surfaced.
Lateral-movement test. From a host in asg-app, prove you cannot reach a peer on a non-sanctioned port, then confirm the sanctioned one works:
# From vm-app01, target vm-app02's private IP
nc -vz -w 3 10.0.2.5 22 # expect: timeout / connection refused (denied by 4000)
nc -vz -w 3 10.0.2.5 7000 # expect: succeeded (allowed by Allow-App-Peer-Gossip-7000)
Cross-check the decision without sending a packet using IP Flow Verify, which evaluates the effective NSG rules for a hypothetical flow and names the rule that decided it:
az network watcher test-ip-flow \
--resource-group "$RG" \
--vm vm-app01 --nic nic-app-vm01 \
--direction Outbound --protocol TCP \
--local 10.0.2.4:50000 --remote 10.0.2.5:22
# Output: Access=Deny, RuleName=Deny-All-VNet-Inbound (or the relevant rule)
If IP Flow Verify says Allow for a flow you intended to block, you have a higher-priority allow shadowing your deny – go re-read your priority bands.
8. Operational hygiene: scale-out, ephemeral IPs, lifecycle
The model only stays correct if membership and rules are managed, not hand-tended:
- Scale-out is automatic if you bake ASGs into the template. Set the ASG in the VMSS network profile and in your VM IaC module’s NIC block. A scaled-out instance inherits the label and is governed the instant it boots – no rule edit, no IP to chase. This is the entire reason ASGs beat IP rules at scale.
- Ephemeral IPs become a non-issue. Because rules reference ASGs, a new DHCP lease or a re-imaged host changes nothing in the NSG. Stop putting addresses in rules; the only place an IP should appear is a cross-VNet flow that genuinely can’t use an ASG.
- Rule lifecycle = source control. NSG rules live in Bicep/Terraform, reviewed by PR, with the priority bands enforced as module validation. Out-of-band portal edits are drift, and your Policy compliance scan plus Activity Log alert exist to catch them. Periodically reconcile: any rule that flow logs show zero hits on over 90 days is a candidate for removal – dead allows are attack surface.
- Name rules like sentences.
Allow-Web-To-App-443is auditable at a glance;rule-7is not. The name is documentation that travels with the resource.
Enterprise scenario
A fintech platform team ran a regulated workload where the audit requirement was explicit: the payments-processing tier must be reachable only from the API tier, on one port, and no other workload in the environment may initiate a connection to it. The catch was density. To control cost, payments pods and several unrelated internal services shared a single /23 subnet – a decision finance had already signed off on and would not reverse. Their existing NSG allowed 10.40.0.0/23 -> 10.40.0.0/23 on the payments port, which technically passed the port check but failed the isolation requirement: every service in that /23, payments or not, could open the payments port. The auditor flagged it.
They could not re-subnet without a migration window they didn’t have. The fix was to make identity, not address, the boundary. They labeled only the payments NICs asg-payments and only the API NICs asg-api, then replaced the CIDR rule with an ASG rule and let the default-deny baseline handle everyone else:
az network nsg rule create \
--resource-group rg-pay-prod --nsg-name nsg-pay \
--name Allow-Api-To-Payments-8443 --priority 1100 \
--direction Inbound --access Allow --protocol Tcp \
--source-asgs asg-api --destination-asgs asg-payments \
--destination-port-ranges 8443 --source-port-ranges '*'
With the Deny-All-VNet-Inbound backstop at priority 4000 already in place, the other services in the /23 – same subnet, same CIDR – lost all reachability to the payments tier the moment this rule shipped, because they were never granted an explicit allow. They proved it with IP Flow Verify from a non-API host (Access=Deny, RuleName=Deny-All-VNet-Inbound) and exported the flow-log denies as the audit evidence. No migration, no new subnet, isolation enforced at the NIC by workload identity. The auditor closed the finding.