Two tools dominate declarative Windows configuration, and most teams pick one for the wrong reasons. PowerShell DSC gives you an agent (the Local Configuration Manager) that re-applies state on a schedule with no controller required - excellent for fleets that must self-heal even when the orchestrator is unreachable. Ansible gives you agentless push from a control node, a vast win_ module library, and orchestration across mixed Linux/Windows estates. They are not competitors; the mature pattern is Ansible as the orchestration plane that ships and triggers DSC for the resources where DSC is genuinely better. This walkthrough builds that combined model end to end - authoring configurations, tuning the LCM, writing idempotent roles, remediating drift, and gating the whole thing behind Pester in CI.
Scope: Windows Server 2019/2022/2025, PowerShell DSC 1.1 (the in-box PSDesiredStateConfiguration that ships with Windows PowerShell 5.1). This is the version the WMI/CIM-based LCM understands. The cross-platform rewrite (DSC 3.0, a standalone Rust executable) is a different runtime with no LCM and no MOF pull server, so do not mix its docs into a 5.1 deployment. Ansible examples assume a recent ansible-core with the ansible.windows and community.windows collections installed.
1. DSC concepts: resources, MOF, and the LCM
Three pieces. Resources are PowerShell modules that know how to test and set one kind of state - a Windows feature, a service, a registry value. Each exposes Get, Test, and Set; the contract is that Test returns a boolean and Set only runs when Test reports drift. That Test-before-Set gate is where idempotency actually lives.
A configuration is a PowerShell function (keyword Configuration) that declares the resources and their target values. Compiling it produces a MOF (Managed Object Format) document per node - a flat, declarative artifact with no PowerShell logic in it. The MOF is what gets applied; the script that generated it is build-time only.
The LCM (Local Configuration Manager) is the agent baked into every Windows box. It ingests a MOF, calls each resource’s Test, applies Set where needed, and - critically - re-runs on a timer to correct drift. Inspect it:
Get-DscLocalConfigurationManager
# Key fields: RefreshMode (Push|Pull), ConfigurationMode,
# ConfigurationModeFrequencyMins, RebootNodeIfNeeded
Mental model: the configuration is your source of truth, the MOF is the compiled binary, and the LCM is the runtime that enforces it. Drift correction is the LCM’s job, not yours.
Confirm the resources available on the box before you author against them:
Get-DscResource | Select-Object Name, ModuleName, Version
Install-Module -Name PSDscResources, NetworkingDsc -Scope AllUsers
PSDscResources is the supported successor to the old in-box PSDesiredStateConfiguration resources - prefer it for Service, Registry, WindowsFeature, and friends.
2. Authoring a configuration for roles, services, and registry
Here is a real baseline: install IIS, guarantee the W3SVC service is running and automatic, and pin a registry value (disable the legacy SMBv1 server). Parameterize the node list so the same configuration compiles for every host.
Configuration WebBaseline {
param([string[]]$NodeName = 'localhost')
Import-DscResource -ModuleName PSDscResources -ModuleVersion 2.12.0.0
Node $NodeName {
WindowsFeature IIS {
Name = 'Web-Server'
Ensure = 'Present'
}
Service W3SVC {
Name = 'W3SVC'
State = 'Running'
StartupType = 'Automatic'
DependsOn = '[WindowsFeature]IIS'
}
Registry DisableSmb1Server {
Key = 'HKLM:\SYSTEM\CurrentControlSet\Services\LanmanServer\Parameters'
ValueName = 'SMB1'
ValueData = '0'
ValueType = 'Dword'
Ensure = 'Present'
}
}
}
# Compile -> emits .\WebBaseline\<NodeName>.mof
WebBaseline -NodeName 'WEB01','WEB02' -OutputPath .\WebBaseline
DependsOn enforces ordering: the service resource will not run until the feature is present. The MOF lands as <NodeName>.mof per node. Apply it locally to validate before you wire up any delivery model:
Start-DscConfiguration -Path .\WebBaseline -Wait -Verbose -Force
Test-DscConfiguration -Detailed # InDesiredState:$true means no drift
Keep secrets out of plaintext MOF. DSC supports certificate-encrypted credentials via a configuration-data block referencing a CertificateFile and Thumbprint; never embed a PSCredential in a MOF without it, because the MOF is stored on disk in clear text otherwise.
3. Push versus pull, and configuring the LCM
The LCM runs in one of two RefreshMode values, and the choice drives your whole operational model.
| Push | Pull | |
|---|---|---|
| Delivery | Start-DscConfiguration sends the MOF |
LCM fetches MOF + modules from a server |
| Scale | Imperative, one-to-many from a controller | Self-service, fleet polls independently |
| Drift correction | Only on next push, or LCM consistency check | LCM re-pulls and re-applies on its own timer |
| Failure mode | Controller down = no new config | Pull server down = nodes keep last-known-good |
Configure the LCM with a meta-configuration - a separate Configuration decorated with [DSCLocalConfigurationManager()]. This sets ApplyAndAutoCorrect, which is what makes the agent remediate drift on every consistency check rather than merely report it:
[DSCLocalConfigurationManager()]
Configuration LcmAutoCorrect {
Node localhost {
Settings {
RefreshMode = 'Push'
ConfigurationMode = 'ApplyAndAutoCorrect'
ConfigurationModeFrequencyMins = 30
RebootNodeIfNeeded = $false
ActionAfterReboot = 'ContinueConfiguration'
}
}
}
LcmAutoCorrect -OutputPath .\LcmMeta
Set-DscLocalConfigurationManager -Path .\LcmMeta -Verbose
The three ConfigurationMode values matter:
- ApplyOnly - apply once, never touch again. Drift is yours to find.
- ApplyAndMonitor - apply, then log drift in event logs but do not fix it. Good for audit-only fleets where change must be human-approved.
- ApplyAndAutoCorrect - apply and re-apply every
ConfigurationModeFrequencyMins. This is the self-healing mode.
ConfigurationModeFrequencyMinshas a documented floor of 15 minutes. Setting it lower is silently clamped. Thirty is a sane production default - frequent enough to close drift windows, infrequent enough to avoid churn.
For a pull server, set RefreshMode = 'Pull' and add a ConfigurationRepositoryWeb block with the server URL and a registration key. Note that Microsoft has deprecated the Windows-hosted DSC Pull Server feature - for greenfield pull at scale, Azure Automation State Configuration or a community pull endpoint is the supported direction, but the on-box push and ApplyAndAutoCorrect model below is fully supported and is what I lean on when pairing DSC with Ansible.
4. Managing Windows from Ansible over WinRM and PSRP
Ansible reaches Windows two ways. The default winrm connection plugin speaks WS-Management; the newer psrp plugin uses PowerShell Remoting Protocol over WinRM and is faster for multi-task plays because it reuses a single runspace. Both ride on the same WinRM listener.
First, enable WinRM on the target. The canonical bootstrap is Microsoft’s ConfigureRemotingForAnsible.ps1, but for production you want an HTTPS listener with a real certificate, not the script’s self-signed default:
# On the Windows host - create an HTTPS listener bound to a known cert
$cert = Get-ChildItem Cert:\LocalMachine\My |
Where-Object Subject -eq 'CN=web01.corp.example.com'
New-Item -Path WSMan:\localhost\Listener -Transport HTTPS `
-Address * -CertificateThumbPrint $cert.Thumbprint -Force
Set-Item WSMan:\localhost\Service\Auth\Basic $false
Set-Item WSMan:\localhost\Service\Auth\CredSSP $false
New-NetFirewallRule -DisplayName 'WinRM HTTPS' -Direction Inbound `
-LocalPort 5986 -Protocol TCP -Action Allow
Then the inventory. Prefer Kerberos auth in a domain; for non-domain hosts, NTLM over HTTPS is acceptable. Avoid Basic and CredSSP unless you have a hard requirement and understand the credential-exposure trade-off.
[web]
web01.corp.example.com
web02.corp.example.com
[web:vars]
ansible_connection=psrp
ansible_port=5986
ansible_psrp_auth=kerberos
ansible_psrp_cert_validation=validate
Validate connectivity with win_ping before any real play - it confirms the listener, auth, and certificate chain in one shot:
ansible web -m ansible.windows.win_ping
5. Writing idempotent Ansible roles with win_ modules
The win_ modules are written to be idempotent: they check current state and report changed only when they actually mutate something. Your job is to use the declarative modules (win_feature, win_service, win_regedit) and resist dropping to raw win_shell, which Ansible cannot reason about for change detection.
Here is the IIS baseline from section 2, expressed as an Ansible role task file:
# roles/web_baseline/tasks/main.yml
- name: Ensure IIS web server role is present
ansible.windows.win_feature:
name: Web-Server
state: present
include_management_tools: true
register: iis_feature
- name: Ensure W3SVC is running and automatic
ansible.windows.win_service:
name: W3SVC
state: started
start_mode: auto
- name: Disable SMBv1 server via registry
ansible.windows.win_regedit:
path: HKLM:\SYSTEM\CurrentControlSet\Services\LanmanServer\Parameters
name: SMB1
data: 0
type: dword
- name: Reboot only if the feature install demands it
ansible.windows.win_reboot:
when: iis_feature.reboot_required
Two idempotency disciplines worth internalizing. First, guard reboots on the module’s own signal - win_feature returns reboot_required, so reboot conditionally rather than unconditionally. Second, when you are forced into win_command or win_shell, add creates/removes guards or a changed_when expression so a re-run does not falsely report changed:
- name: Run a one-shot installer only once
ansible.windows.win_command: C:\setup\app-install.exe /quiet
args:
creates: C:\Program Files\App\app.exe # skip if already installed
Run with --check to verify a role is a no-op against already-converged hosts. A correctly authored role reports ok=... with changed=0 on the second pass - that is your idempotency proof.
6. Combining DSC resources inside Ansible workflows
This is the payoff. Ansible’s win_dsc module invokes any installed DSC resource directly - no MOF compilation, no LCM scheduling. You get DSC’s mature resource implementations (especially for niche state like xWebsite bindings or security settings) driven by Ansible’s orchestration, inventory, and check mode. The module maps DSC resource properties to task parameters one-to-one.
- name: Ensure a website via the DSC Service resource through Ansible
ansible.windows.win_dsc:
resource_name: Service
Name: W3SVC
State: Running
StartupType: Automatic
- name: Configure a registry value via DSC Registry resource
ansible.windows.win_dsc:
resource_name: Registry
Key: HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\LanmanServer\Parameters
ValueName: SMB1
ValueData: '0'
ValueType: Dword
Ensure: Present
win_dsc honors check mode by calling the resource’s Test method and reporting what would change, and it surfaces the resource version so you can pin it. The decision rule I use: reach for win_dsc when a high-quality DSC resource already encapsulates complex logic you would otherwise hand-roll in win_shell; use native win_ modules for the common cases because they are faster and need no DSC module installed on the target.
One caveat:
win_dscrequires the resource module present on the target, and the resource must be compatible with the WMF 5.1 DSC engine. Resources that depend on the cross-platform DSC 3.0 runtime will not load here.
7. Detecting and auto-remediating configuration drift
Two layers of remediation, and you want both.
Layer one - the LCM’s own loop. With ApplyAndAutoCorrect from section 3, every Windows host with an applied MOF self-corrects on its consistency-check timer with no controller involved. This is the safety net that keeps a box compliant even if your Ansible control node is offline for a week. Audit what the LCM did:
# Did anything drift and get corrected on this node?
Get-WinEvent -LogName 'Microsoft-Windows-DSC/Operational' -MaxEvents 50 |
Where-Object Message -match 'not in the desired state|in desired state'
Layer two - Ansible-driven detection and report. Run the converging play on a schedule (cron on the control node, or AWX/AAP). Because every task is idempotent, a scheduled run is drift remediation: anything that drifted gets pulled back, and the run’s change count tells you how much drifted.
# Detect only - check mode plus diff, never mutates
ansible-playbook site.yml --check --diff
# Remediate - drop --check; idempotent tasks correct drift in place
ansible-playbook site.yml --diff
Wire the change count into alerting. A clean nightly run should be changed=0; a non-zero count means something mutated state out-of-band between runs and is worth a ticket. Parse the play recap, or use the ansible.posix.json callback to emit machine-readable results into your observability pipeline.
8. Testing with Pester and integrating into CI
Never ship a DSC configuration or Ansible role that has not been compiled and structurally validated in CI. Pester is the PowerShell test framework; use it to assert that a configuration compiles to a MOF and that the MOF contains the resources you expect. Pester v5 syntax:
# WebBaseline.Tests.ps1
Describe 'WebBaseline configuration' {
BeforeAll {
. $PSScriptRoot\WebBaseline.ps1
$out = Join-Path $TestDrive 'mof'
WebBaseline -NodeName 'TEST01' -OutputPath $out
}
It 'compiles a MOF for the node' {
Join-Path $TestDrive 'mof\TEST01.mof' | Should -Exist
}
It 'declares the IIS WindowsFeature resource' {
Get-Content (Join-Path $TestDrive 'mof\TEST01.mof') -Raw |
Should -Match 'MSFT_RoleResource'
}
}
Run it and fail the build on any red:
$result = Invoke-Pester -Path .\tests -PassThru
if ($result.FailedCount -gt 0) { throw "Pester failed: $($result.FailedCount)" }
For the Ansible side, lint and syntax-check on every push, and lean on --check against a throwaway test host in a later stage:
ansible-lint roles/ playbooks/
ansible-playbook site.yml --syntax-check
A minimal GitHub Actions pipeline ties both together. The PowerShell job runs on a Windows runner (DSC compilation needs WMF 5.1); the lint job runs on Linux:
name: config-mgmt-ci
on: [push, pull_request]
jobs:
dsc-pester:
runs-on: windows-latest
steps:
- uses: actions/checkout@v4
- name: Install DSC resources
shell: powershell
run: Install-Module PSDscResources -Force -Scope CurrentUser
- name: Run Pester
shell: powershell
run: |
$r = Invoke-Pester -Path .\tests -PassThru
if ($r.FailedCount -gt 0) { exit 1 }
ansible-lint:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- run: pip install ansible-lint
- run: ansible-lint roles/ playbooks/
Verify
Walk this checklist on a fresh target to confirm the whole chain works end to end:
# 1. LCM is in auto-correct mode at a sane frequency
(Get-DscLocalConfigurationManager).ConfigurationMode # ApplyAndAutoCorrect
# 2. The node is currently in desired state
Test-DscConfiguration -Detailed # InDesiredState : True
# 3. Force drift, then prove the LCM corrects it on next check
Stop-Service W3SVC
Start-DscConfiguration -UseExisting -Wait -Verbose # re-applies stored MOF
(Get-Service W3SVC).Status # Running
# 4. Ansible reachability and idempotency
ansible web -m ansible.windows.win_ping # pong
ansible-playbook site.yml # first run: changed>0
ansible-playbook site.yml --check --diff # second run: changed=0
If step 3 leaves the service stopped, your LCM is not in ApplyAndAutoCorrect (or no MOF is stored). If step 4’s second run reports changed, a task is non-idempotent - find it with --diff and add proper guards.
Enterprise scenario
A retail platform team ran roughly 400 Windows IIS/middleware hosts across three regions, configured by a single Ansible control node in their primary datacenter. The constraint surfaced during a regional network partition: when the control node was unreachable for eleven hours, application teams pushed manual registry “hotfixes” directly on production boxes to work around an incident. Nothing pulled those changes back, because Ansible only converges when it runs - and it could not run. Two days later, a subset of hosts silently diverged from the security baseline (SMBv1 quietly re-enabled by one of the manual edits), and it went unnoticed until an audit scan flagged it.
The fix was to stop treating Ansible as the only enforcement plane and let DSC’s LCM be the always-on backstop for the security-critical subset. They compiled a small, high-value MOF (SMBv1 disabled, TLS registry settings, a handful of services) and set every host’s LCM to ApplyAndAutoCorrect at 30 minutes. Ansible still owned application deployment and the broader config surface, but the non-negotiable security state now self-healed independent of the control node. The pairing was deliberate: Ansible for orchestration and breadth, the LCM for autonomous drift correction on the settings that must never drift.
[DSCLocalConfigurationManager()]
Configuration SecurityBackstop {
Node localhost {
Settings {
RefreshMode = 'Push'
ConfigurationMode = 'ApplyAndAutoCorrect'
ConfigurationModeFrequencyMins = 30
}
}
}
SecurityBackstop -OutputPath .\Backstop
Set-DscLocalConfigurationManager -Path .\Backstop
After rollout, the next partition was a non-event: hosts kept correcting the security MOF every half hour regardless of control-node reachability, and the audit scan stayed green. The lesson the team wrote into their runbook: an agentless tool cannot remediate when it cannot reach the host, so anything that must stay converged needs an on-box agent behind it.