Security Azure

Sentinel Detection-as-Code: Content Hub, Repositories, and CI/CD Pipelines

A SOC that clicks “Create analytics rule” in the portal is building a detection estate it cannot reason about. Six months in, nobody can tell you which rules changed, who tuned the threshold on the impossible-travel detection, or whether the rule running in production matches the one in your DR workspace. There is no diff, no peer review, no rollback. When an analyst “fixes” a noisy rule at 2 a.m. by widening a filter, that change is invisible until it misses the next real intrusion.

Detection-as-code fixes this the same way infrastructure-as-code fixed configuration drift: the Git repository becomes the source of truth, every change is a reviewed pull request, and a pipeline deploys identical content to every workspace. This guide builds that system on Microsoft Sentinel - repository structure, KQL validation, a CI/CD pipeline, and multi-workspace and MSSP fan-out - and is opinionated about where Microsoft’s built-in repositories connector ends and where you need your own pipeline.

1. Why detection-as-code, concretely

Three failure modes drive the decision, and naming them keeps the design honest:

Problem Portal-managed reality Detection-as-code
Drift Dev, staging, prod rules diverge silently One repo deploys identical content everywhere
No peer review A tuning change ships with zero eyes on it Every KQL change is a reviewed PR with a diff
Not reproducible Rebuilding a workspace means re-clicking git clone + pipeline rebuilds the SOC content

The unit of work shifts from “a rule in a workspace” to “a content file in a branch.” That reframing is the whole point. A detection is now testable, reviewable, and versioned, and the workspace is a deployment target, not a database you edit by hand.

Detection-as-code does not replace the analyst. It replaces the undocumented, unreviewed mutation. The analyst still writes the KQL - they just do it in a branch, with a colleague reviewing the logic before it touches production.

2. Structure the content repository

Lay the repo out so a human and a pipeline can both navigate it. Separate content by type (because deployment tooling and validation differ per type) and keep environment-specific values in parameter files, never in the rule body.

sentinel-content/
  analytics-rules/
    impossible-travel.bicep
    impossible-travel.parameters.json          # default params
    impossible-travel.parameters-<prodWsId>.json
    aadsts-brute-force.bicep
  hunting-queries/
    rare-process-by-host.json
  workbooks/
    identity-overview.json
  parsers/
    asim-auth-custom.kql
  shared/
    rule.bicep                                  # one reusable module
  tests/
    kql-validate.ps1
    schema-validate.ps1
  sentinel-deployment.config                    # repositories-connector config
  .github/workflows/deploy.yml

Author rules as Bicep rather than hand-written ARM JSON. Bicep gives you parameters, type checking, and loadTextContent() so the KQL lives in a separate .kql file that linters and your editor understand - instead of being trapped as an escaped one-line string inside JSON.

A single reusable module keeps every rule consistent. This is the schema Sentinel actually expects for a scheduled rule:

// shared/rule.bicep
@description('Log Analytics / Sentinel workspace name')
param workspaceName string

@description('Stable GUID for this rule - keep it constant across edits')
param ruleId string

param displayName string
param description string = ''
param query string

@allowed(['High', 'Medium', 'Low', 'Informational'])
param severity string

@description('ISO 8601 duration, e.g. PT1H')
param queryFrequency string = 'PT1H'

@description('ISO 8601 duration, e.g. PT1H')
param queryPeriod string = 'PT1H'

@allowed(['GreaterThan', 'LessThan', 'Equal', 'NotEqual'])
param triggerOperator string = 'GreaterThan'

param triggerThreshold int = 0
param enabled bool = true
param tactics array = []
param techniques array = []

resource workspace 'Microsoft.OperationalInsights/workspaces@2023-09-01' existing = {
  name: workspaceName
}

resource rule 'Microsoft.SecurityInsights/alertRules@2025-09-01' = {
  name: ruleId
  scope: workspace
  kind: 'Scheduled'
  properties: {
    displayName: displayName
    description: description
    severity: severity
    enabled: enabled
    query: query
    queryFrequency: queryFrequency
    queryPeriod: queryPeriod
    triggerOperator: triggerOperator
    triggerThreshold: triggerThreshold
    suppressionDuration: 'PT1H'
    suppressionEnabled: false
    tactics: tactics
    techniques: techniques
    incidentConfiguration: {
      createIncident: true
      groupingConfiguration: {
        enabled: true
        reopenClosedIncident: false
        lookbackDuration: 'PT5H'
        matchingMethod: 'AllEntities'
      }
    }
  }
}

The ruleId deserves emphasis. The resource name is the rule’s identity in Azure. Generate a GUID once, store it in the file, and never change it - then edits are idempotent updates, not delete-and-recreate (which loses incident history and resets the rule’s alert lineage). A deleted-and-recreated rule is a new rule as far as your incidents are concerned.

The per-rule file just supplies values and imports the module:

// analytics-rules/impossible-travel.bicep
param workspaceName string

module r '../shared/rule.bicep' = {
  name: 'impossible-travel'
  params: {
    workspaceName: workspaceName
    ruleId: 'b2f1c7a4-9d3e-4a8b-bb21-7e5d4c0a1f93'  // stable GUID
    displayName: 'Impossible travel - successful sign-in'
    description: 'Successful sign-ins from geographically distant locations within an implausible window.'
    severity: 'Medium'
    query: loadTextContent('./impossible-travel.kql')
    queryFrequency: 'PT1H'
    queryPeriod: 'PT6H'
    triggerThreshold: 0
    tactics: ['InitialAccess', 'CredentialAccess']
    techniques: ['T1078']
  }
}

3. Author analytics rules with KQL validation

The KQL lives on its own so it is readable, diffable, and lintable:

// analytics-rules/impossible-travel.kql
let lookback = 6h;
let threshold_kmh = 800.0;   // faster than a commercial flight => impossible
SigninLogs
| where TimeGenerated > ago(lookback)
| where ResultType == 0
| extend lat = toreal(LocationDetails.geoCoordinates.latitude),
         lon = toreal(LocationDetails.geoCoordinates.longitude)
| where isnotnull(lat) and isnotnull(lon)
| order by UserPrincipalName asc, TimeGenerated asc
| serialize
| extend prevLat = prev(lat), prevLon = prev(lon),
         prevTime = prev(TimeGenerated), prevUser = prev(UserPrincipalName)
| where UserPrincipalName == prevUser
| extend distKm = geo_distance_2points(lon, lat, prevLon, prevLat) / 1000.0
| extend hours = datetime_diff('second', TimeGenerated, prevTime) / 3600.0
| where hours > 0
| extend speedKmh = distKm / hours
| where speedKmh > threshold_kmh
| project TimeGenerated, UserPrincipalName, distKm, speedKmh, IPAddress

There is no separate “KQL compiler” you can shell out to, so validation is layered. The cheapest, most reliable gate is to submit the query to the workspace as a query job and fail on a parse or semantic error - the Log Analytics query engine is the authoritative validator:

# tests/kql-validate.ps1 - fail the build if any KQL is syntactically invalid
param([string]$WorkspaceId, [string]$RulesPath = './analytics-rules')

Connect-AzAccount -Identity | Out-Null
$failed = @()

Get-ChildItem -Path $RulesPath -Filter '*.kql' -Recurse | ForEach-Object {
    $kql = Get-Content $_.FullName -Raw
    # Wrap so we validate syntax cheaply without scanning real data
    $probe = "$kql`n| take 0"
    try {
        Invoke-AzOperationalInsightsQuery -WorkspaceId $WorkspaceId -Query $probe -ErrorAction Stop | Out-Null
        Write-Host "PASS  $($_.Name)"
    } catch {
        Write-Host "FAIL  $($_.Name) :: $($_.Exception.Message)"
        $failed += $_.Name
    }
}

if ($failed.Count -gt 0) { throw "KQL validation failed: $($failed -join ', ')" }

Appending | take 0 means the engine parses, binds column references, and resolves functions - catching the real bugs (a renamed column, a typo’d operator, a function that does not exist) - without scanning data or costing query volume. That is the unit test for a detection: does this query bind against the schema it claims to read?

Add a second, schema-level gate that lints the rule structure itself. bicep build already type-checks the Bicep, so wire policy assertions on top - reject any rule missing MITRE tactics, any severity of High without createIncident: true, any queryPeriod shorter than queryFrequency. These are organizational invariants, not Azure’s, so they live in your test script.

4. Repositories connector versus a custom pipeline

Sentinel ships a first-party repositories feature (Content management -> Repositories) that connects a GitHub or Azure DevOps repo and auto-generates a workflow. Know exactly what it does before you decide.

When you create a connection, Sentinel registers an Entra app called Azure Sentinel Content Deployment App (suffixed with the repository ID), grants it access to the workspace’s resource group, and drops a workflow into your repo. On every push it deploys changed content. Its smart deployments feature tracks a CSV in the .sentinel folder to avoid redeploying files that did not change since the last commit.

Dimension Repositories connector Custom pipeline
Setup Minutes, portal-driven You build it
Auth Auto-created service principal Your federated credential / OIDC
Testing gates None - it deploys on push Whatever you wire in
Multi-workspace One connection per workspace One pipeline, fan-out loop
Promotion (dev -> prod) Not built in Branch/environment gates
Approvals None Native (environments / approvals)

The honest read: the connector is a deployer, not a CI system. It has no test stage and no promotion gates - it pushes whatever is on the branch. Use it for a single workspace where “merge to main = live in prod” is acceptable. For anything with environments, approvals, or pre-deployment validation, build your own pipeline. You can also do both: let the connector own the final deploy step while your pipeline owns testing and promotion, configured through sentinel-deployment.config.

That config file (root of the repo, lowercase keys) controls prioritization, exclusion, and parameter-file mapping for the connector:

{
  "prioritizedcontentfiles": [
    "analytics-rules/impossible-travel.bicep"
  ],
  "excludecontentfiles": [
    "analytics-rules/experimental-draft.bicep",
    "tests"
  ],
  "parameterfilemappings": {
    "11111111-1111-1111-1111-111111111111": {
      "analytics-rules/impossible-travel.bicep": "analytics-rules/impossible-travel.parameters-prod.json"
    }
  }
}

The GUID key is the workspace ID, which is how one repo maps different parameter files to different target workspaces. Note the casing - prioritizedcontentfiles, not camelCase - and forward slashes only; backslashes break the deployment script.

5. Build the CI/CD pipeline

Here is a GitHub Actions pipeline that does what the connector will not: lint, validate KQL, then deploy with OIDC (no stored secret). It authenticates via a federated credential, so there is no service-principal secret to rotate or leak.

# .github/workflows/deploy.yml
name: Sentinel Detection-as-Code

on:
  pull_request:
    branches: [ main ]
  push:
    branches: [ main ]

permissions:
  id-token: write      # required for OIDC federated login
  contents: read

jobs:
  validate:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Azure login (OIDC)
        uses: azure/login@v2
        with:
          client-id: ${{ secrets.AZURE_CLIENT_ID }}
          tenant-id: ${{ secrets.AZURE_TENANT_ID }}
          subscription-id: ${{ secrets.AZURE_SUBSCRIPTION_ID }}

      - name: Bicep build (type + schema check)
        run: |
          for f in analytics-rules/*.bicep; do
            echo "Building $f"
            az bicep build --file "$f"
          done

      - name: KQL syntax validation
        uses: azure/powershell@v2
        with:
          inlineScript: ./tests/kql-validate.ps1 -WorkspaceId '${{ vars.DEV_WORKSPACE_ID }}'
          azPSVersion: latest

  deploy:
    needs: validate
    if: github.ref == 'refs/heads/main' && github.event_name == 'push'
    runs-on: ubuntu-latest
    environment: production        # gate with required reviewers
    steps:
      - uses: actions/checkout@v4

      - name: Azure login (OIDC)
        uses: azure/login@v2
        with:
          client-id: ${{ secrets.AZURE_CLIENT_ID }}
          tenant-id: ${{ secrets.AZURE_TENANT_ID }}
          subscription-id: ${{ secrets.AZURE_SUBSCRIPTION_ID }}

      - name: Deploy analytics rules
        run: |
          for f in analytics-rules/*.bicep; do
            az deployment group create \
              --resource-group rg-sec-sentinel-prod \
              --template-file "$f" \
              --parameters workspaceName='law-sentinel-prod' \
              --name "sentinel-$(basename "$f" .bicep)-${{ github.run_id }}"
          done

Two design choices matter. First, validation runs on pull_request so a reviewer sees a green check (or a real failure) before merge - the test gate the connector lacks. Second, environment: production ties the deploy job to GitHub’s environment protection, so a human approves before anything reaches prod, and the approval is logged.

The federated credential is configured once on the app registration, scoped to this repo and (critically) to the production environment so a fork or an arbitrary branch cannot mint a token:

az ad app federated-credential create \
  --id "$APP_OBJECT_ID" \
  --parameters '{
    "name": "github-sentinel-prod",
    "issuer": "https://token.actions.githubusercontent.com",
    "subject": "repo:contoso/sentinel-content:environment:production",
    "audiences": ["api://AzureADTokenExchange"]
  }'

6. Multi-workspace and Lighthouse MSSP fan-out

One repo, many workspaces. Drive the fan-out from a manifest rather than hard-coding targets, so onboarding a workspace is a one-line edit:

// workspaces.json
[
  { "rg": "rg-sec-sentinel-prod", "ws": "law-sentinel-prod", "subId": "1111-...-1111" },
  { "rg": "rg-sec-sentinel-eu",   "ws": "law-sentinel-eu",   "subId": "2222-...-2222" }
]
# deploy every rule to every workspace in the manifest
jq -c '.[]' workspaces.json | while read -r ws; do
  rg=$(echo "$ws" | jq -r '.rg')
  name=$(echo "$ws" | jq -r '.ws')
  subId=$(echo "$ws" | jq -r '.subId')
  az account set --subscription "$subId"
  for f in analytics-rules/*.bicep; do
    az deployment group create \
      --resource-group "$rg" --template-file "$f" \
      --parameters workspaceName="$name" \
      --name "sentinel-$(basename "$f" .bicep)-$(date +%s)"
  done
done

For an MSSP, customer workspaces live in customer tenants. Do not store customer logs in your tenant - use Azure Lighthouse delegated access. Once each customer delegates the right scope to your SOC’s Entra group, your deployment principal can target their workspace cross-tenant with no per-tenant credential. The delegation is granted via an ARM offer deployed in the customer subscription:

// lighthouse-offer.bicep - deployed in the CUSTOMER subscription (subscription scope)
targetScope = 'subscription'

param mspTenantId string
param sentinelDeployGroupId string   // your SOC's Entra group object ID

resource assignment 'Microsoft.ManagedServices/registrationAssignments@2022-10-01' = {
  name: guid(subscription().id, sentinelDeployGroupId)
  properties: {
    registrationDefinitionId: definition.id
  }
}

resource definition 'Microsoft.ManagedServices/registrationDefinitions@2022-10-01' = {
  name: guid(subscription().id, 'sentinel-mssp')
  properties: {
    registrationDefinitionName: 'Sentinel content deployment'
    managedByTenantId: mspTenantId
    authorizations: [
      {
        principalId: sentinelDeployGroupId
        // Microsoft Sentinel Contributor
        roleDefinitionId: 'ab8e14d6-4a74-4a29-9ba8-549422addade'
      }
    ]
  }
}

Grant the least privilege that still works: Microsoft Sentinel Contributor lets the principal manage analytics rules without broad subscription rights. After delegation, the same fan-out loop deploys to delegated workspaces - the customer’s subscription ID goes in the manifest and Lighthouse handles the cross-tenant authorization transparently. No customer secret ever lands in your pipeline.

7. Content Hub solutions and safe upgrades

Most production Sentinel content does not start as your custom code - it starts as a Content Hub solution (Microsoft 365 Defender, Threat Intelligence, a vendor connector). These are packaged, versioned ARM templates installed from the gallery, and they cut across detection-as-code in a way that bites teams who ignore it.

The trap: you install a solution, then tune one of its rules in the portal. Later the solution ships v2.1, you upgrade, and your tuning is silently overwritten - the upgrade reapplies the template’s version of that rule. Treat Content Hub solutions as an upstream dependency, exactly like a third-party library:

// solutions.json - pin your upstream content versions
{
  "Microsoft Entra ID": "3.0.7",
  "Threat Intelligence": "3.0.3",
  "Microsoft Defender XDR": "3.1.2"
}

This is the same discipline as pinning a package lockfile. The Content Hub gives you breadth fast; the repo gives you control. Forking the rules you tune is what reconciles the two - your customizations survive every upstream upgrade because they are yours, with identities the solution does not own.

8. Promotion, rollback, and change auditing

Wire a real promotion path. Map environments to branches and let merges be the promotion event:

feature/*  --PR-->  develop  --(auto-deploy)-->  DEV workspace
develop    --PR-->  main     --(approval)----->  PROD workspace(s)

A merge to develop deploys to the dev workspace automatically; a tuner sees the rule fire on real-but-non-prod telemetry. A reviewed PR into main, gated by the production environment’s required reviewers, promotes it. Tag every production release so each deploy maps to an immutable commit:

git tag -a "release-2026.06.08" -m "Add impossible-travel rule; tune brute-force threshold"
git push origin "release-2026.06.08"

Rollback is git revert plus a re-run of the pipeline. Because rules carry stable GUIDs and Bicep deployment is idempotent, reverting the commit and redeploying restores the previous rule definition exactly - same identity, same incident lineage. There is no “undo” button in the portal that does this; the repo is the undo.

Auditing comes from three layers that together answer “who changed this detection, when, and why”:

Question Source of truth
Who changed the KQL and why? Git history + PR review thread
When did it deploy to prod? Pipeline run + git tag
What did Azure actually apply? AzureActivity deployment events / resource deployment history

Enterprise scenario

A global financial-services platform team ran Sentinel for the group SOC plus a managed offering for fourteen subsidiary banks, each in its own tenant for regulatory isolation. They had ~180 analytics rules and were managing them by hand. The breaking point came during an audit: a regulator asked them to prove that a specific anti-fraud detection had been running, unmodified, in a particular subsidiary’s workspace for the prior quarter. They could not. There was no diff, no deploy record, no way to show the rule in tenant 9 matched the approved baseline. Worse, a spot check found three subsidiaries running subtly different versions of that rule because someone had “quickly tuned” each one in the portal months earlier.

The constraint was hard: they could not centralize logs (data-residency law per subsidiary), and they could not put a credential for each customer tenant in a pipeline (the security team vetoed standing cross-tenant secrets outright).

They solved it with detection-as-code over Lighthouse. All 180 rules moved into one repo as Bicep modules with stable GUIDs. Each subsidiary delegated Microsoft Sentinel Contributor on its workspace resource group to a single SOC Entra group via a Lighthouse offer - so zero customer secrets entered the pipeline. The pipeline authenticated to the MSP tenant with OIDC, then fanned out across all fifteen workspaces (group + fourteen subsidiaries) from a manifest, Lighthouse handling each cross-tenant hop. The audit answer became trivial: git log for the rule, the tagged release, and the AzureActivity deployment event in that tenant’s workspace, all reconciling to one approved commit.

The detail that made the regulator happy was a drift-detection job. A scheduled run redeployed the repo in --what-if mode against every workspace and alerted on any difference - catching the moment any rule diverged from the baseline:

# nightly drift check - flags any rule that no longer matches the repo
az deployment group what-if \
  --resource-group "$rg" \
  --template-file analytics-rules/anti-fraud-velocity.bicep \
  --parameters workspaceName="$ws" \
  --no-pretty-print | tee whatif.json

# any change type other than NoChange/Ignore means the live rule drifted
if jq -e '.changes[] | select(.changeType != "NoChange" and .changeType != "Ignore")' whatif.json >/dev/null; then
  echo "DRIFT DETECTED in $ws" && exit 1
fi

Within a quarter, “prove this detection ran unmodified” went from a multi-day forensic exercise to a one-line query, and portal-side tuning effectively stopped because the pipeline made it pointless - any manual change was reverted by the next deploy and flagged by drift detection.

Verify

Confirm the system end to end, not just that the pipeline went green:

  1. Rule exists and matches the repo. In the target workspace, open the analytics rule and confirm its query matches the committed .kql byte for byte.

    az sentinel alert-rule show \
      --resource-group rg-sec-sentinel-prod \
      --workspace-name law-sentinel-prod \
      --rule-id b2f1c7a4-9d3e-4a8b-bb21-7e5d4c0a1f93 \
      --query "{name:displayName, enabled:enabled, freq:queryFrequency}"
    
  2. Identity is stable. Edit the rule’s description, push, redeploy, and confirm the rule’s resource ID is unchanged - proving it was updated, not recreated.

  3. KQL gate actually fails. Introduce a deliberate typo (SiginLogs) in a branch and confirm the PR check goes red before merge.

  4. Promotion gate holds. Confirm a merge to main pauses for approval rather than deploying immediately.

  5. Rollback restores exactly. git revert the last change, redeploy, and confirm the rule returns to its prior definition with the same GUID.

  6. Drift is detected. Hand-edit the rule in the portal, run the what-if drift job, and confirm it flags the divergence.

Checklist

Microsoft-Sentineldetection-as-codeCI-CDanalytics-rulescontent-hubGitOps

Comments

Keep Reading