Security Azure

Secretless CI/CD: Workload Identity Federation for GitHub Actions and AKS

The single most common credential in a breach post-mortem is a service principal client secret that lived in a CI variable, never rotated, and granted Contributor at the subscription scope. Workload identity federation removes the secret entirely: your pipeline and your pods present a short-lived OIDC token that Entra ID validates against a trust you declared in advance. Nothing to store, nothing to rotate, nothing to leak. This walkthrough takes both GitHub Actions and AKS from secret-based auth to fully federated.

1. Why the client secret is the weak link

A traditional pipeline authenticates with three values - ARM_CLIENT_ID, ARM_TENANT_ID, and ARM_CLIENT_SECRET. The first two are not sensitive. The third is a bearer credential: anyone who reads it can impersonate the app principal from anywhere on the internet until it expires, which is usually one or two years away. It sits in a secrets store you hope is locked down, gets copied into forks, and shows up in printenv debugging that someone forgot to remove.

Federation flips the model. Instead of the workload proving “I know the secret,” it proves “I am running where you said I would run.” The proof is an OIDC token minted by a trusted issuer - GitHub’s token service, or your AKS cluster’s OIDC issuer - and Entra ID exchanges that token for an Azure access token only if the token’s claims match a federated identity credential (FIC) you registered on the app. No symmetric secret ever exists.

Federation is not a convenience feature. It is the difference between a stolen credential being useful for two years versus useful for the ten-minute lifetime of a single CI job, only from the exact branch you authorized.

2. How a federated identity credential works

A FIC is a small object you attach to an Entra application (or user-assigned managed identity) that says: trust tokens from this issuer, with this subject, for this audience. Three claims do all the work:

Claim Meaning Example
issuer Who minted the token (the OIDC provider’s URL) https://token.actions.githubusercontent.com
subject Which specific workload repo:kloudvin/platform:ref:refs/heads/main
audience Who the token is intended for api://AzureADTokenExchange

At runtime the exchange is: the workload requests an OIDC token from its issuer, sends it to Entra’s token endpoint with client_assertion_type=urn:ietf:params:oauth:client-assertion-type:jwt-bearer, and Entra validates the signature against the issuer’s published JWKS, then checks that iss, sub, and aud exactly match a registered FIC. If they do, you get an access token. The match on subject is an exact string match - there are no wildcards in the subject for GitHub credentials, which is precisely what keeps the trust tight.

You can register up to a few hundred FICs per identity, so you scope one per branch, environment, or repo as needed rather than reusing a single broad trust.

3. GitHub Actions to Azure with no secret

First, create (or reuse) an app registration and its service principal, then grant it only the roles it needs at the narrowest scope.

# App registration + service principal
APP_ID=$(az ad app create --display-name "gha-platform-deploy" --query appId -o tsv)
az ad sp create --id "$APP_ID"

TENANT_ID=$(az account show --query tenantId -o tsv)
SUB_ID=$(az account show --query id -o tsv)

# Scope the role to a resource group, not the subscription
az role assignment create \
  --assignee "$APP_ID" \
  --role "Contributor" \
  --scope "/subscriptions/$SUB_ID/resourceGroups/rg-platform-prod"

Now add the federated credential. The subject must match the token GitHub will send. For a push to main, that subject is repo:OWNER/REPO:ref:refs/heads/main.

az ad app federated-credential create \
  --id "$APP_ID" \
  --parameters '{
    "name": "gha-main-branch",
    "issuer": "https://token.actions.githubusercontent.com",
    "subject": "repo:kloudvin/platform:ref:refs/heads/main",
    "audiences": ["api://AzureADTokenExchange"]
  }'

In the repository, store the three non-secret values as variables (or secrets if you prefer - they are not sensitive, but secrets keep them out of logs): AZURE_CLIENT_ID, AZURE_TENANT_ID, AZURE_SUBSCRIPTION_ID. The workflow needs the id-token: write permission so the runner can request an OIDC token, and it uses azure/login with no creds block:

name: deploy
on:
  push:
    branches: [main]

permissions:
  id-token: write   # required to mint the OIDC token
  contents: read

jobs:
  deploy:
    runs-on: ubuntu-latest
    environment: production
    steps:
      - uses: actions/checkout@v4

      - name: Azure login (OIDC, no secret)
        uses: azure/login@v2
        with:
          client-id: ${{ vars.AZURE_CLIENT_ID }}
          tenant-id: ${{ vars.AZURE_TENANT_ID }}
          subscription-id: ${{ vars.AZURE_SUBSCRIPTION_ID }}

      - name: Prove we are authenticated
        run: az account show -o table

That is the whole change. azure/login@v2 detects the absence of a client secret, requests an ID token from the runner’s token service, and performs the assertion exchange. No AZURE_CREDENTIALS JSON blob, no client secret anywhere.

4. Scoping trust precisely with subject claims

The subject is where you turn a broad trust into a surgical one. GitHub composes it from the trigger context, and the exact format matters because Entra does an exact match. The common shapes:

Trigger Subject string
Push to a branch repo:OWNER/REPO:ref:refs/heads/main
A tag repo:OWNER/REPO:ref:refs/tags/v1.2.3
A GitHub Environment repo:OWNER/REPO:environment:production
A pull request repo:OWNER/REPO:pull_request

Mapping these to identities is a design decision. The pattern I deploy:

# Production deploys: gated by a GitHub Environment with reviewers
az ad app federated-credential create \
  --id "$PROD_APP_ID" \
  --parameters '{
    "name": "gha-env-production",
    "issuer": "https://token.actions.githubusercontent.com",
    "subject": "repo:kloudvin/platform:environment:production",
    "audiences": ["api://AzureADTokenExchange"]
  }'

If you genuinely need to match many subjects (for example, every branch under a prefix), Entra supports a flexible FIC using a claimsMatchingExpression instead of a literal subject. Reach for it deliberately - a wildcard is a wider trust by definition, and the default exact-match subject is the safer baseline.

5. AKS workload identity for your pods

The same federation primitive secures workloads inside the cluster. AKS Workload Identity gives each cluster an OIDC issuer; a Kubernetes service account is annotated with an Entra client ID; and a mutating admission webhook projects a service account token into the pod and sets the environment variables the Azure SDKs read. The pod authenticates to Entra with that projected token - no secret mounted, no node identity shared across every pod.

Enable the two features on the cluster (OIDC issuer and the workload identity webhook):

az aks update \
  --resource-group rg-platform-prod \
  --name aks-platform-prod \
  --enable-oidc-issuer \
  --enable-workload-identity

# Capture the cluster's OIDC issuer URL - you need it for the FIC
ISSUER_URL=$(az aks show \
  --resource-group rg-platform-prod \
  --name aks-platform-prod \
  --query oidcIssuerProfile.issuerUrl -o tsv)

Create a user-assigned managed identity for the workload and federate it against the service account’s subject, which has the fixed form system:serviceaccount:NAMESPACE:SERVICEACCOUNT:

az identity create \
  --resource-group rg-platform-prod \
  --name id-orders-api

UAMI_CLIENT_ID=$(az identity show \
  --resource-group rg-platform-prod \
  --name id-orders-api \
  --query clientId -o tsv)

az identity federated-credential create \
  --name "fic-orders-api" \
  --identity-name id-orders-api \
  --resource-group rg-platform-prod \
  --issuer "$ISSUER_URL" \
  --subject "system:serviceaccount:orders:orders-api" \
  --audience "api://AzureADTokenExchange"

Grant that managed identity its Azure roles (for example, Key Vault Secrets User on a specific vault), then wire it into Kubernetes. The service account carries the client ID annotation, and the pod template must carry the azure.workload.identity/use: "true" label or the webhook will not inject anything:

apiVersion: v1
kind: ServiceAccount
metadata:
  name: orders-api
  namespace: orders
  annotations:
    azure.workload.identity/client-id: "<UAMI_CLIENT_ID>"
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: orders-api
  namespace: orders
spec:
  replicas: 2
  selector:
    matchLabels:
      app: orders-api
  template:
    metadata:
      labels:
        app: orders-api
        azure.workload.identity/use: "true"   # webhook trigger
    spec:
      serviceAccountName: orders-api
      containers:
        - name: orders-api
          image: acrkloudvin.azurecr.io/orders-api:1.4.0

The webhook injects AZURE_CLIENT_ID, AZURE_TENANT_ID, AZURE_FEDERATED_TOKEN_FILE, and AZURE_AUTHORITY_HOST, and projects the token at the file path. Any Azure SDK using DefaultAzureCredential (or WorkloadIdentityCredential) picks these up automatically - your application code does not change.

6. Terraform and Bicep without ARM_CLIENT_SECRET

The Terraform azurerm provider speaks OIDC natively. Inside a federated GitHub Actions job, azure/login has already set up the context; tell Terraform to use OIDC and drop the secret. The provider reads ARM_USE_OIDC=true and the standard client/tenant/subscription IDs.

      - name: Terraform apply (OIDC backend + provider)
        env:
          ARM_USE_OIDC: "true"
          ARM_CLIENT_ID: ${{ vars.AZURE_CLIENT_ID }}
          ARM_TENANT_ID: ${{ vars.AZURE_TENANT_ID }}
          ARM_SUBSCRIPTION_ID: ${{ vars.AZURE_SUBSCRIPTION_ID }}
        run: |
          terraform init
          terraform apply -auto-approve

Set use_oidc = true in the provider and the backend so both the data plane and the remote state in Azure Storage authenticate via federation:

terraform {
  backend "azurerm" {
    resource_group_name  = "rg-tfstate"
    storage_account_name = "sttfstatekloudvin"
    container_name       = "tfstate"
    key                  = "platform.tfstate"
    use_oidc             = true
    use_azuread_auth     = true
  }
}

provider "azurerm" {
  features {}
  use_oidc = true
}

You can also manage the federated credentials themselves in Terraform, which is how you keep the trust under review:

resource "azuread_application_federated_identity_credential" "gha_main" {
  application_id = azuread_application.deploy.id
  display_name   = "gha-main-branch"
  issuer         = "https://token.actions.githubusercontent.com"
  subject        = "repo:kloudvin/platform:ref:refs/heads/main"
  audiences      = ["api://AzureADTokenExchange"]
}

For Bicep, there is no secret-specific change at all - az deployment group create runs under the already-authenticated az context from azure/login, so a federated job deploys Bicep exactly as it would have with a secret, minus the secret.

7. Hardening the federated identities

Removing the secret is necessary, not sufficient. Tighten the surface that remains:

Federated credentials do not expire on their own, but the tokens they accept are short-lived by design. The control you must add is constraint on where and what subject can use them - that is what Conditional Access and tight subjects give you.

Enterprise scenario

A retail platform team migrated 40+ pipelines to OIDC and hit a wall on one repo: a self-hosted runner pool in a separate AKS cluster started failing every deploy with AADSTS700024: Client assertion is not within its valid time range. The federation config was correct - the same FIC worked from GitHub-hosted runners. The cause was clock skew. Their self-hosted runner nodes had drifted ~6 minutes because the node pool blocked outbound UDP/123 to public NTP, and the OIDC token’s nbf/exp window is only minutes wide. Entra rejected an assertion that was, from its clock, issued in the future.

The fix was two-pronged. First, point the nodes at an internal time source reachable from the locked-down subnet rather than relying on blocked public NTP:

# On the self-hosted runner nodes (chrony)
cat >/etc/chrony/conf.d/internal.conf <<'EOF'
server ntp.corp.internal iburst
makestep 1.0 3
EOF
systemctl restart chrony && chronyc tracking | grep "System time"

Second, they added a CI guard so a future drift fails loud instead of producing a confusing AADSTS error mid-deploy:

      - name: Assert clock sane before Azure login
        run: |
          skew=$(chronyc tracking | awk '/System time/{print $4}')
          awk -v s="$skew" 'BEGIN{ if (s+0 > 2) { print "clock skew "s"s"; exit 1 } }'

The lesson the team wrote into their runbook: with federation, a “credential” failure is often not about trust at all. Subject, audience, and issuer get the blame, but nbf/exp validation makes token-based auth quietly dependent on time sync - something secret-based auth never cared about.

Verify

Confirm each leg of the chain actually works.

# 1. GitHub Actions: the workflow run logs should show a successful
#    "Azure login" step and `az account show` returning your sub.

# 2. AKS: confirm the webhook injected the environment into the pod
kubectl exec -n orders deploy/orders-api -- env | grep AZURE_
# Expect: AZURE_CLIENT_ID, AZURE_TENANT_ID,
#         AZURE_FEDERATED_TOKEN_FILE, AZURE_AUTHORITY_HOST

# 3. AKS: confirm a token is actually projected
kubectl exec -n orders deploy/orders-api -- ls -l /var/run/secrets/azure/tokens/

# 4. List the federated credentials on the app (review the trust)
az ad app federated-credential list --id "$APP_ID" -o table

# 5. Prove no client secrets remain on the app
az ad app credential list --id "$APP_ID" -o table   # expect: empty

For end-to-end proof inside the cluster, run a one-off pod with the annotated service account and call a protected resource (for example, az keyvault secret show after az login --federated-token), or let the application’s own health check exercise its first Key Vault read.

Checklist

Pitfalls and next steps

The failure I see most is a subject mismatch - the workflow runs on a feature branch but the FIC is registered for refs/heads/main, and the exchange fails with AADSTS70021: No matching federated identity record found. Read the error: it echoes the subject Entra received, so paste that exact string into the FIC. The second most common: forgetting id-token: write, which leaves the runner unable to mint a token at all. In AKS, a pod that authenticates but gets 403 usually means the webhook injected correctly but the managed identity lacks the Azure role - federation and authorization are separate steps.

To finish the migration, inventory every existing service principal with a password credential (az ad app credential list across your apps, or an Entra sign-in log query for credential-based service principal sign-ins), federate each one, cut its pipeline over, verify, then delete the secret. Once the estate is clean, set an Entra policy or a scheduled report that flags any new application secret so the weak link never grows back.

Workload IdentityOIDCFederated CredentialsGitHub ActionsAKSEntra ID

Comments

Keep Reading