Identity Azure

Mastering Entra ID Tokens: App Roles, Group Claims, and the OAuth2 On-Behalf-Of Flow for APIs

Every Entra ID authorization bug I have ever debugged traced back to one of two things: someone trusted a claim that wasn’t in the token, or they trusted a token that was never meant for their API. This is the model I use for a real multi-tier app — a SPA or web frontend calling an API, and that API calling Microsoft Graph or another downstream API on the user’s behalf. We will design authorization with app roles and group claims, wire up the on-behalf-of (OBO) flow, and validate tokens the way they actually need to be validated.

1. Anatomy of access and ID tokens

Two token types, two purposes. The ID token is proof of authentication for the client — never send it to an API. The access token is the credential an API consumes. Your API must validate the access token and nothing else.

A v2.0 access token issued by Entra ID carries claims like these:

{
  "aud": "api://kv-orders-api",
  "iss": "https://login.microsoftonline.com/<tenant-id>/v2.0",
  "azp": "11111111-2222-3333-4444-555555555555",
  "scp": "Orders.Read Orders.Write",
  "roles": ["Orders.Admin"],
  "oid": "aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee",
  "sub": "subject-pairwise-id",
  "tid": "<tenant-id>",
  "ver": "2.0",
  "exp": 1717329600
}

The claims that drive authorization decisions:

Claim Meaning Authorization use
aud Audience — who the token is for Must equal your API’s identifier
iss Issuer (token-issuing authority) Must match your tenant’s issuer
scp Delegated scopes (space-delimited) Present in delegated (user-present) tokens
roles App roles granted Present for app roles (user or app)
oid Immutable object ID of the principal Stable user/app identity key
azp Authorized party (client app ID) Which client requested the token
idtyp app when present, marks an app-only token Distinguish app vs user calls

The single most important rule: scp means a user is present and delegated authority to the client; roles on an app-only token (idtyp: app) means there is no user. Treat them differently. An app-only token has no scp and no user oid semantics you should rely on.

2. Exposing an API: scopes vs app roles, and the right model

Entra ID gives you two authorization primitives when you expose an API, and they answer different questions:

Pick the model deliberately:

Scenario Use
User-facing API where the user’s own rights matter Delegated scopes
Coarse role-based access (Admin / Reader) for users App roles assigned to users/groups
Daemon or service calling with no user App roles assigned to applications
Need both: a user-context API with role tiers Scopes for the action + app roles for the tier

Expose a delegated scope with the Microsoft Graph beta az rest call (the az ad app surface does not cleanly edit api.oauth2PermissionScopes):

APP_OBJECT_ID=$(az ad app list --display-name "kv-orders-api" --query "[0].id" -o tsv)

az rest --method PATCH \
  --uri "https://graph.microsoft.com/v1.0/applications/$APP_OBJECT_ID" \
  --headers "Content-Type=application/json" \
  --body '{
    "identifierUris": ["api://kv-orders-api"],
    "api": {
      "oauth2PermissionScopes": [{
        "id": "0f0f0f0f-1111-2222-3333-444444444444",
        "value": "Orders.Read",
        "type": "User",
        "adminConsentDisplayName": "Read orders",
        "adminConsentDescription": "Allows the app to read orders on behalf of the signed-in user.",
        "userConsentDisplayName": "Read your orders",
        "userConsentDescription": "Allows the app to read your orders.",
        "isEnabled": true
      }]
    }
  }'

Generate scope/role id values as fresh GUIDs (uuidgen / [guid]::NewGuid()). They must be unique within the app and stable once issued — assignments reference them by ID.

3. Defining and assigning app roles

Define app roles on the application object. allowedMemberTypes decides where the role can land: User (users and groups), Application (service principals), or both.

az rest --method PATCH \
  --uri "https://graph.microsoft.com/v1.0/applications/$APP_OBJECT_ID" \
  --headers "Content-Type=application/json" \
  --body '{
    "appRoles": [
      {
        "id": "55555555-aaaa-bbbb-cccc-666666666666",
        "displayName": "Orders Administrator",
        "value": "Orders.Admin",
        "description": "Full administrative access to orders.",
        "allowedMemberTypes": ["User"],
        "isEnabled": true
      },
      {
        "id": "77777777-aaaa-bbbb-cccc-888888888888",
        "displayName": "Orders Processor (daemon)",
        "value": "Orders.Process",
        "description": "Allows a service to process orders without a user.",
        "allowedMemberTypes": ["Application"],
        "isEnabled": true
      }
    ]
  }'

The value is what shows up in the roles claim — keep it stable, because your API authorizes against it.

Assigning a role to a user or group

App role assignments attach to the service principal of the resource API, not the application object. Find the resource SP, then create the assignment.

# Object ID of the API's service principal (the resource)
RESOURCE_SP=$(az ad sp list --display-name "kv-orders-api" --query "[0].id" -o tsv)
ROLE_ID="55555555-aaaa-bbbb-cccc-666666666666"   # Orders.Admin

# principalId can be a user objectId or a group objectId
PRINCIPAL_ID=$(az ad user show --id "alice@contoso.com" --query id -o tsv)

az rest --method POST \
  --uri "https://graph.microsoft.com/v1.0/servicePrincipals/$RESOURCE_SP/appRoleAssignedTo" \
  --headers "Content-Type=application/json" \
  --body "{
    \"principalId\": \"$PRINCIPAL_ID\",
    \"resourceId\": \"$RESOURCE_SP\",
    \"appRoleId\": \"$ROLE_ID\"
  }"

Assigning to a group requires that the group be assigned to the app — and that nested membership is flattened by Entra at token time only for direct members. Nested groups do not transitively grant app roles. For a daemon, set principalId to the client app’s service principal object ID and use an Application role.

Set the app registration to Assignment required (appRoleAssignmentRequired: true on the SP) when you want app roles to be the gate. Otherwise unassigned users still authenticate and simply receive no roles claim.

4. Group claims without token bloat

You can emit the user’s group memberships directly in the token via groupMembershipClaims. This is convenient and a footgun at scale.

az rest --method PATCH \
  --uri "https://graph.microsoft.com/v1.0/applications/$APP_OBJECT_ID" \
  --headers "Content-Type=application/json" \
  --body '{ "groupMembershipClaims": "SecurityGroup" }'

The problem: tokens are size-limited. When a user belongs to more groups than the cap (roughly 200 for SAML, 150 for JWT), Entra ID drops the groups claim entirely and instead emits an overage indicator — a _claim_names / _claim_sources pair pointing at a Graph URL:

{
  "_claim_names": { "groups": "src1" },
  "_claim_sources": {
    "src1": {
      "endpoint": "https://graph.microsoft.com/v1.0/users/<oid>/getMemberObjects"
    }
  }
}

Your API must detect overage and call Graph to retrieve the full set, or your authorization silently breaks for power users. The robust patterns, in order of preference:

  1. Prefer app roles over raw group claims. A roles claim does not overflow the same way and is the right abstraction for authorization.
  2. If you need groups, scope them: configure the token to emit only groups assigned to the application (groupMembershipClaims: "ApplicationGroup") so you never carry irrelevant memberships.
  3. Handle the overage path explicitly — call getMemberObjects (or transitiveMemberOf) with securityEnabledOnly: true.
// Pseudocode: resolve groups, honoring overage
if (principal.HasClaim(c => c.Type == "_claim_names"))
{
    var ids = await graph.Users[oid]
        .GetMemberObjects
        .PostAsGetMemberObjectsPostResponseAsync(
            new() { SecurityEnabledOnly = true });
    groups = ids.Value;
}
else
{
    groups = principal.FindAll("groups").Select(c => c.Value).ToList();
}

5. The on-behalf-of flow to call a downstream API

OBO solves the multi-hop problem: a user calls your API with access token A (audience = your API); your API needs to call Microsoft Graph or another downstream API as that user. You exchange token A for a new token B (audience = Graph) at the token endpoint.

The wire request — note grant_type=jwt-bearer and requested_token_use=on_behalf_of:

curl -X POST \
  "https://login.microsoftonline.com/$TENANT_ID/oauth2/v2.0/token" \
  -H "Content-Type: application/x-www-form-urlencoded" \
  -d "grant_type=urn:ietf:params:oauth:grant-type:jwt-bearer" \
  -d "client_id=$API_CLIENT_ID" \
  -d "client_assertion_type=urn:ietf:params:oauth:client-assertion-type:jwt-bearer" \
  -d "client_assertion=$SIGNED_CLIENT_ASSERTION" \
  -d "assertion=$INCOMING_USER_ACCESS_TOKEN" \
  -d "scope=https://graph.microsoft.com/User.Read" \
  -d "requested_token_use=on_behalf_of"

In production, use MSAL rather than hand-rolling this. MSAL caches the OBO result keyed by the incoming token, so you are not hitting the token endpoint on every request:

var result = await app.AcquireTokenOnBehalfOf(
        new[] { "https://graph.microsoft.com/.default" },
        new UserAssertion(incomingAccessToken))
    .ExecuteAsync();
// result.AccessToken now has aud = Microsoft Graph

Three preconditions that cause most OBO failures:

Use client_assertion (a certificate or federated credential) instead of a client secret. OBO with workload identity federation removes the long-lived secret on the middle tier entirely.

6. Validating tokens correctly

This is where APIs get compromised. A correct validator checks all of:

  1. Signature against the tenant’s published JWKS.
  2. Issuer (iss) matches your tenant’s expected issuer string exactly (mind v1.0 sts.windows.net vs v2.0 login.microsoftonline.com/.../v2.0).
  3. Audience (aud) equals your API’s app ID URI or client ID — not 00000003-0000-0000-c000-000000000000 (that is Graph; a token for Graph is not for you).
  4. Expiry / not-before (exp, nbf) with small clock skew.
  5. Authorization: a required scp value for delegated calls, or a required roles value for app-only calls.

The signing keys rotate, and you must not pin a key or cache JWKS forever. Pull keys from OIDC discovery and let the library refresh on rollover. In ASP.NET Core, Microsoft.Identity.Web handles discovery, JWKS caching, and kid-based key selection for you:

builder.Services
    .AddAuthentication(JwtBearerDefaults.AuthenticationScheme)
    .AddMicrosoftIdentityWebApi(builder.Configuration.GetSection("AzureAd"));

// appsettings.json -> "AzureAd": {
//   "Instance": "https://login.microsoftonline.com/",
//   "TenantId": "<tenant-id>",
//   "ClientId": "api://kv-orders-api",
//   "Audience": "api://kv-orders-api"
// }

For a manual JwtBearer setup, never hardcode IssuerSigningKeys. Let the handler fetch the OIDC metadata document (/.well-known/openid-configuration under your tenant authority); it discovers the jwks_uri, caches keys, and re-fetches when an unknown kid appears — which is exactly what handles key rollover gracefully.

Enforce authorization with policies, not ad hoc string checks:

builder.Services.AddAuthorization(o =>
{
    o.AddPolicy("OrdersAdmin", p => p.RequireRole("Orders.Admin"));
    o.AddPolicy("CanReadOrders", p =>
        p.RequireAssertion(ctx =>
            ctx.User.HasClaim("scp", "Orders.Read") ||
            ctx.User.IsInRole("Orders.Admin")));
});

RequireRole reads the roles claim. scp is space-delimited in one claim, so a substring check is wrong — RequireScope from Microsoft.Identity.Web parses it correctly. Roll your own only if you split on spaces.

7. Customizing tokens: optional claims and claims mapping

Two distinct mechanisms, often confused:

Add an optional email claim to the access token:

az rest --method PATCH \
  --uri "https://graph.microsoft.com/v1.0/applications/$APP_OBJECT_ID" \
  --headers "Content-Type=application/json" \
  --body '{
    "optionalClaims": {
      "accessToken": [{ "name": "email" }],
      "idToken": [{ "name": "email" }]
    }
  }'

Optional claims are best-effort: if the source attribute is empty for a user, the claim simply will not appear. Never make a claim’s presence a security gate unless you have verified it is always populated.

Verify

Pull a real token and decode it. For a quick delegated check, the Azure CLI can mint a token against your API’s scope:

TOKEN=$(az account get-access-token \
  --scope "api://kv-orders-api/Orders.Read" \
  --query accessToken -o tsv)

# Decode the payload (base64url) without sending it anywhere
echo "$TOKEN" | cut -d. -f2 | tr '_-' '/+' | base64 -d 2>/dev/null | jq .

Confirm in the decoded payload:

PAYLOAD=$(echo "$TOKEN" | cut -d. -f2 | tr '_-' '/+' | base64 -d 2>/dev/null)
echo "$PAYLOAD" | jq '{aud, iss, ver, scp, roles, appid: .azp, idtyp}'

You want aud = your API, iss ending in /v2.0, and either scp or roles populated as designed. List the app role assignments that should produce those roles:

az rest --method GET \
  --uri "https://graph.microsoft.com/v1.0/servicePrincipals/$RESOURCE_SP/appRoleAssignedTo" \
  --query "value[].{principal:principalDisplayName, role:appRoleId}" -o table

Then make a real call to the protected endpoint and assert a 401 without a token, 403 with a token lacking the role, and 200 with the right role:

curl -s -o /dev/null -w "%{http_code}\n" https://api.contoso.com/orders            # 401
curl -s -o /dev/null -w "%{http_code}\n" -H "Authorization: Bearer $WRONG" \
  https://api.contoso.com/orders                                                   # 403
curl -s -o /dev/null -w "%{http_code}\n" -H "Authorization: Bearer $TOKEN" \
  https://api.contoso.com/orders                                                   # 200

8. Debugging authorization failures

When a call fails, decode first and correlate second. The fastest triage path:

  1. Decode the token (the jq snippet above, or jwt.ms). Wrong aud is the number-one cause — the client requested a token for the wrong resource.
  2. Check scp vs roles. A 403 on a delegated endpoint with an empty scp means the client never requested your scope. A daemon getting 403 with empty roles means no Application app role was assigned (and consented).
  3. Correlate with sign-in logs. Every interactive and non-interactive token request lands in Entra sign-in logs with an error code. Pull them by app:
az rest --method GET \
  --uri "https://graph.microsoft.com/v1.0/auditLogs/signIns?\$filter=appId eq '$API_CLIENT_ID'&\$top=20" \
  --query "value[].{time:createdDateTime, user:userPrincipalName, code:status.errorCode, reason:status.failureReason}" \
  -o table

Map the common codes:

Code Meaning Fix
AADSTS65001 Consent not granted for the requested permission Grant admin consent for the downstream/API permission
AADSTS70011 Invalid/unknown scope value Scope string typo or wrong resource identifier
AADSTS500011 Resource principal not found in tenant API’s SP not provisioned in the tenant
AADSTS50105 User not assigned a required app role Create the appRoleAssignedTo assignment
AADSTS50076 Interaction (MFA/CA) required for downstream Surface to client to re-auth interactively

Sign-in logs only show what reached Entra. A 403 your API returned will not appear there — that is your code’s policy decision, so check your authorization logic and the decoded roles/scp, not the directory.

Enterprise scenario

A payments platform team shipped an Orders API that called Microsoft Graph via OBO to read the caller’s group memberships for fine-grained authorization. It passed every test in dev and broke in production for exactly the regional managers — the people with the broadest access. Root cause: those users sat in 300+ security groups (entitlement-management churn), so Entra dropped the groups claim and emitted an overage _claim_names/_claim_sources pointer. The validator treated “no groups claim” as “no groups” and returned 403. Worse, the OBO call they added to resolve overage requested GroupMember.Read.All — an admin-consent-only scope the security team refused to grant tenant-wide.

The fix had two parts. First, they stopped emitting raw group claims and switched to app roles plus groupMembershipClaims: "ApplicationGroup", so only groups assigned to the app land in the token and the role tier never overflows. Second, for the residual groups they still needed, they handled overage by calling transitiveMemberOf with the user’s own OBO token under the already-consented User.Read — filtering server-side instead of demanding GroupMember.Read.All.

var policy = new AuthorizationPolicyBuilder()
    .RequireAuthenticatedUser()
    .AddRequirements(new RolesOrOverageGroupsRequirement("Orders.RegionalApprover"))
    .Build();

if (principal.HasClaim(c => c.Type == "_claim_names"))
{
    var groups = await graph.Me.TransitiveMemberOf.GraphGroup
        .GetAsync(r => r.QueryParameters.Select = new[] { "id" });
    // resolve required group from groups, not from the absent claim
}

The lesson: never let the absence of a claim mean “deny by default” without checking for the overage indicator first.

Checklist

Pitfalls and next steps

The traps that bite teams repeatedly: trusting a groups claim that silently went into overage; substring-matching scp and granting Orders.Read to a token that only had Orders.ReadBasic; accepting a token whose aud is Microsoft Graph because the validator never pinned the audience; and pinning a signing key that breaks on the next rollover. Each is a one-line fix and a real incident if you miss it.

From here, push the middle tier to federated credentials so OBO carries no stored secret, move revocation into Continuous Access Evaluation so killed sessions die in near real time instead of at token expiry, and add wrong-audience and expired-token cases to your test suite — the cheapest security tests you will ever write.

Entra IDOAuth2JWTApp RolesMicrosoft Graph

Comments

Keep Reading