Most supply-chain writing is about the producer: sign your artifacts, generate provenance, attach an SBOM. That is the easy half. The hard half is being a consumer at scale — pulling thousands of third-party images and packages you did not build, and deciding, fast and defensibly, which ones you are willing to run. An SBOM nobody queries is a JSON file. A scanner that fires on every transitive CVE is a pager that gets muted. A signature you never verify at the boundary is decorative.
This article builds the consumer pipeline end to end: ingest and normalize SBOMs into a queryable inventory, rank vulnerabilities by reachability, suppress non-exploitable findings with VEX, verify provenance before promotion with policy-as-code, and enforce all of it at the Kubernetes admission boundary. It assumes you are the downstream party — the platform team that has to trust other people’s code.
1. The consumer problem: trusting and triaging at scale
When you operate a platform, your dependency graph is mostly other people’s decisions. A single base image drags in hundreds of OS packages; an application image adds language ecosystems on top. Multiply by every team and release and you track tens of thousands of component-version tuples, each a potential CVE.
Three failure modes dominate, and they are all about volume, not capability:
| Failure mode | What it looks like | Root cause |
|---|---|---|
| Alert fatigue | Scanner reports 800 CVEs on one image; team mutes it | No exploitability filter; raw NVD severity treated as priority |
| No inventory | “Are we exposed to the new OpenSSL CVE?” takes two days to answer | SBOMs generated but never ingested or queried |
| Trust by tag | Cluster runs whatever :latest resolves to |
No provenance check; mutable references; no admission gate |
The fix is a pipeline with four stages: inventory (know what you run), triage (rank what actually matters), verify (trust before promote), enforce (block at the boundary). Everything below is one of those four.
Producing signatures and provenance is covered elsewhere on this blog. Here we assume artifacts arrive with — or without — that metadata, and our job is to consume it correctly. The two halves meet at exactly one place: the admission controller.
2. Ingesting and normalizing SBOMs into a queryable inventory
You will receive SBOMs in two formats: SPDX (Linux Foundation, ISO/IEC 5962) and CycloneDX (OWASP). Both describe components and relationships; they disagree on field names, identifiers, and structure. Do not let two schemas leak into every downstream query — normalize on ingest.
First, get an SBOM. If a vendor ships one as an attestation, pull it from the registry; if not, generate one yourself from the pulled image with syft:
# Pull a vendor-attached CycloneDX SBOM attestation, if present
cosign download attestation \
--predicate-type https://cyclonedx.org/bom \
ghcr.io/vendor/app@sha256:abc123... \
| jq -r '.payload | @base64d | fromjson | .predicate' > app.cdx.json
# Or generate one yourself from the image you actually pulled
syft ghcr.io/vendor/app@sha256:abc123... -o spdx-json=app.spdx.json
syft ghcr.io/vendor/app@sha256:abc123... -o cyclonedx-json=app.cdx.json
The normalization key that matters is the package URL (purl) — pkg:npm/lodash@4.17.21, pkg:deb/debian/openssl@3.0.11. Both formats can carry it; purl is what lets you join across SBOMs, scanners, and advisories without guessing. Flatten every ingested SBOM to a uniform record keyed by purl:
# CycloneDX -> uniform (purl, name, version, type)
jq -r '.components[] | [.purl, .name, .version, .type] | @tsv' app.cdx.json
# SPDX -> uniform; purl lives in externalRefs of type PACKAGE-MANAGER/purl
jq -r '.packages[]
| . as $p
| ($p.externalRefs // [] | map(select(.referenceType=="purl")) | .[0].referenceLocator) as $purl
| [$purl, $p.name, $p.versionInfo] | @tsv' app.spdx.json
Load those rows into a real store so “what runs where” is a query, not an investigation. A minimal relational schema is enough to start:
CREATE TABLE artifact (
digest TEXT PRIMARY KEY, -- sha256:... immutable identity
repo TEXT NOT NULL,
ingested_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE TABLE component (
artifact_digest TEXT NOT NULL REFERENCES artifact(digest),
purl TEXT NOT NULL, -- normalized identity
name TEXT NOT NULL,
version TEXT,
PRIMARY KEY (artifact_digest, purl)
);
CREATE INDEX idx_component_purl ON component (purl);
Now the OpenSSL question is a one-liner against the index:
SELECT DISTINCT a.repo, c.version
FROM component c JOIN artifact a ON a.digest = c.artifact_digest
WHERE c.purl LIKE 'pkg:deb/debian/openssl@%';
That single index is the difference between answering “are we exposed?” in seconds versus days. Graph databases (the GUAC project models exactly this) buy you transitive queries — “what depends on the thing that depends on the vulnerable thing” — but a purl-indexed table covers the 90% case and ships this week.
3. Mapping components to vulnerabilities and prioritizing with reachability
With an inventory keyed by purl, matching to CVEs is mechanical. grype consumes the SBOM directly — scan the SBOM, not the image, so the bill of materials and the vulnerability match come from the same source of truth:
grype sbom:app.cdx.json -o json > app.vulns.json
# Triage view: drop the noise floor, sort by severity
jq -r '.matches[]
| select(.vulnerability.severity != "Negligible"
and .vulnerability.severity != "Low")
| [.vulnerability.severity,
.artifact.name, .artifact.version,
.vulnerability.id]
| @tsv' app.vulns.json | sort
This is where most programs stop and drown. Raw CVE counts are not a priority queue — severity is not exploitability. A critical CVE in a library you never call is lower risk than a medium in code on a request path. Two signals reorder the list:
- Fix availability. A vuln with a released fix is actionable today; one without is a risk-acceptance decision.
grype --only-fixedsurfaces what you can act on now. - Reachability. Is the vulnerable symbol actually called from your code, or merely present on disk? This is the highest-leverage filter in the entire pipeline — most CVEs in a deep dependency tree are unreachable.
Reachability needs call-graph analysis, not package matching. Several tools approximate it; the precise approach is consuming the VEX vulnerable_code_not_in_execute_path justification (next section) when upstream publishes one, and a static reachability tool when they do not. The practical rule a platform team can enforce today:
Page on: fixed AND (reachable OR severity >= High AND internet-exposed). Everything else is a backlog item with an SLA, not an interrupt.
Encode that as a fail condition so CI is honest about what blocks versus what is tracked:
# Block the build only on fixable High/Critical; track the rest
grype sbom:app.cdx.json --only-fixed --fail-on high
4. Using VEX to suppress non-exploitable findings
VEX (Vulnerability Exploitability eXchange) is the mechanism that lets a finding be answered with a statement: “yes, that component is present; no, you are not exploitable, and here is why.” It is the single most effective lever for cutting alert noise without going blind, because it suppresses with an auditable reason instead of an allowlist of CVE IDs.
A VEX statement carries a status and, for not_affected, a justification. The four statuses:
| Status | Meaning | Consumer action |
|---|---|---|
not_affected |
Present but not exploitable in this product | Suppress; record justification |
affected |
Exploitable; action required | Triage and remediate |
fixed |
Was affected, now patched in this version | Suppress for this version |
under_investigation |
Triage in progress | Keep visible; do not suppress |
The not_affected justifications are a closed set — use the standard values, not prose:
component_not_presentvulnerable_code_not_presentvulnerable_code_not_in_execute_pathvulnerable_code_cannot_be_controlled_by_adversaryinline_mitigations_already_exist
Author OpenVEX with vexctl. Say your scanner flags CVE-2023-44487 (HTTP/2 Rapid Reset) in a vendored library that your service compiles in but never exposes an HTTP/2 server from:
vexctl create \
--product "pkg:oci/app@sha256:abc123..." \
--vuln "CVE-2023-44487" \
--status "not_affected" \
--justification "vulnerable_code_not_in_execute_path" \
--file app.openvex.json
The resulting document is small and machine-checkable:
{
"@context": "https://openvex.dev/ns/v0.2.0",
"author": "platform-security@example.com",
"timestamp": "2026-06-08T10:00:00Z",
"statements": [
{
"vulnerability": { "name": "CVE-2023-44487" },
"products": [
{ "@id": "pkg:oci/app@sha256:abc123..." }
],
"status": "not_affected",
"justification": "vulnerable_code_not_in_execute_path"
}
]
}
Feed the VEX back into the scanner. grype moves any not_affected or fixed statement into the ignore set, so a re-scan with the VEX attached returns a clean priority queue — while keeping the suppressed item auditable:
# Re-scan with VEX applied; suppressed matches are filtered out of the main set
grype sbom:app.cdx.json --vex app.openvex.json --fail-on high
# Audit what VEX suppressed (table shows a "(suppressed)" label; JSON -> ignoredMatches)
grype sbom:app.cdx.json --vex app.openvex.json --show-suppressed
The governance rule that keeps VEX honest: a not_affected statement is a claim that must be defensible. Store every VEX document in version control with author and justification, require review for anything an internal team authors (vendor-published VEX is a different trust tier), and treat vulnerable_code_not_in_execute_path as a statement someone is on the hook for. VEX cuts noise, not accountability.
5. Verifying provenance and signatures before promotion
Inventory and triage tell you what you are running and whether it is exploitable. Provenance tells you whether you can trust where it came from. Before an artifact is promoted from a quarantine registry into the namespace teams pull from, verify two things: the signature (who attests to this digest) and the provenance predicate (how it was built).
Verification is keyless against the digest — never the tag, since tags are mutable and you would be verifying a moving target:
DIGEST=ghcr.io/vendor/app@sha256:abc123...
# 1. Signature + signer identity
cosign verify "$DIGEST" \
--certificate-identity-regexp "^https://github.com/vendor/app/.github/workflows/.+@refs/tags/.+" \
--certificate-oidc-issuer "https://token.actions.githubusercontent.com"
# 2. SLSA provenance attestation, gated by a policy on its contents
cosign verify-attestation "$DIGEST" \
--type slsaprovenance \
--certificate-identity-regexp "^https://github.com/vendor/app/.github/workflows/.+@refs/tags/.+" \
--certificate-oidc-issuer "https://token.actions.githubusercontent.com" \
--policy provenance.rego
The two identity strings are the whole trust anchor: get --certificate-identity or --certificate-oidc-issuer wrong and you have verified that something signed it, not the right party. Pin them precisely.
The --policy flag turns “it has a provenance document” into “the provenance meets our bar” — a Rego policy asserting the build ran from the expected source repository and a trusted builder:
package signature
import rego.v1
# Provenance must name our expected source repo and a trusted builder
default allow := false
allow if {
some statement in input.predicate.buildDefinition.resolvedDependencies
startswith(statement.uri, "git+https://github.com/vendor/app")
input.predicate.runDetails.builder.id == "https://github.com/vendor/app/.github/workflows/release.yml@refs/tags/v1.4.0"
}
Run all of this as a promotion gate: untrusted artifacts land in a quarantine registry, the gate verifies signature + provenance + (VEX-filtered) scan, and only on success copies the immutable digest to the registry production namespaces may pull from. cosign copy moves the artifact and its signatures/attestations together:
cosign copy ghcr.io/quarantine/app@sha256:abc123... \
registry.internal/prod/app@sha256:abc123...
6. Enforcing admission control to block unsigned or vulnerable images
Promotion gates are policy you run. Admission control is policy the cluster runs on every pod, including ones that route around your gate — the backstop, and the only place producer-side metadata and consumer-side policy converge. We’ll use Kyverno because its verifyImages rule does signature verification, attestation checks, and digest mutation in one policy.
This policy enforces three things at once: images must come from the trusted internal registry, must carry a valid signature from the expected identity, and tags are rewritten to digests so what is admitted is immutable:
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: require-signed-trusted-images
spec:
validationFailureAction: Enforce # block, do not just audit
background: false
webhookTimeoutSeconds: 30
rules:
- name: only-internal-registry
match:
any:
- resources:
kinds: ["Pod"]
validate:
message: "Images must be pulled from registry.internal/prod"
pattern:
spec:
containers:
- image: "registry.internal/prod/*"
- name: verify-signature-and-pin-digest
match:
any:
- resources:
kinds: ["Pod"]
verifyImages:
- imageReferences:
- "registry.internal/prod/*"
mutateDigest: true # rewrite tag -> digest on admit
verifyDigest: true
required: true
attestors:
- count: 1
entries:
- keyless:
subject: "https://github.com/vendor/app/.github/workflows/release.yml@refs/tags/*"
issuer: "https://token.actions.githubusercontent.com"
rekor:
url: https://rekor.sigstore.dev
mutateDigest: true is the quiet hero: even if a manifest references a mutable tag, Kyverno resolves it to the digest it verified and writes that into the pod spec, closing the time-of-check/time-of-use gap where a tag could be repointed between verification and pull.
You can go further and gate on attestation content — block any image whose attached SBOM scan still carries an un-VEXed Critical — using a conditions block against an attestation predicate. Roll out in stages regardless: Audit to measure the blast radius, scope by namespace label, then flip to Enforce. A content gate that silently blocks every legacy workload on day one is how admission control gets disabled by an incident bridge at 2am.
Always carve a break-glass path: an exempt namespace (or a policy exception object) gated by RBAC so a P1 can ship while the controller stays on for everything else. A gate with no escape hatch gets turned off entirely the first time it blocks a real incident — and then you have no gate at all.
7. Continuous monitoring for newly disclosed CVEs
The scan that passed at admission is point-in-time. Tomorrow a new CVE drops against a component you already admitted, and that artifact is now vulnerable without changing one byte. This is precisely why section 2’s inventory exists: you re-evaluate standing artifacts against new advisories instead of waiting to rebuild.
Re-scan stored SBOMs on a schedule against a freshened vulnerability database — no rebuild, no re-pull, because the SBOM is the source of truth:
grype db update # refresh advisory data first
for sbom in /inventory/sboms/*.cdx.json; do
digest=$(jq -r '.metadata.component.purl' "$sbom")
grype "sbom:$sbom" --vex "$(dirname "$sbom")/vex/$(basename "$sbom" .cdx.json).openvex.json" \
-o json > "/inventory/results/$(basename "$sbom" .json).vulns.json"
done
Join fresh results back to the inventory to answer the only question that matters during a zero-day: which running workloads contain the newly-vulnerable component? Because everything is keyed by purl, that is one query the moment a CVE is announced:
-- "A new CVE hits pkg:npm/loader-utils. What of ours, in prod, is affected?"
SELECT a.repo, a.digest, c.version
FROM component c
JOIN artifact a ON a.digest = c.artifact_digest
WHERE c.purl LIKE 'pkg:npm/loader-utils@%';
Note the VEX document is re-applied on every re-scan, so a finding you have already justified does not re-page the on-call when the database refreshes. Triage memory persists.
8. Operational workflow: upgrade SLAs and exception governance
Tooling without policy is a dashboard nobody owns. The operating model is two artifacts: time-bound remediation SLAs clocked from the moment a finding becomes actionable (a fix exists), and an exception process with an expiry.
| Severity (post-VEX, fix available) | Internet-exposed | Internal only |
|---|---|---|
| Critical | 24 hours | 7 days |
| High | 7 days | 30 days |
| Medium | 30 days | 90 days |
| Low | Best effort | Best effort |
Automate the upgrade itself — Renovate or Dependabot opens the PR, your CI gate (sections 3-5) proves the new digest is clean and signed before merge. The SLA governs the exception, not the upgrade: anything that cannot meet its window needs a tracked, expiring risk acceptance.
The non-negotiable property of an exception is an expiry date. A permanent exception is a silent policy change. Every suppression — VEX
not_affected, an admission exception, a deferred upgrade — carries an owner and an end date, and lapses loud.
Encode exceptions as data so they are auditable and CI-checkable, not tribal knowledge in a chat thread:
# exceptions/CVE-2024-XXXX.yaml
cve: CVE-2024-XXXX
component: "pkg:npm/some-lib@2.3.1"
status: accepted-risk
reason: "No upstream fix; mitigated by WAF rule WAF-1021 (see ticket SEC-4412)"
owner: platform-security@example.com
expires: "2026-09-08" # CI fails the build the day this lapses
A nightly job that fails when an exception is past expires turns governance from a quarterly audit into a daily, self-enforcing control.
Enterprise scenario
A fintech platform team ran a managed Kubernetes fleet for ~40 product teams. Their container scanner reported north of 12,000 findings across the registry; the security channel was muted within a week of go-live, and an internal auditor flagged that “critical vulnerability” tickets were being closed as wontfix with no recorded justification — an audit finding in a regulated shop.
The constraint was specific: they could not simply mandate “zero criticals.” A large share of findings were unreachable transitive dependencies in Java and Node images they consumed from upstream vendors, and forcing 40 teams to patch unreachable CVEs would have stalled every roadmap. They needed to suppress with evidence, and prove to the auditor that each suppression was a defensible decision, not a shortcut.
They built the consumer pipeline above. SBOMs (CycloneDX) for every image landed in a Postgres inventory keyed by purl, and grype ran nightly against the SBOMs against a refreshed database. The key move was treating VEX as the system of record for triage: every suppression became a reviewed, version-controlled OpenVEX statement with a justification, replacing the un-auditable wontfix.
The result was a single CI step that re-scanned the SBOM with the VEX applied and failed only on fixable, un-suppressed findings above the SLA threshold:
# Per-image gate: only un-VEXed, fixable High+ blocks the promotion
grype "sbom:${IMAGE_SBOM}" \
--vex "vex/${IMAGE_NAME}.openvex.json" \
--only-fixed --fail-on high
Findings dropped from ~12,000 raw to ~140 actionable — every one of the suppressed ~11,800 carrying a named justification an auditor could read. The exception finding was closed, because each suppression was now an OpenVEX statement with an owner, a justification from the standard set, and a git history. Kyverno’s verifyImages policy (in Audit for six weeks, scoped by namespace label, then Enforce) became the backstop that kept an un-gated kubectl run from bypassing the pipeline. The lesson: at consumer scale, the win is not finding more vulnerabilities — it is defensibly ignoring the ones that do not matter while provably blocking the ones that do.
Verify
Confirm each stage actually does its job, not just that the tools ran:
# Inventory: a known component resolves from the SBOM store
psql -c "SELECT count(*) FROM component WHERE purl LIKE 'pkg:deb/debian/openssl@%';"
# Triage: VEX moves a known non-exploitable CVE out of the failing set
grype sbom:app.cdx.json --vex app.openvex.json --show-suppressed \
| grep -i "CVE-2023-44487" # should appear with a (suppressed) label
# Provenance: verification FAILS for a wrong identity (negative test)
cosign verify "$DIGEST" \
--certificate-identity-regexp "^https://github.com/ATTACKER/.+" \
--certificate-oidc-issuer "https://token.actions.githubusercontent.com" \
&& echo "BAD: accepted wrong signer" || echo "OK: rejected wrong signer"
# Admission: an unsigned/wrong-registry image is blocked
kubectl run rogue --image=docker.io/library/nginx:latest \
&& echo "BAD: rogue admitted" || echo "OK: admission blocked it"
# Mutation: an admitted pod's image was rewritten tag -> digest
kubectl get pod <name> -o jsonpath='{.spec.containers[0].image}' | grep '@sha256:'
The negative tests are the ones that matter. A gate you have only ever seen pass is a gate you have not tested.