Authoring Production-Grade Helm Charts: Library Charts, Values Schemas & CI Testing

helm create gets you a chart in five seconds and a maintenance liability in five weeks. This guide walks through the practices that separate a throwaway scaffold from a chart you can hand to twenty teams: shared template libraries, fail-fast input validation, deterministic dependency handling, and a CI pipeline that catches breakage before it reaches a cluster.

1. A chart layout that scales

The default scaffold is fine for one service. Once you have a platform, structure the chart so that intent is obvious and overrides are predictable.

myapp/
  Chart.yaml
  values.yaml            # documented defaults, every key present
  values.schema.json     # contract for what callers may pass
  templates/
    _helpers.tpl         # named templates (fullname, labels, selectors)
    deployment.yaml
    service.yaml
    serviceaccount.yaml
    NOTES.txt
  charts/                # vendored dependencies (helm dependency build)
  ci/                    # values files used only by chart-testing
    default-values.yaml
    ha-values.yaml

Two rules carry most of the weight. First, every value your templates read must appear in values.yaml with a sane default and a comment — even if the default is {} or "". An undocumented value is a bug waiting for a 2 a.m. page. Second, keep templates/ free of business logic that belongs in helpers; a template should read like a manifest, not a program.

Use helm create once to remember the layout, then delete the generated boilerplate. The scaffolded values.yaml ships opinions (a specific autoscaling block, a sample ingress) you almost certainly do not want as your defaults.

2. DRY templating with named templates and library charts

Named templates (defined with define in _helpers.tpl) are your first lever against duplication. The canonical pair is a name helper and a labels helper:

{{/* templates/_helpers.tpl */}}
{{- define "myapp.fullname" -}}
{{- $name := default .Chart.Name .Values.nameOverride -}}
{{- if .Values.fullnameOverride -}}
{{- .Values.fullnameOverride | trunc 63 | trimSuffix "-" -}}
{{- else -}}
{{- printf "%s-%s" .Release.Name $name | trunc 63 | trimSuffix "-" -}}
{{- end -}}
{{- end -}}

{{- define "myapp.labels" -}}
helm.sh/chart: {{ printf "%s-%s" .Chart.Name .Chart.Version | replace "+" "_" | trunc 63 | trimSuffix "-" }}
app.kubernetes.io/name: {{ include "myapp.name" . }}
app.kubernetes.io/instance: {{ .Release.Name }}
{{- if .Chart.AppVersion }}
app.kubernetes.io/version: {{ .Chart.AppVersion | quote }}
{{- end }}
app.kubernetes.io/managed-by: {{ .Release.Service }}
{{- end -}}

The trunc 63 is not cosmetic: Kubernetes label values and many resource names are capped at 63 characters, and a long release name will otherwise produce an invalid object that the API server rejects.

When the same helpers need to be shared across many charts, promote them into a library chart. A library chart sets type: library in Chart.yaml, ships only templates/ with define blocks (no rendered manifests), and is consumed as a dependency. The key behavioral difference: Helm does not render a library chart’s templates directly, so it never emits objects on its own — it only exposes named templates.

# common/Chart.yaml
apiVersion: v2
name: common
type: library
version: 1.4.0

# myapp/Chart.yaml
dependencies:
  - name: common
    version: "1.4.0"
    repository: "oci://ghcr.io/myorg/charts"

A widely used pattern is to have the library define a full resource (say, a Deployment) wrapped in tpl, and let each application chart pass overrides. Even at a smaller scale, centralizing just your labels, selectorLabels, and image-reference helpers in a library chart eliminates the most common source of drift across a fleet.

3. Validate inputs with values.schema.json

A values.schema.json file at the chart root is validated by Helm automatically on install, upgrade, lint, and template. It is plain JSON Schema (Draft 7 era), and it is the single highest-leverage reliability improvement you can make to a chart: bad config fails at render time with a clear message instead of producing a broken Deployment.

{
  "$schema": "https://json-schema.org/draft-07/schema#",
  "type": "object",
  "required": ["image", "replicaCount"],
  "properties": {
    "replicaCount": {
      "type": "integer",
      "minimum": 1
    },
    "image": {
      "type": "object",
      "required": ["repository"],
      "properties": {
        "repository": { "type": "string", "minLength": 1 },
        "tag": { "type": "string" },
        "pullPolicy": {
          "type": "string",
          "enum": ["Always", "IfNotPresent", "Never"]
        }
      },
      "additionalProperties": false
    },
    "service": {
      "type": "object",
      "properties": {
        "type": {
          "type": "string",
          "enum": ["ClusterIP", "NodePort", "LoadBalancer"]
        },
        "port": { "type": "integer", "minimum": 1, "maximum": 65535 }
      }
    }
  }
}

Two things worth internalizing. JSON Schema validates structure and types, not cross-field business rules — it cannot express “if autoscaling.enabled then replicaCount is ignored.” For those, fail explicitly inside templates with required and fail:

{{- if and .Values.ingress.enabled (not .Values.ingress.className) }}
{{- fail "ingress.enabled=true requires ingress.className" }}
{{- end }}
{{- $repo := required "image.repository is required" .Values.image.repository }}

Also note that additionalProperties: false is strict — it will reject a typo’d key like imagePullPolcy, which is exactly what you want, but it means a caller cannot smuggle in extra keys. Apply it deliberately at the leaf objects you fully control, and be more permissive at the top level if your chart intentionally accepts pass-through blocks.

4. Dependencies, subcharts, and global values

Declare dependencies in Chart.yaml and lock them. helm dependency update resolves versions and writes Chart.lock; commit that lock file so CI and production resolve byte-identical charts.

helm dependency update ./myapp     # resolves + writes Chart.lock + populates charts/
helm dependency build ./myapp      # rebuilds charts/ from an existing Chart.lock

Use condition and tags to make optional dependencies toggleable without editing Chart.yaml:

dependencies:
  - name: postgresql
    version: "15.5.x"
    repository: "oci://registry-1.docker.io/bitnamicharts"
    condition: postgresql.enabled

The subtlety that bites people is the global scope. Values under .Values.global are visible to the parent chart and every subchart, which makes globals perfect for cross-cutting settings (image registry mirror, image pull secrets, environment name) and dangerous for anything else. A parent can also override a subchart’s values by nesting them under the subchart’s name:

# parent values.yaml
global:
  imageRegistry: registry.internal.example.com
postgresql:            # overrides into the postgresql subchart
  primary:
    persistence:
      size: 50Gi

Resist the urge to push everything into global “just in case.” Globals are an implicit API across all subcharts; once a subchart starts reading one, removing it is a breaking change you cannot see from the parent.

5. Hooks, ordering, and when not to use them

Helm hooks let you run resources at lifecycle points (pre-install, post-install, pre-upgrade, post-delete, and so on), ordered within a phase by helm.sh/hook-weight (lower runs first). The classic use is a schema migration Job before an upgrade.

apiVersion: batch/v1
kind: Job
metadata:
  name: {{ include "myapp.fullname" . }}-migrate
  annotations:
    "helm.sh/hook": pre-upgrade,pre-install
    "helm.sh/hook-weight": "-5"
    "helm.sh/hook-delete-policy": before-hook-creation,hook-succeeded
spec:
  template:
    spec:
      restartPolicy: Never
      containers:
        - name: migrate
          image: "{{ .Values.image.repository }}:{{ .Values.image.tag | default .Chart.AppVersion }}"
          command: ["/app/migrate", "up"]

The critical caveat: hook resources are not tracked as part of the release. Helm creates them out-of-band and does not manage their lifecycle the way it does normal manifests, which is why you set an explicit hook-delete-policy. A failed hook also does not auto-rollback unless you pass --atomic. Reach for hooks when you genuinely need lifecycle ordering — migrations, one-shot setup — and avoid them for anything that should be a first-class, reconciled part of the release. If a Job needs to keep existing, model it as a normal resource, not a hook.

6. Testing: lint, unit tests, and chart-testing

Layer three independent checks; each catches a different class of failure.

helm lint validates chart structure, runs schema validation, and surfaces obvious template errors. Pass --strict to turn warnings into failures in CI:

helm lint ./myapp --strict --values ./myapp/ci/ha-values.yaml

Unit snapshots with the helm-unittest plugin assert that specific rendered output matches expectations, so a careless template edit that shifts a label or drops a probe fails loudly. Tests live in tests/ and run against the rendered templates:

# myapp/tests/deployment_test.yaml
suite: deployment
templates:
  - deployment.yaml
tests:
  - it: sets the replica count from values
    set:
      replicaCount: 3
    asserts:
      - equal:
          path: spec.replicas
          value: 3
  - it: renders a probe on the main container
    asserts:
      - isNotNull:
          path: spec.template.spec.containers[0].livenessProbe

helm plugin install https://github.com/helm-unittest/helm-unittest
helm unittest ./myapp

chart-testing (the ct tool) is what ties it together in CI: it lints changed charts, validates that the chart version was bumped, and can install each changed chart into an ephemeral cluster (kind works well) to confirm it actually deploys. The ci/*-values.yaml files give ct multiple realistic configurations to exercise.

ct lint --target-branch main --chart-dirs charts
ct install --target-branch main --chart-dirs charts

A minimal GitHub Actions job wiring this up against a kind cluster:

name: chart-ci
on: pull_request
jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0          # ct needs history to diff against the base
      - uses: azure/setup-helm@v4
      - uses: helm/chart-testing-action@v2
      - name: Lint changed charts
        run: ct lint --target-branch ${{ github.event.repository.default_branch }}
      - uses: helm/kind-action@v1
      - name: Install changed charts
        run: ct install --target-branch ${{ github.event.repository.default_branch }}

7. Packaging and distribution via OCI

Helm 3 treats OCI registries as a first-class distribution channel, so you can store charts next to your images. Package, push, and pull use the registry directly — no separate chart repo index to maintain.

helm package ./myapp                       # produces myapp-1.2.0.tgz
helm push myapp-1.2.0.tgz oci://ghcr.io/myorg/charts
helm pull oci://ghcr.io/myorg/charts/myapp --version 1.2.0
helm install myapp oci://ghcr.io/myorg/charts/myapp --version 1.2.0

For supply-chain integrity, Helm supports provenance files. helm package --sign produces a .prov file alongside the .tgz, and helm verify (or helm install --verify) checks the signature against your keyring.

helm package ./myapp --sign --key 'platform-team' --keyring ~/.gnupg/secring.gpg
helm verify myapp-1.2.0.tgz                 # validates the .prov signature

Many teams now also sign the pushed OCI artifact with cosign in addition to Helm’s PGP provenance. The two are complementary: PGP provenance proves the chart contents, cosign attaches a signature to the registry artifact and integrates with admission policy. Pick at least one and enforce it.

8. Upgrade safety: diff, atomic, and CRDs

Before any production upgrade, render the change, not just the new state. The helm diff plugin shows exactly what will mutate:

helm plugin install https://github.com/databus23/helm-diff
helm diff upgrade myapp oci://ghcr.io/myorg/charts/myapp --version 1.2.0 -f prod-values.yaml

Run upgrades with --atomic --timeout. With --atomic, a failed upgrade automatically rolls back to the prior revision instead of leaving the release wedged half-applied:

helm upgrade myapp oci://ghcr.io/myorg/charts/myapp \
  --version 1.2.0 -f prod-values.yaml \
  --atomic --timeout 5m

CRDs are the sharpest edge in Helm. Files in a chart’s special crds/ directory are installed before the rest of the chart, but Helm never upgrades or deletes them — this is deliberate, to avoid destroying custom resources cluster-wide. The practical consequence: shipping a new CRD version inside crds/ will not update an existing CRD. Manage CRD lifecycle explicitly, typically by applying CRD updates with kubectl apply as a separate, deliberate step outside the normal chart upgrade.

Enterprise scenario

A platform team running ~40 service charts off a shared common library shipped a “harmless” fix: renaming the selector helper from common.selectorLabels to common.matchLabels and bumping the library to 2.0.0. Lint passed, unit snapshots passed, ct install into kind passed — every check was green. The first production helm upgrade failed with Deployment.apps "checkout" is invalid: spec.selector: field is immutable. The new helper emitted a different spec.selector.matchLabels, and Kubernetes forbids mutating a Deployment’s selector after creation. Their CI only ever ran ct install on a clean cluster, so it never exercised the upgrade path where the immutability rule lives.

The fix had two parts. First, they froze selector labels as a contract: the library’s common.selectorLabels became append-only, asserted by a unit test that fails if the rendered key set changes.

# common/tests/selector_test.yaml
- it: selector labels are frozen (immutable contract)
  template: deployment.yaml
  asserts:
    - equal:
        path: spec.selector.matchLabels
        value:
          app.kubernetes.io/name: checkout
          app.kubernetes.io/instance: RELEASE-NAME

Second, they added an upgrade gate to ct so CI installs the chart, then upgrades over it before tearing down:

# ct.yaml
upgrade: true

ct install --upgrade deploys the chart’s previous released version first, then upgrades to the PR’s version, catching exactly the immutable-field class of break that a from-scratch install hides. The lesson: green local renders prove a chart installs; only an upgrade-over-previous test proves it upgrades.

Verify

Run these against a chart before you trust it:

# 1. Schema + lint pass cleanly, strictly
helm lint ./myapp --strict

# 2. Templates render with defaults AND with a real prod values file
helm template myapp ./myapp -f prod-values.yaml > /tmp/rendered.yaml
test -s /tmp/rendered.yaml && echo "rendered OK"

# 3. Bad input is rejected by the schema (expect a non-zero exit)
helm template myapp ./myapp --set replicaCount=0 ; echo "exit=$?"

# 4. Unit snapshots pass
helm unittest ./myapp

# 5. The rendered output is valid against the live API (dry run)
helm install myapp ./myapp --dry-run=server -f prod-values.yaml

--dry-run=server is meaningfully stronger than the default client dry run: it sends the manifests to the API server for validation (including admission), catching errors a purely local render misses.

Checklist

Pitfalls

Treating global as a convenience. Every global is an implicit contract with every subchart; add them sparingly and document them.
Forgetting hooks are untracked. Without a hook-delete-policy you accumulate orphaned Jobs; without --atomic a failed hook leaves the release inconsistent.
Assuming crds/ upgrades CRDs. It does not. Plan CRD versioning as a first-class, manual operation.
additionalProperties: false everywhere. Strictness is great at leaves you own and painful on blocks meant to pass through to a subchart — apply it with intent.
Skipping the server dry run. A chart that renders locally can still be rejected by admission controllers; --dry-run=server is your last cheap gate before a real install.

Next step: pull your label, selector, and image helpers into a type: library chart, version it, and publish it to your OCI registry. Once every service chart depends on the same library, fixing a labeling bug is one release instead of twenty pull requests.

Authoring Production-Grade Helm Charts: Library Charts, Values Schemas & CI Testing

1. A chart layout that scales

2. DRY templating with named templates and library charts

3. Validate inputs with values.schema.json

4. Dependencies, subcharts, and global values

5. Hooks, ordering, and when not to use them

6. Testing: lint, unit tests, and chart-testing

7. Packaging and distribution via OCI

8. Upgrade safety: diff, atomic, and CRDs

Enterprise scenario

Verify

Checklist

Pitfalls

Written by Vinod

Comments

Keep Reading

Cilium Beyond CNI: Cluster Mesh, Egress Gateway, and the BGP Control Plane

GitOps with Flux: Image Update Automation, OCI Artifact Sources, and Hard Multi-Tenancy

Helm for Complex Releases: Umbrella Charts, Library Charts, Lifecycle Hooks, and Safe Rollbacks