Containerization Containers

Authoring Production-Grade Helm Charts: Library Charts, Values Schemas & CI Testing

helm create gets you a chart in five seconds and a maintenance liability in five weeks. This guide walks through the practices that separate a throwaway scaffold from a chart you can hand to twenty teams: shared template libraries, fail-fast input validation, deterministic dependency handling, and a CI pipeline that catches breakage before it reaches a cluster.

1. A chart layout that scales

The default scaffold is fine for one service. Once you have a platform, structure the chart so that intent is obvious and overrides are predictable.

myapp/
  Chart.yaml
  values.yaml            # documented defaults, every key present
  values.schema.json     # contract for what callers may pass
  templates/
    _helpers.tpl         # named templates (fullname, labels, selectors)
    deployment.yaml
    service.yaml
    serviceaccount.yaml
    NOTES.txt
  charts/                # vendored dependencies (helm dependency build)
  ci/                    # values files used only by chart-testing
    default-values.yaml
    ha-values.yaml

Two rules carry most of the weight. First, every value your templates read must appear in values.yaml with a sane default and a comment — even if the default is {} or "". An undocumented value is a bug waiting for a 2 a.m. page. Second, keep templates/ free of business logic that belongs in helpers; a template should read like a manifest, not a program.

Use helm create once to remember the layout, then delete the generated boilerplate. The scaffolded values.yaml ships opinions (a specific autoscaling block, a sample ingress) you almost certainly do not want as your defaults.

2. DRY templating with named templates and library charts

Named templates (defined with define in _helpers.tpl) are your first lever against duplication. The canonical pair is a name helper and a labels helper:

{{/* templates/_helpers.tpl */}}
{{- define "myapp.fullname" -}}
{{- $name := default .Chart.Name .Values.nameOverride -}}
{{- if .Values.fullnameOverride -}}
{{- .Values.fullnameOverride | trunc 63 | trimSuffix "-" -}}
{{- else -}}
{{- printf "%s-%s" .Release.Name $name | trunc 63 | trimSuffix "-" -}}
{{- end -}}
{{- end -}}

{{- define "myapp.labels" -}}
helm.sh/chart: {{ printf "%s-%s" .Chart.Name .Chart.Version | replace "+" "_" | trunc 63 | trimSuffix "-" }}
app.kubernetes.io/name: {{ include "myapp.name" . }}
app.kubernetes.io/instance: {{ .Release.Name }}
{{- if .Chart.AppVersion }}
app.kubernetes.io/version: {{ .Chart.AppVersion | quote }}
{{- end }}
app.kubernetes.io/managed-by: {{ .Release.Service }}
{{- end -}}

The trunc 63 is not cosmetic: Kubernetes label values and many resource names are capped at 63 characters, and a long release name will otherwise produce an invalid object that the API server rejects.

When the same helpers need to be shared across many charts, promote them into a library chart. A library chart sets type: library in Chart.yaml, ships only templates/ with define blocks (no rendered manifests), and is consumed as a dependency. The key behavioral difference: Helm does not render a library chart’s templates directly, so it never emits objects on its own — it only exposes named templates.

# common/Chart.yaml
apiVersion: v2
name: common
type: library
version: 1.4.0
# myapp/Chart.yaml
dependencies:
  - name: common
    version: "1.4.0"
    repository: "oci://ghcr.io/myorg/charts"

A widely used pattern is to have the library define a full resource (say, a Deployment) wrapped in tpl, and let each application chart pass overrides. Even at a smaller scale, centralizing just your labels, selectorLabels, and image-reference helpers in a library chart eliminates the most common source of drift across a fleet.

3. Validate inputs with values.schema.json

A values.schema.json file at the chart root is validated by Helm automatically on install, upgrade, lint, and template. It is plain JSON Schema (Draft 7 era), and it is the single highest-leverage reliability improvement you can make to a chart: bad config fails at render time with a clear message instead of producing a broken Deployment.

{
  "$schema": "https://json-schema.org/draft-07/schema#",
  "type": "object",
  "required": ["image", "replicaCount"],
  "properties": {
    "replicaCount": {
      "type": "integer",
      "minimum": 1
    },
    "image": {
      "type": "object",
      "required": ["repository"],
      "properties": {
        "repository": { "type": "string", "minLength": 1 },
        "tag": { "type": "string" },
        "pullPolicy": {
          "type": "string",
          "enum": ["Always", "IfNotPresent", "Never"]
        }
      },
      "additionalProperties": false
    },
    "service": {
      "type": "object",
      "properties": {
        "type": {
          "type": "string",
          "enum": ["ClusterIP", "NodePort", "LoadBalancer"]
        },
        "port": { "type": "integer", "minimum": 1, "maximum": 65535 }
      }
    }
  }
}

Two things worth internalizing. JSON Schema validates structure and types, not cross-field business rules — it cannot express “if autoscaling.enabled then replicaCount is ignored.” For those, fail explicitly inside templates with required and fail:

{{- if and .Values.ingress.enabled (not .Values.ingress.className) }}
{{- fail "ingress.enabled=true requires ingress.className" }}
{{- end }}
{{- $repo := required "image.repository is required" .Values.image.repository }}

Also note that additionalProperties: false is strict — it will reject a typo’d key like imagePullPolcy, which is exactly what you want, but it means a caller cannot smuggle in extra keys. Apply it deliberately at the leaf objects you fully control, and be more permissive at the top level if your chart intentionally accepts pass-through blocks.

4. Dependencies, subcharts, and global values

Declare dependencies in Chart.yaml and lock them. helm dependency update resolves versions and writes Chart.lock; commit that lock file so CI and production resolve byte-identical charts.

helm dependency update ./myapp     # resolves + writes Chart.lock + populates charts/
helm dependency build ./myapp      # rebuilds charts/ from an existing Chart.lock

Use condition and tags to make optional dependencies toggleable without editing Chart.yaml:

dependencies:
  - name: postgresql
    version: "15.5.x"
    repository: "oci://registry-1.docker.io/bitnamicharts"
    condition: postgresql.enabled

The subtlety that bites people is the global scope. Values under .Values.global are visible to the parent chart and every subchart, which makes globals perfect for cross-cutting settings (image registry mirror, image pull secrets, environment name) and dangerous for anything else. A parent can also override a subchart’s values by nesting them under the subchart’s name:

# parent values.yaml
global:
  imageRegistry: registry.internal.example.com
postgresql:            # overrides into the postgresql subchart
  primary:
    persistence:
      size: 50Gi

Resist the urge to push everything into global “just in case.” Globals are an implicit API across all subcharts; once a subchart starts reading one, removing it is a breaking change you cannot see from the parent.

5. Hooks, ordering, and when not to use them

Helm hooks let you run resources at lifecycle points (pre-install, post-install, pre-upgrade, post-delete, and so on), ordered within a phase by helm.sh/hook-weight (lower runs first). The classic use is a schema migration Job before an upgrade.

apiVersion: batch/v1
kind: Job
metadata:
  name: {{ include "myapp.fullname" . }}-migrate
  annotations:
    "helm.sh/hook": pre-upgrade,pre-install
    "helm.sh/hook-weight": "-5"
    "helm.sh/hook-delete-policy": before-hook-creation,hook-succeeded
spec:
  template:
    spec:
      restartPolicy: Never
      containers:
        - name: migrate
          image: "{{ .Values.image.repository }}:{{ .Values.image.tag | default .Chart.AppVersion }}"
          command: ["/app/migrate", "up"]

The critical caveat: hook resources are not tracked as part of the release. Helm creates them out-of-band and does not manage their lifecycle the way it does normal manifests, which is why you set an explicit hook-delete-policy. A failed hook also does not auto-rollback unless you pass --atomic. Reach for hooks when you genuinely need lifecycle ordering — migrations, one-shot setup — and avoid them for anything that should be a first-class, reconciled part of the release. If a Job needs to keep existing, model it as a normal resource, not a hook.

6. Testing: lint, unit tests, and chart-testing

Layer three independent checks; each catches a different class of failure.

helm lint validates chart structure, runs schema validation, and surfaces obvious template errors. Pass --strict to turn warnings into failures in CI:

helm lint ./myapp --strict --values ./myapp/ci/ha-values.yaml

Unit snapshots with the helm-unittest plugin assert that specific rendered output matches expectations, so a careless template edit that shifts a label or drops a probe fails loudly. Tests live in tests/ and run against the rendered templates:

# myapp/tests/deployment_test.yaml
suite: deployment
templates:
  - deployment.yaml
tests:
  - it: sets the replica count from values
    set:
      replicaCount: 3
    asserts:
      - equal:
          path: spec.replicas
          value: 3
  - it: renders a probe on the main container
    asserts:
      - isNotNull:
          path: spec.template.spec.containers[0].livenessProbe
helm plugin install https://github.com/helm-unittest/helm-unittest
helm unittest ./myapp

chart-testing (the ct tool) is what ties it together in CI: it lints changed charts, validates that the chart version was bumped, and can install each changed chart into an ephemeral cluster (kind works well) to confirm it actually deploys. The ci/*-values.yaml files give ct multiple realistic configurations to exercise.

ct lint --target-branch main --chart-dirs charts
ct install --target-branch main --chart-dirs charts

A minimal GitHub Actions job wiring this up against a kind cluster:

name: chart-ci
on: pull_request
jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0          # ct needs history to diff against the base
      - uses: azure/setup-helm@v4
      - uses: helm/chart-testing-action@v2
      - name: Lint changed charts
        run: ct lint --target-branch ${{ github.event.repository.default_branch }}
      - uses: helm/kind-action@v1
      - name: Install changed charts
        run: ct install --target-branch ${{ github.event.repository.default_branch }}

7. Packaging and distribution via OCI

Helm 3 treats OCI registries as a first-class distribution channel, so you can store charts next to your images. Package, push, and pull use the registry directly — no separate chart repo index to maintain.

helm package ./myapp                       # produces myapp-1.2.0.tgz
helm push myapp-1.2.0.tgz oci://ghcr.io/myorg/charts
helm pull oci://ghcr.io/myorg/charts/myapp --version 1.2.0
helm install myapp oci://ghcr.io/myorg/charts/myapp --version 1.2.0

For supply-chain integrity, Helm supports provenance files. helm package --sign produces a .prov file alongside the .tgz, and helm verify (or helm install --verify) checks the signature against your keyring.

helm package ./myapp --sign --key 'platform-team' --keyring ~/.gnupg/secring.gpg
helm verify myapp-1.2.0.tgz                 # validates the .prov signature

Many teams now also sign the pushed OCI artifact with cosign in addition to Helm’s PGP provenance. The two are complementary: PGP provenance proves the chart contents, cosign attaches a signature to the registry artifact and integrates with admission policy. Pick at least one and enforce it.

8. Upgrade safety: diff, atomic, and CRDs

Before any production upgrade, render the change, not just the new state. The helm diff plugin shows exactly what will mutate:

helm plugin install https://github.com/databus23/helm-diff
helm diff upgrade myapp oci://ghcr.io/myorg/charts/myapp --version 1.2.0 -f prod-values.yaml

Run upgrades with --atomic --timeout. With --atomic, a failed upgrade automatically rolls back to the prior revision instead of leaving the release wedged half-applied:

helm upgrade myapp oci://ghcr.io/myorg/charts/myapp \
  --version 1.2.0 -f prod-values.yaml \
  --atomic --timeout 5m

CRDs are the sharpest edge in Helm. Files in a chart’s special crds/ directory are installed before the rest of the chart, but Helm never upgrades or deletes them — this is deliberate, to avoid destroying custom resources cluster-wide. The practical consequence: shipping a new CRD version inside crds/ will not update an existing CRD. Manage CRD lifecycle explicitly, typically by applying CRD updates with kubectl apply as a separate, deliberate step outside the normal chart upgrade.

Enterprise scenario

A platform team running ~40 service charts off a shared common library shipped a “harmless” fix: renaming the selector helper from common.selectorLabels to common.matchLabels and bumping the library to 2.0.0. Lint passed, unit snapshots passed, ct install into kind passed — every check was green. The first production helm upgrade failed with Deployment.apps "checkout" is invalid: spec.selector: field is immutable. The new helper emitted a different spec.selector.matchLabels, and Kubernetes forbids mutating a Deployment’s selector after creation. Their CI only ever ran ct install on a clean cluster, so it never exercised the upgrade path where the immutability rule lives.

The fix had two parts. First, they froze selector labels as a contract: the library’s common.selectorLabels became append-only, asserted by a unit test that fails if the rendered key set changes.

# common/tests/selector_test.yaml
- it: selector labels are frozen (immutable contract)
  template: deployment.yaml
  asserts:
    - equal:
        path: spec.selector.matchLabels
        value:
          app.kubernetes.io/name: checkout
          app.kubernetes.io/instance: RELEASE-NAME

Second, they added an upgrade gate to ct so CI installs the chart, then upgrades over it before tearing down:

# ct.yaml
upgrade: true

ct install --upgrade deploys the chart’s previous released version first, then upgrades to the PR’s version, catching exactly the immutable-field class of break that a from-scratch install hides. The lesson: green local renders prove a chart installs; only an upgrade-over-previous test proves it upgrades.

Verify

Run these against a chart before you trust it:

# 1. Schema + lint pass cleanly, strictly
helm lint ./myapp --strict

# 2. Templates render with defaults AND with a real prod values file
helm template myapp ./myapp -f prod-values.yaml > /tmp/rendered.yaml
test -s /tmp/rendered.yaml && echo "rendered OK"

# 3. Bad input is rejected by the schema (expect a non-zero exit)
helm template myapp ./myapp --set replicaCount=0 ; echo "exit=$?"

# 4. Unit snapshots pass
helm unittest ./myapp

# 5. The rendered output is valid against the live API (dry run)
helm install myapp ./myapp --dry-run=server -f prod-values.yaml

--dry-run=server is meaningfully stronger than the default client dry run: it sends the manifests to the API server for validation (including admission), catching errors a purely local render misses.

Checklist

Pitfalls

Next step: pull your label, selector, and image helpers into a type: library chart, version it, and publish it to your OCI registry. Once every service chart depends on the same library, fixing a labeling bug is one release instead of twenty pull requests.

HelmKubernetesChartsJSON-SchemaCI

Comments

Keep Reading