Most Terraform “tests” are a plan someone eyeballed before clicking apply. That catches syntax errors and nothing else. Real confidence comes from a layered suite: cheap checks that run on every save, integration tests that stand up throwaway infrastructure and assert it actually works, and policy gates that block non-compliant changes before they reach an environment. This guide builds that suite and wires it into CI.
The Terraform test pyramid
Think in layers, cheapest and fastest at the bottom. Each layer catches a different class of failure, and you run more of the cheap ones.
| Layer | Tool | Speed | What it catches |
|---|---|---|---|
| Validate | terraform validate, fmt, tflint |
< 1s | Syntax, style, deprecated usage |
| Plan assertions | terraform test (command: plan) |
seconds | Wrong attributes, bad conditionals, var wiring |
| Unit | terraform test + mocked providers |
sub-second | Module logic without touching a cloud |
| Integration | Terratest (Go) | minutes | Real provisioning, real behavior |
| Policy | OPA/Conftest or Sentinel on plan JSON | seconds | Compliance, security, cost guardrails |
The discipline that matters: anything testable without a cloud account belongs in the bottom three layers. Reserve the slow, money-spending integration layer for the handful of behaviors you genuinely cannot verify any other way (a load balancer actually serves traffic, an IAM policy actually denies an action).
Native testing with the terraform test framework
Since Terraform 1.6, terraform test is built into the CLI. Tests live in .tftest.hcl files. Each file contains run blocks; each run executes a plan or apply against your configuration and evaluates assert blocks. By default Terraform looks in the working directory and a tests/ subdirectory.
Here is a plan-only test for a module that derives a storage account name and tags. command = plan means nothing is created, so it is fast and free.
# tests/naming.tftest.hcl
variables {
project = "kloudvin"
environment = "dev"
location = "eastus"
}
run "derives_storage_account_name" {
command = plan
assert {
condition = output.storage_account_name == "stkloudvindev"
error_message = "Storage account name not derived correctly"
}
}
run "applies_required_tags" {
command = plan
assert {
condition = output.tags["environment"] == "dev"
error_message = "environment tag missing or wrong"
}
}
Run it:
terraform init
terraform test
You can override variables per run block, and you can validate that bad input is rejected. The expect_failures argument asserts that a specific resource or variable validation check fails — useful for proving your validation blocks and preconditions work.
run "rejects_invalid_environment" {
command = plan
variables {
environment = "not-a-real-env"
}
expect_failures = [
var.environment,
]
}
Plan-time assertions only see known values. Attributes computed by the provider at apply time (an assigned IP, a generated ID) show up as unknown during plan, so you cannot assert on them with
command = plan. Use a mocked provider or anapplyrun for those.
Module unit tests with mocked providers
Mock providers (Terraform 1.7+) let a run block execute as if real infrastructure were created, but every provider call returns generated fake data. No credentials, no network, sub-second runs. This is how you unit-test module logic — count math, for_each keys, conditional resource creation — at the speed of a plan.
Declare the mock in the test file with mock_provider. You can pin specific attributes with override_resource (or override_data for data sources) so assertions are deterministic instead of relying on randomly generated mock values.
# tests/subnets.tftest.hcl
mock_provider "azurerm" {}
variables {
vnet_cidr = "10.10.0.0/16"
subnet_count = 3
}
run "creates_expected_subnet_count" {
command = plan
assert {
condition = length(azurerm_subnet.this) == 3
error_message = "Expected 3 subnets to be planned"
}
}
run "overridden_id_is_stable" {
command = apply
override_resource {
target = azurerm_virtual_network.this
values = {
id = "/subscriptions/0000/resourceGroups/rg/providers/Microsoft.Network/virtualNetworks/vnet"
}
}
assert {
condition = output.vnet_id == "/subscriptions/0000/resourceGroups/rg/providers/Microsoft.Network/virtualNetworks/vnet"
error_message = "vnet_id output did not match overridden value"
}
}
Because the provider is mocked, command = apply here never calls Azure — it walks the graph and produces planned/overridden values. That gives you apply-time outputs (which expose computed attributes) without the cost or latency. Mock everything you can; it keeps the feedback loop tight enough to run on every file save.
Integration testing with Terratest
Some things only a real cloud can tell you. Terratest is a Go library that runs your actual Terraform, then queries the live resources and asserts on them, then tears everything down. The pattern is always the same: InitAndApply, assert, defer Destroy.
Put tests in a test/ directory as a Go module. A minimal integration test against an example fixture:
// test/network_test.go
package test
import (
"testing"
"github.com/gruntwork-io/terratest/modules/terraform"
"github.com/stretchr/testify/assert"
)
func TestHubSpokeNetwork(t *testing.T) {
t.Parallel()
opts := &terraform.Options{
TerraformDir: "../examples/hub-spoke",
Vars: map[string]interface{}{
"environment": "test",
"location": "eastus",
},
}
defer terraform.Destroy(t, opts)
terraform.InitAndApply(t, opts)
vnetID := terraform.Output(t, opts, "vnet_id")
assert.NotEmpty(t, vnetID)
subnetIDs := terraform.OutputList(t, opts, "subnet_ids")
assert.Len(t, subnetIDs, 3)
}
The defer terraform.Destroy runs even if an assertion fails, which is what keeps you from leaking resources. Always isolate state — give each run a unique workspace or backend key, and randomize resource names so parallel runs never collide:
import "github.com/gruntwork-io/terratest/modules/random"
uniqueID := random.UniqueId()
opts.Vars["name_suffix"] = uniqueID
Run the suite with the standard Go test runner. Integration tests are slow, so give them a generous timeout — Go’s default is 10 minutes and will kill a half-finished apply mid-flight, orphaning resources.
cd test
go test -v -timeout 45m -run TestHubSpokeNetwork
Asserting real behavior
Checking that an output is non-empty is weak. The point of paying for real infrastructure is to verify it behaves. Terratest ships helpers for exactly this.
For an HTTP endpoint, retry until it is healthy — freshly provisioned resources are rarely ready the instant apply returns:
import (
"time"
"github.com/gruntwork-io/terratest/modules/http-helper"
)
url := terraform.Output(t, opts, "app_url")
http_helper.HttpGetWithRetry(
t, url, nil,
200, "OK",
30, // retries
10*time.Second, // sleep between retries
)
For anything past a simple GET, drop to the cloud SDK and assert on the resource directly. Terratest has provider modules (for example modules/azure, modules/aws), or you can call the vendor SDK yourself. The retry/backoff helpers generalize to any flaky check:
import "github.com/gruntwork-io/terratest/modules/retry"
retry.DoWithRetry(t, "wait for blob to be readable", 20, 15*time.Second,
func() (string, error) {
// call the storage SDK; return an error to trigger a retry
return checkBlobExists(storageAccount, container, blob)
},
)
The rule: never time.Sleep and hope. Always poll with a bounded retry so a slow-but-eventually-correct resource passes and a genuinely broken one fails fast.
Policy-as-code gates on the plan JSON
Tests verify behavior; policy verifies intent — “no public storage,” “every resource is tagged with a cost center,” “no VM larger than this SKU.” Both OPA/Conftest and HashiCorp Sentinel evaluate a machine-readable plan, so you gate the change before it is applied.
Generate the plan JSON once and feed it to your policy engine:
terraform plan -out=tfplan.binary
terraform show -json tfplan.binary > tfplan.json
A Conftest policy in Rego that fails the plan if any storage account allows public access. resource_changes is the stable, documented surface of the plan JSON — walk it rather than the internal planned_values tree.
# policy/storage.rego
package main
deny[msg] {
resource := input.resource_changes[_]
resource.type == "azurerm_storage_account"
resource.change.after.public_network_access_enabled == true
msg := sprintf("Storage account '%s' must not allow public network access", [resource.address])
}
Run it as a gate — a non-zero exit code fails the pipeline:
conftest test tfplan.json --policy policy/
If you are on Terraform Cloud/Enterprise, Sentinel does the same job with tfplan/v2 imports and is enforced by the platform rather than a CLI step. Pick one per organization; running both is rarely worth the maintenance.
Wiring the suite into CI
Run the layers in order of cost. Fail fast on the cheap ones so you never spend cloud minutes on a change that a linter could have rejected. Here is a GitHub Actions workflow that stages validate -> native test -> policy -> integration.
# .github/workflows/terraform-test.yml
name: terraform-test
on:
pull_request:
jobs:
validate-and-unit:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: hashicorp/setup-terraform@v3
- run: terraform fmt -check -recursive
- run: terraform init -backend=false
- run: terraform validate
- run: terraform test # native unit/plan tests with mocks
policy:
needs: validate-and-unit
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: hashicorp/setup-terraform@v3
- run: |
terraform init -backend=false
terraform plan -out=tfplan.binary
terraform show -json tfplan.binary > tfplan.json
- uses: open-policy-agent/setup-conftest@v2
- run: conftest test tfplan.json --policy policy/
integration:
needs: policy
runs-on: ubuntu-latest
permissions:
id-token: write # OIDC: short-lived cloud credentials, no static secrets
contents: read
steps:
- uses: actions/checkout@v4
- uses: azure/login@v2
with:
client-id: ${{ secrets.AZURE_CLIENT_ID }}
tenant-id: ${{ secrets.AZURE_TENANT_ID }}
subscription-id: ${{ secrets.AZURE_SUBSCRIPTION_ID }}
- uses: actions/setup-go@v5
with:
go-version: "1.22"
- run: cd test && go test -v -timeout 45m -parallel 4 ./...
Three things worth calling out:
- Ephemeral credentials. Use OIDC federation (
id-token: writeplusazure/login, oraws-actions/configure-aws-credentials) so the runner gets short-lived tokens scoped to a sandbox subscription. Never put long-lived cloud keys in CI secrets. - Parallelism.
t.Parallel()in Go plus-parallel Nlets independent tests run concurrently; combined with randomized names and isolated state, this cuts wall-clock time dramatically. Make sure your sandbox has the quota and that names truly cannot collide. - Cost estimation. Add an Infracost step on the plan JSON to comment the price delta on the PR. It is a guardrail against an integration test that quietly spins up something expensive, and a useful review signal in its own right.
Verify
Confirm each layer actually runs and fails when it should:
# Native tests pass on good input
terraform test
# A deliberately broken assertion should make `terraform test` exit non-zero
terraform test || echo "native tests failed (expected for the broken case)"
# Policy catches a violation: temporarily set public access true, then:
conftest test tfplan.json --policy policy/ # expect a deny + non-zero exit
# Integration test stands up and tears down cleanly
cd test && go test -v -timeout 45m -run TestHubSpokeNetwork
After an integration run, the cloud account should contain zero leftover resources. List by your test tag or naming prefix and confirm the set is empty — that is the real proof your defer Destroy worked.
Checklist
Pitfalls and next steps
The failure mode that erodes trust fastest is flakiness. Most of it traces to one of three causes: a test that sleeps instead of polling, parallel runs sharing state or names, and the Go test timeout killing an apply partway through (which both fails the test and orphans resources). Bound every wait with a retry, isolate state per run, and set the timeout above your slowest realistic apply-plus-destroy.
Even with disciplined defer Destroy, resources leak — a runner gets killed, a destroy hits a dependency error, a developer Ctrl-C’s a local run. Treat cleanup as a system, not a hope: tag every test resource (managed-by=terratest, a run ID, a timestamp) and run a scheduled sweeper that deletes anything matching the tag past a TTL. For very expensive fixtures, consider sharing a long-lived base environment across tests and only creating the cheap, fast-changing pieces per run — at the cost of weaker isolation, so weigh it carefully.
From here, the high-value extensions are contract tests for published modules (so a breaking change to an input variable fails before consumers find out), and promoting your policy bundle to a versioned, separately tested artifact rather than a folder of loose Rego files.