Testing Terraform for Real: Native terraform test, Terratest, and Policy Checks in CI

Most Terraform “tests” are a plan someone eyeballed before clicking apply. That catches syntax errors and nothing else. Real confidence comes from a layered suite: cheap checks that run on every save, integration tests that stand up throwaway infrastructure and assert it actually works, and policy gates that block non-compliant changes before they reach an environment. This guide builds that suite and wires it into CI.

The Terraform test pyramid

Think in layers, cheapest and fastest at the bottom. Each layer catches a different class of failure, and you run more of the cheap ones.

Layer	Tool	Speed	What it catches
Validate	`terraform validate`, `fmt`, `tflint`	< 1s	Syntax, style, deprecated usage
Plan assertions	`terraform test` (command: plan)	seconds	Wrong attributes, bad conditionals, var wiring
Unit	`terraform test` + mocked providers	sub-second	Module logic without touching a cloud
Integration	Terratest (Go)	minutes	Real provisioning, real behavior
Policy	OPA/Conftest or Sentinel on plan JSON	seconds	Compliance, security, cost guardrails

The discipline that matters: anything testable without a cloud account belongs in the bottom three layers. Reserve the slow, money-spending integration layer for the handful of behaviors you genuinely cannot verify any other way (a load balancer actually serves traffic, an IAM policy actually denies an action).

Native testing with the `terraform test` framework

Since Terraform 1.6, terraform test is built into the CLI. Tests live in .tftest.hcl files. Each file contains run blocks; each run executes a plan or apply against your configuration and evaluates assert blocks. By default Terraform looks in the working directory and a tests/ subdirectory.

Here is a plan-only test for a module that derives a storage account name and tags. command = plan means nothing is created, so it is fast and free.

# tests/naming.tftest.hcl

variables {
  project     = "kloudvin"
  environment = "dev"
  location    = "eastus"
}

run "derives_storage_account_name" {
  command = plan

  assert {
    condition     = output.storage_account_name == "stkloudvindev"
    error_message = "Storage account name not derived correctly"
  }
}

run "applies_required_tags" {
  command = plan

  assert {
    condition     = output.tags["environment"] == "dev"
    error_message = "environment tag missing or wrong"
  }
}

Run it:

terraform init
terraform test

You can override variables per run block, and you can validate that bad input is rejected. The expect_failures argument asserts that a specific resource or variable validation check fails — useful for proving your validation blocks and preconditions work.

run "rejects_invalid_environment" {
  command = plan

  variables {
    environment = "not-a-real-env"
  }

  expect_failures = [
    var.environment,
  ]
}

Plan-time assertions only see known values. Attributes computed by the provider at apply time (an assigned IP, a generated ID) show up as unknown during plan, so you cannot assert on them with command = plan. Use a mocked provider or an apply run for those.

Module unit tests with mocked providers

Mock providers (Terraform 1.7+) let a run block execute as if real infrastructure were created, but every provider call returns generated fake data. No credentials, no network, sub-second runs. This is how you unit-test module logic — count math, for_each keys, conditional resource creation — at the speed of a plan.

Declare the mock in the test file with mock_provider. You can pin specific attributes with override_resource (or override_data for data sources) so assertions are deterministic instead of relying on randomly generated mock values.

# tests/subnets.tftest.hcl

mock_provider "azurerm" {}

variables {
  vnet_cidr   = "10.10.0.0/16"
  subnet_count = 3
}

run "creates_expected_subnet_count" {
  command = plan

  assert {
    condition     = length(azurerm_subnet.this) == 3
    error_message = "Expected 3 subnets to be planned"
  }
}

run "overridden_id_is_stable" {
  command = apply

  override_resource {
    target = azurerm_virtual_network.this
    values = {
      id = "/subscriptions/0000/resourceGroups/rg/providers/Microsoft.Network/virtualNetworks/vnet"
    }
  }

  assert {
    condition     = output.vnet_id == "/subscriptions/0000/resourceGroups/rg/providers/Microsoft.Network/virtualNetworks/vnet"
    error_message = "vnet_id output did not match overridden value"
  }
}

Because the provider is mocked, command = apply here never calls Azure — it walks the graph and produces planned/overridden values. That gives you apply-time outputs (which expose computed attributes) without the cost or latency. Mock everything you can; it keeps the feedback loop tight enough to run on every file save.

Integration testing with Terratest

Some things only a real cloud can tell you. Terratest is a Go library that runs your actual Terraform, then queries the live resources and asserts on them, then tears everything down. The pattern is always the same: InitAndApply, assert, defer Destroy.

Put tests in a test/ directory as a Go module. A minimal integration test against an example fixture:

// test/network_test.go
package test

import (
	"testing"

	"github.com/gruntwork-io/terratest/modules/terraform"
	"github.com/stretchr/testify/assert"
)

func TestHubSpokeNetwork(t *testing.T) {
	t.Parallel()

	opts := &terraform.Options{
		TerraformDir: "../examples/hub-spoke",
		Vars: map[string]interface{}{
			"environment": "test",
			"location":    "eastus",
		},
	}

	defer terraform.Destroy(t, opts)

	terraform.InitAndApply(t, opts)

	vnetID := terraform.Output(t, opts, "vnet_id")
	assert.NotEmpty(t, vnetID)

	subnetIDs := terraform.OutputList(t, opts, "subnet_ids")
	assert.Len(t, subnetIDs, 3)
}

The defer terraform.Destroy runs even if an assertion fails, which is what keeps you from leaking resources. Always isolate state — give each run a unique workspace or backend key, and randomize resource names so parallel runs never collide:

import "github.com/gruntwork-io/terratest/modules/random"

uniqueID := random.UniqueId()
opts.Vars["name_suffix"] = uniqueID

Run the suite with the standard Go test runner. Integration tests are slow, so give them a generous timeout — Go’s default is 10 minutes and will kill a half-finished apply mid-flight, orphaning resources.

cd test
go test -v -timeout 45m -run TestHubSpokeNetwork

Asserting real behavior

Checking that an output is non-empty is weak. The point of paying for real infrastructure is to verify it behaves. Terratest ships helpers for exactly this.

For an HTTP endpoint, retry until it is healthy — freshly provisioned resources are rarely ready the instant apply returns:

import (
	"time"
	"github.com/gruntwork-io/terratest/modules/http-helper"
)

url := terraform.Output(t, opts, "app_url")

http_helper.HttpGetWithRetry(
	t, url, nil,
	200, "OK",
	30,             // retries
	10*time.Second, // sleep between retries
)

For anything past a simple GET, drop to the cloud SDK and assert on the resource directly. Terratest has provider modules (for example modules/azure, modules/aws), or you can call the vendor SDK yourself. The retry/backoff helpers generalize to any flaky check:

import "github.com/gruntwork-io/terratest/modules/retry"

retry.DoWithRetry(t, "wait for blob to be readable", 20, 15*time.Second,
	func() (string, error) {
		// call the storage SDK; return an error to trigger a retry
		return checkBlobExists(storageAccount, container, blob)
	},
)

The rule: never time.Sleep and hope. Always poll with a bounded retry so a slow-but-eventually-correct resource passes and a genuinely broken one fails fast.

Policy-as-code gates on the plan JSON

Tests verify behavior; policy verifies intent — “no public storage,” “every resource is tagged with a cost center,” “no VM larger than this SKU.” Both OPA/Conftest and HashiCorp Sentinel evaluate a machine-readable plan, so you gate the change before it is applied.

Generate the plan JSON once and feed it to your policy engine:

terraform plan -out=tfplan.binary
terraform show -json tfplan.binary > tfplan.json

A Conftest policy in Rego that fails the plan if any storage account allows public access. resource_changes is the stable, documented surface of the plan JSON — walk it rather than the internal planned_values tree.

# policy/storage.rego
package main

deny[msg] {
	resource := input.resource_changes[_]
	resource.type == "azurerm_storage_account"
	resource.change.after.public_network_access_enabled == true
	msg := sprintf("Storage account '%s' must not allow public network access", [resource.address])
}

Run it as a gate — a non-zero exit code fails the pipeline:

conftest test tfplan.json --policy policy/

If you are on Terraform Cloud/Enterprise, Sentinel does the same job with tfplan/v2 imports and is enforced by the platform rather than a CLI step. Pick one per organization; running both is rarely worth the maintenance.

Wiring the suite into CI

Run the layers in order of cost. Fail fast on the cheap ones so you never spend cloud minutes on a change that a linter could have rejected. Here is a GitHub Actions workflow that stages validate -> native test -> policy -> integration.

# .github/workflows/terraform-test.yml
name: terraform-test

on:
  pull_request:

jobs:
  validate-and-unit:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: hashicorp/setup-terraform@v3
      - run: terraform fmt -check -recursive
      - run: terraform init -backend=false
      - run: terraform validate
      - run: terraform test   # native unit/plan tests with mocks

  policy:
    needs: validate-and-unit
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: hashicorp/setup-terraform@v3
      - run: |
          terraform init -backend=false
          terraform plan -out=tfplan.binary
          terraform show -json tfplan.binary > tfplan.json
      - uses: open-policy-agent/setup-conftest@v2
      - run: conftest test tfplan.json --policy policy/

  integration:
    needs: policy
    runs-on: ubuntu-latest
    permissions:
      id-token: write     # OIDC: short-lived cloud credentials, no static secrets
      contents: read
    steps:
      - uses: actions/checkout@v4
      - uses: azure/login@v2
        with:
          client-id: ${{ secrets.AZURE_CLIENT_ID }}
          tenant-id: ${{ secrets.AZURE_TENANT_ID }}
          subscription-id: ${{ secrets.AZURE_SUBSCRIPTION_ID }}
      - uses: actions/setup-go@v5
        with:
          go-version: "1.22"
      - run: cd test && go test -v -timeout 45m -parallel 4 ./...

Three things worth calling out:

Ephemeral credentials. Use OIDC federation (id-token: write plus azure/login, or aws-actions/configure-aws-credentials) so the runner gets short-lived tokens scoped to a sandbox subscription. Never put long-lived cloud keys in CI secrets.
Parallelism. t.Parallel() in Go plus -parallel N lets independent tests run concurrently; combined with randomized names and isolated state, this cuts wall-clock time dramatically. Make sure your sandbox has the quota and that names truly cannot collide.
Cost estimation. Add an Infracost step on the plan JSON to comment the price delta on the PR. It is a guardrail against an integration test that quietly spins up something expensive, and a useful review signal in its own right.

Verify

Confirm each layer actually runs and fails when it should:

# Native tests pass on good input
terraform test

# A deliberately broken assertion should make `terraform test` exit non-zero
terraform test || echo "native tests failed (expected for the broken case)"

# Policy catches a violation: temporarily set public access true, then:
conftest test tfplan.json --policy policy/   # expect a deny + non-zero exit

# Integration test stands up and tears down cleanly
cd test && go test -v -timeout 45m -run TestHubSpokeNetwork

After an integration run, the cloud account should contain zero leftover resources. List by your test tag or naming prefix and confirm the set is empty — that is the real proof your defer Destroy worked.

Checklist

Pitfalls and next steps

The failure mode that erodes trust fastest is flakiness. Most of it traces to one of three causes: a test that sleeps instead of polling, parallel runs sharing state or names, and the Go test timeout killing an apply partway through (which both fails the test and orphans resources). Bound every wait with a retry, isolate state per run, and set the timeout above your slowest realistic apply-plus-destroy.

Even with disciplined defer Destroy, resources leak — a runner gets killed, a destroy hits a dependency error, a developer Ctrl-C’s a local run. Treat cleanup as a system, not a hope: tag every test resource (managed-by=terratest, a run ID, a timestamp) and run a scheduled sweeper that deletes anything matching the tag past a TTL. For very expensive fixtures, consider sharing a long-lived base environment across tests and only creating the cheap, fast-changing pieces per run — at the cost of weaker isolation, so weigh it carefully.

From here, the high-value extensions are contract tests for published modules (so a breaking change to an input variable fails before consumers find out), and promoting your policy bundle to a versioned, separately tested artifact rather than a folder of loose Rego files.

Testing Terraform for Real: Native terraform test, Terratest, and Policy Checks in CI

The Terraform test pyramid

Native testing with the `terraform test` framework

Module unit tests with mocked providers

Integration testing with Terratest

Asserting real behavior

Policy-as-code gates on the plan JSON

Wiring the suite into CI

Verify

Checklist

Pitfalls and next steps

Written by Vinod

Comments

Keep Reading

Dynamic Inventory and Secure Secrets for Ansible at Cloud Scale

Engineering Idempotent Ansible Collections with Molecule Testing

Programmatic Infrastructure with CDK for Terraform in TypeScript

Testing Terraform for Real: Native terraform test, Terratest, and Policy Checks in CI

The Terraform test pyramid

Native testing with the terraform test framework

Module unit tests with mocked providers

Integration testing with Terratest

Asserting real behavior

Policy-as-code gates on the plan JSON

Wiring the suite into CI

Verify

Checklist

Pitfalls and next steps

Written by Vinod

Comments

Keep Reading

Dynamic Inventory and Secure Secrets for Ansible at Cloud Scale

Engineering Idempotent Ansible Collections with Molecule Testing

Programmatic Infrastructure with CDK for Terraform in TypeScript

Native testing with the `terraform test` framework