Quick take — A reusable hashicorp/aws ~> 5.0 Terraform module for aws_dax_cluster covering encryption at rest, TLS endpoint encryption, a multi-node replication factor, an IAM role for DynamoDB access, and a private subnet group — production defaults baked in. New here? Jump to the Quickstart below to deploy it in minutes; read on for how it works and when to reach for it.
Quickstart (copy-paste)
Minimal, runnable configuration — drop this in a .tf file and fill in the "..." placeholders (each required input is commented):
provider "aws" {
region = "us-east-1"
}
module "dax" {
source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-aws-dax?ref=v1.0.0"
cluster_name = "..." # DAX cluster name (lowercased by AWS).
subnet_ids = ["...", "..."] # >= 2 private subnets across >= 2 AZs.
security_group_ids = ["..."] # Security groups allowing port 8111/9111 from clients.
dynamodb_table_arns = ["..."] # Tables DAX is allowed to read/write on your behalf.
}
Then terraform init && terraform apply. Every other input has a sensible default — see Inputs below to override behaviour.
What this module is
DynamoDB Accelerator (DAX) (aws_dax_cluster) is a fully managed, in-memory write-through cache that sits in front of DynamoDB and turns single-digit-millisecond reads into microsecond reads for cache hits. Applications talk to DAX using a drop-in DAX client instead of the DynamoDB SDK, and DAX transparently serves cached item and query results while forwarding writes through to the underlying tables. It runs as a cluster of nodes spread across Availability Zones, and at runtime it assumes an IAM role to access DynamoDB on your behalf — so the cache, not your application, holds the DynamoDB permissions.
The resource is small but its defaults are the wrong ones for production. replication_factor can be set to 1, which gives you a single-node cluster with no failover — a node loss means a cold cache and an instant load spike on DynamoDB. server_side_encryption defaults to off, so cached items sit unencrypted at rest, and cluster_endpoint_encryption_type defaults to NONE, so the client→DAX connection is plaintext. The IAM role is mandatory and easy to over-scope: a lazy dynamodb:* on * hands the cache far more than it needs. Finally, DAX must live in a subnet group of private subnets, which is a separate resource people forget, and tuning the cache TTLs requires a parameter group, a third resource.
Wrapping all of this in a module encodes the correct posture once: a multi-node replication_factor (defaulting to 3 for AZ-resilient production clusters), server_side_encryption { enabled = true }, cluster_endpoint_encryption_type = "TLS", a private subnet group, a tunable parameter group for item/query TTLs, and a least-privilege IAM role scoped to exactly the DynamoDB tables you name. App teams hand the module a name, subnets, security groups, and a list of table ARNs, and they get an encrypted, highly available cache that passes a security review.
When to use it
- You run read-heavy DynamoDB workloads — leaderboards, product catalogs, session stores, real-time bidding — where the same items are read far more often than written and microsecond latency matters.
- You want to absorb hot-key and burst read traffic in front of DynamoDB so a spike does not blow through provisioned capacity or rack up on-demand read charges.
- You need encryption everywhere — at rest for cached items and TLS in transit to the cache — and a least-privilege role scoped to specific tables rather than a blanket DynamoDB grant.
- You operate many caching layers across teams and want a paved-road module so nobody ships a single-node, unencrypted DAX cluster with an over-permissive role.
Reach for DynamoDB on-demand with no cache when your read pattern is unpredictable and uncacheable, or when strong read-after-write consistency on every read is required — DAX serves eventually consistent reads from cache, and strongly consistent reads bypass the cache entirely, so a workload that is overwhelmingly strongly-consistent gets little benefit. DAX shines specifically for repeated eventually-consistent reads of a hot working set.
Module structure
terraform-module-aws-dax/
├── versions.tf # provider + Terraform version pins
├── main.tf # IAM role, subnet group, parameter group, cluster
├── variables.tf # var-driven inputs with validations
└── outputs.tf # cluster ARN, endpoints, role ARN, and key attributes
versions.tf
terraform {
required_version = ">= 1.5.0"
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.0"
}
}
}
main.tf
locals {
tags = merge(
{
"Name" = var.cluster_name
"ManagedBy" = "terraform"
"Module" = "terraform-module-aws-dax"
},
var.tags,
)
# Use the caller-supplied role if given, otherwise create a least-privilege
# role scoped to exactly the named DynamoDB table ARNs.
create_role = var.iam_role_arn == null
role_arn = local.create_role ? aws_iam_role.this[0].arn : var.iam_role_arn
}
# Trust policy: only the DAX service may assume this role.
data "aws_iam_policy_document" "assume" {
count = local.create_role ? 1 : 0
statement {
actions = ["sts:AssumeRole"]
principals {
type = "Service"
identifiers = ["dax.amazonaws.com"]
}
}
}
resource "aws_iam_role" "this" {
count = local.create_role ? 1 : 0
name_prefix = "${var.cluster_name}-dax-"
assume_role_policy = data.aws_iam_policy_document.assume[0].json
tags = local.tags
}
# Least-privilege DynamoDB access, scoped to the named tables and their indexes.
data "aws_iam_policy_document" "dynamodb" {
count = local.create_role ? 1 : 0
statement {
sid = "DaxDynamoDBAccess"
effect = "Allow"
actions = [
"dynamodb:GetItem",
"dynamodb:BatchGetItem",
"dynamodb:Query",
"dynamodb:Scan",
"dynamodb:PutItem",
"dynamodb:UpdateItem",
"dynamodb:DeleteItem",
"dynamodb:BatchWriteItem",
"dynamodb:ConditionCheckItem",
"dynamodb:DescribeTable",
]
resources = concat(
var.dynamodb_table_arns,
[for arn in var.dynamodb_table_arns : "${arn}/index/*"],
)
}
}
resource "aws_iam_role_policy" "dynamodb" {
count = local.create_role ? 1 : 0
name = "${var.cluster_name}-dynamodb-access"
role = aws_iam_role.this[0].id
policy = data.aws_iam_policy_document.dynamodb[0].json
}
# DAX must live in private subnets across multiple AZs.
resource "aws_dax_subnet_group" "this" {
name = "${var.cluster_name}-subnet-group"
description = "Subnet group for ${var.cluster_name} managed by Terraform"
subnet_ids = var.subnet_ids
}
# Parameter group controls item/query TTLs for the cache.
resource "aws_dax_parameter_group" "this" {
name = "${var.cluster_name}-params"
description = "Parameter group for ${var.cluster_name} managed by Terraform"
dynamic "parameters" {
for_each = var.parameters
content {
name = parameters.value.name
value = parameters.value.value
}
}
}
resource "aws_dax_cluster" "this" {
cluster_name = var.cluster_name
node_type = var.node_type
replication_factor = var.replication_factor
iam_role_arn = local.role_arn
subnet_group_name = aws_dax_subnet_group.this.name
security_group_ids = var.security_group_ids
parameter_group_name = aws_dax_parameter_group.this.name
availability_zones = var.availability_zones
# Encryption on by default: at rest and in transit (TLS) to the endpoint.
server_side_encryption {
enabled = true
}
cluster_endpoint_encryption_type = "TLS"
maintenance_window = var.maintenance_window
notification_topic_arn = var.notification_topic_arn
description = var.description
tags = local.tags
}
variables.tf
variable "cluster_name" {
description = "DAX cluster name (AWS lowercases it). Letters, digits, hyphens; starts with a letter."
type = string
validation {
condition = can(regex("^[a-z][a-z0-9-]{0,19}$", var.cluster_name)) && !can(regex("--|-$", var.cluster_name))
error_message = "cluster_name must start with a letter, be 1-20 lowercase alphanumeric/hyphen chars, and not end in or contain consecutive hyphens."
}
}
variable "node_type" {
description = "Compute/memory capacity per node, e.g. dax.t3.small, dax.r5.large."
type = string
default = "dax.t3.small"
validation {
condition = can(regex("^dax\\.", var.node_type))
error_message = "node_type must start with 'dax.' (e.g. dax.t3.small)."
}
}
variable "replication_factor" {
description = "Number of nodes in the cluster. Use >= 3 in production for AZ-resilient failover."
type = number
default = 3
validation {
condition = var.replication_factor >= 1 && var.replication_factor <= 10
error_message = "replication_factor must be between 1 and 10; use >= 3 for production."
}
}
variable "subnet_ids" {
description = "At least two private subnet IDs across two AZs for the DAX subnet group."
type = list(string)
validation {
condition = length(var.subnet_ids) >= 2
error_message = "subnet_ids must contain at least two subnets across two availability zones."
}
}
variable "security_group_ids" {
description = "Security group IDs for the cluster (allow client ingress on 8111 plaintext / 9111 TLS)."
type = list(string)
validation {
condition = length(var.security_group_ids) > 0
error_message = "At least one security group ID is required."
}
}
variable "dynamodb_table_arns" {
description = "DynamoDB table ARNs DAX is allowed to access. Used to build the least-privilege role."
type = list(string)
validation {
condition = length(var.dynamodb_table_arns) > 0
error_message = "At least one DynamoDB table ARN is required to scope the DAX role."
}
}
variable "iam_role_arn" {
description = "Existing IAM role ARN for DynamoDB access. Null lets the module create a least-privilege role."
type = string
default = null
}
variable "availability_zones" {
description = "AZs to place nodes in. Empty lets DAX spread nodes automatically."
type = list(string)
default = []
}
variable "parameters" {
description = "DAX parameter group parameters, e.g. record-ttl-millis and query-ttl-millis."
type = list(object({
name = string
value = string
}))
default = [
{ name = "record-ttl-millis", value = "300000" },
{ name = "query-ttl-millis", value = "300000" },
]
}
variable "maintenance_window" {
description = "Weekly UTC maintenance window, format ddd:hh24:mi-ddd:hh24:mi (>= 60 min)."
type = string
default = "sun:05:00-sun:06:00"
}
variable "notification_topic_arn" {
description = "Optional SNS topic ARN for DAX cluster notifications."
type = string
default = null
}
variable "description" {
description = "Free-text description for the DAX cluster."
type = string
default = "Managed by Terraform (kloudvin terraform-module-aws-dax)."
}
variable "tags" {
description = "Additional tags merged onto the cluster."
type = map(string)
default = {}
}
outputs.tf
output "cluster_arn" {
description = "ARN of the DAX cluster."
value = aws_dax_cluster.this.arn
}
output "cluster_name" {
description = "Name of the DAX cluster."
value = aws_dax_cluster.this.cluster_name
}
output "configuration_endpoint" {
description = "Configuration endpoint (DNS:port) the DAX client connects to."
value = aws_dax_cluster.this.configuration_endpoint
}
output "cluster_address" {
description = "DNS name of the cluster without the port appended."
value = aws_dax_cluster.this.cluster_address
}
output "port" {
description = "Port used by the configuration endpoint."
value = aws_dax_cluster.this.port
}
output "nodes" {
description = "List of node objects (id, address, port, availability_zone)."
value = aws_dax_cluster.this.nodes
}
output "iam_role_arn" {
description = "ARN of the IAM role DAX assumes to access DynamoDB."
value = local.role_arn
}
output "subnet_group_name" {
description = "Name of the DAX subnet group."
value = aws_dax_subnet_group.this.name
}
output "parameter_group_name" {
description = "Name of the associated DAX parameter group."
value = aws_dax_parameter_group.this.name
}
How to use it
module "dax" {
source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-aws-dax?ref=v1.0.0"
cluster_name = "catalog-prod"
node_type = "dax.r5.large"
replication_factor = 3 # one node per AZ for failover
subnet_ids = aws_db_subnet_group.private.subnet_ids
security_group_ids = [aws_security_group.dax.id]
# The module builds a least-privilege role scoped to exactly these tables.
dynamodb_table_arns = [
aws_dynamodb_table.products.arn,
aws_dynamodb_table.categories.arn,
]
# Cache hot catalog reads for five minutes.
parameters = [
{ name = "record-ttl-millis", value = "300000" },
{ name = "query-ttl-millis", value = "60000" },
]
maintenance_window = "sun:05:00-sun:06:00"
notification_topic_arn = aws_sns_topic.dax_alerts.arn
tags = {
Environment = "prod"
Team = "catalog"
CostCenter = "CAT-3310"
}
}
# Downstream: hand the configuration endpoint to an ECS service so the DAX
# client connects to the cache instead of DynamoDB directly.
resource "aws_ssm_parameter" "dax_endpoint" {
name = "/catalog/prod/dax/endpoint"
type = "String"
value = module.dax.configuration_endpoint
}
# Application tasks need their OWN permission to talk to the DAX cluster,
# separate from the role DAX uses to reach DynamoDB.
resource "aws_iam_role_policy" "app_dax_access" {
name = "catalog-app-dax-access"
role = aws_iam_role.catalog_task.id
policy = jsonencode({
Version = "2012-10-17"
Statement = [{
Effect = "Allow"
Action = [
"dax:GetItem",
"dax:BatchGetItem",
"dax:Query",
"dax:Scan",
"dax:PutItem",
"dax:UpdateItem",
]
Resource = module.dax.cluster_arn
}]
})
}
Pin the module with
?ref=<tag>so a cluster never silently picks up a breaking module change — changingreplication_factoror encryption settings can force node-level changes.
With Terragrunt
Terragrunt keeps this module DRY across environments — define the backend and provider once in a root config, then a thin terragrunt.hcl per environment supplies only the inputs that differ.
1. Root config — live/terragrunt.hcl (inherited by every module):
remote_state {
backend = "s3"
generate = { path = "backend.tf", if_exists = "overwrite" }
config = {
# ...s3 state bucket/container + key per path...
}
}
2. Module config — live/prod/dax/terragrunt.hcl:
include "root" {
path = find_in_parent_folders()
}
terraform {
source = "git::https://dev.azure.com/teknohut/kloudvin/_git/terraform-modules//terraform-module-aws-dax?ref=v1.0.0"
}
inputs = {
cluster_name = "..."
subnet_ids = ["...", "..."]
security_group_ids = ["..."]
dynamodb_table_arns = ["..."]
}
3. Deploy one environment, or roll out all modules together:
cd live/prod/dax && terragrunt apply # this module
terragrunt run-all apply # every module under live/prod
Why Terragrunt here: the backend and provider live in one place instead of being copy-pasted into every module; inputs is overridden per environment (dev / stage / prod) without forking the module; and run-all orchestrates dependencies across modules. Reach for it once you have more than one environment or more than a handful of modules — for a single stack, the plain Quickstart above is enough.
Inputs
| Name | Type | Default | Required | Description |
|---|---|---|---|---|
| cluster_name | string | — | Yes | DAX cluster name (lowercased by AWS). |
| subnet_ids | list(string) | — | Yes | >= 2 private subnets across >= 2 AZs. |
| security_group_ids | list(string) | — | Yes | Security groups for the cluster nodes. |
| dynamodb_table_arns | list(string) | — | Yes | Table ARNs the DAX role may access. |
| node_type | string | dax.t3.small | No | Compute/memory per node (dax.*). |
| replication_factor | number | 3 | No | Number of nodes; use >= 3 in prod. |
| iam_role_arn | string | null | No | Existing role; null creates a least-privilege role. |
| availability_zones | list(string) | [] | No | AZs for nodes; empty spreads automatically. |
| parameters | list(object) | record/query TTL 300000 | No | Parameter group entries (name/value). |
| maintenance_window | string | sun:05:00-sun:06:00 | No | Weekly UTC maintenance window. |
| notification_topic_arn | string | null | No | SNS topic for cluster notifications. |
| description | string | “Managed by Terraform…” | No | Free-text cluster description. |
| tags | map(string) | {} | No | Additional tags merged onto the cluster. |
Outputs
| Name | Description |
|---|---|
| cluster_arn | ARN of the DAX cluster. |
| cluster_name | Name of the DAX cluster. |
| configuration_endpoint | Configuration endpoint (DNS:port) for the DAX client. |
| cluster_address | DNS name of the cluster without the port. |
| port | Port used by the configuration endpoint. |
| nodes | List of node objects (id, address, port, AZ). |
| iam_role_arn | ARN of the role DAX assumes to access DynamoDB. |
| subnet_group_name | Name of the DAX subnet group. |
| parameter_group_name | Name of the associated parameter group. |
Enterprise scenario
A retail platform serves a product catalog whose items are read tens of thousands of times per second during peak sale events but updated only a few times an hour. Hot keys on a handful of best-sellers kept pushing DynamoDB into throttling and inflating on-demand read costs, so the catalog team published this module at v1.0.0 and put a three-node dax.r5.large cluster in front of the products and categories tables across three AZs. The module enforces server_side_encryption and cluster_endpoint_encryption_type = "TLS", so cached items are encrypted at rest and every client connection is TLS; the generated role can touch only those two tables and their indexes, while the application’s task role separately holds the dax:* permissions to reach the cluster. With a five-minute record TTL, cache hits return in microseconds, DynamoDB read costs dropped sharply, and a security review confirmed no single-node or unencrypted DAX clusters and no over-scoped DynamoDB roles across the estate.
Best practices
- Run at least three nodes in production. This module defaults
replication_factor = 3so the cluster survives an AZ loss without a cold cache; a single-node cluster is a single point of failure that dumps full read load back on DynamoDB the moment it fails. - Keep encryption on, both at rest and in transit. The module hard-codes
server_side_encryption { enabled = true }andcluster_endpoint_encryption_type = "TLS"; never downgrade the endpoint toNONE, and make sure clients connect on the TLS port (9111) with the encrypting DAX client. - Scope the DAX role to specific tables. Let the module build the least-privilege role (
iam_role_arn = null) so DAX can touch only the nameddynamodb_table_arnsand theirindex/*— never grantdynamodb:*on*to the cache. - Separate the two IAM grants. The role DAX assumes to reach DynamoDB is distinct from the
dax:*permissions your application needs to reach the cluster — wire the application task/role withdax:*on thecluster_arnoutput, and don’t conflate the two. - Tune TTLs to your read/write ratio. Set
record-ttl-millisandquery-ttl-millisto match how stale a read can be; a write-through cache serves eventually consistent reads, so a longer TTL on rarely-changing reference data maximizes hit rate, while volatile data needs a shorter window. - Right-size nodes and place them across AZs. Use the memory-optimized
dax.r5/dax.r6gfamilies for large working sets, letavailability_zonesspread nodes for resilience, and tag withEnvironment,Team, andCostCenterso the cache fleet is attributable in Cost Explorer.