Tuning Block and File Storage on AWS: EBS gp3/io2, EFS Throughput Modes, and Workload-Driven Sizing

Storage is where most “the database is slow” tickets actually end. Teams provision a volume by capacity, pick a type from muscle memory, and never look at the throughput ceiling the instance imposes underneath it. The result is a 16,000-IOPS volume bolted to an instance that can only push 4,750 — money spent on numbers the kernel can never reach. This is the most expensive misunderstanding in AWS storage, and it is invisible: the volume reports its full provisioned numbers, CloudWatch shows you well under them, and nobody connects the two because the limit that bit you lives on the instance, not the volume.

This guide is the mental model and the concrete knobs I use to size and tune block and file storage on AWS: what each EBS type is actually for, how gp3 and io2 Block Express decouple IOPS, throughput, and capacity, where the instance becomes the bottleneck, and how EFS throughput modes change the calculus for shared file workloads. The governing equation is one line — achieved performance = min(volume limit, instance limit, filesystem/app limit) — and everything in this article is an elaboration of where each of those three terms comes from and how to read it off a real system. Everything here is verifiable with fio and CloudWatch, and I show both. Because this is a reference you will return to while sizing a fleet or chasing a latency spike, the volume types, the limits, the throughput modes, and the failure modes are all laid out as scannable tables — read the prose once, then keep the tables open.

By the end you will stop sizing storage by capacity alone. When a workload is slow you will know whether you face a volume ceiling, an instance EBS-bandwidth cap, a near-empty EFS Bursting filesystem out of credits, a too-shallow queue depth hiding real headroom, or a snapshot that reads slow only because nobody enabled Fast Snapshot Restore. Knowing which in five minutes — from two CloudWatch metrics and one fio run — is what separates a right-sized fleet from a bill full of numbers the hardware can never deliver.

What problem this solves

EBS and EFS hide enormous machinery so you can attach a disk and run. That abstraction is a gift until performance matters, then the defaults and the muscle-memory choices cost you twice: once in latency users feel, once in spend on capacity and IOPS the instance can never consume. The pain is concrete — a reconciliation batch that flatlines at 600 MiB/s no matter how high you push the volume, a shared filesystem that crawls on a near-empty Bursting EFS, a restored DR volume that reads at a tenth of its rated speed for the first hour, a gp2 boot volume silently throttled because someone never migrated it to gp3.

What breaks without this knowledge: engineers “buy more IOPS” (no effect, because the instance was the cap), oversize volumes to chase performance the old gp2-era way (3 IOPS/GiB coupling that no longer applies), pick io2 Block Express for a workload that gp3 would serve at a quarter of the price, or mount EFS with the wrong throughput mode and watch a filesystem starve. Meanwhile the actual constraint — the instance’s published EBS baseline, an exhausted burst-credit bucket, a single-threaded I/O pattern that a deeper queue would saturate — sits there, perfectly measurable, ignored.

Who hits this: anyone running databases (random small-block, IOPS-bound), analytics and log pipelines (large sequential, throughput-bound), container fleets sharing EFS, and DR/golden-image workflows that restore from snapshots. It bites hardest on right-sizing reviews (where over-provisioned volumes hide in plain sight), on latency-sensitive OLTP under concurrency (where gp3’s ceiling or queueing shows up), and on cost audits (where the gap between provisioned and achieved is real money). The fix is almost never “a bigger volume” — it’s “find the term in min(volume, instance, app) that’s actually binding and move that one.”

To frame the whole field before the deep dive, here is every performance-limit class this article covers, the question it forces, and the one place to look first:

Limit class	What it caps	First question to ask	First place to look	Most common single cause
Volume per-volume ceiling	One volume’s max IOPS / throughput	Am I at the volume’s rated max?	`describe-volumes` (Iops, Throughput)	gp3 left at 3,000/125 defaults
Instance EBS bandwidth	All EBS traffic from the instance	Does the instance cap below the volume?	`describe-instance-types` EbsOptimizedInfo	Big volume on a small instance
gp3 throughput-per-IOPS ratio	Throughput you can buy vs IOPS	Did I provision enough IOPS to buy the MiB/s?	Provisioned iops vs throughput	1,000 MiB/s needs ≥4,000 IOPS
EFS throughput mode	Filesystem aggregate throughput	Bursting on a near-empty filesystem?	`describe-file-systems` ThroughputMode	Bursting starves below ~1 TiB
Queue depth / parallelism	Achievable IOPS at the app	Is iodepth/numjobs deep enough?	`fio` iodepth vs achieved IOPS	Single-threaded I/O, iodepth=1
Snapshot lazy-load	First-touch read speed	Is this a fresh restore without FSR?	First-read latency vs steady state	Restore without Fast Snapshot Restore

Learning objectives

By the end of this article you can:

Choose the right EBS volume type (gp3, io2 Block Express, st1, sc1) by access pattern — random small-block vs large sequential — and name what each one is actually for.
Provision IOPS, throughput, and capacity as three independent decisions on gp3 and io2, and respect the ratios that bound them (gp3’s 0.25 MiB/s per IOPS; io2’s 1,000 IOPS/GiB).
Look up an instance’s EBS-optimized baseline and burst limits and never provision a volume past the number the instance can actually consume for a sustained workload.
Use Elastic Volumes to modify type/IOPS/throughput online, and work around the optimizing state and the 6-hour modification cooldown.
Pick the correct EFS performance mode (General Purpose vs Max I/O) and throughput mode (Elastic, Provisioned, Bursting), and explain why a near-empty Bursting filesystem is the most common EFS complaint.
Benchmark the real path with fio (O_DIRECT, workload-matched block sizes and queue depth) and read IOPS, bandwidth, and latency percentiles against min(volume, instance).
Diagnose a storage-performance incident — instance-bound, volume-bound, credit-starved, queue-starved, or lazy-load — from CloudWatch metrics and confirm the fix.

Prerequisites & where this fits

You should already understand the basics: an EBS volume is network-attached block storage bound to one Availability Zone and (normally) one EC2 instance; EFS is an NFSv4.1 file system reachable from many instances across AZs; and an EC2 instance is the compute that mounts them. You should be comfortable running aws CLI with --query, reading JSON output, and reading a Terraform resource block. Familiarity with Linux filesystems (mkfs, mount, /dev/nvme*), the page cache, and basic IOPS-vs-throughput-vs-latency vocabulary helps.

This sits in the Compute & Storage track. It assumes the EC2 fundamentals from Amazon EC2, In Depth: Instance Types, AMIs, EBS, User Data, IMDS & Every Launch Option, and it is the performance-and-tuning companion to the breadth survey in AWS Block & File Storage, In Depth: EBS, EFS, FSx & Instance Store. It pairs with AWS Observability, In Depth: CloudWatch, CloudTrail, Config & EventBridge because every limit here is read off a CloudWatch metric, and with Amazon RDS & Aurora, In Depth: Engines, Multi-AZ, Read Replicas, Backups & Every Option, whose managed storage abstracts the same physics you tune by hand here.

A quick map of who owns which limit during a sizing review or an incident, so you reason about the right layer:

Layer	What lives here	Who usually owns it	Performance class it can cause
Application / DB engine	Block size, queue depth, fsync pattern	App / DBA	Queue-starved IOPS; fsync-bound latency
Filesystem / RAID	xfs/ext4, mdadm stripe, mount opts	Platform / SRE	Single-volume ceiling when stripe absent
EBS volume	Type, provisioned IOPS/throughput	Platform	Volume per-volume ceiling
EC2 instance	EBS-optimized baseline/burst	Platform	Instance EBS-bandwidth cap (the silent one)
EFS file system	Performance + throughput mode	Platform	Credit starvation; mode mismatch
Snapshot / DLM	FSR, lifecycle, incremental chain	Platform / Backup	Lazy-load slow first touch

Core concepts

Five mental models make every later decision obvious.

Achieved performance is the minimum of several ceilings, not the volume’s number. The volume’s provisioned IOPS and throughput are a maximum the volume can do. The instance imposes its own EBS-optimized bandwidth and IOPS limit, and that is usually lower. The filesystem and application impose a third (block size, queue depth, fsync). What you actually get is min(volume, instance, app). Almost every “we paid for performance we don’t see” story is the volume number being the largest of the three while the instance or the app is the one binding.

IOPS, throughput, and capacity are three separate purchases on modern volumes. On the legacy gp2, IOPS scaled with size (3 IOPS/GiB), so you oversized a volume just to buy performance. gp3 and io2 break that coupling: you set capacity for how much data you store, IOPS for how many small operations per second, and throughput (MiB/s) for how much sequential bandwidth — independently, within ratios. Sizing storage is now three decisions, and conflating them is how you both overspend and under-provision at once.

Random-small is an IOPS problem; large-sequential is a throughput problem. Databases and busy filesystems do many tiny (4–16 KiB) random operations — that is an IOPS workload, and it wants SSD (gp3/io2). Log ingestion and analytics scans move large blocks sequentially — that is a throughput workload, where HDD st1 can be cost-effective, though a well-provisioned gp3 at 1,000 MiB/s often wins on latency. Naming the workload (random-small vs large-sequential) is the first fork in choosing a type.

EFS performance is governed by two orthogonal settings people routinely confuse. Performance mode (General Purpose vs Max I/O, immutable after creation) trades latency against aggregate ceiling. Throughput mode (Elastic, Provisioned, Bursting, changeable with a cooldown) governs how much aggregate throughput you get and how you pay. The classic EFS failure is a near-empty Bursting filesystem: throughput scales with stored data (50 KiB/s per GiB baseline), so a 100 GiB filesystem has a tiny baseline and starves once its burst credits run out.

Snapshots are incremental, and a fresh restore is lazy-loaded. EBS snapshots store only changed blocks since the last snapshot, so frequent snapshots are cheap and deleting an old one never breaks a newer one. But a volume restored from a snapshot loads each block from S3 on first touch, so the first read of every block is slow — that is lazy loading, not the steady-state number. Fast Snapshot Restore (FSR) pre-initializes the volume so it delivers full performance immediately. Benchmark a fresh restore without FSR and you measure S3 fetch latency, not the volume.

The vocabulary in one table

Before the deep sections, pin down every moving part. The glossary at the end repeats these for lookup; this table is the mental model side by side:

Concept	One-line definition	Where it lives	Why it matters to performance
gp3	General-purpose SSD; IOPS/throughput decoupled from size	EBS volume type	The default; cheaper than gp2, tunable
io2 Block Express	High-IOPS, sub-ms, durable SSD	EBS volume type	Only when gp3’s ceiling isn’t enough
st1 / sc1	Throughput-optimized / cold HDD	EBS volume type	Large sequential, never random/boot
Provisioned IOPS	Small ops/sec you buy for the volume	Volume setting	Caps random-small performance
Provisioned throughput	MiB/s you buy for the volume	Volume setting	Caps sequential bandwidth
Instance EBS baseline	Sustained EBS bandwidth the instance allows	Instance attribute	The real cap nobody checks first
EBS-optimized burst	30-min higher bandwidth on smaller sizes	Instance attribute	Misleads you if workload is sustained
Elastic Volumes	Online modify of type/IOPS/throughput	EBS feature	Change without downtime; 6 h cooldown
RAID 0 stripe	Aggregate N volumes’ ceilings	Filesystem (mdadm)	Beat single-volume ceiling; no redundancy
EFS performance mode	General Purpose vs Max I/O	EFS (immutable)	Latency vs aggregate ceiling
EFS throughput mode	Elastic / Provisioned / Bursting	EFS (cooldown)	How much throughput + how you pay
Burst credits	Earned headroom on st1 / EFS Bursting	Volume/FS state	Starve when exhausted → slow
FSR	Fast Snapshot Restore (pre-init)	Snapshot feature	Full speed on first touch after restore

EBS volume types by workload

There are four types worth provisioning in 2026. Pick by access pattern, not by habit. The per-volume ceilings are the volume’s maximum — the instance ceiling (next section) is often what actually binds.

Type	Media	Best for	Max IOPS / vol	Max throughput / vol	Boot?
`gp3`	SSD	General purpose; boot, most apps, mid-tier DBs	16,000	1,000 MiB/s	Yes
`io2` Block Express	SSD	Latency-sensitive, high-IOPS DBs; sub-ms, durable	256,000	4,000 MiB/s	Yes
`st1`	HDD	Large sequential, throughput-bound (logs, big-data scans)	500	500 MiB/s	No
`sc1`	HDD	Cold, infrequently accessed, lowest cost	250	250 MiB/s	No

The full attribute grid — every type side by side on the dimensions that decide a pick, including the legacy gp2/io1 you’ll meet on existing fleets:

Attribute	`gp3`	`io2` Block Express	`gp2` (legacy)	`io1` (legacy)	`st1`	`sc1`
Media	SSD	SSD	SSD	SSD	HDD	HDD
Min / max size	1 GiB – 16 TiB	4 GiB – 64 TiB	1 GiB – 16 TiB	4 GiB – 16 TiB	125 GiB – 16 TiB	125 GiB – 16 TiB
Max IOPS / volume	16,000	256,000	16,000	64,000	500	250
Max throughput / volume	1,000 MiB/s	4,000 MiB/s	250 MiB/s	1,000 MiB/s	500 MiB/s	250 MiB/s
Baseline	3,000 / 125 MiB/s	(you provision)	3 IOPS/GiB (coupled)	(you provision)	credit-based	credit-based
IOPS:capacity ratio	≤ 500 IOPS/GiB	≤ 1,000 IOPS/GiB	3 IOPS/GiB	≤ 50 IOPS/GiB	n/a	n/a
Durability	99.8–99.9%	99.999%	99.8–99.9%	99.8–99.9%	99.8–99.9%	99.8–99.9%
Bootable	Yes	Yes	Yes	Yes	No	No
Multi-Attach	No	Yes (≤16)	No	Yes (≤16)	No	No
Best for	Default; most apps	Sub-ms / > 16k IOPS	(migrate to gp3)	(migrate to io2)	Sequential	Cold

The decision rules I apply:

Default to gp3. It is cheaper than the legacy gp2 for the same baseline and lets you buy IOPS and throughput independently of size. There is almost no reason to provision gp2 on a new system.
Reach for io2 Block Express only when you need it: sustained IOPS above 16,000, single-digit-millisecond p99 latency under load, durability of 99.999%, or volumes larger than 16 TiB. Block Express is the substrate that unlocks the high ceilings and is available on Nitro instances.
st1/sc1 are HDD and throughput-optimized, not IOPS devices. They are excellent for streaming reads of large files and terrible for random small I/O or as a boot volume — you cannot boot from them. st1 uses a throughput burst-credit model; sc1 is the cold, cheapest tier.

Rule of thumb: if the workload is random and small-block (databases, busy filesystems), it is an IOPS problem -> SSD (gp3/io2). If it is large and sequential (log ingestion, analytics scans), it is a throughput problem -> consider st1, but measure, because a well-provisioned gp3 at 1,000 MiB/s often wins on latency.

Picking by the numbers — a decision table

When the workload is described in plain terms, this maps it to a type without debate:

If the workload is…	It’s probably…	Provision	Why
Boot/root volume, mixed app I/O	General-purpose	`gp3` (3,000/125 default)	Cheapest sane default; bootable
OLTP DB, random 4–16 KiB, < 16,000 IOPS	IOPS-bound, moderate	`gp3` with raised IOPS	Decoupled IOPS, fraction of io2 cost
OLTP DB needing > 16,000 IOPS or sub-ms p99	IOPS-bound, extreme	`io2` Block Express	Only type that exceeds gp3’s ceiling
Volume > 16 TiB with high IOPS	Large + high-IOPS	`io2` Block Express	gp3 caps at 16 TiB / 16,000 IOPS
Log/stream ingestion, large sequential writes	Throughput-bound	`st1` (or gp3 @ 1,000)	HDD cheap for sequential; measure latency
Cold archive on a block device, rare reads	Cost-floor	`sc1`	Lowest $/GiB block tier
Shared across many instances / AZs	File, not block	EFS (not EBS)	EBS is single-AZ, single-attach by default

What each type costs you to get wrong

The mis-picks I see most, and what they cost:

Mistake	Looks like	Actual cost	Correct move
`gp2` on a new system	“It’s always worked”	Pays more for less; IOPS coupled to size	Migrate to `gp3` (online)
`io2` for a `gp3` workload	Over-engineered DB volume	3–5× the price for unused ceiling	Right-size to `gp3` with provisioned IOPS
`st1`/`sc1` for random I/O	Terrible DB latency	HDD seeks kill small random ops	SSD (`gp3`/`io2`)
`gp3` left at 3,000/125 default	“Why is it slow?”	Throttled to baseline despite headroom	Raise provisioned IOPS/throughput
HDD as a boot volume	Won’t boot	Hard failure	`gp3` for root

Decoupling IOPS, throughput, and capacity

The single most useful property of gp3 and io2 is that the three dimensions are separately provisionable. On gp2, IOPS scaled with size (3 IOPS/GiB), so you used to oversize a volume just to buy performance. That coupling is gone.

gp3 baseline is 3,000 IOPS and 125 MiB/s at any size, and you provision above that up to 16,000 IOPS and 1,000 MiB/s. The throughput ceiling you can buy also scales with provisioned IOPS — you get up to 0.25 MiB/s per IOPS, so 1,000 MiB/s requires at least 4,000 provisioned IOPS.

resource "aws_ebs_volume" "data" {
  availability_zone = "us-east-1a"
  size              = 200    # GiB, sized for capacity only
  type              = "gp3"
  iops              = 8000   # decoupled from size
  throughput        = 500    # MiB/s, decoupled from size
  encrypted         = true
  kms_key_id        = aws_kms_key.ebs.arn
}

For io2, you provision IOPS directly, bounded by a ratio of IOPS to capacity (up to 1,000 IOPS/GiB), and Block Express raises the per-volume ceiling to 256,000 IOPS and 4,000 MiB/s:

resource "aws_ebs_volume" "oltp" {
  availability_zone = "us-east-1a"
  size              = 500
  type              = "io2"      # Block Express on supported Nitro instances
  iops              = 64000      # within the 1000 IOPS/GiB ratio (500 GiB -> up to 500k)
  encrypted         = true
}

The dimensions and their ratios, side by side

Every provisionable dimension, its range, its default, and the ratio that bounds it:

Dimension	gp3 range	gp3 default	io2 range	Ratio / bound	Gotcha
Capacity (size)	1 GiB – 16 TiB	(you set)	4 GiB – 64 TiB	io2: IOPS ≤ 1,000 × GiB	Shrinking size is not supported online
Provisioned IOPS	3,000 – 16,000	3,000	100 – 256,000 (Block Express)	gp3: ≤ 500 IOPS/GiB	Above 16,000 needs io2, not gp3
Provisioned throughput	125 – 1,000 MiB/s	125	up to 4,000 MiB/s (Block Express)	gp3: ≤ 0.25 MiB/s per IOPS	1,000 MiB/s needs ≥ 4,000 IOPS
Throughput-per-IOPS	derived	derived	derived	gp3 hard rule	Buying MiB/s without IOPS is rejected
Durability	99.8–99.9%	—	99.999%	—	io2 is the durability tier

The gp3 throughput-per-IOPS trap

This catches people who raise throughput without raising IOPS. To buy a given throughput on gp3, you need at least throughput / 0.25 provisioned IOPS:

Target throughput	Minimum gp3 IOPS required	Why
125 MiB/s (baseline)	3,000 (baseline)	Free with baseline
250 MiB/s	1,000 (covered by 3,000 baseline)	Within baseline IOPS
500 MiB/s	2,000 (covered by 3,000 baseline)	Within baseline IOPS
750 MiB/s	3,000	Exactly at baseline IOPS
1,000 MiB/s	4,000	Must raise IOPS above baseline

Modifying a volume online with Elastic Volumes

Modifying a volume in place is online via Elastic Volumes — no detach, no downtime:

aws ec2 modify-volume \
  --volume-id vol-0abc123 \
  --volume-type gp3 \
  --iops 10000 \
  --throughput 700

# Watch the modification progress; the volume stays attached and usable
aws ec2 describe-volumes-modifications --volume-id vol-0abc123 \
  --query 'VolumesModifications[0].[ModificationState,Progress]' --output text

Two operational caveats that bite people: after a modification completes the volume enters an optimizing state where performance is between old and new for a while, and a given volume can only be modified once every 6 hours. Plan changes; don’t thrash them. After a size increase you must also grow the partition and filesystem inside the OS — the block device is bigger, but the filesystem doesn’t know until you tell it:

sudo growpart /dev/nvme0n1 1      # extend the partition to fill the device
sudo xfs_growfs -d /             # xfs: grow to the partition
# (ext4 equivalent: sudo resize2fs /dev/nvme0n1p1)

The Elastic Volumes operations, what is online, and the constraints:

Operation	Online?	Reversible?	Constraint	After-step required
Change type (gp2→gp3, gp3→io2)	Yes	Yes (with cooldown)	6 h between modifications	None
Raise IOPS	Yes	Yes	6 h cooldown; `optimizing` state	None
Raise throughput (gp3)	Yes	Yes	Needs IOPS to back it	None
Grow size	Yes	No (cannot shrink)	6 h cooldown	`growpart` + `xfs_growfs`/`resize2fs`
Shrink size	Not supported	—	Must create new + copy	Migrate data

The instance bandwidth ceiling

This is the section that saves the most money. A volume’s provisioned numbers are a maximum the volume can do — the instance imposes its own EBS bandwidth and IOPS limits, and those are usually lower. AWS publishes per-instance “EBS-optimized” limits: a baseline and a 30-minute burst (on smaller sizes), measured at a 16 KiB block size.

Concretely: an m6i.large tops out around 10,000 IOPS and 4,750 Mbps (~594 MiB/s) of dedicated EBS bandwidth. Attaching a single gp3 provisioned for 16,000 IOPS and 1,000 MiB/s to that instance is wasted spend — the instance caps you at roughly 60% of the throughput and 62% of the IOPS you paid for. The fix is to size the instance to the storage need, or aggregate volumes when the instance has headroom.

Check the limits before you provision the volume:

aws ec2 describe-instance-types \
  --instance-types m6i.large m6i.4xlarge \
  --query 'InstanceTypes[].{type:InstanceType, \
     baseIOPS:EbsInfo.EbsOptimizedInfo.BaselineIops, \
     burstIOPS:EbsInfo.EbsOptimizedInfo.MaximumIops, \
     baseMBps:EbsInfo.EbsOptimizedInfo.BaselineThroughputInMBps, \
     burstMBps:EbsInfo.EbsOptimizedInfo.MaximumThroughputInMBps}' \
  --output table

Smaller instances get an unlimited-duration baseline plus a burst bucket; the larger sizes in a family deliver their maximum continuously. If your workload is sustained (a busy database), size against the baseline, not the burst, or you will fall off a cliff after 30 minutes. On modern instances EBS optimization is on by default and not billable; on older types you may still need --ebs-optimized.

Representative instance EBS limits (general-purpose families)

These are the published per-instance EBS-optimized numbers for common sizes. Use describe-instance-types for the authoritative value in your Region/family — these illustrate the shape (baseline scales with size; smaller sizes burst):

Instance	EBS baseline (Mbps)	EBS baseline (MiB/s)	Baseline IOPS	Bursts?	What a 1,000 MiB/s volume gets
`m6i.large`	4,750	~594	10,000	Yes (30 min)	~594 MiB/s sustained (capped)
`m6i.xlarge`	6,000	~750	20,000	Yes (30 min)	~750 MiB/s sustained (capped)
`m6i.2xlarge`	10,000	~1,250	40,000	Yes (30 min)	Full 1,000 MiB/s (headroom)
`m6i.4xlarge`	10,000	~1,250	40,000	No (sustained)	Full 1,000 MiB/s (headroom)
`m6i.8xlarge`	10,000	~1,250	40,000	No	Full 1,000 MiB/s
`m6i.16xlarge`	20,000	~2,500	80,000	No	Full + room to stripe
`r6i.2xlarge`	10,000	~1,250	40,000	No	Full 1,000 MiB/s
`r5.2xlarge`	4,750	~594	up to 18,750	Yes (30 min)	~594 MiB/s (capped — the scenario)
`c6i.4xlarge`	10,000	~1,250	40,000	No	Full 1,000 MiB/s
`m6i.metal` / `.32xlarge`	40,000	~5,000	100,000	No	Stripe many volumes
`r6id.32xlarge`	80,000	~10,000	260,000	No	io2 Block Express headroom

A second reading of that table: the family sets the per-vCPU ratio, but the size sets whether you burst or run at the maximum continuously. Map a storage demand to the smallest instance that meets it on the baseline:

Sustained storage demand	Smallest instance that meets baseline	Don’t pick	Why
≤ 600 MiB/s, ≤ 10k IOPS	`m6i.large` (~594) — or one size up for margin	smaller bursting size for a 24/7 DB	Burst ends after 30 min
~750 MiB/s	`m6i.xlarge` (~750)	`m6i.large` (caps ~594)	Volume would be throttled
~1,000 MiB/s	`m6i.2xlarge` / `.4xlarge` (~1,250)	anything ≤ `m6i.xlarge`	Need headroom above 1,000
~2,000+ MiB/s (striped)	`m6i.16xlarge` (~2,500)	mid sizes	Stripe needs instance headroom
> 4,000 MiB/s, > 100k IOPS	`.metal` / `r6id.32xlarge`	general sizes	Only big sizes + io2 reach this

Baseline vs burst — the cliff that bites sustained workloads

The distinction that turns a passing benchmark into a 3am incident:

Aspect	Baseline	Burst
Duration	Unlimited	~30 minutes per 24 h (credit-based)
Which sizes get burst	Smaller sizes in a family	Larger sizes deliver max continuously
Sustained DB workload	Size against this	Ignore — you’ll fall off after 30 min
Short batch / spiky	Can lean on burst	Fine within the credit window
Symptom of relying on burst	Fast for 30 min, then throttled	Latency spike exactly at the half-hour mark

Striping to beat the single-volume ceiling

When one instance has bandwidth headroom but a single volume’s per-volume ceiling is the limit, stripe. A RAID 0 across N gp3 volumes multiplies the volume ceilings — up to the instance limit:

# Two gp3 volumes, each provisioned for high throughput, striped
sudo mdadm --create /dev/md0 --level=0 --raid-devices=2 /dev/nvme1n1 /dev/nvme2n1
sudo mkfs.xfs /dev/md0
sudo mount /dev/md0 /data

RAID 0 gives no redundancy — rely on EBS’s own durability and snapshots, and know that a snapshot of a striped set is not crash-consistent across members unless you freeze the filesystem first. When striping helps and when it doesn’t:

Situation	Stripe?	Why
Need > 1,000 MiB/s, instance allows > that	Yes	Aggregate N gp3 volumes to instance limit
Need > 16,000 IOPS, but io2 too costly	Sometimes	N gp3 volumes can exceed one gp3’s IOPS
Instance baseline already the cap	No	Striping can’t exceed the instance ceiling
Single volume already meets demand	No	Adds complexity + risk for nothing
Need redundancy at the volume layer	No (not RAID 0)	RAID 0 = zero redundancy; rely on snapshots

Multi-Attach, Fast Snapshot Restore, and snapshot lifecycle

Multi-Attach lets a single io2 (or io1) volume attach to up to 16 Nitro instances in the same AZ concurrently. It is not a magic shared disk — it provides no coordination. You must run a cluster-aware filesystem (GFS2, OCFS2) or an application that arbitrates writes; mounting xfs/ext4 read-write on two instances corrupts the volume. Use it for clustered, fence-aware software, not as a poor man’s EFS.

Fast Snapshot Restore (FSR) removes the lazy-load penalty. Normally a volume restored from a snapshot loads blocks from S3 on first touch, so the first read of each block is slow. FSR pre-initializes the volume so it delivers full provisioned performance immediately — essential for golden-image boot volumes and for restoring large data volumes into service quickly. It is billed per AZ per hour while enabled.

aws ec2 enable-fast-snapshot-restores \
  --availability-zones us-east-1a us-east-1b \
  --source-snapshot-ids snap-0abc123

When to reach for each of these features

The three features here solve different problems and are mutually independent:

Feature	Solves	Use when	Hard rule / limit	Cost shape
Multi-Attach (io2/io1)	One volume, many readers/writers	Clustered, fence-aware software	Up to 16 Nitro instances, same AZ; cluster FS only	Volume cost only
Fast Snapshot Restore	Slow first-touch after restore	Golden images, time-critical DR restores	Billed per AZ per hour while enabled	Hourly per AZ + per snapshot
Data Lifecycle Manager	Manual snapshot scripts	Any scheduled backup + retention	Policy-driven; tag-targeted	No charge for DLM itself

Automating snapshots with Data Lifecycle Manager

Automate retention with Data Lifecycle Manager rather than cron jobs and Lambda glue. A policy that snapshots nightly, keeps 14, and copies to a DR Region:

{
  "ResourceTypes": ["VOLUME"],
  "TargetTags": [{ "Key": "Backup", "Value": "daily" }],
  "Schedules": [
    {
      "Name": "daily-14d",
      "CreateRule": { "Interval": 24, "IntervalUnit": "HOURS", "Times": ["03:00"] },
      "RetainRule": { "Count": 14 },
      "CopyTags": true,
      "CrossRegionCopyRules": [
        {
          "TargetRegion": "us-west-2",
          "Encrypted": true,
          "CmkArn": "arn:aws:kms:us-west-2:111122223333:key/abcd-1234",
          "RetainRule": { "Interval": 14, "IntervalUnit": "DAYS" }
        }
      ]
    }
  ]
}

EBS snapshots are incremental and block-level: only changed blocks since the last snapshot are stored, so frequent snapshots are cheap. Deleting an old snapshot never breaks a newer one — AWS re-references the blocks the newer snapshot still needs. The snapshot facts that govern cost and recovery:

Property	Behaviour	Implication
Incremental	Only changed blocks since last snapshot stored	Frequent snapshots are cheap
Deletion safety	Newer snapshots keep blocks they need	Deleting an old snapshot never breaks a newer one
First restore (no FSR)	Blocks lazy-loaded from S3 on first touch	First read is slow; not the steady-state number
FSR enabled	Volume pre-initialized	Full performance on first touch
Cross-Region copy	Re-encrypts with target-Region CMK	DR copies need a key in the target Region
Crash consistency (striped set)	Not consistent across members unless frozen	Freeze the filesystem before snapshotting a RAID set

EFS performance modes, throughput modes, and elastic throughput

EFS is NFSv4.1, multi-AZ, and grows automatically. Its performance is governed by two orthogonal settings that people routinely confuse.

Performance mode (set at creation, immutable):

General Purpose — lowest per-operation latency. The right default; required for latency-sensitive and most interactive workloads. Use this unless proven otherwise.
Max I/O — higher aggregate throughput and IOPS by trading away latency. AWS now steers nearly everyone to General Purpose with Elastic throughput; Max I/O is a legacy choice for massively parallel, latency-tolerant jobs.

Throughput mode (changeable, subject to a cooldown):

Elastic — throughput scales automatically with demand, up to high regional limits (on the order of GiB/s for reads), and you pay only for the data transferred. This is the default I recommend for spiky or unpredictable workloads; no provisioning, no cliffs.
Provisioned — you set a fixed throughput independent of stored size. Use it when you have a steady, known high throughput need on a small filesystem, where Elastic’s per-request pricing would cost more.
Bursting — throughput scales with stored data (baseline 50 KiB/s per GiB) and earns burst credits. Cheap, but a small filesystem starves; this is why so many EFS performance complaints trace back to a near-empty Bursting filesystem that ran out of credits.

resource "aws_efs_file_system" "shared" {
  encrypted        = true
  performance_mode = "generalPurpose"
  throughput_mode  = "elastic"     # scales automatically, pay-per-use

  lifecycle_policy {
    transition_to_ia                    = "AFTER_30_DAYS"
    transition_to_primary_storage_class = "AFTER_1_ACCESS"
  }
}

Performance mode — the immutable choice

You set this once at creation and cannot change it later; choose deliberately:

Performance mode	Latency	Aggregate ceiling	Choose when	Cannot change later
General Purpose	Lowest per-op	High (paired with Elastic)	Default; interactive, latency-sensitive, most workloads	Correct
Max I/O	Higher per-op	Highest aggregate IOPS	Legacy: massively parallel, latency-tolerant batch	Correct

Throughput mode — the changeable choice

This you can change, but decreases and mode switches are rate-limited (roughly a day cooldown):

Throughput mode	How throughput is set	You pay for	Best for	Failure mode
Elastic	Auto-scales with demand	Data transferred (per GB)	Spiky / unpredictable; the default	Per-request cost on steady very-high load
Provisioned	Fixed MiB/s you set	Provisioned MiB/s (whether used or not)	Steady, known high throughput on a small FS	Paying for headroom you don’t use
Bursting	Scales with stored data (50 KiB/s/GiB) + credits	Storage only	Large filesystems with bursty access	Near-empty FS starves when credits run out

Choosing between the three modes is a function of size, access shape, and steadiness. This decision table resolves it:

If the filesystem is…	And access is…	Choose	Why
Small (< 1 TiB)	Spiky / unpredictable	Elastic	No baseline cliff; pay per GB
Small (< 1 TiB)	Steady, known high throughput	Provisioned	Fixed MiB/s cheaper than per-GB at steady high load
Large (> 5 TiB)	Bursty	Bursting	Baseline (50 KiB/s/GiB) is already large; cheapest
Any size	Unknown / changing	Elastic	Safe default; auto-scales, no provisioning
Near-empty	Anything	Elastic (never Bursting)	Bursting starves with almost no baseline
Very large, steady max	Sustained ceiling	Provisioned (if cheaper than Elastic per-GB)	Compare metered Elastic cost vs flat Provisioned

Switching to Provisioned for a steady high-throughput job:

aws efs update-file-system \
  --file-system-id fs-0abc123 \
  --throughput-mode provisioned \
  --provisioned-throughput-in-mibps 256

Throughput-mode changes and decreases in provisioned throughput are rate-limited (you can raise it, but reducing it or switching modes has a cooldown of roughly a day), so don’t treat it as an autoscaling knob.

Why Bursting starves — the math

Bursting baseline is 50 KiB/s per GiB stored. A small filesystem has a tiny baseline and survives only on credits; once they’re gone it crawls. This table is the single most useful EFS diagnostic:

Stored data	Baseline throughput	Burst throughput (while credits last)	Verdict on Bursting
100 GiB	~5 MiB/s	~100 MiB/s	Starves fast; use Elastic
500 GiB	~25 MiB/s	~100 MiB/s	Marginal; Elastic safer
1 TiB	~50 MiB/s	~100 MiB/s	Workable if access is bursty
10 TiB	~500 MiB/s	higher	Bursting genuinely cheap and adequate
Empty / near-empty	near zero	drains immediately	The classic “EFS is slow” ticket

EFS storage classes, lifecycle, and access points

EFS has Standard and Infrequent Access (IA) classes (plus One Zone variants for single-AZ cost savings). Lifecycle management moves files between Standard and IA based on access age; the transition_to_primary_storage_class = "AFTER_1_ACCESS" rule above promotes a file back to Standard the moment it is read again, which avoids the IA per-access read charge punishing hot files that aged out. For most shared filesystems IA cuts storage cost substantially with negligible behavioral change, because access is Pareto-distributed.

The EFS storage classes side by side

Storage class	Durability scope	$/GiB (relative)	Access charge	Use for
Standard	Multi-AZ	Baseline	None	Hot, frequently-read files
Standard-IA	Multi-AZ	~Much lower	Per-GB read fee	Cold files in a multi-AZ FS
One Zone	Single-AZ	Lower than Standard	None	Reproducible / non-critical data
One Zone-IA	Single-AZ	Lowest	Per-GB read fee	Cold + reproducible

Lifecycle transition rules

The transition knobs and what each does:

Lifecycle setting	Values	Effect	When to use
`transition_to_ia`	AFTER_1/7/14/30/60/90_DAYS	Demote untouched files to IA after N days	Almost always; big storage savings
`transition_to_primary_storage_class`	AFTER_1_ACCESS	Promote a file back to Standard on read	Avoid repeated IA read fees on re-hot files
(no lifecycle)	—	Everything stays Standard	Only if all data is uniformly hot

Access points are the right way to hand EFS to multiple applications or containers. Each enforces a POSIX identity and a root directory, so an app physically cannot see another tenant’s files:

resource "aws_efs_access_point" "app_a" {
  file_system_id = aws_efs_file_system.shared.id

  posix_user {
    uid = 1000
    gid = 1000
  }

  root_directory {
    path = "/app-a"
    creation_info {
      owner_uid   = 1000
      owner_gid   = 1000
      permissions = "0750"
    }
  }
}

Pair access points with a filesystem policy that requires TLS and IAM authorization, so a leaked mount target is useless without credentials:

{
  "Version": "2012-10-17",
  "Statement": [{
    "Effect": "Deny",
    "Principal": { "AWS": "*" },
    "Action": "*",
    "Resource": "*",
    "Condition": { "Bool": { "aws:SecureTransport": "false" } }
  }]
}

Mount with the EFS helper so encryption-in-transit and the access point are wired correctly:

sudo mount -t efs -o tls,accesspoint=fsap-0abc123 fs-0abc123:/ /mnt/app-a

EFS mount options that affect performance and safety

The mount flags that matter, and what each buys:

Mount option	What it does	Default	When to set
`tls`	Encryption in transit via stunnel	Off (helper adds it)	Always in production
`accesspoint=fsap-...`	Enforce POSIX root + identity	None	Multi-tenant / per-app isolation
`iam`	Authenticate the mount with IAM	Off	When filesystem policy requires IAM
`nconnect=N`	Multiple TCP connections per mount	1	Throughput-bound clients (raises parallelism)
`noresvport`	Reconnect on a new port after a blip	On (helper)	Resilience across network events
`_netdev` (fstab)	Wait for network before mount	—	Boot-time mounts in `/etc/fstab`

Benchmarking with fio and interpreting results

Never trust the spec sheet — measure the path you actually run. fio is the tool. Match the block size and pattern to your workload: 16 KiB random for database-like I/O, large sequential for streaming.

Random read IOPS (database-style), with O_DIRECT to bypass the page cache so you measure the device, not RAM:

sudo fio --name=randread --filename=/data/fiotest --direct=1 \
  --rw=randread --bs=16k --iodepth=64 --numjobs=4 --group_reporting \
  --size=10G --runtime=120 --time_based --ioengine=libaio

Sequential throughput (analytics/log-streaming style):

sudo fio --name=seqread --filename=/data/fiotest --direct=1 \
  --rw=read --bs=1M --iodepth=32 --numjobs=2 --group_reporting \
  --size=20G --runtime=120 --time_based --ioengine=libaio

Match the fio profile to your real workload

The number one benchmarking error is testing a pattern the application never runs. Map the workload to the right block size, pattern, and queue depth before you draw any conclusion:

Workload	Pattern (`--rw`)	Block size (`--bs`)	iodepth	numjobs	Limit it stresses
OLTP database (random reads)	`randread`	8k–16k	32–64	4–8	IOPS
OLTP database (mixed)	`randrw` (70/30)	8k–16k	32–64	4–8	IOPS + fsync latency
Log / stream ingestion (writes)	`write`	1M	16–32	2–4	Throughput
Analytics scan (sequential reads)	`read`	1M	32	2–4	Throughput
Boot / small mixed	`randrw`	4k	16	1–2	Latency
Latency probe (single op)	`randread`	4k	1	1	p50/p99 latency

The fio knobs and why each matters

Getting these wrong is how you “prove” a volume is slow when your test was the bottleneck:

fio flag	What it controls	Set it to	If wrong you measure
`--direct=1`	Bypass the OS page cache (O_DIRECT)	Always 1 for device tests	RAM, not the volume
`--bs`	Block size	4–16k random (DB); 1M sequential	The wrong workload’s profile
`--rw`	Pattern	`randread`/`randwrite`/`read`/`write`/`randrw`	A pattern your app never runs
`--iodepth`	Outstanding I/Os per job	Deep (32–64) to saturate	Under-driven device (looks slow)
`--numjobs`	Parallel worker threads	Match cores / concurrency	Single-threaded ceiling, not the volume’s
`--runtime` + `--time_based`	Duration	≥ 120 s to ride past burst	A burst window, not steady state
`--ioengine`	I/O submission path	`libaio` on Linux	A slower engine’s overhead

Reading the output

IOPS — compare against min(volume provisioned IOPS, instance IOPS limit). If you fall short of both, the bottleneck is elsewhere (filesystem, single-threaded I/O, too-shallow queue depth).
bw (bandwidth) — compare against min(volume throughput, instance EBS bandwidth). Hitting the instance number and not the volume’s confirms you are instance-bound; that’s your signal to resize the instance, not the volume.
clat / lat percentiles — gp3 typically lands around single-digit-millisecond latency; io2 Block Express is sub-millisecond. A p99 far above the median under load means queueing — usually iodepth or numjobs higher than the device can absorb. Latency is the metric users feel; watch the percentiles, not the average.

What each fio number tells you and what to do next:

fio metric	Compare against	If you hit the volume number	If you hit the instance number	If you hit neither
`IOPS`	`min(vol IOPS, instance IOPS)`	Raise volume IOPS or stripe	Resize the instance	Deeper iodepth/numjobs; check FS/app
`bw` (MiB/s)	`min(vol throughput, instance EBS bw)`	Raise volume throughput or stripe	Resize the instance	Larger block size; more parallel jobs
`clat`/`lat` p50	gp3 ~single-digit ms; io2 sub-ms	Expected; healthy	n/a	Investigate FS / fsync / network
`clat`/`lat` p99	Should track p50 under healthy load	Queueing — lower iodepth	Queueing at the instance cap	Outliers — noisy neighbour / GC

A fresh volume restored from snapshot without FSR will read slow on first touch — that is lazy loading, not the steady-state number. Either enable FSR or pre-warm by reading every block before you benchmark.

Confirming the real limit end-to-end (CloudWatch)

Confirm the storage is performing to the limit that actually applies, end to end.

# 1. Confirm provisioned volume settings took effect
aws ec2 describe-volumes --volume-ids vol-0abc123 \
  --query 'Volumes[0].{type:VolumeType,size:Size,iops:Iops,throughput:Throughput,state:State}'

# 2. Confirm the instance's EBS ceiling (the real cap)
aws ec2 describe-instance-types --instance-types m6i.large \
  --query 'InstanceTypes[0].EbsInfo.EbsOptimizedInfo'

# 3. Measure actual achieved performance against CloudWatch
aws cloudwatch get-metric-statistics --namespace AWS/EBS \
  --metric-name VolumeReadOps --dimensions Name=VolumeId,Value=vol-0abc123 \
  --start-time "$(date -u -v-1H '+%Y-%m-%dT%H:%M:%SZ')" \
  --end-time "$(date -u '+%Y-%m-%dT%H:%M:%SZ')" \
  --period 300 --statistics Sum

# 4. Check whether the instance is throttling EBS (Nitro burst-balance / throughput)
#    A persistently low VolumeThroughputPercentage or exhausted BurstBalance == bottleneck found
aws cloudwatch get-metric-statistics --namespace AWS/EBS \
  --metric-name VolumeThroughputPercentage --dimensions Name=VolumeId,Value=vol-0abc123 \
  --start-time "$(date -u -v-1H '+%Y-%m-%dT%H:%M:%SZ')" \
  --end-time "$(date -u '+%Y-%m-%dT%H:%M:%SZ')" --period 300 --statistics Average

For EFS, confirm throughput mode and watch the burst/IO limit percentage:

aws efs describe-file-systems --file-system-id fs-0abc123 \
  --query 'FileSystems[0].{mode:ThroughputMode,prov:ProvisionedThroughputInMibps,perf:PerformanceMode}'

# PercentIOLimit near 100 on General Purpose means you should consider Elastic/Max I/O
aws cloudwatch get-metric-statistics --namespace AWS/EFS \
  --metric-name PercentIOLimit --dimensions Name=FileSystemId,Value=fs-0abc123 \
  --start-time "$(date -u -v-1H '+%Y-%m-%dT%H:%M:%SZ')" \
  --end-time "$(date -u '+%Y-%m-%dT%H:%M:%SZ')" --period 300 --statistics Maximum

The CloudWatch metrics that reveal each ceiling

This is the reference you keep open while diagnosing. Each metric points at exactly one limit:

Metric	Namespace	Near its limit means	Confirms
`VolumeReadOps` / `VolumeWriteOps`	AWS/EBS	(rate) approaching provisioned IOPS	Volume IOPS ceiling
`VolumeThroughputPercentage`	AWS/EBS	Low % despite load = throttled	Instance EBS bandwidth cap
`VolumeQueueLength`	AWS/EBS	Persistently high = saturated/queued	Device saturation or shallow concurrency
`BurstBalance`	AWS/EBS	Draining toward 0 (st1/gp2)	Burst-credit starvation
`VolumeReadBytes` / `VolumeWriteBytes`	AWS/EBS	(rate) approaching provisioned throughput	Volume throughput ceiling
`PercentIOLimit`	AWS/EFS	Near 100 on General Purpose	EFS perf-mode ceiling → consider Elastic/Max I/O
`BurstCreditBalance`	AWS/EFS	Draining toward 0	EFS Bursting starvation
`MeteredIOBytes`	AWS/EFS	Tracks billed throughput	EFS cost driver

Architecture at a glance

The diagram below traces a single I/O request from the application down to durable storage and shows where each ceiling sits. Read it left to right as the data path: the application issues reads and writes with a particular block size and queue depth; those land on the EC2 instance, whose Nitro EBS-optimized link has a published baseline and burst — the first ceiling, and the one nobody checks first. From the instance, block traffic crosses to the EBS volume (gp3 or io2 Block Express), which has its own per-volume IOPS and throughput ceiling, and file traffic goes to the EFS mount target over NFSv4.1/TLS, governed by the chosen throughput mode. Underneath, EBS snapshots in S3 and EFS lifecycle to IA form the durability and cost tier — and the snapshot path is where lazy-load latency hides on a fresh restore.

The badges mark the five places performance actually dies. Badge 1 sits on the instance link (instance EBS baseline caps you below the volume’s rated number); badge 2 on the gp3 volume (left at 3,000/125, or throughput bought without the IOPS to back it); badge 3 on io2 Block Express (the right call only above gp3’s ceiling); badge 4 on the EFS mount (Bursting on a near-empty filesystem starves); badge 5 on the snapshot restore (no FSR means the first read of every block fetches from S3). Follow the numbered legend to turn each badge into a symptom you can confirm with one CloudWatch metric and a fix you can apply with one CLI call. The governing rule the whole diagram teaches: achieved performance is min(instance, volume, filesystem), so the only move that helps is to raise the term that is actually binding.

Real-world scenario

A fintech platform team — call them Aarna Pay — ran a PostgreSQL fleet on r5.2xlarge instances, each with a single 4 TiB gp3 volume provisioned to the full 16,000 IOPS and 1,000 MiB/s. Their batch reconciliation job — a heavy nightly read-write pass over the day’s settlement data — consistently flatlined at roughly 600 MiB/s no matter how high they pushed the volume’s provisioned throughput, and p99 query latency spiked into the seconds during the window. The on-call instinct was “buy more IOPS,” and they had, twice, with no effect except the spend going up. The reconciliation window kept growing past its SLA, threatening the morning settlement cut-off.

The constraint was the instance, not the volume. An r5.2xlarge delivers a baseline of about 593.75 MiB/s (4,750 Mbps) of EBS throughput — almost exactly the ceiling they kept hitting. VolumeThroughputPercentage sat low even at peak, the tell-tale of an instance-side throttle rather than a volume that’s maxed. The volume was provisioned 68% beyond anything the instance could ever consume; they were paying for 1,000 MiB/s and physically capped at ~594. A two-minute describe-instance-types would have shown it on day one.

Two changes fixed it. They moved the database to r6i.4xlarge, which delivers a sustained ~1,187.5 MiB/s baseline (and, being a larger size, no 30-minute burst cliff), and they migrated the hottest volumes to io2 Block Express for the latency floor under concurrent load. They also right-sized the volume’s provisioned throughput down to match the new instance baseline, recovering the over-provisioning spend. They codified the rule so it can’t regress: provisioned volume throughput must never exceed the instance’s published EBS baseline.

# Guardrail: cap provisioned throughput at the instance's EBS baseline.
# Fetch the instance EBS baseline at plan time and clamp the volume to it.
data "aws_ec2_instance_type" "db" {
  instance_type = "r6i.4xlarge"
}

locals {
  instance_ebs_baseline_mibps = data.aws_ec2_instance_type.db.ebs_optimized_info[0].baseline_throughput_in_mbps
}

resource "aws_ebs_volume" "pg_data" {
  availability_zone = "us-east-1a"
  size              = 4096
  type              = "io2"
  iops              = 64000
  # Provisioning beyond the instance baseline is wasted money; clamp it.
  throughput        = min(1000, local.instance_ebs_baseline_mibps)
  encrypted         = true
}

The reconciliation window dropped from 50 minutes to 22, p99 latency fell back under 10 ms, and the monthly storage bill went down because the over-provisioned IOPS were trimmed. The lesson the team internalized: storage performance is min(volume, instance), and the instance limit is the one nobody checks first. The before/after, with the metric that proved each step:

Phase	Instance	Volume config	Achieved throughput	p99 latency	Proof metric
Before	`r5.2xlarge`	gp3 16,000/1,000	~600 MiB/s (capped)	seconds	`VolumeThroughputPercentage` low
“Buy more IOPS”	`r5.2xlarge`	gp3 16,000/1,000 (again)	~600 MiB/s (unchanged)	seconds	No change — wrong knob
Resize instance	`r6i.4xlarge`	gp3 16,000/1,000	~1,000 MiB/s	< 50 ms	`VolumeThroughputPercentage` healthy
Migrate + right-size	`r6i.4xlarge`	io2 64,000, throughput clamped	~1,000 MiB/s	< 10 ms	p99 under SLA; bill down

Advantages and disadvantages

The decoupled, software-defined storage model both enables precise tuning and invites the over-provisioning mistakes this article exists to prevent. Weigh it honestly:

Advantages (why this model helps you)	Disadvantages (why it bites)
IOPS, throughput, and capacity are independent purchases — pay for exactly the shape you need	Three knobs means three ways to mis-size; conflating them overspends and under-provisions at once
Elastic Volumes change type/IOPS/throughput online — no downtime to tune	The 6-hour cooldown and `optimizing` state mean you can’t thrash changes during an incident
The instance EBS limit is published and queryable (`describe-instance-types`)	It’s invisible by default — the volume reports its full number while the instance silently caps you
EFS Elastic throughput removes provisioning and cliffs entirely	Bursting (the cheap mode) starves a near-empty filesystem — the #1 EFS complaint
Snapshots are incremental and cheap; DLM automates retention + DR copy	A fresh restore is lazy-loaded — slow first touch unless you pay for FSR
RAID 0 striping beats the single-volume ceiling up to the instance limit	RAID 0 has zero redundancy and breaks crash-consistency of snapshots unless you freeze the FS
io2 Block Express delivers sub-ms latency and huge ceilings	Easy to over-reach for — many gp3 workloads land on io2 at 3–5× the cost for unused headroom
Everything is measurable with `fio` + CloudWatch	A bad `fio` config (shallow iodepth, page cache on) “proves” a volume is slow when the test was the cap

The model is right for any workload where you want to size storage to measured demand rather than buy a fixed appliance. It rewards teams who measure (fio + CloudWatch) and codify guardrails (clamp throughput to the instance baseline); it punishes muscle-memory sizing — picking io2 by reflex, leaving gp3 at defaults, mounting EFS on Bursting, or benchmarking a lazy-loaded restore. The disadvantages are all knowable and measurable — which is the entire point of treating storage as min(volume, instance, app) and finding the binding term before you spend.

Hands-on lab

Provision a gp3 volume, prove it’s throttled at the default, measure it with fio, raise the knobs online, and confirm the gain — all on one small instance you delete at the end. Run from a session on an Amazon Linux 2023 EC2 instance (a t3.large or m6i.large is fine and cheap).

Step 1 — Variables and a gp3 volume at the default (3,000 / 125).

AZ=$(curl -s http://169.254.169.254/latest/meta-data/placement/availability-zone)
IID=$(curl -s http://169.254.169.254/latest/meta-data/instance-id)
VOL=$(aws ec2 create-volume --availability-zone "$AZ" --size 100 \
  --volume-type gp3 --encrypted \
  --query VolumeId --output text)
echo "Volume: $VOL in $AZ on $IID"

Expected: a vol-... id. At this point IOPS=3,000 and throughput=125 MiB/s (the defaults).

Step 2 — Attach, format, mount.

aws ec2 wait volume-available --volume-ids "$VOL"
aws ec2 attach-volume --volume-id "$VOL" --instance-id "$IID" --device /dev/sdf
sleep 5
DEV=$(lsblk -o NAME,SERIAL | grep "${VOL#vol-}" | awk '{print "/dev/"$1}')
sudo mkfs.xfs "$DEV" && sudo mkdir -p /data && sudo mount "$DEV" /data

Expected: /data mounted on the new device. (On Nitro the device appears as /dev/nvme*, hence the serial lookup.)

Step 3 — Benchmark at the default and record the ceiling.

sudo fio --name=base --filename=/data/fiotest --direct=1 --rw=randread \
  --bs=16k --iodepth=64 --numjobs=4 --group_reporting \
  --size=5G --runtime=60 --time_based --ioengine=libaio | grep -E 'IOPS|BW'

Expected: IOPS pinned near 3,000 and bandwidth near 125 MiB/s — the gp3 baseline, regardless of how deep you drive it. This is the throttle the default imposes.

Step 4 — Raise IOPS and throughput online with Elastic Volumes.

aws ec2 modify-volume --volume-id "$VOL" --iops 8000 --throughput 500
# Wait until the modification leaves 'modifying'/'optimizing'
aws ec2 describe-volumes-modifications --volume-id "$VOL" \
  --query 'VolumesModifications[0].[ModificationState,Progress]' --output text

Expected: state progresses modifying → optimizing → completed. The volume stays mounted and usable throughout.

Step 5 — Re-benchmark and confirm the gain.

sudo fio --name=tuned --filename=/data/fiotest --direct=1 --rw=randread \
  --bs=16k --iodepth=64 --numjobs=4 --group_reporting \
  --size=5G --runtime=60 --time_based --ioengine=libaio | grep -E 'IOPS|BW'

Expected: IOPS now climbs toward 8,000 and bandwidth toward 500 MiB/s — provided the instance’s EBS limit allows it. On an m6i.large (~594 MiB/s baseline) you’ll see the throughput land near the volume number; on a smaller instance you’ll hit the instance cap first — which is exactly the lesson.

Step 6 — Prove the instance ceiling is real.

TYPE=$(curl -s http://169.254.169.254/latest/meta-data/instance-type)
aws ec2 describe-instance-types --instance-types "$TYPE" \
  --query 'InstanceTypes[0].EbsInfo.EbsOptimizedInfo.{baseMBps:BaselineThroughputInMBps,baseIOPS:BaselineIops}' \
  --output table

Expected: the baseline MiB/s and IOPS the instance allows. Compare to your fio bandwidth: if fio matched this number rather than the volume’s 500, you just observed min(volume, instance) with your own eyes.

Validation checklist. You provisioned gp3 at the default and saw it throttle at 3,000/125; raised IOPS/throughput online with zero downtime; re-measured a real gain; and confirmed the instance EBS baseline is a separate, often-lower ceiling. The steps mapped to what each proves:

Step	What you did	What it proves	Real-world analogue
3	Benchmark gp3 at default	The 3,000/125 default is a real throttle	“Why is my new volume slow?”
4	`modify-volume` online	Tuning needs no detach/downtime	Right-sizing a live production volume
5	Re-benchmark tuned	Raising the knobs actually helps	The fix after the diagnosis
6	`describe-instance-types`	The instance is a separate ceiling	The bill full of unreachable numbers

Cleanup (avoid lingering volume + snapshot charges).

sudo umount /data
aws ec2 detach-volume --volume-id "$VOL"
aws ec2 wait volume-available --volume-ids "$VOL"
aws ec2 delete-volume --volume-id "$VOL"

Cost note. A 100 GiB gp3 volume for an hour is a few rupees; the provisioned IOPS/throughput above baseline add a little more while modified. Deleting the volume stops all of it. (There is no free-tier gp3 with provisioned IOPS, but an hour of this lab is well under ₹50.)

Common mistakes & troubleshooting

Before the playbook, the error and status reference — the exact strings, states, and API errors you’ll see, what each means, and the immediate move. These are the messages that surface from the CLI, the volume state machine, and the OS when storage tuning goes wrong:

String / state / error	Where it appears	Meaning	Immediate move
`VolumeModificationRateExceeded`	`modify-volume` API	Modified within the last 6 hours	Wait for the 6-hour cooldown
Volume state `optimizing`	`describe-volumes-modifications`	Modify applied; perf between old/new	Wait it out; do not re-modify
`InvalidParameterValue: throughput too high for iops`	`modify-volume` / `create-volume`	gp3 0.25 MiB/s-per-IOPS rule violated	Raise IOPS first (1,000 MiB/s ⇒ ≥ 4,000)
`iops ... exceeds the ratio`	`create-volume` (io2)	IOPS > 1,000 × GiB	Increase size or lower IOPS
`VolumeInUse`	`delete-volume` / `attach-volume`	Still attached (or attaching elsewhere)	Detach first; check Multi-Attach
`IncorrectState: available`	`detach-volume`	Already detached	No action; it’s free
`xfs ... corruption` / `EXT4-fs error` (dmesg)	OS kernel log	Single-writer FS on Multi-Attach, or bad RAID	Use a cluster FS; fsck offline
`No space left on device` after grow	OS	Grew volume, not the filesystem	`growpart` + `xfs_growfs`/`resize2fs`
`mount.nfs4: Connection timed out` (EFS)	OS mount	Security group / mount target / no `tls` helper	Open 2049; use `amazon-efs-utils`
`BurstBalance` at 0 (alarm)	CloudWatch (EBS)	st1/gp2 burst credits exhausted	Size up; or provisioned gp3
`BurstCreditBalance` at 0 (alarm)	CloudWatch (EFS)	EFS Bursting starved	Switch to Elastic throughput
`PercentIOLimit` ≈ 100 (alarm)	CloudWatch (EFS)	General Purpose IOPS ceiling hit	Move to Elastic (or Max I/O legacy)

This is the playbook — the part you bookmark. First as a scannable table you can read mid-incident, then the same entries with the full confirm-command detail underneath.

#	Symptom	Root cause	Confirm (exact cmd / metric)	Fix
1	Throughput flatlines well below the volume’s number	Instance EBS baseline is the cap	`describe-instance-types ... EbsOptimizedInfo`; `VolumeThroughputPercentage` low	Resize instance to a larger size/family
2	New gp3 volume “slow” at 3,000 IOPS / 125 MiB/s	Left at the default; never provisioned up	`describe-volumes` Iops=3000, Throughput=125	`modify-volume --iops --throughput`
3	Raised throughput but it didn’t increase	Not enough provisioned IOPS to back it (gp3 0.25 MiB/s/IOPS)	Provisioned IOPS < throughput/0.25	Raise IOPS first (1,000 MiB/s needs ≥ 4,000)
4	EFS crawls; small filesystem	Bursting mode + near-empty FS out of credits	`BurstCreditBalance` → 0; `ThroughputMode`=bursting	Switch to Elastic throughput mode
5	Restored DR volume reads at a fraction of rated speed	Snapshot lazy-load (no FSR)	First-read latency >> steady state	Enable FSR or pre-warm by reading all blocks
6	`fio` shows low IOPS despite headroom	iodepth/numjobs too shallow; single-threaded	Raise iodepth/numjobs → IOPS rises	Deepen queue; parallelize the workload
7	`fio` numbers absurdly high, then production slow	Page cache not bypassed (no O_DIRECT)	`--direct=1` collapses the number to real	Always benchmark with `--direct=1`
8	Modification “stuck”; performance between old/new	`optimizing` state after modify	`describe-volumes-modifications` = optimizing	Wait it out; don’t re-modify (6 h cooldown)
9	“Modify failed: too soon”	Modified within the last 6 hours	Last modification < 6 h ago	Wait for the 6-hour cooldown
10	Grew the volume but the filesystem is still small	Didn’t grow partition/FS inside the OS	`lsblk` device big, `df -h` FS small	`growpart` + `xfs_growfs`/`resize2fs`
11	Two instances mounted one volume; corruption	Plain xfs/ext4 RW on a Multi-Attach volume	Filesystem errors in `dmesg`	Use a cluster FS (GFS2/OCFS2) or don’t multi-attach
12	st1 fast then slow under sustained reads	Throughput burst credits exhausted	`BurstBalance` draining to 0	Size up; or move to provisioned gp3
13	EFS `PercentIOLimit` pegged at ~100%	General Purpose perf-mode IOPS ceiling	`PercentIOLimit` near 100	Move to Elastic throughput (or Max I/O legacy)
14	Latency p99 spikes at the 30-minute mark	Relied on instance EBS burst, not baseline	Throttle begins exactly after ~30 min	Size against the baseline, larger instance

The expanded form, with the full reasoning for the entries that bite hardest:

1. Throughput flatlines well below the volume’s provisioned number. Root cause: The instance EBS baseline is lower than the volume’s ceiling — the classic, most expensive mistake. Confirm: aws ec2 describe-instance-types --instance-types <type> --query 'InstanceTypes[0].EbsInfo.EbsOptimizedInfo'; CloudWatch VolumeThroughputPercentage sits low even at peak (a volume that’s truly maxed reads ~100%). Fix: Resize the instance to a larger size or family whose baseline ≥ your target; never provision volume throughput past the instance baseline for sustained work.

2. A brand-new gp3 volume is “slow” — capped at 3,000 IOPS / 125 MiB/s. Root cause: gp3 ships at the baseline default; provisioning above it is opt-in and was never done. Confirm: aws ec2 describe-volumes --volume-ids <vol> --query 'Volumes[0].{iops:Iops,tput:Throughput}' returns 3000 / 125. Fix: aws ec2 modify-volume --volume-id <vol> --iops <n> --throughput <m> (online).

3. You raised throughput but achieved bandwidth didn’t move. Root cause: gp3 enforces ≤ 0.25 MiB/s per provisioned IOPS — you bought MiB/s without the IOPS to back it. Confirm: provisioned IOPS < target throughput / 0.25 (e.g. asking 1,000 MiB/s with only 3,000 IOPS). Fix: Raise IOPS first — 1,000 MiB/s requires ≥ 4,000 provisioned IOPS — then the throughput is allowed.

4. EFS crawls and it’s a small filesystem. Root cause: Bursting throughput mode on a near-empty filesystem (50 KiB/s per GiB baseline) that has exhausted its burst credits. Confirm: CloudWatch BurstCreditBalance trending to zero; aws efs describe-file-systems --query 'FileSystems[0].ThroughputMode' returns bursting. Fix: aws efs update-file-system --throughput-mode elastic — throughput then scales with demand, no credit cliff.

5. A volume restored from a snapshot reads at a fraction of its rated speed. Root cause: Lazy loading — blocks fetch from S3 on first touch; you’re measuring S3 latency, not the volume. Confirm: the first read of each region is slow and the second is fast; steady-state matches the spec after a full pass. Fix: Enable Fast Snapshot Restore on the snapshot in the target AZs, or pre-warm by reading every block (dd if=/dev/nvmeXn1 of=/dev/null bs=1M).

6. fio reports low IOPS even though the volume and instance have headroom. Root cause: Too-shallow queue depth or single-threaded I/O — the device is under-driven, not slow. Confirm: raising --iodepth and --numjobs increases IOPS; at iodepth=1 you measure latency-bound, not the device ceiling. Fix: Drive a deeper queue (32–64) and more jobs that match real concurrency; fix single-threaded application I/O.

7. fio shows impossibly high numbers, but production is slow. Root cause: The benchmark hit the page cache (RAM), not the device — --direct=1 was missing. Confirm: adding --direct=1 drops the number to a believable device figure. Fix: Always benchmark device performance with --direct=1 (O_DIRECT).

8 & 9. Modification seems stuck, or “modify failed: too soon.” Root cause: After a modify the volume enters optimizing (performance between old and new); and a volume can be modified only once per 6 hours. Confirm: aws ec2 describe-volumes-modifications --volume-id <vol> shows optimizing; a second modify inside 6 h is rejected. Fix: Wait out optimizing; plan changes so you don’t need a second modify inside the 6-hour window.

10. You grew the volume but the filesystem is still the old size. Root cause: Growing the EBS volume enlarges the block device, not the partition/filesystem inside the OS. Confirm: lsblk shows the larger device; df -h shows the old filesystem size. Fix: sudo growpart /dev/nvme0n1 1 then sudo xfs_growfs -d /mount (xfs) or sudo resize2fs /dev/nvme0n1p1 (ext4).

11. Two instances mounted one volume and it corrupted. Root cause: A Multi-Attach io2 volume mounted xfs/ext4 read-write on more than one instance — those filesystems assume single-writer. Confirm: filesystem inconsistency errors in dmesg/journal on both nodes. Fix: Use a cluster-aware filesystem (GFS2/OCFS2) with proper fencing, or don’t multi-attach a single-writer filesystem.

12. st1 is fast initially, then slows under sustained reads. Root cause: st1’s throughput burst credits are exhausted; you’ve dropped to the baseline. Confirm: CloudWatch BurstBalance draining toward 0. Fix: Size the st1 volume larger (baseline scales with size), or switch to a provisioned gp3 if latency matters.

13. EFS PercentIOLimit is pegged near 100%. Root cause: You’ve hit the General Purpose performance-mode IOPS ceiling. Confirm: CloudWatch PercentIOLimit at ~100 sustained. Fix: Move to Elastic throughput (raises the effective ceiling for most workloads); Max I/O is the legacy alternative but costs latency and is immutable.

14. Latency p99 spikes right at the 30-minute mark. Root cause: The workload leaned on the instance’s EBS burst rather than the baseline; the burst window ended. Confirm: throttling begins ~30 minutes into sustained load; the instance is a smaller size that bursts. Fix: Size against the baseline — choose a larger instance whose baseline meets sustained demand.

Best practices

Default new volumes to gp3; reserve io2 Block Express for sub-ms latency or > 16,000 IOPS needs. Picking io2 by reflex pays 3–5× for headroom most workloads never touch.
Size capacity, IOPS, and throughput as three independent decisions, not one. Capacity for data stored, IOPS for random-small, throughput for sequential.
Look up the target instance’s EBS baseline/burst limits before provisioning the volume; never provision past the instance baseline for sustained workloads. Codify it as a Terraform guardrail that clamps throughput to the instance baseline.
Size sustained workloads against the instance baseline, not the 30-minute burst. A workload that bursts fine in test falls off a cliff at the half-hour in production.
Stripe (RAID 0) across volumes only when the instance has bandwidth headroom a single volume can’t fill; accept zero redundancy. Striping cannot exceed the instance ceiling.
Use Elastic Volumes for online changes; respect the 6-hour modification cooldown and optimizing state. Plan changes; never thrash them during an incident.
After growing a volume, grow the partition and filesystem too (growpart + xfs_growfs/resize2fs) — the bigger block device is invisible until you do.
Enable Fast Snapshot Restore for golden images and time-critical restores, or pre-warm; never measure a fresh restore and call it the steady-state number.
Automate snapshot retention and cross-Region copy with Data Lifecycle Manager, not bespoke Lambdas — tag-target volumes and let DLM run.
Treat Multi-Attach as cluster-filesystem-only; never mount xfs/ext4 RW on two instances. It corrupts the volume.
Default EFS to General Purpose + Elastic throughput; avoid Bursting on near-empty filesystems. Bursting starves below ~1 TiB.
Enable EFS lifecycle to IA with AFTER_1_ACCESS promotion back to Standard so cold data is cheap without punishing files that go hot again.
Front EFS with access points (POSIX root + identity) and a TLS-required filesystem policy so a leaked mount target is useless without credentials.
Benchmark with fio using --direct=1 and workload-matched block sizes and queue depth; judge against min(volume, instance). A bad config “proves” the wrong thing.
Alarm on the leading indicators — EBS VolumeThroughputPercentage/BurstBalance, EFS PercentIOLimit/BurstCreditBalance — to catch a ceiling before users feel it.

The alarms worth wiring before the next incident, and why each is leading rather than lagging:

Alarm on	Namespace / metric	Threshold (starting point)	Why it’s leading
Instance EBS throttle	AWS/EBS `VolumeThroughputPercentage`	< 100% while load is high, 10 min	Catches instance-bound before “it’s slow” tickets
EBS burst starvation	AWS/EBS `BurstBalance`	< 20% and falling	Predicts the st1/gp2 throttle cliff
Volume saturation	AWS/EBS `VolumeQueueLength`	Sustained high (> 1 per provisioned 500 IOPS)	I/O queuing before latency blows up
EFS credit starvation	AWS/EFS `BurstCreditBalance`	Trending to 0	The near-empty-Bursting failure, pre-emptively
EFS IOPS ceiling	AWS/EFS `PercentIOLimit`	> 90% sustained	Perf-mode ceiling before throughput collapses
EFS cost creep	AWS/EFS `MeteredIOBytes`	Above budget baseline	Elastic per-GB charges climbing

Security notes

Encrypt every volume and filesystem at rest with KMS. Set encrypted = true on EBS volumes and EFS filesystems; use a customer-managed KMS key (CMK) where you need key-policy control, audit, and rotation — see AWS KMS & Encryption, In Depth: Keys, Key Policies, Envelope Encryption, Grants & Rotation. Encryption is effectively free and there is no reason to leave a volume unencrypted in 2026.
Encrypt EFS in transit with TLS. Mount with the tls option (the EFS helper wires stunnel) and back it with a filesystem policy that denies any access where aws:SecureTransport is false, so a plaintext mount is rejected outright.
Isolate EFS with access points and least-privilege IAM. Each access point enforces a POSIX user and root directory; an app physically cannot traverse to another tenant’s files. Require IAM authorization (iam mount option) so a leaked mount target without credentials is inert.
Restrict EBS snapshot sharing deliberately. A snapshot shared publicly or with the wrong account leaks your data wholesale. Audit CreateVolumePermission; for cross-account DR, share with specific account IDs and re-encrypt with a CMK the target account is granted use of.
Lock down who can detach/modify/delete volumes. ec2:DetachVolume, ec2:ModifyVolume, and ec2:DeleteVolume are destructive; scope them with IAM conditions (e.g. resource tags) so a broad EC2 role can’t wipe a production data volume.
Use KMS key policies and grants for cross-Region snapshot copies. A DR copy re-encrypts with a target-Region key; the copy fails (or the data is unreadable) if the role lacks kms:Encrypt/kms:Decrypt on that key.
Keep the security and performance fixes aligned. A TLS-required EFS policy, an encrypted volume, and an access point cost essentially nothing in throughput — there is no performance excuse to skip them.

The controls that secure storage, what each defends against, and the performance cost:

Control	Mechanism	Secures against	Performance cost
EBS encryption at rest	`encrypted=true` + KMS CMK	Disk/snapshot data theft	Negligible (Nitro offload)
EFS encryption in transit	`tls` mount + deny non-TLS policy	Network sniffing of NFS	Minimal (stunnel)
EFS access points	POSIX root + identity per app	Cross-tenant file access	None
EFS IAM auth	`iam` mount + filesystem policy	Leaked mount target without creds	None
Snapshot sharing controls	`CreateVolumePermission` audit	Public/wrong-account data leak	None
Destructive-action IAM scoping	Tag-conditioned `Detach/Modify/Delete`	Accidental/malicious volume wipe	None
Cross-Region copy key grants	KMS key policy / grants	Unreadable or failed DR copies	None

Cost & sizing

The bill drivers and how they interact with the tuning decisions:

EBS capacity is billed per GiB-month regardless of how much you use. Provision for the data you store, not “round up for headroom” — a 4 TiB volume holding 800 GiB is paying for 3.2 TiB of air.
gp3 provisioned IOPS and throughput above the free baseline (3,000 IOPS / 125 MiB/s) are billed separately. This is where over-provisioning hides: buying 16,000 IOPS / 1,000 MiB/s on an instance that caps at 594 MiB/s pays for numbers the hardware can’t deliver. Clamp to the instance baseline.
io2 costs more per GiB and per provisioned IOPS than gp3 — justified only when you genuinely need its ceiling or sub-ms latency. Most “we put it on io2 to be safe” volumes are gp3 workloads paying a premium.
EFS is billed per GiB-month by storage class, plus throughput. Bursting bills storage only (cheapest for large filesystems); Elastic bills per GB transferred (best for spiky); Provisioned bills the MiB/s you reserve whether or not you use it. Lifecycle to IA cuts storage cost sharply with negligible behavioural change.
Snapshots bill for incremental stored blocks, so frequent snapshots are cheap; FSR bills per AZ per hour while enabled, so turn it on for golden images and DR rehearsals and off when idle. Cross-Region copy adds inter-Region transfer.

A rough monthly picture for a mid-tier production database volume and a shared filesystem: a 4 TiB gp3 at the default baseline is a few thousand rupees; raising it to 8,000 IOPS / 500 MiB/s adds a modest IOPS+throughput charge; the same workload on io2 at 64,000 IOPS is several times that. A 1 TiB EFS on Standard with lifecycle to IA can cut storage cost by more than half versus all-Standard. The cost drivers and what each one buys you:

Cost driver	What you pay for	Rough INR / month (illustrative)	What it fixes	Watch-out
gp3 capacity (per GiB)	Storage, baseline 3,000/125 included	~₹7–8 per GiB → 4 TiB ≈ ₹30,000	Baseline performance for free	Capacity ≠ performance; size data, not air
gp3 provisioned IOPS	IOPS above 3,000	Small per-IOPS-month above baseline	Random-small headroom	Buying IOPS the instance can’t consume
gp3 provisioned throughput	MiB/s above 125	Small per-MiB/s-month above baseline	Sequential headroom	Needs IOPS to back it (0.25 rule)
io2 capacity + IOPS	Higher per-GiB + per-IOPS	Several× gp3 for the same shape	Sub-ms latency, > 16,000 IOPS	Over-reached for gp3 workloads
EFS Standard storage	Per-GiB-month, multi-AZ	Higher than EBS per-GiB	Shared, multi-AZ file access	All-Standard when IA would do
EFS lifecycle to IA	Cheaper per-GiB on cold files	Cuts storage cost > 50% typically	Cold-data cost	IA read fee on files that go hot
EFS Elastic throughput	Per-GB transferred	Scales with use	Spiky workloads, no cliffs	Steady very-high load can cost more
FSR	Per AZ per hour while enabled	Hourly per AZ	Fast first-touch restore	Leaving it on idle burns money

Interview & exam questions

1. A volume is provisioned for 1,000 MiB/s but the workload flatlines at ~594 MiB/s. What’s happening and how do you confirm? The instance EBS baseline is the cap, not the volume. An m6i.large/r5.2xlarge delivers ~4,750 Mbps (~594 MiB/s) of EBS bandwidth; the volume’s number is unreachable on that instance. Confirm with describe-instance-types ... EbsOptimizedInfo and a low VolumeThroughputPercentage. Fix by resizing the instance, not buying more volume.

2. How do gp3 and io2 differ from gp2 in how you provision performance? On gp2, IOPS were coupled to size (3 IOPS/GiB), so you oversized to buy performance. gp3 and io2 decouple capacity, IOPS, and throughput into independent purchases. gp3 baseline is 3,000 IOPS / 125 MiB/s, tunable to 16,000 / 1,000; io2 Block Express reaches 256,000 IOPS / 4,000 MiB/s. You size three dimensions separately.

3. On gp3, you raise throughput to 1,000 MiB/s but it won’t take effect. Why? gp3 enforces a maximum of 0.25 MiB/s per provisioned IOPS, so 1,000 MiB/s requires at least 4,000 provisioned IOPS. If you’re still at the 3,000 baseline, the throughput request is bounded. Raise IOPS to ≥ 4,000 first, then the throughput is allowed.

4. When do you choose io2 Block Express over gp3? Only when you need what gp3 can’t give: sustained IOPS above 16,000, single-digit-millisecond (sub-ms) p99 latency under concurrency, 99.999% durability, or volumes larger than 16 TiB. Otherwise gp3 serves the same workload at a fraction of the cost — picking io2 by reflex pays 3–5× for unused headroom.

5. Why does an EFS filesystem with little data crawl, and how do you fix it? It’s on Bursting throughput mode, whose baseline is 50 KiB/s per GiB stored — a near-empty filesystem has almost no baseline and survives only on burst credits, which then run out. Confirm with BurstCreditBalance draining to zero. Fix by switching to Elastic throughput, which scales with demand and has no credit cliff.

6. What is Fast Snapshot Restore and when is it essential? Normally a volume restored from a snapshot lazy-loads blocks from S3 on first touch, so the first read of each block is slow. FSR pre-initializes the volume so it delivers full provisioned performance immediately. It’s essential for golden-image boot volumes and time-critical DR restores, and it’s billed per AZ per hour while enabled.

7. Difference between EFS performance mode and throughput mode? Performance mode (General Purpose vs Max I/O, set at creation, immutable) trades per-operation latency against aggregate IOPS ceiling. Throughput mode (Elastic, Provisioned, Bursting, changeable with a cooldown) governs how much aggregate throughput you get and how you pay. People confuse them; they’re orthogonal — one is latency-vs-ceiling, the other is throughput-vs-cost.

8. You restored a DR volume and it benchmarks at a tenth of its rated speed. Is the volume broken? No — you’re measuring lazy loading (S3 fetch on first touch), not steady state. The second read of each block is fast. Either enable FSR before relying on the volume, or pre-warm by reading every block (dd ... of=/dev/null) so the benchmark reflects the device, not S3 latency.

9. When does RAID 0 striping help EBS performance, and what’s the catch? Striping aggregates N volumes’ per-volume ceilings, useful when a single volume’s IOPS/throughput ceiling is the limit and the instance has bandwidth headroom above it. The catch: RAID 0 has zero redundancy (rely on EBS durability + snapshots), and striping cannot exceed the instance EBS limit — if the instance is already the cap, striping buys nothing.

10. Your fio test shows great numbers but production is slow. What’s the likely test error? The benchmark probably hit the page cache (RAM) instead of the device — --direct=1 (O_DIRECT) was missing. Or iodepth/numjobs were too shallow and under-drove the device. Re-run with --direct=1, a workload-matched block size, and a deep enough queue, then compare against min(volume, instance).

11. Why size a sustained workload against the instance baseline rather than the burst? Smaller instances get a higher EBS bandwidth for a 30-minute burst, then fall back to the baseline. A sustained database that leaned on burst is fast in a short test and throttles exactly at the half-hour mark in production. Size against the baseline; choose a larger size/family if the baseline doesn’t meet sustained demand.

12. You can only attach an io2 volume to one instance — except when? And what’s the constraint? Multi-Attach lets an io2/io1 volume attach to up to 16 Nitro instances in the same AZ. The hard constraint: it provides no write coordination, so you must run a cluster-aware filesystem (GFS2/OCFS2) or an application that arbitrates writes. Mounting plain xfs/ext4 read-write on two instances corrupts the volume.

These map to AWS Certified Solutions Architect – Associate (SAA-C03) — design cost-optimized and high-performing storage — and AWS Certified SysOps Administrator – Associate (SOA-C02) — monitor and tune EBS/EFS, CloudWatch storage metrics. The deep performance-tuning angle (instance limits, io2 Block Express, striping) also appears on the Solutions Architect – Professional (SAP-C02). A compact cert-mapping for revision:

Question theme	Primary cert	Exam objective area
Volume type selection by workload	SAA-C03	Design high-performing & cost-optimized storage
Instance EBS limit vs volume limit	SAP-C02 / SOA-C02	Performance tuning; monitoring
gp3 decoupling + 0.25 ratio	SAA-C03	Storage performance fundamentals
EFS performance/throughput modes	SAA-C03	Design file storage solutions
Snapshots, FSR, DLM, DR copy	SOA-C02	Backup, recovery, automation
CloudWatch storage metrics	SOA-C02	Monitor, log, and remediate

Quick check

A gp3 volume is provisioned for 16,000 IOPS / 1,000 MiB/s, but the workload never exceeds ~594 MiB/s. Where is the bottleneck, and what one command confirms it?
You raise a gp3 volume’s throughput to 1,000 MiB/s but it won’t apply while IOPS sits at 3,000. What rule are you hitting, and what do you change?
True or false: switching an EFS filesystem from Bursting to Elastic throughput is the right fix for a small, near-empty filesystem that keeps running out of throughput.
A volume restored from a snapshot benchmarks at a fraction of its rated speed. Name the cause and two ways to fix it.
Your fio random-read test reports numbers far above the volume’s provisioned IOPS. What single flag is almost certainly missing, and what were you actually measuring?

Answers

The instance EBS baseline is the cap — a volume’s provisioned number is unreachable if the instance can’t push it (e.g. ~594 MiB/s on an m6i.large/r5.2xlarge). Confirm with aws ec2 describe-instance-types --instance-types <type> --query 'InstanceTypes[0].EbsInfo.EbsOptimizedInfo', and note VolumeThroughputPercentage sitting low. Fix by resizing the instance, not the volume.
The gp3 throughput-per-IOPS ratio — you can buy at most 0.25 MiB/s per provisioned IOPS, so 1,000 MiB/s needs ≥ 4,000 IOPS. Raise IOPS to at least 4,000 first; then the throughput change is allowed.
True. Bursting’s baseline is 50 KiB/s per GiB stored, so a near-empty filesystem starves once burst credits run out. Elastic throughput scales with demand and removes the credit cliff — the correct fix.
The cause is snapshot lazy loading — blocks fetch from S3 on first touch, so you’re measuring S3 latency, not the device. Fix by (a) enabling Fast Snapshot Restore on the snapshot in the target AZs, or (b) pre-warming by reading every block (dd if=/dev/nvmeXn1 of=/dev/null bs=1M) before benchmarking.
--direct=1 (O_DIRECT) is missing — you were measuring the page cache (RAM), not the EBS device. Re-run with --direct=1 (and a deep enough iodepth/numjobs) to measure the real device, then compare against min(volume, instance).

Glossary

gp3 — general-purpose SSD EBS volume; baseline 3,000 IOPS / 125 MiB/s, tunable to 16,000 / 1,000, with IOPS and throughput decoupled from capacity. The sensible default.
io2 / io2 Block Express — high-performance SSD EBS volume; up to 256,000 IOPS / 4,000 MiB/s, sub-ms latency, 99.999% durability. For workloads gp3 can’t serve.
st1 / sc1 — throughput-optimized and cold HDD EBS volume types; for large sequential reads (st1) and cold storage (sc1); not bootable, terrible at random I/O.
Provisioned IOPS — the number of small (16 KiB) I/O operations per second you buy for a volume; caps random-small performance.
Provisioned throughput — the MiB/s of sequential bandwidth you buy for a volume; on gp3, bounded to 0.25 MiB/s per provisioned IOPS.
Instance EBS baseline — the sustained EBS bandwidth and IOPS an instance type allows, published in EbsOptimizedInfo; usually lower than a big volume’s ceiling, and the cap nobody checks first.
EBS-optimized burst — a 30-minute higher EBS bandwidth that smaller instance sizes can sustain on credit; misleads sizing of sustained workloads.
Elastic Volumes — the EBS feature to change a volume’s type, IOPS, throughput, or size online; constrained by a 6-hour modification cooldown and an optimizing state.
optimizing state — the period after a volume modification during which performance is between the old and new values.
RAID 0 / striping — combining N volumes at the OS (mdadm) to aggregate their ceilings up to the instance limit; zero redundancy.
Multi-Attach — attaching one io2/io1 volume to up to 16 Nitro instances in an AZ; requires a cluster-aware filesystem because it has no write coordination.
Fast Snapshot Restore (FSR) — pre-initializes a snapshot-restored volume so it delivers full performance on first touch instead of lazy-loading from S3; billed per AZ per hour.
Lazy loading — the default behaviour where a snapshot-restored volume fetches each block from S3 on first read, making the first touch slow.
Data Lifecycle Manager (DLM) — AWS-native, tag-targeted policies that create, retain, and cross-Region-copy EBS snapshots without custom scripts.
EFS performance mode — General Purpose (lowest latency) vs Max I/O (highest aggregate, higher latency); set at creation and immutable.
EFS throughput mode — Elastic (auto-scales, pay per GB), Provisioned (fixed MiB/s), or Bursting (scales with stored data + credits); changeable with a ~1-day cooldown.
EFS burst credits / BurstCreditBalance — headroom a Bursting filesystem earns; when it drains, a near-empty filesystem throttles to its tiny baseline.
EFS access point — an application-specific entry point enforcing a POSIX identity and root directory for multi-tenant isolation.
VolumeThroughputPercentage — the CloudWatch metric that, when low under load, reveals an instance-side EBS throttle (vs a volume that’s genuinely maxed).
PercentIOLimit — the CloudWatch EFS metric that, near 100%, signals you’ve hit the General Purpose performance-mode ceiling.

Next steps

You can now size block and file storage to the limit that actually binds, and confirm it with fio and CloudWatch. Build outward:

Next: AWS Block & File Storage, In Depth: EBS, EFS, FSx & Instance Store — the breadth survey of every storage service, including FSx and instance store, that this tuning guide drills into.
Related: Amazon EC2, In Depth: Instance Types, AMIs, EBS, User Data, IMDS & Every Launch Option — where the instance EBS-optimized limits come from and how to read them across families.
Related: AWS Observability, In Depth: CloudWatch, CloudTrail, Config & EventBridge — build the dashboards and alarms on the storage metrics this article relies on.
Related: Amazon RDS & Aurora, In Depth: Engines, Multi-AZ, Read Replicas, Backups & Every Option — how managed databases abstract the same storage physics you tune by hand here.
Related: AWS KMS & Encryption, In Depth: Keys, Key Policies, Envelope Encryption, Grants & Rotation — the encryption layer for every volume, filesystem, and snapshot above.