Taming BigQuery Cost and Performance: Partitioning, Clustering, and Reservations

A BigQuery bill blows up for two reasons: queries read far more bytes than they need to, and you are paying for compute on the wrong model. Both are fixable without rewriting your warehouse. This guide works from the physical layer up — partitioning and clustering to cut bytes scanned — then up to the capacity layer — editions, reservations, and autoscaling to cap and isolate compute — and closes with hard guardrails so a single bad query can never page you about the invoice again.

1. Pick the right pricing model before you tune anything

BigQuery bills compute two ways, and choosing wrong dwarfs every other optimization.

On-demand: you pay per TiB of data scanned by queries (storage is billed separately). Zero capacity to manage, but cost is a direct function of bytes read, and a single SELECT * over a fat table can cost real money. There is a per-project concurrency ceiling on slots but no reservation to manage.
Capacity (Editions): you buy slots (units of compute) under Standard, Enterprise, or Enterprise Plus editions. You pay for slot-time, not bytes. Slots can be on-demand-style autoscaling, or purchased as 1-year / 3-year commitments at a discount. Cost is a function of compute consumed over time, decoupled from bytes.

The crossover is about predictability and volume, not a magic TiB number. A rough decision frame:

Signal	Lean on-demand	Lean capacity (Editions)
Monthly query volume	Low / spiky / unpredictable	High and sustained
Spend pattern	A few analysts, bursty	Steady pipelines + BI dashboards
Need for cost ceiling	Per-query byte limits suffice	Want a hard slot cap on total compute
Workload isolation	Not required	ETL must not starve BI, etc.

The honest test: if your on-demand bytes-scanned bill is large and steady month over month, model it against a baseline commitment plus autoscaling. If it is small or wildly spiky, on-demand with aggressive byte limits is usually cheaper and far less operational overhead. You can mix: keep some projects on-demand and assign others to a reservation.

Check what a project is using and inspect recent spend by job before you commit:

-- Bytes billed by user over the last 30 days (on-demand cost driver)
SELECT
  user_email,
  ROUND(SUM(total_bytes_billed) / POW(1024, 4), 2) AS tib_billed,
  COUNT(*) AS jobs
FROM `region-us`.INFORMATION_SCHEMA.JOBS
WHERE creation_time >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 30 DAY)
  AND job_type = 'QUERY'
  AND statement_type != 'SCRIPT'
GROUP BY user_email
ORDER BY tib_billed DESC;

2. Read the query plan to find the real bottleneck

Do not optimize by guessing. Every query exposes a stage-by-stage execution plan, and the plan tells you whether you are I/O-bound (reading too much), shuffle-bound (repartitioning too much), or compute-bound.

In the console, the Execution Details tab shows stages with wait/read/compute/write timing and rows in/out. The same data is queryable. The single most important number for cost is total_bytes_processed — that is what you pay for on-demand and what partitioning/clustering attacks directly.

-- Most expensive queries by bytes processed, with cache + slot signal
SELECT
  job_id,
  user_email,
  ROUND(total_bytes_processed / POW(1024, 3), 2) AS gib_processed,
  cache_hit,
  total_slot_ms,
  TIMESTAMP_DIFF(end_time, start_time, SECOND) AS runtime_s,
  SUBSTR(query, 0, 120) AS query_preview
FROM `region-us`.INFORMATION_SCHEMA.JOBS
WHERE creation_time >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 7 DAY)
  AND job_type = 'QUERY'
  AND state = 'DONE'
ORDER BY total_bytes_processed DESC
LIMIT 25;

Interpretation cheatsheet:

High bytes processed, low slot-ms -> classic full-scan problem. Fix with partitioning, clustering, and SELECT discipline (steps 3-5).
High slot-ms, modest bytes -> compute/shuffle heavy. Look for skewed joins, exploding CROSS JOIN/UNNEST, repeated subqueries, or ORDER BY on huge result sets. Materialized views (step 6) often help.
cache_hit = true -> free and instant; lean into result caching for repeated reads (step 6).

Always preview cost before running an exploratory query. The dry run returns the byte estimate without executing:

bq query --use_legacy_sql=false --dry_run \
  'SELECT * FROM `proj.ds.events` WHERE event_date = "2026-05-01"'
# Prints: Query successfully validated. ... will process N bytes

3. Partitioning: cut the table into prunable slices

Partitioning splits a table into segments so the engine can skip entire segments that a query’s filter cannot match (partition pruning). It is the biggest single lever on bytes scanned. BigQuery supports three kinds.

Time-unit partitioning on a DATE, TIMESTAMP, or DATETIME column — by far the most common. Choose the granularity that matches your query filters: DAY for most event/log data, HOUR only for very high-volume short-window queries, MONTH/YEAR for sparse historical data.

CREATE TABLE proj.ds.events (
  event_id   STRING,
  event_ts   TIMESTAMP,
  user_id    STRING,
  country     STRING,
  payload     JSON
)
PARTITION BY DATE(event_ts)
OPTIONS (
  partition_expiration_days = 540,
  require_partition_filter  = TRUE
);

Integer-range partitioning when you filter on a bounded integer (customer ID bucket, tenant ID):

CREATE TABLE proj.ds.txn (
  tenant_id  INT64,
  amount      NUMERIC,
  created      TIMESTAMP
)
PARTITION BY RANGE_BUCKET(tenant_id, GENERATE_ARRAY(0, 4000, 10));

Ingestion-time partitioning when rows have no natural date column — BigQuery partitions by load time and exposes the pseudo-column _PARTITIONTIME (and _PARTITIONDATE):

CREATE TABLE proj.ds.raw_logs (line STRING)
PARTITION BY _PARTITIONDATE;

Critical constraints to internalize:

A table has one partitioning column. Pick the dimension you filter on most.
The hard ceiling is large (tens of thousands of partitions per table), but cardinality still matters: HOUR granularity on years of data burns through it fast. Match granularity to retention and query window.
partition_expiration_days auto-deletes old partitions — the cleanest, cheapest retention mechanism there is.

Convert an existing unpartitioned table by CREATE TABLE ... PARTITION BY ... AS SELECT ... then swapping names. You cannot retrofit partitioning onto a table in place.

4. Clustering: sort within partitions so blocks prune too

Clustering physically co-locates rows that share values in up to four columns, in priority order. Within each partition (or across the whole table if unpartitioned), data is sorted into blocks; a filter on a leading clustering column lets BigQuery skip blocks it cannot match. Clustering compounds with partitioning: partitioning skips partitions, clustering then skips blocks inside the surviving partitions.

CREATE TABLE proj.ds.events (
  event_id STRING,
  event_ts TIMESTAMP,
  user_id  STRING,
  country   STRING,
  payload   JSON
)
PARTITION BY DATE(event_ts)
CLUSTER BY country, user_id;

Rules that actually matter in practice:

Order is significant. List clustering columns from most-frequently-filtered to least. A filter on country (the first key) prunes well; a filter on user_id alone (second key) prunes far less, because the data is sorted by country first.
Best for high-cardinality columns you filter or aggregate on, and for join keys. It also speeds GROUP BY on the leading keys.
Unlike partitioning, you can add or change clustering on an existing table with ALTER TABLE ... SET OPTIONS, but only newly written/modified data is reclustered until BigQuery’s automatic background re-clustering catches up. Re-clustering is free and automatic.
Clustered tables give estimated (not exact) dry-run byte counts, because pruning is decided at run time. The real bill reflects the pruned scan.

-- Add or change clustering keys on an existing table
ALTER TABLE proj.ds.events
SET OPTIONS (clustering_fields = ['country', 'user_id']);

5. Kill full scans with SELECT discipline and forced filters

Even perfectly laid-out tables get hammered by careless SQL. Three habits remove most waste.

Never SELECT * on wide tables. BigQuery is columnar; cost is the sum of the columns you touch across the scanned rows. Selecting five columns instead of fifty can cut bytes ~10x with zero layout changes. Need almost everything except a couple of huge columns? Use SELECT * EXCEPT (payload, raw_blob).

Always filter on the partition column so pruning can engage. The filter must be on the partitioning column itself (or its pseudo-column), with a literal or a constant expression BigQuery can evaluate at planning time:

-- Prunes: filter is on the partition column with a static range
SELECT user_id, country
FROM proj.ds.events
WHERE DATE(event_ts) BETWEEN '2026-05-01' AND '2026-05-07'
  AND country = 'IN';   -- leading clustering key prunes blocks too

Wrapping the partition column in a non-trivial function, or comparing it to a subquery/volatile value, can defeat pruning and silently trigger a full scan. Keep partition predicates simple and static.

Make the filter mandatory. Setting require_partition_filter = TRUE rejects any query that omits a partition filter, turning a class of accidental full scans into an immediate error instead of a four-figure line item:

ALTER TABLE proj.ds.events
SET OPTIONS (require_partition_filter = TRUE);

Now a naked SELECT * FROM proj.ds.events fails fast with a “Cannot query over table … without a filter” error — exactly the guardrail you want on a hot, expensive table.

6. Materialized views, BI Engine, and result caching for hot queries

Once layout is right, attack repeated reads.

Result caching is free and automatic: identical query text against unchanged tables returns from cache in milliseconds at zero cost (cache_hit = true). It is defeated by non-determinism (CURRENT_TIMESTAMP(), RAND()), and by any change to the underlying tables. Parameterize and stabilize query text in dashboards so they actually hit cache.

Materialized views precompute and incrementally maintain an aggregation. BigQuery transparently rewrites qualifying queries against the base table to read the smaller MV — even queries that do not name the view — and refreshes it as base data changes.

CREATE MATERIALIZED VIEW proj.ds.daily_country_counts
PARTITION BY day
CLUSTER BY country
AS
SELECT
  DATE(event_ts) AS day,
  country,
  COUNT(*)        AS events,
  COUNT(DISTINCT user_id) AS uniques
FROM proj.ds.events
GROUP BY day, country;

MVs suit high-frequency aggregations over slowly-appending data. They have real limits — a restricted SQL surface (no arbitrary joins in older forms; check current support before relying on a complex shape) and refresh cost — so reserve them for genuinely hot rollups, not one-off reports.

BI Engine is an in-memory acceleration layer. You buy a small amount of reservation capacity, and BigQuery caches hot data in memory to serve sub-second dashboard queries (Looker Studio, Looker, and connecting BI tools) without re-scanning storage:

# Reserve 4 GiB of BI Engine memory in a location (adjust size/location)
bq update --reservation --project_id=PROJECT_ID \
  --location=US --bi_reservation_size=4294967296

Layer these: result cache for identical reads, MVs for common aggregations, BI Engine for interactive dashboards. Each removes load before it ever reaches a full table scan.

7. Slot reservations, assignments, and autoscaling for isolation

On Editions, you manage compute as reservations (pools of slots) and assignments (which projects/folders/orgs use which pool). This is how you stop ETL from starving BI, and how you put a hard ceiling on total compute spend.

The model has three objects:

Capacity commitment — an optional baseline of slots bought for 1 or 3 years at a discount.
Reservation — a named pool with a baseline (always-on slots) and autoscale max (additional slots that scale on demand, billed only while active).
Assignment — binds a project/folder/org to a reservation for a job type (QUERY, PIPELINE for load/ETL, etc.).

# 1. Optional baseline commitment (Enterprise edition, 1 year)
bq mk --capacity_commitment --project_id=ADMIN_PROJECT \
  --location=US --edition=ENTERPRISE \
  --slots=500 --plan=ANNUAL

# 2. A reservation with a baseline + autoscaling headroom
bq mk --reservation --project_id=ADMIN_PROJECT \
  --location=US --edition=ENTERPRISE \
  --slots=500 --autoscale_max_slots=1000 \
  bi_reservation

# 3. Assign the BI/analytics project to that reservation for queries
bq mk --reservation_assignment --project_id=ADMIN_PROJECT \
  --location=US \
  --reservation_id=bi_reservation \
  --assignee_type=PROJECT \
  --assignee_id=analytics-prod \
  --job_type=QUERY

Create a separate reservation for ETL so a heavy nightly load cannot consume the slots your dashboards need:

bq mk --reservation --project_id=ADMIN_PROJECT \
  --location=US --edition=ENTERPRISE \
  --slots=0 --autoscale_max_slots=2000 \
  etl_reservation

bq mk --reservation_assignment --project_id=ADMIN_PROJECT \
  --location=US \
  --reservation_id=etl_reservation \
  --assignee_type=PROJECT \
  --assignee_id=etl-prod \
  --job_type=QUERY

Patterns that work:

Baseline = your steady floor, autoscale = your spiky peak. A baseline of 0 with autoscaling gives pure pay-as-you-go slots with a hard autoscale_max_slots ceiling — a clean compute cost cap.
Idle slot sharing: by default, idle slots in one reservation can be borrowed by others in the same admin project. Disable ignore_idle_slots behavior per reservation when you need strict isolation and predictable performance.
Assignments inherit down the resource hierarchy: assign a folder and every project beneath it uses that reservation unless overridden.

Verify

Prove that layout changes actually reduce bytes and that capacity controls hold.

Confirm pruning works. Dry-run the same logical query with and without the partition filter and compare the byte estimate — the difference is your pruning win:

# With partition filter (should be small)
bq query --use_legacy_sql=false --dry_run \
  'SELECT user_id FROM `proj.ds.events`
   WHERE DATE(event_ts) = "2026-05-01" AND country = "IN"'

# Without it (full scan — should be dramatically larger, or rejected
# outright if require_partition_filter = TRUE)
bq query --use_legacy_sql=false --dry_run \
  'SELECT user_id FROM `proj.ds.events`'

Inspect partition health — row counts and sizes per partition catch skew and runaway cardinality:

SELECT
  partition_id,
  total_rows,
  ROUND(total_logical_bytes / POW(1024, 3), 2) AS gib
FROM `proj.ds`.INFORMATION_SCHEMA.PARTITIONS
WHERE table_name = 'events'
ORDER BY partition_id DESC
LIMIT 20;

Check slot utilization and autoscaling against your reservation, so you can right-size baseline vs autoscale:

SELECT
  reservation_name,
  job_type,
  ROUND(SUM(period_slot_ms) / 1000, 1) AS slot_seconds
FROM `region-us`.INFORMATION_SCHEMA.JOBS_TIMELINE
WHERE period_start >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 DAY)
  AND reservation_name IS NOT NULL
GROUP BY reservation_name, job_type
ORDER BY slot_seconds DESC;

8. Guardrails: quotas, cost controls, and per-query byte limits

Tuning lowers the floor; guardrails cap the ceiling so a single mistake cannot ruin the month.

Per-query maximum bytes billed kills runaway scans before they run. Set it at session, job, or — best — as a project default:

# Reject any single query that would bill more than 100 GiB
bq query --use_legacy_sql=false \
  --maximum_bytes_billed=107374182400 \
  'SELECT ... FROM `proj.ds.events` WHERE ...'

In SQL you can pin it per statement, and you can set a project-level default so every query inherits the cap:

SET @@query.maximum_bytes_billed = 107374182400;  -- 100 GiB this session

Custom query quotas cap daily bytes per user or per project, enforced by Cloud Quotas / IAM admin quotas — the backstop when someone forgets the byte limit. Set a per-user-per-day and per-project-per-day quota on the BigQuery API’s query-usage metric so spend cannot exceed a known maximum even under abuse.

Budgets and alerts in Cloud Billing notify (or trigger automation via Pub/Sub) at thresholds. Budgets do not stop spend by themselves, so pair them with the byte limits and quotas above; together they form a real cost ceiling rather than a smoke alarm.

Defense in depth: require_partition_filter on hot tables, a project-default maximum_bytes_billed, custom per-user daily quotas, and a billing budget with Pub/Sub automation. Any one can be bypassed; together they make a six-figure surprise structurally impossible.

Enterprise scenario

A fintech platform team I worked with ran clickstream analytics on a single events table approaching 90 TB. They were on-demand, and the bill had drifted past $40k/month. The table was partitioned by DATE(event_ts) — yet bytes scanned stayed enormous. The gotcha: their dbt models filtered on event_ts >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 90 DAY). Because CURRENT_TIMESTAMP() is non-deterministic, the planner could not resolve the predicate to a static partition range, so pruning silently degraded to a near-full scan on every dashboard refresh. A dry run confirmed it: 88 TB estimated with the filter present.

The fix was two-part. First, pin the lower bound to a static, plannable expression so pruning engages — CURRENT_DATE() resolves at planning time, unlike CURRENT_TIMESTAMP() arithmetic on the column:

SELECT user_id, country
FROM `proj.ds.events`
WHERE DATE(event_ts) >= DATE_SUB(CURRENT_DATE(), INTERVAL 90 DAY)
  AND country = 'IN';

That alone dropped the estimate from 88 TB to under 3 TB. Second, they set require_partition_filter = TRUE on the table so any future model that lost its predicate failed loudly in CI instead of quietly billing five figures. Within a billing cycle the warehouse spend fell roughly 70%, enough that the eventual move to an Enterprise reservation with a 500-slot baseline and 1000-slot autoscale ceiling was a planned capacity decision rather than a panic. The lesson: a partitioned table is not a pruned table — validate the estimate, never the schema.

Checklist

Pitfalls and next steps

The recurring mistakes: assuming you have outgrown on-demand without modelling it against a commitment plus autoscaling (often on-demand wins for spiky workloads); wrapping the partition column in a function and silently disabling pruning; ordering clustering keys by intuition instead of by filter frequency; and treating a billing budget as a cost limit when it only alerts. Remember you cannot retrofit partitioning in place — you rebuild — and that clustered-table dry runs give estimates, so validate against real total_bytes_billed, not the estimate.

Next, push these defaults into infrastructure as code so every new table is born partitioned, clustered, and filter-required, and wire a scheduled query over INFORMATION_SCHEMA.JOBS that flags any query exceeding a bytes-scanned threshold straight into your alerting. At that point cost stops being a monthly surprise and becomes a tuned, observable property of the warehouse.

Taming BigQuery Cost and Performance: Partitioning, Clustering, and Reservations

1. Pick the right pricing model before you tune anything

2. Read the query plan to find the real bottleneck

3. Partitioning: cut the table into prunable slices

4. Clustering: sort within partitions so blocks prune too

5. Kill full scans with SELECT discipline and forced filters

6. Materialized views, BI Engine, and result caching for hot queries

7. Slot reservations, assignments, and autoscaling for isolation

Verify

8. Guardrails: quotas, cost controls, and per-query byte limits

Enterprise scenario

Checklist

Pitfalls and next steps

Written by Vinod

Comments

Keep Reading

BigQuery Fine-Grained Security: Column-Level, Row-Level, and Data Masking

Cloud DNS at Scale: Private Zones, Peering, Forwarding, and Response Policies

Event-Driven Architecture with Cloud Functions 2nd Gen and Eventarc