A shared data lake bucket starts clean and ends as a 20 KB bucket policy that nobody dares to edit. Forty teams, each needing a different prefix, a different VPC restriction, a different account — all crammed into one JSON document with a hard 20 KB ceiling and a single point of failure. S3 Access Points exist precisely to break that document apart: each consumer gets its own named endpoint to the bucket, with its own policy, and the bucket policy shrinks to one line that says “trust access points.” On top of that, S3 Object Lambda lets you transform objects on the read path without copying data, and Multi-Region Access Points (MRAP) give you one global endpoint over replicated buckets. This guide wires all three together the way a platform team actually does it.
The reason this matters in production terms: a shared bucket is a contention point, a blast-radius concentrator, and a correctness hazard all at once. One bad edit to the shared policy breaks every consumer; one team’s prefix wildcard accidentally grants another team’s data; one nightly Glue job that writes a redacted copy doubles your storage and introduces staleness the moment it runs. Access points, Object Lambda, and MRAP each attack one of those failure modes — single-purpose policies kill contention and blast radius, in-flight transforms kill the duplicate copy, and a global endpoint over replicated buckets kills the multi-region routing/divergence problem.
By the end you will stop treating the bucket policy as the place where authorization happens. You will know how to decompose it into N small, auditable access-point policies; when to reach for a VPC-bound access point versus a cross-account one; how to insert a Lambda into the GET path safely (including the Range-request trap that breaks naive transforms); and how to stand up a global active-active endpoint with the SigV4A signing that every first MRAP integration forgets. Every design decision below comes with the limit that constrains it, the error you’ll see when you get it wrong, and the exact aws command to confirm the fix.
What problem this solves
A single S3 bucket policy is one JSON document with a 20 KB hard size limit, edited by one change process, governing every principal that touches the bucket. That works for a private bucket with three IAM roles. It collapses for a shared data lake. The pain shows up in four distinct ways, none of which a bigger policy fixes:
- Blast radius. One malformed edit — a missing comma, a too-broad
Resource, a fat-fingeredDeny— breaks every consumer at once. There is no isolation between tenants when they all live in one policy. - Change contention. Forty teams cannot safely co-edit one JSON file. Every onboarding is a coordination event, every change a merge-conflict risk, and the policy eventually grows past the size ceiling and simply won’t save — which is how partner onboarding fails at 18 KB.
- No per-tenant network control. You cannot cleanly say “team A reaches this prefix only from VPC-A, team B only from VPC-B” inside one document. Conditions pile up and become unreadable.
- Duplicate-copy redaction. When two audiences need two views of the same object (analysts see raw, partners must not see PII), teams reach for a nightly job that writes a second redacted copy — doubling storage and adding hours of staleness.
Who hits this: every platform team that owns a shared bucket fronting a data lake, a media library, a partner-export surface, or a multi-tenant SaaS storage tier. It bites hardest the moment you cross account boundaries (cross-account sharing crammed into a bucket policy), the moment PII enters a shared dataset (the duplicate-copy trap), and the moment you go multi-region (routing and divergence). Here is the field this article covers, the question each feature forces, and where you reach first:
| Capability | What it actually does | First question it forces | Where you configure it | Most common first mistake |
|---|---|---|---|---|
| Access Point | A named endpoint with its own policy + BPA on one bucket | Should the bucket policy authorize, or just delegate? | aws s3control create-access-point |
Forgetting the bucket-policy delegation statement |
| VPC-bound Access Point | An access point reachable only via an interface/gateway endpoint in one VPC | Must this data ever have a public path? | --vpc-configuration at create |
No matching VPC endpoint → all requests fail |
| Object Lambda Access Point (OLAP) | A Lambda inserted into the GET path; transforms bytes per request | Is the transform length-preserving (range-safe)? | create-access-point-for-object-lambda |
Not handling Range/partNumber headers |
| Multi-Region Access Point | One global endpoint routing to the closest healthy bucket | Is the data actually replicated, or just routed? | create-multi-region-access-point |
Routing over divergent (un-replicated) data |
| Cross-account Access Point | An access-point policy granting a principal in another account | Did both sides (AP policy + consumer IAM) allow it? | put-access-point-policy |
Granting only one side of the cross-account pair |
Learning objectives
By the end of this article you can:
- Explain why a single bucket policy fails at scale (size, blast radius, contention, network) and rewrite it into a one-statement delegation document using
s3:DataAccessPointAccountors3:DataAccessPointArn. - Create per-application access points with small, single-purpose policies, and address objects through the access-point ARN and hostname instead of the bucket name.
- Lock an access point to a single VPC so it has no public DNS path, and reason about how access-point BPA composes with account/bucket BPA (most-restrictive wins).
- Insert a Lambda into the S3 GET path with an Object Lambda Access Point, wire the supporting access point correctly, and stream transformed bytes back via
WriteGetObjectResponse. - Decide whether your transform is range-safe and either support
GetObject-Range/GetObject-PartNumberor reject ranges with a clean501so SDK multipart downloads fall back to a full GET. - Stand up a Multi-Region Access Point over regional buckets, back it with two-way Cross-Region Replication (and RTC for a 15-minute SLA), and drive a reversible failover with
TrafficDialPercentage. - Sign MRAP requests with SigV4A (CRT signing enabled) and diagnose the
SignatureDoesNotMatchyou get when you forget. - Read the error/limit reference for all three features and attribute every request to the access point it came through via CloudTrail data events and request metrics.
Prerequisites & where this fits
You should already be comfortable with the S3 fundamentals — storage classes, versioning, lifecycle, and encryption, because access points sit on top of a normal bucket and inherit its versioning and encryption behaviour. You should understand IAM policy evaluation — users, roles, policies, and how Allow/Deny resolve, since the whole access-point model is a layering of resource policies, and you should know the least-privilege and permission-boundary patterns that make per-tenant policies safe. Cross-account sharing assumes familiarity with AWS Organizations, SCPs, and delegated administration. For the VPC-bound path you’ll want the VPC deep dive — subnets, routing, IGW, NAT, and endpoints, and Object Lambda assumes the Lambda deep dive — runtimes, triggers, layers, and concurrency.
This sits in the Storage & data-access track, one layer above plain S3 and one layer below a full S3 data protection and governance at scale program. Think of it as the access-plane design for a shared bucket: the bucket itself (and its replication, encryption, lifecycle) is the data plane; access points, OLAP, and MRAP are how you expose, transform, and globalize that plane without copying it. A quick map of who owns what during a shared-bucket design, so you pull in the right person:
| Layer | What lives here | Who usually owns it | What it can break |
|---|---|---|---|
| Bucket policy | Delegation statement only | Platform / data-lake team | Every consumer (blast radius) if it authorizes directly |
| Access-point policy | Per-tenant fine-grained grants | The consuming team (delegated) | One tenant’s access only |
| BPA (account + AP) | Public-access guardrails | Security / platform | Re-exposure if loosened (it can’t be, by design) |
| VPC + endpoints | Interface/gateway endpoint, endpoint policy | Network team | VPC-bound AP unreachable if endpoint missing |
| Lambda transform | The OLAP function + its role | App/data team | Wrong/empty bytes; latency on every GET |
| Replication + MRAP | CRR rules, RTC, routing dials | Platform / SRE | Divergent data served as “one” global object |
Core concepts
Five mental models make every later decision obvious.
An access point is a named door, not a copy. An S3 access point is a named network endpoint attached to a single bucket, each with its own resource policy, its own Block Public Access (BPA) settings, and optionally a VPC restriction. It is not a copy of the data and not a new storage location — it is an alternate front door into the same objects, with its own lock. Requests through it still reach the bucket, so the bucket owner delegates to access points with one statement and the access-point policy does the fine-grained work.
The bucket policy becomes a delegation document, not an authorization document. This is the single biggest shift. Instead of “allow role/finance-etl to s3:GetObject on finance/*,” the bucket policy says “allow any request that arrived via an access point owned by this account” (s3:DataAccessPointAccount). The authorization moves into N small access-point policies you can actually reason about. You go from one unauditable monolith to many single-responsibility policies.
Block Public Access composes as the most-restrictive of the layers. Every access point carries its own BPA, and the effective setting is the most restrictive of the access-point setting and the account/bucket setting. You can never use an access point to loosen public access the account locked down — if the account blocks public policies, no access point can re-expose the bucket. This is a one-way ratchet by design, and it’s why VPC-bound access points are the strongest network control S3 offers for a shared bucket.
Object Lambda transforms on the read path, so you keep exactly one authoritative copy. When a client GETs through an Object Lambda Access Point, S3 invokes your Lambda, hands it a pre-signed URL to the original object, and your function returns transformed bytes via WriteGetObjectResponse — without writing a derived copy back to S3. Redaction, PII masking, row filtering, watermarking, format conversion all happen per request, based on who is asking. The cost is a Lambda invocation per GET and the latency it adds.
MRAP routes; replication copies — they are different jobs. A Multi-Region Access Point is one global hostname that routes each request to the lowest-latency healthy underlying bucket using AWS Global Accelerator anycast under the hood. It does not move data between regions. “The object exists in the other region” is your responsibility via Cross-Region Replication (CRR). An MRAP over un-replicated buckets is latency routing over divergent data — a correctness bug waiting to happen. And because a single global request can be served from any region, it cannot be signed with region-scoped SigV4; it requires SigV4A.
The vocabulary in one table
Pin down every moving part before the deep sections. The glossary repeats these for lookup; this is the model side by side:
| Term | One-line definition | Where it lives | Why it matters |
|---|---|---|---|
| Access point (AP) | Named endpoint + policy + BPA on one bucket | Account, region of the bucket | Decomposes the bucket policy |
| Supporting access point | A plain AP that an OLAP points at | On the bucket | OLAP’s data source; must be the full ARN |
| Object Lambda AP (OLAP) | Lambda inserted into the GET path | References a supporting AP | Transform on read, no copy |
| MRAP | One global endpoint over multi-region buckets | Global (no region segment) | Latency routing + failover |
| CRR | Cross-Region Replication between buckets | Replication config on buckets | Keeps MRAP data consistent |
| RTC | Replication Time Control (15-min SLA) | CRR rule option | Bounded replication lag |
s3:DataAccessPointAccount |
Condition key: “request came via an AP in this account” | Bucket policy condition | The delegation statement |
s3:DataAccessPointArn |
Condition key: scope to specific AP ARNs | Bucket policy condition | Tighter delegation |
| BPA | Block Public Access (4 flags) | Account + each AP | Most-restrictive wins |
| SigV4A | Multi-region request signing variant | Client/SDK signer | Mandatory for MRAP |
WriteGetObjectResponse |
API the OLAP Lambda calls to return bytes | Lambda code + IAM | How transformed bytes reach the caller |
| VPC configuration | Binds an AP to one VPC | AP at create time | No public path, ever |
And the three access-point flavors side by side — pick by what job you need the door to do:
| Flavor | Primary job | Adds over a bucket | Signing | Reach for it when |
|---|---|---|---|---|
| Standard AP | Per-tenant policy + BPA | Decomposed authorization | SigV4 | Many teams/prefixes on one bucket |
| VPC-bound AP | No public path | Network isolation | SigV4 | Data must never be internet-reachable |
| Object Lambda AP | Transform on read | In-flight redaction/convert | SigV4 | Two views of one object (raw vs masked) |
| MRAP | Global routing + failover | Multi-region single endpoint | SigV4A | Active-active low-latency reads |
The mental model: an access point is a named door, not a copy
An S3 access point is a named network endpoint attached to a single bucket, each with its own resource policy, its own Block Public Access settings, and optionally a VPC restriction. It is not a copy of the data and it is not a new storage location — it is an alternate front door into the same objects, with its own lock.
Three facts drive every design decision below:
- Each access point has its own policy and its own BPA. You decompose one giant bucket policy into N small, single-purpose policies — one per application, team, or account.
- An access point can optionally be locked to a single VPC. Once you set
VpcConfiguration, that access point only answers requests arriving over an interface endpoint in that VPC. There is no public path, ever. - Requests through an access point still hit the bucket policy. The access point is additive. The bucket owner delegates to access points with one statement; the access point policy then does the fine-grained work.
Internalize this: the bucket policy becomes a delegation document (“allow access via my access points”), and the access point policies become the authorization documents. You move from one unauditable monolith to many small, single-responsibility policies you can actually reason about.
Access points use a distinct ARN shape and a distinct hostname, so application code addresses the access point, not the bucket:
arn:aws:s3:us-east-1:111122223333:accesspoint/finance-reports-ap
https://finance-reports-ap-111122223333.s3-accesspoint.us-east-1.amazonaws.com
The ARN and hostname forms differ across the three access-point types — using the wrong one is the most common “it worked in the console but not in code” error. Keep this table open while you write client code:
| Access-point type | ARN form | Hostname / how clients address it | Region segment? |
|---|---|---|---|
| Standard AP | arn:aws:s3:<region>:<acct>:accesspoint/<name> |
<name>-<acct>.s3-accesspoint.<region>.amazonaws.com |
Yes |
| AP object ref (in policy) | …:accesspoint/<name>/object/<key-or-prefix> |
n/a (policy Resource) |
Yes |
| Object Lambda AP | arn:aws:s3-object-lambda:<region>:<acct>:accesspoint/<name> |
pass the ARN as --bucket to get-object |
Yes |
| MRAP | arn:aws:s3::<acct>:accesspoint/<alias>.mrap |
<alias>.accesspoint.s3-global.amazonaws.com |
No (global) |
And the naming/ARN identifiers you’ll juggle, with their constraints — names are not arbitrary, and collisions are scoped by account:
| Identifier | Format / rule | Constraint | Collision scope |
|---|---|---|---|
| AP name | lowercase, hyphens, 3–50 chars | No underscores, no uppercase | Per account + region |
| AP alias | auto-generated, S3-style | Read-only; used in some hostnames | Globally unique |
| MRAP name | 3–50 chars, lowercase | Cannot be reused after delete for a while | Per account |
| MRAP alias | auto-generated <id>.mrap |
Immutable; this is what clients use | Globally unique |
| OLAP name | lowercase, hyphens | Same rules as AP | Per account + region |
Why bucket policies break down at scale
A single bucket policy has a 20 KB size limit. That sounds generous until you have dozens of consumers, each needing a Condition block for their VPC, their aws:PrincipalOrgID, their prefix, and their allowed actions. You also hit operational problems that have nothing to do with size:
- Blast radius. One bad edit to the shared policy can break every consumer at once. There is no isolation.
- Change contention. Forty teams cannot safely co-edit one JSON file. Every change is a coordination event.
- No per-tenant network controls. You cannot say “team A reaches this prefix only from VPC-A, team B only from VPC-B” cleanly in one document.
Access points solve all four: separate policies (separate size budgets), per-access-point blast radius, independent change ownership, and per-access-point VPC binding. The bucket policy collapses to a delegation statement:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "DelegateToAccessPoints",
"Effect": "Allow",
"Principal": { "AWS": "arn:aws:iam::111122223333:root" },
"Action": "s3:*",
"Resource": [
"arn:aws:s3:::datalake-shared-prod",
"arn:aws:s3:::datalake-shared-prod/*"
],
"Condition": {
"StringEquals": { "s3:DataAccessPointAccount": "111122223333" }
}
}
]
}
That s3:DataAccessPointAccount condition is the key: it says “permit any request that arrived via an access point owned by this account.” The bucket stops making fine-grained decisions and lets the access points do it. (You can also use s3:DataAccessPointArn to scope to specific access points.)
Here is the honest before/after — the monolith versus the decomposed model — across the dimensions that actually bite a platform team:
| Dimension | Monolithic bucket policy | Access-point model |
|---|---|---|
| Size budget | One 20 KB document for everyone | 20 KB per access point, plus a tiny bucket policy |
| Blast radius | One edit can break all consumers | One AP policy affects one consumer |
| Change ownership | Central team gatekeeps every edit | Each team owns its AP policy (delegated) |
| Per-tenant network | Conditions piled into one doc | VpcConfiguration per access point |
| Auditability | One giant doc, hard to reason about | N single-purpose docs, each auditable |
| Revocation | Edit and hope you didn’t break others | Delete the access point — clean cut |
| Cross-account | Crammed in with Principal + conditions |
Lives in one AP policy, bucket untouched |
The two delegation condition keys are not interchangeable — pick by how much you trust the access-point layer:
| Condition key | What it permits | Use when | Risk if misused |
|---|---|---|---|
s3:DataAccessPointAccount |
Any AP owned by the named account | You trust every AP in the account equally | A rogue AP in the account inherits trust |
s3:DataAccessPointArn |
Only the listed AP ARNs (supports wildcards) | You want to allow-list specific access points | Easy to forget to add a new AP → access denied |
| Both, combined | Account and ARN pattern must match | Tightest: account-scoped + ARN-scoped | More to maintain |
| Neither (direct grants) | Classic per-principal authorization | You are not using access points | The monolith problem returns |
Creating access points with scoped policies
Create an access point per application. The first one is internet-routable but still gated by its policy and BPA; you scope it down with the policy:
aws s3control create-access-point \
--account-id 111122223333 \
--name finance-reports-ap \
--bucket datalake-shared-prod
Now attach a policy that confines this access point to one prefix and one set of actions. Note the access point ARN in Resource and the /object/ segment used to reference objects through the access point:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "FinanceReadWritePrefix",
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::111122223333:role/finance-etl"
},
"Action": ["s3:GetObject", "s3:PutObject"],
"Resource": "arn:aws:s3:us-east-1:111122223333:accesspoint/finance-reports-ap/object/finance/*"
}
]
}
Apply it:
aws s3control put-access-point-policy \
--account-id 111122223333 \
--name finance-reports-ap \
--policy file://finance-ap-policy.json
The create-access-point call accepts a focused set of parameters; knowing the default and the gotcha for each removes the guesswork:
| Parameter | What it sets | Default | When to set it | Gotcha |
|---|---|---|---|---|
--name |
The access-point name (→ hostname) | required | always | Immutable; lowercase, no underscores |
--bucket |
The bucket it fronts | required | always | One bucket per AP; cannot be changed |
--bucket-account-id |
Owner account of the bucket | the caller | cross-account bucket | Needed when AP and bucket are in different accounts |
--vpc-configuration |
Binds the AP to one VPC | none (internet) | private-only data | No public DNS path once set; needs a VPC endpoint |
--public-access-block-configuration |
The four BPA flags on this AP | all true |
rarely loosen | Cannot loosen below account BPA |
The four BPA flags, what each blocks, and why you set all of them unless audited otherwise:
| BPA flag | Blocks | Default on AP | When you’d ever set false |
|---|---|---|---|
BlockPublicAcls |
New public ACLs on PUT | true | Legacy ACL-based workflow (avoid) |
IgnorePublicAcls |
Honoring existing public ACLs | true | Almost never |
BlockPublicPolicy |
Public bucket/AP policy statements | true | Never on a shared data lake |
RestrictPublicBuckets |
Cross-account/anonymous via public policy | true | Never on a shared data lake |
VPC-only access points
For an access point that must never be reachable from the internet, bind it to a VPC at creation. This is the single strongest network control S3 offers for a shared bucket: the access point simply has no public DNS path.
aws s3control create-access-point \
--account-id 111122223333 \
--name analytics-internal-ap \
--bucket datalake-shared-prod \
--vpc-configuration VpcId=vpc-0abc123def456 \
--public-access-block-configuration \
BlockPublicAcls=true,IgnorePublicAcls=true,BlockPublicPolicy=true,RestrictPublicBuckets=true
A VPC-bound access point is reachable only through an S3 interface endpoint (or gateway endpoint) in that VPC. Combine it with an endpoint policy and you have a closed loop: traffic stays on the AWS network, and the access point rejects anything from outside vpc-0abc123def456.
The network-origin matrix — what each access-point shape will and won’t answer — is the difference between “locked down” and “accidentally public”:
| Access-point shape | Reachable from internet? | Reachable from VPC (endpoint)? | Reachable cross-account? | Strongest guarantee |
|---|---|---|---|---|
| Standard AP (policy-gated) | Yes, if policy allows | Yes (via endpoint) | Yes, if AP policy + IAM allow | Policy is the only gate |
| VPC-bound AP | No DNS path | Yes (only that VPC) | Only from that VPC’s account context | No public path, ever |
| OLAP (on a supporting AP) | Yes, if policy allows | Yes | Yes | Transform always runs |
| MRAP | Yes (global anycast) | Yes (via endpoint, regional) | Yes, if policies allow | Latency routing + failover |
You can also pair the VPC-bound access point with an S3 gateway or interface endpoint policy for defense in depth. The two endpoint types differ in ways that matter for access points:
| Endpoint type | Cost | Used by | Works with VPC-bound AP | Note |
|---|---|---|---|---|
| Gateway endpoint | Free | Route-table prefix list | Yes | No private DNS; same-region only |
| Interface endpoint (PrivateLink) | Hourly + per-GB | ENI + private DNS | Yes | Private IP; cross-region capable |
| No endpoint | n/a | Public internet | No (AP won’t answer) | VPC-bound AP requires an endpoint |
Delegated cross-account access
Access points shine for cross-account sharing because the access point policy can grant to a principal in another account, and that consumer addresses the access point ARN directly — they never see your bucket name. The owning account still controls everything via the access point policy plus the delegating bucket policy.
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "PartnerReadOnly",
"Effect": "Allow",
"Principal": { "AWS": "arn:aws:iam::444455556666:role/partner-ingest" },
"Action": "s3:GetObject",
"Resource": "arn:aws:s3:us-east-1:111122223333:accesspoint/partner-share-ap/object/exports/*",
"Condition": {
"StringEquals": { "aws:PrincipalOrgID": "o-exampleorgid" }
}
}
]
}
The cross-account principal also needs a matching Allow in its own IAM policy — cross-account always requires both sides. But critically, the bucket policy stays untouched; all the partner-specific logic lives in one small access point policy you can revoke by deleting the access point.
Cross-account access fails silently in predictable ways. This is the four-cornered checklist — all four must align or the consumer gets AccessDenied:
| Side | What must allow | Owned by | Symptom if missing |
|---|---|---|---|
| Bucket policy (delegation) | s3:DataAccessPointAccount/Arn for the AP |
Producer | AccessDenied even with a perfect AP policy |
| Access-point policy | The consumer principal + action + /object/ resource |
Producer | AccessDenied; AP didn’t grant |
| Consumer IAM policy | s3:GetObject on the AP ARN (or *) |
Consumer | AccessDenied; consumer side never allowed it |
| Org/condition guard | aws:PrincipalOrgID (optional, recommended) |
Producer | Over-broad grant if omitted; confused-deputy risk |
Block Public Access inheritance and naming patterns
Every access point carries its own BPA configuration, and it is the most restrictive of the access point setting and the account/bucket setting that wins. You cannot use an access point to loosen public-access controls that the account-level BPA has locked down. If the account blocks public policies, no access point can re-expose the bucket. Set all four flags on every access point unless you have an explicit, audited reason not to:
# BPA is set at create time; inspect it:
aws s3control get-access-point \
--account-id 111122223333 \
--name finance-reports-ap \
--query 'PublicAccessBlockConfiguration'
For shared datasets, adopt a naming convention that encodes ownership and intent, because the hostname is derived from the name. A consistent scheme — {team}-{dataset}-{rw|ro}-ap — keeps endpoints self-documenting and makes IAM Resource wildcards predictable:
| Access point name | Purpose | Network |
|---|---|---|
finance-reports-rw-ap |
Finance ETL read/write | Internet + policy |
analytics-events-ro-ap |
Analytics read-only | VPC-only |
partner-share-ro-ap |
Cross-account export | Internet + OrgID |
The ARN and hostname both embed the account ID, which is why two accounts can have an access point named reports-ap over different buckets without collision.
A naming convention is not cosmetic — each token does load-bearing work for IAM wildcards and operational clarity:
| Token | Encodes | Example | Why it pays off |
|---|---|---|---|
{team} |
Owning team | finance |
Group AP policies by team; predictable wildcards |
{dataset} |
What data | reports |
Self-documenting endpoint; maps to a prefix |
{rw|ro} |
Intended access | ro |
At-a-glance least-privilege intent |
-ap suffix |
Resource type | -ap / -olap / -mrap |
Distinguish AP vs OLAP vs MRAP in logs |
The effective-BPA composition is worth a truth table, because “I set the AP to allow X” does not mean X is allowed — the account can still block it:
| Account BPA | Access-point BPA | Effective behavior | Can the AP re-expose? |
|---|---|---|---|
| All blocked | All blocked | Fully private | No |
| All blocked | Loosened (false) | Still fully private (account wins) | No |
| Loosened | All blocked | Private for this AP | No (AP is stricter) |
| Loosened | Loosened | Public policy possible (audit hard) | Yes — avoid on shared data |
S3 Object Lambda: transform on the read path
Object Lambda inserts your own Lambda function into the GET path. When a client reads an object through an Object Lambda Access Point, S3 invokes your function, hands it the original object stream, and your function returns the transformed bytes to the caller — without writing a derived copy back to S3. This is the right tool for redaction, PII masking, row-level filtering, watermarking, and format conversion, because you keep exactly one authoritative copy and transform per-request based on who is asking.
The topology has three layers:
- The bucket holds the authoritative object.
- A supporting access point (a normal access point) sits on the bucket.
- The Object Lambda Access Point points at that supporting access point and names the Lambda transform.
The Lambda receives an event with a pre-signed inputS3Url. It fetches the original, transforms it, and calls WriteGetObjectResponse to stream the result back. Here is a correct PII-masking transform in Python:
import boto3
import re
import urllib3
s3 = boto3.client("s3")
http = urllib3.PoolManager()
# Mask anything that looks like a US SSN.
SSN = re.compile(rb"\b\d{3}-\d{2}-\d{4}\b")
def handler(event, context):
ctx = event["getObjectContext"]
# S3 hands us a pre-signed URL to the ORIGINAL object.
resp = http.request("GET", ctx["inputS3Url"])
original = resp.data
transformed = SSN.sub(b"***-**-****", original)
# Stream the transformed bytes back to the caller.
s3.write_get_object_response(
Body=transformed,
RequestRoute=ctx["outputRoute"],
RequestToken=ctx["outputToken"],
)
return {"status_code": 200}
The function’s execution role needs s3-object-lambda:WriteGetObjectResponse. Because the inputS3Url is pre-signed by S3 itself, the function does not need separate s3:GetObject on the bucket for the standard fetch path — but grant it if your code makes additional S3 calls (e.g., reading a redaction config object).
The event S3 hands your Lambda is small but every field matters. Confuse outputRoute and outputToken and the response goes nowhere:
| Event field | Type | What it is | Used for |
|---|---|---|---|
getObjectContext.inputS3Url |
string | Pre-signed URL to the ORIGINAL object | Fetch the source bytes |
getObjectContext.outputRoute |
string | Where to send the transformed response | RequestRoute in WriteGetObjectResponse |
getObjectContext.outputToken |
string | One-time token authorizing the write-back | RequestToken in WriteGetObjectResponse |
userRequest.url |
string | The original request URL | Inspect key/query |
userRequest.headers |
map | Caller’s headers (incl. Range) |
Decide range handling |
userIdentity |
object | Who made the request | Per-caller transform logic |
configuration.payload |
string | Static payload from the OLAP config | Pass tunables to the function |
WriteGetObjectResponse takes more than just a body — these are the parameters that turn a transform into a correct HTTP response:
| Parameter | Required | What it controls | Note |
|---|---|---|---|
RequestRoute |
yes | Routes the response to the caller | From outputRoute |
RequestToken |
yes | Authorizes this write-back | From outputToken; single-use |
Body |
yes (for 200) | The transformed bytes | Stream for large objects |
StatusCode |
no | HTTP status to return | Use 501/416 to reject ranges |
ContentType |
no | MIME of the result | Set when you change format |
ContentLength |
no | Length of the result | Required by some clients |
ErrorCode / ErrorMessage |
no | Structured error to the caller | For rejections |
The execution role’s least-privilege set is small — grant exactly these, nothing broader:
| Permission | On | Why | When to omit |
|---|---|---|---|
s3-object-lambda:WriteGetObjectResponse |
the OLAP | Stream bytes back to the caller | Never (required) |
s3:GetObject |
the supporting AP / bucket | Only if you call S3 beyond inputS3Url |
Omit if you only use the pre-signed URL |
s3:GetObject (config object) |
a small config key | Read a redaction map / allow-list | Omit if config is in env vars |
logs:* (basic exec role) |
CloudWatch Logs | Function logging | Never omit |
Wiring Object Lambda, supporting access points, and Range handling
First create the supporting access point (a plain access point), then the Object Lambda Access Point that references it. The Object Lambda Access Point’s SupportingAccessPoint must be the full ARN of the supporting access point:
# 1. Supporting access point on the bucket.
aws s3control create-access-point \
--account-id 111122223333 \
--name pii-supporting-ap \
--bucket datalake-shared-prod
# 2. Object Lambda Access Point referencing it.
aws s3control create-access-point-for-object-lambda \
--account-id 111122223333 \
--name pii-redacted-olap \
--configuration '{
"SupportingAccessPoint": "arn:aws:s3:us-east-1:111122223333:accesspoint/pii-supporting-ap",
"TransformationConfigurations": [
{
"Actions": ["GetObject"],
"ContentTransformation": {
"AwsLambda": {
"FunctionArn": "arn:aws:lambda:us-east-1:111122223333:function:pii-redactor"
}
}
}
]
}'
Clients then read through the Object Lambda Access Point ARN, and S3 invokes the transform transparently:
aws s3api get-object \
--bucket arn:aws:s3-object-lambda:us-east-1:111122223333:accesspoint/pii-redacted-olap \
--key customers/2026/records.csv \
./redacted.csv
The OLAP’s TransformationConfigurations block supports more Actions than just GetObject — knowing which your client paths use prevents “transform didn’t run” surprises:
| Supported action | When it fires | Common need | If unconfigured |
|---|---|---|---|
GetObject |
Plain GET | Redaction, conversion | Object returned untransformed |
GetObject-Range |
Client sends a Range header |
Range-safe slicing | SDK multipart fails or gets raw bytes |
GetObject-PartNumber |
Client sends partNumber |
Multipart-aware reads | Same as above |
HeadObject |
HEAD request | Adjust Content-Length/metadata |
HEAD reflects original, not transformed |
ListObjects / ListObjectsV2 |
Listing through the OLAP | Filter/redact listings | List shows original keys |
Handling Range and partial reads
This is where naive Object Lambda functions break in production. If a client sends a Range or partNumber header (the AWS SDKs do this constantly for large objects and multipart downloads), your function must handle it. You have two correct options:
- Declare support and implement it: enable
GetObject-RangeandGetObject-PartNumberin the transformation configuration, read the requested range from the original, and apply your transform to that slice. Only do this when your transform is range-safe (e.g., simple byte substitution that does not change length). - Reject ranges for transforms that are not range-safe (format conversion, compression, anything that changes byte offsets) by returning a
501so callers fall back to a full GET.
A length-preserving transform like fixed-width masking is range-safe; a format conversion (CSV to Parquet, or gzip) is not, because byte offsets in the output no longer map to the input. Know which one you have before you enable range support. A safe rejection looks like this:
head = event.get("userRequest", {}).get("headers", {})
if "Range" in head or "range" in head:
s3.write_get_object_response(
StatusCode=501,
ErrorCode="RangeNotSatisfiable",
ErrorMessage="Range requests are not supported by this transform",
RequestRoute=ctx["outputRoute"],
RequestToken=ctx["outputToken"],
)
return {"status_code": 501}
Decide range strategy by the nature of the transform — this is the decision table to keep next to your function:
| Transform | Changes byte length? | Range-safe? | Strategy | Why |
|---|---|---|---|---|
Fixed-width masking (***-**-****) |
No | Yes | Support ranges; transform the slice | Offsets preserved |
Variable redaction ([REDACTED]) |
Yes | No | Reject ranges (501) |
Output offsets diverge from input |
| CSV → Parquet | Yes | No | Reject ranges | Whole-object reframe |
| gzip / compression | Yes | No | Reject ranges | Stream is not slice-addressable |
| Watermark image | Maybe | Usually no | Reject ranges | Re-encode changes bytes |
| Row-level filter (drop rows) | Yes | No | Reject ranges | Row boundaries don’t map to byte ranges |
| Header/metadata rewrite only | No | Yes | Support ranges | Body unchanged |
When you reject, pick the status that makes the client do the right thing:
| Status to return | Client behavior | Use for |
|---|---|---|
501 Not Implemented |
SDK retries with a full GET | “This transform never supports ranges” |
416 Range Not Satisfiable |
Client treats range as invalid | The specific requested range is invalid |
200 with full body (ignore range) |
Client may mis-assemble multipart | Avoid — breaks SDK multipart |
206 Partial Content (true range) |
Client accepts the slice | Only when genuinely range-safe |
Multi-Region Access Points: one global endpoint
A Multi-Region Access Point (MRAP) is a single global endpoint that routes requests to whichever underlying bucket — across multiple Regions — is closest and healthy. You attach buckets in different Regions, wire S3 Cross-Region Replication (CRR) between them for active-active, and clients use one hostname that ends in .accesspoint.s3-global.amazonaws.com. S3 routes each request to the lowest-latency available bucket using latency-based routing built on AWS Global Accelerator under the hood.
Create the MRAP over two regional buckets:
aws s3control create-multi-region-access-point \
--account-id 111122223333 \
--details '{
"Name": "global-assets-mrap",
"Regions": [
{ "Bucket": "assets-use1" },
{ "Bucket": "assets-euw1" }
]
}'
This is asynchronous; poll the request token until it reports SUCCEEDED, then read the generated alias (the global hostname prefix):
aws s3control list-multi-region-access-points \
--account-id 111122223333 \
--query 'AccessPoints[?Name==`global-assets-mrap`].[Name,Alias,Status]'
For active-active you must configure two-way replication so a write in either Region propagates to the other. Enable replication on both buckets, turn on bidirectional sync (replica modifications and delete-marker replication as your data model requires), and ideally enable S3 Replication Time Control (RTC) for a 15-minute replication SLA. Without CRR, an MRAP is just latency routing over divergent data — which is a correctness bug waiting to happen.
Failover is automatic for read availability: if S3 detects a Regional impairment, it routes around it. But “the object exists in the other Region” is your responsibility via replication. MRAP routes; it does not copy. Replication copies.
An MRAP goes through several states during creation/deletion; polling the wrong way looks like a hang:
| MRAP status | Meaning | What to do | Typical duration |
|---|---|---|---|
CREATING |
Anycast endpoint being provisioned | Poll; do not use yet | up to ~30 min |
READY |
Endpoint live and routing | Use it | — |
DELETING |
Tear-down in progress | Wait before reusing the name | minutes |
PARTIALLY_CREATED |
Some regions failed to attach | Investigate the failed region | — |
FAILED |
Creation failed | Read the failure reason; recreate | — |
The replication options that make an MRAP correct rather than merely routed — each one closes a specific divergence gap:
| Replication option | What it does | Default | Enable when | Cost impact |
|---|---|---|---|---|
| CRR rule (one-way) | Source → destination copy | off | Read-only replica region | Per-GB transfer + request |
| Two-way (bidirectional) | Writes in either region propagate | off | Active-active MRAP | Doubles replication traffic |
| Replica modification sync | Replicate metadata/ACL changes on replicas | off | Active-active correctness | Minor |
| Delete-marker replication | Propagate delete markers | off | If deletes must mirror | Minor |
| RTC (Replication Time Control) | 15-min SLA + metrics | off | You need bounded lag | Per-GB RTC fee |
| Replica ownership override | Destination account owns replicas | off | Cross-account replication | None |
Request routing, failover, and SigV4A signing
MRAP supports two routing controls. By default it uses latency-based routing across all active Regions. You can also flip a Region’s routing status to drain it (for maintenance or a controlled failover) using the routing-control API:
aws s3control submit-multi-region-access-point-routes \
--account-id 111122223333 \
--mrap global-assets-mrap \
--route-updates '[
{ "Bucket": "assets-euw1", "Region": "eu-west-1", "TrafficDialPercentage": 0 },
{ "Bucket": "assets-use1", "Region": "us-east-1", "TrafficDialPercentage": 100 }
]'
Setting TrafficDialPercentage to 0 drains a Region without deleting anything — the canonical way to do a planned, reversible failover.
The routing controls and what each is for — confusing “drain” with “delete” is how people cause outages during maintenance:
| Control | API / setting | Effect | Reversible? | Use for |
|---|---|---|---|---|
| Latency routing (default) | none | Closest healthy region serves | n/a | Normal active-active |
TrafficDialPercentage=0 |
submit-...-routes |
Drain a region (no new traffic) | Yes (dial back up) | Planned maintenance / failover |
TrafficDialPercentage=100 |
submit-...-routes |
Full traffic to a region | Yes | Restore after drain |
| Automatic health failover | built-in | Route around an impaired region | n/a (auto) | Region outage |
| Remove a region | submit-multi-region-access-point-routes is not delete |
(use update MRAP) | — | Permanent topology change |
SigV4A is mandatory
This is the detail that trips up every first MRAP integration. Because a single global request can be served from any Region, it cannot be signed with classic SigV4 (which is Region-scoped). MRAP requests must use Signature Version 4A (SigV4A), the multi-Region signing variant. The recent AWS SDKs and CLI v2 support SigV4A, but you typically must enable the CRT (Common Runtime) auth dependency. For the CLI:
# SigV4A for MRAP requires the CRT signing component.
pip install 'awscli[crt]' # or use a CLI v2 build with CRT bundled
# Address the MRAP by its ARN; the SDK selects SigV4A automatically.
aws s3api get-object \
--bucket arn:aws:s3::111122223333:accesspoint/mfzwi23gnjvgw.mrap \
--key images/logo.png \
./logo.png
Note the MRAP ARN form: arn:aws:s3::<account>:accesspoint/<alias>.mrap — no Region segment, because it is global. If you see SignatureDoesNotMatch on your first MRAP call, the cause is almost always a SigV4A-incapable signer; install the CRT extra and retry.
SigV4 vs SigV4A side by side — this is the table that explains the failure you’ll inevitably hit once:
| Property | SigV4 (classic) | SigV4A (multi-region) |
|---|---|---|
| Scope | Single region | Multiple regions (*) |
| Used by | Standard S3, regional APs | MRAP |
| Dependency | Built into all SDKs | Requires CRT (e.g. awscli[crt]) |
| Symptom when wrong | n/a | SignatureDoesNotMatch on first MRAP call |
| Region in signature | The bucket’s region | * (any region) |
| How to enable | default | Install CRT; SDK auto-selects for .mrap ARNs |
Architecture at a glance
Trace the system left to right and it tells the whole story. On the left, two consumers approach the same authoritative data through different doors: an internal analytics role arrives over a VPC-bound access point (no public DNS path, reachable only via the S3 interface endpoint in vpc-0abc…), while an external partner arrives at an Object Lambda Access Point over the public edge. The center is the access plane: every door funnels into the bucket’s delegation policy, which authorizes nothing on its own — it simply trusts s3:DataAccessPointAccount. The partner’s path passes through the OLAP, which invokes the PII-redactor Lambda; that function fetches the original via the pre-signed inputS3Url, masks email and device fields, and streams the result back with WriteGetObjectResponse. The internal path goes straight to objects. On the right sits the data tier: the primary bucket datalake-shared-prod in us-east-1, plus — for the global-assets workload — a Multi-Region Access Point that anycasts reads to the closest healthy regional bucket, kept consistent by two-way Cross-Region Replication with RTC.
Five numbered failure points are marked on the hops where they actually bite. (1) is the bucket-policy delegation: forget s3:DataAccessPointAccount and every access point returns AccessDenied. (2) is the VPC-bound access point with no matching endpoint — every request times out. (3) is the OLAP Range trap — a naive transform mis-serves SDK multipart downloads. (4) is the Lambda write-back — a mismatched outputRoute/outputToken and the bytes go nowhere. (5) is the MRAP signing/replication pair — SigV4A missing gives SignatureDoesNotMatch, and un-replicated buckets make the “one global object” diverge. Read the legend as symptom · confirm · fix and you have the diagnostic map next to the architecture.
Real-world scenario
A media analytics platform team — call it PrismFeed — ran a single customer-events-prod bucket shared by 30+ internal teams plus two external data partners. The bucket policy had grown past 18 KB and was within sight of the 20 KB hard limit; the last partner onboarding had failed because the policy would not save. Worse, the same dataset had to be served two ways: internal analysts got raw event records, but a downstream BI partner was contractually forbidden from seeing raw email addresses and device IDs. The team had been solving this by running a nightly Glue job that wrote a second, redacted copy of every object to a partner/ prefix — doubling storage for 400 TB of events and introducing a 24-hour staleness gap that the partner kept complaining about.
The constraint: stay within the policy size limit, eliminate the duplicate redacted copy, and serve both audiences from one authoritative object — while keeping the partner traffic governed and the raw data inside a specific VPC.
They restructured around access points and Object Lambda. The bucket policy was rewritten to a single 400-byte delegation statement using s3:DataAccessPointAccount. Internal teams each got a VPC-bound access point scoped to their prefix. The partner got an Object Lambda Access Point whose transform masked email and device fields on the fly, eliminating the nightly Glue job and the 400 TB duplicate entirely — and the partner now saw live data with zero staleness. The masking function was deliberately length-preserving so range requests stayed safe:
import re
EMAIL = re.compile(rb'"email":"[^"]*"')
DEVICE = re.compile(rb'"device_id":"[^"]*"')
def mask(chunk: bytes) -> bytes:
chunk = EMAIL.sub(b'"email":"[REDACTED]"', chunk)
return DEVICE.sub(b'"device_id":"[REDACTED]"', chunk)
The result: the 18 KB policy became one line, partner onboarding stopped being a policy-size gamble, S3 storage dropped by roughly a third (the eliminated redacted copies), and the staleness complaint disappeared because redaction now happened at read time on the single live object. The one real cost they accepted was Lambda invocation on the partner read path — which, for a partner pulling a few thousand objects a day, was a rounding error next to 400 TB of duplicated storage.
The migration as a before/after ledger, because the deltas are the lesson:
| Dimension | Before (monolith + Glue copy) | After (access points + OLAP) | Net effect |
|---|---|---|---|
| Bucket policy size | ~18 KB, near the 20 KB wall | ~400 bytes (delegation only) | Onboarding no longer a size gamble |
| Partner data freshness | 24 h stale (nightly job) | Live (read-time redaction) | Complaint eliminated |
| Storage footprint | 400 TB raw + 400 TB redacted copy | 400 TB raw only | ~⅓ total storage cut |
| Redaction mechanism | Nightly Glue job | OLAP Lambda per GET | No batch pipeline to operate |
| Per-tenant network | Conditions in one policy | VPC-bound AP per team | Clean isolation |
| New cost introduced | Glue compute nightly | Lambda per partner GET | Far smaller; scales with reads |
| Blast radius of a bad edit | All 30+ teams | One access point | Contained |
Advantages and disadvantages
Decomposing a shared bucket into access points, OLAP, and MRAP is overwhelmingly the right move at scale — but each capability carries a cost you should accept with eyes open:
| Advantages | Disadvantages |
|---|---|
| Bucket policy collapses to one delegation line; per-tenant policies get their own 20 KB budget | One more layer to reason about — a request now traverses bucket policy and AP policy and consumer IAM |
| Per-access-point blast radius — a bad edit breaks one consumer, not all | More objects to govern (N access points, OLAPs, MRAPs) and to monitor |
| VPC-bound access points give a true no-public-path network control | A VPC-bound AP with no matching endpoint is silently unreachable |
| Object Lambda kills the duplicate-copy pattern — one authoritative object, transformed per request | Lambda invocation + latency on every GET through the OLAP; a hot read path can dominate cost |
| Cross-account sharing lives in one revocable AP policy; bucket stays untouched | Cross-account still needs both sides aligned (AP policy + consumer IAM) — silent AccessDenied if not |
| MRAP gives one global endpoint with automatic read failover | MRAP routes but does not copy — divergence if CRR lags or is misconfigured |
| Access points themselves are free and don’t change S3 request pricing | MRAP adds a per-GB routing charge; CRR adds transfer + (with RTC) an SLA fee |
| Request metrics + CloudTrail attribute every request to the door it came through | SigV4A (CRT) is an easy-to-miss prerequisite — first call fails with SignatureDoesNotMatch |
The model is right for any shared bucket fronting a data lake, a partner-export surface, or a multi-tenant storage tier — anywhere the bucket policy is a contention point or PII forces two views of one dataset. It is overkill for a private bucket with a handful of IAM roles, where a plain bucket policy is simpler. It bites hardest when teams forget the second side of a cross-account grant, ship an OLAP transform that ignores Range, or stand up an MRAP without actually wiring replication — all three are “works in the demo, fails in production” traps this article exists to prevent.
Hands-on lab
Stand up a real access-point decomposition, then an Object Lambda redaction, end to end — free-tier-friendly (one small object, one tiny Lambda). Run in CloudShell or any shell with the AWS CLI v2 and credentials. Replace 111122223333 with your account ID.
Step 1 — Variables and a shared bucket.
ACCT=$(aws sts get-caller-identity --query Account --output text)
REGION=us-east-1
BUCKET=datalake-lab-$ACCT
aws s3api create-bucket --bucket $BUCKET --region $REGION
aws s3api put-public-access-block --bucket $BUCKET \
--public-access-block-configuration \
BlockPublicAcls=true,IgnorePublicAcls=true,BlockPublicPolicy=true,RestrictPublicBuckets=true
Expected: the bucket is created and all four BPA flags are on.
Step 2 — Put a sample object with fake PII.
printf '{"name":"Asha","email":"asha@example.com","device_id":"dev-9921","ssn":"123-45-6789"}\n' > rec.json
aws s3api put-object --bucket $BUCKET --key customers/rec.json --body rec.json
Step 3 — Reduce the bucket policy to a delegation statement.
cat > delegate.json <<EOF
{ "Version":"2012-10-17","Statement":[{
"Sid":"DelegateToAPs","Effect":"Allow",
"Principal":{"AWS":"arn:aws:iam::$ACCT:root"},
"Action":"s3:*",
"Resource":["arn:aws:s3:::$BUCKET","arn:aws:s3:::$BUCKET/*"],
"Condition":{"StringEquals":{"s3:DataAccessPointAccount":"$ACCT"}}
}]}
EOF
aws s3api put-bucket-policy --bucket $BUCKET --policy file://delegate.json
Expected: the bucket now authorizes only via access points.
Step 4 — Create a standard access point and read through it.
aws s3control create-access-point --account-id $ACCT --name lab-ap --bucket $BUCKET
aws s3api get-object \
--bucket arn:aws:s3:$REGION:$ACCT:accesspoint/lab-ap \
--key customers/rec.json ./via-ap.json
cat ./via-ap.json # full record, unredacted
Expected: the object reads back through the access-point ARN — proof the delegation works.
Step 5 — Deploy a redactor Lambda.
cat > app.py <<'EOF'
import boto3, re, urllib3
s3=boto3.client("s3"); http=urllib3.PoolManager()
EMAIL=re.compile(rb'"email":"[^"]*"'); SSN=re.compile(rb"\b\d{3}-\d{2}-\d{4}\b")
def handler(event, context):
ctx=event["getObjectContext"]
data=http.request("GET", ctx["inputS3Url"]).data
data=EMAIL.sub(b'"email":"[REDACTED]"', data)
data=SSN.sub(b"***-**-****", data)
s3.write_get_object_response(Body=data,
RequestRoute=ctx["outputRoute"], RequestToken=ctx["outputToken"])
return {"status_code":200}
EOF
zip fn.zip app.py
# Assume an execution role 'olap-lab-role' with WriteGetObjectResponse + basic logging exists.
aws lambda create-function --function-name pii-redactor-lab \
--runtime python3.12 --handler app.handler --zip-file fileb://fn.zip \
--role arn:aws:iam::$ACCT:role/olap-lab-role --timeout 30
Step 6 — Create the supporting AP and the Object Lambda Access Point.
aws s3control create-access-point --account-id $ACCT --name lab-support-ap --bucket $BUCKET
aws s3control create-access-point-for-object-lambda --account-id $ACCT \
--name lab-redacted-olap --configuration "{
\"SupportingAccessPoint\":\"arn:aws:s3:$REGION:$ACCT:accesspoint/lab-support-ap\",
\"TransformationConfigurations\":[{\"Actions\":[\"GetObject\"],
\"ContentTransformation\":{\"AwsLambda\":{\"FunctionArn\":
\"arn:aws:lambda:$REGION:$ACCT:function:pii-redactor-lab\"}}}]}"
Step 7 — Read through the OLAP and confirm masking.
aws s3api get-object \
--bucket arn:aws:s3-object-lambda:$REGION:$ACCT:accesspoint/lab-redacted-olap \
--key customers/rec.json ./via-olap.json
diff <(cat ./via-ap.json) <(cat ./via-olap.json) && echo "NO MASKING (BUG)" || echo "MASKED OK"
cat ./via-olap.json # email + ssn now redacted
Expected: MASKED OK, and the OLAP output shows "email":"[REDACTED]" and ***-**-**** while the raw AP read did not. This is the exact check that catches a misconfigured OLAP topology.
Validation checklist — what each step proved:
| Step | What you did | What it proves |
|---|---|---|
| 3 | Bucket policy → delegation only | Authorization moved out of the bucket |
| 4 | Read via standard AP ARN | The delegation actually works |
| 5–6 | OLAP over a supporting AP | The three-layer topology is wired right |
| 7 | diff raw vs OLAP output |
The transform actually changes bytes |
Cleanup (avoid lingering charges):
aws s3control delete-access-point-for-object-lambda --account-id $ACCT --name lab-redacted-olap
aws s3control delete-access-point --account-id $ACCT --name lab-support-ap
aws s3control delete-access-point --account-id $ACCT --name lab-ap
aws lambda delete-function --function-name pii-redactor-lab
aws s3 rm s3://$BUCKET --recursive
aws s3api delete-bucket --bucket $BUCKET
Cost note. One object, a handful of Lambda invocations, and a few requests — this lab costs effectively nothing (well under ₹10) and deleting the bucket + functions stops everything. Access points and OLAPs carry no standing charge.
Common mistakes & troubleshooting
This is the playbook — bookmark it. First as a scannable table, then the full reasoning for the entries that bite hardest. Confirm each layer is doing exactly what you intended before you hand the topology to consumers.
# 1. Access point exists and carries the VPC + BPA you expect.
aws s3control get-access-point \
--account-id 111122223333 --name analytics-internal-ap \
--query '{vpc:VpcConfiguration, bpa:PublicAccessBlockConfiguration}'
# 2. The access point policy is the small, scoped one (not the monolith).
aws s3control get-access-point-policy \
--account-id 111122223333 --name finance-reports-rw-ap
# 3. Object Lambda actually transforms: original vs. transformed bytes differ.
aws s3api get-object --bucket datalake-shared-prod --key customers/2026/records.csv ./raw.csv
aws s3api get-object \
--bucket arn:aws:s3-object-lambda:us-east-1:111122223333:accesspoint/pii-redacted-olap \
--key customers/2026/records.csv ./masked.csv
diff <(head -c 200 ./raw.csv) <(head -c 200 ./masked.csv) && echo "NO MASKING" || echo "MASKED OK"
# 4. MRAP is READY and reports both Regions.
aws s3control get-multi-region-access-point \
--account-id 111122223333 --name global-assets-mrap \
--query 'AccessPoint.{status:Status, regions:Regions[].Region}'
# 5. Replication is keeping both buckets in sync (expect near-zero pending).
aws cloudwatch get-metric-statistics \
--namespace AWS/S3 --metric-name ReplicationLatency \
--dimensions Name=SourceBucket,Value=assets-use1 \
--start-time "$(date -u -d '-1 hour' +%FT%TZ)" \
--end-time "$(date -u +%FT%TZ)" \
--period 300 --statistics Maximum
The symptom → root cause → confirm → fix playbook — read at 2am, act in minutes:
| # | Symptom | Root cause | Confirm (exact cmd / path) | Fix |
|---|---|---|---|---|
| 1 | Every AP request returns AccessDenied, even with a perfect AP policy |
Bucket policy missing the delegation statement | aws s3api get-bucket-policy --bucket <b> — no s3:DataAccessPointAccount |
Add the DelegateToAccessPoints statement |
| 2 | Cross-account consumer gets AccessDenied |
Only one side of the pair allows it | get-access-point-policy and the consumer’s IAM policy |
Allow on both AP policy and consumer IAM |
| 3 | VPC-bound AP times out / connection refused | No S3 endpoint in that VPC | aws ec2 describe-vpc-endpoints --filters Name=vpc-id,Values=<vpc> |
Create a gateway/interface S3 endpoint |
| 4 | OLAP returns the original (unredacted) bytes | OLAP points at the wrong supporting AP, or transform no-op’d | diff raw.csv masked.csv → identical |
Fix SupportingAccessPoint ARN; verify transform |
| 5 | SDK multipart download corrupts / fails through OLAP | Transform ignores Range/partNumber |
Client sends Range; function doesn’t handle it |
Support range (if safe) or reject with 501 |
| 6 | OLAP GetObject returns 500 |
Lambda error or missing WriteGetObjectResponse perm |
Lambda CloudWatch logs; check exec role | Grant s3-object-lambda:WriteGetObjectResponse; fix code |
| 7 | First MRAP call: SignatureDoesNotMatch |
SigV4A signer not available | Using non-CRT CLI/SDK | pip install 'awscli[crt]' / enable CRT |
| 8 | MRAP serves stale/diverging data between regions | Replication not configured or lagging | get-metric-statistics ReplicationLatency high |
Enable two-way CRR; turn on RTC |
| 9 | MRAP stuck in CREATING for ages |
Normal async provisioning | list-multi-region-access-points → status |
Wait (~30 min); poll the token |
| 10 | New AP can’t be reached despite policy | Account BPA blocks the public policy | get-public-access-block (account) |
Use a VPC-bound AP or fix routing — don’t loosen BPA |
| 11 | put-bucket-policy fails: policy too large |
Still authorizing in the bucket policy | Policy > 20 KB | Decompose into access points (this whole article) |
| 12 | Object reads fine by bucket name, fails by AP ARN | Wrong ARN/hostname form for the AP type | Compare ARN to the type table | Use the correct ARN form (standard vs OLAP vs MRAP) |
Entry 1 — every access-point request returns AccessDenied. Root cause: the bucket policy still authorizes directly (or was emptied) and lacks the delegation statement, so requests arriving via an access point are never permitted at the bucket. Confirm: aws s3api get-bucket-policy --bucket <b> shows no s3:DataAccessPointAccount/Arn condition. Fix: add the DelegateToAccessPoints statement. This is the number-one access-point onboarding failure — the access-point policy looks perfect, but the bucket never delegated to it.
Entry 3 — VPC-bound access point times out. Root cause: a VPC-bound access point has no public DNS path and is reachable only through an S3 gateway/interface endpoint in that VPC; if the endpoint is missing, every request hangs or is refused. Confirm: aws ec2 describe-vpc-endpoints --filters Name=vpc-id,Values=<vpc> returns nothing for S3. Fix: create a gateway endpoint (free, same-region) or an interface endpoint (PrivateLink) and ensure the route table / DNS is wired.
Entry 4 — Object Lambda returns the original bytes. Root cause: the OLAP’s SupportingAccessPoint points at the wrong access point, or the transform silently no-op’d (regex didn’t match, function returned the input). Confirm: diff the raw read against the OLAP read — if identical, the transform isn’t running. Fix: verify the SupportingAccessPoint is the full ARN of the correct supporting access point and that the Lambda actually mutates the bytes. This is exactly what Step 7 of the lab catches.
Entry 7 — SignatureDoesNotMatch on the first MRAP call. Root cause: MRAP requires SigV4A (region *), and your signer is a classic SigV4-only build. Confirm: you’re on a CLI/SDK without CRT. Fix: pip install 'awscli[crt]' or use a CLI v2 build with CRT bundled; the SDK then auto-selects SigV4A for .mrap ARNs.
Entry 8 — MRAP serves divergent data. Root cause: the MRAP routes to the closest region, but the regions hold different data because replication isn’t configured or has fallen behind — “global” over divergent buckets. Confirm: ReplicationLatency (or pending bytes/operations) is high. Fix: enable two-way CRR and turn on RTC for a bounded 15-minute SLA. Remember: MRAP routes, replication copies.
The error and limit reference
The codes and limits you’ll meet, what they mean for these features specifically, how to confirm, and the fix. The non-obvious ones are the cross-account AccessDenied (which has four possible causes) and SignatureDoesNotMatch (always SigV4A):
| Code / signal | Meaning here | Likely cause | How to confirm | Fix |
|---|---|---|---|---|
AccessDenied (via AP) |
Request through an AP was denied | Missing delegation, AP policy, or consumer IAM | Check all three layers | Align bucket delegation + AP policy + IAM |
AccessDenied (cross-account) |
One side didn’t allow it | Consumer IAM or AP policy missing the grant | get-access-point-policy + consumer IAM |
Allow on both sides |
SignatureDoesNotMatch (MRAP) |
Region-scoped signature on a global endpoint | SigV4A not available | Using non-CRT signer | Install CRT; SDK uses SigV4A |
NoSuchAccessPoint |
AP ARN doesn’t resolve | Wrong name/region/account in the ARN | list-access-points |
Correct the ARN form |
InvalidRequest (OLAP) |
Bad OLAP/transform config | Malformed TransformationConfigurations |
get-access-point-for-object-lambda |
Fix the configuration JSON |
501 (from OLAP) |
Transform rejected a range | Your code returns 501 on Range |
Inspect WriteGetObjectResponse |
Intentional — clients fall back to full GET |
416 RangeNotSatisfiable |
Range invalid for the object | Bad range or non-range-safe transform | Function logic | Reject cleanly; document no-range support |
| Connection timeout (VPC-bound AP) | No reachable endpoint | Missing S3 VPC endpoint | describe-vpc-endpoints |
Create the endpoint |
MalformedPolicy (put-bucket-policy) |
Bad delegation policy JSON | Syntax / wrong condition key | put-bucket-policy error text |
Fix the JSON / condition key |
PolicyTooLarge / size error |
Bucket policy over 20 KB | Still authorizing in the bucket | Policy size | Decompose into access points |
MRAP FAILED / PARTIALLY_CREATED |
Creation didn’t fully succeed | A region failed to attach | get-multi-region-access-point |
Read failure reason; recreate |
High ReplicationLatency |
Replicas lagging | CRR throttled or misconfigured | CloudWatch metric | Enable RTC; check CRR rules |
KMS.AccessDeniedException (on read) |
Consumer can’t decrypt | No kms:Decrypt on the key |
CloudTrail KMS event | Grant kms:Decrypt to the principal/OLAP role |
503 SlowDown |
Request rate too high for a prefix | Hot prefix, not enough key spread | S3 request metrics | Spread keys across prefixes (not APs) |
InvalidAccessPointAliasError |
Alias used where ARN expected | Wrong identifier form | Compare to the ARN/alias table | Use the correct ARN form for the AP type |
OLAP 502 / empty body |
Lambda timed out or returned nothing | Slow/failed transform | Lambda duration + logs | Speed up transform; ensure Body is set |
And the hard limits and quotas that constrain a design — real numbers you should size against, not discover in production:
| Limit / quota | Value | Applies to | Note |
|---|---|---|---|
| Bucket policy size | 20 KB | Per bucket | The driver for decomposition |
| Access-point policy size | 20 KB | Per access point | Each AP gets its own budget |
| Access points per bucket (region) | 10,000 (default) | Per bucket per region | Soft-ish; plan naming for scale |
| Access point name length | 3–50 chars | Per AP | Lowercase, hyphens, no underscores |
| GET requests per prefix | 5,500 / s | Bucket key namespace | Scales by prefix, not by AP |
| PUT/COPY/DELETE per prefix | 3,500 / s | Bucket key namespace | Same — spread keys, not APs |
| Object Lambda response size | up to 5 GB stream | Per WriteGetObjectResponse |
Stream large bodies |
| Lambda timeout (OLAP) | up to 60 s effective for the GET path | Per request | Keep transforms fast |
| MRAP regions | up to ~17–20 (account/region dependent) | Per MRAP | One bucket per region per MRAP |
| RTC SLA | 99.99% within 15 min | CRR with RTC | Per-GB fee applies |
| Replication | async, no SLA without RTC | CRR | “Eventually” unless RTC |
| MRAP provisioning time | up to ~30 min | Per create | Async; poll the request token |
OLAP WriteGetObjectResponse timeout |
tied to the GET request budget | Per request | A slow transform stalls the caller |
| Supporting AP per OLAP | exactly 1 | Per OLAP | One data source; full ARN required |
| Object key length | up to 1,024 bytes | Per object | Unchanged by access points |
| Versioning requirement (CRR) | versioning must be on | Both buckets | CRR refuses un-versioned buckets |
Best practices
- Reduce the bucket policy to a single delegation statement using
s3:DataAccessPointAccount(ors3:DataAccessPointArnfor tighter scoping). The bucket should authorize nothing directly once you’ve adopted access points. - One access point per application, team, or partner, each with a small, single-purpose policy. Decomposition is the whole point — resist the urge to make one “shared” access point.
- Set all four Block Public Access flags on every access point, and confirm the account-level BPA so no access point can ever loosen public access.
- Use VPC-bound access points for any data that must never have a public path — it’s the strongest network control S3 offers, and it composes with an endpoint policy for a closed loop.
- Grant cross-account access on both sides (the access-point policy and the consumer’s IAM policy), keep the bucket policy untouched, and add an
aws:PrincipalOrgIDcondition to prevent confused-deputy access. - Point each Object Lambda Access Point at the correct supporting access point ARN, and verify the transform actually changes bytes (the
difftest) before consumers depend on it. - Decide
Range/partNumberhandling explicitly — support it for length-preserving transforms, reject it with501for format-changing ones. Never ignore ranges and return the full body; that corrupts SDK multipart downloads. - Grant the OLAP Lambda only
WriteGetObjectResponseplus logging by default; adds3:GetObjectonly if the function reads beyond the pre-signedinputS3Url. - Back every MRAP with two-way Cross-Region Replication — never route over divergent data — and enable RTC when you need the 15-minute SLA.
- Install SigV4A (CRT) before the first MRAP call and verify no
SignatureDoesNotMatch; address the MRAP by its.mrapARN so the SDK auto-selects the signer. - Use
TrafficDialPercentagefor planned failover/drain — it’s reversible and deletes nothing — and document the procedure before you need it at 2am. - Attribute every request to its door — enable request metrics filtered per access point and CloudTrail data events (both record the access-point ARN) so one broken consumer is visible without noise from the others.
Security notes
- Least-privilege per access-point policy. Each access point’s policy should grant the narrowest action set on the narrowest
/object/<prefix>/*resource — the decomposition is wasted if every access point grantss3:*on*. - VPC isolation for sensitive data. A VPC-bound access point plus an S3 endpoint policy keeps raw/PII data on the AWS backbone with no public DNS path — the strongest isolation S3 offers short of a private bucket.
- Object Lambda as a redaction boundary. Use OLAP to enforce that a class of caller can never see raw PII — the partner physically cannot retrieve unmasked bytes because the transform always runs on their door. Pair it with a VPC-bound, unredacted access point for internal callers.
- Cross-account guardrails. Always scope cross-account grants with
aws:PrincipalOrgID(or specific role ARNs) to prevent a confused-deputy where a third party rides the partner’s access. Revoke cleanly by deleting the access point. - Encryption is the bucket’s job, inherited. Access points don’t change encryption; the bucket’s default SSE-S3/SSE-KMS applies. For KMS-encrypted data, the consuming principal (and the OLAP Lambda) needs
kms:Decrypton the key — a common silent failure on cross-account reads. - Audit through the door. CloudTrail data events and S3 server access logs both record the access-point ARN, so every read/write is attributable to the consumer that made it — essential for a shared dataset.
The security controls mapped to the threat each one closes:
| Control | Mechanism | Closes which threat | Also helps |
|---|---|---|---|
| Per-AP least-privilege policy | scoped Action + /object/<prefix> |
Over-broad access via a shared door | Blast-radius containment |
| VPC-bound access point | VpcConfiguration + endpoint policy |
Public exposure of sensitive data | Data-exfiltration paths |
| Account BPA ratchet | most-restrictive-wins composition | Accidental re-exposure by an AP | Guardrail across all APs |
aws:PrincipalOrgID on cross-account |
condition key | Confused-deputy / third-party ride-along | Scoping partner access |
| OLAP redaction transform | Lambda on the GET path | A caller class seeing raw PII | Eliminates duplicate redacted copies |
| KMS key policy for consumers | kms:Decrypt grant |
Silent AccessDenied on encrypted reads |
Cross-account key access |
| CloudTrail data events | per-AP ARN logging | Unattributable access on shared data | Forensics / compliance |
Cost & sizing
A few economics and limits worth internalizing before you build a sprawling topology:
- Access points themselves are free, and they do not change S3 request pricing. You pay normal S3 request and storage rates regardless of how many access points front a bucket. Decomposing the policy costs nothing.
- Request rate scales per prefix, not per access point. The 5,500 GET / 3,500 PUT-per-second-per-prefix scaling is a property of the bucket key namespace. Adding access points does not multiply your request ceiling; spreading keys across more prefixes does.
- Object Lambda adds Lambda cost and latency to every GET, plus the
WriteGetObjectResponsedata path. Budget for invocation count = read count, and keep transforms fast and memory-right-sized. It is a per-request cost, so a hot read path can dominate your bill. - MRAP adds a per-GB data-routing charge on top of normal S3 and any CRR transfer/replication costs. Cross-Region replication moves bytes between Regions and is billed accordingly; RTC adds a per-GB fee for the SLA.
For observability, enable request metrics with an access-point filter so each consumer’s traffic is independently visible, and turn on S3 server access logging or CloudTrail data events — both record the access point ARN, so you can attribute every request to the door it came through. A useful CloudWatch metric-math approach is to alarm per access point on 4xx rate, which surfaces a single broken consumer without noise from the others.
The cost drivers and what each one buys you, with rough figures:
| Cost driver | What you pay for | Rough figure | What it’s for | Watch-out |
|---|---|---|---|---|
| Access points | Nothing (free) | ₹0 | Policy decomposition | No reason not to use them |
| S3 requests | Per-1,000 GET/PUT | ~₹0.03–0.4 / 1k | Normal bucket access | Unchanged by access points |
| Object Lambda invocations | Lambda per GET + duration | scales with reads | In-flight transform | Hot read path dominates |
WriteGetObjectResponse data |
Per-GB returned | per-GB | OLAP response path | Large objects add up |
| MRAP routing | Per-GB routed | per-GB | Global endpoint + failover | On top of S3 + CRR |
| CRR transfer | Per-GB inter-region | per-GB | Keeping replicas in sync | Doubles for two-way |
| RTC | Per-GB + replication metrics | per-GB fee | 15-min SLA | Only if you need bounded lag |
| CloudTrail data events | Per-100k events | per-event | Per-door audit | High-traffic = real volume |
When to reach for each, sized by scale:
| If your situation is… | Reach for… | Because |
|---|---|---|
| One bucket, a few IAM roles | Plain bucket policy | Decomposition is overkill |
| Shared bucket, many teams/prefixes | Access points | Kills the 20 KB / blast-radius / contention problem |
| Two views of one dataset (raw vs redacted) | Object Lambda | One authoritative copy, transformed per request |
| Data that must never be public | VPC-bound access point | No public DNS path |
| Cross-account sharing | Access-point policy + OrgID | Revocable, bucket untouched |
| Global low-latency reads, active-active | MRAP + two-way CRR | One endpoint, replicated data |
| Just need a regional read replica | Plain CRR (no MRAP) | Routing isn’t needed |
Interview & exam questions
1. Why does a single bucket policy fail at scale, and what replaces it? A bucket policy is one 20 KB document with one change process governing every principal — it concentrates blast radius, creates change contention, can’t express clean per-tenant network rules, and eventually won’t save. Access points replace it: each consumer gets a named endpoint with its own 20 KB policy and BPA, and the bucket policy collapses to a one-statement delegation using s3:DataAccessPointAccount.
2. What does the bucket policy look like after you adopt access points? It contains a single Allow statement permitting s3:* on the bucket and its objects, conditioned on s3:DataAccessPointAccount equal to the owning account (or s3:DataAccessPointArn for specific access points). It authorizes nothing directly — all fine-grained authorization moves into the per-access-point policies.
3. How does Block Public Access compose between an access point and the account? The effective setting is the most restrictive of the access-point BPA and the account/bucket BPA. An access point can never loosen public access the account has locked down — it’s a one-way ratchet. If the account blocks public policies, no access point can re-expose the bucket.
4. What is a VPC-bound access point and why is it the strongest network control? An access point created with VpcConfiguration has no public DNS path — it answers only requests arriving over an S3 gateway/interface endpoint in that VPC. There is no policy mistake that can make it public, because the endpoint simply doesn’t exist on the internet. That’s stronger than any condition-based restriction.
5. Explain the three-layer Object Lambda topology. The bucket holds the authoritative object; a supporting access point (a plain AP) sits on the bucket; and the Object Lambda Access Point references that supporting AP’s full ARN and names the Lambda transform. Clients GET through the OLAP ARN, S3 invokes the Lambda with a pre-signed inputS3Url, and the function streams transformed bytes back via WriteGetObjectResponse.
6. Why is Range handling the classic Object Lambda trap? AWS SDKs send Range/partNumber headers constantly for large objects and multipart downloads. If your transform changes byte length (variable redaction, format conversion, compression), output offsets no longer map to input, so serving a “range” corrupts the download. You must either support ranges (only for length-preserving transforms) or reject them with 501 so the client falls back to a full GET.
7. What’s the difference between an MRAP and Cross-Region Replication? An MRAP is one global endpoint that routes each request to the closest healthy regional bucket (Global Accelerator anycast) — it does not move data. CRR copies objects between regions. An MRAP without replication is latency routing over divergent data. Routing and copying are separate jobs: MRAP routes, replication copies.
8. Why must MRAP requests use SigV4A? A single global MRAP request can be served from any region, so it can’t carry a region-scoped classic SigV4 signature. SigV4A signs for region *. Recent SDKs/CLI support it but require the CRT signing dependency; without it the first MRAP call fails with SignatureDoesNotMatch. Address the MRAP by its .mrap ARN and the SDK auto-selects SigV4A.
9. How do you perform a planned, reversible MRAP failover? Use submit-multi-region-access-point-routes to set a region’s TrafficDialPercentage to 0, which drains it without deleting anything; dial it back to 100 to restore. This is the canonical maintenance/failover move — distinct from automatic health-based failover (which routes around an impaired region without you doing anything).
10. A cross-account consumer gets AccessDenied despite a correct access-point policy. What’s missing? Cross-account always requires both sides: the access-point policy must grant the external principal, and that principal’s own IAM policy must allow s3:GetObject on the access-point ARN. The bucket policy stays untouched. Add an aws:PrincipalOrgID condition to prevent a confused-deputy.
11. Do access points multiply your request-rate ceiling? No. The 5,500 GET / 3,500 PUT per second is per prefix in the bucket key namespace, not per access point. Fronting a bucket with more access points doesn’t raise the ceiling; spreading keys across more prefixes does. Access points are an access-control and network tool, not a performance lever.
12. When is this whole pattern overkill? For a private bucket with a handful of IAM roles, a plain bucket policy is simpler and sufficient — there’s no contention, blast-radius, or PII-view problem to solve. Reach for access points when the bucket is shared (many teams/prefixes/accounts), for Object Lambda when two audiences need two views of one object, and for MRAP when you need a global active-active read endpoint.
These map primarily to the AWS Certified Solutions Architect – Professional (SAP-C02) (multi-account data sharing, cross-region architectures) and Security – Specialty (SCS-C02) (least-privilege data access, redaction, network isolation), with the storage mechanics relevant to Solutions Architect – Associate (SAA-C03). A compact cert mapping:
| Question theme | Primary cert | Objective area |
|---|---|---|
| Access-point decomposition, delegation | SAP-C02 / SAA-C03 | Design secure, scalable data access |
| VPC-bound access points, endpoints | SCS-C02 / SAP-C02 | Network isolation; data perimeter |
| Object Lambda redaction | SCS-C02 | Data protection; least exposure |
| Cross-account sharing + OrgID | SAP-C02 | Multi-account architectures |
| MRAP + replication + SigV4A | SAP-C02 | Multi-region, resilient design |
| Request-rate scaling per prefix | SAA-C03 | Performant storage design |
Quick check
- After adopting access points, what is the only thing the bucket policy should do, and which condition key expresses it?
- Your custom Object Lambda function returns the original, unredacted bytes through the OLAP. Name the two most likely causes.
- True or false: scaling out to more access points raises your bucket’s GET request-per-second ceiling.
- Your first MRAP
get-objectfails withSignatureDoesNotMatch. What’s wrong and what’s the fix? - You need raw data to have no public path whatsoever for one team. Which access-point feature do you use, and what else must exist in the VPC for it to work?
Answers
- Delegate. It should contain a single
Allowstatement conditioned ons3:DataAccessPointAccount(ors3:DataAccessPointArn), permitting requests that arrived via an access point owned by the account. It authorizes nothing directly; the per-access-point policies do the fine-grained work. - Either the OLAP’s
SupportingAccessPointpoints at the wrong access point, or the transform silently no-op’d (the regex didn’t match, or the function returned its input). Confirm with adiffof the raw read versus the OLAP read — identical output means the transform isn’t running. - False. The 5,500 GET / 3,500 PUT per second is per prefix in the bucket key namespace, not per access point. More access points don’t raise the ceiling; spreading keys across more prefixes does.
- MRAP requires SigV4A (region
*); your signer is classic SigV4-only. Install the CRT signing dependency (pip install 'awscli[crt]'or a CRT-bundled CLI v2) and address the MRAP by its.mrapARN so the SDK auto-selects SigV4A. - A VPC-bound access point (
VpcConfigurationat create time) — it has no public DNS path. For it to be reachable, the VPC must also have an S3 gateway or interface endpoint; without an endpoint the access point answers nothing.
Glossary
- Access point (AP) — a named network endpoint attached to a single bucket, with its own resource policy and Block Public Access settings, optionally bound to one VPC; an alternate front door into the same objects, not a copy.
- Supporting access point — a plain access point that an Object Lambda Access Point points at as its data source; referenced by its full ARN.
- Object Lambda Access Point (OLAP) — an access point that inserts your Lambda into the GET path, transforming object bytes per request via
WriteGetObjectResponsewithout writing a derived copy. - Multi-Region Access Point (MRAP) — a single global endpoint (
<alias>.mrap) that routes each request to the closest healthy underlying bucket across regions using Global Accelerator anycast. - Cross-Region Replication (CRR) — asynchronous copying of objects between buckets in different regions; what keeps an MRAP’s regional buckets consistent.
- Replication Time Control (RTC) — a CRR option providing a 99.99% / 15-minute replication SLA plus metrics, for a per-GB fee.
s3:DataAccessPointAccount— bucket-policy condition key meaning “this request arrived via an access point owned by this account”; the heart of the delegation statement.s3:DataAccessPointArn— bucket-policy condition key that scopes delegation to specific access-point ARNs (supports wildcards).- Block Public Access (BPA) — four flags (
BlockPublicAcls,IgnorePublicAcls,BlockPublicPolicy,RestrictPublicBuckets) that prevent public exposure; effective setting is the most restrictive of account and access-point. - VPC configuration — binding an access point to one VPC at creation, removing any public DNS path so it answers only via an S3 endpoint in that VPC.
- SigV4A — Signature Version 4A, the multi-region signing variant (region
*) required for MRAP; needs the CRT signing dependency. WriteGetObjectResponse— the S3 Object Lambda API the transform Lambda calls to stream its result back, using the request’soutputRouteand one-timeoutputToken.inputS3Url— the pre-signed URL to the original object that S3 hands the OLAP Lambda so it can fetch the source bytes without its owns3:GetObjectgrant.TrafficDialPercentage— the per-region MRAP routing dial;0drains a region (reversible failover),100restores it.- Range-safe transform — a transform that preserves byte length (e.g. fixed-width masking) so
Range/partNumberrequests can be served correctly; format-changing transforms are not range-safe and must reject ranges.
Next steps
You can now decompose a shared bucket, transform on read, and serve globally without copying. Build outward:
- Next: S3 Deep Dive: Storage Classes, Versioning, Lifecycle & Encryption — the data plane beneath every access point, OLAP, and MRAP.
- Related: S3 Data Protection & Governance at Scale — Object Lock, inventory, and the governance program these access patterns plug into.
- Related: IAM Least Privilege & Permission Boundaries — make every per-access-point policy as tight as it should be.
- Related: Lambda Deep Dive: Runtimes, Triggers, Layers & Concurrency — right-size and harden the function on your Object Lambda read path.
- Related: VPC Deep Dive: Subnets, Routing, IGW, NAT & Endpoints — wire the S3 endpoint a VPC-bound access point depends on.
- Related: PrivateLink: Provider/Consumer & Cross-Account — the private-connectivity pattern for cross-account data access without a public path.