GCP Lesson 18 of 98

Google Cloud Memorystore, In Depth: Redis, Redis Cluster, Memcached, HA & Eviction

The fastest database query is the one you never run. When an application reads the same product catalogue, the same user session, or the same leaderboard thousands of times a second, paying the full cost of a relational query — parse, plan, disk, network — for every read is wasteful and, eventually, fatal to your database. The standard remedy is an in-memory cache: a tier of RAM-backed key/value storage that sits in front of the slow store and serves hot data in sub-millisecond time. On Google Cloud, the managed answer is Memorystore.

Memorystore is not one product but a family of three, and choosing badly among them is the most common Memorystore mistake. There is Memorystore for Redis (a managed single-shard Redis with an optional high-availability replica), Memorystore for Redis Cluster (a managed, horizontally sharded Redis that scales out to terabytes and far higher throughput), and Memorystore for Memcached (a managed, multi-node Memcached for simple, large, distributed caching). They share a goal — give you a fast in-memory tier without you patching, replicating or babysitting nodes — but differ sharply in data model, scaling model, durability and price.

This lesson is the exhaustive version. We will fix the vocabulary, compare the three offerings field by field, then walk every configuration option you set when you create and operate them: size and version, the maxmemory eviction policies, persistence (RDB snapshots and AOF), maintenance windows, and the full connectivity story — Private Service Access, Private Service Connect, the VPC plumbing, AUTH and in-transit TLS. We will cover scaling, the metrics that matter, and the cache patterns (cache-aside, write-through, session store) that decide whether your cache helps or hurts. We finish with the decision interviewers love: Redis or Memcached? Commands are real gcloud against current Memorystore (2026), with console names called out so you can follow along either way.

Learning objectives

By the end of this lesson you will be able to:

Prerequisites & where this fits

You should be comfortable with the Google Cloud resource hierarchy (organisation, folders, projects) and basic IAM, have the gcloud CLI installed and initialised, and — crucially — understand what a VPC, a subnet and private connectivity are, because a Memorystore instance has no public endpoint and lives entirely inside your private network. A passing familiarity with key/value caching and the idea of a hash map will make the data-model sections concrete. This is the Databases lesson of the GCP Zero-to-Hero course that follows Cloud SQL (the relational store you will most often put a cache in front of) and Firestore (the document store), and precedes the Bigtable deep dive. The networking it depends on is covered in the VPC deep dive — read that if subnets, routes and private access are unfamiliar.

Core concepts: the mental model

Before the settings, fix the vocabulary. Most Memorystore confusion is a confusion of which product you are talking about and what “node”, “shard” and “replica” mean for each.

The single most important early decision: which of the three products, and for legacy Redis, which tier. Get that wrong and no amount of tuning helps. The second most important: eviction policy and size, because a too-small instance with the wrong eviction policy is the classic 3 a.m. incident. We will return to both in depth.

The three offerings compared

Here is the field-by-field comparison. Read it before you create anything — it is the decision the rest of the lesson elaborates.

Dimension Memorystore for Redis Memorystore for Redis Cluster Memorystore for Memcached
Engine Redis (single shard) Redis (sharded, OSS Cluster API) Memcached
Data types Full Redis (strings, hashes, lists, sets, sorted sets, streams, …) Full Redis Opaque key/value only
Scaling model Vertical (resize the node) Horizontal (add/remove shards) + vertical Horizontal (add/remove nodes) + per-node size
Max size (order of magnitude) Up to ~300 GB per instance Terabytes (many shards) Large (multi-node; many GB per node × nodes)
High availability Standard tier: primary + 1 replica, auto-failover Built-in: up to 2 replicas per shard, auto-failover None — node loss loses that node’s data
Read scaling Read replicas (Standard, up to 5) Replicas can serve reads Reads spread across nodes by the client
Persistence RDB and/or AOF (optional) RDB and/or AOF (optional) None
Single endpoint Yes (primary endpoint; optional read endpoint) Discovery endpoint (cluster-aware client) Discovery endpoint (Auto Discovery)
In-transit TLS Yes (optional, set at creation) Yes Yes
AUTH Yes (optional) IAM auth and/or Redis AUTH SASL (optional)
Best for General caching, sessions, queues, locks on a single logical Redis Large, high-throughput Redis needing scale-out and resilience Simple, large, distributed caching of opaque objects
Relative cost Low–medium Higher (multiple shards) Low–medium (pay per node)

The plain-English version:

Note the modern direction of travel: Redis Cluster is Google’s strategic Redis offering and where new capabilities land first. The single-node Redis product remains ideal when you genuinely need only one shard, but if you anticipate growth past one machine, design for Cluster from the start to avoid a migration.

Memorystore for Redis: tiers, replicas and failover

This is the workhorse. Everything here is about a single logical Redis (one primary), optionally made highly available.

Service tiers

The tier is the headline choice and is set at creation.

Tier Topology Failover SLA Use for
Basic Single node, no replica None — a maintenance event or failure causes a full cache flush and brief unavailability No availability SLA Dev/test, or a pure cache where losing all data on restart is acceptable
Standard Primary + 1 replica in different zones of the region Automatic failover to the replica; endpoint preserved Availability SLA (e.g. 99.9%) Anything production

The Basic-tier gotcha is brutal and frequently learned the hard way: a Basic instance loses its entire dataset during scaling and during routine maintenance, because there is no replica to fail over to. Never put a Basic instance where a cold cache would cause a thundering-herd outage on your database. For production, Standard tier is the default.

Read replicas

Standard tier supports read replicas — you can configure 1 to 5 replicas (the replica count is the number of read replicas; with the read-replicas-mode enabled). This gives you two things: more read throughput (clients can issue reads against a read endpoint that load-balances across replicas) and faster failover resilience. There are two endpoints:

Reads from a replica are eventually consistent — replication is asynchronous, so a value written to the primary may not yet be visible on a replica for a few milliseconds. For a cache this is usually fine; for read-your-own-writes correctness, read from the primary.

How failover works

In Standard tier, the replica continuously receives the primary’s data stream. If the primary’s zone or node fails, Memorystore promotes the replica to primary automatically; the primary endpoint IP is preserved, so a well-written client reconnects to the same address and continues. The application’s job is simply to handle dropped connections and retry — connections are severed at failover, so your client library must reconnect (most do automatically) and your code must not assume a connection lives forever. You can also trigger a manual failover (for testing DR, or to force a primary onto a healthier zone) with gcloud redis instances failover.

A subtlety interviewers probe: during failover there may be a small window of unavailability (seconds) and, because replication is asynchronous, a small risk of losing the last few writes that had not yet reached the replica. HA protects availability and most data, not literally every last write. If you need stronger durability, enable AOF persistence (below).

Memorystore for Redis Cluster: sharding and scale-out

When a single Redis node is not enough, Redis Cluster spreads the keyspace across shards and runs them as one logical cluster.

Redis Cluster supports the same persistence (RDB/AOF), in-transit TLS, and IAM/AUTH options as single-node Redis, plus zone distribution of shards/replicas for availability-zone resilience. It is the right starting point when you expect to outgrow one node.

Memorystore for Memcached: nodes and Auto Discovery

Memcached is the simplest offering and scales by nodes.

Use Memcached when the workload is a large, flat, opaque cache and you value simplicity and horizontal scale over data structures, persistence and HA.

Sizing and versions: every setting

Across the products, sizing and version are core creation choices.

Capacity and node/shard layout

Product Capacity lever(s) Notes
Redis (Basic/Standard) Memory size (GB)--size Single node; pick enough RAM for working set + overhead. Resizable later (scale up/down) within limits; resizing Standard is online, Basic flushes.
Redis (Standard) Read replica count (1–5) Read throughput + resilience.
Redis Cluster Shard count + replicas per shard (0–2) + node type Horizontal scale; rebalanced online. Total RAM ≈ node memory × shards.
Memcached Node count (1–20) + vCPUs/node + memory/node Horizontal; total RAM = memory/node × nodes.

Always size for the working set plus headroom. Redis itself needs memory beyond your data for replication buffers, client buffers and (if enabled) the fork used by RDB/AOF rewrites. A common rule of thumb is to keep usage comfortably below the ceiling (e.g. target ~70–80% steady-state) so eviction and background saves have room. Undersizing forces constant eviction (and cache misses) or, with a non-evicting policy, write failures.

Engine version

You select a Redis version (e.g. Redis 7.x for the cluster/modern product; the single-node product supports a range of supported major versions) or a Memcached version at creation. Choose the latest supported version unless a specific client or feature pins you to an older one. Some features (e.g. certain persistence or TLS behaviours) require a minimum version. Major-version upgrades are supported as a managed operation but should be tested — Redis command/behaviour changes between majors are usually minor but not zero.

Region and zones

Memorystore is regional. You pick a region; for Standard-tier Redis you may specify the primary and replica zones (or let Google choose) to control zone placement, and for Redis Cluster you control zone distribution of shards/replicas across the region’s zones for availability-zone resilience. There is no cross-region replication built in — for multi-region you run independent instances and replicate at the application or data-pipeline layer.

Network and connectivity (set at creation)

Crucially, the connectivity model and the network are creation-time settings — you attach the instance to a VPC (and, for PSA, to an allocated IP range) when you create it. We cover this in full in the Connectivity section; just note here that you cannot bolt private connectivity on afterwards in the same way, so plan it up front.

Eviction: maxmemory policies in full

This is the setting that decides whether your Redis behaves like a cache or a store. When Redis reaches maxmemory and a new write needs space, the maxmemory-policy governs what happens. (Memcached has its own simpler LRU and does not expose this Redis setting.)

Policy What it evicts Behaves like When to use
noeviction Nothing — write commands return errors when full (reads/deletes still work) A store (never silently drops data) When the data must not be evicted (e.g. Redis used as a durable-ish store/queue with persistence); you must size to fit and monitor closely
allkeys-lru The least-recently-used key among all keys A classic cache General-purpose caching where any key may be dropped — the most common cache choice
allkeys-lfu The least-frequently-used key among all keys A cache that favours popular items Caches with strong hot/cold skew where frequency predicts future use better than recency
allkeys-random A random key among all keys A cache (cheap eviction) Rare; when access is uniform and you want minimal eviction overhead
volatile-lru LRU only among keys with a TTL A cache plus a protected no-TTL region When some keys are “cache” (TTL set, evictable) and others are “permanent” (no TTL, never evicted) in the same instance
volatile-lfu LFU among keys with a TTL Same split, frequency-based As above with hot/cold skew
volatile-random Random among keys with a TTL Same split, cheap Rare
volatile-ttl The key with the shortest remaining TTL A cache that drops soonest-to-expire first When you want to proactively shed near-expiry keys

Two rules to internalise:

  1. A cache should use an allkeys-* policy (usually allkeys-lru). If you leave the default and it is noeviction (or volatile-*) and you never set TTLs, the instance will fill up and start rejecting writes rather than evicting — the classic “Redis stopped accepting writes” incident.
  2. The volatile-* policies only evict keys that have a TTL. If you choose a volatile-* policy but set no TTLs, Redis has nothing eligible to evict and behaves like noeviction when full. This trips up many teams.

You set the policy via the instance’s Redis configuration (--redis-config maxmemory-policy=allkeys-lru), and you can change it after creation. Memorystore manages the underlying maxmemory value relative to the instance size, reserving some memory for overhead — you tune the policy, not the raw byte ceiling.

Other useful Redis config you can set the same way includes maxmemory-gb behaviour (managed), notify-keyspace-events (keyspace notifications), timeout (idle client timeout), maxmemory-clients, and the active-expiration tuning — the supported set is documented per version, and unsupported/dangerous directives are blocked by the managed service.

Persistence: RDB snapshots and AOF

By default a Memorystore Redis instance is a pure in-memory cache — restart it and the data is gone. You can opt into persistence to survive restarts and reduce data loss on failover. (Memcached has no persistence.)

RDB — point-in-time snapshots

RDB (Redis Database) persistence periodically writes a compact binary snapshot of the entire dataset to disk. You configure a snapshot period — every 1, 6, 12 or 24 hours — and an optional start time. On restart (or when promoting after a failure), Redis can reload from the latest snapshot.

AOF — append-only file

AOF (Append Only File) logs every write command to disk, so the dataset can be reconstructed by replaying the log. Memorystore offers AOF with an fsync policy (typically every secondappendfsync everysec), giving at most ~1 second of writes at risk.

Choosing

Need Choose
Pure cache; cold start on restart is fine No persistence (default) — cheapest, fastest
Survive restarts but some loss acceptable RDB (pick a snapshot period matching your tolerance)
Minimise data loss (≈1s) AOF (everysec) — most durable
Maximum protection AOF + RDB together (AOF for the small window, RDB for fast reload)

Persistence turns Redis from a cache into something closer to a lightweight durable store — useful when Redis holds data you cannot trivially rebuild (e.g. a queue, or session state you do not want all users to lose on a restart). But remember: persistence is not a backup. For point-in-time recovery and protection against logical corruption, use Memorystore’s export/import (RDB to Cloud Storage) or the backup capability where available, on a schedule.

Maintenance windows

Memorystore performs managed maintenance (patching, minor upgrades) on a schedule. You control when with a maintenance window — a day of week and start time — so disruptive operations happen during your low-traffic period. For Standard-tier Redis and Redis Cluster, maintenance is failover-based and largely transparent (the replica takes over while a node updates), but connections may drop, so the same “retry on disconnect” discipline applies. For Basic tier, maintenance flushes the cache — another reason Basic is dev/test only. You can also view upcoming maintenance and, in some cases, reschedule a pending maintenance event. Set the window at creation with --maintenance-window-day, --maintenance-window-hour (and the equivalents for the cluster product).

Connectivity: every way to reach the instance

A Memorystore instance has no public IP. It is reachable only from inside your network via private connectivity. There are two mechanisms, and getting one of them right is half of operating Memorystore.

Private Service Access (PSA) — the classic model

Private Service Access is a one-time, per-VPC setup that lets Google-managed services (Cloud SQL, Memorystore, and others) get private IPs inside your VPC via VPC peering:

  1. Allocate an IP range for service producers in your VPC (gcloud compute addresses create … --purpose=VPC_PEERING --prefix-length=…). Memorystore (legacy Redis and Memcached) draws its private IP from this reserved range.
  2. Create the private connection (gcloud services vpc-peerings connect --service=servicenetworking.googleapis.com …). This peers your VPC with Google’s service-producer VPC.
  3. Create the instance attached to that VPC (--network=projects/PROJECT/global/networks/VPC). Memorystore assigns it a private IP from the allocated range.

Clients in the same VPC (or a VPC connected to it appropriately) then reach the instance by its private IP. This is the established model for single-node Redis and Memcached.

Private Service Connect (PSC) — the modern model

Private Service Connect is the newer, more flexible private-connectivity model and is the model used by Memorystore for Redis Cluster (and increasingly the recommended path). Instead of peering whole VPCs, PSC creates endpoints in your VPC that map to the service:

The deeper producer/consumer mechanics are covered in the Private Service Connect deep dive; the VPC deep dive covers the subnets, routes and firewall rules both models depend on.

Reaching Memorystore from each compute surface

Because the endpoint is private, where your client runs determines what plumbing you need:

Client runs on How it reaches Memorystore
GCE VM in the same VPC Directly via the private IP — just open egress in the firewall to the instance’s port (6379 Redis / 11211 Memcached).
GKE (pods) Pods are in the cluster’s VPC; reach the private IP directly (ensure the cluster is on the right VPC and firewall allows it).
Cloud Run / Cloud Functions (serverless) Need Direct VPC egress or a Serverless VPC Access connector so the serverless workload can route into the VPC and reach the private IP. Without VPC egress, serverless cannot reach Memorystore.
On-prem / another VPC Via the appropriate connection (Cloud VPN/Interconnect into the VPC for PSA; PSC endpoints for the cluster) — with routing and firewall configured.

AUTH and in-transit TLS

Two security controls, both important and partly creation-time:

Encryption at rest is handled by Google by default; for Redis Cluster and where supported you can use CMEK (customer-managed keys in Cloud KMS) to control the at-rest key.

Scaling: vertical, replicas and shards

Each product scales differently — match the lever to the product.

A scaling gotcha worth stating: changing a Memcached node count or a Redis Cluster shard count reshuffles which keys live where, so a portion of the cache effectively “misses” until it warms — plan scaling for low-traffic windows and pre-warm if a cold portion would hurt the backing store.

Monitoring: the signals that matter

Memorystore publishes metrics to Cloud Monitoring (under the redis.googleapis.com, memorystore.googleapis.com and memcache.googleapis.com resource types). The ones to alert on:

Metric Why it matters Alert when
Memory usage ratio (used / max) The master health signal Sustained > ~80% — you are about to evict heavily or reject writes
Cache hit ratio (hits / (hits+misses)) Whether the cache is earning its keep Drops — too-small instance, bad TTLs, or churn
Evicted keys Pressure under an allkeys-* policy Rising — working set exceeds capacity; resize
Expired keys TTL behaviour Context for hit-ratio analysis
Connected clients Connection-pool health Near the connection limit — pool misconfig or leak
Blocked clients / rejected connections Saturation > 0 sustained — capacity or client problem
CPU utilisation (per node/shard) Throughput ceiling High → add shards/nodes (Memcached/Cluster) or resize
Replication lag / offset (Standard/Cluster) Replica freshness, failover safety Growing lag → replica struggling
Calls / commands per second Load shape For capacity planning
Keyspace / number of keys Growth trend Unbounded growth → missing TTLs

The two you must never ignore are memory usage ratio (predicts eviction and write-rejection) and hit ratio (predicts whether the cache is actually offloading your database). A high memory ratio plus a low hit ratio means you are paying for a cache that is thrashing — resize, fix TTLs, or reconsider what you cache.

Cache patterns: using the cache correctly

Owning a fast cache is not the same as using it well. These are the standard patterns; the first is the one you will use most.

Cache-aside (lazy loading) — the default

The application is in charge. On a read: check the cache; on a hit, return it; on a miss, read the database, write the value into the cache with a TTL, and return it. On a write: update the database and invalidate (delete) or update the cached key.

def get_product(pid):
    key = f"product:{pid}"
    val = r.get(key)               # 1. try cache
    if val is not None:
        return deserialize(val)    #    hit
    row = db.query_product(pid)    # 2. miss -> read DB
    r.set(key, serialize(row), ex=300)  # 3. populate with 5-min TTL
    return row

def update_product(pid, data):
    db.update_product(pid, data)   # 1. write DB
    r.delete(f"product:{pid}")     # 2. invalidate cache (read repopulates)

Write-through and write-behind

Session store

Storing web session state in Redis is a canonical use. Sessions are small, accessed every request, and benefit from Redis’s TTL (expire idle sessions) and HA (don’t log everyone out on a failover). Use Standard tier with AOF if losing active sessions on a rare failure is unacceptable. This decouples sessions from app instances, enabling stateless, horizontally-scaled app servers — a key reason to reach for Redis over a local in-process cache. Memcached can also hold sessions but, having no HA or persistence, a node loss logs those users out.

Other Redis-native patterns

Because Redis has data structures, it does jobs Memcached cannot:

These are precisely the workloads where Redis beats Memcached — and where you should not use Basic tier without understanding the data-loss implications.

Embedded diagram

The diagram below maps the three Memorystore offerings side by side — single-node Redis (Standard with primary, replica and read endpoint), sharded Redis Cluster, and multi-node Memcached — together with how each connects privately into your VPC and how an application uses the cache-aside pattern in front of a database.

Google Cloud Memorystore: Redis, Redis Cluster and Memcached topologies, private connectivity and caching patterns

Use it as the mental index for this lesson: pick the product (left), wire it privately into the VPC (middle), and apply the right cache pattern (right).

Hands-on lab

We will create a small Standard-tier Memorystore for Redis instance over Private Service Access, connect to it from a VM, exercise caching commands, set an eviction policy, then clean up. Everything is small and short-lived to stay within the GCP Free Tier / $300 credit — but note Memorystore itself is not in the always-free tier, so we keep the instance tiny and delete it promptly.

Prerequisites

gcloud auth login
gcloud config set project YOUR_PROJECT_ID
gcloud services enable redis.googleapis.com compute.googleapis.com servicenetworking.googleapis.com

Step 1 — Set up Private Service Access on the default VPC

# Reserve an IP range for Google service producers (Memorystore draws from this)
gcloud compute addresses create memorystore-psa-range \
  --global \
  --purpose=VPC_PEERING \
  --prefix-length=24 \
  --network=default

# Create the private connection (peering) to servicenetworking
gcloud services vpc-peerings connect \
  --service=servicenetworking.googleapis.com \
  --ranges=memorystore-psa-range \
  --network=default

Expected: both commands complete with a long-running operation that finishes done. You now have private connectivity for managed services on the default VPC.

Step 2 — Create a small Standard-tier Redis instance

gcloud redis instances create lab-cache \
  --size=1 \
  --region=us-central1 \
  --tier=standard \
  --redis-version=redis_7_0 \
  --network=default \
  --connect-mode=private-service-access \
  --redis-config maxmemory-policy=allkeys-lru \
  --maintenance-window-day=sunday \
  --maintenance-window-hour=3

This creates a 1 GB, HA (primary + replica) Redis 7 instance with an LRU eviction policy (so it behaves like a cache), maintenance on Sunday 03:00. Creation takes a few minutes.

Step 3 — Find the private endpoint

gcloud redis instances describe lab-cache --region=us-central1 \
  --format="value(host,port,authorizedNetwork,currentLocationId,replicaCount)"

Expected: a private IP (e.g. 10.x.x.x), port 6379, the network, the zone, and the replica count. Note the host and port.

Step 4 — Connect from a VM in the same VPC

# A tiny VM in the same region/VPC; install redis-cli
gcloud compute instances create lab-client \
  --zone=us-central1-a \
  --machine-type=e2-micro \
  --network=default

gcloud compute ssh lab-client --zone=us-central1-a --command='
  sudo apt-get update -qq && sudo apt-get install -y redis-tools
'

Now exercise the cache (replace REDIS_HOST with the IP from Step 3):

gcloud compute ssh lab-client --zone=us-central1-a --command='
  REDIS_HOST=10.x.x.x
  redis-cli -h $REDIS_HOST set product:1 "{\"name\":\"widget\"}" EX 300
  redis-cli -h $REDIS_HOST get product:1
  redis-cli -h $REDIS_HOST ttl product:1
  redis-cli -h $REDIS_HOST config get maxmemory-policy
  redis-cli -h $REDIS_HOST info stats | grep -E "keyspace_hits|keyspace_misses|evicted_keys"
'

Expected: OK on the set; the JSON value back on the get; a TTL counting down from ~300; maxmemory-policy reporting allkeys-lru; and hit/miss/eviction counters you can watch change.

Step 5 — Test a manual failover (HA)

gcloud redis instances failover lab-cache --region=us-central1

Expected: a long-running operation; the primary endpoint IP is preserved while the replica is promoted. Re-run the get from the client — it should still work (after a brief reconnect), demonstrating that a well-behaved client survives failover.

Validation

Cleanup

gcloud compute instances delete lab-client --zone=us-central1-a -q
gcloud redis instances delete lab-cache --region=us-central1 -q

# Optional: tear down PSA if you created it only for this lab
gcloud services vpc-peerings delete \
  --service=servicenetworking.googleapis.com --network=default -q
gcloud compute addresses delete memorystore-psa-range --global -q

Cost note

Memorystore is billed per GB-hour of provisioned capacity (Redis: per GB of instance size, doubled for Standard since it runs a replica; Memcached: per node-hour by shape; Redis Cluster: per shard). It is not in the always-free tier. A 1 GB Standard Redis runs only a few rupees/cents per hour, but a forgotten instance is a steady drain — delete it the moment the lab is done. There is no charge for the data; you pay for the provisioned RAM/nodes/shards, plus network egress for cross-zone traffic in HA. Persistence (RDB/AOF) adds modest disk/IO cost; in-transit TLS is free.

Common mistakes & troubleshooting

Symptom Likely cause Fix
Cache lost all data after maintenance/scaling Basic tier (no replica) — it flushes on those events Use Standard tier for anything you cannot afford to cold-start
Redis rejecting writes (“OOM command not allowed”) maxmemory-policy=noeviction (or volatile-* with no TTLs) and the instance is full Switch to allkeys-lru, and/or resize larger; set TTLs
Serverless (Cloud Run/Functions) cannot connect No VPC egress to the private IP Enable Direct VPC egress or a Serverless VPC Access connector
Connection refused / timeout from a VM Firewall blocks egress to 6379/11211, or wrong VPC Allow egress to the instance port; ensure the client is in the connected VPC
Redis Cluster client errors (MOVED/CROSSSLOT) Client is not cluster-aware, or multi-key op spans slots Use a cluster-mode client; co-locate related keys with hash tags {...}
Stale values served from cache Missing invalidation on write, or no TTL Invalidate on write and set TTLs so staleness is bounded
Reads return old data intermittently Reading from a replica/read endpoint (async lag) Read from the primary when read-your-writes consistency is required
Connections drop periodically A failover or maintenance event; or idle timeout Ensure the client reconnects/retries; tune timeout; use connection pooling
Low hit ratio, high DB load Instance too small (constant eviction) or wrong keys cached Resize, fix TTL strategy, cache the genuinely hot data

Best practices

Security notes

Interview & exam questions

  1. What are the three Memorystore offerings and when do you use each? Memorystore for Redis (single logical Redis, optional HA) for general caching/sessions/queues; Memorystore for Redis Cluster (sharded, horizontally scalable) when you outgrow one node and need scale-out + resilience; Memorystore for Memcached (multi-node, opaque key/value) for big, simple, rebuildable caches without data structures or persistence.

  2. Basic vs Standard tier for Redis — what’s the difference? Basic = single node, no replica, no failover, and it flushes the cache on maintenance/scaling (no SLA). Standard = primary + replica in different zones, automatic failover, an availability SLA. Production should always use Standard.

  3. How does failover work in Standard-tier Redis, and what must the application do? The replica is promoted to primary automatically; the primary endpoint IP is preserved. The app must handle dropped connections and retry/reconnect — connections are severed at failover. Because replication is async, the last few unreplicated writes may be lost (mitigate with AOF).

  4. What is the difference between a read replica and high availability in Memorystore for Redis? HA (the Standard-tier replica) is about availability — automatic failover at the same endpoint after a zone/node failure. Read replicas (1–5) add read throughput via a separate read endpoint (eventually consistent). One protects uptime; the other scales reads.

  5. Explain Redis Cluster sharding. What changes for the client? The keyspace is split into 16,384 hash slots across shards; a key’s slot = CRC16(key) mod 16384. Clients must be cluster-aware (connect to the discovery endpoint, route by slot). Multi-key commands must touch one slot — use hash tags {...} to co-locate related keys. You scale by changing shard count (online rebalance).

  6. What does the maxmemory-policy setting control, and what’s the right value for a cache? It controls what Redis does when full: evict (LRU/LFU/random/TTL-based) or reject writes (noeviction). For a general cache use allkeys-lru. The classic bug is leaving it on noeviction (or volatile-* with no TTLs) so the instance stops accepting writes when full.

  7. What’s the difference between allkeys-lru and volatile-lru? allkeys-lru can evict any key; volatile-lru evicts only keys that have a TTL. With volatile-* and no TTLs set, nothing is eligible — Redis behaves like noeviction when full.

  8. RDB vs AOF persistence — when each? RDB = periodic binary snapshots (1/6/12/24h): compact, fast reload, but loses everything since the last snapshot. AOF = logs every write (fsync ~every second): ≈1s loss window, more durable, higher I/O. Use AOF when minimising data loss matters; RDB for cheap restart resilience; both for maximum protection. Neither is a backup.

  9. How do you connect to Memorystore privately, and how from serverless? Via Private Service Access (allocate an IP range, create the peering — Redis/Memcached) or Private Service Connect (endpoints in your subnets — Redis Cluster). From Cloud Run/Functions you additionally need Direct VPC egress or a Serverless VPC Access connector to route into the VPC and reach the private IP.

  10. Redis or Memcached — how do you choose? Redis when you need data structures (hashes, sorted sets, streams), persistence, HA/failover, pub/sub, scripting, or anything beyond opaque blobs — i.e. most of the time. Memcached when you want a simple, multi-threaded, horizontally-scaled cache of opaque objects with no need for those features, and node loss is harmless.

  11. What does Memorystore not do that you must design around? No built-in cross-region replication (run independent instances + app-level replication for multi-region), Memcached has no HA/persistence, and persistence is not a backup. Also, scaling Memcached/Redis Cluster reshuffles keys (a temporary miss spike).

  12. Which metrics would you alert on for a production cache? Memory usage ratio (eviction/write-rejection risk), cache hit ratio (is the cache earning its keep), evicted keys (capacity pressure), connected clients / rejected connections, CPU per node/shard, and replication lag (failover safety).

Quick check

  1. Single-node, no replica, flushes the cache on maintenance — which Redis tier?
  2. You need to scale a Redis workload past a single machine’s RAM and throughput, online. Which offering?
  3. Your Redis “stopped accepting writes” when full. What is the most likely maxmemory-policy, and what should it be for a cache?
  4. True/false: a Cloud Run service can reach Memorystore over its private IP with no extra configuration.
  5. Which persistence option gives the smallest data-loss window, and roughly how small?

Answers

  1. Basic tier (single node, no failover, flushes on maintenance/scaling).
  2. Memorystore for Redis Cluster (sharded, horizontally scalable, online resharding).
  3. Likely noeviction (or a volatile-* policy with no TTLs); for a cache set allkeys-lru (and use TTLs).
  4. False — Cloud Run needs Direct VPC egress or a Serverless VPC Access connector to reach the private IP.
  5. AOF with appendfsync everysec — at most about 1 second of writes at risk.

Exercise

Design and partially build a resilient cache tier in a sandbox project, justifying every choice:

  1. Set up Private Service Access on a VPC, then create a Standard-tier Memorystore for Redis instance with allkeys-lru eviction, a maintenance window, and in-transit TLS + AUTH enabled.
  2. From a VM, implement cache-aside in a short script against a small “database” (a file or Cloud SQL): read-through with a TTL, and invalidate on write. Demonstrate a hit, a miss, and a stale-then-corrected read after the TTL expires.
  3. Add a read replica and show reads served from the read endpoint; observe eventual consistency by reading immediately after a write.
  4. Trigger a manual failover and prove your client reconnects and continues against the preserved endpoint.
  5. Enable AOF, write some keys, restart/failover, and show the data survived; then export an RDB snapshot to Cloud Storage as a “backup”.
  6. Now re-architect on paper for 10× scale: lay out a Redis Cluster (shard count, replicas per shard, hash-tag strategy for multi-key ops, PSC connectivity) and write a paragraph on what changes in the client.
  7. Tear everything down and confirm the bill stops.

Write a short justification for each resilience and security choice (tier, eviction, persistence, TLS/AUTH, connectivity) and which failure or threat each addresses.

Certification mapping

Glossary

Next steps

GCPMemorystoreRedisMemcachedCachingHigh Availability
Need this built for real?

Vinod is a Senior Cloud Architect (22+ yrs) — available for Azure / AWS / GCP architecture, landing zones, and migrations.

Work with me

Comments