Azure Service Bus at Scale: Sessions, Deduplication, and Dead-Letter Handling

Azure Service Bus is the broker you reach for when “fire a message and hope” is no longer acceptable — when you need ordering per customer, no duplicate side effects, and a place for poison messages to land instead of taking down a consumer in a tight retry loop. The primitives that deliver this (sessions, duplicate detection, PeekLock, dead-letter queues) are individually simple and collectively easy to misuse. Get the lock model wrong and you double-process under load; get session affinity wrong and your “ordered” queue silently interleaves; forget the DLQ and a single malformed message stalls a partition for hours.

This guide builds the patterns the way they survive production. Examples use the Azure.Messaging.ServiceBus .NET SDK (the supported successor to Microsoft.Azure.ServiceBus and WindowsAzure.ServiceBus) plus az servicebus CLI for provisioning. The concepts map directly to the Java, Python, and JavaScript SDKs.

Tiers matter. Sessions, duplicate detection, and topics all require the Standard or Premium tier — the Basic tier gives you queues only. Anything throughput- or latency-sensitive belongs on Premium, which gives dedicated capacity (messaging units), predictable latency, and a hard 100 MB max message size. This guide assumes Standard at minimum.

1. Queues vs topics/subscriptions: choose the fan-out first

A queue is point-to-point: many senders, many competing consumers, each message delivered to exactly one consumer. A topic is publish/subscribe: senders publish once, and every subscription gets its own independent copy with its own cursor, DLQ, and filters.

The decision is not “which is better” — it is how many independent readers does this message need.

Need	Use
One logical consumer group competing on work	Queue
Multiple teams/services react to the same event independently	Topic + subscriptions
Routing the same event differently by content	Topic with SQL/correlation filters per subscription
Per-key ordering	Either — enable sessions on the queue or subscription

A subscription behaves like a queue with a filter in front. Everything below about PeekLock, sessions, lock renewal, and dead-lettering applies identically to a subscription’s receiver. Provision a namespace, a sessioned queue, and a topic:

RG=rg-sb-orders
NS=sb-orders-prod          # must be globally unique
LOC=eastus

az group create -n $RG -l $LOC
az servicebus namespace create -g $RG -n $NS -l $LOC --sku Premium --capacity 1

# Sessioned, duplicate-detected work queue
az servicebus queue create -g $RG --namespace-name $NS -n orders \
  --enable-session true \
  --enable-duplicate-detection true \
  --duplicate-detection-history-time-window PT10M \
  --max-delivery-count 10 \
  --lock-duration PT1M \
  --default-message-time-to-live P14D

# Topic with two subscriptions
az servicebus topic create -g $RG --namespace-name $NS -n order-events \
  --enable-duplicate-detection true
az servicebus topic subscription create -g $RG --namespace-name $NS \
  --topic-name order-events -n billing --max-delivery-count 10
az servicebus topic subscription create -g $RG --namespace-name $NS \
  --topic-name order-events -n analytics --max-delivery-count 10

--enable-session, --enable-duplicate-detection, and partitioning are immutable after creation. You cannot toggle them on an existing entity — you create a new one and migrate. Decide up front.

2. Ordered processing with sessions

Service Bus does not guarantee global FIFO on a plain queue — competing consumers and redelivery break ordering. Ordering is guaranteed only within a session. A session is a logical group identified by the SessionId you set on each message. All messages sharing a SessionId are delivered in order, to a single consumer at a time, who holds an exclusive lock on that session.

The right session key is your ordering boundary: CustomerId, AggregateId, DeviceId — never a constant (that serializes everything) and never unique-per-message (that defeats the point).

await using var client = new ServiceBusClient(fullyQualifiedNamespace,
    new DefaultAzureCredential());
var sender = client.CreateSender("orders");

var msg = new ServiceBusMessage(BinaryData.FromObjectAsJson(order))
{
    SessionId = order.CustomerId,           // ordering boundary
    MessageId = order.OrderId,              // drives dedup (step 3)
    ContentType = "application/json",
    Subject = "OrderPlaced",
};
await sender.SendMessageAsync(msg);

On the consumer side, use a session processor. It locks one session, drains it in order, then moves to the next free session — concurrency scales by number of active sessions, not message count:

var processor = client.CreateSessionProcessor("orders", new ServiceBusSessionProcessorOptions
{
    MaxConcurrentSessions = 8,              // 8 sessions in parallel
    MaxConcurrentCallsPerSession = 1,       // keep order within a session
    AutoCompleteMessages = false,           // complete explicitly on success
    SessionIdleTimeout = TimeSpan.FromSeconds(30),
});

processor.ProcessMessageAsync += async args =>
{
    var order = args.Message.Body.ToObjectFromJson<Order>();
    await HandleAsync(order, args.CancellationToken);
    await args.CompleteMessageAsync(args.Message);   // advance the session cursor
};
processor.ProcessErrorAsync += args =>
{
    log.LogError(args.Exception, "Session error on {Entity}", args.EntityPath);
    return Task.CompletedTask;
};
await processor.StartProcessingAsync();

Session state

Each session carries a small session state blob — server-side scratch space keyed to the SessionId, surviving across consumers and redeliveries. Use it as a checkpoint or saga cursor so a consumer that picks up an existing session knows where it left off:

processor.ProcessMessageAsync += async args =>
{
    var stateBytes = await args.GetSessionStateAsync();
    var cursor = stateBytes is null
        ? new SagaCursor()
        : stateBytes.ToObjectFromJson<SagaCursor>();

    cursor = await AdvanceAsync(cursor, args.Message);

    await args.SetSessionStateAsync(BinaryData.FromObjectAsJson(cursor));
    await args.CompleteMessageAsync(args.Message);
};

Session state counts against the entity’s storage quota, so keep it to a cursor or a few IDs — not the whole aggregate.

3. Duplicate detection for idempotent producers

At-least-once delivery means a sender that times out and retries can enqueue the same logical message twice. Duplicate detection makes the enqueue idempotent: within the configured history window, Service Bus drops any message whose MessageId it has already seen on that entity, silently and server-side.

# 10-minute dedup window set in step 1:
#   --enable-duplicate-detection true
#   --duplicate-detection-history-time-window PT10M

The contract is simple and strict:

You must set a deterministic MessageId derived from the business event (OrderId, a hash of the payload) — not a fresh GUID per send.
The window is a trade-off: longer windows catch slower retries but cost more throughput and storage. PT10M handles SDK retries and brief outages; PT1H covers a consumer-driven replay. The maximum is 7 days on Premium (1 day on Standard).
Dedup is per-entity and covers only the enqueue. It does not make your handler idempotent.

“Exactly-once-ish” is the honest framing. Dedup gives you exactly-once enqueue inside the window. End-to-end you still get at-least-once delivery (PeekLock can redeliver), so the consumer side must also be idempotent — typically an upsert keyed by MessageId or a processed-IDs table. Dedup and an idempotent handler are complementary, not redundant.

4. PeekLock vs ReceiveAndDelete, and lock renewal

There are two receive modes, and the choice is a data-safety decision:

ReceiveAndDelete removes the message the instant it is delivered. One network hop, fastest throughput, zero redelivery. If your consumer crashes mid-process, the message is gone. Use only for telemetry where loss is acceptable.
PeekLock (the default, and what you almost always want) delivers the message and places a time-bound lock on it. You then explicitly Complete (success — remove it), Abandon (release immediately for redelivery), DeadLetter (route to the DLQ), or Defer. If the lock expires before you act, the message is redelivered and its delivery count increments.

The trap is the lock duration. LockDuration maxes out at 5 minutes. A handler that runs longer than the lock loses it mid-flight, the message is redelivered, and now two consumers process it — the classic double-processing bug. Do not crank the lock to 5 minutes and hope; renew the lock for genuinely long handlers.

The processor renews automatically up to MaxAutoLockRenewalDuration — set it to your realistic worst-case handler time:

var processor = client.CreateProcessor("orders", new ServiceBusProcessorOptions
{
    ReceiveMode = ServiceBusReceiveMode.PeekLock,
    MaxConcurrentCalls = 16,
    PrefetchCount = 0,                                     // see step 8
    AutoCompleteMessages = false,
    MaxAutoLockRenewalDuration = TimeSpan.FromMinutes(10), // renew past LockDuration
});

If you receive messages manually instead of via the processor, renew explicitly before the lock window closes:

var receiver = client.CreateReceiver("orders");
var message = await receiver.ReceiveMessageAsync();
try
{
    using var cts = new CancellationTokenSource(TimeSpan.FromMinutes(8));
    await receiver.RenewMessageLockAsync(message);   // call again as needed for very long work
    await DoLongWorkAsync(message, cts.Token);
    await receiver.CompleteMessageAsync(message);
}
catch (Exception ex)
{
    // Surface the reason on the DLQ so the re-drive processor can triage it.
    await receiver.DeadLetterMessageAsync(message,
        deadLetterReason: "ProcessingFailed",
        deadLetterErrorDescription: ex.Message);
}

Rule of thumb: keep LockDuration at 1 minute and let renewal extend it. A short base lock means a crashed consumer’s messages free up fast; renewal keeps a healthy slow consumer from losing its lock. Setting a 5-minute base lock gets you the worst of both — slow recovery from crashes with no protection past 5 minutes.

5. Dead-letter queues and a re-drive processor

Every queue and subscription has a system-managed dead-letter sub-queue at the address <entity>/$DeadLetterQueue. Messages land there for a handful of reasons:

MaxDeliveryCountExceeded — abandoned/lock-expired more than MaxDeliveryCount times (set it to 10, not the default of 10 by accident — pick deliberately).
TTLExpiredException — the message outlived its time-to-live (enable DeadLetteringOnMessageExpiration to capture these instead of dropping them).
HeaderSizeExceeded, or a subscription with dead-lettering on filter evaluation errors enabled.
Application dead-lettering — your handler called DeadLetterMessageAsync because the payload is unprocessable (bad schema, references a deleted entity).

The DLQ is a real queue: it does not auto-expire by default and it does not auto-empty. A DLQ filling up silently is one of the most common Service Bus incidents. Alert on its depth and build a re-drive processor to inspect, fix, and replay.

// Read the DLQ, log the reason, and either re-drive or discard.
var dlqReceiver = client.CreateReceiver("orders", new ServiceBusReceiverOptions
{
    SubQueue = SubQueue.DeadLetter,        // resolves to orders/$DeadLetterQueue
});
var resender = client.CreateSender("orders");

await foreach (var dead in dlqReceiver.ReceiveMessagesAsync())
{
    var reason = dead.DeadLetterReason;
    var desc   = dead.DeadLetterErrorDescription;
    log.LogWarning("DLQ {MessageId}: {Reason} / {Desc}", dead.MessageId, reason, desc);

    if (IsTransient(reason))
    {
        // Copy a NEW message from the dead one and resubmit to the main queue.
        var replay = new ServiceBusMessage(dead)        // copies body + app properties
        {
            MessageId = dead.MessageId,                 // preserve dedup identity
            SessionId = dead.SessionId,                 // preserve ordering boundary
        };
        await resender.SendMessageAsync(replay);
        await dlqReceiver.CompleteMessageAsync(dead);   // remove from DLQ only after re-send
    }
    else
    {
        await ArchiveForManualReviewAsync(dead);
        await dlqReceiver.CompleteMessageAsync(dead);
    }
}

You cannot move a message out of the DLQ in place — there is no “resubmit” verb. The pattern is always receive from $DeadLetterQueue, send a fresh copy to the source, then complete the dead-lettered one. Use new ServiceBusMessage(deadMessage) so the body and application properties carry over, and re-send after the new message is accepted so a crash mid-redrive never loses the message.

6. Subscription filters: SQL and correlation rules

On topics, each subscription decides which published messages it keeps via rules. A subscription created without an explicit rule gets a default 1=1 (match-all). For routing, attach filters:

CorrelationFilter — matches on system properties (Subject/Label, CorrelationId, MessageId, To, ReplyTo) and named application properties by exact equality. It is indexed and the cheapest filter — prefer it.
SQLFilter — a SQL-92-like boolean over system and application properties (<, >, LIKE, IN, AND/OR). More expressive, more expensive to evaluate.

# billing only wants high-value OrderPlaced events -> SQL filter
az servicebus topic subscription rule create -g $RG --namespace-name $NS \
  --topic-name order-events --subscription-name billing -n high-value \
  --filter-sql-expression "Subject = 'OrderPlaced' AND amount > 1000"

# analytics wants everything with region = 'emea' -> cheap correlation filter
az servicebus topic subscription rule create -g $RG --namespace-name $NS \
  --topic-name order-events --subscription-name analytics -n emea \
  --correlation-filter '{"properties": {"region": "emea"}}'

The sender sets those properties so filters have something to match:

var evt = new ServiceBusMessage(BinaryData.FromObjectAsJson(order))
{
    Subject = "OrderPlaced",
    CorrelationId = order.CorrelationId,
};
evt.ApplicationProperties["amount"] = order.Total;   // visible to SQL filters
evt.ApplicationProperties["region"] = order.Region;  // visible to correlation filters
await topicSender.SendMessageAsync(evt);

If you add a custom rule, delete the default $Default rule — otherwise the subscription matches everything and your filter, and you wonder why analytics is getting low-value orders. New custom rule, drop the default.

7. Auto-forwarding, scheduled messages, and deferral

Three features that cover most “I need to delay or chain this” requirements without external infrastructure:

Auto-forwarding chains an entity to another in the same namespace — a subscription forwards to a queue, or a queue to a topic — fully server-side. Use it to fan a topic’s matched messages into per-team work queues, or to build a single ingestion endpoint:

az servicebus topic subscription update -g $RG --namespace-name $NS \
  --topic-name order-events -n billing \
  --forward-to billing-work        # matched messages flow straight to the billing queue

Scheduled messages are enqueued now but become visible only at a future time — native delayed delivery, no Quartz or cron loop:

var seq = await sender.ScheduleMessageAsync(
    reminderMessage,
    DateTimeOffset.UtcNow.AddHours(24));   // visible in 24h
// Cancel before it fires if the situation changes:
await sender.CancelScheduledMessageAsync(seq);

Deferral is for “I received this, but I cannot process it yet” — an out-of-order step in a saga, or a dependency not ready. The message is set aside (kept off the active stream) and can only be retrieved later by its sequence number, which you must persist:

if (!ReadyToProcess(message))
{
    await receiver.DeferMessageAsync(message);
    await SaveForLaterAsync(message.SessionId, message.SequenceNumber); // you own this
    return;
}
// Later, once the dependency arrives:
var deferred = await receiver.ReceiveDeferredMessageAsync(savedSequenceNumber);
await Process(deferred);
await receiver.CompleteMessageAsync(deferred);

Deferral’s catch: a deferred message is invisible to normal receive. If you lose the sequence number you have effectively leaked the message until its TTL expires. Persist SequenceNumber durably (the session state in step 2 is a natural home) before you defer.

8. Scaling consumers, prefetch, and Premium throttling

Throughput on Service Bus is a function of consumer concurrency, prefetch, and — on Premium — provisioned capacity.

Concurrency. MaxConcurrentCalls (or MaxConcurrentSessions) sets how many messages a single processor handles in parallel. Scale out by running more consumer instances; competing consumers split the load automatically. Sessions cap effective parallelism at the number of active sessions, so a low session cardinality is itself a throughput ceiling.
Prefetch. PrefetchCount pulls N extra messages into a local buffer to hide round-trip latency. It is a throughput win and a correctness trap: prefetched messages hold their locks while sitting in the buffer. If PrefetchCount is large and handlers are slow, buffered locks expire before you touch them, the messages redeliver, and delivery counts climb toward the DLQ. Start at 0, raise to roughly MaxConcurrentCalls * (1 to 3) only for short, high-rate handlers, and never combine large prefetch with long processing.
Premium capacity. Premium is sold in messaging units (MU) — 1, 2, 4, 8, 16. Each MU is isolated, predictable capacity. When you exceed it you get throttling (HTTP 429-equivalent ServerBusyException), not failure; the SDK backs off and retries. Watch the ThrottledRequests, ServerErrors, and ActiveMessages metrics and scale MUs when throttling becomes sustained rather than spiky.

# Scale Premium capacity up to 4 messaging units under sustained load
az servicebus namespace update -g $RG -n $NS --capacity 4

The default SDK retry policy already handles transient ServerBusyException with exponential backoff; tune it only with evidence:

var client = new ServiceBusClient(fullyQualifiedNamespace, new DefaultAzureCredential(),
    new ServiceBusClientOptions
    {
        RetryOptions = new ServiceBusRetryOptions
        {
            Mode = ServiceBusRetryMode.Exponential,
            MaxRetries = 5,
            MaxDelay = TimeSpan.FromSeconds(30),
        },
    });

Verify

Prove each guarantee before you trust it:

# Queue + DLQ depth, delivery config — watch DLQ count, it should not grow unbounded.
az servicebus queue show -g $RG --namespace-name $NS -n orders \
  --query "{active: countDetails.activeMessageCount, dead: countDetails.deadLetterMessageCount, dup: requiresDuplicateDetection, session: requiresSession, maxDelivery: maxDeliveryCount}"

# Confirm subscription rules are what you think (no stray $Default left behind).
az servicebus topic subscription rule list -g $RG --namespace-name $NS \
  --topic-name order-events --subscription-name billing -o table

Ordering: send 50 messages with the same SessionId and out-of-order payloads; assert the consumer observed them in send order.
Dedup: send the same MessageId twice inside the window; assert exactly one delivery.
Lock renewal: run a handler that sleeps 90s with a 60s LockDuration and MaxAutoLockRenewalDuration = 10m; assert no redelivery (delivery count stays 1).
DLQ: force a handler exception 11 times (MaxDeliveryCount = 10); assert the message appears under $DeadLetterQueue with reason MaxDeliveryCountExceeded.

KQL for the dead-letter rate, wired to an alert:

AzureMetrics
| where ResourceProvider == "MICROSOFT.SERVICEBUS"
| where MetricName == "DeadletteredMessages"
| summarize Dead = sum(Total) by Resource, bin(TimeGenerated, 5m)
| where Dead > 0

Enterprise scenario

A payments platform processed wallet transactions through a single Standard-tier queue with competing consumers. Each transaction was independent — until the product team shipped running balances. Now two debits on the same wallet, processed concurrently, could read the same starting balance and both succeed, overdrawing the account. They also hit duplicate charges: a gateway timeout made the upstream service resend, and both copies were processed.

The constraint was hard: strict per-wallet ordering and no duplicate debit, without serializing the entire queue (millions of wallets, thousands of transactions per second) and with a 6-week audit retention requirement on anything that failed.

They fixed it with three changes and no new infrastructure:

Sessions keyed on WalletId. Per-wallet ordering became absolute — a wallet’s transactions process one at a time, in order — while different wallets still ran fully parallel. Effective concurrency stayed high because session cardinality (number of active wallets) was enormous.
Duplicate detection with a deterministic MessageId set to the upstream transaction ID, on a PT1H window sized to the gateway’s retry envelope, backed by an idempotent UPSERT keyed on the same ID so a redelivery past the window still could not double-debit.
A DLQ re-drive processor moved to Premium for predictable latency, alerting on DeadletteredMessages > 0 and archiving non-transient failures to a Storage account for the 6-week audit trail before completing them.

The session consumer that closed the overdraw race:

var processor = client.CreateSessionProcessor("wallet-tx", new ServiceBusSessionProcessorOptions
{
    MaxConcurrentSessions = 32,            // 32 wallets in flight
    MaxConcurrentCallsPerSession = 1,      // strict order per wallet
    PrefetchCount = 0,                     // long DB transaction -> no buffered lock loss
    MaxAutoLockRenewalDuration = TimeSpan.FromMinutes(5),
    AutoCompleteMessages = false,
});

processor.ProcessMessageAsync += async args =>
{
    var tx = args.Message.Body.ToObjectFromJson<WalletTx>();
    // Idempotent debit: succeeds once per TxId even on redelivery.
    await ApplyDebitIfNewAsync(tx, idempotencyKey: args.Message.MessageId);
    await args.CompleteMessageAsync(args.Message);
};

Result: zero overdrafts and zero duplicate debits in the following quarter, with no message-level locking in their own code and no external coordination service — the ordering came from sessions, the dedup from MessageId plus an idempotent write, and the safety net from the DLQ.

Azure Service Bus at Scale: Sessions, Deduplication, and Dead-Letter Handling

1. Queues vs topics/subscriptions: choose the fan-out first

2. Ordered processing with sessions

Session state

3. Duplicate detection for idempotent producers

4. PeekLock vs ReceiveAndDelete, and lock renewal

5. Dead-letter queues and a re-drive processor

6. Subscription filters: SQL and correlation rules

7. Auto-forwarding, scheduled messages, and deferral

8. Scaling consumers, prefetch, and Premium throttling

Verify

Enterprise scenario

Checklist

Written by Vinod

Comments

Keep Reading

GPU Workloads and KAITO Inference on AKS: Node Pools, Drivers, and Autoscaling

Running the Managed Istio Add-on on AKS: mTLS, Ingress Gateways, and Egress Control

Secrets Store CSI Driver on AKS: Mounting Key Vault Secrets with Rotation and K8s Sync