Does adding an Idempotency-Key header make my API idempotent?

At the HTTP layer, yes — if implemented correctly with atomic check-and-set and full response storage. But it says nothing about your database writes, queue consumers, external API calls, or saga compensation. Each of those can independently produce duplicate effects that the HTTP key won't catch.

Can I pass the same idempotency key through the entire flow?

You can forward it as a correlation ID, but downstream layers need their own scoped keys. The HTTP-layer key is scoped to an incoming request. A queue consumer may process the same logical operation multiple times due to at-least-once delivery — that's a separate event needing its own dedup identifier, usually derived from the message ID or payload hash rather than the original request key.

What TTL should I set on idempotency key storage?

Match the TTL to the longest retry window for each operation type, not globally. If a payment provider can take 72 hours to settle and you may need to resubmit, a 24-hour TTL on the key is too short. For saga compensation, the key must outlive the longest possible saga run.

How do I test that my idempotency implementation is correct at each layer?

Write explicit retry tests at each layer rather than relying on unit tests. For the HTTP layer: send the same key concurrently and verify only one response is computed. For queue consumers: publish the same message twice and verify the effect happens exactly once. For saga compensation: inject a failure at step N and run compensation twice; verify no double-application. These require integration-level setup but are the only reliable way to confirm the guarantees hold.

EngineeringMay 11, 20267 min readReviewed May 11, 2026

Idempotency keys: the layer you're protecting isn't the one that bites you

Five layers. Each has a different failure mode. Here is the map.

By FlowVerify Editorial Team

The standard idempotency tutorial covers a payment API scenario: your client generates a UUID, attaches it to the request as an Idempotency-Key header, and your server stores the result the first time it processes it. Any retry sends the same UUID, the server returns the cached response, and the charge doesn't land twice. The pattern is correct. For a synchronous HTTP handler with a single database write and no downstream calls, it's close to complete.

The problem: most production backends don't have that shape. A typical payment or notification flow receives an HTTP request, writes to a database, publishes a message to a queue, calls an external payment provider, and when something fails mid-way, runs compensating logic. Each of those steps can produce a duplicate operation. An idempotency key at the HTTP layer handles exactly one of them.

This is the map of all five layers, what breaks at each one, and what fix actually applies.

Layer 1: the HTTP key (the one everyone adds)

The HTTP layer key is well-understood: generate a UUID, include it in the Idempotency-Key header, do an atomic check on the server — if the key exists, return the cached response; if not, process and store. The things most tutorials cover correctly:

Atomic storage. Use INSERT ... ON CONFLICT DO NOTHING or Redis SETNX, not SELECT then INSERT. A non-atomic check allows two concurrent requests with the same key to both proceed through.
Full response storage. Store the HTTP status code and the response body, not a processed flag. Replaying means returning exactly the same response, not re-running the handler.
TTL aligned to your retry window. Across payment processors and mobile clients, 24 hours is a reasonable floor. For multi-day workflows, align TTL per operation type to the longest retry window that operation might see.

One thing most tutorials understate: the client bears a real burden. The key must be generated and persisted before the first network call. If a mobile app generates the key in memory and then crashes before saving it to local storage, the retry arrives with a different key and the server processes it again. On iOS and Android, this is not an edge case — it's a common pattern in background sync and offline-first apps. Most SDK retry implementations handle the key correctly; most hand-rolled fetch wrappers do not.

A practical note on key format: UUID v4 (random) works fine for most uses. For high-volume APIs where idempotency keys are stored in a database, UUID v7 (time-ordered) performs better on insert because it doesn't fragment B-tree indexes. The difference matters at tens of thousands of requests per second; below that, it doesn't.

This layer is where most teams correctly invest effort. It's not where most production incidents originate.

Layer 2: the database write race condition

Your HTTP idempotency check passes and the request proceeds to write to your database. At this point, you're relying on the assumption that only one request is doing this write at a time. That assumption frequently breaks.

Consider a subscription activation flow. Your API receives a valid payment confirmation, the idempotency check passes, and the handler sets subscription_status to 'active'. At the same moment, a webhook from your payment processor also fires — a different HTTP request, with a different idempotency key, also valid, also authorized. Both requests arrive within milliseconds of each other. Neither is a retry in the HTTP sense. Both pass their respective idempotency checks. Both attempt to write the same state transition.

The fix is not another idempotency key. It's a database-level constraint: a conditional update that only proceeds if the row is in the expected state.

sql

UPDATE subscriptions
SET status = 'active', activated_at = now(), version = version + 1
WHERE user_id = $1
  AND status = 'pending'
  AND version = $2;

-- If rowcount = 0, something else already ran this transition.
-- Check the current state and handle it explicitly.

If this update returns zero rows affected, another request already ran the transition. The right response depends on your domain: sometimes you return success (the desired state was achieved), sometimes you return a conflict error. What you don't do is silently ignore the zero-row result — that's how partial updates slip through undetected.

This is an orthogonal guarantee to the HTTP layer key. The HTTP key says 'this request has been seen.' The database constraint says 'this state transition has already completed.' You need both, and neither substitutes for the other.

Layer 3: message queue consumers and at-least-once delivery

When your HTTP handler publishes a message to a queue — Kafka, SQS, RabbitMQ, Pub/Sub — you get at-least-once delivery semantics. The queue guarantees the message will be delivered at least once. If your consumer crashes after processing but before acknowledging, or if a consumer group rebalance happens at the wrong moment, the same message arrives again.

The idempotency key from the original HTTP request is not automatically in the message. Even if you include it, the consumer has to be written to check it. Most queue consumers are not idempotent by default — they're written to process each message as if it's the first time.

The standard fix: include a deterministic message-scoped identifier in the payload, and check it against a processed-messages table before doing any work.

sql

-- Before processing the message, atomically claim it:
INSERT INTO processed_messages (message_id, processed_at)
VALUES ($1, now())
ON CONFLICT (message_id) DO NOTHING;

-- If rowcount = 0, another consumer instance already processed this message.
-- Acknowledge the message and return without doing work.

There's a subtlety: this pattern has the same race condition as the HTTP layer. Two consumer instances reading the same message during a rebalance can both pass the 'have I seen this?' check before either has written to the dedup table. The insert-on-conflict pattern handles this because the database's unique constraint enforces the serialization. If both instances attempt the insert simultaneously, exactly one succeeds; the other gets zero rows and skips processing.

The failure mode when this layer is absent is specific: a customer gets charged once at the HTTP layer, but the order creation message is processed twice, resulting in two orders. The HTTP idempotency key is clean. The queue consumer is the gap.

Layer 4: external API calls and the unknown result

Your service calls an external payment processor. The request sends. Then your network connection drops, or the provider's response is lost in transit. You don't receive a response. You don't know if the charge went through.

Two things break in practice here, and they're worth naming separately.

First: many SDK retry implementations generate a new idempotency key for each attempt. This defeats the purpose. If you send the same charge to Stripe with two different idempotency keys, Stripe will process it twice. Check your payment SDK's retry configuration explicitly. Stripe's official Python library regenerates the key by default unless you supply one. The safe pattern: generate the key before making any call, store it alongside the charge record, and pass it explicitly on every attempt for that charge.

Second: the HTTP key on your own API and the idempotency key you pass to the payment provider are separate. Your system's idempotency key says 'I won't create two charge records.' The provider's idempotency key says 'I won't charge the card twice.' When you receive a timeout, your local charge record is in an indeterminate state. The provider's charge status is unknown.

The resolution path is not optional: after a timeout, query the provider for the charge by reference ID, check its status, and use that to resolve your local record. Teams that defer this to 'future work' end up resolving it manually during incidents. The reconciliation query should be a first-class part of your payment integration, with error handling that covers 'charge not found,' 'charge succeeded,' and 'charge failed' as distinct states.

Layer 5: saga compensation must be idempotent too

Sagas — sequences of local transactions coordinated without a distributed lock — are the standard pattern for workflows that span multiple services. If a step fails, the saga runs compensating transactions on the steps that already succeeded. The compensation for 'reserve inventory' is 'release inventory.' The compensation for 'create order' is 'cancel order.'

Compensating transactions must be idempotent. An orchestrator that retries a compensation step after a partial failure will call the same compensation twice. Releasing inventory twice means more inventory than you actually have. Cancelling an order that's already been cancelled can trigger a second cancellation notification to the customer.

The fix: scope idempotency to each saga step independently. A common approach is deriving a deterministic key from the saga instance ID and the step name:

python

import hashlib

def step_key(saga_id: str, step_name: str) -> str:
    return hashlib.sha256(f"{saga_id}::{step_name}".encode()).hexdigest()[:32]

# Usage:
# compensation_key = step_key(saga.id, "release_inventory")
# Pass this key to the inventory service as its idempotency key.

This key is stable across retries, unique per step, and derivable without storing additional state.

The deeper problem: compensation logic is usually written late in the development cycle, tested on the happy path only, and deployed once. Idempotency bugs in compensation surface during incidents, when the orchestrator is retrying because something went wrong and you're least equipped to debug it.

Test compensation paths explicitly. For each saga step, write a test that runs the compensation twice and asserts the downstream effect happens exactly once. This is harder to set up than unit tests; it requires real or realistic downstream services. But it's the only way to confirm the guarantee holds under retry conditions.

What 'done right' looks like across all five layers

Idempotency is not a feature you add to your API endpoint. It's a property you have to achieve independently at each layer, because each layer has a different failure mode and a different fix.

Layer	What breaks without it	Canonical fix
HTTP API	Two concurrent requests with the same key both land a create	Atomic SETNX or INSERT ON CONFLICT; store full response with status code
Database write	Two valid operations race to update the same row	Conditional update with version check; unique constraint on the transition
Queue consumer	At-least-once delivery runs consumer twice; downstream effect doubles	Dedup table with unique message ID; atomic insert-on-conflict before processing
External API call	Timeout leaves result ambiguous; retry without stable key charges twice	Caller-controlled stable idempotency key; build the reconciliation query path
Saga compensation	Orchestrator retries compensation; compensating action applies twice	Step-scoped key derived from saga ID and step name; test compensation under retry

Five layers, five failure modes, five fixes

None of these guarantees subsumes another. A team that correctly implements HTTP-layer idempotency and skips the queue consumer dedup will hit a production incident that the HTTP metrics won't show. A team that gets all five layers right for the happy path but skips saga compensation testing will discover the gap during an incident.

The reason idempotency bugs are hard to reproduce is specific timing at a specific layer. Fixing the wrong layer first is the most common response pattern — teams add an idempotency key to the API endpoint after a double-charge, when the actual duplicate was in the queue consumer that processed the payment confirmation event. The HTTP key is clean. The consumer is the gap.

Audit each layer independently. The five-minute version: for each layer in your flow, ask whether the same operation landing twice would produce a duplicate effect, and whether you have a mechanism that prevents that. If you're not sure, the answer is no.

Frequently asked questions

Every Postgres isolation level, and the specific bug it lets through

Three isolation levels, three distinct failure modes. Most Postgres deployments run at Read Committed without knowing it. Here is what each level permits and what upgrading actually costs.

May 15, 2026Read full article →

EngineeringMay 11, 20267 min readReviewed May 11, 2026

Idempotency keys: the layer you're protecting isn't the one that bites you

Five layers. Each has a different failure mode. Here is the map.

By FlowVerify Editorial Team

This is the map of all five layers, what breaks at each one, and what fix actually applies.

Layer 1: the HTTP key (the one everyone adds)

Atomic storage. Use INSERT ... ON CONFLICT DO NOTHING or Redis SETNX, not SELECT then INSERT. A non-atomic check allows two concurrent requests with the same key to both proceed through.
Full response storage. Store the HTTP status code and the response body, not a processed flag. Replaying means returning exactly the same response, not re-running the handler.
TTL aligned to your retry window. Across payment processors and mobile clients, 24 hours is a reasonable floor. For multi-day workflows, align TTL per operation type to the longest retry window that operation might see.

This layer is where most teams correctly invest effort. It's not where most production incidents originate.

Layer 2: the database write race condition

The fix is not another idempotency key. It's a database-level constraint: a conditional update that only proceeds if the row is in the expected state.

sql

UPDATE subscriptions
SET status = 'active', activated_at = now(), version = version + 1
WHERE user_id = $1
  AND status = 'pending'
  AND version = $2;

-- If rowcount = 0, something else already ran this transition.
-- Check the current state and handle it explicitly.

Layer 3: message queue consumers and at-least-once delivery

The standard fix: include a deterministic message-scoped identifier in the payload, and check it against a processed-messages table before doing any work.

sql

-- Before processing the message, atomically claim it:
INSERT INTO processed_messages (message_id, processed_at)
VALUES ($1, now())
ON CONFLICT (message_id) DO NOTHING;

-- If rowcount = 0, another consumer instance already processed this message.
-- Acknowledge the message and return without doing work.

Layer 4: external API calls and the unknown result

Two things break in practice here, and they're worth naming separately.

Layer 5: saga compensation must be idempotent too

The fix: scope idempotency to each saga step independently. A common approach is deriving a deterministic key from the saga instance ID and the step name:

python

import hashlib

def step_key(saga_id: str, step_name: str) -> str:
    return hashlib.sha256(f"{saga_id}::{step_name}".encode()).hexdigest()[:32]

# Usage:
# compensation_key = step_key(saga.id, "release_inventory")
# Pass this key to the inventory service as its idempotency key.

This key is stable across retries, unique per step, and derivable without storing additional state.

What 'done right' looks like across all five layers

Idempotency is not a feature you add to your API endpoint. It's a property you have to achieve independently at each layer, because each layer has a different failure mode and a different fix.

Layer	What breaks without it	Canonical fix
HTTP API	Two concurrent requests with the same key both land a create	Atomic SETNX or INSERT ON CONFLICT; store full response with status code
Database write	Two valid operations race to update the same row	Conditional update with version check; unique constraint on the transition
Queue consumer	At-least-once delivery runs consumer twice; downstream effect doubles	Dedup table with unique message ID; atomic insert-on-conflict before processing
External API call	Timeout leaves result ambiguous; retry without stable key charges twice	Caller-controlled stable idempotency key; build the reconciliation query path
Saga compensation	Orchestrator retries compensation; compensating action applies twice	Step-scoped key derived from saga ID and step name; test compensation under retry

Five layers, five failure modes, five fixes

Idempotency keys: the layer you're protecting isn't the one that bites you

Layer 1: the HTTP key (the one everyone adds)

Layer 2: the database write race condition

Layer 3: message queue consumers and at-least-once delivery

Layer 4: external API calls and the unknown result

Layer 5: saga compensation must be idempotent too

What 'done right' looks like across all five layers

Frequently asked questions

Related reading

Every Postgres isolation level, and the specific bug it lets through

LLM database access: the RBAC gap most teams don't see

Rate limiting in production: why the algorithm you chose is probably wrong for your workload

Stay ahead on eSignatures, compliance, and document workflows

Every Postgres isolation level, and the specific bug it lets through

Idempotency keys: the layer you're protecting isn't the one that bites you

Layer 1: the HTTP key (the one everyone adds)

Layer 2: the database write race condition

Layer 3: message queue consumers and at-least-once delivery

Layer 4: external API calls and the unknown result

Layer 5: saga compensation must be idempotent too

What 'done right' looks like across all five layers

Frequently asked questions

Related reading

Every Postgres isolation level, and the specific bug it lets through

LLM database access: the RBAC gap most teams don't see

Rate limiting in production: why the algorithm you chose is probably wrong for your workload

Stay ahead on eSignatures, compliance, and document workflows

Every Postgres isolation level, and the specific bug it lets through