Three idempotency failure modes that only show up in production
The concurrent-request race, partial commits, and request-body mismatch — with the Postgres patterns that fix each one
A payment API accepted a retry. The client had seen a timeout and sent the request again with the same idempotency key. The first request had already committed — the network had dropped the response before it reached the client. Two charges hit the account. The duplicate-detection logic was in the code. It still happened.
The standard idempotency pattern is straightforward to describe: accept a key, check if it exists, return the cached response if so, otherwise run the operation and store the result. This handles sequential retries cleanly. It breaks in three specific ways that only surface under production conditions, and most guides don't address any of them.
What the standard pattern gets right
The core guarantee: if a client sends the same request twice with the same idempotency key, the operation runs once and both requests receive the same response. This protects against the most common failure mode: a client that retries after a network timeout or a clear server error, not knowing whether the original request succeeded.
For synchronous operations with clear success or failure outcomes, sequential retries against a working implementation will behave correctly. The problems below require either concurrent traffic, partial system failures, or clients that make mistakes with their keys.
Failure mode 1: the concurrent-request race
Two requests arrive within milliseconds of each other, both carrying the same idempotency key. Both pass the 'does this key exist' check. Both proceed to run the operation. Two records are created, two charges applied, two emails sent.
This happens because the naive implementation does a SELECT followed by a separate INSERT, and the two steps are not atomic:
# The SELECT-then-INSERT pattern has a race condition.
# Two requests can both pass this check before either inserts the key.
existing = db.execute(
"SELECT * FROM idempotency_keys WHERE key = $1", key
)
if existing:
return existing.response
# Both concurrent requests can reach this point simultaneously.
result = run_operation()
db.execute(
"INSERT INTO idempotency_keys (key, response) VALUES ($1, $2)",
key, result
)
return resultThe SELECT and INSERT are two separate operations. A small window exists between them, and under any meaningful load (or in any test environment that runs concurrent requests) two requests will both find no existing row and both proceed past the check.
The fix is to flip the order: insert first, using a unique constraint on the key to enforce that only one request can succeed. All others hit the conflict and read back whatever the successful insert left:
-- Composite primary key enforces uniqueness per caller.
-- The NOT NULL on request_hash forces you to record what was requested.
CREATE TABLE idempotency_keys (
caller_id TEXT NOT NULL,
key TEXT NOT NULL,
status TEXT NOT NULL DEFAULT 'pending',
request_hash TEXT NOT NULL,
response JSONB,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
completed_at TIMESTAMPTZ,
PRIMARY KEY (caller_id, key)
);try:
# Atomically reserve the key. Fails immediately if already claimed.
db.execute(
"""
INSERT INTO idempotency_keys (caller_id, key, request_hash)
VALUES ($1, $2, $3)
""",
caller_id, key, hash_request_body(request.body)
)
except UniqueViolationError:
existing = db.execute(
"""
SELECT status, response
FROM idempotency_keys
WHERE caller_id = $1 AND key = $2
""",
caller_id, key
)
if existing.status == 'pending':
# In-flight: tell the client to retry after a short delay.
return http_409({"error": "request_in_progress"})
return http_response_from_stored(existing.response)
# Only one request per key reaches this line.
result = run_operation()
db.execute(
"""
UPDATE idempotency_keys
SET status = 'completed', response = $1, completed_at = NOW()
WHERE caller_id = $2 AND key = $3
""",
result, caller_id, key
)
return http_response_from_stored(result)The 409 for 'pending' keys matters in practice. If a client sends the same request again while the first is still processing, which happens regularly when clients implement aggressive retry logic, returning a 409 with a Retry-After header tells it to wait before trying again. Blocking at the database layer until the in-flight request completes also works but adds connection pressure under high retry rates.
Failure mode 2: the partial commit
The operation commits. The process crashes before the response leaves the server, or the client's connection drops after the request is accepted. The client has a timeout and no response. It retries with the same key.
If your implementation stores the idempotency record in the same transaction as the operation, this resolves cleanly. The client gets back the stored result on retry; the operation ran exactly once. If the record update is a separate step (a second transaction, a post-commit hook, an async write), you can end up with the operation committed but the key still showing 'pending'. The retry then either executes the operation again or returns a 409. Both are wrong.
BEGIN;
-- The actual operation.
INSERT INTO orders (customer_id, item_id, quantity)
VALUES ($1, $2, $3)
RETURNING id;
-- The idempotency record update in the same transaction.
UPDATE idempotency_keys
SET
status = 'completed',
response = jsonb_build_object(
'status', 201,
'body', jsonb_build_object('order_id', $4)
),
completed_at = NOW()
WHERE caller_id = $5 AND key = $6;
COMMIT;
-- If the process crashes before COMMIT, both writes roll back.
-- On retry the key is still 'pending' and the operation runs once, cleanly.For operations that call external services (a payment processor, a notification API, a third-party webhook), the same-transaction guarantee isn't available. The approach there is to store the external call's result (transaction ID, external status, timestamp) in the idempotency record as part of marking it completed. On retry, you check whether the external call already succeeded and return that result without re-calling the external service.
Failure mode 3: request-body mismatch
A client reuses an idempotency key with different request data. This is almost always a client-side bug: a retry that accidentally modified the payload, or a key-generation scheme that produced collisions across unrelated requests. Without a body hash check, the server returns the original response for a completely different operation.
Store a hash of the canonical request body alongside each key. On subsequent requests, compare hashes before returning the stored result:
import hashlib, json
def hash_request_body(body: dict) -> str:
# Include only fields that determine the operation outcome.
# Exclude: timestamps, client request IDs, logging metadata.
canonical = {k: body[k] for k in CANONICAL_FIELDS if k in body}
serialized = json.dumps(canonical, sort_keys=True, separators=(',', ':'))
return hashlib.sha256(serialized.encode()).hexdigest()
# When an existing key is found, compare hashes before returning:
if existing and existing.request_hash != hash_request_body(request.body):
return http_422({"error": "idempotency_key_reused_with_different_body"})The CANONICAL_FIELDS list should include only the fields that determine the operation's outcome. Exclude fields like timestamps, client-generated request IDs, or logging metadata that legitimately differ across retries. Hash only what the operation actually acts on.
| Failure mode | What breaks | Root cause | Fix |
|---|---|---|---|
| Concurrent-request race | Duplicate operations under concurrent load | SELECT then INSERT is not atomic | INSERT ... ON CONFLICT; return 409 for pending keys |
| Partial commit | Double execution after crash or lost response | Key update and operation in separate transactions | Single transaction for both; store external call IDs for non-DB ops |
| Request-body mismatch | Stored result returned for a different operation | No body hash stored with the key | Hash canonical fields; return 422 on mismatch |
Key scope, expiry, and what to store
Idempotency keys must be scoped per caller. A key of 'retry-001' from one customer is not the same request as 'retry-001' from another. The primary key for an idempotency record is (caller_id, raw_key), where caller_id comes from your authentication context: API key, user ID, or OAuth client ID. A global key namespace lets clients accidentally or deliberately collide with other customers' keys.
Key retention should match your retry window. 24 hours covers most synchronous API use cases. For workflows where clients can legitimately retry days later, 7 days is more practical. Clean up expired keys with a background job on a schedule; don't delete inline during request processing, where it adds latency to the hot path.
Store the full HTTP response in the idempotency record: status code plus body. A 201 Created and a 200 OK are different responses; returning a 200 for a request that originally produced a 201 is wrong. A minimal stored response looks like:
{"status": 201, "body": {"order_id": "ord_abc123"}}
Don't store response headers unless a specific header varies per request and the client depends on receiving it again on retry. The Location header for a 201 is worth including if your API uses it. Tracing headers, Content-Type, and infrastructure metadata aren't worth the complexity.
Testing the cases that break in production
Unit tests that mock the database verify the happy path. The concurrent-request race requires two real requests hitting a real database simultaneously; most test suites never run this:
from concurrent.futures import ThreadPoolExecutor, as_completed
def test_concurrent_same_key(api_client, db):
key = "test-key-concurrent"
payload = {"item_id": "widget-001", "quantity": 1}
with ThreadPoolExecutor(max_workers=2) as executor:
futures = [
executor.submit(
api_client.post, "/orders",
json=payload,
headers={"Idempotency-Key": key}
)
for _ in range(2)
]
responses = [f.result() for f in as_completed(futures)]
# Acceptable outcomes: (201, 201 with same order_id) or (201, 409).
# Unacceptable: two 201s with different order IDs.
successful = [r for r in responses if r.status_code == 201]
order_ids = {r.json()["order_id"] for r in successful}
assert len(order_ids) == 1, (
f"Race condition: {len(order_ids)} distinct order IDs created for the same key"
)
row_count = db.execute(
"SELECT COUNT(*) FROM idempotency_keys WHERE caller_id = $1 AND key = $2",
TEST_CALLER_ID, key
).scalar()
assert row_count == 1For the partial-commit case, test that a key already in 'completed' status returns the stored response rather than executing the operation again. Insert the idempotency record manually in your test setup, then verify the endpoint returns the stored result and the operation table remains unchanged.
These two tests — concurrent submission and pre-completed key — cover the cases that slip through in most implementations. If you add them to your integration test suite now, before the first customer reports a duplicate charge, they'll stay green as the codebase evolves.
Idempotency across service boundaries
When your endpoint calls a downstream service as part of its operation — a payment processor, an inventory system, a notification service — idempotency needs to propagate to those calls.
The standard pattern is key derivation: from the original client key, generate a deterministic key for each downstream call. If the original key is 'client-retry-abc', pass 'client-retry-abc:payment' to the payment API and 'client-retry-abc:inventory' to the inventory service. Each downstream service handles its own idempotency check against the derived key.
Don't assume downstream services are idempotent by default. External payment processors like Stripe require an explicit idempotency key on each call. Many internal services have no idempotency at all. Map every non-idempotent external call in your operation and add explicit keys. The derived-key pattern gives you uniqueness without requiring the original caller to know anything about your internal service topology.
The three patterns above — INSERT ON CONFLICT for atomic reservation, same-transaction commit for partial-failure safety, and body-hash checking for key reuse — cover the practical failure surface of a production idempotency implementation. The concurrent-request integration test is the fastest way to find out whether your current implementation actually holds: two threads, same key, real database, verify one execution. If you haven't run that test, you don't yet know.
Frequently asked questions
Related reading
Postgres or MySQL in 2026: the answer is almost always Postgres, but here's when it isn't
In 2026 Postgres is the right default for almost every new project. But three specific workloads still favour MySQL — and migrating an existing codebase is rarely worth it without a concrete pain point.
DuckDB in your SaaS stack: what it replaces, what it doesn't, and the multi-tenant pattern that holds up
DuckDB can replace a cloud data warehouse for per-customer SaaS analytics below 100 GB per customer. The multi-tenant part requires one specific pattern, and it is not the one most tutorials show.
Rate limiting in production: the four algorithms and their failure modes
Most services reach for a token bucket and never look further. Rate limiting is four distinct algorithms with different burst behaviours and failure modes — here is what each one actually protects against.