How long should idempotency keys be retained?

24 hours covers most retry windows for synchronous APIs. For async or long-running operations where clients may retry days later, 7 days is more practical. Use a background job to clean up expired keys rather than deleting inline during request processing.

Should I use Redis or Postgres for idempotency keys?

Postgres is the safer default if you're already using it. The unique constraint gives you atomic key reservation via INSERT ON CONFLICT. Redis is faster but requires a Lua script to make the reserve-then-update pattern atomic; SET NX EX alone leaves a gap between reservation and response storage, recreating the same race condition.

What happens if a client sends the same key with different request data?

Store a hash of the canonical request fields alongside the key. If a subsequent request matches an existing key but the hash differs, return a 422 — the client has reused a key incorrectly. Do not return the result of the original operation for a different payload.

How do I make idempotency work across service calls?

Derive a key per downstream call from the original client key. If the original key is 'client-abc', pass 'client-abc:payment' to the payment service and 'client-abc:inventory' to inventory. Each service handles its own idempotency check independently. Don't assume downstream services are idempotent by default — check the docs, and add explicit keys where needed.

EngineeringMay 28, 20266 min readReviewed May 28, 2026

Three idempotency failure modes that only show up in production

The concurrent-request race, partial commits, and request-body mismatch — with the Postgres patterns that fix each one

By FlowVerify Editorial Team

A payment API accepted a retry. The client had seen a timeout and sent the request again with the same idempotency key. The first request had already committed — the network had dropped the response before it reached the client. Two charges hit the account. The duplicate-detection logic was in the code. It still happened.

The standard idempotency pattern is straightforward to describe: accept a key, check if it exists, return the cached response if so, otherwise run the operation and store the result. This handles sequential retries cleanly. It breaks in three specific ways that only surface under production conditions, and most guides don't address any of them.

What the standard pattern gets right

The core guarantee: if a client sends the same request twice with the same idempotency key, the operation runs once and both requests receive the same response. This protects against the most common failure mode: a client that retries after a network timeout or a clear server error, not knowing whether the original request succeeded.

For synchronous operations with clear success or failure outcomes, sequential retries against a working implementation will behave correctly. The problems below require either concurrent traffic, partial system failures, or clients that make mistakes with their keys.

Failure mode 1: the concurrent-request race

Two requests arrive within milliseconds of each other, both carrying the same idempotency key. Both pass the 'does this key exist' check. Both proceed to run the operation. Two records are created, two charges applied, two emails sent.

This happens because the naive implementation does a SELECT followed by a separate INSERT, and the two steps are not atomic:

broken-select-insert.py

# The SELECT-then-INSERT pattern has a race condition.
# Two requests can both pass this check before either inserts the key.
existing = db.execute(
    "SELECT * FROM idempotency_keys WHERE key = $1", key
)
if existing:
    return existing.response

# Both concurrent requests can reach this point simultaneously.
result = run_operation()
db.execute(
    "INSERT INTO idempotency_keys (key, response) VALUES ($1, $2)",
    key, result
)
return result

The SELECT and INSERT are two separate operations. A small window exists between them, and under any meaningful load (or in any test environment that runs concurrent requests) two requests will both find no existing row and both proceed past the check.

The fix is to flip the order: insert first, using a unique constraint on the key to enforce that only one request can succeed. All others hit the conflict and read back whatever the successful insert left:

schema.sql

-- Composite primary key enforces uniqueness per caller.
-- The NOT NULL on request_hash forces you to record what was requested.
CREATE TABLE idempotency_keys (
    caller_id    TEXT         NOT NULL,
    key          TEXT         NOT NULL,
    status       TEXT         NOT NULL DEFAULT 'pending',
    request_hash TEXT         NOT NULL,
    response     JSONB,
    created_at   TIMESTAMPTZ  NOT NULL DEFAULT NOW(),
    completed_at TIMESTAMPTZ,
    PRIMARY KEY (caller_id, key)
);

fixed-pattern.py

try:
    # Atomically reserve the key. Fails immediately if already claimed.
    db.execute(
        """
        INSERT INTO idempotency_keys (caller_id, key, request_hash)
        VALUES ($1, $2, $3)
        """,
        caller_id, key, hash_request_body(request.body)
    )
except UniqueViolationError:
    existing = db.execute(
        """
        SELECT status, response
        FROM idempotency_keys
        WHERE caller_id = $1 AND key = $2
        """,
        caller_id, key
    )
    if existing.status == 'pending':
        # In-flight: tell the client to retry after a short delay.
        return http_409({"error": "request_in_progress"})
    return http_response_from_stored(existing.response)

# Only one request per key reaches this line.
result = run_operation()

db.execute(
    """
    UPDATE idempotency_keys
    SET status = 'completed', response = $1, completed_at = NOW()
    WHERE caller_id = $2 AND key = $3
    """,
    result, caller_id, key
)
return http_response_from_stored(result)

The 409 for 'pending' keys matters in practice. If a client sends the same request again while the first is still processing, which happens regularly when clients implement aggressive retry logic, returning a 409 with a Retry-After header tells it to wait before trying again. Blocking at the database layer until the in-flight request completes also works but adds connection pressure under high retry rates.

Failure mode 2: the partial commit

The operation commits. The process crashes before the response leaves the server, or the client's connection drops after the request is accepted. The client has a timeout and no response. It retries with the same key.

If your implementation stores the idempotency record in the same transaction as the operation, this resolves cleanly. The client gets back the stored result on retry; the operation ran exactly once. If the record update is a separate step (a second transaction, a post-commit hook, an async write), you can end up with the operation committed but the key still showing 'pending'. The retry then either executes the operation again or returns a 409. Both are wrong.

transactional-update.sql

BEGIN;

-- The actual operation.
INSERT INTO orders (customer_id, item_id, quantity)
VALUES ($1, $2, $3)
RETURNING id;

-- The idempotency record update in the same transaction.
UPDATE idempotency_keys
SET
    status       = 'completed',
    response     = jsonb_build_object(
                       'status', 201,
                       'body',   jsonb_build_object('order_id', $4)
                   ),
    completed_at = NOW()
WHERE caller_id = $5 AND key = $6;

COMMIT;
-- If the process crashes before COMMIT, both writes roll back.
-- On retry the key is still 'pending' and the operation runs once, cleanly.

For operations that call external services (a payment processor, a notification API, a third-party webhook), the same-transaction guarantee isn't available. The approach there is to store the external call's result (transaction ID, external status, timestamp) in the idempotency record as part of marking it completed. On retry, you check whether the external call already succeeded and return that result without re-calling the external service.

Failure mode 3: request-body mismatch

A client reuses an idempotency key with different request data. This is almost always a client-side bug: a retry that accidentally modified the payload, or a key-generation scheme that produced collisions across unrelated requests. Without a body hash check, the server returns the original response for a completely different operation.

Store a hash of the canonical request body alongside each key. On subsequent requests, compare hashes before returning the stored result:

hash-check.py

import hashlib, json

def hash_request_body(body: dict) -> str:
    # Include only fields that determine the operation outcome.
    # Exclude: timestamps, client request IDs, logging metadata.
    canonical = {k: body[k] for k in CANONICAL_FIELDS if k in body}
    serialized = json.dumps(canonical, sort_keys=True, separators=(',', ':'))
    return hashlib.sha256(serialized.encode()).hexdigest()

# When an existing key is found, compare hashes before returning:
if existing and existing.request_hash != hash_request_body(request.body):
    return http_422({"error": "idempotency_key_reused_with_different_body"})

The CANONICAL_FIELDS list should include only the fields that determine the operation's outcome. Exclude fields like timestamps, client-generated request IDs, or logging metadata that legitimately differ across retries. Hash only what the operation actually acts on.

Failure mode	What breaks	Root cause	Fix
Concurrent-request race	Duplicate operations under concurrent load	SELECT then INSERT is not atomic	INSERT ... ON CONFLICT; return 409 for pending keys
Partial commit	Double execution after crash or lost response	Key update and operation in separate transactions	Single transaction for both; store external call IDs for non-DB ops
Request-body mismatch	Stored result returned for a different operation	No body hash stored with the key	Hash canonical fields; return 422 on mismatch

Three failure modes: root causes and fixes

Key scope, expiry, and what to store

Idempotency keys must be scoped per caller. A key of 'retry-001' from one customer is not the same request as 'retry-001' from another. The primary key for an idempotency record is (caller_id, raw_key), where caller_id comes from your authentication context: API key, user ID, or OAuth client ID. A global key namespace lets clients accidentally or deliberately collide with other customers' keys.

Key retention should match your retry window. 24 hours covers most synchronous API use cases. For workflows where clients can legitimately retry days later, 7 days is more practical. Clean up expired keys with a background job on a schedule; don't delete inline during request processing, where it adds latency to the hot path.

Store the full HTTP response in the idempotency record: status code plus body. A 201 Created and a 200 OK are different responses; returning a 200 for a request that originally produced a 201 is wrong. A minimal stored response looks like:

{"status": 201, "body": {"order_id": "ord_abc123"}}

Don't store response headers unless a specific header varies per request and the client depends on receiving it again on retry. The Location header for a 201 is worth including if your API uses it. Tracing headers, Content-Type, and infrastructure metadata aren't worth the complexity.

Testing the cases that break in production

Unit tests that mock the database verify the happy path. The concurrent-request race requires two real requests hitting a real database simultaneously; most test suites never run this:

test-concurrent.py

from concurrent.futures import ThreadPoolExecutor, as_completed

def test_concurrent_same_key(api_client, db):
    key = "test-key-concurrent"
    payload = {"item_id": "widget-001", "quantity": 1}

    with ThreadPoolExecutor(max_workers=2) as executor:
        futures = [
            executor.submit(
                api_client.post, "/orders",
                json=payload,
                headers={"Idempotency-Key": key}
            )
            for _ in range(2)
        ]
        responses = [f.result() for f in as_completed(futures)]

    # Acceptable outcomes: (201, 201 with same order_id) or (201, 409).
    # Unacceptable: two 201s with different order IDs.
    successful = [r for r in responses if r.status_code == 201]
    order_ids = {r.json()["order_id"] for r in successful}

    assert len(order_ids) == 1, (
        f"Race condition: {len(order_ids)} distinct order IDs created for the same key"
    )

    row_count = db.execute(
        "SELECT COUNT(*) FROM idempotency_keys WHERE caller_id = $1 AND key = $2",
        TEST_CALLER_ID, key
    ).scalar()
    assert row_count == 1

For the partial-commit case, test that a key already in 'completed' status returns the stored response rather than executing the operation again. Insert the idempotency record manually in your test setup, then verify the endpoint returns the stored result and the operation table remains unchanged.

These two tests — concurrent submission and pre-completed key — cover the cases that slip through in most implementations. If you add them to your integration test suite now, before the first customer reports a duplicate charge, they'll stay green as the codebase evolves.

Idempotency across service boundaries

When your endpoint calls a downstream service as part of its operation — a payment processor, an inventory system, a notification service — idempotency needs to propagate to those calls.

The standard pattern is key derivation: from the original client key, generate a deterministic key for each downstream call. If the original key is 'client-retry-abc', pass 'client-retry-abc:payment' to the payment API and 'client-retry-abc:inventory' to the inventory service. Each downstream service handles its own idempotency check against the derived key.

Don't assume downstream services are idempotent by default. External payment processors like Stripe require an explicit idempotency key on each call. Many internal services have no idempotency at all. Map every non-idempotent external call in your operation and add explicit keys. The derived-key pattern gives you uniqueness without requiring the original caller to know anything about your internal service topology.

The three patterns above — INSERT ON CONFLICT for atomic reservation, same-transaction commit for partial-failure safety, and body-hash checking for key reuse — cover the practical failure surface of a production idempotency implementation. The concurrent-request integration test is the fastest way to find out whether your current implementation actually holds: two threads, same key, real database, verify one execution. If you haven't run that test, you don't yet know.

Frequently asked questions

Reddit's zero-downtime migration of 500 Kafka brokers wasn't about Kafka. It was three reusable techniques.

Reddit moved 500+ Kafka brokers and a petabyte of live data from EC2 to Kubernetes with zero downtime. The three techniques behind it aren't specific to Kafka.

Jul 8, 2026Read full article →

EngineeringMay 28, 20266 min readReviewed May 28, 2026

Three idempotency failure modes that only show up in production

The concurrent-request race, partial commits, and request-body mismatch — with the Postgres patterns that fix each one

By FlowVerify Editorial Team

What the standard pattern gets right

Failure mode 1: the concurrent-request race

This happens because the naive implementation does a SELECT followed by a separate INSERT, and the two steps are not atomic:

broken-select-insert.py

# The SELECT-then-INSERT pattern has a race condition.
# Two requests can both pass this check before either inserts the key.
existing = db.execute(
    "SELECT * FROM idempotency_keys WHERE key = $1", key
)
if existing:
    return existing.response

# Both concurrent requests can reach this point simultaneously.
result = run_operation()
db.execute(
    "INSERT INTO idempotency_keys (key, response) VALUES ($1, $2)",
    key, result
)
return result

schema.sql

-- Composite primary key enforces uniqueness per caller.
-- The NOT NULL on request_hash forces you to record what was requested.
CREATE TABLE idempotency_keys (
    caller_id    TEXT         NOT NULL,
    key          TEXT         NOT NULL,
    status       TEXT         NOT NULL DEFAULT 'pending',
    request_hash TEXT         NOT NULL,
    response     JSONB,
    created_at   TIMESTAMPTZ  NOT NULL DEFAULT NOW(),
    completed_at TIMESTAMPTZ,
    PRIMARY KEY (caller_id, key)
);

fixed-pattern.py

try:
    # Atomically reserve the key. Fails immediately if already claimed.
    db.execute(
        """
        INSERT INTO idempotency_keys (caller_id, key, request_hash)
        VALUES ($1, $2, $3)
        """,
        caller_id, key, hash_request_body(request.body)
    )
except UniqueViolationError:
    existing = db.execute(
        """
        SELECT status, response
        FROM idempotency_keys
        WHERE caller_id = $1 AND key = $2
        """,
        caller_id, key
    )
    if existing.status == 'pending':
        # In-flight: tell the client to retry after a short delay.
        return http_409({"error": "request_in_progress"})
    return http_response_from_stored(existing.response)

# Only one request per key reaches this line.
result = run_operation()

db.execute(
    """
    UPDATE idempotency_keys
    SET status = 'completed', response = $1, completed_at = NOW()
    WHERE caller_id = $2 AND key = $3
    """,
    result, caller_id, key
)
return http_response_from_stored(result)

Failure mode 2: the partial commit

transactional-update.sql

BEGIN;

-- The actual operation.
INSERT INTO orders (customer_id, item_id, quantity)
VALUES ($1, $2, $3)
RETURNING id;

-- The idempotency record update in the same transaction.
UPDATE idempotency_keys
SET
    status       = 'completed',
    response     = jsonb_build_object(
                       'status', 201,
                       'body',   jsonb_build_object('order_id', $4)
                   ),
    completed_at = NOW()
WHERE caller_id = $5 AND key = $6;

COMMIT;
-- If the process crashes before COMMIT, both writes roll back.
-- On retry the key is still 'pending' and the operation runs once, cleanly.

Failure mode 3: request-body mismatch

Store a hash of the canonical request body alongside each key. On subsequent requests, compare hashes before returning the stored result:

hash-check.py

import hashlib, json

def hash_request_body(body: dict) -> str:
    # Include only fields that determine the operation outcome.
    # Exclude: timestamps, client request IDs, logging metadata.
    canonical = {k: body[k] for k in CANONICAL_FIELDS if k in body}
    serialized = json.dumps(canonical, sort_keys=True, separators=(',', ':'))
    return hashlib.sha256(serialized.encode()).hexdigest()

# When an existing key is found, compare hashes before returning:
if existing and existing.request_hash != hash_request_body(request.body):
    return http_422({"error": "idempotency_key_reused_with_different_body"})

Failure mode	What breaks	Root cause	Fix
Concurrent-request race	Duplicate operations under concurrent load	SELECT then INSERT is not atomic	INSERT ... ON CONFLICT; return 409 for pending keys
Partial commit	Double execution after crash or lost response	Key update and operation in separate transactions	Single transaction for both; store external call IDs for non-DB ops
Request-body mismatch	Stored result returned for a different operation	No body hash stored with the key	Hash canonical fields; return 422 on mismatch

Three failure modes: root causes and fixes

Key scope, expiry, and what to store

{"status": 201, "body": {"order_id": "ord_abc123"}}

Testing the cases that break in production

Unit tests that mock the database verify the happy path. The concurrent-request race requires two real requests hitting a real database simultaneously; most test suites never run this:

test-concurrent.py

from concurrent.futures import ThreadPoolExecutor, as_completed

def test_concurrent_same_key(api_client, db):
    key = "test-key-concurrent"
    payload = {"item_id": "widget-001", "quantity": 1}

    with ThreadPoolExecutor(max_workers=2) as executor:
        futures = [
            executor.submit(
                api_client.post, "/orders",
                json=payload,
                headers={"Idempotency-Key": key}
            )
            for _ in range(2)
        ]
        responses = [f.result() for f in as_completed(futures)]

    # Acceptable outcomes: (201, 201 with same order_id) or (201, 409).
    # Unacceptable: two 201s with different order IDs.
    successful = [r for r in responses if r.status_code == 201]
    order_ids = {r.json()["order_id"] for r in successful}

    assert len(order_ids) == 1, (
        f"Race condition: {len(order_ids)} distinct order IDs created for the same key"
    )

    row_count = db.execute(
        "SELECT COUNT(*) FROM idempotency_keys WHERE caller_id = $1 AND key = $2",
        TEST_CALLER_ID, key
    ).scalar()
    assert row_count == 1

Idempotency across service boundaries

When your endpoint calls a downstream service as part of its operation — a payment processor, an inventory system, a notification service — idempotency needs to propagate to those calls.

Three idempotency failure modes that only show up in production

What the standard pattern gets right

Failure mode 1: the concurrent-request race

Failure mode 2: the partial commit

Failure mode 3: request-body mismatch

Key scope, expiry, and what to store

Testing the cases that break in production

Idempotency across service boundaries

Frequently asked questions

Related reading

Reddit's zero-downtime migration of 500 Kafka brokers wasn't about Kafka. It was three reusable techniques.

CRDTs vs OT is a solved question in 2026. Where you draw the sync boundary is not.

Railway disconnected a carrier to contain an outage. It cut its last route instead.

Stay ahead on eSignatures, compliance, and document workflows

Reddit's zero-downtime migration of 500 Kafka brokers wasn't about Kafka. It was three reusable techniques.

Three idempotency failure modes that only show up in production

What the standard pattern gets right

Failure mode 1: the concurrent-request race

Failure mode 2: the partial commit

Failure mode 3: request-body mismatch

Key scope, expiry, and what to store

Testing the cases that break in production

Idempotency across service boundaries

Frequently asked questions

Related reading

Reddit's zero-downtime migration of 500 Kafka brokers wasn't about Kafka. It was three reusable techniques.

CRDTs vs OT is a solved question in 2026. Where you draw the sync boundary is not.

Railway disconnected a carrier to contain an outage. It cut its last route instead.

Stay ahead on eSignatures, compliance, and document workflows

Reddit's zero-downtime migration of 500 Kafka brokers wasn't about Kafka. It was three reusable techniques.