Every Postgres isolation level, and the specific bug it lets through
A map from Read Committed to Serializable — not as theory, but as production failure modes
Race conditions that break production systems have usually been present since the first deploy. They never fire in staging because staging doesn't have concurrent users driving the same code paths simultaneously. Postgres's isolation levels determine which race conditions your schema and application are exposed to. Most teams ship without knowing which level they're on.
Three anomaly classes matter in practice: non-repeatable reads, phantom reads, and write skew. Each is prevented by a higher isolation level. The gap is that most engineers learn these as abstract definitions rather than as the shape of a bug they're about to ship.
The three levels you actually use (SQL defines four)
SQL defines four isolation levels: Read Uncommitted, Read Committed, Repeatable Read, and Serializable. Postgres maps Read Uncommitted to Read Committed internally: any transaction that requests Read Uncommitted gets Read Committed behaviour. Since Read Uncommitted (where a transaction can read data that hasn't been committed yet) is almost never correct, this isn't a loss.
That leaves three real choices:
Read Committed: Each statement in the transaction sees committed data as of the moment that statement runs. The view can shift between statements. This is the default.
Repeatable Read: The entire transaction sees a consistent snapshot taken at the start of its first statement. Rows you've read won't change mid-transaction. In Postgres's MVCC implementation, phantom reads are also prevented at this level.
Serializable: Postgres tracks dependencies between concurrent transactions and aborts any combination that couldn't have run one-at-a-time in some serial order.
Each level up prevents more anomalies and adds more contention. The practical question is which anomalies your code is actually exposed to.
Read Committed: the non-repeatable read
Read Committed is Postgres's default and provides less protection than most engineers assume. The key property: each statement inside the transaction gets a fresh snapshot of committed data at the time that statement executes. Between statements, the world can change.
Consider an order management system. A transaction reads order status, runs some conditional logic, then reads the status again before a final write:
BEGIN;
SELECT status FROM orders WHERE id = $1;
-- Result: 'pending'
-- 200ms of processing ...
SELECT status FROM orders WHERE id = $1;
-- Result: 'cancelled' -- a concurrent UPDATE committed between these two reads
COMMIT;Each SELECT sees the world at its own execution time. If another session updated and committed the order in between, the second read returns a different value. The business logic ran against one state; the final decision is made against another.
In short transactions this rarely matters. In longer transactions where a read drives a subsequent write, it becomes a bug that only appears under concurrent load. The classic production version is financial reporting:
BEGIN;
SELECT SUM(amount) FROM line_items WHERE invoice_id = $1; -- reads 4500
-- ... processing, 300ms ...
SELECT SUM(amount) FROM line_items WHERE invoice_id = $1; -- reads 5200, new item added
COMMIT;The report commits numbers that don't correspond to any single consistent state of the database. Auditors notice. Engineers spend a day finding a bug that has nothing to do with their code.
The fix for non-repeatable reads is either Repeatable Read isolation, or restructuring the transaction to read each thing once and not revisit it.
Repeatable Read: the write skew problem
Repeatable Read freezes the transaction's view at the start of its first statement. Re-read any row and it looks the same. Concurrent inserts are also invisible. Postgres's MVCC implementation prevents phantom reads at this level, which standard SQL doesn't guarantee.
What Repeatable Read doesn't prevent: write skew. Write skew happens when two concurrent transactions each read some overlapping set of data, each make independent decisions based on that data, and both commit, producing a combined result that couldn't have occurred if they'd run in sequence.
The on-call scheduling example is the clearest version of this:
-- Rule: at least one doctor must be on call at all times.
-- Transaction A (Doctor Alice goes off call):
BEGIN;
SELECT COUNT(*) FROM on_call WHERE shift_id = $1; -- returns 2
-- "Two on call, safe to remove one"
UPDATE on_call SET active = false WHERE doctor_id = 'alice' AND shift_id = $1;
COMMIT;
-- Transaction B (Doctor Bob goes off call, concurrent with A):
BEGIN;
SELECT COUNT(*) FROM on_call WHERE shift_id = $1; -- also returns 2 (snapshot from B's start)
-- "Two on call, safe to remove one"
UPDATE on_call SET active = false WHERE doctor_id = 'bob' AND shift_id = $1;
COMMIT;
-- Result: zero doctors on call.Both transactions read two active doctors. Both conclude it's safe to remove one. Both commit. Zero doctors remain on call.
Neither transaction touched the other's row — Alice's transaction updated Alice's row, Bob's updated Bob's. Repeatable Read guarantees that rows you've already read won't change under you. It doesn't account for the combined effect of your writes and the other transaction's writes on a shared invariant.
The same pattern appears in every capacity constraint:
- Seat reservation: both transactions see 9/10 seats booked, both book the 10th, 11 bookings result.
- Budget allocation: both transactions see 4,800 of a 5,000 budget spent, both allocate 300.
- Daily transaction limits: both sessions see a user at 90% of their limit, both approve transactions that together exceed it.
Repeatable Read can't help here because neither transaction ever reads the other's uncommitted write. Each sees a consistent, correct snapshot. The conflict is only visible at commit time, across both transactions together.
Serializable: what the abort is actually buying you
Serializable adds Serializable Snapshot Isolation (SSI), in Postgres since 9.1. SSI tracks read/write dependencies between concurrent transactions. If two transactions' combined reads and writes produce a result impossible in any serial ordering, Postgres aborts one of them:
ERROR: could not serialize access due to read/write dependencies among transactions
DETAIL: Process 23041 waits for SIReadLock on relation 16742 of database 16385; blocked by process 23042.
HINT: The transaction might succeed if retried.
SQLSTATE: 40001Error code 40001 is the signal to retry. This is non-negotiable: SSI works by occasionally aborting transactions, and if you don't retry on 40001 you get errors instead of anomalies. Every code path using Serializable needs a retry loop:
import psycopg2
MAX_RETRIES = 5
def book_seat(conn, slot_id, user_id):
for attempt in range(MAX_RETRIES):
try:
with conn.cursor() as cur:
cur.execute("SET TRANSACTION ISOLATION LEVEL SERIALIZABLE")
cur.execute(
"SELECT COUNT(*) FROM bookings WHERE slot_id = %s",
(slot_id,)
)
count = cur.fetchone()[0]
if count >= 10:
conn.rollback()
return False
cur.execute(
"INSERT INTO bookings (slot_id, user_id) VALUES (%s, %s)",
(slot_id, user_id)
)
conn.commit()
return True
except psycopg2.OperationalError as e:
if e.pgcode == "40001":
conn.rollback()
continue
raise
raise RuntimeError("Seat booking failed after max retries")The performance overhead of SSI is lower than most teams expect. Postgres tracks dependencies efficiently, and for typical OLTP workloads the abort rate is small unless transactions are long or highly overlapping. The genuine engineering cost is the retry logic itself — and that logic must be genuinely idempotent. Retrying a partially-complete transaction must produce the same result as completing it fresh.
What each level actually prevents
| Isolation level | Dirty reads | Non-repeatable reads | Phantom reads | Write skew |
|---|---|---|---|---|
| Read Uncommitted (maps to Read Committed in Postgres) | Prevented | Not prevented | Not prevented | Not prevented |
| Read Committed (default) | Prevented | Not prevented | Not prevented | Not prevented |
| Repeatable Read | Prevented | Prevented | Prevented (Postgres MVCC) | Not prevented |
| Serializable | Prevented | Prevented | Prevented | Prevented |
The Postgres documentation is precise on each cell. The one cell that surprises people: phantom reads are listed as prevented at Repeatable Read in Postgres but not in the SQL standard. This is because Postgres uses snapshot isolation rather than lock-based repeatable read. The effect is stronger guarantees at Repeatable Read than the standard requires, but write skew remains possible until Serializable.
Your ORM is on Read Committed and probably didn't tell you
Rails, Django, SQLAlchemy, Prisma, Sequelize, GORM, Ecto: all default to Read Committed. Your migration runner is on Read Committed. Your database connection pooler is on Read Committed. The setting is inherited from the server default, which is Read Committed, and frameworks don't override it.
To check your current session's isolation level:
SHOW transaction_isolation;To set it for a single transaction:
BEGIN;
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
-- your queries
COMMIT;Most ORMs expose a way to set this per-transaction. In SQLAlchemy:
with engine.connect() as conn:
conn = conn.execution_options(isolation_level="REPEATABLE READ")
# your queries
conn.commit()In Active Record (Rails):
ActiveRecord::Base.transaction(isolation: :repeatable_read) do
# your queries
endThe practical guidance:
Read Committed works when writes are guarded by conditional expressions (UPDATE ... WHERE balance >= amount), unique constraints protect global invariants, and transactions are short. The majority of CRUD operations belong here.
Repeatable Read is the right upgrade for long transactions that read rows multiple times and need a stable view throughout. Useful for generating consistent reports or snapshots. It does not protect against write skew.
Serializable is the correct choice when the invariant you're enforcing can only be stated as 'no combination of concurrent transactions should produce this result' — capacity checks, ledger constraints, anything with a global maximum across rows. Budget for retry logic before you enable it.
The most common mistake isn't running at too-low a level. It's not knowing what level you're on, and writing application code that assumes stronger guarantees than Read Committed provides. Checking SHOW transaction_isolation takes ten seconds. Understanding what the answer means is the work.
Frequently asked questions
Related reading
LLM database access: the RBAC gap most teams don't see
Giving an LLM access to your database is easy. The problem is that your application-layer RBAC is invisible when the model generates SQL. Here's where it goes wrong and how to fix it at the layer that enforces.
Rate limiting in production: why the algorithm you chose is probably wrong for your workload
Most rate limiting failures aren't implementation errors. They come from picking an algorithm whose properties don't match the actual traffic shape. Here's a workload-first framework for making the right choice.
Idempotency keys: the layer you're protecting isn't the one that bites you
An Idempotency-Key header handles one of five layers where duplicates cause harm. Database writes, queue consumers, external API calls, and saga compensation each have failure modes the HTTP key doesn't cover.