Every Postgres index type, and the bug you get when you pick wrong
A canonical reference for engineers who know when to index but not which type to reach for
Every Postgres engineer learns the same lesson eventually: an index that exists is not the same as an index that is used. And an index that is used is not always the right kind.
The default instinct is B-tree. It is what you get when you write CREATE INDEX with no type specified, and it handles most queries well enough that teams rarely think past it. Postgres has six index types. Each one exists because B-tree handles a specific class of query structurally wrong: not just slower, but unable to serve the operator at all. The query planner recognises this, skips the index, and returns a sequential scan without any warning or error.
Why Postgres index types matter as much as which column to index
Standard indexing advice focuses on column selection: high-cardinality columns, columns in WHERE clauses, join keys. That advice is correct as far as it goes. The second question is which type. Most teams stop asking.
The planner's choice on any given query is binary. Either the index type supports the operator in the predicate, or it does not. If it does not, the planner executes a sequential scan, silently, without any degraded-performance notice. The index consumes disk and write overhead and contributes nothing to the query.
Three illustrations of indexes that exist and are never used:
- A B-tree on a jsonb column when the query uses the ? (key-exists) operator.
- A B-tree on a text column when the query pattern is LIKE '%word%' (infix search, not prefix).
- A Hash index on a timestamp column when the query has a BETWEEN range condition.
In each case, EXPLAIN ANALYZE shows Seq Scan with Filter, not Index Scan. The fix is a different index type, not a different column.
B-tree: the default, with three ways it breaks without warning
B-tree is correct for the majority of indexed queries: equality (=), range (<, >, BETWEEN), sorted result sets (ORDER BY), and prefix patterns (LIKE 'word%'). On ordered, scalar data with these operators, it is the right pick.
Three failure modes that produce no error:
Function wrapping. A query on lower(email) with an index on email will not use the index. The planner cannot infer that lower(email) is monotone over email's B-tree ordering. The fix is a functional index: CREATE INDEX ON users (lower(email)). The query expression must match exactly.
Type mismatches. The predicate WHERE user_id = '1234' when user_id is an integer type triggers an implicit cast. Whether the cast preserves the index plan depends on the cast direction and operator family. The safe rule: literals should match column types in parameterised queries, especially where the driver chooses the bind type.
Collation mismatches. If the database collation and index collation diverge, or if the system's glibc was upgraded without rebuilding indexes, sort order can be wrong. The planner may continue using the index and silently return rows in incorrect order or miss rows that should match. The glibc 2.28 change that corrupted sort order in several Linux distributions is the canonical example; it resurfaced in cloud environments as recently as 2024.
-- Confirm an index is being used
EXPLAIN (ANALYZE, BUFFERS)
SELECT * FROM users WHERE lower(email) = '[email protected]';
-- With the functional index present:
-- Index Scan using idx_users_lower_email on users
-- (cost=0.43..8.45 rows=1 width=96) (actual time=0.082..0.083 rows=1 loops=1)
-- Without it:
-- Seq Scan on users (cost=0.00..1284.00 rows=6 width=96)
-- Filter: (lower((email)::text) = '[email protected]')
-- Rows Removed by Filter: 49994Hash: one thing B-tree cannot do, and almost never worth the tradeoff
Hash indexes store a hash of the column value and support only the = operator. No range scans, no sorting, no pattern matching. Before Postgres 10, they were not WAL-logged and could be lost on crash. That is fixed as of version 10 and are fully durable.
The theoretical case for Hash is O(1) lookup versus B-tree's O(log n). In practice, the benefit only shows up when the table is large enough that B-tree height becomes significant — roughly 50 million rows or more. Even then the margin is within noise on modern storage. The real cost: Hash cannot serve range queries, ORDER BY, or prefix patterns. Any query that later adds a range condition needs a second index. For most tables, a single B-tree that handles equality, range, and ORDER BY beats two indexes.
The narrow case where Hash makes sense: a column used exclusively for equality lookups, on a table large enough that you have measured a real improvement, and with certainty that the query pattern will not expand. In any other situation, B-tree is simpler and barely slower for equality.
GIN: the right type for multi-value data, at a write cost
GIN (Generalised Inverted Index) is built for columns that contain multiple indexable values: arrays, jsonb objects, and full-text search vectors. Where B-tree stores one indexed value per row, GIN indexes every element inside the value, building an inverted map from element to row IDs.
The operators GIN supports that B-tree cannot serve:
- @> (jsonb containment): WHERE metadata @> '{"env": "production"}'
- && (array overlap): WHERE tags && ARRAY['engineering', 'postgres']
- @@ (full-text match): WHERE to_tsvector('english', body) @@ plainto_tsquery('english', 'database')
For each of these, a B-tree index is structurally incorrect. The planner skips it and scans the heap.
The write penalty is real. Each INSERT or UPDATE that touches a GIN-indexed column can modify many index entries, one per element in the value. On tables with high update rates and large jsonb documents (20 or more keys each), GIN write overhead runs 3 to 5 times higher than an equivalent B-tree. Postgres buffers new entries in a pending list (default 4 MB, set by gin_pending_list_limit) and flushes in bulk during autovacuum. Until flushed, GIN reads run slightly slower. If INSERT latency climbs after adding a GIN index, raise gin_pending_list_limit or schedule manual flushes: SELECT gin_clean_pending_list('idx_name'::regclass).
GiST: spatial and range data, with a type-cast trap
GiST (Generalised Search Tree) is a framework for indexing types without a natural linear order: geometric types (point, polygon, circle), range types (tsrange, daterange, int4range), and network addresses (inet, cidr). PostGIS adds spatial indexing on top of GiST.
The operators that require GiST and cannot use B-tree:
- && (range overlap): WHERE reservation_period && '[2026-06-01, 2026-06-10)'::daterange
- <-> (nearest-neighbour distance): WHERE location <-> point(12.9, 77.6) < 5000
- << (network containment): WHERE ip_address << '10.0.0.0/8'
Without a GiST index, these queries scan every row in the table, regardless of table size.
The type-cast trap: GiST indexes are operator-class specific. A column indexed as timestamp without time zone will not be used by a query that casts implicitly to timestamp with time zone, which happens whenever the client connection sets a TimeZone parameter and the predicate compares against a timestamptz literal. If EXPLAIN shows an implicit cast above the Filter node, the predicate type does not match the index class.
GiST also powers trigram similarity search via the pg_trgm extension, enabling infix text patterns (LIKE '%word%') that B-tree cannot handle. Both GIN and GiST support pg_trgm; GIN is faster for reads, GiST cheaper to maintain on write-heavy tables.
BRIN: the minimal index that punishes random writes
BRIN (Block Range Index) stores the minimum and maximum value for each physical block range of the table. A query on a BRIN-indexed column checks only the block ranges whose min-max span overlaps the query value, skipping the rest.
BRIN is 40 to 100 times smaller than B-tree on the same column. On a 100 GB events table with a monotonically increasing created_at, a B-tree index might occupy 5 GB and sit only partially in the buffer pool. A BRIN index on the same column takes about 50 MB. For range queries spanning hours or days, BRIN often wins because it causes far fewer buffer reads. For point lookups, B-tree wins.
BRIN is only correct when column values are correlated with physical row order. Append-only, monotonically increasing data (event timestamps, auto-incrementing IDs, log sequence numbers) satisfies this. Data inserted out of order, or any column updated in place, does not. After enough non-monotonic writes, each BRIN block range expands to cover the full value domain, and the index reports nearly every block as a candidate, forcing a heap scan that discards most results.
Partial indexes: the precision tool most teams skip
A partial index is any index type with a WHERE clause attached. It indexes only the rows that satisfy the predicate.
The practical case: many real tables have status columns where queries concentrate on a small fraction of rows. An orders table with 50 million rows might have 49 million in status = 'completed' and 1 million in status = 'pending'. A query on pending rows with a full-table B-tree index works, but the index is 50 times larger than the query needs, occupying buffer pool that could serve the rows themselves.
-- Full-table index: large, competes for buffer pool space
CREATE INDEX idx_orders_created ON orders (created_at);
-- Partial index: 50x smaller, used only for pending rows
CREATE INDEX idx_orders_pending ON orders (created_at)
WHERE status = 'pending';
-- Used automatically by the planner:
SELECT * FROM orders
WHERE status = 'pending'
AND created_at > NOW() - INTERVAL '7 days';Partial indexes are also the correct answer for high-null columns. If 80% of rows have deleted_at IS NULL, a full B-tree on deleted_at wastes 80% of its entries on nulls that queries never filter on. A partial index with WHERE deleted_at IS NULL is five times smaller and stays in the buffer pool more consistently.
One constraint: the query WHERE clause must match or logically imply the index predicate for the planner to choose the partial index. WHERE status = 'pending' uses the partial index above. WHERE status IN ('pending', 'processing') does not, because 'processing' is not implied by the predicate. Write the predicate to match your most frequent query shape precisely.
| Index type | Operators supported | Best for | Avoid when | Common silent failure |
|---|---|---|---|---|
| B-tree | =, <, >, BETWEEN, LIKE 'x%', ORDER BY | Most scalar columns; ordered, unique data | Array / JSONB containment; infix text | Function wrapping; collation mismatch disables it |
| Hash | = only | Pure equality on very large tables (50M+ rows) | Any range or ORDER BY requirement | Not selected for any operator except = |
| GIN | @>, &&, @@, ? | Arrays, JSONB keys, full-text tsvector columns | High-update columns (3–5x write overhead) | Pending list lag degrades read latency on writes |
| GiST | Spatial, range &&, <->, <<, pg_trgm | Geometric, range, inet types; infix text via trgm | High-frequency point lookups; pure equality | Type-cast mismatch silently disables the index |
| BRIN | Range on monotonic data | Append-only tables: events, logs, time series | Non-monotonic inserts; in-place column updates | Degrades to full-range; becomes a no-op silently |
| Partial | Any type + WHERE filter | Status columns; high-null columns; sparse data | When query predicates do not imply the condition | Planner ignores if predicate is not implied by WHERE |
Reading EXPLAIN ANALYZE to catch the wrong index type
The diagnostic pattern is the same for every index problem. Run EXPLAIN (ANALYZE, BUFFERS) on the slow query and read the plan output:
EXPLAIN (ANALYZE, BUFFERS)
SELECT id, amount FROM orders
WHERE status = 'pending'
AND created_at > NOW() - INTERVAL '7 days';What each plan node means:
- Index Scan using idx_name: the index is used and the type is correct.
- Bitmap Index Scan on idx_name, then Bitmap Heap Scan: index used; a heap fetch follows. Normal for low-selectivity conditions.
- Seq Scan on table with Filter: the index is ignored. Check whether the operator in the predicate matches the index type you created.
- Seq Scan on table without Filter: the table is small enough that the planner correctly prefers a seq scan. Not a problem.
If the index exists but the plan shows Seq Scan with Filter, run ANALYZE on the table first; stale statistics are the most common cause of a planner that ignores a valid index. Then confirm the operator is supported by the index type. If both are correct and the index is still ignored, force it with SET enable_seqscan = off to verify it produces correct results. A forced index that works correctly but is not chosen normally usually means the statistics target is too low: ALTER TABLE t ALTER COLUMN c SET STATISTICS 500.
The shared hit and read counts from BUFFERS tell you whether the index is actually reducing I/O. A plan that uses an index but still shows high shared reads either has a buffer pool too small for the index size, or the index is generating many false positives that the heap scan then discards. BRIN on a degraded column produces exactly this pattern.
Closing thoughts
The six Postgres index types are not performance variants of the same idea. Each one exists because a specific class of operator cannot be answered by B-tree at all. Getting the type right is not an optimisation step. It is the difference between an index the planner can use and one it cannot.
As new index operator classes continue arriving in Postgres extensions, including pgvector for vector similarity, pg_trgm for trigrams, and PostGIS for geography. The number of types worth knowing will only grow. The diagnostic approach stays constant: EXPLAIN (ANALYZE, BUFFERS), find the Seq Scan, trace the operator back to the index type that handles it.
Frequently asked questions
Related reading
Why boring technology is more important in 2026, not less
AI coding tools made adopting new technology feel fast. That's not the same as making it safe, and the distinction matters more now than it did before.
You adopted Kafka to decouple your services. Here's the coupling that came back.
Event-driven architecture replaces synchronous coupling with three quieter kinds: schema coupling, ordering coupling, and observability coupling. Each one is invisible in a demo and surfaces as a production incident.
Postgres or MySQL in 2026: the answer is almost always Postgres, but here's when it isn't
In 2026 Postgres is the right default for almost every new project. But three specific workloads still favour MySQL — and migrating an existing codebase is rarely worth it without a concrete pain point.