Index examples
Simple one table example
Test data preparation
CREATE TABLE persons (
id SERIAL PRIMARY KEY,
email TEXT,
created_at TIMESTAMP DEFAULT NOW()
);
INSERT INTO persons (email)
SELECT md5(random()::text) || '@example.com'
FROM generate_series(1, 1000000);Execute a command while there is no index created for email column
EXPLAIN ANALYZE SELECT * FROM persons WHERE email = 'test@example.com';What you can see (IIUC when number of rows is about ~100k):
Seq Scan on persons (cost=0.00..2387.00 rows=1 width=57) (actual time=3.825..3.825 rows=0 loops=1)
Filter: (email = 'test@example.com'::text)
Rows Removed by Filter: 100000
Planning Time: 0.107 ms
Execution Time: 3.836 msSeq Scan on persons
Seq Scan on personsMeaning: PostgreSQL performed a sequential scan (full table scan) on the table
persons.Implication: Since there's no index on the
emailcolumn, PostgreSQL had to scan every row to find a match.
2. (cost=0.00..2387.00 rows=1 width=57)
(cost=0.00..2387.00 rows=1 width=57)Meaning: This is the query planner's estimate:
0.00: Startup cost (cost to begin the operation).2387.00: Total estimated cost to complete the operation.rows=1: Estimated number of rows returned.width=57: Estimated average size (in bytes) of each row.
Implication: The planner expected to find 1 matching row, but had to scan all rows (which it estimated would cost 2387 units of work).
3. (actual time=3.825..3.825 rows=0 loops=1)
(actual time=3.825..3.825 rows=0 loops=1)Meaning:
actual time: Time taken to execute this node in milliseconds.First number (
3.825): Time to return the first row.Second number (
3.825): Time to return all rows.
rows=0: Actual number of rows returned.loops=1: Number of times this operation was executed (usually 1 unless in a loop or subquery).
Implication: The query took ~3.825 ms to scan the entire table and found no rows matching
email = 'test@example.com'.
4. Filter: (email = 'test@example.com'::text)
Filter: (email = 'test@example.com'::text)Meaning: The condition used to filter rows during the scan.
Implication: Since there's no index, the filter is applied to every row during the scan.
5. Rows Removed by Filter: 100000
Rows Removed by Filter: 100000Meaning: Out of 100,000 rows in the table, all were scanned and none matched the filter condition.
Implication: This confirms that the query had to process all rows, and none met the condition β which is inefficient without an index.
6. Planning Time: 0.107 ms
Planning Time: 0.107 msMeaning: Time taken by PostgreSQL to plan the query execution.
Implication: Very fast, as expected for a simple query.
7. Execution Time: 3.836 ms
Execution Time: 3.836 msMeaning: Total time taken to execute the query from start to finish.
Implication: This is the real-world performance metric. Without an index, scanning 100,000 rows took ~3.8 ms β acceptable for small datasets, but could be much slower on larger tables.
What you can see (when number of rows is about ~1,1kk)
Gather (cost=1000.00..19226.06 rows=1 width=57) (actual time=20.481..22.033 rows=0 loops=1)
Meaning: This is the Gather node, which collects results from parallel worker processes.
Cost: Estimated cost range for the entire operation.
Rows: Estimated number of rows returned (still 1).
Actual Time: Time taken to gather results from all workers (20.481 ms to get first row, 22.033 ms to finish).
Rows: 0 rows returned (no match found).
Loops: 1 (the gather operation ran once).
2. Workers Planned: 2
Workers Planned: 2Meaning: PostgreSQL planned to use 2 parallel workers to speed up the scan.
Implication: The query planner decided that the table was large enough to benefit from parallelism.
3. Workers Launched: 2
Workers Launched: 2Meaning: PostgreSQL successfully launched 2 parallel workers.
Implication: The scan was split across 3 processes (1 main + 2 workers).
4. -> Parallel Seq Scan on persons (cost=0.00..18225.96 rows=1 width=57) (actual time=18.120..18.120 rows=0 loops=3)
-> Parallel Seq Scan on persons (cost=0.00..18225.96 rows=1 width=57) (actual time=18.120..18.120 rows=0 loops=3)Meaning: Each worker (and the main process) performed a parallel sequential scan on a portion of the
personstable.Cost: Estimated cost for this node (lower than the gather node).
Actual Time: Time taken per scan β all workers finished in ~18.120 ms.
Rows: 0 rows returned.
Loops: 3 (main process + 2 workers).
5. Filter: (email = 'test@example.com'::text)
Filter: (email = 'test@example.com'::text)Meaning: The same filter condition as before β applied during the scan.
Implication: No index, so every row had to be checked.
6. Rows Removed by Filter: 366667
Rows Removed by Filter: 366667Meaning: Each worker scanned ~366,667 rows and removed them all because they didnβt match the filter.
Total Rows Scanned: ~366,667 Γ 3 = 1,100,001 rows
Implication: The entire table was scanned in parallel, but still no match found.
7. Planning Time: 0.037 ms
Planning Time: 0.037 msMeaning: Time taken to plan the query.
Implication: Very fast, as the query is simple.
8. Execution Time: 22.043 ms
Execution Time: 22.043 msMeaning: Total time taken to execute the query.
Implication: Even with parallelism, scanning over a million rows took ~22 ms β and still no match.
Create an index for email column
Index Scan using idx_persons_email on persons
Meaning: PostgreSQL used an index scan on the
idx_persons_emailindex to find matching rows.Implication: Instead of scanning the entire table, it used the B-tree index to directly locate the rows where
email = 'test@example.com'.
2. (cost=0.43..8.45 rows=1 width=57)
(cost=0.43..8.45 rows=1 width=57)Meaning: Estimated cost of the operation:
0.43: Startup cost (cost to begin the index scan).8.45: Total estimated cost to retrieve all matching rows.rows=1: Estimated number of rows returned.width=57: Estimated average size (in bytes) of each row.
Implication: The query planner expects this to be a very fast operation, as it involves only a few index node lookups and possibly one table row fetch.
3. (actual time=0.018..0.018 rows=0 loops=1)
(actual time=0.018..0.018 rows=0 loops=1)Meaning:
actual time: Time taken to execute this node.First number (
0.018): Time to return the first row.Second number (
0.018): Time to return all rows.
rows=0: No rows matched the condition.loops=1: The scan was executed once.
Implication: The index allowed PostgreSQL to quickly determine that no matching rows exist, without scanning the entire table.
4. Index Cond: (email = 'test@example.com'::text)
Index Cond: (email = 'test@example.com'::text)Meaning: The condition used to search the index.
Implication: The index was used effectively to evaluate the
WHEREclause.
5. Planning Time: 0.176 ms
Planning Time: 0.176 msMeaning: Time taken by PostgreSQL to plan the query.
Implication: Very fast, as expected for a simple indexed query.
More complex example
Asssume there are tables
After index creation
Last updated
Was this helpful?