Does Postgres automatically put indexes on Foreign Keys and Primary Keys? How can I tell? Is there a command that will return all indexes on a table?
PostgreSQL automatically creates indexes on primary keys and unique constraints, but not on the referencing side of foreign key relationships.
When Pg creates an implicit index it will emit a NOTICE
-level message that you can see in psql
and/or the system logs, so you can see when it happens. Automatically created indexes are visible in \d
output for a table, too.
The documentation on unique indexes says:
PostgreSQL automatically creates an index for each unique constraint and primary key constraint to enforce uniqueness. Thus, it is not necessary to create an index explicitly for primary key columns.
and the documentation on constraints says:
Since a DELETE of a row from the referenced table or an UPDATE of a referenced column will require a scan of the referencing table for rows matching the old value, it is often a good idea to index the referencing columns. Because this is not always needed, and there are many choices available on how to index, declaration of a foreign key constraint does not automatically create an index on the referencing columns.
Therefore you have to create indexes on foreign-keys yourself if you want them.
Note that if you use primary-foreign-keys, like 2 FK's as a PK in a M-to-N table, you will have an index on the PK and probably don't need to create any extra indexes.
While it's usually a good idea to create an index on (or including) your referencing-side foreign key columns, it isn't required. Each index you add slows DML operations down slightly, so you pay a performance cost on every INSERT
, UPDATE
or DELETE
. If the index is rarely used it may not be worth having.
This query will list missing indexes on foreign keys, original source.
Edit: Note that it will not check small tables (less then 9 MB) and some other cases. See final WHERE
statement.
-- check for FKs where there is no matching index
-- on the referencing side
-- or a bad index
WITH fk_actions ( code, action ) AS (
VALUES ( 'a', 'error' ),
( 'r', 'restrict' ),
( 'c', 'cascade' ),
( 'n', 'set null' ),
( 'd', 'set default' )
),
fk_list AS (
SELECT pg_constraint.oid as fkoid, conrelid, confrelid as parentid,
conname, relname, nspname,
fk_actions_update.action as update_action,
fk_actions_delete.action as delete_action,
conkey as key_cols
FROM pg_constraint
JOIN pg_class ON conrelid = pg_class.oid
JOIN pg_namespace ON pg_class.relnamespace = pg_namespace.oid
JOIN fk_actions AS fk_actions_update ON confupdtype = fk_actions_update.code
JOIN fk_actions AS fk_actions_delete ON confdeltype = fk_actions_delete.code
WHERE contype = 'f'
),
fk_attributes AS (
SELECT fkoid, conrelid, attname, attnum
FROM fk_list
JOIN pg_attribute
ON conrelid = attrelid
AND attnum = ANY( key_cols )
ORDER BY fkoid, attnum
),
fk_cols_list AS (
SELECT fkoid, array_agg(attname) as cols_list
FROM fk_attributes
GROUP BY fkoid
),
index_list AS (
SELECT indexrelid as indexid,
pg_class.relname as indexname,
indrelid,
indkey,
indpred is not null as has_predicate,
pg_get_indexdef(indexrelid) as indexdef
FROM pg_index
JOIN pg_class ON indexrelid = pg_class.oid
WHERE indisvalid
),
fk_index_match AS (
SELECT fk_list.*,
indexid,
indexname,
indkey::int[] as indexatts,
has_predicate,
indexdef,
array_length(key_cols, 1) as fk_colcount,
array_length(indkey,1) as index_colcount,
round(pg_relation_size(conrelid)/(1024^2)::numeric) as table_mb,
cols_list
FROM fk_list
JOIN fk_cols_list USING (fkoid)
LEFT OUTER JOIN index_list
ON conrelid = indrelid
AND (indkey::int2[])[0:(array_length(key_cols,1) -1)] @> key_cols
),
fk_perfect_match AS (
SELECT fkoid
FROM fk_index_match
WHERE (index_colcount - 1) <= fk_colcount
AND NOT has_predicate
AND indexdef LIKE '%USING btree%'
),
fk_index_check AS (
SELECT 'no index' as issue, *, 1 as issue_sort
FROM fk_index_match
WHERE indexid IS NULL
UNION ALL
SELECT 'questionable index' as issue, *, 2
FROM fk_index_match
WHERE indexid IS NOT NULL
AND fkoid NOT IN (
SELECT fkoid
FROM fk_perfect_match)
),
parent_table_stats AS (
SELECT fkoid, tabstats.relname as parent_name,
(n_tup_ins + n_tup_upd + n_tup_del + n_tup_hot_upd) as parent_writes,
round(pg_relation_size(parentid)/(1024^2)::numeric) as parent_mb
FROM pg_stat_user_tables AS tabstats
JOIN fk_list
ON relid = parentid
),
fk_table_stats AS (
SELECT fkoid,
(n_tup_ins + n_tup_upd + n_tup_del + n_tup_hot_upd) as writes,
seq_scan as table_scans
FROM pg_stat_user_tables AS tabstats
JOIN fk_list
ON relid = conrelid
)
SELECT nspname as schema_name,
relname as table_name,
conname as fk_name,
issue,
table_mb,
writes,
table_scans,
parent_name,
parent_mb,
parent_writes,
cols_list,
indexdef
FROM fk_index_check
JOIN parent_table_stats USING (fkoid)
JOIN fk_table_stats USING (fkoid)
WHERE table_mb > 9
AND ( writes > 1000
OR parent_writes > 1000
OR parent_mb > 10 )
ORDER BY issue_sort, table_mb DESC, table_name, fk_name;
where
clauses: Amongst others, it only takes tables into consideration which size is more than 9 MB.
If you want to list the indexes of all the tables in your schema(s) from your program, all the information is on hand in the catalog:
select
n.nspname as "Schema"
,t.relname as "Table"
,c.relname as "Index"
from
pg_catalog.pg_class c
join pg_catalog.pg_namespace n on n.oid = c.relnamespace
join pg_catalog.pg_index i on i.indexrelid = c.oid
join pg_catalog.pg_class t on i.indrelid = t.oid
where
c.relkind = 'i'
and n.nspname not in ('pg_catalog', 'pg_toast')
and pg_catalog.pg_table_is_visible(c.oid)
order by
n.nspname
,t.relname
,c.relname
If you want to delve further (such as columns and ordering), you need to look at pg_catalog.pg_index. Using psql -E [dbname]
comes in handy for figuring out how to query the catalog.
\di
will also list all the indexes in the database." (comment copied form other answer, applies here as well)
I love how this is explained in the article Cool performance features of EclipseLink 2.5
Indexing Foreign Keys The first feature is auto indexing of foreign keys. Most people incorrectly assume that databases index foreign keys by default. Well, they don't. Primary keys are auto indexed, but foreign keys are not. This means any query based on the foreign key will be doing full table scans. This is any OneToMany, ManyToMany or ElementCollection relationship, as well as many OneToOne relationships, and most queries on any relationship involving joins or object comparisons. This can be a major perform issue, and you should always index your foreign keys fields.
This function, based on the work by Laurenz Albe at https://www.cybertec-postgresql.com/en/index-your-foreign-key/, list all the foreign keys with missing indexes. The size of the table is shown, as for small tables the scanning performance could be superior to the index one.
--
-- function: missing_fk_indexes
-- purpose: List all foreing keys in the database without and index in the referencing table.
-- author: Based on the work of Laurenz Albe
-- see: https://www.cybertec-postgresql.com/en/index-your-foreign-key/
--
create or replace function missing_fk_indexes ()
returns table (
referencing_table regclass,
fk_columns varchar,
table_size varchar,
fk_constraint name,
referenced_table regclass
)
language sql as $$
select
-- referencing table having ta foreign key declaration
tc.conrelid::regclass as referencing_table,
-- ordered list of foreign key columns
string_agg(ta.attname, ', ' order by tx.n) as fk_columns,
-- referencing table size
pg_catalog.pg_size_pretty (
pg_catalog.pg_relation_size(tc.conrelid)
) as table_size,
-- name of the foreign key constraint
tc.conname as fk_constraint,
-- name of the target or destination table
tc.confrelid::regclass as referenced_table
from pg_catalog.pg_constraint tc
-- enumerated key column numbers per foreign key
cross join lateral unnest(tc.conkey) with ordinality as tx(attnum, n)
-- name for each key column
join pg_catalog.pg_attribute ta on ta.attnum = tx.attnum and ta.attrelid = tc.conrelid
where not exists (
-- is there ta matching index for the constraint?
select 1 from pg_catalog.pg_index i
where
i.indrelid = tc.conrelid and
-- the first index columns must be the same as the key columns, but order doesn't matter
(i.indkey::smallint[])[0:cardinality(tc.conkey)-1] @> tc.conkey) and
tc.contype = 'f'
group by
tc.conrelid,
tc.conname,
tc.confrelid
order by
pg_catalog.pg_relation_size(tc.conrelid) desc
$$;
test it this way,
select * from missing_fk_indexes();
you'll see a list like this.
referencing_table | fk_columns | table_size | fk_constraint | referenced_table
------------------------+------------------+------------+----------------------------------------------+------------------
stk_warehouse | supplier_id | 8192 bytes | stk_warehouse_supplier_id_fkey | stk_supplier
stk_reference | supplier_id | 0 bytes | stk_reference_supplier_id_fkey | stk_supplier
stk_part_reference | reference_id | 0 bytes | stk_part_reference_reference_id_fkey | stk_reference
stk_warehouse_part | part_id | 0 bytes | stk_warehouse_part_part_id_fkey | stk_part
stk_warehouse_part_log | dst_warehouse_id | 0 bytes | stk_warehouse_part_log_dst_warehouse_id_fkey | stk_warehouse
stk_warehouse_part_log | part_id | 0 bytes | stk_warehouse_part_log_part_id_fkey | stk_part
stk_warehouse_part_log | src_warehouse_id | 0 bytes | stk_warehouse_part_log_src_warehouse_id_fkey | stk_warehouse
stk_product_part | part_id | 0 bytes | stk_product_part_part_id_fkey | stk_part
stk_purchase | parent_id | 0 bytes | stk_purchase_parent_id_fkey | stk_purchase
stk_purchase | supplier_id | 0 bytes | stk_purchase_supplier_id_fkey | stk_supplier
stk_purchase_line | reference_id | 0 bytes | stk_purchase_line_reference_id_fkey | stk_reference
stk_order | freighter_id | 0 bytes | stk_order_freighter_id_fkey | stk_freighter
stk_order_line | product_id | 0 bytes | stk_order_line_product_id_fkey | cnt_product
stk_order_fulfillment | freighter_id | 0 bytes | stk_order_fulfillment_freighter_id_fkey | stk_freighter
stk_part | sibling_id | 0 bytes | stk_part_sibling_id_fkey | stk_part
stk_order_part | part_id | 0 bytes | stk_order_part_part_id_fkey | stk_part
For those who decided to systematically create and index on every referencing column, this other version could be more efficient:
--
-- function: missing_fk_indexes2
-- purpose: List all foreing keys in the database without and index in the referencing table.
-- The listing contains create index sentences
-- author: Based on the work of Laurenz Albe
-- see: https://www.cybertec-postgresql.com/en/index-your-foreign-key/
--
create or replace function missing_fk_indexes2 ()
returns setof varchar
language sql as $$
select
-- create index sentence
'create index on ' ||
tc.conrelid::regclass ||
'(' ||
string_agg(ta.attname, ', ' order by tx.n) ||
')' as create_index
from pg_catalog.pg_constraint tc
-- enumerated key column numbers per foreign key
cross join lateral unnest(tc.conkey) with ordinality as tx(attnum, n)
-- name for each key column
join pg_catalog.pg_attribute ta on ta.attnum = tx.attnum and ta.attrelid = tc.conrelid
where not exists (
-- is there ta matching index for the constraint?
select 1 from pg_catalog.pg_index i
where
i.indrelid = tc.conrelid and
-- the first index columns must be the same as the key columns, but order doesn't matter
(i.indkey::smallint[])[0:cardinality(tc.conkey)-1] @> tc.conkey) and
tc.contype = 'f'
group by
tc.conrelid,
tc.conname,
tc.confrelid
order by
pg_catalog.pg_relation_size(tc.conrelid) desc
$$;
Now the output is the create index
sentence you have to add to your database.
select * from missing_fk_indexes2();
missing_fk_indexes2
----------------------------------------------------------
create index on stk_warehouse(supplier_id)
create index on stk_reference(supplier_id)
create index on stk_part_reference(reference_id)
create index on stk_warehouse_part(part_id)
create index on stk_warehouse_part_log(dst_warehouse_id)
create index on stk_warehouse_part_log(part_id)
create index on stk_warehouse_part_log(src_warehouse_id)
create index on stk_product_part(part_id)
create index on stk_purchase(parent_id)
create index on stk_purchase(supplier_id)
create index on stk_purchase_line(reference_id)
create index on stk_order(freighter_id)
create index on stk_order_line(product_id)
create index on stk_order_fulfillment(freighter_id)
create index on stk_part(sibling_id)
create index on stk_order_part(part_id)
For a PRIMARY KEY
, an index will be created with the following message:
NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "index" for table "table"
For a FOREIGN KEY
, the constraint will not be created if there is no index on the referenced table.
An index on referencing table is not required (though desired), and therefore will not be implicitly created.
And here's a bash script that generates the SQL to create indexes for missing indexes on foreign keys using @sergeyB's SQL.
#!/bin/bash
read -r -d '' SQL <<EOM
WITH fk_actions ( code, action ) AS (
VALUES ( 'a', 'error' ),
( 'r', 'restrict' ),
( 'c', 'cascade' ),
( 'n', 'set null' ),
( 'd', 'set default' )
),
fk_list AS (
SELECT pg_constraint.oid as fkoid, conrelid, confrelid as parentid,
conname, relname, nspname,
fk_actions_update.action as update_action,
fk_actions_delete.action as delete_action,
conkey as key_cols
FROM pg_constraint
JOIN pg_class ON conrelid = pg_class.oid
JOIN pg_namespace ON pg_class.relnamespace = pg_namespace.oid
JOIN fk_actions AS fk_actions_update ON confupdtype = fk_actions_update.code
JOIN fk_actions AS fk_actions_delete ON confdeltype = fk_actions_delete.code
WHERE contype = 'f'
),
fk_attributes AS (
SELECT fkoid, conrelid, attname, attnum
FROM fk_list
JOIN pg_attribute
ON conrelid = attrelid
AND attnum = ANY( key_cols )
ORDER BY fkoid, attnum
),
fk_cols_list AS (
SELECT fkoid, array_to_string(array_agg(attname), ':') as cols_list
FROM fk_attributes
GROUP BY fkoid
),
index_list AS (
SELECT indexrelid as indexid,
pg_class.relname as indexname,
indrelid,
indkey,
indpred is not null as has_predicate,
pg_get_indexdef(indexrelid) as indexdef
FROM pg_index
JOIN pg_class ON indexrelid = pg_class.oid
WHERE indisvalid
),
fk_index_match AS (
SELECT fk_list.*,
indexid,
indexname,
indkey::int[] as indexatts,
has_predicate,
indexdef,
array_length(key_cols, 1) as fk_colcount,
array_length(indkey,1) as index_colcount,
round(pg_relation_size(conrelid)/(1024^2)::numeric) as table_mb,
cols_list
FROM fk_list
JOIN fk_cols_list USING (fkoid)
LEFT OUTER JOIN index_list
ON conrelid = indrelid
AND (indkey::int2[])[0:(array_length(key_cols,1) -1)] @> key_cols
),
fk_perfect_match AS (
SELECT fkoid
FROM fk_index_match
WHERE (index_colcount - 1) <= fk_colcount
AND NOT has_predicate
AND indexdef LIKE '%USING btree%'
),
fk_index_check AS (
SELECT 'no index' as issue, *, 1 as issue_sort
FROM fk_index_match
WHERE indexid IS NULL
UNION ALL
SELECT 'questionable index' as issue, *, 2
FROM fk_index_match
WHERE indexid IS NOT NULL
AND fkoid NOT IN (
SELECT fkoid
FROM fk_perfect_match)
),
parent_table_stats AS (
SELECT fkoid, tabstats.relname as parent_name,
(n_tup_ins + n_tup_upd + n_tup_del + n_tup_hot_upd) as parent_writes,
round(pg_relation_size(parentid)/(1024^2)::numeric) as parent_mb
FROM pg_stat_user_tables AS tabstats
JOIN fk_list
ON relid = parentid
),
fk_table_stats AS (
SELECT fkoid,
(n_tup_ins + n_tup_upd + n_tup_del + n_tup_hot_upd) as writes,
seq_scan as table_scans
FROM pg_stat_user_tables AS tabstats
JOIN fk_list
ON relid = conrelid
)
SELECT relname as table_name,
cols_list
FROM fk_index_check
JOIN parent_table_stats USING (fkoid)
JOIN fk_table_stats USING (fkoid)
ORDER BY issue_sort, table_mb DESC, table_name;
EOM
DB_NAME="dbname"
DB_USER="dbuser"
DB_PASSWORD="dbpass"
DB_HOSTNAME="hostname"
DB_PORT=5432
export PGPASSWORD="$DB_PASSWORD"
psql -h $DB_HOSTNAME -p $DB_PORT -U $DB_USER -d $DB_NAME -t -A -F"," -c "$SQL" | while read -r line; do
IFS=','
parts=($line)
unset IFS
tableName=${parts[0]}
colsList=${parts[1]}
indexName="${tableName}_${colsList//:/_}_index"
printf -- "\n--Index: %s\nDROP INDEX IF EXISTS %s;\n
CREATE INDEX %s\n\t\tON %s USING btree\n\t(%s);
" "$indexName" "$indexName" "$indexName" "$tableName" "$colsList"
done
Success story sharing
If the referenced column(s) are changed frequently, it might be wise to add an index to the referencing column(s) so that referential actions associated with the foreign key constraint can be performed more efficiently.