Query goal: | Find surrogate key columns that do not have an associated sequence generator. Surrogate key values must be generated by using the system (the sequence generator mechanism in case of PostgreSQL). If there are no sequence generators, then there is a question as to whether there are no surrogate keys in the database at all (could be possible and OK) or (more probable) developers have forgotten to implement the generation of surrogate keys. |
Notes about the query: | The query assumes that each surrogate key column has a prefix or a suffix "ID", it has an integer type, and the key consist of only one column. For instance, the column name could be person_id or id_person. The query also assumes that there is enough to have at most one surrogate key in a table, i.e., that the table should have no other surrogate keys. The query takes into account both external and internal (are created as the result of declaring a column as the identity column) sequence generators. The query takes into account a possibility that a column may be associated with a sequence generator through a domain. |
Query type: | Problem detection (Each row in the result could represent a flaw in the design) |
Query reliability: | Medium (Medium number of false-positive results) |
Query license: | MIT License |
Fixing suggestion: | Associate a sequence generator with the column. If the column values must be registered by a user, then rename the column (replace "id" with "code" or "kood" (in Estonian)). |
Data source: | INFORMATION_SCHEMA+system catalog |
SQL query: | Click on query to copy it
with keys as (select (select nspname from pg_namespace where oid=c.relnamespace) as key_schema, c.relname as key_table, c.oid as key_table_oid, o.conkey AS key_col from pg_constraint o inner join pg_class c on c.oid = o.conrelid where o.contype in ('u', 'p') and cardinality(o.conkey)=1), keys_unnest as (select key_schema, key_table, key_table_oid, key_col, key_col_num, ordin from keys, unnest(keys.key_col) with ordinality as k(key_col_num, ordin)), keys_with_names AS (select key_schema, key_table, a_key.attname as key_col from keys_unnest k inner join pg_attribute a_key on k.key_col_num = a_key.attnum and k.key_table_oid = a_key.attrelid and a_key.attisdropped = false ), fk as (select (select nspname from pg_namespace where oid=c.relnamespace) as key_schema, c.relname as key_table, c.oid as key_table_oid, o.conkey AS key_col from pg_constraint o inner join pg_class c on c.oid = o.conrelid where o.contype='f'), fk_unnest as (select key_schema, key_table, key_table_oid, key_col, key_col_num, ordin from fk, unnest(fk.key_col) with ordinality as k(key_col_num, ordin)), fk_with_names AS (select key_schema, key_table, a_key.attname as key_col from fk_unnest k inner join pg_attribute a_key on k.key_col_num = a_key.attnum and k.key_table_oid = a_key.attrelid and a_key.attisdropped = false), key_not_fk AS (select key_schema, key_table, key_col from keys_with_names except select key_schema, key_table, key_col from fk_with_names), surrogate_key AS ( SELECT c.table_schema, c.table_name FROM information_schema.columns c LEFT JOIN information_schema.domains d USING (domain_schema, domain_name) INNER JOIN information_schema.schemata s ON c.table_schema=s.schema_name WHERE (coalesce (c.column_default, d.domain_default) ILIKE '%nextval%' OR c.is_identity='YES') AND (c.table_schema = 'public' OR s.schema_owner<>'postgres')) SELECT c.table_schema, c.table_name , c.column_name FROM information_schema.columns c LEFT JOIN information_schema.domains d USING (domain_schema, domain_name) INNER JOIN information_schema.schemata s ON c.table_schema=s.schema_name WHERE (coalesce (c.column_default, d.domain_default) NOT ILIKE '%nextval%' OR coalesce (c.column_default, d.domain_default) IS NULL) AND c.is_identity='NO' AND (c.column_name~*'_id$' OR c.column_name~*'^id_') AND c.data_type~*'(integer|smallint|bigint)' AND (c.table_schema = 'public' OR s.schema_owner<>'postgres') AND EXISTS (SELECT 1 FROM key_not_fk AS k WHERE k.key_schema=c.table_schema AND k.key_table=c.table_name AND k.key_col=c.column_name) AND NOT EXISTS (SELECT 1 FROM surrogate_key AS sk WHERE sk.table_schema=c.table_schema AND sk.table_name=c.table_name) ORDER BY table_schema, table_name; |
Collection name | Collection description |
---|---|
Find problems about base tables | A selection of queries that return information about the data types, field sizes, default values as well as general structure of base tables. Contains all the types of queries - problem detection, software measure, and general overview |
Find problems automatically | Queries, that results point to problems in the database. Each query in the collection produces an initial assessment. However, a human reviewer has the final say as to whether there is a problem or not . |
Category name | Category description |
---|---|
Comfortability of data management | Queries of this category provide information about the means that have been used to make the use or management of database more comfortable and thus, more efficient. |
Result quality depends on names | Queries of this category use names (for instance, column names) to try to guess the meaning of a database object. Thus, the goodness of names determines the number of false positive and false negative results. |
Sequence generators | Queries of this category provide information about sequence generators and their usage. |
Validity and completeness | Queries of this category provide information about whether database design represents the world (domain) correctly (validity) and whether database design captures all the information about the world (domain) that is correct and relevant (completeness). |