Goal This query identifies classifier tables where the code column uses a data type that accommodates unnecessarily large values for reference data. Specifically, it flags code columns defined as integer, bigint, or varchar(n) where n > 10.
Notes The query identifies classifier tables based on the following heuristics: it finds base tables that have four or less columns, have at most one foreign key, and have a column that is associated with a sequence generator. The query considers both external and internal sequence generators. The query only includes tables that name indicates that it is a classifier (reference data) table and it is not a master data table. The query considers both names in Estonian and English.
Type Problem detection (Each row in the result could represent a flaw in the design)
Reliability Medium (Medium number of false-positive results)
License MIT License
Fixing Suggestion Instead of integer/bigint use smallint. In case of VARCHAR(n) make sure that n is nnot bigger than 10.
Data Source INFORMATION_SCHEMA+system catalog
SQL Query
WITH fk AS (SELECT
(SELECT nspname FROM pg_namespace WHERE oid=c.relnamespace) AS target_schema,
c.relname AS target_table
FROM pg_constraint o INNER JOIN pg_class c ON c.oid = o.conrelid
INNER JOIN pg_class f ON f.oid = o.confrelid
WHERE o.contype = 'f'),

classifiers AS (SELECT A.table_schema, A.table_name 
FROM information_schema.tables A 
INNER JOIN information_schema.schemata S
  ON A.table_schema = S.schema_name
WHERE 
  A.table_type = 'BASE TABLE'
  AND (A.table_schema = 'public' OR S.schema_owner <> 'postgres')  
  AND A.table_name ~* '(klassifikaator|tüüp|tyyp|type|liik|liigid|kind|seisund|staatus|status|state|kood|code|sõnastik|sonastik|dict|register|loend|nimekiri|seadistus|kategoori|categor|lookup|ref|enum|class)'  
  AND A.table_name !~* '^[_]*(osapool|party|klient|customer|client|isik|person|kasutaja|user|toode|product|artikkel|article|teenus|service|töötaja|tootaja|employee|partner|ettevõte|ettevote|company|firma|organisatsioon|organization|leping|contract|asukoht|location|aadress|address|projekt|project|osakond|department)[_]*$'
  AND (SELECT Count(*) AS cnt FROM INFORMATION_SCHEMA.columns AS c
WHERE A.table_schema=c.table_schema
AND A.table_name=c.table_name)<=4
AND (SELECT Count(*) AS cnt FROM fk
WHERE A.table_schema=fk.target_schema
AND A.table_name=fk.target_table)<=1
)

SELECT table_schema, table_name, column_name, data_type, character_maximum_length
FROM INFORMATION_SCHEMA.columns
WHERE column_name~*'_(kood|code|id)$'
AND (data_type IN ('integer','bigint')
OR (data_type~'character' AND character_maximum_length>10))
AND (table_schema, table_name) IN (SELECT table_schema, table_name
FROM classifiers)
ORDER BY table_schema, table_name, ordinal_position;

Collections

This query belongs to the following collections:

NameDescription
Find problems automaticallyQueries, that results point to problems in the database. Each query in the collection produces an initial assessment. However, a human reviewer has the final say as to whether there is a problem or not .
Categories

This query is classified under the following categories:

NameDescription
Classifier tablesQueries of this category provide information about registration of classifiers.
Field sizeQueries of this category provide information about the maximum size of values that can be recorded in column fields