Goal Find the pairs of table columns that name is different only by one symbol and that have the same type and/or domain.
Notes The query finds the pairs of columns of the same table where the Levenshtein distance between the names is one. The query uses a function from the fuzzystrmatch extension. The query removes digits from the names before the comparison, i.e., the result does not contain the pairs like col1 and col2.
Type Problem detection (Each row in the result could represent a flaw in the design)
Reliability Low (Many false-positive results)
License MIT License
Data Source INFORMATION_SCHEMA+system catalog
SQL Query
CREATE EXTENSION IF NOT EXISTS fuzzystrmatch;
WITH columns AS (SELECT nspname AS table_schema, relname AS table_name, 
CASE WHEN relkind='r' THEN 'BASE TABLE'
WHEN relkind='v' THEN 'VIEW'
WHEN relkind='m' THEN 'MATERIALIZED VIEW'
WHEN relkind='f' THEN 'FOREIGN TABLE'
WHEN relkind='p' THEN 'PARTITIONED TABLE'
END AS table_type, 
translate(attname,'01234567890','') AS column_name_stripped,
pg_type.typname AS column_type,
domain_type.typname AS column_domain_type,
string_agg(attname, ',
' ORDER BY attname) AS columns FROM pg_class INNER JOIN pg_namespace ON pg_class.relnamespace=pg_namespace.oid INNER JOIN pg_attribute ON pg_class.oid=pg_attribute.attrelid INNER JOIN pg_type ON pg_attribute.atttypid =pg_type.oid LEFT JOIN pg_type AS domain_type ON domain_type.oid=pg_type.typbasetype WHERE attnum>=1 AND relkind IN ('r','v','m','f','p') AND nspname NOT IN (SELECT schema_name FROM INFORMATION_SCHEMA.schemata WHERE schema_name<>'public' AND schema_owner='postgres' AND schema_name IS NOT NULL) GROUP BY table_schema, table_name, table_type, column_name_stripped, column_type, column_domain_type ) SELECT c.table_schema, c.table_type, c.table_name, c.column_name_stripped AS col1, c.column_type AS col1_type, c.column_domain_type AS col1_domain_type, c.columns AS col1_family, c2.column_name_stripped AS col2, c2.column_type AS col2_type, c2.column_domain_type AS col2_domain_type, c2.columns AS col2_family FROM columns AS c INNER JOIN columns AS c2 USING (table_schema, table_name) WHERE c.column_name_stripped>c2.column_name_stripped AND levenshtein_less_equal(c.column_name_stripped,c2.column_name_stripped,1)=1 AND c.column_type IS NOT DISTINCT FROM c2.column_type AND c.column_domain_type IS NOT DISTINCT FROM c2.column_domain_type ORDER BY table_type, table_schema, table_name; DROP EXTENSION IF EXISTS fuzzystrmatch;
Categories

This query is classified under the following categories:

NameDescription
NamingQueries of this category provide information about the style of naming.