Goal Find as to whether columns for holding e-mail addresses, phone numbers, ip addresses, zip codes, ordinary addresses, or file addresses have inconsistent field sizes across tables.
Notes The query considers both column names in English and Estonian. Missing field size, i.e., unilimited field size is considered as a separate field size.
Type Problem detection (Each row in the result could represent a flaw in the design)
Reliability Medium (Medium number of false-positive results)
License MIT License
Fixing Suggestion If there are specific requirements that tell otherwise use consistent field sizes.
Data Source INFORMATION_SCHEMA only
SQL Query
WITH field_sizes AS (SELECT table_schema, table_name, column_name,  data_type, 
CASE WHEN data_type~*'(character varying|text)' AND character_maximum_length IS NULL THEN 'Unlimited'
ELSE character_maximum_length::text END AS character_maximum_length_for_count, character_maximum_length,
CASE WHEN column_name~*'(ip.*addr|ip.*aadr)' THEN 'ip'
WHEN column_name~*'(zip_)' THEN 'zip'
WHEN column_name~*'(phone|telef|tel_nr)' THEN 'phone'
WHEN column_name~*'e[_| ]*(mail|meil)' THEN 'mail'
WHEN column_name~*'(addr|aadr)' AND column_name!~*'(ip|phone|telef|tel_nr|mail|meil)' THEN 'address'
WHEN column_name~*'(path|tee)' AND column_name!~*'(url|http|ftp)' THEN 'path'
WHEN column_name~*'(url|http|ftp|veebileht|koduleht|homepage|webpage)' THEN 'URL'
ELSE 'other' END AS content_type
FROM INFORMATION_SCHEMA.columns
WHERE data_type~*'(char|text)' 
AND column_name~*'(zip_|ip.*addr|ip.*aadr|phone|telef|tel_nr|mail|meil|addr|aadr|path|tee|url|http|ftp|veebileht|koduleht|homepage|webpage)'
AND column_name!~*'(type|size|profile|prefix|suffix|tyyp|suurus|list|login|username|password|psswd|kasutajanimi|parool|salasona|shipping)'
AND (table_schema, table_name) IN (SELECT table_schema, table_name
FROM INFORMATION_SCHEMA.tables WHERE table_type='BASE TABLE') AND 
table_schema NOT IN (SELECT schema_name
FROM INFORMATION_SCHEMA.schemata
WHERE schema_name<>'public' AND
schema_owner='postgres' AND schema_name IS NOT NULL))
SELECT content_type, Count(DISTINCT character_maximum_length_for_count) AS nr_of_different_sizes, 
Count(*) AS nr_of_columns,
string_agg(table_schema || '.' || table_name || '.' || column_name || ' ' ||  
CASE WHEN character_maximum_length IS NOT NULL THEN data_type  || '(' || character_maximum_length || ')' ELSE data_type END, ';
' ORDER BY character_maximum_length DESC NULLS FIRST, table_schema, table_name, column_name) AS columns FROM field_sizes GROUP BY content_type HAVING Count(DISTINCT character_maximum_length)>1 ORDER BY Count(DISTINCT character_maximum_length) DESC, content_type;
Collections

This query belongs to the following collections:

NameDescription
Find problems about base tablesA selection of queries that return information about the data types, field sizes, default values as well as general structure of base tables. Contains all the types of queries - problem detection, software measure, and general overview
Find problems automaticallyQueries, that results point to problems in the database. Each query in the collection produces an initial assessment. However, a human reviewer has the final say as to whether there is a problem or not .
Categories

This query is classified under the following categories:

NameDescription
Field sizeQueries of this category provide information about the maximum size of values that can be recorded in column fields
InconsistenciesQueries of this catergory provide information about inconsistencies of solving the same problem in different places.
Result quality depends on namesQueries of this category use names (for instance, column names) to try to guess the meaning of a database object. Thus, the goodness of names determines the number of false positive and false negative results.