Query goal: | Find non-key base table columns with the same name and type that have different default values. Be consistent. Columns with the same name and type shouldn't probably have different default values in case of different tables. "If you do something a certain way, do all similar things in the same way." (Robert C. Martin, Clean Code) |
Notes about the query: | In case of the string_agg function, the line break (br) tag is used as a part of the separator for the better readability in case the query result is displayed in a web browser. The query excludes columns that name suggests that it contains data about state/status. Entities that belong to different entity types might have different inital state. The queries consider both column names in Estonian and English. |
Query type: | Problem detection (Each row in the result could represent a flaw in the design) |
Query reliability: | Low (Many false-positive results) |
Query license: | MIT License |
Fixing suggestion: | Use default values consistently. |
Data source: | INFORMATION_SCHEMA+system catalog |
SQL query: | Click on query to copy it
WITH keys as (select o.conname, (select nspname from pg_namespace where oid=m.relnamespace) as key_schema, m.relname as key_table, m.oid as key_table_oid, o.conkey AS key_col from pg_constraint o inner join pg_class c on c.oid = o.conrelid inner join pg_class m on m.oid = o.conrelid where o.contype in ('u','p') and o.conrelid in (select oid from pg_class c where c.relkind = 'r')), keys_unnest as (select conname, key_schema, key_table, key_table_oid, key_col, key_col_num, ordin from keys, unnest(keys.key_col) with ordinality as k(key_col_num, ordin)), keys_with_names as (select key_schema as table_schema, key_table as table_name, a_key.attname as column_name from keys_unnest k inner join pg_attribute a_key on k.key_col_num = a_key.attnum and k.key_table_oid = a_key.attrelid and a_key.attisdropped = false), def_values AS (SELECT c.table_schema, c.table_name, c.column_name, c.data_type, coalesce(c.column_default, domain_default) AS def_value FROM information_schema.columns AS c LEFT JOIN information_schema.domains d ON c.domain_schema=d.domain_schema AND c.domain_name=d.domain_name WHERE (c.table_schema, c.table_name) IN (SELECT table_schema, table_name FROM INFORMATION_SCHEMA.tables WHERE table_type='BASE TABLE') AND c.table_schema NOT IN (SELECT schema_name FROM INFORMATION_SCHEMA.schemata WHERE schema_name<>'public' AND schema_owner='postgres' AND schema_name IS NOT NULL) AND (column_default IS NOT NULL OR domain_default IS NOT NULL) AND NOT EXISTS (SELECT 1 FROM keys_with_names AS kwn WHERE kwn.table_schema=c.table_schema AND kwn.table_name=c.table_name AND kwn.column_name=c.column_name) AND column_name!~*'(seisund|staatus|olek|status)') SELECT column_name, data_type, Count(*) AS number_of_columns, Count(DISTINCT def_value) AS number_of_different_def_values, string_agg(table_schema || '.' || table_name || ' ' || def_value,';<br>' ORDER BY table_schema, table_name) AS different_def_values FROM def_values GROUP BY column_name, data_type HAVING Count(DISTINCT def_value)>1 ORDER BY Count(DISTINCT def_value) DESC, Count(*) DESC, column_name; |
Collection name | Collection description |
---|---|
Find problems about base tables | A selection of queries that return information about the data types, field sizes, default values as well as general structure of base tables. Contains all the types of queries - problem detection, software measure, and general overview |
Find problems automatically | Queries, that results point to problems in the database. Each query in the collection produces an initial assessment. However, a human reviewer has the final say as to whether there is a problem or not . |
Category name | Category description |
---|---|
Default value | Queries of this catergory provide information about the use of default values. |
Inconsistencies | Queries of this catergory provide information about inconsistencies of solving the same problem in different places. |
Result quality depends on names | Queries of this category use names (for instance, column names) to try to guess the meaning of a database object. Thus, the goodness of names determines the number of false positive and false negative results. |
Validity and completeness | Queries of this category provide information about whether database design represents the world (domain) correctly (validity) and whether database design captures all the information about the world (domain) that is correct and relevant (completeness). |
Reference |
---|
The corresponding code smell in case of cleaning code is "G11: Inconsistency". (Robert C. Martin, Clean Code) |
Smell "Overloaded attribute names": Sharma, T., Fragkoulis, M., Rizou, S., Bruntink, M. and Spinellis, D.: Smelly relations: measuring and understanding database schema quality. In: Proceedings of the 40th International Conference on Software Engineering: Software Engineering in Practice, pp. 55-64. ACM, (2018). |