Query goal: | Find domains that have the same properties (base type, character length, not null + check constraints, default value, collation). There should not be multiple domains that have the same properties. Do remember that the same task can be solved in SQL usually in multiple different ways. Therefore, the domains may have syntactically different check constraints that solve the same task. Thus, the exact copies are not the only possible duplication. |
Notes about the query: | The query finds exact copies that differ only by name. In case of the string_agg function, the line break (br) tag is used as a part of the separator for the better readability in case the query result is displayed in a web browser. |
Query type: | Problem detection (Each row in the result could represent a flaw in the design) |
Query reliability: | High (Few or no false-positive results) |
Query license: | MIT License |
Fixing suggestion: | All but one are redundant. Drop all the domains except one and use the domain in case of all the columns that need the domain. |
Data source: | INFORMATION_SCHEMA only |
SQL query: | Click on query to copy it
WITH domains AS (SELECT domain_schema, domain_name, collation_schema, collation_name, CASE WHEN data_type='numeric' THEN data_type || '(' || numeric_precision || ',' || numeric_scale || ')' WHEN character_maximum_length IS NOT NULL THEN data_type || '(' || character_maximum_length || ')' WHEN datetime_precision IS NOT NULL AND data_type<>'date' THEN data_type || '(' || datetime_precision || ')' ELSE data_type END AS data_type, domain_default, cc.check_clause FROM INFORMATION_SCHEMA.domains AS d LEFT JOIN INFORMATION_SCHEMA.domain_constraints AS dc USING (domain_schema, domain_name) LEFT JOIN INFORMATION_SCHEMA.check_constraints AS cc USING (constraint_schema, constraint_name) WHERE domain_schema NOT IN (SELECT schema_name FROM INFORMATION_SCHEMA.schemata WHERE schema_name<>'public' AND schema_owner='postgres' AND schema_name IS NOT NULL)), domains_check_agg AS (SELECT domain_schema, domain_name, collation_schema, collation_name, data_type, domain_default, string_agg(check_clause, ',' ORDER BY check_clause) AS constraints FROM domains GROUP BY domain_schema, domain_name, collation_schema, collation_name, data_type, domain_default) SELECT data_type, domain_default, collation_schema, collation_name, constraints, string_agg(domain_schema || '.' || domain_name, ';<br>' ORDER BY domain_schema, domain_name) AS domains, Count(*) AS number_of_domains FROM domains_check_agg GROUP BY data_type, domain_default, collation_schema, collation_name, constraints HAVING Count(*)>1 ORDER BY Count(*) DESC, data_type, domain_default, constraints; |
SQL query | Description |
---|---|
WITH domains_start AS (SELECT nspname AS domain_schema, t.typname AS domain_name, t.typnotnull AS not_null, t.typdefault AS def_value, string_agg(substring(pg_get_constraintdef(c.oid),7),';' ORDER BY substring(pg_get_constraintdef(c.oid),7)) AS constraints FROM pg_catalog.pg_type t INNER JOIN pg_catalog.pg_namespace n ON t.typnamespace=n.oid INNER JOIN pg_catalog.pg_constraint c ON t.oid=c.contypid WHERE t.typtype='d' AND nspname NOT IN (SELECT schema_name FROM INFORMATION_SCHEMA.schemata WHERE schema_name<>'public' AND schema_owner='postgres' AND schema_name IS NOT NULL) GROUP BY nspname, t.typname, t.typnotnull, t.typdefault), domains AS (SELECT domain_Schema, domain_name, not_null, def_value, constraints, (SELECT CASE WHEN data_type='numeric' THEN data_type || '(' || numeric_precision || ',' || numeric_scale || ')' WHEN character_maximum_length IS NOT NULL THEN data_type || '(' || character_maximum_length || ')' WHEN datetime_precision IS NOT NULL AND data_type<>'date' THEN data_type || '(' || datetime_precision || ')' ELSE data_type END AS data_type FROM INFORMATION_SCHEMA.domains AS d WHERE domains_start.domain_schema=d.domain_schema AND domains_start.domain_name=d.domain_name) AS data_type FROM domains_start), duplicate_domains AS (SELECT data_type, not_null, def_value, constraints, array_agg(domain_schema || '.' || domain_name ORDER BY domain_schema, domain_name) AS domain_array FROM domains GROUP BY data_type, not_null, def_value, constraints HAVING Count(*)>1) SELECT format('DROP DOMAIN %1$s;', unnest(domain_array)) AS statements FROM duplicate_domains ORDER BY statements; | Drop the domain. One of the domains must stay in place. |
Collection name | Collection description |
---|---|
Find problems automatically | Queries, that results point to problems in the database. Each query in the collection produces an initial assessment. However, a human reviewer has the final say as to whether there is a problem or not . |
Category name | Category description |
---|---|
Comfortability of database evolution | Queries of this category provide information about the means that influence database evolution. |
Domains | Queries of this category provide information about reusable specifications of column properties. |
Duplication of implementation elements | Queries of this catergory provide information about the duplication of the database objects. |
Reference |
---|
https://refactoring.guru/smells/alternative-classes-with-different-interfaces |
https://refactoring.guru/smells/duplicate-code |
The corresponding code smell in case of cleaning code is "G5: Duplication". (Robert C. Martin, Clean Code) |