This query identifies a semantic mismatch in data types for columns intended to store standard codes, such as country, language, currency, or airport codes. It flags columns whose names suggest they contain these codes, but which are defined using non-text data types (e.g., integer, numeric, bigint). Since these codes often contain letters or leading zeros and are not used in mathematical calculations, they are semantically strings. Storing them as numeric types can lead to data loss—such as the truncation of leading zeros—and reduces formatting flexibility.
Notes
The query considers both column names in English and Estonian.
Type
Problem detection (Each row in the result could represent a flaw in the design)
SELECT table_schema, table_name, column_name, data_type
FROM INFORMATION_SCHEMA.columns
WHERE data_type!~'(character|text)'
AND column_name~*'(riik|riigi|keel|valuuta|lennujaam|maakond|country|lang|currency|airport).*(kood|code)'
AND (table_schema, table_name) IN (SELECT table_schema, table_name
FROM INFORMATION_SCHEMA.tables WHERE table_type='BASE TABLE') AND
table_schema NOT IN (SELECT schema_name
FROM INFORMATION_SCHEMA.schemata
WHERE schema_name<>'public' AND
schema_owner='postgres' AND schema_name IS NOT NULL)
ORDER BY table_schema, table_name, ordinal_position;
Collections
This query belongs to the following collections:
Name
Description
Find problems automatically
Queries, that results point to problems in the database. Each query in the collection produces an initial assessment. However, a human reviewer has the final say as to whether there is a problem or not .
Categories
This query is classified under the following categories:
Name
Description
Classifier tables
Queries of this category provide information about registration of classifiers.
Data types
Queries of this category provide information about the data types and their usage.
Result quality depends on names
Queries of this category use names (for instance, column names) to try to guess the meaning of a database object. Thus, the goodness of names determines the number of false positive and false negative results.