Semantic mismatch: non-textual data types for classifier codes

Query Details
Classification

Goal	This query identifies a semantic mismatch in data types for columns intended to store standard codes, such as country, language, currency, or airport codes. It flags columns whose names suggest they contain these codes, but which are defined using non-text data types (e.g., integer, numeric, bigint). Since these codes often contain letters or leading zeros and are not used in mathematical calculations, they are semantically strings. Storing them as numeric types can lead to data loss—such as the truncation of leading zeros—and reduces formatting flexibility.
Notes	The query considers both column names in English and Estonian.
Type	Problem detection (Each row in the result could represent a flaw in the design)
Reliability	Medium (Medium number of false-positive results)
License	MIT License
Data Source	INFORMATION_SCHEMA only
SQL Query	SELECT table_schema, table_name, column_name, data_type FROM INFORMATION_SCHEMA.columns WHERE data_type!~'(character\|text)' AND column_name~'(riik\|riigi\|keel\|valuuta\|lennujaam\|maakond\|country\|lang\|currency\|airport).(kood\|code)' AND (table_schema, table_name) IN (SELECT table_schema, table_name FROM INFORMATION_SCHEMA.tables WHERE table_type='BASE TABLE') AND table_schema NOT IN (SELECT schema_name FROM INFORMATION_SCHEMA.schemata WHERE schema_name<>'public' AND schema_owner='postgres' AND schema_name IS NOT NULL) ORDER BY table_schema, table_name, ordinal_position;

Collections

This query belongs to the following collections:

Name	Description
Find problems automatically	Queries, that results point to problems in the database. Each query in the collection produces an initial assessment. However, a human reviewer has the final say as to whether there is a problem or not .

Categories

This query is classified under the following categories:

Name	Description
Classifier tables	Queries of this category provide information about registration of classifiers.
Data types	Queries of this category provide information about the data types and their usage.
Result quality depends on names	Queries of this category use names (for instance, column names) to try to guess the meaning of a database object. Thus, the goodness of names determines the number of false positive and false negative results.