I need to compare two PostgreSQL databases with exactly one SQL-query. I tried with the following query:
SELECT *
FROM information_schema.columns
WHERE table_name IN (SELECT table_name
FROM information_schema.tables
WHERE table_schema NOT IN ('information_schema', 'pg_catalog')
AND table_type = 'BASE TABLE'
ORDER BY table_schema, TABLE_NAME);
It works for my problem but the foreign and primary keys are missing in this table. Is there are way to include them into my sql-query?
It is ok if the result table is not normalized and displays data multiple times. It is purely a runtime comparison, which is why the result tables are deleted again when the program has run through.
Related
I have two tables, one a test table and one a production table, both with +200 columns and a couple thousand lines of code to create the table. I periodically make changes and am trying to automate QA. I would like to
Compare all rows between the two tables to detect differences.
Exclude certain columns, either because columns are new (added to test, does not exist in prod) or because they will be different on purpose (table_creation, created_by_used_id, etc).
Use a variable to generate the SELECT list_of_column_names so I do not have to continually manually update the column names I need to compare between the two tables.
#3 is the issue. I know how to do this in python, but am currently limited to doing this only in PostgreSQL and have never done anything with variables in SQL.
Code So Far
So far, I know I can get all columns names from
SELECT *
FROM information_schema.columns
WHERE table_schema = 'my_test_schema'
AND table_name = 'my_test_table'
From there, I can do a FULL JOIN and WHERE clause to join with the prod columns and get a table with 1 column of only the subset column names that I want.
After that, I'm using an EXCEPT/UNION ALL script to compare the tables. The issue below is with the * - I instead need to have some sort of variable or list and use that to select the column names.
SELECT * FROM my_test_table
EXCEPT
SELECT * FROM my_prod_table
UNION ALL
SELECT * from my_prod_table
EXCEPT
SELECT * from my_test_table
I am open to alternate suggestions.
This will give you the columns which are present in prod table and not in
test table and or the other way around:
SELECT column_name
FROM information_schema.columns
WHERE table_schema = 'my_test_schema'
AND table_name = 'my_test_table'
except
SELECT column_name
FROM information_schema.columns
WHERE table_schema = 'my_prod_table'
AND table_name = 'my_prod_table'
UNION
SELECT column_name
FROM information_schema.columns
WHERE table_schema = 'my_prod_table'
AND table_name = 'my_prod_table'
except
SELECT column_name
FROM information_schema.columns
WHERE table_schema = 'my_test_schema'
AND table_name = 'my_test_table'
I have a table with 100 columns and i need to get distinct records from all the columns from the table.
I used below query to get distinct records from table
select distinct col1, col2, col3,........ from test_table
but is there any good query to fetch distinct records from all the columns
from table without mentioning column names in the query.
Since you want DISTINCT on all columns, and you want to select all columns, it couldn't be simpler:
SELECT DISTINCT * FROM table
I am not sure if there is a simpler way,
You can use information_schema to get your columns and then use it.
SELECT string_agg(column_name::character varying, ',') as columns
FROM information_schema.columns
WHERE table_schema = 'schema_name'
AND table_name = 'table_name'
This will return you the list of columns in your table.
SELECT string_agg(column_name::character varying, ',') as columns
FROM information_schema.columns
WHERE table_schema = 'schema_name'
AND table_name = 'table_name' \gset
You can refer to gset here,
For example, if your table has two columns 'a' and 'b', gset will store, 'a,b'.
echo might be used to check what gset has stored,
\echo :columns
The following query might help you,
select distinct :columns from table_name;
I have to update all columns of type "uuid" to "varchar(38)". I created all the necessary queries with:
SELECT format(
'ALTER TABLE %I.%I.%I ALTER COLUMN %I SET DATA TYPE varchar(38);',
table_catalog,
table_schema,
table_name,
column_name
)
FROM information_schema.columns
WHERE data_type = 'uuid'
AND table_schema NOT LIKE 'pg_%'
AND lower(table_schema) <> 'information_schema'
AND is_updatable = 'YES';
Obviously, I can't execute the resulting queries because of all the existing PK and FK constraints involving the uuid columns.
Is there a way to temporarily disable the constraints, then executing all the queries and reactivating the constraints afterwards without dropping the constraints?
Or if I have to drop all the constraints first, is there a way to set them all up again after the updates? I am not the creator of the database so I don't have all necessary queries to create the constraints again.
I found a way to create all queries for dropping and creating all constraints of the database.
So first I have to save the output of the first query
SELECT 'ALTER TABLE "'||nspname||'"."'||relname||'" DROP CONSTRAINT "'||conname||'";'
FROM pg_constraint
INNER JOIN pg_class ON conrelid=pg_class.oid
INNER JOIN pg_namespace ON pg_namespace.oid=pg_class.relnamespace
ORDER BY CASE WHEN contype='f' THEN 0 ELSE 1 END,contype,nspname,relname,conname;
and of the second query
SELECT 'ALTER TABLE "'||nspname||'"."'||relname||'" ADD CONSTRAINT "'||conname||'" "'||
pg_get_constraintdef(pg_constraint.oid)||'";'
FROM pg_constraint
INNER JOIN pg_class ON conrelid=pg_class.oid
INNER JOIN pg_namespace ON pg_namespace.oid=pg_class.relnamespace
ORDER BY CASE WHEN contype='f' THEN 0 ELSE 1 END DESC,contype DESC,nspname DESC,relname DESC,conname DESC;
When I have all the queries, I first dropped every constrained, updated the tables and then executed the queries for adding the constraints again. Worked perfectly!
I want to extract the names of the tables I have.
The code below returns me tables AND views.
SELECT quote_ident(table_name) as tab_name
FROM information_schema.tables
WHERE table_schema='public'
Question
How can I obtain just the table names and exclude the views?
From the documentation (emphasis mine):
The view tables contains all tables and views defined in the current database.
You can use the table_type column to exclude views:
SELECT quote_ident(table_name) as tab_name
FROM information_schema.tables
WHERE table_schema = 'public'
AND table_type != 'VIEW'
Introduction
I've been developing a wizard to create complex database Postgres queries for users without any programming/SQL background. Thanks to foreign key constraints stored in a view in information_schema, the user may select any number of tables and the tool will find the correct join settings (thus, the user does not have to add ON table_a.field_1 = table_b.field_2).
While developing, I have been using an administration database user and now wanted to change that to a read-only user to make it more secure. However, this read-only user seems not to be able to access the foreign key constraints.
Current situation
When more than one table has been selected, the tool tries to get the connections between the various tables in order to know how to join them. During that process, the following query is executed:
SELECT
tc.constraint_name,
tc.table_name,
kcu.column_name,
ccu.table_name AS foreign_table_name,
ccu.column_name AS foreign_column_name
FROM information_schema.table_constraints AS tc
JOIN information_schema.key_column_usage AS kcu
ON tc.constraint_name = kcu.constraint_name
JOIN information_schema.constraint_column_usage AS ccu
ON ccu.constraint_name = tc.constraint_name
WHERE constraint_type = 'FOREIGN KEY'
AND ccu.table_name = 'TableB'
AND tc.table_name IN ('TableA');
(Note: the last WHERE clause uses IN because there can be more than one base table available. TableA is the base table and each successfully connected/joined table will be available for additional joins, e.g. a third table could use AND ccu.table_name = 'TableC' AND tc.table_name IN ('TableA', 'TableB'); and so on.)
When using the admin db user (with most common privileges like GRANT, SELECT, INSERT, UPDATE, DELETE, TRUNCATE, ...) executes the query, the result looks something like this:
constraint_name | table_name | column_name | foreign_table_name | foreign_column_name
----------------+------------+-------------+--------------------+---------------------
constraint1 | TableA | field_1 | TableB | field_2
(1 row)
But when the read-only db user runs that query, it returns:
constraint_name | table_name | column_name | foreign_table_name | foreign_column_name
----------------+------------+-------------+--------------------+---------------------
(0 rows)
Due to the existing but not returned foreign key constraint entry, the joins can not be properly written as SQL and the user generated query (by using the wizard) fails.
What I tried
First of course, I thought the read-only user (ro_user) might not have the permissions to access tables and views in database information_schema. So I ran
GRANT SELECT ON ALL TABLES IN SCHEMA information_schema TO ro_user;
as admin but to no avail. Getting more into the depths of the documentation, I found that all tables and views in information_schema are available and accessible to any user by default in postgres anyways. So granting the select privilege shouldn't even change anything.
Just to make sure, I also ran
GRANT REFERENCES ON ALL TABLES IN SCHEMA actual_database TO ro_user;
but of course, this didn't change anything neither, since REFERENCESis only needed for creating new foreign key, I just need to read them.
Next, I thought, maybe the sql from the tool is failing due to some information not being available, so I queried the three views separately by running:
SELECT * FROM information_schema.table_constraints AS tc WHERE constraint_type = 'FOREIGN KEY';
SELECT * FROM information_schema.key_column_usage AS kcu;
SELECT * FROM information_schema.constraint_column_usage AS ccu;
And sure enough, the last one wouldn't return any single row for the ro_user:
psql=> SELECT * FROM information_schema.constraint_column_usage AS ccu;
table_catalog | table_schema | table_name | column_name | constraint_catalog | constraint_schema | constraint_name
---------------+--------------+------------+-------------+--------------------+-------------------+-----------------
(0 rows)
whereas the admin user got lots of results. So, it was coming down to that one view information_schema.constraint_column_usage.
As I was typing out that question over the course of an hour recollecting and boiling down all the ideas I tried during the last days, I finally found the cause.
The view constraint_column_usage identifies all columns in the current database that are used by some constraint. Only those columns are shown that are contained in a table owned by a currently enabled role.
From documentation via this SO answer
And through that I found a solution
SELECT
conrelid::regclass AS table_from,
conname,
pg_get_constraintdef(c.oid) AS cdef
FROM pg_constraint c
JOIN pg_namespace n
ON n.oid = c.connamespace
WHERE contype IN ('f')
AND n.nspname = 'public'
AND pg_get_constraintdef(c.oid) LIKE '%"TableB"%'
AND conrelid::regclass::text IN ('"TableA"')
ORDER BY conrelid::regclass::text, contype DESC;
It doesn't output the same format as the old query, but it contains the same information and is - most importantly - available to the ro_user.