Selecting all records in all tables of the Public Schema in PostgreSQL - postgresql

I have several tables in my PostgreSQL database's Public Schema. The tables are named "projects_2019", "projects_2020", "projects_2021", etc. and have the same columns. The idea is that a new table will be added every year.
I would like to select all records in all of the tables whose name includes "projects_", how could I do this without naming each and every table name (since I don't know how many there will be in the future)?
Here's what I have so far:
WITH t as
(SELECT * FROM information_schema.tables WHERE table_schema = 'public' and table_name ~ 'projects_')
SELECT * FROM t

You can do it using dynamic SQL and information_schema. For Example:
-- Sample Data
CREATE TABLE table1 (
id int4 NULL,
caption text NULL
);
CREATE TABLE table2 (
id int4 NULL,
caption text NULL
);
CREATE TABLE table3 (
id int4 NULL,
caption text NULL
);
CREATE TABLE table4 (
id int4 NULL,
caption text NULL
);
INSERT INTO table1 (id, caption) VALUES (1, 'text1');
INSERT INTO table2 (id, caption) VALUES (2, 'text2');
INSERT INTO table3 (id, caption) VALUES (3, 'text3');
INSERT INTO table4 (id, caption) VALUES (4, 'text4');
-- create function sample:
CREATE OR REPLACE FUNCTION select_tables()
RETURNS table(id integer, caption text)
LANGUAGE plpgsql
AS $function$
declare
v_sql text;
v_union text;
begin
SELECT string_agg('select * from ' || table_schema || '.' || table_name, ' union all ')
into v_sql
FROM information_schema.tables WHERE table_schema = 'public' and table_name ~ 'table';
return query
execute v_sql;
end ;
$function$
;
-- selecting data:
select * from select_tables()
-- Result:
id caption
1 text1
2 text2
3 text3
4 text4

You can try like this:
FOR i IN SELECT table_name
FROM information_schema.tables
WHERE table_schema = 'public'
and table_name ~ 'projects_'
LOOP
sqlstr := sqlstr || format($$
UNION
SELECT name FROM %I
$$,
i.table_name);
END LOOP;
EXECUTE sqlstr;

Related

snowflake incorporate insert and update to a table

I have three snowflake tables(TEST1, TEST2, CONTROL) as below.
TEST1
create OR REPLACE table TEST1 (
id varchar(100),
name varchar(100),
address VARCHAR(64)
);
INSERT INTO TEST1 values (100, 'ABC', null);
INSERT INTO TEST1 values (200, 'XYZ', null);
INSERT INTO TEST1 values (300, 'VBN', null);
TEST2
create OR REPLACE table TEST2 (
id varchar(100),
name varchar(100),
address VARCHAR(64)
);
INSERT INTO TEST2 values (100, 'ABC', null);
INSERT INTO TEST2 values (200, 'FDS', null);
CONTROL
create OR REPLACE table CONTROL (
KEY_COLUMNS VARCHAR,
TABLE_NAME_SOURCE VARCHAR,
TABLE_NAME_TARGET VARCHAR
);
INSERT INTO CONTROL values ('id,name,address', 'TEST2','TEST1');
I want to incorporate insert and update to 'TEST2' based on 'TEST1' table.If records does not exist in TEST2, then the stored procedure should insert rows into TEST2 from TEST1
If there is any update to TEST1, it should be captured in TEST2.
My stored procedure looks like below. I have declared variables which come from control table. My script works completely fine for insert.
For update, it should update all columns specified in KEY_COLUMNS from TEST1 to TEST2. I can hardcode and update (highlighted in bold), but instead I want to use KEY_COLUMN column from control table as I did for insert something like (WHEN MATCHED THEN update set target_columns = source_columns).
CREATE OR REPLACE PROCEDURE SP()
RETURNS string
LANGUAGE SQL
AS
$$
DECLARE
source_tbl STRING := (select TABLE_NAME_SOURCE from CONTROL);
source_columns STRING;
target_columns STRING;
query1 STRING;
BEGIN
SELECT KEY_COLUMNS INTO :source_columns FROM CONTROL WHERE TABLE_NAME_SOURCE = :source_tbl;
SELECT KEY_COLUMNS INTO :target_columns FROM CONTROL WHERE TABLE_NAME_SOURCE = :source_tbl;
QUERY1 := 'MERGE INTO TEST2 AS A
USING (
select '|| :source_columns ||' from TEST1) AS B
ON A.ID=B.ID
WHEN NOT MATCHED THEN INSERT ('|| :target_columns ||')
values ('|| :source_columns ||')
**WHEN MATCHED THEN update set A.ID=B.ID, A.name=B.name, A.address=B.address**;';
EXECUTE IMMEDIATE :QUERY1;
RETURN :QUERY1;
END;
$$;
call SP();
Expected TEST2 Output
You want a new statement.
SELECT listagg(' A.'||s.value||'=B.'||s.value, ',') FROM CONTROL, TABLE(SPLIT_TO_TABLE(KEY_COLUMNS, ',')) s WHERE TABLE_NAME_SOURCE = 'TEST2';
which gives:
LISTAGG(' A.'||S.VALUE||'=B.'||S.VALUE, ',')
A.id=B.id, A.name=B.name, A.address=B.address
Thus you SP becomes:
CREATE OR REPLACE PROCEDURE SP()
RETURNS string
LANGUAGE SQL
AS
$$
DECLARE
source_tbl STRING := (select TABLE_NAME_SOURCE from CONTROL);
source_columns STRING;
target_columns STRING;
update_sets STRING;
query1 STRING;
BEGIN
SELECT KEY_COLUMNS INTO :source_columns FROM CONTROL WHERE TABLE_NAME_SOURCE = :source_tbl;
SELECT KEY_COLUMNS INTO :target_columns FROM CONTROL WHERE TABLE_NAME_SOURCE = :source_tbl;
SELECT listagg(' A.'||s.value||'=B.'||s.value, ',') INTO :update_sets FROM CONTROL, TABLE(SPLIT_TO_TABLE(KEY_COLUMNS, ',')) s WHERE TABLE_NAME_SOURCE = :source_tbl;;
QUERY1 := 'MERGE INTO TEST2 AS A
USING (
select '|| :source_columns ||' from TEST1) AS B
ON A.ID=B.ID
WHEN NOT MATCHED THEN INSERT ('|| :target_columns ||')
values ('|| :source_columns ||')
WHEN MATCHED THEN update set '|| :update_sets ||';';
EXECUTE IMMEDIATE :QUERY1;
RETURN :QUERY1;
END;
$$;
which returns
SP
MERGE INTO TEST2 AS A USING ( select id,name,address from TEST1) AS B ON A.ID=B.ID WHEN NOT MATCHED THEN INSERT (id,name,address) values (id,name,address) WHEN MATCHED THEN update set A.id=B.id, A.name=B.name, A.address=B.address;
You can try to build your update statement merge string using value fetched from below query -
SNOWFLAKE1#COMPUTE_WH#TEST_DB.PUBLIC>select * from control;
+-----------------+-------------------+-------------------+
| KEY_COLUMNS | TABLE_NAME_SOURCE | TABLE_NAME_TARGET |
|-----------------+-------------------+-------------------|
| id,name,address | TEST2 | TEST1 |
+-----------------+-------------------+-------------------+
1 Row(s) produced. Time Elapsed: 0.156s
SNOWFLAKE1#COMPUTE_WH#TEST_DB.PUBLIC>select listagg(concat('A.',value,'=','B.',value),',') as set_upd from (select * fr
om control, lateral split_to_table(control.key_columns,','));
+---------------------------------------------+
| SET_UPD |
|---------------------------------------------|
| A.id=B.id,A.name=B.name,A.address=B.address |
+---------------------------------------------+
1 Row(s) produced. Time Elapsed: 0.185s

Postgres-11 : Alter Table dynamically

I am trying to alter table based on another table dynamically.
Below is the piece of code i wrote in postgresql stored procedure.
But running out into some syntax errors.
Please help me here.
I just started working in postgresql and i am from sql server background. Like how we do in sql server print stmts do debug dynamic queries inside procedures; do we have any link to refer please share that as well. It would help me.
DROP TABLE IF EXISTS temp_table;
CREATE TEMP TABLE temp_table AS
with cte as
(
select column_name,data_type,character_maximum_length
from information_schema."columns" c
where table_name = 'customer_new' and table_schema = 'public'
and column_default is null
)
,cte1 as
(
select column_name,data_type,character_maximum_length
from information_schema."columns" c
where table_name = 'customer_old' and table_schema = 'public'
and column_default is null
)
select cte.column_name,
case when cte.character_maximum_length is not null then cte.data_type||'('||cte.character_maximum_length||')' else cte.data_type end as data_type
from cte
left join cte1 on cte.column_name = cte1.column_name
where cte1.column_name is null;
for v_column_name,v_data_type in SELECT column_name,data_type FROM temp_table
loop
execute format ('alter table %s add column %s %s ;', v_dump_table_name, v_column_name, v_data_type);
end loop;
DROP TABLE IF EXISTS temp_table;
Thanks in advance.
DROP TABLE IF EXISTS temp_table;
CREATE TEMP TABLE temp_table AS
with cte as
(
select column_name,data_type,character_maximum_length
from information_schema."columns" c
where table_name = 'customer_new' and table_schema = 'public'
and column_default is null
)
,cte1 as
(
select column_name,data_type,character_maximum_length
from information_schema."columns" c
where table_name = 'customer_old' and table_schema = 'public'
and column_default is null
)
select 'alter table '||v_dump_table_name||' add column '||cte2.column_name||' '||data_type
as col
from cte2;
for v_column in SELECT col FROM temp_table
loop
execute format ('%s ;', v_column);
end loop;
DROP TABLE IF EXISTS temp_table;

t-sql select column names from all tables where there is at least 1 null value

Context: I am exploring a new database (in MS SQL server), and I want to know for each table, all columns that have null values.
I.e. result would look something like this:
table column nulls
Tbl1 Col1 8
I have found this code here on stackoverflow, that makes a table of table-columnnames - without the WHERE statement which is my addition.
I tried to filter for nulls in WHERE statement, but then the table ends up empty, and I see why - i am checking if the col name is actually null, and not its contents. But can't figure out how to proceed.
select schema_name(tab.schema_id) as schema_name,
tab.name as table_name,
col.name as column_name
from sys.tables as tab
inner join sys.columns as col
on tab.object_id = col.object_id
left join sys.types as t
on col.user_type_id = t.user_type_id
-- in this where statement, I am trying to filter for nulls, but i get an empty result. and i know there are nulls
where col.name is null
order by schema_name, table_name, column_id
I also tried this (see 4th line):
select schema_name(tab.schema_id) as schema_name,
tab.name as table_name,
col.name as column_name
,(select count(*) from tab.name where col.name is null) as countnulls
from sys.tables as tab
inner join sys.columns as col
on tab.object_id = col.object_id
left join sys.types as t
on col.user_type_id = t.user_type_id
order by schema_name, table_name, column_id
the last one returns an error "Invalid object name 'tab.name'."
column name can't be null but if you mean nullable column (column that accept null value) that has null value at least so you can use following statement:
declare #schema varchar(255), #table varchar(255), #col varchar(255), #cmd varchar(max)
DECLARE getinfo cursor for
SELECT schema_name(tab.schema_id) as schema_name,tab.name , col.name from sys.tables as tab
inner join sys.columns as col on tab.object_id = col.object_id
where col.is_nullable =1
order by schema_name(tab.schema_id),tab.name,col.name
OPEN getinfo
FETCH NEXT FROM getinfo into #schema,#table,#col
WHILE ##FETCH_STATUS = 0
BEGIN
set #schema = QUOTENAME(#schema)
set #table = QUOTENAME(#table)
set #col = QUOTENAME(#col)
SELECT #cmd = 'IF EXISTS (SELECT 1 FROM '+ #schema +'.'+ #table +' WHERE ' + #col + ' IS NULL) BEGIN SELECT '''+#schema+''' as schemaName, '''+#table+''' as tablename, '''+#col+''' as columnName, * FROM '+ #schema +'.'+ #table +' WHERE ' + #col + ' IS NULL end'
EXEC(#cmd)
FETCH NEXT FROM getinfo into #schema,#table,#col
END
CLOSE getinfo
DEALLOCATE getinfo
that use cursor on all nullable columns in every table in the Database then check if this column has at least one null value if yes will select schema Name, table name, column name and all records that has null value in this column
but if you want to get only count of nulls you can use the following statement:
declare #schema varchar(255), #table varchar(255), #col varchar(255), #cmd varchar(max)
DECLARE getinfo cursor for
SELECT schema_name(tab.schema_id) as schema_name,tab.name , col.name from sys.tables as tab
inner join sys.columns as col on tab.object_id = col.object_id
where col.is_nullable =1
order by schema_name(tab.schema_id),tab.name,col.name
OPEN getinfo
FETCH NEXT FROM getinfo into #schema,#table,#col
WHILE ##FETCH_STATUS = 0
BEGIN
set #schema = QUOTENAME(#schema)
set #table = QUOTENAME(#table)
set #col = QUOTENAME(#col)
SELECT #cmd = 'IF EXISTS (SELECT 1 FROM '+ #schema +'.'+ #table +' WHERE ' + #col + ' IS NULL) BEGIN SELECT '''+#schema+''' as schemaName, '''+#table+''' as tablename, '''+#col+''' as columnName, count(*) as nulls FROM '+ #schema +'.'+ #table +' WHERE ' + #col + ' IS NULL end'
EXEC(#cmd)
FETCH NEXT FROM getinfo into #schema,#table,#col
END
that use cursor on all nullable columns in every table in the Database then check if this column has at least one null value if yes will select schema Name, table name, column name and count all records that has null value in this column

Postgres find all rows in database tables matching criteria on a given column

I am trying to write sub-queries so that I search all tables for a column named id and since there are multiple tables with id column, I want to add the condition, so that id = 3119093.
My attempt was:
Select *
from information_schema.tables
where id = '3119093' and id IN (
Select table_name
from information_schema.columns
where column_name = 'id' );
This didn't work so I tried:
Select *
from information_schema.tables
where table_name IN (
Select table_name
from information_schema.columns
where column_name = 'id' and 'id' IN (
Select * from table_name where 'id' = 3119093));
This isn't the right way either. Any help would be appreciated. Thanks!
A harder attempt is:
CREATE OR REPLACE FUNCTION search_columns(
needle text,
haystack_tables name[] default '{}',
haystack_schema name[] default '{public}'
)
RETURNS table(schemaname text, tablename text, columnname text, rowctid text)
AS $$
begin
FOR schemaname,tablename,columnname IN
SELECT c.table_schema,c.table_name,c.column_name
FROM information_schema.columns c
JOIN information_schema.tables t ON
(t.table_name=c.table_name AND t.table_schema=c.table_schema)
WHERE (c.table_name=ANY(haystack_tables) OR haystack_tables='{}')
AND c.table_schema=ANY(haystack_schema)
AND t.table_type='BASE TABLE'
--AND c.column_name = "id"
LOOP
EXECUTE format('SELECT ctid FROM %I.%I WHERE cast(%I as text) like %L',
schemaname,
tablename,
columnname,
needle
) INTO rowctid;
IF rowctid is not null THEN
RETURN NEXT;
END IF;
END LOOP;
END;
$$ language plpgsql;
select * from search_columns('%3119093%'::varchar,'{}'::name[]) ;
The only problem is this code displays the table name and column name. I have to then manually enter
Select * from table_name where id = 3119093
where I got the table name from the code above.
I want to automatically implement returning rows from a table but I don't know how to get the table name automatically.
I took the time to make it work for you.
For starters, some information on what is going on inside the code.
Explanation
function takes two input arguments: column name and column value
it requires a created type that it will be returning a set of
first loop identifies tables that have a column name specified as the input argument
then it forms a query which aggregates all rows that match the input condition inside every table taken from step 3 with comparison based on ILIKE - as per your example
function goes into the second loop only if there is at least one row in currently visited table that matches specified condition (then the array is not null)
second loop unnests the array of rows that match the condition and for every element it puts it in the function output with RETURN NEXT rec clause
Notes
Searching with LIKE is inefficient - I suggest adding another input argument "column type" and restrict it in the lookup by adding a join to pg_catalog.pg_type table.
The second loop is there so that if more than 1 row is found for a particular table, then every row gets returned.
If you are looking for something else, like you need key-value pairs, not just the values, then you need to extend the function. You could for example build json format from rows.
Now, to the code.
Test case
CREATE TABLE tbl1 (col1 int, id int); -- does contain values
CREATE TABLE tbl2 (col1 int, col2 int); -- doesn't contain column "id"
CREATE TABLE tbl3 (id int, col5 int); -- doesn't contain values
INSERT INTO tbl1 (col1, id)
VALUES (1, 5), (1, 33), (1, 25);
Table stores data:
postgres=# select * From tbl1;
col1 | id
------+----
1 | 5
1 | 33
1 | 25
(3 rows)
Creating type
CREATE TYPE sometype AS ( schemaname text, tablename text, colname text, entirerow text );
Function code
CREATE OR REPLACE FUNCTION search_tables_for_column (
v_column_name text
, v_column_value text
)
RETURNS SETOF sometype
LANGUAGE plpgsql
STABLE
AS
$$
DECLARE
rec sometype%rowtype;
v_row_array text[];
rec2 record;
arr_el text;
BEGIN
FOR rec IN
SELECT
nam.nspname AS schemaname
, cls.relname AS tablename
, att.attname AS colname
, null::text AS entirerow
FROM
pg_attribute att
JOIN pg_class cls ON att.attrelid = cls.oid
JOIN pg_namespace nam ON cls.relnamespace = nam.oid
WHERE
cls.relkind = 'r'
AND att.attname = v_column_name
LOOP
EXECUTE format('SELECT ARRAY_AGG(row(tablename.*)::text) FROM %I.%I AS tablename WHERE %I::text ILIKE %s',
rec.schemaname, rec.tablename, rec.colname, quote_literal(concat('%',v_column_value,'%'))) INTO v_row_array;
IF v_row_array is not null THEN
FOR rec2 IN
SELECT unnest(v_row_array) AS one_row
LOOP
rec.entirerow := rec2.one_row;
RETURN NEXT rec;
END LOOP;
END IF;
END LOOP;
END
$$;
Exemplary call & output
postgres=# select * from search_tables_for_column('id','5');
schemaname | tablename | colname | entirerow
------------+-----------+---------+-----------
public | tbl1 | id | (1,5)
public | tbl1 | id | (1,25)
(2 rows)

Postgresql, select a "fake" row

In Postgres 8.4 or higher, what is the most efficient way to get a row of data populated by defaults without actually creating the row. Eg, as a transaction (pseudocode):
create table "mytable"
(
id serial PRIMARY KEY NOT NULL,
parent_id integer NOT NULL DEFAULT 1,
random_id integer NOT NULL DEFAULT random(),
)
begin transaction
fake_row = insert into mytable (id) values (0) returning *;
delete from mytable where id=0;
return fake_row;
end transaction
Basically I'd expect a query with a single row where parent_id is 1 and random_id is a random number (or other function return value) but I don't want this record to persist in the table or impact on the primary key sequence serial_id_seq.
My options seem to be using a transaction like above or creating views which are copies of the table with the fake row added but I don't know all the pros and cons of each or whether a better way exists.
I'm looking for an answer that assumes no prior knowledge of the datatypes or default values of any column except id or the number or ordering of the columns. Only the table name will be known and that a record with id 0 should not exist in the table.
In the past I created the fake record 0 as a permanent record but I've come to consider this record a type of pollution (since I typically have to filter it out of future queries).
You can copy the table definition and defaults to the temp table with:
CREATE TEMP TABLE table_name_rt (LIKE table_name INCLUDING DEFAULTS);
And use this temp table to generate dummy rows. Such table will be dropped at the end of the session (or transaction) and will only be visible to current session.
You can query the catalog and build a dynamic query
Say we have this table:
create table test10(
id serial primary key,
first_name varchar( 100 ),
last_name varchar( 100 ) default 'Tom',
age int not null default 38,
salary float default 100.22
);
When you run following query:
SELECT string_agg( txt, ' ' order by id )
FROM (
select 1 id, 'SELECT ' txt
union all
select 2, -9999 || ' as id '
union all
select 3, ', '
|| coalesce( column_default, 'null'||'::'||c.data_type )
|| ' as ' || c.column_name
from information_schema.columns c
where table_schema = 'public'
and table_name = 'test10'
and ordinal_position > 1
) xx
;
you will get this sting as a result:
"SELECT -9999 as id , null::character varying as first_name ,
'Tom'::character varying as last_name , 38 as age , 100.22 as salary"
then execute this query and you will get the "phantom row".
We can build a function that build and excecutes the query and return our row as a result:
CREATE OR REPLACE FUNCTION get_phantom_rec (p_i test10.id%type )
returns test10 as $$
DECLARE
v_sql text;
myrow test10%rowtype;
begin
SELECT string_agg( txt, ' ' order by id )
INTO v_sql
FROM (
select 1 id, 'SELECT ' txt
union all
select 2, p_i || ' as id '
union all
select 3, ', '
|| coalesce( column_default, 'null'||'::'||c.data_type )
|| ' as ' || c.column_name
from information_schema.columns c
where table_schema = 'public'
and table_name = 'test10'
and ordinal_position > 1
) xx
;
EXECUTE v_sql INTO myrow;
RETURN myrow;
END$$ LANGUAGE plpgsql ;
and then this simple query gives you what you want:
select * from get_phantom_rec ( -9999 );
id | first_name | last_name | age | salary
-------+------------+-----------+-----+--------
-9999 | | Tom | 38 | 100.22
I would just select the fake values as literals:
select 1 id, 1 parent_id, 1 user_id
The returned row will be (virtually) indistinguishable from a real row.
To get the values from the catalog:
select
0 as id, -- special case for serial type, just return 0
(select column_default::int -- Cast to int, because we know the column is int
from INFORMATION_SCHEMA.COLUMNS
where table_name = 'mytable'
and column_name = 'parent_id') as parent_id,
(select column_default::int -- Cast to int, because we know the column is int
from INFORMATION_SCHEMA.COLUMNS
where table_name = 'mytable'
and column_name = 'user_id') as user_id;
Note that you must know what the columns are and their type, but this is reasonable. If you change the table schema (except default value), you would need to tweak the query.
See the above as a SQLFiddle.