Query on subquery that gets tables' names - postgresql

I have few tables in my database. They all have the same columns (id, name) but differ in the table name. Those tables have names that start with letter 'h'.
Not a very interesting schema design but I have to follow it.
I need to search for id in all those tables.
I tried something similar to:
select id from (select table_name
FROM information_schema.tables
where table_name like 'h%') as t;
I got error:
ERROR: column "id" does not exist.
I understand the error now but I still do not know how to do the query?

You need dynamic SQL to do that since you cannot use values as identifiers in plain SQL. Write a PL/pgSQL function with EXECUTE:
CREATE FUNCTION f_all_tables()
RETURNS TABLE (id int) AS
$func$
DECLARE
_tbl regclass;
BEGIN
FOR _tbl IN
SELECT c.oid::regclass
FROM pg_catalog.pg_class c
JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace
WHERE c.relkind = 'r'
AND c.relname LIKE 'h%'
AND n.nspname = 'public' -- your schema name here
LOOP
RETURN QUERY EXECUTE '
SELECT id FROM ' || _tbl;
END LOOP;
END
$func$ LANGUAGE plpgsql;
I am using a variable of the object identifier type regclass to prevent SQL injection effectively. More about that in this related answer:
Table name as a PostgreSQL function parameter

Related

Get tablename from regclass in PostgreSQL

I would like to get Tablename from regclass in PostgreSQL. I have found a work around but I am not feeling so happy with it:
SELECT split_part('datastore.inline'::regclass::TEXT, '.', 2);
Is there a dedicated function to extract table name from regclass in Postgre?
You can query pg_class:
select relname
from pg_class
where oid = 'datastore.inline'::regclass;
There is no built-in function but you can create your own one:
create or replace function get_relname(regclass)
returns name language sql as $$
select relname
from pg_class
where oid = $1
$$;
select get_relname('datastore.inline'::regclass);

Perform query using tables and columns from information_schema

I'm trying to using information_schema.columns to find all of the columns in my database that has a geometry type and then check the SRID for the data in those columns.
I can do this with multiple queries where I first find the table names and column names
SELECT table_name, column_name
FROM information_schema.columns
WHERE udt_name = 'geometry';
and then (manually)
SELECT ST_SRID(column_name)
FROM table_name;
for each entry.
Does anyone how to streamline this into a single query?
Table names can't be variable; Postgres needs to be able to come up with an execution plan before it knows the parameter values. So you can't do this in a simple SQL statement.
Instead, you need to construct a dynamic query string using a procedural language like PL/pgSQL:
CREATE FUNCTION SRIDs() RETURNS TABLE (
tablename TEXT,
columnname TEXT,
srid INTEGER
) AS $$
BEGIN
FOR tablename, columnname IN (
SELECT table_name, column_name
FROM information_schema.columns
WHERE udt_name = 'geometry'
)
LOOP
EXECUTE format(
'SELECT ST_SRID(%s) FROM %s',
columnname, tablename
) INTO srid;
RETURN NEXT;
END LOOP;
END
$$
LANGUAGE plpgsql;
SELECT * FROM SRIDs();

Wait for the sequence to get last_value

I have a query that gives all sequences together with the nextval:
SELECT c.oid::regclass, setval(c.oid, nextval(c.oid), false)
FROM pg_class c
WHERE c.relkind = 'S'
But it throws an error on the production database:
ERROR: cannot access temporary tables of other sessions
I've also created a function with last_value (to avoid setting the sequence value) like in this post Get max id of all sequences in PostgreSQL
That doesn't help.
Is there a way to wait for all sequences to get finished without locking all tables?
Thats my function
CREATE TYPE tp_sequencedetails AS (sequence_name text, last_value bigint);
CREATE OR REPLACE FUNCTION getsequenceswithdetails()
RETURNS SETOF tp_sequencedetails AS
$BODY$
DECLARE
returnrec tp_sequencedetails;
sequence_name text;
BEGIN
FOR sequence_name IN (SELECT c.oid::regclass FROM pg_class c WHERE c.relkind = 'S')
LOOP
FOR returnrec IN EXECUTE 'SELECT ''' || sequence_name || ''', last_value FROM ' || sequence_name
LOOP
RETURN NEXT returnrec;
END LOOP;
END LOOP;
END;
$BODY$
LANGUAGE plpgsql VOLATILE
COST 100
ROWS 1000;
ERROR: cannot access temporary tables of other sessions
In another session you're creating a temporary sequence. You are now trying to get or to set a value for this sequence, but it's not visible in your current session. A temporary table and sequence is only visible for the session that creates this object.
Solution: Keep the temporary sequences out of your query.
SELECT c.oid::regclass, setval(c.oid, nextval(c.oid), false)
FROM pg_class c
JOIN pg_namespace ON pg_namespace.oid = relnamespace
WHERE c.relkind = 'S'
AND nspname NOT ILIKE 'pg_temp%';

Alter table add column default and execute the default for each row

I am making a function that adds a id column to a given table, creates a sequence and fills the new columns value. The thing is that the column is created but now I need to fill it with nextval() of the created sequence (1,2,3,4,5...). I don't know how to specify that in the add column sentence.
CREATE OR REPLACE FUNCTION create_id(tabla character varying)
RETURNS void AS
$BODY$
DECLARE
BEGIN
IF NOT EXISTS (SELECT information_schema.columns.column_name FROM information_schema.columns WHERE information_schema.columns.table_name=tabla AND information_schema.columns.column_name='id')
THEN
EXECUTE 'ALTER TABLE '|| tabla ||' ADD COLUMN id numeric(8,0)';
IF NOT EXISTS (SELECT relname FROM pg_class WHERE relname='seq_id_'||tabla)
THEN
EXECUTE 'CREATE SEQUENCE seq_id_'||tabla||' INCREMENT 1 MINVALUE 1 MAXVALUE 9223372036854775807 START 1 CACHE 1';
EXECUTE 'GRANT ALL ON TABLE seq_id_'||tabla||' TO postgres';
EXECUTE 'ALTER TABLE ONLY '||tabla||' ALTER COLUMN id SET DEFAULT nextval(''seq_id_'||tabla||'''::regclass)';
END IF;
END IF;
RETURN;
END;
$BODY$
LANGUAGE plpgsql;
Your function suffers from a number of series problems. Use this instead:
CREATE OR REPLACE FUNCTION f_create_id(_tbl text)
RETURNS void AS
$func$
DECLARE
_seq text := _tbl || '_id_seq';
BEGIN
IF EXISTS (
SELECT 1 FROM pg_namespace n
JOIN pg_class c ON c.relnamespace = n.oid
JOIN pg_attribute a ON a.attrelid = c.oid
WHERE n.nspname = current_schema() -- default to current schema
AND c.relname = _tbl
AND a.attname = 'id'
AND NOT a.attisdropped)
THEN
RAISE EXCEPTION 'Column already exists!'; RETURN;
END IF;
IF EXISTS (
SELECT 1 FROM pg_namespace n
JOIN pg_class c ON c.relnamespace = n.oid
WHERE n.nspname = current_schema() -- default to current schema
AND c.relname = _seq)
THEN
RAISE EXCEPTION 'Sequence already exists!'; RETURN;
END IF;
EXECUTE format('CREATE SEQUENCE %I.%I', current_schema(), _seq;
EXECUTE format($$ALTER TABLE %I.%I ADD COLUMN id numeric(8,0)
DEFAULT nextval('%I'::regclass)$$ -- one statement!
, current_schema(), _tbl, _seq);
END
$func$ LANGUAGE plpgsql;
Major points
If you set the column default in the same ALTER TABLE statement, values are inserted automatically. Be aware that this makes a big difference in performance for big tables, since every row has to be updated, while adding a NULL column only needs a tiny change to the system catalog.
You must define the schema to create objects in. If you want to default to the current schema, you still have to consider this in your queries to catalog (or information schema) tables. Table names are only unique in combination with the schema name.
I use the session information functions current_schema() to find out the current schema.
You must safeguard against SQL injection when using dynamic SQL with user input. Details:
Table name as a PostgreSQL function parameter
If the sequence already exists, do not use it! You might interfere wit existing objects.
Normally, you do not need EXECUTE GRANT ALL ON TABLE ... TO postgres. If postgres is a superuser (default) the role has all rights anyway. You might want to make postgres the owner. That would make a difference.
I am using the system catalog in both queries, while you use the information schema in one of them. I am generally not a fan of the information schema.Its bloated views are slow. The presented information adheres to a cross-database standard, but what's that good for when writing plpgsql functions, which are 100% not portable anyway?
Superior alternative
I would suggest not to use the column name id, which is an SQL anti-pattern. Use a proper descriptive name instead, like tablename || '_id'.
What's the point of using numeric(8,0)? If you don't want fractional digits, why not use integer? Simpler, smaller, faster.
Given that, you are much better off with a serial type, making everything much simpler:
CREATE OR REPLACE FUNCTION f_create_id(_tbl text)
RETURNS void AS
$func$
BEGIN
IF EXISTS (
SELECT 1 FROM pg_namespace n
JOIN pg_class c ON c.relnamespace = n.oid
JOIN pg_attribute a ON a.attrelid = c.oid
WHERE n.nspname = current_schema() -- default to current schema
AND c.relname = _tbl
AND a.attname = _tbl || '_id' -- proper column name
AND NOT a.attisdropped)
THEN
RAISE EXCEPTION 'Column already exists!';
ELSE
EXECUTE format('ALTER TABLE %I.%I ADD COLUMN %I serial'
, current_schema(), _tbl, _tbl || '_id');
END IF;
END
$func$ LANGUAGE plpgsql;

Loop on tables with PL/pgSQL in Postgres 9.0+

I want to loop through all my tables to count rows in each of them. The following query gets me an error:
DO $$
DECLARE
tables CURSOR FOR
SELECT tablename FROM pg_tables
WHERE tablename NOT LIKE 'pg_%'
ORDER BY tablename;
tablename varchar(100);
nbRow int;
BEGIN
FOR tablename IN tables LOOP
EXECUTE 'SELECT count(*) FROM ' || tablename INTO nbRow;
-- Do something with nbRow
END LOOP;
END$$;
Errors:
ERROR: syntax error at or near ")"
LINE 1: SELECT count(*) FROM (sql_features)
^
QUERY: SELECT count(*) FROM (sql_features)
CONTEXT: PL/pgSQL function inline_code_block line 8 at EXECUTE statement
sql_features is a table's name in my DB. I already tried to use quote_ident() but to no avail.
I can't remember the last time I actually needed to use an explicit cursor for looping in PL/pgSQL.
Use the implicit cursor of a FOR loop, that's much cleaner:
DO
$$
DECLARE
rec record;
nbrow bigint;
BEGIN
FOR rec IN
SELECT *
FROM pg_tables
WHERE tablename NOT LIKE 'pg\_%'
ORDER BY tablename
LOOP
EXECUTE 'SELECT count(*) FROM '
|| quote_ident(rec.schemaname) || '.'
|| quote_ident(rec.tablename)
INTO nbrow;
-- Do something with nbrow
END LOOP;
END
$$;
You need to include the schema name to make this work for all schemas (including those not in your search_path).
Also, you actually need to use quote_ident() or format() with %I or a regclass variable to safeguard against SQL injection. A table name can be almost anything inside double quotes. See:
Table name as a PostgreSQL function parameter
Minor detail: escape the underscore (_) in the LIKE pattern to make it a literal underscore: tablename NOT LIKE 'pg\_%'
How I might do it:
DO
$$
DECLARE
tbl regclass;
nbrow bigint;
BEGIN
FOR tbl IN
SELECT c.oid
FROM pg_class c
JOIN pg_namespace n ON n.oid = c.relnamespace
WHERE c.relkind = 'r'
AND n.nspname NOT LIKE 'pg\_%' -- system schema(s)
AND n.nspname <> 'information_schema' -- information schema
ORDER BY n.nspname, c.relname
LOOP
EXECUTE 'SELECT count(*) FROM ' || tbl INTO nbrow;
-- raise notice '%: % rows', tbl, nbrow;
END LOOP;
END
$$;
Query pg_catalog.pg_class instead of tablename, it provides the OID of the table.
The object identifier type regclass is handy to simplify. n particular, table names are double-quoted and schema-qualified where necessary automatically (also prevents SQL injection).
This query also excludes temporary tables (temp schema is named pg_temp% internally).
To only include tables from a given schema:
AND n.nspname = 'public' -- schema name here, case-sensitive
The cursor returns a record, not a scalar value, so "tablename" is not a string variable.
The concatenation turns the record into a string that looks like this (sql_features). If you had selected e.g. the schemaname with the tablename, the text representation of the record would have been (public,sql_features).
So you need to access the column inside the record to create your SQL statement:
DO $$
DECLARE
tables CURSOR FOR
SELECT tablename
FROM pg_tables
WHERE tablename NOT LIKE 'pg_%'
ORDER BY tablename;
nbRow int;
BEGIN
FOR table_record IN tables LOOP
EXECUTE 'SELECT count(*) FROM ' || table_record.tablename INTO nbRow;
-- Do something with nbRow
END LOOP;
END$$;
You might want to use WHERE schemaname = 'public' instead of not like 'pg_%' to exclude the Postgres system tables.