Function or loop using table and topologies names as arguments in Postgresql - postgresql

I'm working with topologies in PostGIS and to create a TopoGeometry column, I'm using this loop:
DO $$DECLARE r record;
BEGIN
FOR r IN SELECT * FROM table_uf_11 LOOP
BEGIN
UPDATE table_uf_11 SET tg_geom = toTopoGeom(ST_Force2D(geom),'topology_uf_11', 1, 1)
WHERE gid= r.gid;
EXCEPTION
WHEN OTHERS THEN
RAISE WARNING 'Loading of record % failed: %', r.gid, SQLERRM;
END;
END LOOP;
END$$;
The reason for using this loop is because in some rows the toTopoGeom function displays error, but are just a few cases, for exemplo 38 cases in 24.000.
Using this structure I can identify which cases are problematic in the log and fix them later.
My problem is that I have another 26 tables with their respective topologies, all of them identified by the state code, for exemplo:
table_uf_12 / topology_uf_12
table_uf_13 / topology_uf_13
table_uf_14 / topology_uf_14
...
table_uf_53 / topology_uf_53
The state code are not necessarily sequential, but the names has the same pattern. Column names as geom and tg_geom are equal for all tables.
How can I make a function or another loop structure to replicate this process in all 27 tables and the same time save the log of each table?
I tried to make a function, but in this case the arguments would be the table name and the topology name, and i'm having difficult to elaborate this structure.
Any suggestions?

I think this should do it:
DO $BODY$
DECLARE
t regclass;
gid bigint;
BEGIN
FOR t IN SELECT oid::regclass FROM pg_class WHERE relname ~ '^table_uf_\d+$' LOOP
FOR gid IN EXECUTE 'SELECT gid FROM ' || t::text LOOP
BEGIN
EXECUTE
' UPDATE ' || t::text ||
' SET tg_geom = toTopoGeom(ST_Force2D(geom), $2, 1, 1)'
' WHERE gid = $1'
USING gid, replace(t::text, 'table', 'topology');
EXCEPTION
WHEN OTHERS THEN
RAISE WARNING 'Loading of record % failed: %', gid, SQLERRM;
END;
END LOOP;
END LOOP;
END
$BODY$

Related

How to implement execute command inside for loop - postgres, dbeaver

I am new to postgres. I need to create a function that will take a list of all the tables in the database whose names are stored in one table and then delete the records of all the tables that are older than x days and have a certain row_status. Some tables do not have a row_status column.
I get an error when I try to save a written function in dbeaver -> ERROR: syntax error at or near "||"
create function delete_old_records1(day1 int, row_status1 character default null, row_status2 character default null)
returns void
language plpgsql
as $$
declare
c all_tables1%rowtype;
begin
for c in select * from all_tables1 loop
if exists(SELECT column_name FROM information_schema.columns
WHERE table_schema = 'yard_kondor' AND table_name =c.table_name AND column_name = 'row_status') then
execute 'delete from '||c.table_name||' where row_create_datetime>current_date-day1+1 and
row_status in (coalesce(row_status1,''P''), row_status2)';
else
execute 'delete from '||c.table_name||' where row_create_datetime>current_date-day1+1';
raise notice 'Table '||c.table_name||' does not have row_status column';
end if;
end loop;
return;
commit;
end;
$$
Your immediate problem is this line:
raise notice 'Table '||c.table_name||' does not have row_status column';
That should be:
raise notice 'Table % does not have row_status column', c.table_name;
However, your function could be improved a bit. In general it is highly recommended to use format() to generate dynamic SQL to properly deal with identifiers. You also can't commit in a function. If you really need that, use a procedure.
create function delete_old_records1(day1 int, row_status1 character default null, row_status2 character default null)
returns void
language plpgsql
as $$
declare
c all_tables1%rowtype;
begin
for c in select * from all_tables1
loop
if exists (SELECT column_name FROM information_schema.columns
WHERE table_schema = 'yard_kondor'
AND table_name = c.table_name
AND column_name = 'row_status') then
execute format('delete from %I
where row_create_datetime > current_date - %J + 1
and row_status in (coalesce(row_status1,%L), row_status2)', c.table_name, day1, 'P');
else
execute format('delete from %I where row_create_datetime > current_date - day1 + 1', c.table_name);
raise notice 'Table % does not have row_status column', c.table_name;
end if;
end loop;
return;
-- you can't commit in a function
end;
$$
Thank you for answer. Now I'm able to save function, but currently I have a problem with running the function.
I started the function with:
DO $$ BEGIN
PERFORM "delete_old_records1"(31,'P','N');
END $$;
I started script with ALT+X (also tried select delete_old_records1(31,'P','N');) and have this error:
SQL Error [42703]: ERROR: column "day1" does not exist
Where: PL/pgSQL function delete_old_records1(integer,character,character) line 11 at EXECUTE statement
SQL statement "SELECT "delete_old_records1"(31,'P','N')"
PL/pgSQL function inline_code_block line 2 at PERFORM

How to access a record's field in an Execute sentence?

I need to access a dynamic field from a dynamic table, and I have this code for doing this
Here f_field takes the name of a column from t_table dynamicly.
When t_table is declared as a record this code gets an error. But when declared as a static column fieldType it runs as expected.
The question is, How can I declared t_row for this code to run properly, or how can I achieve the same by other way. Remember that t_table and f_field are dynamic and therefore their values change.
FOR t_row IN EXECUTE 'SELECT * from ' || t_table loop
EXECUTE format('select $1.%I', f_field) USING t_row into f;
RAISE notice '%', f;
END LOOP;
if I get your task right, you dont need $1 and USING here at all, eg:
do
$$
declare
t_table text;
f text;
f_field text := 'oid';
r record;
begin
for r in (select relname from pg_class where relname = 'pg_database') loop
EXECUTE format('select %I from %I', f_field, r.relname) into f;
RAISE notice '%', f;
end loop;
end;
$$
;
mind it raises only first row out of many this way
NOTICE: 12669
DO
When I was doing my research I found out that sentences like EXECUTE... doesn't understand records structure, this was by experimenting with plpgsql because there is not doc about this. So I need to convert t_row in another kind of object that has methods to extract a value from a dynamic field, I chose json because I think it is easy to use. This is my code.
FOR t_row IN EXECUTE 'SELECT * from dd.a' loop
f_json := row_to_json(layer_row);
f := json_extract_path(f_json, f_field);
RAISE notice '%', f;
END LOOP;
It work for fine for me, but this is only a walkaround, if someone could point out why is my first code not working I will apreciate it. Thanks in advance

Creating TEMP TABLE dynamically in Postgresql and selecting the same table in FOR loop. But getting the error near PIPE symbol

do
$xyz$
declare
y text;
i record;
begin
y := to_char(current_timestamp, 'YYYYMMDDHHMMSS');
raise notice '%',y;
execute 'CREATE TEMP TABLE someNewTable'
||y
||' AS select * from ( VALUES(0::int,-99999::numeric), (1::int, 100::numeric)) as t (key, value)';
for i in (select * from someNewTable||y) loop
raise notice '%',i.key;
end loop;
end;
$xyz$ language 'plpgsql'
ERROR: syntax error at or near "||"
LINE 13: for i in (select * from someNewTable||y) loop
Im unable to understand why the error is at the PIPE symbol. Please help me. I have been trying in Oracle db too, but same error. Am I doing anything wrong here?
The query in for ... loop statement also has to be dynamic, so you should use execute twice.
Use the format() function which is very convenient in conjunction with execute:
do $xyz$
declare
y text;
i record;
begin
y := to_char(current_timestamp, 'YYYYMMDDHHMMSS');
raise notice '%', y;
execute format($ex$
create temp table somenewtable%s
as select * from (
values
(0::int, -99999::numeric),
(1::int, 100::numeric)
) as t (key, value)
$ex$, y);
for i in
execute format($ex$
select * from somenewtable%s
$ex$, y)
loop
raise notice '%',i.key;
end loop;
end;
$xyz$ language 'plpgsql';

Fixing invalid memory alloc request at PostgreSQL 9.2.9

I've encountered a problem querying some of my tables recently. When I try to select data I get an ERROR telling: ERROR: invalid memory alloc request size 4294967293. This generally indicates data corruption. There is a nice and precise technique of how to delete corrupted rows described here: https://confluence.atlassian.com/jirakb/invalid-memory-alloc-request-size-440107132.html
But, since I have lots of corrupted tables, this method is too slow. So, I've found a nice function which returns the last successful ctid here: http://blog.dob.sk/2012/05/19/fixing-pg_dump-invalid-memory-alloc-request-size/
Looking for corrupted row is a bit faster when using it, but not fast enough. I slightly modified it to store all "last successful ctid" in a different table and now it looks like this:
CREATE OR REPLACE FUNCTION
find_bad_row(tableName TEXT)
RETURNS void
as $find_bad_row$
DECLARE
result tid;
curs REFCURSOR;
row1 RECORD;
row2 RECORD;
tabName TEXT;
count BIGINT := 0;
BEGIN
DROP TABLE IF EXISTS bad_rows_tbl;
CREATE TABLE bad_rows_tbl (id varchar(255), offs BIGINT);
SELECT reverse(split_part(reverse($1), '.', 1)) INTO tabName;
OPEN curs FOR EXECUTE 'SELECT ctid FROM ' || tableName;
count := 1;
FETCH curs INTO row1;
WHILE row1.ctid IS NOT NULL LOOP
BEGIN
result = row1.ctid;
count := count + 1;
FETCH curs INTO row1;
EXECUTE 'SELECT (each(hstore(' || tabName || '))).* FROM '
|| tableName || ' WHERE ctid = $1' INTO row2
USING row1.ctid;
IF count % 100000 = 0 THEN
RAISE NOTICE 'rows processed: %', count;
END IF;
EXCEPTION
WHEN SQLSTATE 'XX000' THEN
RAISE NOTICE 'LAST CTID: %', result;
EXECUTE 'INSERT INTO bad_rows_tbl VALUES(' || result || ',' || count || ')';
END;
END LOOP;
CLOSE curs;
END
$find_bad_row$
LANGUAGE plpgsql;
I'm quite new to plpgsql, so I'm stuck with the following question: how to query not pre-unsuccessful ctid, but the exact unsuccessful one (or calculate the next one from pre-unsuccessful) so I could insert it into bad_rows_tbl and use as an argument for a DELETE statement further?
Hope for some help...
UPD: a function I ended up
CREATE OR REPLACE FUNCTION
find_bad_row(tableName TEXT)
RETURNS tid[]
as $find_bad_row$
DECLARE
result tid;
curs REFCURSOR;
row1 RECORD;
row2 RECORD;
tabName TEXT;
youNeedMe BOOLEAN = false;
count BIGINT := 0;
arrIter BIGINT := 0;
arr tid[];
BEGIN
CREATE TABLE bad_rows_tbl (id varchar(255), offs BIGINT);
SELECT reverse(split_part(reverse($1), '.', 1)) INTO tabName;
OPEN curs FOR EXECUTE 'SELECT ctid FROM ' || tableName;
count := 1;
FETCH curs INTO row1;
WHILE row1.ctid IS NOT NULL LOOP
BEGIN
result = row1.ctid;
count := count + 1;
IF youNeedMe THEN
arr[arrIter] = result;
arrIter := arrIter + 1;
RAISE NOTICE 'ADDING CTID: %', result;
youNeedMe = FALSE;
END IF;
FETCH curs INTO row1;
EXECUTE 'SELECT (each(hstore(' || tabName || '))).* FROM '
|| tableName || ' WHERE ctid = $1' INTO row2
USING row1.ctid;
IF count % 100000 = 0 THEN
RAISE NOTICE 'rows processed: %', count;
END IF;
EXCEPTION
WHEN SQLSTATE 'XX000' THEN
RAISE NOTICE 'LAST GOOD CTID: %', result;
youNeedMe = TRUE;
END;
END LOOP;
CLOSE curs;
RETURN arr;
END
$find_bad_row$
LANGUAGE plpgsql;
This is supplemental to the function given in the question and answers next steps after the db is dumpable.
Your next steps should be:
dumpall and restore on a physically different system. The reason being at this point we don't know what caused this and chances are not too bad that it might be hardware.
You need to take the old system down and run hardware diagnostics on it, looking for problems. You really want to find out what happened so you don't run into it again. Of particular interest:
Double check ECC RAM and MCE logs
Look at all RAID arrays and their battery backups
CPUs and PSUs
If it were me I would also look at environmental variables such as AC in and datacenter temperature.
Go over your backup strategy. In particular look at PITR (and related utility pgbarman). Make sure you can recover from a similar situation in the future if you run into it.
Data corruption doesn't just happen. In rare cases it can be caused by bugs in PostgreSQL, but in most cases it is due to your hardware or due to custom code you have running in the back-end. Narrowing down the cause and ensuring recoverability are critical going forward.
Assuming you aren't running custom C code in your database, most likely your data corruption is due to something on the hardware

Variables for identifiers inside IF EXISTS in a plpgsql function

CREATE OR REPLACE FUNCTION drop_now()
RETURNS void AS
$BODY$
DECLARE
row record;
BEGIN
RAISE INFO 'in';
FOR row IN
select relname from pg_stat_user_tables
WHERE schemaname='public' AND relname LIKE '%test%'
LOOP
IF EXISTS(SELECT row.relname.tm FROM row.relname
WHERE row.relname.tm < current_timestamp - INTERVAL '90 minutes'
LIMIT 1)
THEN
-- EXECUTE 'DROP TABLE ' || quote_ident(row.relname);
RAISE INFO 'Dropped table: %', quote_ident(row.relname);
END IF;
END LOOP;
END;
$BODY$
LANGUAGE plpgsql VOLATILE;
Could you tell me how to use variables in SELECT which is inside IF EXISTS? At the present moment, row.relname.tm and row.relname are treated literally which is not I want.
CREATE OR REPLACE FUNCTION drop_now()
RETURNS void AS
$func$
DECLARE
_tbl regclass;
_found int;
BEGIN
FOR _tbl IN
SELECT relid
FROM pg_stat_user_tables
WHERE schemaname = 'public'
AND relname LIKE '%test%'
LOOP
EXECUTE format($f$SELECT 1 FROM %s
WHERE tm < now() - interval '90 min'$f$, _tbl);
GET DIAGNOSTICS _found = ROW_COUNT;
IF _found > 0 THEN
-- EXECUTE 'DROP TABLE ' || _tbl;
RAISE NOTICE 'Dropped table: %', _tbl;
END IF;
END LOOP;
END
$func$ LANGUAGE plpgsql;
Major points
row is a reserved word in the SQL standard. It's use is allowed in Postgres, but it's still unwise. I make it a habbit to prepend psql variable with an underscore _ to avoid any naming conflicts.
You don't don't select the whole row anyway, just the table name in this example. Best use a variable of type regclass, thereby avoiding SQL injection by way of illegal table names automatically. Details in this related answer:
Table name as a PostgreSQL function parameter
You don't need LIMIT in an EXISTS expression, which only checks for the existence of any rows. And you don't need meaningful target columns for the same reason. Just write SELECT 1 or SELECT * or something.
You need dynamic SQL for queries with variable identifiers. Plain SQL does not allow for that. I.e.: build a query string and EXECUTE it. Details in this closely related answer:
Dynamic SQL (EXECUTE) as condition for IF statement
The same is true for a DROP statement, should you want to run it. I added a comment.
You'll need to build your query as a string then execute that - see the section on executing dynamic commands in the plpgsql section of the manual.