Fixing invalid memory alloc request at PostgreSQL 9.2.9 - postgresql

I've encountered a problem querying some of my tables recently. When I try to select data I get an ERROR telling: ERROR: invalid memory alloc request size 4294967293. This generally indicates data corruption. There is a nice and precise technique of how to delete corrupted rows described here: https://confluence.atlassian.com/jirakb/invalid-memory-alloc-request-size-440107132.html
But, since I have lots of corrupted tables, this method is too slow. So, I've found a nice function which returns the last successful ctid here: http://blog.dob.sk/2012/05/19/fixing-pg_dump-invalid-memory-alloc-request-size/
Looking for corrupted row is a bit faster when using it, but not fast enough. I slightly modified it to store all "last successful ctid" in a different table and now it looks like this:
CREATE OR REPLACE FUNCTION
find_bad_row(tableName TEXT)
RETURNS void
as $find_bad_row$
DECLARE
result tid;
curs REFCURSOR;
row1 RECORD;
row2 RECORD;
tabName TEXT;
count BIGINT := 0;
BEGIN
DROP TABLE IF EXISTS bad_rows_tbl;
CREATE TABLE bad_rows_tbl (id varchar(255), offs BIGINT);
SELECT reverse(split_part(reverse($1), '.', 1)) INTO tabName;
OPEN curs FOR EXECUTE 'SELECT ctid FROM ' || tableName;
count := 1;
FETCH curs INTO row1;
WHILE row1.ctid IS NOT NULL LOOP
BEGIN
result = row1.ctid;
count := count + 1;
FETCH curs INTO row1;
EXECUTE 'SELECT (each(hstore(' || tabName || '))).* FROM '
|| tableName || ' WHERE ctid = $1' INTO row2
USING row1.ctid;
IF count % 100000 = 0 THEN
RAISE NOTICE 'rows processed: %', count;
END IF;
EXCEPTION
WHEN SQLSTATE 'XX000' THEN
RAISE NOTICE 'LAST CTID: %', result;
EXECUTE 'INSERT INTO bad_rows_tbl VALUES(' || result || ',' || count || ')';
END;
END LOOP;
CLOSE curs;
END
$find_bad_row$
LANGUAGE plpgsql;
I'm quite new to plpgsql, so I'm stuck with the following question: how to query not pre-unsuccessful ctid, but the exact unsuccessful one (or calculate the next one from pre-unsuccessful) so I could insert it into bad_rows_tbl and use as an argument for a DELETE statement further?
Hope for some help...
UPD: a function I ended up
CREATE OR REPLACE FUNCTION
find_bad_row(tableName TEXT)
RETURNS tid[]
as $find_bad_row$
DECLARE
result tid;
curs REFCURSOR;
row1 RECORD;
row2 RECORD;
tabName TEXT;
youNeedMe BOOLEAN = false;
count BIGINT := 0;
arrIter BIGINT := 0;
arr tid[];
BEGIN
CREATE TABLE bad_rows_tbl (id varchar(255), offs BIGINT);
SELECT reverse(split_part(reverse($1), '.', 1)) INTO tabName;
OPEN curs FOR EXECUTE 'SELECT ctid FROM ' || tableName;
count := 1;
FETCH curs INTO row1;
WHILE row1.ctid IS NOT NULL LOOP
BEGIN
result = row1.ctid;
count := count + 1;
IF youNeedMe THEN
arr[arrIter] = result;
arrIter := arrIter + 1;
RAISE NOTICE 'ADDING CTID: %', result;
youNeedMe = FALSE;
END IF;
FETCH curs INTO row1;
EXECUTE 'SELECT (each(hstore(' || tabName || '))).* FROM '
|| tableName || ' WHERE ctid = $1' INTO row2
USING row1.ctid;
IF count % 100000 = 0 THEN
RAISE NOTICE 'rows processed: %', count;
END IF;
EXCEPTION
WHEN SQLSTATE 'XX000' THEN
RAISE NOTICE 'LAST GOOD CTID: %', result;
youNeedMe = TRUE;
END;
END LOOP;
CLOSE curs;
RETURN arr;
END
$find_bad_row$
LANGUAGE plpgsql;

This is supplemental to the function given in the question and answers next steps after the db is dumpable.
Your next steps should be:
dumpall and restore on a physically different system. The reason being at this point we don't know what caused this and chances are not too bad that it might be hardware.
You need to take the old system down and run hardware diagnostics on it, looking for problems. You really want to find out what happened so you don't run into it again. Of particular interest:
Double check ECC RAM and MCE logs
Look at all RAID arrays and their battery backups
CPUs and PSUs
If it were me I would also look at environmental variables such as AC in and datacenter temperature.
Go over your backup strategy. In particular look at PITR (and related utility pgbarman). Make sure you can recover from a similar situation in the future if you run into it.
Data corruption doesn't just happen. In rare cases it can be caused by bugs in PostgreSQL, but in most cases it is due to your hardware or due to custom code you have running in the back-end. Narrowing down the cause and ensuring recoverability are critical going forward.
Assuming you aren't running custom C code in your database, most likely your data corruption is due to something on the hardware

Related

How to access a record's field in an Execute sentence?

I need to access a dynamic field from a dynamic table, and I have this code for doing this
Here f_field takes the name of a column from t_table dynamicly.
When t_table is declared as a record this code gets an error. But when declared as a static column fieldType it runs as expected.
The question is, How can I declared t_row for this code to run properly, or how can I achieve the same by other way. Remember that t_table and f_field are dynamic and therefore their values change.
FOR t_row IN EXECUTE 'SELECT * from ' || t_table loop
EXECUTE format('select $1.%I', f_field) USING t_row into f;
RAISE notice '%', f;
END LOOP;
if I get your task right, you dont need $1 and USING here at all, eg:
do
$$
declare
t_table text;
f text;
f_field text := 'oid';
r record;
begin
for r in (select relname from pg_class where relname = 'pg_database') loop
EXECUTE format('select %I from %I', f_field, r.relname) into f;
RAISE notice '%', f;
end loop;
end;
$$
;
mind it raises only first row out of many this way
NOTICE: 12669
DO
When I was doing my research I found out that sentences like EXECUTE... doesn't understand records structure, this was by experimenting with plpgsql because there is not doc about this. So I need to convert t_row in another kind of object that has methods to extract a value from a dynamic field, I chose json because I think it is easy to use. This is my code.
FOR t_row IN EXECUTE 'SELECT * from dd.a' loop
f_json := row_to_json(layer_row);
f := json_extract_path(f_json, f_field);
RAISE notice '%', f;
END LOOP;
It work for fine for me, but this is only a walkaround, if someone could point out why is my first code not working I will apreciate it. Thanks in advance

Postgres pg_dump

My automated pg_dump process has been failing when attempting to backup a postgres database. The error message I'm receiving:
ERROR Message:
pg_dump: Error message from server: ERROR: invalid memory alloc
request size 1249770967 pg_dump: The command was: COPY
public.data_store (id, length, last_modified, data) TO stdout;
Custom backup of jackrabbit pg_dump: SQL command failed pg_dump: Error
message from server: ERROR: invalid memory alloc request size
1249770967 pg_dump: The command was: COPY public.data_store
(id, length, last_modified, data) TO stdout; [!!ERROR!!] Failed to
produce custom backup database jackrabbit Plain backup of pdi_logging
Custom backup of pdi_logging Plain backup of postgres Custom backup of
postgres Plain backup of quartz Custom backup of quartz
Based on my findings everything seemed to point to corrupt data in the table so I created a function to query the table and extract the ctid so I then may find the culprit and delete the corrupt row.
Function:
CREATE OR REPLACE FUNCTION find_bad_row(tablename text)
RETURNS tid AS
$BODY$
DECLARE
result tid;
curs REFCURSOR;
row1 RECORD;
row2 RECORD;
tabName TEXT;
count BIGINT := 0;
BEGIN
SELECT reverse(split_part(reverse($1), '.', 1)) INTO tabName;
OPEN curs FOR EXECUTE 'SELECT ctid FROM ' || tableName;
count := 1;
FETCH curs INTO row1;
WHILE row1.ctid IS NOT NULL LOOP
result = row1.ctid;
count := count + 1;
FETCH curs INTO row1;
EXECUTE 'SELECT (each(hstore(' || tabName || '))).* FROM '
|| tableName || ' WHERE ctid = $1' INTO row2
USING row1.ctid;
IF count % 100000 = 0 THEN
RAISE NOTICE 'rows processed: %', count;
END IF;
END LOOP;
CLOSE curs;
RETURN row1.ctid;
EXCEPTION
WHEN OTHERS THEN
RAISE NOTICE 'LAST CTID: %', result;
RAISE NOTICE '%: %', SQLSTATE, SQLERRM;
RETURN result;
END
$BODY$
LANGUAGE plpgsql VOLATILE
COST 100;
ALTER FUNCTION find_bad_row(text)
OWNER TO pentaho;
After calling the function select find_bad_row('public.data_store'), the result given was (0,2). I searched the table for said ctid, select ctid, * from public.data_store, and deleted the preceding row. I then executed my pg_dump script and received the same error. Re-running the function on the table returns a result again of the first row. Being new to postgres is my approach altogether wrong and is there another way to resolve this? Could it be that the entire table is corrupt?

Function or loop using table and topologies names as arguments in Postgresql

I'm working with topologies in PostGIS and to create a TopoGeometry column, I'm using this loop:
DO $$DECLARE r record;
BEGIN
FOR r IN SELECT * FROM table_uf_11 LOOP
BEGIN
UPDATE table_uf_11 SET tg_geom = toTopoGeom(ST_Force2D(geom),'topology_uf_11', 1, 1)
WHERE gid= r.gid;
EXCEPTION
WHEN OTHERS THEN
RAISE WARNING 'Loading of record % failed: %', r.gid, SQLERRM;
END;
END LOOP;
END$$;
The reason for using this loop is because in some rows the toTopoGeom function displays error, but are just a few cases, for exemplo 38 cases in 24.000.
Using this structure I can identify which cases are problematic in the log and fix them later.
My problem is that I have another 26 tables with their respective topologies, all of them identified by the state code, for exemplo:
table_uf_12 / topology_uf_12
table_uf_13 / topology_uf_13
table_uf_14 / topology_uf_14
...
table_uf_53 / topology_uf_53
The state code are not necessarily sequential, but the names has the same pattern. Column names as geom and tg_geom are equal for all tables.
How can I make a function or another loop structure to replicate this process in all 27 tables and the same time save the log of each table?
I tried to make a function, but in this case the arguments would be the table name and the topology name, and i'm having difficult to elaborate this structure.
Any suggestions?
I think this should do it:
DO $BODY$
DECLARE
t regclass;
gid bigint;
BEGIN
FOR t IN SELECT oid::regclass FROM pg_class WHERE relname ~ '^table_uf_\d+$' LOOP
FOR gid IN EXECUTE 'SELECT gid FROM ' || t::text LOOP
BEGIN
EXECUTE
' UPDATE ' || t::text ||
' SET tg_geom = toTopoGeom(ST_Force2D(geom), $2, 1, 1)'
' WHERE gid = $1'
USING gid, replace(t::text, 'table', 'topology');
EXCEPTION
WHEN OTHERS THEN
RAISE WARNING 'Loading of record % failed: %', gid, SQLERRM;
END;
END LOOP;
END LOOP;
END
$BODY$

Dynamically generated CURSOR in Postgresql

I have got a cursor, it is pointing to a SELECT, but this select is generated dynamically. I want to assign the statement after the declarement.
I have done an example working and another example NOT working. This is a simple example to print some data only.
This is the table:
CREATE TABLE public.my_columns (
id serial NOT NULL,
"name" varchar(30) NOT NULL,
/* Keys */
CONSTRAINT my_columns_pkey
PRIMARY KEY (id)
) WITH (
OIDS = FALSE
);
CREATE INDEX my_columns_index01
ON public.my_columns
("name");
INSERT INTO public.my_columns
("name")
VALUES
('name1'),
('name2'),
('name3'),
('name4'),
('name5'),
('name6');
This is the function(I have put the working code and the code not working):
CREATE OR REPLACE FUNCTION public.dynamic_table
(
)
RETURNS text AS $$
DECLARE
v_sql_dynamic varchar;
--NOT WORKING:
--db_c CURSOR IS (v_sql_dynamic::varchar);
--WORKING:
db_c CURSOR IS (SELECT id, name from public.my_columns);
db_rec RECORD;
BEGIN
v_sql_dynamic := 'SELECT id, name from public.my_columns';
FOR db_rec IN db_c LOOP
RAISE NOTICE 'NAME: %', db_rec.name;
END LOOP;
RETURN 'OK';
EXCEPTION WHEN others THEN
RETURN 'Error: ' || SQLERRM::text || ' ' || SQLSTATE::text;
END;
$$ LANGUAGE plpgsql;
Any ideas?
Thank you.
Do you really need the explicit cursor? If you need iterate over dynamic SQL, then you can use FOR IN EXECUTE. It is loop over implicit (internal) cursor for dynamic SQL
FOR db_rec IN EXECUTE v_sql_dynamic
LOOP
..
END LOOP
Little bit more complex solution is described in documentation - OPEN FOR EXECUTE:
do $$
declare r refcursor; rec record;
begin
open r for execute 'select * from pg_class';
fetch next from r into rec;
while found
loop
raise notice '%', rec;
fetch next from r into rec;
end loop;
close r;
end $$;
With this kind of cursor, you cannot to use FOR IN

count number of rows to be affected before update in trigger

I want to know number of rows that will be affected by UPDATE query in BEFORE per statement trigger . Is that possible?
The problem is that i want to allow only queries that will update up to 4 rows. If affected rows count is 5 or more i want to raise error.
I don't want to do this in code because i need this check on db level.
Is this at all possible?
Thanks in advance for any clues on that
Write a function that updates the rows for you or performs a rollback. Sorry for poor style formatting.
create function update_max(varchar, int)
RETURNS void AS
$BODY$
DECLARE
sql ALIAS FOR $1;
max ALIAS FOR $2;
rcount INT;
BEGIN
EXECUTE sql;
GET DIAGNOSTICS rcount = ROW_COUNT;
IF rcount > max THEN
--ROLLBACK;
RAISE EXCEPTION 'Too much rows affected (%).', rcount;
END IF;
--COMMIT;
END;
$BODY$ LANGUAGE plpgsql
Then call it like
select update_max('update t1 set id=id+10 where id < 4', 3);
where the first param ist your sql-Statement and the 2nd your max rows.
Simon had a good idea but his implementation is unnecessarily complicated. This is my proposition:
create or replace function trg_check_max_4()
returns trigger as $$
begin
perform true from pg_class
where relname='check_max_4' and relnamespace=pg_my_temp_schema();
if not FOUND then
create temporary table check_max_4
(value int check (value<=4))
on commit drop;
insert into check_max_4 values (0);
end if;
update check_max_4 set value=value+1;
return new;
end; $$ language plpgsql;
I've created something like this:
begin;
create table test (
id integer
);
insert into test(id) select generate_series(1,100);
create or replace function trg_check_max_4_updated_records()
returns trigger as $$
declare
counter_ integer := 0;
tablename_ text := 'temptable';
begin
raise notice 'trigger fired';
select count(42) into counter_
from pg_catalog.pg_tables where tablename = tablename_;
if counter_ = 0 then
raise notice 'Creating table %', tablename_;
execute 'create temporary table ' || tablename_ || ' (counter integer) on commit drop';
execute 'insert into ' || tablename_ || ' (counter) values(1)';
execute 'select counter from ' || tablename_ into counter_;
raise notice 'Actual value for counter= [%]', counter_;
else
execute 'select counter from ' || tablename_ into counter_;
execute 'update ' || tablename_ || ' set counter = counter + 1';
raise notice 'updating';
execute 'select counter from ' || tablename_ into counter_;
raise notice 'Actual value for counter= [%]', counter_;
if counter_ > 4 then
raise exception 'Cannot change more than 4 rows in one trancation';
end if;
end if;
return new;
end; $$ language plpgsql;
create trigger trg_bu_test before
update on test
for each row
execute procedure trg_check_max_4_updated_records();
update test set id = 10 where id <= 1;
update test set id = 10 where id <= 2;
update test set id = 10 where id <= 3;
update test set id = 10 where id <= 4;
update test set id = 10 where id <= 5;
rollback;
The main idea is to have a trigger on 'before update for each row' that creates (if necessary) a temporary table (that is dropped at the end of transaction). In this table there is just one row with one value, that is the number of updated rows in current transaction. For each update the value is incremented. If the value is bigger than 4, the transaction is stopped.
But I think that this is a wrong solution for your problem. What's a problem to run such wrong query that you've written about, twice, so you'll have 8 rows changed. What about deletion rows or truncating them?
PostgreSQL has two types of triggers: row and statement triggers. Row triggers only work within the context of a row so you can't use those. Unfortunately, "before" statement triggers don't see what kind of change is about to take place so I don't believe you can use those, either.
Based on that, I would say it's unlikely you'll be able to build that kind of protection into the database using triggers, not unless you don't mind using an "after" trigger and rolling back the transaction if the condition isn't satisfied. Wouldn't mind being proved wrong. :)
Have a look at using Serializable Isolation Level. I believe this will give you a consistent view of the database data within your transaction. Then you can use option #1 that MusiGenesis mentioned, without the timing vulnerability. Test it of course to validate.
I've never worked with postgresql, so my answer may not apply. In SQL Server, your trigger can call a stored procedure which would do one of two things:
Perform a SELECT COUNT(*) to determine the number of records that will be affected by the UPDATE, and then only execute the UPDATE if the count is 4 or less
Perform the UPDATE within a transaction, and only commit the transaction if the returned number of rows affected is 4 or less
No. 1 is timing vulnerable (the number of records affected by the UPDATE may change between the COUNT(*) check and the actual UPDATE. No. 2 is pretty inefficient, if there are many cases where the number of rows updated is greater than 4.