how to break connection pull in pgsql? - postgresql

CREATE OR REPLACE FUNCTION add_() RETURNS void AS
$BODY$
declare
foo int;
BEGIN
FOR i IN 1..50 LOOP
foo = i;
RAISE NOTICE 'combination_array(%)', foo ;
UPDATE table_1 set r_id = foo WHERE id = (select id from table_1 where r_id is null order by id limit 1);
END LOOP;
END; $BODY$ LANGUAGE 'plpgsql' ;
SELECT add_();
after this execution when ever i execute
UPDATE table_1
set r_id = foo
WHERE id = (select id from table_1 where r_id is null order by id limit 1);
END LOOP;
its going to be busy any one tell me how to clear the pull in pgsql

it's return all connection with ids
SELECT * from pg_stat_activity;
You can kill the specific connection by procpid though following query.
select pg_terminate_backend(procpid)
from pg_stat_activity
where datname = 'database-name'

You can send signal to this busy connected backend process.
To cancel a running query, send the SIGINT signal to the process running that command. To terminate a backend process cleanly, send SIGTERM to that process.
for exp :
pg93#db-172-16-3-150-> psql
psql (9.3.3)
Type "help" for help.
digoal=# select pg_sleep(1000000);
-- find the pid
-- ps -ewf|grep postgres
pg93 24872 23190 0 00:11 ? 00:00:00 postgres: postgres digoal [local] SELECT
-- send signal
pg93#db-172-16-3-150-> kill -s SIGINT 24872
-- then we see that feedback from psql.
ERROR: canceling statement due to user request

Related

Is it worth Parallel/Concurrent INSERT INTO... (SELECT...) to the same Table in Postgres?

I was attempting an INSERT INTO.... ( SELECT... ) (inserting a batch of rows from SELECT... subquery), onto the same table in my database. For the most part it was working, however, I did see a "Deadlock" exception logged every now and then. Does it make sense to do this or is there a way to avoid a deadlock scenario? On a high-level, my queries both resemble this structure:
CREATE OR REPLACE PROCEDURE myConcurrentProc() LANGUAGE plpgsql
AS $procedure$
DECLARE
BEGIN
LOOP
EXIT WHEN row_count = 0
WITH cte AS (SELECT *
FROM TableA tbla
WHERE EXISTS (SELECT 1 FROM TableB tblb WHERE tblb.id = tbla.id)
INSERT INTO concurrent_table (SELECT id FROM cte);
COMMIT;
UPDATE log_tbl
SET status = 'FINISHED',
WHERE job_name = 'tblA_and_B_job';
END LOOP;
END
$procedure$;
And the other script that runs in parallel and INSERTS... also to the same table is also basically:
CREATE OR REPLACE PROCEDURE myConcurrentProc() LANGUAGE plpgsql
AS $procedure$
DECLARE
BEGIN
LOOP
EXIT WHEN row_count = 0
WITH cte AS (SELECT *
FROM TableC c
WHERE EXISTS (SELECT 1 FROM TableD d WHERE d.id = tblc.id)
INSERT INTO concurrent_table (SELECT id FROM cte);
COMMIT;
UPDATE log_tbl
SET status = 'FINISHED',
WHERE job_name = 'tbl_C_and_D_job';
END LOOP;
END
$procedure$;
So you can see I'm querying two different tables in each script, however inserting into the same some_table. I also have the UPDATE... statement that writes to a log table so I suppose that could also cause issues. Is there any way to use BEGIN... END here and COMMIT to avoid any deadlock/concurrency issues or should I just create a 2nd table to hold the "tbl_C_and_D_job" data?

PostgreSQL read commit on different transactions

I was running some tests to better understanding read commits for postgresql.
I have two transactions running in parallel:
-- transaction 1
begin;
select id from item order by id asc FETCH FIRST 500 ROWS ONLY;
select pg_sleep(10);
commit;
--transaction 2
begin;
select id from item order by id asc FETCH FIRST 500 ROWS ONLY;
commit;
The first transaction will select first 500 ids and then hold the id by sleeping 10s
The second transaction will in the mean while querying for first 500 rows in the table.
Based my understanding of read commits, first transaction will select 1 to 500 records and second transaction will select 501 to 1000 records.
But the actual result is that both two transactions select 1 to 500 records.
I will be really appreciated if someone can point out which part is wrong. Thanks
You are misinterpreting the meaning of read committed. It means that a transaction cannot see (select) updates that are not committed. Try the following:
create table read_create_test( id integer generated always as identity
, cola text
) ;
insert into read_create_test(cola)
select 'item-' || to_char(n,'fm000')
from generate_series(1,50) gs(n);
-- transaction 1
do $$
max_val integer;
begin
insert into read_create_test(cola)
select 'item2-' || to_char(n+100,'fm000')
from generate_series(1,50) gs(n);
select max(id)
into max_val
from read_create_test;
raise notice 'Transaction 1 Max id: %',max_val;
select pg_sleep(30); -- make sure 2nd transaction has time to start
commit;
end;
$$;
-- transaction 2 (run after transaction 1 begins but before it ends)
do $$
max_val integer;
begin
select max(id)
into max_val
from read_create_test;
raise notice 'Transaction 2 Max id: %',max_val;
end;
$$;
-- transaction 3 (run after transaction 1 ends)
do $$
max_val integer;
begin
select max(id)
into max_val
from read_create_test;
raise notice 'Transaction 3 Max id: %',max_val;
end;
$$;
Analyze the results keeping in mind that A transaction cannot see uncommitted DML.

Raise error without rollback in plpgsql/postgresql

I have two stored functions. delete_item which deletes one item, logging it's success or failure in an actionlog table, it returns a 1 on errors and 0 when running successfully.
Secondly I have another function remove_expired that finds what to delete and loops through it calling delete_item.
All this is intended to be called using a simple bash script (hard requirement from operations, so calling like this is not up for discussion), and it have to give an error code when things doesn't work for their reporting tools.
We want all the deletions possible to succeed (we don't expect errors, but humans are still humans, and errors does happen), so if we want to delete 10 items and 1 fails, we still want the other 9 to be deleted.
Secondly we would really like the logs to be in the table actionlog both in the success and error case. I.e., we want that log to be complete.
Since plpgsql functions doesn't allow manual transaction management that seems to not be an option (unless I missed a way to circumvent this?).
The only way I've found so far to achieve this is to wrap scripts around it outside plpgsql, but we would very much like this to be possible in pure plpgsql so we can just give operations a pssql -C ... command and then they shouldn't be concerned with anything else.
SQL to reproduce the problem:
DROP FUNCTION IF EXISTS remove_expired(timestamp with time zone);
DROP FUNCTION IF EXISTS delete_item(integer);
DROP TABLE IF EXISTS actionlog;
DROP TABLE IF EXISTS evil;
DROP TABLE IF EXISTS test;
CREATE TABLE test (
id serial primary key not null,
t timestamp with time zone not null
);
CREATE TABLE evil (
test_id integer not null references test(id)
);
CREATE TABLE actionlog (
eventTime timestamp with time zone not null default now(),
message text not null
);
INSERT INTO test (actualTime, t)
VALUES ('2020-04-01T10:00:00+0200'),
('2020-04-01T10:15:00+0200'), -- Will not be deleable due to foreign key
('2020-04-01T10:30:00+0200')
;
INSERT INTO evil (test_id) SELECT id FROM test WHERE id = 2;
CREATE OR REPLACE FUNCTION remove_expired(timestamp with time zone)
RETURNS void
AS
$$
DECLARE
test_id int;
failure_count int = 0;
BEGIN
FOR test_id IN
SELECT id FROM test WHERE t < $1
LOOP
failure_count := delete_item(test_id) + failure_count;
END LOOP;
IF failure_count > 0 THEN
-- I want this to cause 'psql ... -c "SELECT * FROM remove_expred...' to exit with exit code != 0
RAISE 'There was one or more errors deleting. See the log for details';
END IF;
END;
$$ LANGUAGE plpgsql;
CREATE OR REPLACE FUNCTION delete_item(integer)
RETURNS integer
AS
$$
BEGIN
DELETE FROM test WHERE id = $1;
INSERT INTO actionlog (message)
VALUES ('Deleted with ID: ' || $1);
RETURN 0;
EXCEPTION WHEN OTHERS THEN
INSERT INTO actionlog (message)
VALUES ('Error deleting ID: ' || $1 || '. The error was: ' || SQLERRM);
RETURN 1;
END
$$ LANGUAGE plpgsql;
Thanks in advance for any useful input
You can have something close to what your expect in PostgreSQL 11 or PostgreSQL 12 but only with procedures because as already said functions will always roll everything back in case of errors.
With:
DROP PROCEDURE IF EXISTS remove_expired(timestamp with time zone);
DROP PROCEDURE IF EXISTS delete_item(integer);
DROP FUNCTION IF EXISTS f_removed_expired;
DROP SEQUENCE IF EXISTS failure_count_seq;
DROP TABLE IF EXISTS actionlog;
DROP TABLE IF EXISTS evil;
DROP TABLE IF EXISTS test;
CREATE TABLE test (
id serial primary key not null,
t timestamp with time zone not null
);
CREATE TABLE evil (
test_id integer not null references test(id)
);
CREATE TABLE actionlog (
eventTime timestamp with time zone not null default now(),
message text not null
);
INSERT INTO test (t)
VALUES ('2020-04-01T10:00:00+0200'),
('2020-04-01T10:15:00+0200'), -- Will not be removed due to foreign key
('2020-04-01T10:30:00+0200')
;
select * from test where t < current_timestamp;
INSERT INTO evil (test_id) SELECT id FROM test WHERE id = 2;
CREATE SEQUENCE failure_count_seq MINVALUE 0;
SELECT SETVAL('failure_count_seq', 0, true);
CREATE OR REPLACE PROCEDURE remove_expired(timestamp with time zone)
AS
$$
DECLARE
test_id int;
failure_count int = 0;
return_code int;
BEGIN
FOR test_id IN
SELECT id FROM test WHERE t < $1
LOOP
call delete_item(test_id);
COMMIT;
END LOOP;
SELECT currval('failure_count_seq') INTO failure_count;
IF failure_count > 0 THEN
-- I want this to cause 'psql ... -c "SELECT * FROM remove_expred...' to exit with exit code != 0
RAISE 'There was one or more errors deleting. See the log for details';
END IF;
END;
$$ LANGUAGE plpgsql;
CREATE OR REPLACE PROCEDURE delete_item(in integer)
AS
$$
DECLARE
forget_value int;
BEGIN
DELETE FROM test WHERE id = $1;
INSERT INTO actionlog (message)
VALUES ('Deleted with ID: ' || $1);
EXCEPTION WHEN OTHERS THEN
INSERT INTO actionlog (message)
VALUES ('Error deleting ID: ' || $1 || '. The error was: ' || SQLERRM);
COMMIT;
SELECT NEXTVAL('failure_count_seq') INTO forget_value;
END
$$ LANGUAGE plpgsql;
--
I get:
select * from test;
id | t
----+------------------------
1 | 2020-04-01 10:00:00+02
2 | 2020-04-01 10:15:00+02
3 | 2020-04-01 10:30:00+02
(3 rows)
select current_timestamp;
current_timestamp
-------------------------------
2020-04-01 16:52:26.171975+02
(1 row)
call remove_expired(current_timestamp);
psql:test.sql:80: ERROR: There was one or more errors deleting. See the log for details
CONTEXT: PL/pgSQL function remove_expired(timestamp with time zone) line 17 at RAISE
select currval('failure_count_seq');
currval
---------
1
(1 row)
select * from test;
id | t
----+------------------------
2 | 2020-04-01 10:15:00+02
(1 row)
select * from actionlog;
eventtime | message
-------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------
2020-04-01 16:52:26.172173+02 | Deleted with ID: 1
2020-04-01 16:52:26.179794+02 | Error deleting ID: 2. The error was: update or delete on table "test" violates foreign key constraint "evil_test_id_fkey" on table "evil"
2020-04-01 16:52:26.196503+02 | Deleted with ID: 3
(3 rows)
I use a sequence to record the number of failures: you can use this sequence to test for failures and return the right return code.

Asynchronous data load by parallel sessions

Looking for help with a data load function designed to support asynchronous execution by parallel sessions.
Process_Log table contains the list of data load functions, with current status and a list of upstream dependencies.
Each session first looks for a function that is ready for execution, calls it, and updates status.
For further details please see comments in the code.
In Oracle PL/SQL I would design it as a nested block within the loop, and autonomous transaction for status updates.
Not sure how to achieve that in Postgres. Running 9.2.
CREATE OR REPLACE FUNCTION dm_operations.dm_load()
RETURNS void AS
$BODY$
declare
_run_cnt integer;
_ready_cnt integer;
_process_id dm_operations.process_log.process_id%type;
_exec_name dm_operations.process_log.exec_name%type;
_rowcnt dm_operations.process_log.rows_affected%type;
_error text;
_error_text text;
_error_detail text;
_error_hint text;
_error_context text;
begin
loop
--(1) Find one function ready to run
select sum(case when process_status = 'RUNNING' then 1 else 0 end) run_cnt,
sum(case when process_status = 'READY' then 1 else 0 end) ready_cnt,
min(case when process_status = 'READY' then process_id end) process_id
into _run_cnt, _ready_cnt, _process_id
from dm_operations.process_log; --One row per each executable data load function
--(2) Exit loop if nothing is ready
if _ready_cnt = 0 then exit;
else
--(3) Lock the row until the status is updated
select exec_name
into _exec_name
from dm_operations.process_log
where process_id = _process_id
for update;
--(4) Set status of the function to 'RUNNING'
--New status must be visible to other sessions
update dm_operations.process_log
set process_status = 'RUNNING',
start_ts = now()
where process_id = _process_id;
--(5) Release lock. (How?)
--(6) Execute data load function. See example below.
-- Is this correct syntax for dynamic call to a function that returns void?
execute 'perform dm_operations.'||_exec_name;
--(7) Get number of rows processed by the data load function
GET DIAGNOSTICS _rowcnt := ROW_COUNT;
--(8) Upon successful function execution set status to 'SUCCESS'
update dm_operations.process_log
set process_status = 'SUCCESS',
end_ts = now(),
rows_affected = _rowcnt
where process_id = _process_id;
--(9) Check dependencies and update status
--These changes must be visible to the next loop iteration, and to other sessions
update dm_operations.process_log pl1
set process_status = 'READY'
where process_status is null
and not exists (select null from dm_operations.process_log pl2
where pl2.process_id in (select unnest(pl1.depends_on))
and (coalesce(pl2.process_status,'NULL') <> 'SUCCESS'));
end if;
--(10) Log error and allow the loop to continue
EXCEPTION
when others then
GET STACKED DIAGNOSTICS _error_text = MESSAGE_TEXT,
_error_detail = PG_EXCEPTION_DETAIL,
_error_hint = PG_EXCEPTION_HINT,
_error_context = PG_EXCEPTION_CONTEXT;
_error := _error_text||
_error_detail||
_error_hint||
_error_context;
update dm_operations.process_log
set process_status = 'ERROR',
start_ts = now(),
rows_affected = _rowcnt,
error_text = _error
where process_id = _process_id;
end;
end loop;
end;
$BODY$
LANGUAGE plpgsql;
Data load function example (6):
CREATE OR REPLACE FUNCTION load_target()
RETURNS void AS
$BODY$
begin
execute 'truncate table target_table';
insert into target_table
select ...
from source_table;
end;
$BODY$
LANGUAGE plpgsql;
You cannot start asynchronous operations in PL/pgSQL.
There are two options I can think of:
The hard way: Upgrade to a more recent PostgreSQL version and write a background worker in C that executes load_target. You'd have to use
Don't write your function in the database, but on the client side. Then you can simply open several database sessions and run functions in parallel that way.

Postgres pg_dump

My automated pg_dump process has been failing when attempting to backup a postgres database. The error message I'm receiving:
ERROR Message:
pg_dump: Error message from server: ERROR: invalid memory alloc
request size 1249770967 pg_dump: The command was: COPY
public.data_store (id, length, last_modified, data) TO stdout;
Custom backup of jackrabbit pg_dump: SQL command failed pg_dump: Error
message from server: ERROR: invalid memory alloc request size
1249770967 pg_dump: The command was: COPY public.data_store
(id, length, last_modified, data) TO stdout; [!!ERROR!!] Failed to
produce custom backup database jackrabbit Plain backup of pdi_logging
Custom backup of pdi_logging Plain backup of postgres Custom backup of
postgres Plain backup of quartz Custom backup of quartz
Based on my findings everything seemed to point to corrupt data in the table so I created a function to query the table and extract the ctid so I then may find the culprit and delete the corrupt row.
Function:
CREATE OR REPLACE FUNCTION find_bad_row(tablename text)
RETURNS tid AS
$BODY$
DECLARE
result tid;
curs REFCURSOR;
row1 RECORD;
row2 RECORD;
tabName TEXT;
count BIGINT := 0;
BEGIN
SELECT reverse(split_part(reverse($1), '.', 1)) INTO tabName;
OPEN curs FOR EXECUTE 'SELECT ctid FROM ' || tableName;
count := 1;
FETCH curs INTO row1;
WHILE row1.ctid IS NOT NULL LOOP
result = row1.ctid;
count := count + 1;
FETCH curs INTO row1;
EXECUTE 'SELECT (each(hstore(' || tabName || '))).* FROM '
|| tableName || ' WHERE ctid = $1' INTO row2
USING row1.ctid;
IF count % 100000 = 0 THEN
RAISE NOTICE 'rows processed: %', count;
END IF;
END LOOP;
CLOSE curs;
RETURN row1.ctid;
EXCEPTION
WHEN OTHERS THEN
RAISE NOTICE 'LAST CTID: %', result;
RAISE NOTICE '%: %', SQLSTATE, SQLERRM;
RETURN result;
END
$BODY$
LANGUAGE plpgsql VOLATILE
COST 100;
ALTER FUNCTION find_bad_row(text)
OWNER TO pentaho;
After calling the function select find_bad_row('public.data_store'), the result given was (0,2). I searched the table for said ctid, select ctid, * from public.data_store, and deleted the preceding row. I then executed my pg_dump script and received the same error. Re-running the function on the table returns a result again of the first row. Being new to postgres is my approach altogether wrong and is there another way to resolve this? Could it be that the entire table is corrupt?