Within a plpgsql stored procedure, I am attempting to write to a list of hosts (using PL/Proxy - unrelated to this problem) based on an active column in table A, where A contains a column host and column active which indicates whether or not the host is alive and should be written to (for each row in A where active = TRUE, write_to_host(A.host)). The function write_hosts reads from the A table and only chooses the hosts that are marked active=TRUE .
Upon catching an exception while attempting to write to one of the hosts, I want to update table A, setting A.active = FALSE WHERE A.host = <badhost>, and continue attempting to write to the other hosts. It looked something like the following, where I wanted to attempt the insert, fail, update the A table marking the row as inactive, then recursively call the write_doc function, which should see the remaining active entries and only attempt to write to them:
CREATE OR REPLACE FUNCTION write_doc(args...)
RETURNS SETOF JSON AS $$
BEGIN
SELECT * FROM write_to_host(args...);
RETURN NEXT ARRAY_TO_JSON(ARRAY[TRUE::text, 'inserted']);
RETURN;
EXCEPTION WHEN OTHERS THEN
IF SQLERRM ~ '.+Connection refused.+' THEN
EXECUTE 'UPDATE hosts SET active = FALSE WHERE host = ''' || _bad_host || '''';
END;
RETURN QUERY SELECT * FROM write_doc(args...);
$$ LANGUAGE plpgsql;
This results in an infinite loop, which I have to CNTRL-C out of.
When I comment out the RETURN QUERY bit, the deactivation works for the host in question. When I leave it in, table A is not updated (at least not before I stop the execution of the procedure).
Using an alternative method with a LOOP, it still ends up looping infinitely for me, and table A is not updated... something like the following, where the RETURN statement should exit the loop:
CREATE OR REPLACE FUNCTION write_doc(args...)
RETURNS SETOF JSON AS $$
BEGIN
LOOP
BEGIN
SELECT * FROM write_to_host(args...);
RETURN NEXT ARRAY_TO_JSON(ARRAY[TRUE::text, 'inserted']);
RETURN;
EXCEPTION WHEN OTHERS THEN
IF SQLERRM ~ '.+Connection refused.+' THEN
EXECUTE 'UPDATE hosts SET active = FALSE WHERE host = ''' || _bad_host || '''';
END;
END;
END LOOP;
END;
$$ LANGUAGE plpgsql;
I understand that all stored procedures are transactional, but thought that the table would be updated within the transaction and would be viewable by the code running in the stored procedure, guess I am wrong?
Any suggestions/information someone can give to lead me in the right direction? Any questions don't hesitate to ask
Related
Going off of this example, I am trying to lock someone's credit card account (row), check to see if they have enough money, and if they do, use it to pay. I need to lock it to prevent a condition where they have enough credit, but then it is used on another transaction and my program is left thinking it still does have enough credits.
At the terminal level I can accomplish this like so: I can open two psql sessions in two terminals and I can issue a SELECT * FROM credit_card WHERE credit_card_number = 1234 FOR UPDATE; command in one and then a SELECT * FROM credit_card FOR UDPATE in the other (or something else like UPDATE credit_card SET credits = credits -99 WHERE credit_card_number = 1234), and I can see that the former call blocks the latter call. However, when I do the same thing but in a function like so
CREATE OR REPLACE FUNCTION foo (p_credit_card_number BIGINT)
RETURNS VOID
AS $$
BEGIN
SELECT * FROM credit_card WHERE credit_card_number = p_credit_card_number FOR UPDATE
...
END
$$
LANGUAGE 'plpgsql';
I get the typical query has no destination for result data error (eg. see here).
Question: How can I lock a specific row, or number of rows, while using SELECT ...FOR UPDATE inside a Function while avoiding the above mentioned error?? If not possible, or not advisable, how else would I do this inside a Function?
The error you are getting has nothing to do whit the "for update" clause. It works exactly the same in a stored procedure of not. The error you has to to do this a select statement inside a procedure (or function or do block). When you select in a procedure you must tell the procedure what to do with the selected variable(s).
This you do with the Select <column_list> into <variable_list> ...
So assuming you actually need the data then something like the following:
create or replace function foo (p_credit_card_number bigint)
returns void
language 'plpgsql'
as $$
v_credit_card credit_card%type;
begin
select *
into v_credit_card
from credit_card
where credit_card_number = p_credit_card_number
for update;
...
update credit_card
set ...
where credit_card = v_credit_card ;
end;
$$;
A more common usage though is to create a cursor and update using 'where current of'
create or replace function foo (p_credit_card_number bigint)
returns void
language 'plpgsql'
as $$
c_credit_cards cursor for
select *
from credit_card
where credit_card_number = p_credit_card_number
for update;
v_credit_card credit_card%type;
begin
for v_cc in c_credit_cards
loop
... additional processing ...
update credit_card
set ...
where current of c_credit_cards;
end loop;
...
end;
But note that when the cursor is opened ALL rows in it are locked.
I have a table in Redshift (let's call it a status table) where I set the status of tables which I want to truncate. I created a Redshift Stored Procedure in order to achieve that. Here is my code for the SP:
CREATE OR REPLACE PROCEDURE <schema>.truncate_table()
AS $$
DECLARE
v_tpsl RECORD;
exec_statement VARCHAR(256);
BEGIN
FOR v_tpsl in SELECT * from <schama>.tablename_process_status_log WHERE status = 'TRUE' LOOP
exec_statement = 'TRUNCATE TABLE <schema>.' + quote_ident(v_tpsl.staging_table_name) + '_test;';
RAISE INFO 'statement = %', exec_statement;
EXECUTE exec_statement;
END LOOP;
END;
$$
LANGUAGE plpgsql;
Now when I am CALLING the Stored Procedure, I am getting this error:
SQL Error [500310] [34000]: [Amazon](500310) Invalid operation: cursor does not exist;
I looked at the documentation of the SP to check if Truncate is possible or not. By looking at the examples, it looks like it's possible.
I am not sure what is going wrong in this. I am using RedshiftJDBC42-no-awssdk-1.2.34.1058.jar and connecting via DBeaver.
It looks like I have found the answer. According to this, Any cursor that is open (explicitly or implicitly) is closed automatically when a COMMIT, ROLLBACK, or TRUNCATE statement is processed. In my next iteration of the loop, it's trying to accessing the cursor which is already closed.
Postgresql 10/11.
I need to delete row instead of update in case if target cell value is null.
So I created this trigger function:
CREATE OR REPLACE FUNCTION delete_on_update_related_table() RETURNS trigger
AS $$
DECLARE
refColumnName text = TG_ARGV[0];
BEGIN
IF TG_NARGS <> 1 THEN
RAISE EXCEPTION 'Trigger function expects 1 parameters, but got %', TG_NARGS;
END IF;
EXECUTE 'DELETE FROM ' || TG_TABLE_NAME || ' WHERE $1 = ''$2'''
USING refColumnName, OLD.id;
RETURN NULL;
END;
$$ LANGUAGE plpgsql;
And a BEFORE UPDATE trigger:
CREATE TRIGGER proper_delete
BEFORE UPDATE OF def_id
ON public.definition_products
FOR EACH ROW
WHEN (NEW.def_id IS NULL)
EXECUTE PROCEDURE delete_on_update_related_table('def_id');
Table is simple:
id uuid primary key
def_id uuid not null
Test:
UPDATE definition_products SET
def_id = NULL
WHERE id = 'f47415e8-6b00-4c65-aeb8-cadc15ca5890';
-- rows affected 0
Documentation says:
Row-level triggers fired BEFORE can return null to signal the trigger
manager to skip the rest of the operation for this row (i.e.,
subsequent triggers are not fired, and the INSERT/UPDATE/DELETE does
not occur for this row).
Previously, I used a RULE instead of the trigger. But there is no way to use WHERE & RETURNING clause in same rule.
You need an unconditional ON UPDATE DO INSTEAD rule with a RETURNING clause
So, is there a way?
While Jeremy's answer is good, there is still room for improvement.
Problems
You need to be very accurate in the definition of the objective. Your statement:
I need to delete row instead of update in case if target cell value is null.
... does not imply that the column was changed to NULL in the UPDATE at hand. Might have been NULL before, like, before you implemented the trigger. So not:
BEFORE UPDATE OF def_id ON public.definition_products
But just:
BEFORE UPDATE ON public.definition_products
Of course, if the column is defined NOT NULL (as it probably should be), there is no effective difference - except for the noise and an additional point of failure. The manual:
A column-specific trigger (one defined using the UPDATE OFcolumn_name syntax) will fire when any of its columns are listed as targets in the UPDATE command's SET list. It is possible for a column's value to change even when the trigger is not fired, because changes made to the row's contents by BEFORE UPDATE triggers are not considered.
Also, nothing in your question indicates the need for dynamic SQL. (That would be the case if you wanted to reuse the same trigger function for multiple triggers on different tables. And even then it's often better to just create several distinct trigger functions for multiple reason: simpler, faster, less error-prone, easier to read & maintain, ...)
As for "error-prone": your original dynamic statement was just invalid:
EXECUTE 'DELETE FROM ' || TG_TABLE_NAME || ' WHERE $1 = ''$2'''
USING refColumnName, OLD.id;
Can't pass a column name as value (refColumnName).
Can't put single quotes around $2, which is passed as value and hence needs no quoting.
An unqualified, unquoted TG_TABLE_NAME can go terribly wrong, which is especially critical for a heavy-weight function that deletes rows.
Jeremy's version fixes most, but still features the unqualified TG_TABLE_NAME.
This would be good:
EXECUTE format('DELETE FROM %s WHERE %I = $1', TG_RELID::regclass, refColumnName) -- refColumnName still unquoted
USING OLD.id;
Or:
EXECUTE format('DELETE FROM %I.%I WHERE %I = $1', TG_TABLE_SCHEMA, TG_TABLE_NAME, refColumnName)
USING OLD.id;
Related:
Why does a PostgreSQL SELECT query return different results when a schema name is specified?
Table name as a PostgreSQL function parameter
Solution
Simpler trigger function:
CREATE OR REPLACE FUNCTION delete_on_update_related_table()
RETURNS trigger AS
$func$
BEGIN
DELETE FROM public.definition_products WHERE id = OLD.id; -- def_id?
RETURN NULL;
END
$func$ LANGUAGE plpgsql;
Simpler trigger:
CREATE TRIGGER proper_delete
BEFORE UPDATE ON public.definition_products
FOR EACH ROW
WHEN (NEW.def_id IS NULL) -- that's the defining condition!
EXECUTE PROCEDURE delete_on_update_related_table(); -- no parameter
You probably want to use OLD.id, not OLD.def_id. (The row to delete is best defined by it's PK, not by the column changed to NULL.) But that's not entirely clear.
This works for me, with a few small changes:
CREATE OR REPLACE FUNCTION delete_on_update_related_table() RETURNS trigger
AS $$
DECLARE
refColumnName text = quote_ident(TG_ARGV[0]);
BEGIN
IF TG_NARGS <> 1 THEN RAISE EXCEPTION 'Trigger function expects 1 parameters, but got %', TG_NARGS; END IF;
EXECUTE format('DELETE FROM %s WHERE %s = %s', quote_ident(TG_TABLE_NAME), refColumnName, quote_literal(OLD.id));
RETURN NULL;
END;
$$ LANGUAGE plpgsql;
-- create trigger
CREATE TRIGGER proper_delete
BEFORE UPDATE OF def_id
ON public.definition_products
FOR EACH ROW
WHEN (NEW.def_id IS NULL)
EXECUTE PROCEDURE delete_on_update_related_table('id'); --Note id, not def_id
This is a SQL learning exercise. I have a 'tbl' with a single integer field. There are no indices on this table. I have this function:
CREATE OR REPLACE FUNCTION rcs()
RETURNS VOID AS $$
DECLARE
c CURSOR FOR SELECT * FROM tbl ORDER BY sequence;
s INTEGER;
r RECORD;
BEGIN
s := 0;
OPEN c;
LOOP
FETCH c INTO r;
EXIT WHEN NOT FOUND;
RAISE NOTICE 'got it';
r.sequence = s;
s := s + 1;
RAISE NOTICE '%', r.sequence;
END LOOP;
CLOSE c;
END;
$$ language 'plpgsql'
This loads and runs cleanly and the RAISE statements suggest that the 'sequence' field gets updated to 0, 1 etc. in accordance with the ORDER BY.
However, when I SELECT the table afterwards, the pre-existing values (which happen to all be '6') did not change.
Is this something to do with transactions? I tried fiddling around with COMMIT, etc. to no avail.
This is a freshly installed Postgresql 9.4.4 running on a Linode with no hackery of config files or anything like that, from the 'psql' command line.
EDIT: maybe it's because 'r' isn't actually the DB table, it's some kind of temporary copy of it? If so, please clarify, hopefully what I'm trying to achieve here is obvious (and I know it may be a bit nonsensical, but surely it's possible without resorting to reading the set into Java, etc.)
The actual problem: your function does not contain an UPDATE statement so nothing gets written to disk. r.sequence = s; simply assigns a new value to a variable that is held in memory.
To fix this, you need something like:
UPDATE tbl
set sequence = s -- change the actual column in the table
WHERE current of c; -- for the current row of your cursor
If where current of doesn't work, you need to switch that to a "regular" where clause:
UPDATE tbl
set sequence = s
WHERE tbl.pk_column = r.pk_column; -- the current primary key value
But a much more efficient solution is to do this in a single statement:
update tbl
set sequence = tu.new_sequence
from (
select t.pk_column,
row_number() over (order by t.sequence) as new_sequence
from tbl t
) tu
where tbl.pk_column = tu.pk_column;
You need to replace the column name pk_column with the real primary key (or unique) column of your table.
I have created a "merge" function which is supposed to execute either an UPDATE or an INSERT query, depending on existing data. Instead of writing an upsert-wrapper for each table (as in most of the available examples), this function takes entire SQL strings. Both of the SQL strings are automatically generated by our application.
The plan is to call the function like this:
-- hypothetical "settings" table, with a primary key of (user_id, setting):
SELECT merge(
$$UPDATE settings SET value = 'x' WHERE user_id = 42 AND setting = 'foo'$$,
$$INSERT INTO settings (user_id, setting, value) VALUES (42, 'foo', 'x')$$
);
Here's the full code of the merge() function:
CREATE OR REPLACE FUNCTION merge (update_sql TEXT, insert_sql TEXT) RETURNS TEXT AS
$func$
DECLARE
max_iterations INTEGER := 10;
i INTEGER := 0;
num_updated INTEGER;
BEGIN
-- usually returns before re-entering the loop
LOOP
-- first try the update
EXECUTE update_sql;
GET DIAGNOSTICS num_updated = ROW_COUNT;
IF num_updated > 0 THEN
RETURN 'UPDATE';
END IF;
-- nothing was updated: try the insert, watching out for concurrent inserts
BEGIN
EXECUTE insert_sql;
RETURN 'INSERT';
EXCEPTION WHEN unique_violation THEN
-- nop; just loop and try again from the top
END;
-- emergency brake
i := i + 1;
IF i >= max_iterations THEN
RAISE EXCEPTION 'merge(): tried looping % times, giving up now.', i;
EXIT;
END IF;
END LOOP;
END;
$func$
LANGUAGE plpgsql;
It appears to work well enough in my tests, but I'm not certain if I haven't missed anything crucial, especially regarding concurrent UPDATE/INSERT/DELETE queries, which may be issued without using this function. Did I overlook anything important?
Among the resources I consulted for this function are:
UPDATE/INSERT example 40.2 in the PostgreSQL manual
Why is UPSERT so complicated?
SO: Insert, on duplicate update (postgresql)
(Edit: one of the goals was to avoid locking the target table.)
The answer to your question depends your the context of how your application(s) will access the database. There are many ways to solve this as nicely discussed in depesz's post you cited by yourself. In addition you might want to also consider using writeable CTEs see here. Also the [question]Insert, on duplicate update in PostgreSQL? has some interesting discussions for your decision making process.