I would love to be able to validate objects representing table rows using the database's existing constraints (triggers that raise exceptions and checks) without actually inserting them into the database.
Is there currently a way one could do this in postgres? At least with BEFORE INSERT triggers and CHECK, I assume it makes no sense with AFTER INSERT triggers.
The easiest way I can think or right now would be to:
Lock the table
Insert a new row
If exception raise to the API / else DELETE the row and call it valid
Unlock
But I can see several issues with this.
A simpler way is to insert within a transaction and not commit:
BEGIN;
INSERT INTO tbl(...) VALUES (...);
-- see effects ...
ROLLBACK;
No need for additional locking. The row is never visible to any other transaction with default transacton isolation level READ COMMITTED. (You might be stalling concurrent writes that confict with the tested row.)
Notable side-effect: Sequences of serial or IDENTITY columns are advanced even if the INSERT is never committed. But gaps in sequential numbers are to be expected anyway and nothing to worry about.
Be wary of triggers with side-effects. All "transactional" SQL effects are rolled back, even most DDL commands. But some special operations (like advancing sequences) are never rolled back.
Also, DEFERRED constraints do not kick in. The manual:
DEFERRED constraints are not checked until transaction commit.
If you need this a lot, work with a copy of your table, or even your database.
Strictly speaking, while any trigger / constraint / concurrent event is allowed, there is no other way to "validate objects" than to insert them into the actual target table in the actual target database at the actual point in time. Triggers, constraints, even default values, can interact with the current state of the whole DB. The more possibilities are ruled out and requirements are reduced, the more options we might have to emulate the test.
CREATE FUNCTION validate_function ( )
RETURNS trigger LANGUAGE plpgsql
AS $function$
DECLARE
valid_flag boolean := 't';
BEGIN
--Validation code
if valid_flag = 'f' then
RAISE EXCEPTION 'This record is not valid id %', id
USING HINT = 'Please enter valid record';
RETURN NULL;
else
RETURN NEW;
end if;
END;
$function$
CREATE TRIGGER validate_rec BEFORE INSERT OR UPDATE ON some_tbl
FOR EACH ROW EXECUTE FUNCTION validate_function();
With this function and trigger you validate inside the trigger. If the new record fails validation you set the valid_flag to false and then use that to raise exception. The RETURN NULL; is probably redundant and I am not sure it will be reached, but if it is it will also abort the insert or update. If the record is valid then you RETURN NEW and the insert/update completes.
Related
I am trying to understand which type of a lock to use for a trigger function.
Simplified function:
CREATE OR REPLACE FUNCTION max_count() RETURNS TRIGGER AS
$$
DECLARE
max_row INTEGER := 6;
association_count INTEGER := 0;
BEGIN
LOCK TABLE my_table IN ROW EXCLUSIVE MODE;
SELECT INTO association_count COUNT(*) FROM my_table WHERE user_id = NEW.user_id;
IF association_count > max_row THEN
RAISE EXCEPTION 'Too many rows';
END IF;
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
CREATE CONSTRAINT TRIGGER my_max_count
AFTER INSERT OR UPDATE ON my_table
DEFERRABLE INITIALLY DEFERRED
FOR EACH ROW
EXECUTE PROCEDURE max_count();
I initially was planning to use EXCLUSIVE but it feels too heavy. What I really want is to ensure that during this function execution no new rows are added to the table with concerned user_id.
If you want to prevent concurrent transactions from modifying the table, a SHARE lock would be correct. But that could lead to a deadlock if two such transactions run at the same time — each has modified some rows and is blocked by the other one when it tries to escalate the table lock.
Moreover, all table locks that conflict with SHARE UPDATE EXCLUSIVE will lead to autovacuum cancelation, which will cause table bloat when it happens too often.
So stay away from table locks, they are usually the wrong thing.
The better way to go about this is to use no explicit locking at all, but to use the SERIALIZABLE isolation level for all transactions that access this table.
Then you can simply use your trigger (without lock), and no anomalies can occur. If you get a serialization error, repeat the transaction.
This comes with a certain performance penalty, but allows more concurrency than a table lock. It also avoids the problems described in the beginning.
I have a general function that can manipulate the sequence of any table (why is irrelevant to my question). It reads the current value, works out the new value, sets it, and returns its calculation, which is what's inserted. This is obviously a multi-step process.
I call it from a BEFORE INSERT trigger on tables where I need it.
All I need to know is am I guaranteed that the function will be called by only one caller at a time in a multi-user environment?
Specifically, does the BEFORE INSERT trigger have to complete before it is called again by another caller?
Logically, I would assume yes, but one never knows what may be going on under the hood.
If the answer is no, what minimal locking would I need on the function to guarantee I can read and write the sequence in a "thread-safe" manner?
I'm using PG 10.
EDIT
Here is the function updated with a lock:
CREATE OR REPLACE FUNCTION public.uts_set()
RETURNS TRIGGER AS
$$
DECLARE
sv int8;
seq text := format('%I.%I_uts_seq', tg_table_schema, tg_table_name);
BEGIN
EXECUTE format('LOCK TABLE %I IN ROW EXCLUSIVE MODE;', tg_table_name);
EXECUTE 'SELECT last_value+1 FROM ' || seq INTO sv; -- currval(seq) isn't useable
PERFORM setval(seq, GREATEST(sv, (EXTRACT(epoch FROM localtimestamp) * 1000000)::int8), false);
RETURN NULL;
END;
$$ LANGUAGE plpgsql;
However, a SELECT already acquires ROW EXCLUSIVE, so this statement may be redundant and a stronger lock may be needed. Or, conversely, it may mean no lock is needed.
UPDATE
If I am reading this SO question correctly, my original version without the LOCK should work since the trigger acquires the same lock my updated function is redundantly taking.
All I need to know is am I guaranteed that the function will be called by only one caller at a time in a multi-user environment?
No. Not related to calling functions itself, but you can achieve this behaviour with SERIALIZABLE transaction isolation level:
This level emulates serial transaction execution for all committed
transactions; as if transactions had been executed one after another,
serially, rather than concurrently
But this approach would introduce several tradeoffs, such preparing your application to retry transactions with serialization failure.
Maybe a missed something, but I really believe that you just need NEXTVAL, something like below:
CREATE OR REPLACE FUNCTION public.uts_set()
RETURNS TRIGGER AS
$$
DECLARE
sv int8;
-- First, use %I wildcard for identifiers instead of %s
seq text := format('%I.%I', tg_table_schema, tg_table_name || '_uts_seq');
BEGIN
-- Second, you couldn't call CURRVAL on a session
-- that you didn't issued NEXTVAL before
sv := NEXTVAL(seq);
-- Do your logic here...
-- Result is ignored since this is an STATEMENT trigger
RETURN NULL;
END;
$$ LANGUAGE plpgsql;
Remember that CURRVAL acts on session local scope and NEXTVAL on global scope, so you have a reliable thread-safe mechanism in hands.
The sequence itself handles thread safety with concurrent sessions. So it real comes down to the code that is interacting with the sequence. The following code is thread safe:
SELECT nextval('myseq');
If the sequence is doing much fancier things like setval and currval, I would be more worried about that being done in a high transaction/multi-user environment. Even so, the sequence itself should be locked from other queries while the sequence is being manipulated.
I would like to make sure that the values of certain required fields can not be changed later on. Is there a way to define this on the schema level?
Currently, I'm thinking about implementing this using a Record Trigger to raise an exception if a value change is noticed but this feels clunky.
E.g.:
BEGIN
IF (TG_OP = 'UPDATE') THEN
IF (NEW.product_id !== OLD.product_id) THEN
RAISE EXCEPTION 'Attempt to change frozen field "product_id" on UPDATE.'
END IF;
END IF;
END
If you want a trigger with comparison on a field, you can save execution by specifying condition on the trigger itself:
A Boolean expression that determines whether the trigger function will
actually be executed. If WHEN is specified, the function will only be
called if the condition returns true. In FOR EACH ROW triggers, the
WHEN condition can refer to columns of the old and/or new row values
by writing OLD.column_name or NEW.column_name respectively. Of course,
INSERT triggers cannot refer to OLD and DELETE triggers cannot refer
to NEW.
eg:
CREATE TRIGGER check_update
BEFORE UPDATE ON accounts
FOR EACH ROW
WHEN (OLD.product_id IS DISTINCT FROM NEW.product_id)
EXECUTE PROCEDURE check_account_update();
Of course it does not freeze anything. Now you cant change it with update, unless you disable the trigger, update and enable trigger back. But at least later requires alter table, not just update
I created 5 triggers in my small (2 table database).
After I added the last one (to change INVPOS.INVSYMBOL after INVOICE.SYMBOL has been updated) these triggers activated each other and I got a
Too many concurrent executions of the same request.
error.
Could you please look at the triggers I created and help me out?
What can I do to avoid these problems in future? Should I merge a few triggers into one?
One solution could be to check has the intresting field(s) changed and only run the trigger's action if really nessesary (data has changed), ie
CREATE TRIGGER Foo FOR T
AS
BEGIN
-- only execute update statement when the Fld changed
if(new.Fld is distinct from old.Fld)then begin
update ...
end
END
Another option could be to check has the trigger already done it's thing in this transaction, ie
CREATE TRIGGER Foo FOR T
AS
DECLARE trgrDone VARCHAR(255);
BEGIN
trgrDone = RDB$GET_CONTEXT('USER_TRANSACTION', 'Foo');
IF(trgrDone IS NULL)THEN BEGIN
-- trigger hasn't been executed yet
-- register the execution
rdb$set_context('USER_TRANSACTION', 'Foo', 1);
-- do the work which might cause reentry
update ...
END
END
You should avoid circular references between triggers.
In general, triggers are not suitable for complex business logic, they work good for simple "if-then" business rules.
For the case you described you'd better implemenent a stored procedure where you could prepare data for all tables (perform data check, calculate necessary values, etc) and then insert them. It will lead to straightforward, fast and easy-to-maintain code.
Also, use CHECK for "preventing from inserting 0 to AMOUNT and PRICENET", and calculated fields for tasks like "calculate NETVAL".
I have just left MySQL behind in favor of PostgreSQL, and I have a question regarding triggers. This trigger is designed to update a field in the 'workflow' table if a row is deleted in the 'processes' table.
CREATE OR REPLACE FUNCTION fn_process_delete() RETURNS TRIGGER AS $$
BEGIN
UPDATE workflow SET deleted_process_name = OLD.process_name
WHERE process_id = OLD.process_id;
RETURN NULL;
END;
$$ LANGUAGE plpgsql;
DROP TRIGGER IF EXISTS process_delete ON processes;
CREATE TRIGGER process_delete
AFTER DELETE ON processes
FOR EACH ROW
EXECUTE PROCEDURE fn_process_delete();
My question is two-fold:
If I use AFTER DELETE as above, the row will delete, but the update statement does not update the field in the 'workflow' table.
If I use BEFORE DELETE, the processes table will not perform the delete at all and delivers an error saying "No unique identifier for this row".
Can anyone advise?
Question 2:
Your trigger function ends with:
RETURN NULL;
With that you skip the execution of the triggering event. Per documentation on trigger procedures:
Row-level triggers fired BEFORE can return null to signal the trigger
manager to skip the rest of the operation for this row (i.e.,
subsequent triggers are not fired, and the INSERT/UPDATE/DELETE does
not occur for this row).
You need to replace that with:
RETURN OLD;
for the system to proceed with the deletion of the row. Here is why:
In the case of a before-trigger on DELETE, the returned value has no
direct effect, but it has to be nonnull to allow the trigger action to
proceed. Note that NEW is null in DELETE triggers, so returning that
is usually not sensible. The usual idiom in DELETE triggers is to
return OLD.
Bold emphasis mine.
Question 1
I see no reason why your trigger and trigger function should not work as AFTER DELETE. It goes without saying that a row with a matching process_id has to exist in table workflow.