table partitioning with jpa throws OptimisticLockException - postgresql

I have a log_table in my db that is partitioned according to partitioning docs. I have a function that inserts records to partition table depending on date and a trigger that calls that function, like in the documentation.
Trigger example
CREATE TRIGGER insert_measurement_trigger
BEFORE INSERT ON measurement
FOR EACH ROW EXECUTE PROCEDURE measurement_insert_trigger();
Function example
CREATE OR REPLACE FUNCTION measurement_insert_trigger()
RETURNS TRIGGER AS $$
BEGIN
IF ( NEW.logdate >= DATE '2006-02-01' AND
NEW.logdate < DATE '2006-03-01' ) THEN
INSERT INTO measurement_y2006m02 VALUES (NEW.*);
ELSIF ( NEW.logdate >= DATE '2006-03-01' AND
NEW.logdate < DATE '2006-04-01' ) THEN
INSERT INTO measurement_y2006m03 VALUES (NEW.*);
...
ELSIF ( NEW.logdate >= DATE '2008-01-01' AND
NEW.logdate < DATE '2008-02-01' ) THEN
INSERT INTO measurement_y2008m01 VALUES (NEW.*);
ELSE
RAISE EXCEPTION 'Date out of range. Fix the measurement_insert_trigger() function!';
END IF;
RETURN NULL;
END;
$$
LANGUAGE plpgsql;
I am using openjpa 2.3.0 as the JPA implementation.
With trigger on the table
When i try to persist new entity in transaction to this log_table
with trigger, I get an exception on commiting:
Caused by: <openjpa-2.3.0-r422266:1540826 nonfatal store error> org.apache.openjpa.persistence.OptimisticLockException:
An optimistic lock violation was detected when flushing object instance "...entities.LogTable#67ec8ef4" to the data store.
This indicates that the object was concurrently modified in another transaction.
When I insert one record manualy using INSERT it works, but returns
Query returned successfully: 0 rows affected, 54 ms execution time.
Without the trigger on table
It works OK using the same java code that throwed the exception, so the code should be ok. I do nothing exceptional for persisting the entity.
The manual INSERT command returns
Query returned successfully: one row affected, 51 ms execution time.
Can this difference in number of affected rows be the reason why openjpa cannot handle this correctly? On the code mentioned in the exception, I found this
try {
int count = executeUpdate(stmnt, sql, row);
if (count != 1) {
logSQLWarnings(stmnt);
Object failed = row.getFailedObject();
if (failed != null)
_exceptions.add(new OptimisticException(failed));
...
It seems that because with the trigger the insert return 0 affected rows instead of 1, the code evaluates it as exceptional.
SUBQUESTION
IS there a way to make the trigger result behave the same? I mean that it would return that 1 row was affected? Somehow propagate the result from the inner function?

I have the same problem using Java Ebean ORM. Inserting on java side causing Optimization lock error because of failing rowCount.
My Research into this problem did not lend a solution, i.e propogating row count from insert into child table or hacking the row count functionality.
My Solution (After your conditions, change NEW.id to next sequence in table and RETURN NEW that will cause insert into parent table) and give you the tuple response you need to affect row_count.
CREATE OR REPLACE FUNCTION measurement_insert_trigger()
RETURNS TRIGGER AS $$
BEGIN
IF ( NEW.logdate >= DATE '2006-02-01' AND
NEW.logdate < DATE '2006-03-01' ) THEN
INSERT INTO measurement_y2006m02 VALUES (NEW.*);
ELSIF ( NEW.logdate >= DATE '2006-03-01' AND
NEW.logdate < DATE '2006-04-01' ) THEN
INSERT INTO measurement_y2006m03 VALUES (NEW.*);
...
ELSIF ( NEW.logdate >= DATE '2008-01-01' AND
NEW.logdate < DATE '2008-02-01' ) THEN
INSERT INTO measurement_y2008m01 VALUES (NEW.*);
ELSE
RAISE EXCEPTION 'Date out of range. Fix the measurement_insert_trigger() function!';
END IF;
--ORIGINAL CODE
--RETURN NULL
------------------NEW CODE-----------------
NEW.id = nextval('table_id_seq');
RETURN NEW;
END;
$$
LANGUAGE plpgsql;
Yes for a moment we do have duplicate entries that have 2 different ids. So hopefully no unique constraints on parent table exist. If they do you might have to remove them. We need to delete the duplicate on parent table.
Create Function that will return trigger to be used after insert. Trigger will delete the NEW row inserted into parent table measurement.
CREATE OR REPLACE FUNCTION measurement_after_insert_trigger()
RETURNS TRIGGER AS $$
BEGIN
EXECUTE 'DELETE FROM measurement where measurement.id = ' || NEW.id;
RETURN NULL;
END;
$$
LANGUAGE plpgsql;
Lastly
Add After INSERT Trigger to parent table measurement
CREATE TRIGGER measurement_after_insert_trigger_delete_entry
AFTER INSERT ON measurement
FOR EACH ROW EXECUTE PROCEDURE measurement_after_insert_trigger();
I did consider another possible solution to this was to just have raw sql due the insert. Just did not like that approach as I would have to make sure all future constructors route to the raw sql. Approach above will allow the code to behave as expected as if this was just a regular table we were inserting to.

Related

How can I write a trigger that gets the last inserted row into the table?

I was to populate a field is_continued_post if some conditions are true about the previously inserted row into the table (it's the same user, and it's inserted_at is less than N mins from the new rows inserted_at).
When a new comment is inserted into the database. I want to get the last comment (with the same post_id) that was inserted, then check that the old rows user_id are the same as the new rows user_id, and that the old row was inserted less than 2 mins before the new row. If this is true, I want to flip a boolean on the new row to true before inserting it.
Is this possible with Postgresql triggers? Or is there a better way to do this?
This is what I've come up with so far:
CREATE OR REPLACE FUNCTION update_message_cont()
RETURNS trigger AS $$
BEGIN
old := (SELECT m0.user_id, m0.inserted_at FROM messages AS m0 WHERE (m0.post_id = NEW.post_id) ORDER BY m0.inserted_at DESC LIMIT 1);
NEW.is_continued := CASE
WHEN old is NULL THEN FALSE
WHEN old.user_id = NEW.user_id AND ((NEW.inserted_at - old.inserted_at) < 120) THEN TRUE
END;
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
Yes, that is possible, but only if you have a column in the table that allows you to identify the last inserted row. The order of insertion is not reflected in the table as such.
So introduce a column
inserted_at timestamp with time zone DEFAULT clock_timestamp() NOT NULL
An index on (post_id, inserted_at) will make the query fast.
The whole trigger could look like:
CREATE FUNCTION update_message_cont() RETURNS trigger AS
$$BEGIN
SELECT user_id IS NOT DISTINCT FROM NEW.user_id INTO NEW.is_continued
FROM messages
WHERE post_id = NEW.post_id
AND inserted_at > NEW.inserted_at - INTERVAL '120 seconds'
ORDER BY inserted_at DESC
LIMIT 1;
-- if no previous row was found:
IF NEW.is_continued IS NULL THEN
NEW.is_continued = FALSE;
END IF;
RETURN NEW;
END;$$ LANGUAGE plpgsql;
CREATE TRIGGER update_message_cont BEFORE INSERT ON messages
FOR EACH ROW EXECUTE PROCEDURE update_message_cont();

BEFORE Trigger + RETURNING returns NULL

This is one of the BEFORE triggers that inserts into the right table partition:
CREATE OR REPLACE FUNCTION public.insert_install_session()
RETURNS trigger
LANGUAGE plpgsql
AS
$body$
BEGIN
IF (NEW.created >= '2015-10-01 00:00:00' AND NEW.created < '2015-10-02 00:00:00') THEN
INSERT INTO install_session_2015_10_01 VALUES (NEW.*);
ELSIF (NEW.created >= '2015-10-02 00:00:00' AND NEW.created < '2015-10-03 00:00:00') THEN
INSERT INTO install_session_2015_10_02 VALUES (NEW.*);
ELSIF (NEW.created >= '2015-09-30 00:00:00' AND NEW.created < '2015-10-01 00:00:00') THEN
INSERT INTO install_session_2015_09_30 VALUES (NEW.*);
ELSE
RETURN NEW;
END IF;
RETURN NULL;
END;
$body$
CREATE TRIGGER trigger_insert_install_session
BEFORE INSERT ON install_session
FOR EACH ROW EXECUTE PROCEDURE insert_install_session
and I have a query that uses RETURNING:
INSERT INTO "install_session"
(<columns here>)
VALUES
(<values here>)
RETURNING "install_session"."id";
How can I make the RETURNING work? It seems it always returns NULL.
Is it because of the RETURN NULL at the end of the function? I can't return NEW because the row would be inserted a second time, no? Here is the official docs.
This is not going to work with a trigger solution. You could make it work with rules instead of triggers IIRC, but that has other caveats ...
However, to just get the auto-generated ID from a serial column, you can call currval() immediately after the command in the same session:
SELECT currval('name_of_your_id_sequence_here'::regclass);
Or even just lastval() - if respective id columns in the partitions are not inherited (don't share the same sequence).
SELECT lastval('name_of_your_id_sequence_here'::regclass);
You can use pg_get_serial_sequence() to find the name of the sequence if you don't know it:
SELECT currval(pg_get_serial_sequence('install_session', 'id'));
Related answer on dba.SE:
Return ID from partitioned table in postgres?

postgres count from table efficient way

In my application we are using postgresql,now it has one million records in summary table.
When I run the following query it takes 80,927 ms
SELECT COUNT(*) AS count
FROM summary_views
GROUP BY question_id,category_type_id
Is there any efficient way to do this?
COUNT(*) in PostgreSQL tends to be slow. It's a feature of MVCC. One of the workarounds of the problem is a row counting trigger with a helper table:
create table table_count(
table_count_id text primary key,
rows int default 0
);
CREATE OR REPLACE FUNCTION table_count_update()
RETURNS trigger AS
$BODY$
begin
if tg_op = 'INSERT' then
update table_count set rows = rows + 1
where table_count_id = TG_TABLE_NAME;
elsif tg_op = 'DELETE' then
update table_count set rows = rows - 1
where table_count_id = TG_TABLE_NAME;
end if;
return null;
end;
$BODY$
LANGUAGE 'plpgsql' VOLATILE;
Next step is to add proper trigger declaration for each table you'd like to use it with. For example for table tab_name:
begin;
insert into table_count values
('tab_name',(select count(*) from tab_name));
create trigger tab_name_table_count after insert or delete
on tab_name for each row execute procedure table_count_update();
commit;
It is important to run in a transaction block to keep actual count and helper table in sync in case of delete or insert between initial count and trigger creation. Transaction guarantees this. From now on to get current count instantly, just invoke:
select rows from table_count where table_count_id = 'tab_name';
Edit: In case of your group by clause, you'll need more sophisticated trigger function and count table.

GreenPlum - Table With Trigger - Insert Failed

This is my first time using triggers in green plum environment. I think I have most of it setup but I am facing some issues when I insert data. Here is my trigger
CREATE TRIGGER insert_trigger
BEFORE INSERT ON leads.abhi_temp
FOR EACH ROW EXECUTE PROCEDURE leads.my_trigger();
Here is the definition of the trigger
CREATE OR REPLACE FUNCTION leads.my_trigger()
RETURNS TRIGGER AS $$
BEGIN
IF ( NEW.date >= DATE '2003-01-01' AND
NEW.date < DATE '2003-12-31' ) THEN
INSERT INTO leads.abhi_temp_y2003 VALUES (NEW.*);
ELSIF ( NEW.date >= DATE '2004-01-01' AND
NEW.date < DATE '2004-12-31' ) THEN
INSERT INTO leads.abhi_temp_y2004 VALUES (NEW.*);
END IF;
RETURN NULL;
END;
$$
LANGUAGE plpgsql;
Now to insert data into my table I use
insert into leads.myData (select column1, column2 from leads.someOtherDara where column1 = '1');
But this gives me an error
ERROR: function cannot execute on segment because it issues a non-SELECT statement (functions.c:133)
I think the error is because I am using nested queries to insert data. Not sure how to fix this. Any recommendation. Thanks in advance for your help
I am aware, There is very limited support for triggers in Greenplum, It does not support DML operations.
May i know how do you achieve this, i mean how the rules can be applied as u said in your previous comments

count number of rows to be affected before update in trigger

I want to know number of rows that will be affected by UPDATE query in BEFORE per statement trigger . Is that possible?
The problem is that i want to allow only queries that will update up to 4 rows. If affected rows count is 5 or more i want to raise error.
I don't want to do this in code because i need this check on db level.
Is this at all possible?
Thanks in advance for any clues on that
Write a function that updates the rows for you or performs a rollback. Sorry for poor style formatting.
create function update_max(varchar, int)
RETURNS void AS
$BODY$
DECLARE
sql ALIAS FOR $1;
max ALIAS FOR $2;
rcount INT;
BEGIN
EXECUTE sql;
GET DIAGNOSTICS rcount = ROW_COUNT;
IF rcount > max THEN
--ROLLBACK;
RAISE EXCEPTION 'Too much rows affected (%).', rcount;
END IF;
--COMMIT;
END;
$BODY$ LANGUAGE plpgsql
Then call it like
select update_max('update t1 set id=id+10 where id < 4', 3);
where the first param ist your sql-Statement and the 2nd your max rows.
Simon had a good idea but his implementation is unnecessarily complicated. This is my proposition:
create or replace function trg_check_max_4()
returns trigger as $$
begin
perform true from pg_class
where relname='check_max_4' and relnamespace=pg_my_temp_schema();
if not FOUND then
create temporary table check_max_4
(value int check (value<=4))
on commit drop;
insert into check_max_4 values (0);
end if;
update check_max_4 set value=value+1;
return new;
end; $$ language plpgsql;
I've created something like this:
begin;
create table test (
id integer
);
insert into test(id) select generate_series(1,100);
create or replace function trg_check_max_4_updated_records()
returns trigger as $$
declare
counter_ integer := 0;
tablename_ text := 'temptable';
begin
raise notice 'trigger fired';
select count(42) into counter_
from pg_catalog.pg_tables where tablename = tablename_;
if counter_ = 0 then
raise notice 'Creating table %', tablename_;
execute 'create temporary table ' || tablename_ || ' (counter integer) on commit drop';
execute 'insert into ' || tablename_ || ' (counter) values(1)';
execute 'select counter from ' || tablename_ into counter_;
raise notice 'Actual value for counter= [%]', counter_;
else
execute 'select counter from ' || tablename_ into counter_;
execute 'update ' || tablename_ || ' set counter = counter + 1';
raise notice 'updating';
execute 'select counter from ' || tablename_ into counter_;
raise notice 'Actual value for counter= [%]', counter_;
if counter_ > 4 then
raise exception 'Cannot change more than 4 rows in one trancation';
end if;
end if;
return new;
end; $$ language plpgsql;
create trigger trg_bu_test before
update on test
for each row
execute procedure trg_check_max_4_updated_records();
update test set id = 10 where id <= 1;
update test set id = 10 where id <= 2;
update test set id = 10 where id <= 3;
update test set id = 10 where id <= 4;
update test set id = 10 where id <= 5;
rollback;
The main idea is to have a trigger on 'before update for each row' that creates (if necessary) a temporary table (that is dropped at the end of transaction). In this table there is just one row with one value, that is the number of updated rows in current transaction. For each update the value is incremented. If the value is bigger than 4, the transaction is stopped.
But I think that this is a wrong solution for your problem. What's a problem to run such wrong query that you've written about, twice, so you'll have 8 rows changed. What about deletion rows or truncating them?
PostgreSQL has two types of triggers: row and statement triggers. Row triggers only work within the context of a row so you can't use those. Unfortunately, "before" statement triggers don't see what kind of change is about to take place so I don't believe you can use those, either.
Based on that, I would say it's unlikely you'll be able to build that kind of protection into the database using triggers, not unless you don't mind using an "after" trigger and rolling back the transaction if the condition isn't satisfied. Wouldn't mind being proved wrong. :)
Have a look at using Serializable Isolation Level. I believe this will give you a consistent view of the database data within your transaction. Then you can use option #1 that MusiGenesis mentioned, without the timing vulnerability. Test it of course to validate.
I've never worked with postgresql, so my answer may not apply. In SQL Server, your trigger can call a stored procedure which would do one of two things:
Perform a SELECT COUNT(*) to determine the number of records that will be affected by the UPDATE, and then only execute the UPDATE if the count is 4 or less
Perform the UPDATE within a transaction, and only commit the transaction if the returned number of rows affected is 4 or less
No. 1 is timing vulnerable (the number of records affected by the UPDATE may change between the COUNT(*) check and the actual UPDATE. No. 2 is pretty inefficient, if there are many cases where the number of rows updated is greater than 4.