How to address moddatetime apparently bypassing precision of column? - postgresql

We have a database where every timestamp is defined as timestamp(3) with time zone. We noticed that in some cases we were getting back timestamps with 6 fractional digits, not 3. I tracked this down to only happening in out update_date columns, and only if it was updated via moddatetime.
Does anybody have experience with gracefully fixing this?
I tested manual inserts, updates, from literal, from now(), etc and none of those cause the issue.
I'm aware of being able to fix this by casting to ::timestamptz(3) wherever we use this field, but this will add a lot of mess, plus I am concerned about the performance if we need to do ... WHERE date_trunc('milliseconds', update_date) > other_tz3_value.
The best I've come up with would be changing the index we have on update_date to a functional index on date_trunc('milliseconds', update_date), but again, this seems clunky.
The tables in question have a trigger like:
CREATE TRIGGER mytrigger
BEFORE UPDATE ON mytable
FOR EACH ROW EXECUTE PROCEDURE moddatetime(update_date);
My expectation would be that no matter how a value is inserted into a timestamp(3) field, it always comes back with no more than 3 fractional digits.

I don't think it's possible with the moddatetime function.
This answer shows an example of using a custom trigger. We can modify it to define the precision of the generated timestamp like so:
CREATE OR REPLACE FUNCTION update_modified_column()
RETURNS TRIGGER AS $$
BEGIN
NEW.modified = current_timestamp(3);
RETURN NEW;
END;
$$ language 'plpgsql';

Related

Create Trigger to get hourly difference between timestamps

I have a table where I would like to calculate the difference in time (in hours) between two columns after inserting a row. I would like to set up a trigger to do this whenever an insert or update is performed on the table.
My columns are delay_start, delay_stop, and delay_duration. I would like to do the following:
delay_duration = delay_stop - delay_start
The result should be of numeric (4,2) value and go into the delay_duration category. Below is what I have so far, but it will not populate the column for some reason.
BEGIN
INSERT INTO public.deckdelays(delay_duration)
VALUES(DATEDIFF(hh, delay_stop, delay_start));
RETURN NEW;
END;
I am quite new to all of this so if anyone could help I would greatly appreciate it!
If you have Postgres 12 or later you can define delay_duration as a generated column. This allows you to eliminate triggers.
create table deckdelays(id integer generated always as identity
, delay_start timestamp
, delay_stop timestamp
, delay_duration numeric(4,2)
generated always as
( extract(epoch from (delay_stop - delay_start))/3600 )
stored
--, other attributes
);
See demo here.
But if you insist on a trigger:
create or replace
function delayduration_func()
returns trigger
language plpgsql
as $$
begin
new.delay_duration = (extract(epoch from (deckdelays.delay_stop - deckdelays.delay_start))/3600)::numeric;
return new;
end;
$$;
create trigger delaydurationset1
before insert
or update of delay_stop, delay_start
on deckdelays
execute procedure delayduration_func();
Changes:
Before trigger instead of after. A before trigger can modify the
values in a column without additional DML statements, an after
trigger cannot. Issuing a DML statement on a table within a trigger
on that same table can lead to all types of problems. It is bast
avoided if possible.
Trigger name and function name not the same. Might just be me but I
do not like different things having the same name. Although it works
often leads to confusion. Always avoid confusion if possible.
Trigger fires on update of delay_start. An update of either delay_start or delay_end also updates delay_duration.

Does CLOCK_TIMESTAMP from a BEFORE trigger match log/commit order *exactly* in PG 12.3?

I've got a Postgres 12.3 question: Can I rely on CLOCK_TIMESTAMP() in a trigger to stamp an updated_dts timestamp in exactly the same order as changes are committed to the permanent data?
On the face of it, this might sound like kind of an silly question, but I just spent two tracking down a super rare race condition in a non-Postgres system that hinged on exactly this behavior. (Lagging commits made their 'last value seen' tracking data unreliable.) Now I'm trying to figure out if it's possible for CLOCK_TIMESTAMP() to not match the order of changes recorded in the WAL perfectly.
It's simple to see how this could occur with NOW/TRANSACTION_TIMESTAMP/CURRENT_TIMESTAMP as they're returning the transaction start time, not the completion time. It's pretty easy, in that case, to record a timestamp sequence where the stamps and log order don't agree. But I can't figure out if there's any chance for commits to be saved in a different order to the BEFORE trigger CLOCK_TIMESTAMP() values.
For background, we need a 100% reliable timeline for an external search to use. As I understand it, I can create one using logical replication, and a replication-target side trigger to stamp changes as they're replayed from the log. What I'm unclear on, is if it's possible to get the same fidelity from CLOCK_TIMESTAMP() on a single server.
I haven't got the chops to get deep into the Postgres internals, and see how requests are interleaved, nor how granular execution is, and am hoping that someone here knows definitively. If this is more of a question for one of the PG mailing lists, please let me know.
-- Thanks
Below is a bit of sample code for how I'm looking at building the timestamps. It works fine, but doesn't prove anything about behavior with lots of concurrent processes.
---------------------------------------------
-- Create the trigger function
---------------------------------------------
DROP FUNCTION IF EXISTS api.set_updated CASCADE;
CREATE OR REPLACE FUNCTION api.set_updated()
RETURNS TRIGGER
AS $BODY$
BEGIN
NEW.updated_dts = CLOCK_TIMESTAMP();
RETURN NEW;
END;
$BODY$
language plpgsql;
COMMENT ON FUNCTION api.set_updated() IS 'Sets updated_dts field to CLOCK_TIMESTAMP(), if the record has changed..';
---------------------------------------------
-- Create the table
---------------------------------------------
DROP TABLE IF EXISTS api.numbers;
CREATE TABLE api.numbers (
id uuid NOT NULL DEFAULT extensions.gen_random_uuid (),
number integer NOT NULL DEFAULT NULL,
updated_dts timestamptz NOT NULL DEFAULT 'epoch'::timestamptz
);
---------------------------------------------
-- Define the triggers (binding)
---------------------------------------------
-- NOTE: I'm guessing that in production that I can use DEFAULT CLOCK_TIMESTAMP() instead of a BEFORE INSERT trigger,
-- I'm using a distinct DEFAULT value, as I want it to pop out if I'm not getting the trigger to fire.
CREATE TRIGGER trigger_api_number_before_insert
BEFORE INSERT ON api.numbers
FOR EACH ROW
EXECUTE PROCEDURE set_updated();
CREATE TRIGGER trigger_api_number_before_update
BEFORE UPDATE ON api.numbers
FOR EACH ROW
WHEN (OLD.* IS DISTINCT FROM NEW.*)
EXECUTE PROCEDURE set_updated();
---------------------------------------------
-- INSERT some data
---------------------------------------------
INSERT INTO numbers (number) values (1),(2),(3);
---------------------------------------------
-- Take a look
---------------------------------------------
SELECT * from numbers ORDER BY updated_dts ASC; -- The values should be listed as 1, 2, 3 as oldest to newest.
---------------------------------------------
-- UPDATE a row
---------------------------------------------
UPDATE numbers SET number = 11 where number = 1;
---------------------------------------------
-- Take a look
---------------------------------------------
SELECT * from numbers ORDER BY updated_dts ASC; -- The values should be listed as 2, 3, 11 as oldest to newest.
No, you cannot depend on clock_timestamp() order during trigger execution (or while evaluating a DEFAULT clause) being the same as commit order.
Commit will always happen later than the function call, and you cannot control how long it takes between them.
But I am surprised that that is a problem for you. Typically, the commit time is not visible or relevant. Why don't you simply accept the clock_timestamp() as the measure of things?

Trigger to check valid input

I am inserting a lot of measurement data from different sources in a postgres database. The data are a measured value and an uncertainty (and a lot of auxilliary data) The problem is that in some cases I get an absolute error value eg 123 +/- 33, in other cases, I get a relative error as a percentage of the measured value, eg 123 +/- 10%. I would like to store all the measurements with absolute error, i.e the latter should be stored as 123 +/- 12.3 - (at this point, I don't care too much about the valid number of digits)
My idea is to use a trigger to do this. Basically, if the error is numeric, store it as is, if it is non-numeric, check if the last character is '%', in that case, multiply it with the measured value, divide by 100 and store the resulting value. I got an isnumeric-function from here: isnumeric() with PostgreSQL which works fine. But when I try to make this into a trigger, it seems as if the input is checked for validity even before the trigger fires, so that the insert is abortet before I get any possibility to do anything with the values.
my triggerfunction: (need to do the calculation, just setting the error to 0 here...
create function my_trigger_function()
returns trigger as'
begin
if not isnumeric(new.err) THEN
new.err=0;
end if;
return new;
end' language 'plpgsql';
then I connect it to the table:
create trigger test_trigger
before insert on test
for each row
execute procedure my_trigger_function();
Doing this, I would expect to get val=123 and err=0 for the following insert
insert into test(val,err) values(123,'10%');
but the insert fails with "invalid input syntax for type numeric" which then must be triggered before my trigger gets any possibility to see the data (or I have misunderstood something basic). Is it possible to make the new.err data-type agnostic or can I run the trigger even earlier or is what I want to do just plain impossible?
It's not possible with a trigger because the SQL parser fails before.
When the trigger is launched, the NEW.* columns already have their definitive types matching the destination columns.
The closest alternative is to provide a function converting from text to numeric implementing your custom syntax rules and apply it in the VALUES clause:
insert into test(val,err) values(123, custom_convert('10%'));
Daniel answered on my original question - and I found out I had to think otherwise. His proposal for how to do it may work for others, but the way my system interfaces to the database by fetching table and column names directly from the database, it would not work well.
Instead I added a boolean field relerr to the measurement table
alter table measure add relerr boolean default false;
Then I made a trigger that checks if relerr is true - indicating that I am trying to store a relative error, if so, it recalculates the error column (called prec for precision)
CREATE FUNCTION calc_fromrel_error()
RETURNS trigger as'
IF NEW.relerr THEN
NEW.prec=NEW.prec*NEW.value/100;
NEW.relerr=FALSE;
END IF;
return NEW;
END' language 'plpgsql';
and then
create trigger meas_calc_relerr_trigger
before update on measure
for each row
execute procedure calc_fromrel_error();
voila, by doing a
INSERT into measure (value,prec,relerr) values(220,10,true);
I get the table populated with 220,22,false. Inserted values should normally never be updated, if that for some strange reason should happen, I will be able to calculate the prec column manually.

Sanitize input to a column in postgres

So, I think this should be fairly simple, but the documentation makes it seem somewhat more complicated. I've written an SQL function in PostgreSQL (8.1, for now) which does some cleanup on some string input. For what it's worth, the string is an LDAP distinguished name, and I want there to consistently be no spaces after the commas - and the function is clean_dn(), which returns the cleaned DN. I want to do the same thing to force all input to another couple of columns to lower case, etc - which should be easy once I figure this part out.
Anyway, I want this function to be run on the "dn" column of a table any time anyone attempts to insert to or update and modify that column. But all the rule examples I can find seem to make the assumption that all insert/update queries modify all the columns in a table all the time. In my situation, that is not the case. What I think I really want is a constraint which just changes the value rather than returning true or false, but that doesn't seem to make sense with the SQL idea of a constraint. Do I have my rule do an UPDATE into the NEW table? Do I have to create a new rule for every possible combination of NEW values? And if I add a column, do I have to go through and update all of my rule combinations to refelect every possible new combination of columns?
There has to be an easy way...
First, update to a current version of PostgreSQL. 8.1 is long dead and forgotten und unsupported and very, very old .. you get my point? Current version is PostgreSQL 9.2.
Then, use a trigger instead of a rule. It's simpler. It's the way most people go. I do.
For column col in table tbl ...
First, create a trigger function:
CREATE OR REPLACE FUNCTION trg_tbl_insupbef()
RETURNS trigger AS
$BODY$
BEGIN
NEW.col := f_myfunc(NEW.col); -- your function here, must return matching type
RETURN NEW;
END;
$BODY$
LANGUAGE plpgsql VOLATILE;
Then use it in a trigger.
For ancient Postgres 8.1:
CREATE TRIGGER insupbef
BEFORE INSERT OR UPDATE
ON tbl
FOR EACH ROW
EXECUTE PROCEDURE trg_tbl_insupbef();
For modern day Postgres (9.0+)
CREATE TRIGGER insbef
BEFORE INSERT OR UPDATE OF col -- only call trigger, if column was updated
ON tbl
FOR EACH ROW
EXECUTE PROCEDURE trg_tbl_insupbef();
You could pack more stuff into one trigger, but then you can't condition the UPDATE trigger on just the one column ...

PostgreSQL, triggers, and concurrency to enforce a temporal key

I want to define a trigger in PostgreSQL to check that the inserted row, on a generic table, has the the property: "no other row exists with the same key in the same valid time" (the keys are sequenced keys). In fact, I has already implemented it. But since the trigger has to scan the entire table, now i'm wondering: is there a need for a table-level lock? Or this is managed someway by the PostgreSQL itself?
Here is an example.
In the upcoming PostgreSQL 9.0 I would have defined the table in this way:
CREATE TABLE medicinal_products
(
aic_code CHAR(9), -- sequenced key
full_name VARCHAR(255),
market_time PERIOD,
EXCLUDE USING gist
(aic_code CHECK WITH =,
market_time CHECK WITH &&)
);
but in fact I have been defined it like this:
CREATE TABLE medicinal_products
(
PRIMARY KEY (aic_code, vs),
aic_code CHAR(9), -- sequenced key
full_name VARCHAR(255),
vs DATE NOT NULL,
ve DATE,
CONSTRAINT valid_time_range
CHECK (ve > vs OR ve IS NULL)
);
Then, I have written a trigger that check the costraint: "two distinct medicinal products can have the same code in two different periods, but not in same time".
So the code:
INSERT INTO medicinal_products VALUES ('1','A','2010-01-01','2010-04-01');
INSERT INTO medicinal_products VALUES ('1','A','2010-03-01','2010-06-01');
return an error.
One solution is to have a second table to use for detecting clashes, and populate that with a trigger. Using the schema you added into the question:
CREATE TABLE medicinal_product_date_map(
aic_code char(9) NOT NULL,
applicable_date date NOT NULL,
UNIQUE(aic_code, applicable_date));
(note: this is the second attempt due to misreading your requirement the first time round. hope it's right this time).
Some functions to maintain this table:
CREATE FUNCTION add_medicinal_product_date_range(aic_code_in char(9), start_date date, end_date date)
RETURNS void STRICT VOLATILE LANGUAGE sql AS $$
INSERT INTO medicinal_product_date_map
SELECT $1, $2 + offset
FROM generate_series(0, $3 - $2)
$$;
CREATE FUNCTION clr_medicinal_product_date_range(aic_code_in char(9), start_date date, end_date date)
RETURNS void STRICT VOLATILE LANGUAGE sql AS $$
DELETE FROM medicinal_product_date_map
WHERE aic_code = $1 AND applicable_date BETWEEN $2 AND $3
$$;
And populate the table first time with:
SELECT count(add_medicinal_product_date_range(aic_code, vs, ve))
FROM medicinal_products;
Now create triggers to populate the date map after changes to medicinal_products: after insert calls add_, after update calls clr_ (old values) and add_ (new values), after delete calls clr_.
CREATE FUNCTION sync_medicinal_product_date_map()
RETURNS trigger LANGUAGE plpgsql AS $$
BEGIN
IF TG_OP = 'UPDATE' OR TG_OP = 'DELETE' THEN
PERFORM clr_medicinal_product_date_range(OLD.aic_code, OLD.vs, OLD.ve);
END IF;
IF TG_OP = 'UPDATE' OR TG_OP = 'INSERT' THEN
PERFORM add_medicinal_product_date_range(NEW.aic_code, NEW.vs, NEW.ve);
END IF;
RETURN NULL;
END;
$$;
CREATE TRIGGER sync_date_map
AFTER INSERT OR UPDATE OR DELETE ON medicinal_products
FOR EACH ROW EXECUTE PROCEDURE sync_medicinal_product_date_map();
The uniqueness constraint on medicinal_product_date_map will trap any products being added with the same code on the same day:
steve#steve#[local] =# INSERT INTO medicinal_products VALUES ('1','A','2010-01-01','2010-04-01');
INSERT 0 1
steve#steve#[local] =# INSERT INTO medicinal_products VALUES ('1','A','2010-03-01','2010-06-01');
ERROR: duplicate key value violates unique constraint "medicinal_product_date_map_aic_code_applicable_date_key"
DETAIL: Key (aic_code, applicable_date)=(1 , 2010-03-01) already exists.
CONTEXT: SQL function "add_medicinal_product_date_range" statement 1
SQL statement "SELECT add_medicinal_product_date_range(NEW.aic_code, NEW.vs, NEW.ve)"
PL/pgSQL function "sync_medicinal_product_date_map" line 6 at PERFORM
This depends on the values being checked for having a discrete space- which is why I asked about dates vs timestamps. Although timestamps are, technically, discrete since Postgresql only stores microsecond-resolution, adding an entry to the map table for every microsecond the product is applicable for is not practical.
Having said that, you could probably also get away with something better than a full-table scan to check for overlapping timestamp intervals, with some trickery on looking for only the first interval not after or not before... however, for easy discrete spaces I prefer this approach which IME can also be handy for other things too (e.g. reports that need to quickly find which products are applicable on a certain day).
I also like this approach because it feels right to leverage the database's uniqueness-constraint mechanism this way. Also, I feel it will be more reliable in the context of concurrent updates to the master table: without locking the table against concurrent updates, it would be possible for a validation trigger to see no conflict and allow inserts in two concurrent sessions, that are then seen to conflict when both transaction's effects are visible.
Just a thought, in case the valid time blocks could be coded with a number or something, creating a UNIQUE index on Id+TimeBlock would be blazingly fast and resolve all table lock problems.
It is managed by PostgreSQL itself. On a select it acquires an ACCESS_SHARE lock which means that you can query the table but do not perform updates.
A radical solution which might help you is to use a cache like ehcache or memcached to store the id/timeblock info and not use the postgresql at all. Many can be persisted so they would survive a server restart and they do not exhibit this locking behavior.
Why can't you use a UNIQUE constraint? Will be much faster (it's an index) and easier.