PostgreSQL: Auto-increment based on multi-column unique constraint

PostgreSQL: Auto-increment based on multi-column unique constraint - postgresql

One of my tables has the following definition:
CREATE TABLE incidents
(
id serial NOT NULL,
report integer NOT NULL,
year integer NOT NULL,
month integer NOT NULL,
number integer NOT NULL, -- Report serial number for this period
...
CONSTRAINT PRIMARY KEY (id),
CONSTRAINT UNIQUE (report, year, month, number)
);
How would you go about incrementing the number column for every report, year, and month independently? I'd like to avoid creating a sequence or table for each (report, year, month) set.
It would be nice if PostgreSQL supported incrementing "on a secondary column in a multiple-column index" like MySQL's MyISAM tables, but I couldn't find a mention of such a feature in the manual.
An obvious solution is to select the current value in the table + 1, but this obviously is not safe for concurrent sessions. Maybe a pre-insert trigger would work, but are they guaranteed to be non-concurrent?
Also note that I'm inserting incidents individually, so I can't use generate_series as suggested elsewhere.

It would be nice if PostgreSQL supported incrementing "on a secondary column in a multiple-column index" like MySQL's MyISAM tables
Yeah, but note that in doing so, MyISAM locks your entire table. Which then makes it safe to find the biggest +1 without worrying about concurrent transactions.
In Postgres, you can do this too, and without locking the whole table. An advisory lock and a trigger will be good enough:
CREATE TYPE animal_grp AS ENUM ('fish','mammal','bird');
CREATE TABLE animals (
grp animal_grp NOT NULL,
id INT NOT NULL DEFAULT 0,
name varchar NOT NULL,
PRIMARY KEY (grp,id)
);
CREATE OR REPLACE FUNCTION animals_id_auto()
RETURNS trigger AS $$
DECLARE
_rel_id constant int := 'animals'::regclass::int;
_grp_id int;
BEGIN
_grp_id = array_length(enum_range(NULL, NEW.grp), 1);
-- Obtain an advisory lock on this table/group.
PERFORM pg_advisory_lock(_rel_id, _grp_id);
SELECT COALESCE(MAX(id) + 1, 1)
INTO NEW.id
FROM animals
WHERE grp = NEW.grp;
RETURN NEW;
END;
$$ LANGUAGE plpgsql STRICT;
CREATE TRIGGER animals_id_auto
BEFORE INSERT ON animals
FOR EACH ROW WHEN (NEW.id = 0)
EXECUTE PROCEDURE animals_id_auto();
CREATE OR REPLACE FUNCTION animals_id_auto_unlock()
RETURNS trigger AS $$
DECLARE
_rel_id constant int := 'animals'::regclass::int;
_grp_id int;
BEGIN
_grp_id = array_length(enum_range(NULL, NEW.grp), 1);
-- Release the lock.
PERFORM pg_advisory_unlock(_rel_id, _grp_id);
RETURN NEW;
END;
$$ LANGUAGE plpgsql STRICT;
CREATE TRIGGER animals_id_auto_unlock
AFTER INSERT ON animals
FOR EACH ROW
EXECUTE PROCEDURE animals_id_auto_unlock();
INSERT INTO animals (grp,name) VALUES
('mammal','dog'),('mammal','cat'),
('bird','penguin'),('fish','lax'),('mammal','whale'),
('bird','ostrich');
SELECT * FROM animals ORDER BY grp,id;
This yields:
grp | id | name
--------+----+---------
fish | 1 | lax
mammal | 1 | dog
mammal | 2 | cat
mammal | 3 | whale
bird | 1 | penguin
bird | 2 | ostrich
(6 rows)
There is one caveat. Advisory locks are held until released or until the session expires. If an error occurs during the transaction, the lock is kept around and you need to release it manually.
SELECT pg_advisory_unlock('animals'::regclass::int, i)
FROM generate_series(1, array_length(enum_range(NULL::animal_grp),1)) i;
In Postgres 9.1, you can discard the unlock trigger, and replace the pg_advisory_lock() call with pg_advisory_xact_lock(). That one is automatically held until and released at the end of the transaction.
On a separate note, I'd stick to using a good old sequence. That will make things faster -- even if it's not as pretty-looking when you look at the data.
Lastly, a unique sequence per (year, month) combo could also be obtained by adding an extra table, whose primary key is a serial, and whose (year, month) value has a unique constraint on it.

I think I found better solution. It doesn't depends on grp Type (it can be enum, integer and string) and can be used in a lot of cases.
myFunc() - function for a trigger. You can name it as you want.
number - autoincrement column which grows up for each exists value of grp.
grp - your column you want to count in number.
myTrigger - trigger for your table.
myTable - table where you want to make trigger.
unique_grp_number_key - unique constraint key. We need make it for unique pair of values: grp and number.
ALTER TABLE "myTable"
ADD CONSTRAINT "unique_grp_number_key" UNIQUE(grp, number);
CREATE OR REPLACE FUNCTION myFunc() RETURNS trigger AS $body_start$
BEGIN
SELECT COALESCE(MAX(number) + 1, 1)
INTO NEW.number
FROM "myTable"
WHERE grp = NEW.grp;
RETURN NEW;
END;
$body_start$ LANGUAGE plpgsql;
CREATE TRIGGER myTrigger BEFORE INSERT ON "myTable"
FOR EACH ROW
WHEN (NEW.number IS NULL)
EXECUTE PROCEDURE myFunc();
How does it work? When you insert something in myTable, trigger invokes and checks if number field is empty. If it is empty, myFunc() select MAX value of number where grp equals to new grp value which you want to insert. It returns max value + 1 like auto_increment and replaces null number field to new autoincrement value.
This solution is more unique than Denis de Bernardy cause it doesn't depend on grp Type, but thanks to him, his code helps me write my solution.
Maybe it's too late to write answer, but i can't found unique solution for this problem in stackoverflow, so it can help someone. Enjoy and thanks for help!

I think this will help:
http://www.varlena.com/GeneralBits/130.php
Note that in MySQL it is for MyISAM tables only.
PP I have tested advisory locks and found them useless for more than 1 transaction in same time. I am using 2 windows of pgAdmin. First is as simple as possible:
BEGIN;
INSERT INTO animals (grp,name) VALUES ('mammal','dog');
COMMIT;
BEGIN;
INSERT INTO animals (grp,name) VALUES ('mammal','cat');
COMMIT;
ERROR: duplicate key violates unique constraint "animals_pkey"
Second:
BEGIN;
INSERT INTO animals (grp,name) VALUES ('mammal','dog');
INSERT INTO animals (grp,name) VALUES ('mammal','cat');
COMMIT;
ERROR: deadlock detected
SQL state: 40P01
Detail: Process 3764 waits for ExclusiveLock on advisory lock [46462,46496,2,2]; blocked by process 2712.
Process 2712 waits for ShareLock on transaction 136759; blocked by process 3764.
Context: SQL statement "SELECT pg_advisory_lock( $1 , $2 )"
PL/pgSQL function "animals_id_auto" line 15 at perform
And database is locked and can not be unlocked - it is unknown what to unlock.

Related

Get row number of row to be inserted in Postgres trigger that gives no collisions when inserting multiple rows

Given the following (simplified) schema:
CREATE TABLE period (
id UUID NOT NULL DEFAULT uuid_generate_v4(),
name TEXT
);
CREATE TABLE course (
id UUID NOT NULL DEFAULT uuid_generate_v4(),
name TEXT
);
CREATE TABLE registration (
id UUID NOT NULL DEFAULT uuid_generate_v4(),
period_id UUID NOT NULL REFERENCES period(id),
course_id UUID NOT NULL REFERENCES course(id),
inserted_at timestamptz NOT NULL DEFAULT now()
);
I now want to add a new column client_ref, which identifies a registration unique within a period, but consists of only a 4-character string. I want to use pg_hashids - which requires a unique integer input - to base the column value on.
I was thinking of setting up a trigger on the registration table that runs on inserting a new row. I came up with the following:
CREATE OR REPLACE FUNCTION set_client_ref()
RETURNS trigger
LANGUAGE plpgsql
AS $function$
DECLARE
next_row_number integer;
BEGIN
WITH rank AS (
SELECT
period.id AS period_id,
row_number() OVER (PARTITION BY period.id ORDER BY registration.inserted_at)
FROM
registration
JOIN period ON registration.period_id = period.id ORDER BY
period.id,
row_number
)
SELECT
COALESCE(rank.row_number, 0) + 1 INTO next_row_number
FROM
period
LEFT JOIN rank ON (rank.period_id = period.id)
WHERE
period.id = NEW.period_id
ORDER BY
rank.row_number DESC
LIMIT 1;
NEW.client_ref = id_encode (next_row_number);
RETURN NEW;
END
$function$
;
The trigger is set-up like: CREATE TRIGGER set_client_ref BEFORE INSERT ON registration FOR EACH ROW EXECUTE FUNCTION set_client_ref();
This works as expected when inserting a single row to registration, but if I insert multiple within one statement, they end up having the same client_ref. I can reason about why this happens (the rows don't know about each other's existence, so they assume they're all just next in line when retrieving their row_order), but I am not sure what a way is to prevent this. I tried setting up the trigger as an AFTER trigger, but it resulted in the same (duplicated) behaviour.
What would be a better way to get the lowest possible, unique integer for the rows to be inserted (to base the hash function on) that also works when inserting multiple rows?

Generating incremental numbers based on a different column

I have got a composite primary key in a table in PostgreSQL (I am using pgAdmin4)
Let's call the the two primary keys productno and version.
version represents the version of productno.
So if I create a new dataset, then it needs to be checked if a dataset with this productno already exists.
If productno doesn't exist yet, then version should be (version) 1
If productno exists once, then version should be 2
If productno exists twice, then version should be 3
... and so on
So that we get something like:
productno | version
-----|-----------
1 | 1
1 | 2
1 | 3
2 | 1
2 | 2
I found a quite similar problem: auto increment on composite primary key
But I can't use this solution because PostgreSQL syntax is obviously a bit different - so tried a lot around with functions and triggers but couldn't figure out the right way to do it.

You can keep the version numbers in a separate table (one for each "base PK" value). That is way more efficient than doing a max() + 1 on every insert and has the additional benefit that it's safe for concurrent transactions.
So first we need a table that keeps track of the version numbers:
create table version_counter
(
product_no integer primary key,
version_nr integer not null
);
Then we create a function that increments the version for a given product_no and returns that new version number:
create function next_version(p_product_no int)
returns integer
as
$$
insert into version_counter (product_no, version_nr)
values (p_product_no, 1)
on conflict (product_no)
do update
set version_nr = version_counter.version_nr + 1
returning version_nr;
$$
language sql
volatile;
The trick here is the the insert on conflict which increments an existing value or inserts a new row if the passed product_no does not yet exists.
For the product table:
create table product
(
product_no integer not null,
version_nr integer not null,
created_at timestamp default clock_timestamp(),
primary key (product_no, version_nr)
);
then create a trigger:
create function increment_version()
returns trigger
as
$$
begin
new.version_nr := next_version(new.product_no);
return new;
end;
$$
language plpgsql;
create trigger base_table_insert_trigger
before insert on product
for each row
execute procedure increment_version();
This is safe for concurrent transactions because the row in version_counter will be locked for that product_no until the transaction inserting the row into the product table is committed - which will commit the change to the version_counter table as well (and free the lock on that row).
If two concurrent transactions insert the same value for product_no, one of them will wait until the other finishes.
If two concurrent transactions insert different values for product_no, they can work without having to wait for the other.
If we then insert these rows:
insert into product (product_no) values (1);
insert into product (product_no) values (2);
insert into product (product_no) values (3);
insert into product (product_no) values (1);
insert into product (product_no) values (3);
insert into product (product_no) values (2);
The product table looks like this:
select *
from product
order by product_no, version_nr;
product_no | version_nr | created_at
-----------+------------+------------------------
1 | 1 | 2019-08-23 10:50:57.880
1 | 2 | 2019-08-23 10:50:57.947
2 | 1 | 2019-08-23 10:50:57.899
2 | 2 | 2019-08-23 10:50:57.989
3 | 1 | 2019-08-23 10:50:57.926
3 | 2 | 2019-08-23 10:50:57.966
Online example: https://rextester.com/CULK95702

You can do it like this:
-- Check if pk exists
SELECT pk INTO temp_pk FROM table a WHERE a.pk = v_pk1;
-- If exists, inserts it
IF temp_pk IS NOT NULL THEN
INSERT INTO table(pk, versionpk) VALUES (v_pk1, temp_pk);
END IF;

So - I got it work now
So if you want a column to update depending on another column in pg sql - have a look at this:
This is the function I use:
CREATE FUNCTION public.testfunction()
RETURNS trigger
LANGUAGE 'plpgsql'
COST 100
VOLATILE NOT LEAKPROOF
AS $BODY$
DECLARE v_productno INTEGER := NEW.productno;
BEGIN
IF NOT EXISTS (SELECT *
FROM testtable
WHERE productno = v_productno)
THEN
NEW.version := 1;
ELSE
NEW.version := (SELECT MAX(testtable.version)+1
FROM testtable
WHERE testtable.productno = v_productno);
END IF;
RETURN NEW;
END;
$BODY$;
And this is the trigger that runs the function:
CREATE TRIGGER testtrigger
BEFORE INSERT
ON public.testtable
FOR EACH ROW
EXECUTE PROCEDURE public.testfunction();
Thank you #ChechoCZ, you definetly helped me getting in the right direction.

Function taking forever to run for large number of records

I have created the following function in Postgres 9.3.5:
CREATE OR REPLACE FUNCTION get_result(val1 text, val2 text)
RETURNS text AS
$BODY
$Declare
result text;
BEGIN
select min(id) into result from table
where id_used is null and id_type = val2;
update table set
id_used = 'Y',
col1 = val1,
id_used_date = now()
where id_type = val2
and id = result;
RETURN result;
END;
$BODY$
LANGUAGE plpgsql VOLATILE COST 100;
When I run this function in a loop of over a 1000 or more records it just does freezing and just says "query is running". When I check my table nothing is being updated. When I run it for one or two records it runs fine.
Example of the function when being run:
select get_result('123','idtype');
table columns:
id character varying(200),
col1 character varying(200),
id_used character varying(1),
id_used_date timestamp without time zone,
id_type character(200)
id is the table index.
Can someone help?

Most probably you are running into race conditions. When you run your function a 1000 times in quick succession in separate transactions, something like this happens:
T1 T2 T3 ...
SELECT max(id) -- id 1
SELECT max(id) -- id 1
SELECT max(id) -- id 1
...
Row id 1 locked, wait ...
Row id 1 locked, wait ...
UPDATE id 1
...
COMMIT
Wake up, UPDATE id 1 again!
COMMIT
Wake up, UPDATE id 1 again!
COMMIT
...
Largely rewritten and simplified as SQL function:
CREATE OR REPLACE FUNCTION get_result(val1 text, val2 text)
RETURNS text AS
$func$
UPDATE table t
SET id_used = 'Y'
, col1 = val1
, id_used_date = now()
FROM (
SELECT id
FROM table
WHERE id_used IS NULL
AND id_type = val2
ORDER BY id
LIMIT 1
FOR UPDATE -- lock to avoid race condition! see below ...
) t1
WHERE t.id_type = val2
-- AND t.id_used IS NULL -- repeat condition (not if row is locked)
AND t.id = t1.id
RETURNING id;
$func$ LANGUAGE sql;
Related question with a lot more explanation:
Atomic UPDATE .. SELECT in Postgres
Explain
Don't run two separate SQL statements. That is more expensive and widens the time frame for race conditions. One UPDATE with a subquery is much better.
You don't need PL/pgSQL for the simple task. You still can use PL/pgSQL, the UPDATE stays the same.
You need to lock the selected row to defend against race conditions. But you cannot do this with the aggregate function you head because, per documentation:
The locking clauses cannot be used in contexts where returned rows
cannot be clearly identified with individual table rows; for example
they cannot be used with aggregation.
Bold emphasis mine. Luckily, you can replace min(id) easily with the equivalent ORDER BY / LIMIT 1 I provided above. Can use an index just as well.
If the table is big, you need an index on id at least. Assuming that id is indexed already as PRIMARY KEY, that would help. But this additional partial multicolumn index would probably help a lot more:
CREATE INDEX foo_idx ON table (id_type, id)
WHERE id_used IS NULL;
Alternative solutions
Advisory locks May be the superior approach here:
Postgres UPDATE ... LIMIT 1
Or you may want to lock many rows at once:
How to mark certain nr of rows in table on concurrent access

How to prevent insert, update and delete on inherited tables in PostgreSQL using BEFORE triggers

When using table inheritance, I would like to enforce that insert, update and delete statements should be done against descendant tables. I thought a simple way to do this would be using a trigger function like this:
CREATE FUNCTION test.prevent_action() RETURNS trigger AS $prevent_action$
BEGIN
RAISE EXCEPTION
'% on % is not allowed. Perform % on descendant tables only.',
TG_OP, TG_TABLE_NAME, TG_OP;
END;
$prevent_action$ LANGUAGE plpgsql;
...which I would reference from a trigger defined specified using BEFORE INSERT OR UPDATE OR DELETE.
This seems to work fine for inserts, but not for updates and deletes.
The following test sequence demonstrates what I've observed:
DROP SCHEMA IF EXISTS test CASCADE;
psql:simple.sql:1: NOTICE: schema "test" does not exist, skipping
DROP SCHEMA
CREATE SCHEMA test;
CREATE SCHEMA
-- A function to prevent anything
-- Used for tables that are meant to be inherited
CREATE FUNCTION test.prevent_action() RETURNS trigger AS $prevent_action$
BEGIN
RAISE EXCEPTION
'% on % is not allowed. Perform % on descendant tables only.',
TG_OP, TG_TABLE_NAME, TG_OP;
END;
$prevent_action$ LANGUAGE plpgsql;
CREATE FUNCTION
CREATE TABLE test.people (
person_id SERIAL PRIMARY KEY,
last_name text,
first_name text
);
psql:simple.sql:17: NOTICE: CREATE TABLE will create implicit sequence "people_person_id_seq" for serial column "people.person_id"
psql:simple.sql:17: NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "people_pkey" for table "people"
CREATE TABLE
CREATE TRIGGER prevent_action BEFORE INSERT OR UPDATE OR DELETE ON test.people
FOR EACH ROW EXECUTE PROCEDURE test.prevent_action();
CREATE TRIGGER
CREATE TABLE test.students (
student_id SERIAL PRIMARY KEY
) INHERITS (test.people);
psql:simple.sql:24: NOTICE: CREATE TABLE will create implicit sequence "students_student_id_seq" for serial column "students.student_id"
psql:simple.sql:24: NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "students_pkey" for table "students"
CREATE TABLE
--The trigger successfully prevents this INSERT from happening
--INSERT INTO test.people (last_name, first_name) values ('Smith', 'Helen');
INSERT INTO test.students (last_name, first_name) values ('Smith', 'Helen');
INSERT 0 1
INSERT INTO test.students (last_name, first_name) values ('Anderson', 'Niles');
INSERT 0 1
UPDATE test.people set first_name = 'Oh', last_name = 'Noes!';
UPDATE 2
SELECT student_id, person_id, first_name, last_name from test.students;
student_id | person_id | first_name | last_name
------------+-----------+------------+-----------
1 | 1 | Oh | Noes!
2 | 2 | Oh | Noes!
(2 rows)
DELETE FROM test.people;
DELETE 2
SELECT student_id, person_id, first_name, last_name from test.students;
student_id | person_id | first_name | last_name
------------+-----------+------------+-----------
(0 rows)
So I'm wondering what I've done wrong that allows updates and deletes directly against the test.people table in this example.

The trigger is set to execute FOR EACH ROW, but there is no row in test.people, that's why it's not run.
As a sidenote, you may issue select * from ONLY test.people to list the rows in test.people that don't belong to child tables.
The solution seems esasy: set a trigger FOR EACH STATEMENT instead of FOR EACH ROW, since you want to forbid the whole statement anyway.
CREATE TRIGGER prevent_action BEFORE INSERT OR UPDATE OR DELETE ON test.people
FOR EACH STATEMENT EXECUTE PROCEDURE test.prevent_action();

Using Rule to Insert Into Secondary Table Auto-Increments Sequence

To automatically add a column in a second table to tie it to the first table via a unique index, I have a rule such as follows:
CREATE OR REPLACE RULE auto_insert AS ON INSERT TO user DO ALSO
INSERT INTO lastlogin (id) VALUES (NEW.userid);
This works fine if user.userid is an integer. However, if it is a sequence (e.g., type serial or bigserial), what is inserted into table lastlogin is the next sequence id. So this command:
INSERT INTO user (username) VALUES ('john');
would insert column [1, 'john', ...] into user but column [2, ...] into lastlogin. The following 2 workarounds do work except that the second one consumes twice as many serials since the sequence is still auto-incrementing:
CREATE OR REPLACE RULE auto_insert AS ON INSERT TO user DO ALSO
INSERT INTO lastlogin (id) VALUES (lastval());
CREATE OR REPLACE RULE auto_insert AS ON INSERT TO user DO ALSO
INSERT INTO lastlogin (id) VALUES (NEW.userid-1);
Unfortunately, the workarounds do not work if I'm inserting multiple rows:
INSERT INTO user (username) VALUES ('john'), ('mary');
The first workaround would use the same id, and the second workaround is all kind of screw-up.
Is it possible to do this via postgresql rules or should I simply do the 2nd insertion into lastlogin myself or use a row trigger? Actually, I think the row trigger would also auto-increment the sequence when I access NEW.userid.

Forget rules altogether. They're bad.
Triggers are way better for you. And in 99% of cases when someone thinks he needs a rule. Try this:
create table users (
userid serial primary key,
username text
);
create table lastlogin (
userid int primary key references users(userid),
lastlogin_time timestamp with time zone
);
create or replace function lastlogin_create_id() returns trigger as $$
begin
insert into lastlogin (userid) values (NEW.userid);
return NEW;
end;
$$
language plpgsql volatile;
create trigger lastlogin_create_id
after insert on users for each row execute procedure lastlogin_create_id();
Then:
insert into users (username) values ('foo'),('bar');
select * from users;
userid | username
--------+----------
1 | foo
2 | bar
(2 rows)
select * from lastlogin;
userid | lastlogin_time
--------+----------------
1 |
2 |
(2 rows)

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

PostgreSQL: Auto-increment based on multi-column unique constraint - postgresql

Related

Get row number of row to be inserted in Postgres trigger that gives no collisions when inserting multiple rows

Generating incremental numbers based on a different column

Function taking forever to run for large number of records

How to prevent insert, update and delete on inherited tables in PostgreSQL using BEFORE triggers

Using Rule to Insert Into Secondary Table Auto-Increments Sequence

Categories

Resources