How to ensure a unique number field with zero order - postgresql

Here is a table that has fields id, id_user, order_id.
Required when creating a record to find the last number of user and insert the following in order.
I wrote a stored procedure that takes the next order number to the user, but even it does not provide a unique order number.
CREATE OR REPLACE FUNCTION get_next_order()
RETURNS TRIGGER
LANGUAGE plpgsql
AS $function$
DECLARE
next_order_num bigint;
BEGIN
select order_id + 1 INTO next_order_num
from payment_out
where payment_out.id_usr = NEW.id_usr
and payment_out.order_id is not null
order by payment_out.order_id desc
limit 1;
-- if payments does't exist, return 1
NEW.order_id = coalesce(next_order_num, 1);
return NEW;
END;
$function$
CREATE TRIGGER get_next_order
BEFORE INSERT
ON payment_out
FOR EACH ROW EXECUTE
PROCEDURE get_next_order()
How can I avoid duplicate order numbers?

For this to work in the presence of multiple concurrent transactions inserting orders for the same user, you need a lock on a particular record to make them wait and execute serially.
e.g., before the first SELECT, you might:
PERFORM 1 FROM "users" where id_user = NEW.id_user FOR UPDATE;
where you lock the parent "users" record that owns the orders.
Otherwise, multiple concurrent transactions could execute your procedure at the same time, but they can't see each others' inserted values, so they'll pick the same numbers.
However, beware: A foreign key constraint will cause a SHARE lock to be taken on the users entry already, when you insert into a table that depends on it. Your trigger will try to upgrade that into an UPDATE lock, but multiple transactions might already hold the SHARE lock, so this will block. You'll land up with transactions all waiting for each other, until PostgreSQL kills all but one of them in a deadlock abort error. The only way to avoid this is for the application to SELECT 1 FROM users WHERE id_user = blahblah FOR UPDATE before it creates the orders for that user.
A variant is to keep a next_order_id field in users and do an UPDATE users SET next_order_id = next_order_id + 1 RETURNING next_order_id, and use the result of that to set the order ID. The same lock upgrade problem applies.

Related

Serial id next value not getting sequential after failed insert attempt [duplicate]

I'm moving from MySql to Postgres, and I noticed that when you delete rows from MySql, the unique ids for those rows are re-used when you make new ones. With Postgres, if you create rows, and delete them, the unique ids are not used again.
Is there a reason for this behaviour in Postgres? Can I make it act more like MySql in this case?
Sequences have gaps to permit concurrent inserts. Attempting to avoid gaps or to re-use deleted IDs creates horrible performance problems. See the PostgreSQL wiki FAQ.
PostgreSQL SEQUENCEs are used to allocate IDs. These only ever increase, and they're exempt from the usual transaction rollback rules to permit multiple transactions to grab new IDs at the same time. This means that if a transaction rolls back, those IDs are "thrown away"; there's no list of "free" IDs kept, just the current ID counter. Sequences are also usually incremented if the database shuts down uncleanly.
Synthetic keys (IDs) are meaningless anyway. Their order is not significant, their only property of significance is uniqueness. You can't meaningfully measure how "far apart" two IDs are, nor can you meaningfully say if one is greater or less than another. All you can do is say "equal" or "not equal". Anything else is unsafe. You shouldn't care about gaps.
If you need a gapless sequence that re-uses deleted IDs, you can have one, you just have to give up a huge amount of performance for it - in particular, you cannot have any concurrency on INSERTs at all, because you have to scan the table for the lowest free ID, locking the table for write so no other transaction can claim the same ID. Try searching for "postgresql gapless sequence".
The simplest approach is to use a counter table and a function that gets the next ID. Here's a generalized version that uses a counter table to generate consecutive gapless IDs; it doesn't re-use IDs, though.
CREATE TABLE thetable_id_counter ( last_id integer not null );
INSERT INTO thetable_id_counter VALUES (0);
CREATE OR REPLACE FUNCTION get_next_id(countertable regclass, countercolumn text) RETURNS integer AS $$
DECLARE
next_value integer;
BEGIN
EXECUTE format('UPDATE %s SET %I = %I + 1 RETURNING %I', countertable, countercolumn, countercolumn, countercolumn) INTO next_value;
RETURN next_value;
END;
$$ LANGUAGE plpgsql;
COMMENT ON get_next_id(countername regclass) IS 'Increment and return value from integer column $2 in table $1';
Usage:
INSERT INTO dummy(id, blah)
VALUES ( get_next_id('thetable_id_counter','last_id'), 42 );
Note that when one open transaction has obtained an ID, all other transactions that try to call get_next_id will block until the 1st transaction commits or rolls back. This is unavoidable and for gapless IDs and is by design.
If you want to store multiple counters for different purposes in a table, just add a parameter to the above function, add a column to the counter table, and add a WHERE clause to the UPDATE that matches the parameter to the added column. That way you can have multiple independently-locked counter rows. Do not just add extra columns for new counters.
This function does not re-use deleted IDs, it just avoids introducing gaps.
To re-use IDs I advise ... not re-using IDs.
If you really must, you can do so by adding an ON INSERT OR UPDATE OR DELETE trigger on the table of interest that adds deleted IDs to a free-list side table, and removes them from the free-list table when they're INSERTed. Treat an UPDATE as a DELETE followed by an INSERT. Now modify the ID generation function above so that it does a SELECT free_id INTO next_value FROM free_ids FOR UPDATE LIMIT 1 and if found, DELETEs that row. IF NOT FOUND gets a new ID from the generator table as normal. Here's an untested extension of the prior function to support re-use:
CREATE OR REPLACE FUNCTION get_next_id_reuse(countertable regclass, countercolumn text, freelisttable regclass, freelistcolumn text) RETURNS integer AS $$
DECLARE
next_value integer;
BEGIN
EXECUTE format('SELECT %I FROM %s FOR UPDATE LIMIT 1', freelistcolumn, freelisttable) INTO next_value;
IF next_value IS NOT NULL THEN
EXECUTE format('DELETE FROM %s WHERE %I = %L', freelisttable, freelistcolumn, next_value);
ELSE
EXECUTE format('UPDATE %s SET %I = %I + 1 RETURNING %I', countertable, countercolumn, countercolumn, countercolumn) INTO next_value;
END IF;
RETURN next_value;
END;
$$ LANGUAGE plpgsql;

Is it safe to use temporary tables when an application may try to create them for independent, but simultaneous processes?

I am hoping that I can articulate this effectively, so here it goes:
I am creating a model which will be run on a platform by users, possibly simultaneously, but each model run is marked by a unique integer identifier. This model will execute a series of PostgreSQL queries and eventually write a result elswehere.
Now because of the required parallelization of model runs, I have to make sure that the processes will not collide, despite running in the same database. I am at a point now where I have to store a list of records, sorted by a score variable and then operate on them. This is the beginning of the query:
DO
$$
DECLARE row RECORD;
BEGIN
DROP TABLE IF EXISTS ranked_clusters;
CREATE TEMP TABLE ranked_clusters AS (
SELECT
pl.cluster_id AS c_id,
SUM(pl.total_area) AS cluster_score
FROM
emob.parking_lots AS pl
WHERE
pl.cluster_id IS NOT NULL
AND
run_id = 2005149
GROUP BY
pl.cluster_id
ORDER BY
cluster_score DESC
);
FOR row IN SELECT c_id FROM ranked_clusters LOOP
RAISE NOTICE 'Cluster %', row.c_id;
END LOOP;
END;
$$ LANGUAGE plpgsql;
So I create a temporary table called ranked_clusters and then iterate through it, at the moment just logging the identifiers of each record.
I have been careful to only build this list from records which have a run_id value equal to a certain number, so data from the same source, but with a different number will be ignored.
What I am worried about however is that a simultaneous process will also create its own ranked_clusters temporary table, which will collide with the first one, invalidating the results.
So my question is essentially this: Are temporary tables only visible to the session which creates them (or to the cursor object from say, Python)? And is it therefore safe to use a temporary table in this way?
The main reason I ask is because I see that these so-called "temporary" tables seem to persist after I execute the query in PgAdmin III, and the query fails on the next execution because the table already exists. This troubles me because it seems as though the tables are actually globally accessible during their lifetime and would therefore introduce the possibility of a collision when a simultaneous run occurs.
Thanks #a_horse_with_no_name for the explanation but I am not yet convinced that it is safe, because I have been able to execute the following code:
import psycopg2 as pg2
conn = pg2.connect(dbname=CONFIG["GEODB_NAME"],
user=CONFIG["GEODB_USER"],
password=CONFIG["GEODB_PASS"],
host=CONFIG["GEODB_HOST"],
port=CONFIG["GEODB_PORT"])
conn.autocommit = True
cur = conn.cursor()
conn2 = pg2.connect(dbname=CONFIG["GEODB_NAME"],
user=CONFIG["GEODB_USER"],
password=CONFIG["GEODB_PASS"],
host=CONFIG["GEODB_HOST"],
port=CONFIG["GEODB_PORT"])
conn2.autocommit = True
cur2 = conn.cursor()
cur.execute("CREATE TEMPORARY TABLE temptable (tempcol INTEGER); INSERT INTO temptable VALUES (0);")
cur2.execute("SELECT tempcol FROM temptable;")
print(cur2.fetchall())
And I receive the value in temptable despite it being created as a temporary table in a completely different connection as the one which queries it afterwards. Am I missing something here? Because it seems like the temporary table is indeed accessible between connections.
The above had a typo, Both cursors were actually being spawned from conn, rather than one from conn and another from conn2. Individual connections in psycopg2 are not able to access each other's temporary tables, but cursors spawned from the same connection are.
Temporary tables are only visible to the session (=connection) that created them. Even if two sessions create the same table, they won't interfere with each other.
Temporary tables are removed automatically when the session is disconnected.
If you want to automatically remove them when your transaction ends, use the ON COMMIT DROP option when creating the table.
So the answer is: yes, this is safe.
Unrelated, but: you can't store rows "in a sorted way". Rows in a table have no implicit sort order. The only way you can get a guaranteed sort order is to use an ORDER BY when selecting the rows. The order by that is part of your CREATE TABLE AS statement is pretty much useless.
If you have to rely on the sort order of the rows, the only safe way to do that is in the SELECT statement:
FOR row IN SELECT c_id FROM ranked_clusters ORDER BY cluster_score
LOOP
RAISE NOTICE 'Cluster %', row.c_id;
END LOOP;

How can i get my table to populate using a trigger function?

i have a function:
CREATE OR REPLACE FUNCTION delete_student()
RETURNS TRIGGER AS
$BODY$
BEGIN
IF (TG_OP = 'DELETE')
THEN
INSERT INTO cancel(eno, excode,sno,cdate,cuser)
VALUES ((SELECT entry.eno FROM entry
JOIN student ON (entry.sno = student.sno)
WHERE entry.sno = OLD.sno),(SELECT entry.excode FROM entry
JOIN student ON (entry.sno = student.sno)
WHERE entry.sno = OLD.sno),
OLD.sno,current_timestamp,current_user);
END IF;
RETURN OLD;
END; $BODY$ LANGUAGE plpgsql;
and i also have the trigger:
CREATE TRIGGER delete_student
BEFORE DELETE
on student
FOR EACH ROW
EXECUTE PROCEDURE delete_student();
the idea is when i delete a student from the student relation then the entry in the entry relation also delete and my cancel relation updates.
this is what i put into my student relation:
INSERT INTO
student(sno, sname, semail) VALUES (1, 'a. adedeji', 'ayooladedeji#live.com');
and this is what i put into my entry relation:
INSERT INTO
entry(excode, sno, egrade) VALUES (1, 1, 98.56);
when i execute the command
DELETE FROM student WHERE sno = 1;
it deletes the student and also the corresponding entry and the query returns with no errors however when i run a select on my cancel table the table shows up empty?
You do not show how the corresponding entry is deleted. If the entry is deleted before the student record then that causes the problem because then the INSERT in the trigger will fail because the SELECT statement will not provide any values to insert. Is the corresponding entry deleted through a CASCADING delete on student?
Also, your trigger can be much simpler:
CREATE OR REPLACE FUNCTION delete_student() RETURNS trigger AS $BODY$
BEGIN
INSERT INTO cancel(eno, excode, sno, cdate, cuser)
VALUES (SELECT eno, excode, sno, current_timestamp, current_user
FROM entry
WHERE sno = OLD.sno);
RETURN OLD;
END; $BODY$ LANGUAGE plpgsql;
First of all, the function only fires on a DELETE trigger, so you do not have to test for TG_OP. Secondly, in the INSERT statement you never access any data from the student relation so there is no need to JOIN to that relation; the sno does come from the student relation, but through the OLD implicit parameter.
You didn't post your DB schema and it's not very clear what your problem is, but it looks like a cascade delete is interfering somewhere. Specifically:
Before deleting the student, you insert something into cancel that references it.
Postgres proceeds to delete the row in student.
Postgres proceeds to honors all applicable cascade deletes.
Postgres deletes rows in entry and ... cancel (including the one you just inserted).
A few remarks:
Firstly, and as a rule of thumb, before triggers should never have side-effects on anything but the row itself. Inserting row in a before delete trigger is a big no no: besides introducing potential problems related such as Postgres reporting an incorrect FOUND value or incorrect row counts upon completing the query, consider the case where a separate before trigger cancels the delete altogether by returning NULL. As such, your trigger function should be running on an after trigger -- only at that point can you be sure that the row is indeed deleted.
Secondly, you don't need these inefficient, redundant, and ugly-as-sin sub-select statements. Use the insert ... select ... variety of inserts instead:
INSERT INTO cancel(eno, excode,sno,cdate,cuser)
SELECT entry.eno entry.excode, OLD.sno, current_timestamp, current_user
FROM entry
WHERE entry.sno = OLD.sno;
Thirdly, your trigger should probably be running on the entry table, like so:
INSERT INTO cancel(eno, excode,sno,cdate,cuser)
SELECT OLD.eno OLD.excode, OLD.sno, current_timestamp, current_user;
Lastly, there might be a few problems in your schema. If there is a unique row in entry for each row in student, and you need information in entry to make your trigger work in order to fill in cancel, it probably means the two tables (student and entry) ought to be merged. Whether you merge them or not, you might also need to remove (or manually manage) some cascade deletes where applicable, in order to enforce the business logic in the order you need it to run.

PL/pgSQL query in PostgreSQL returns result for new, empty table

I am learning to use triggers in PostgreSQL but run into an issue with this code:
CREATE OR REPLACE FUNCTION checkAdressen() RETURNS TRIGGER AS $$
DECLARE
adrCnt int = 0;
BEGIN
SELECT INTO adrCnt count(*) FROM Adresse
WHERE gehoert_zu = NEW.kundenId;
IF adrCnt < 1 OR adrCnt > 3 THEN
RAISE EXCEPTION 'Customer must have 1 to 3 addresses.';
ELSE
RAISE EXCEPTION 'No exception';
END IF;
END;
$$ LANGUAGE plpgsql;
I create a trigger with this procedure after freshly creating all my tables so they are all empty. However the count(*) function in the above code returns 1.
When I run SELECT count(*) FROM adresse; outside of PL/pgSQL, I get 0.
I tried using the FOUND variable but it is always true.
Even more strangely, when I insert some values into my tables and then delete them again so that they are empty again, the code works as intended and count(*) returns 0.
Also if I leave out the WHERE gehoert_zu = NEW.kundenId, count(*) returns 0 which means I get more results with the WHERE clause than without.
--Edit:
Here is an example of how I use the procedure:
CREATE TABLE kunde (
kundenId int PRIMARY KEY
);
CREATE TABLE adresse (
id int PRIMARY KEY,
gehoert_zu int REFERENCES kunde
);
CREATE CONSTRAINT TRIGGER adressenKonsistenzTrigger AFTER INSERT ON Kunde
DEFERRABLE INITIALLY DEFERRED
FOR EACH ROW
EXECUTE PROCEDURE checkAdressen();
INSERT INTO kunde VALUES (1);
INSERT INTO adresse VALUES (1,1);
It looks like I am getting the DEFERRABLE INITIALLY DEFERRED part wrong. I assumed the trigger would be executed after the first INSERT statement but it happens after the second one, although the inserts are not inside a BEGIN; - COMMIT; - Block.
According to the PostgreSQL Documentation inserts are commited automatically every time if not inside such a block and thus there shouldn't be an entry in adresse when the first INSERT statement is commited.
Can anyone point out my mistake?
--Edit:
The trigger and DEFERRABLE INITIALLY DEFERRED seem to be working all right.
My mistake was to assume that since I am not using a BEGIN-COMMIT-Block each insert would be executed in an own transaction with the trigger being executed afterwards every time.
However even without the BEGIN-COMMIT all inserts get bundled into one transaction and the trigger is executed afterwards.
Given this behaviour, what is the point in using BEGIN-COMMIT?
You need a transaction plus the "DEFERRABLE INITIALLY DEFERRED" because of the chicken and egg problem.
starting with two empty tables:
you cannot insert a single row into the person table, because the it needs at least one address.
you cannot insert a single row into the address table, because the FK constraint needs a corresponding row on the person table to exist
This is why you need to bundle the two inserts into one operation: the transaction. You need the BEGIN+ COMMIT, and the DEFERRABLE allows transient forbidden database states to exists: it causes the check to be evaluated at commit time.
This may seem a bit silly, but the answer is you need to stop deferring the trigger and run it BEFORE the insert. If you run it after the insert, of course there is data in the table.
As far as I can tell this is working as expected.
One further note, you probably dont mean:
RAISE EXCEPTION 'No Exception';
You probably want
RAISE INFO 'No Exception';
Then you can change your settings and run queries in transactions to test that the trigger does what you want it to do. As it is, every insert is going to fail and you have no way to move this into production without editing your procedure.

After Insert Trigger....Looping

What would the processing load concern be if I had an "After Insert" trigger created on a table and in that trigger I performed a While loop to iterate through "potentially" multiple rows?
End result is I will 99.999% of the time have only 1 row, but as the future is unpredictable i also want to be able to handle multiple rows being inserted.
Trigger Model:
1) Insert information into the table
2) Create views specific to the client, via stored procedures (if possible)
What Say You? :)
Haven't fully developed but this is the design i am looking for, may not be structurally sound but should get the point acrossed.
CREATE TRIGGER dbo.New_Client_Setup
ON dbo.client
AFTER INSERT
AS
BEGIN
SET NOCOUNT ON;
--Fill Temp Table
select * into #clients
from inserted
--Iterate through Temp Table
While (select count(*) from #clients) <> 0 BEGIN
declare #id int, #clnt nvarchar(10)
select top(1)
#id = id
, #clnt = short
order by id desc
Execute dbo.sp_Create_View_Client ( #id, #clnt )
-- Drop used ID
delete from #clients
where id = #id
END
Drop table #clients
END
GO
Again, observe the design of the trigger not necessarily the syntactic sugar
Design wise, reading the comments, I think you do not neccesarily need to do this in triggers. I would say you should do it as part of your insert statement in transactions - i.e. do the insert, and then do the loop that you want to do (whatever that does - execute dbo.sp_Create_View_Client)...
The second thing I would mention is what exactly is dbo.sp_Create_View_Client doing - is it a must-dependent on the insert? Meaning, what happens if the insert works fine, and the trigger fails? I would maybe do the whole insert and execute of the SP all in one transaction, so as to preserve data integrity.