incrementing on errors - postgresql

How do I suppress the 'id' in this table from incrementing when an error occurs?
db=> CREATE TABLE test (id serial primary key, info text, UNIQUE(info));
NOTICE: CREATE TABLE will create implicit sequence "test_id_seq" for serial column "test.id"
NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "test_pkey" for table "test"
NOTICE: CREATE TABLE / UNIQUE will create implicit index "test_info_key" for table "test"
CREATE TABLE
db=> INSERT INTO test (info) VALUES ('hello') ;
INSERT 0 1
db=> INSERT INTO test (info) VALUES ('hello') ;
ERROR: duplicate key violates unique constraint "test_info_key"
db=> INSERT INTO test (info) VALUES ('hello') ;
ERROR: duplicate key violates unique constraint "test_info_key"
db=> INSERT INTO test (info) VALUES ('goodbye') ;
INSERT 0 1
db=> SELECT * from test; SELECT last_value from test_id_seq;
id | info
----+---------
1 | hello
4 | goodbye
(2 rows)
last_value
------------
4
(1 row)

You cannot suppress this - and there is nothing wrong with having gaps in your ID values.
The primary key is a totally meaningless value that is only used to uniquely identify one row in a table.
You cannot rely on the ID to never have any gaps - just think what happens if you delete a row.
Simply ignore it - nothing is wrong
Edit
Just wanted to mention that this behaviour is also clearly stated in the manual:
To avoid blocking concurrent transactions that obtain numbers from the same sequence, a nextval operation is never rolled back
http://www.postgresql.org/docs/current/static/functions-sequence.html
(Scroll to the bottom)

Your question boils down to this: "Can I rollback the next value from a PostgreSQL sequence?"
And the answer is, "You can't." PostgreSQL documentation says
To avoid blocking concurrent transactions that obtain numbers from the same sequence, a nextval operation is never rolled back . . .

Imagine two different transactions go to insert. Transaction A gets id=1 Transaction B gets id=2. Transaction B commits. transaction A rolls back. Now what do we do? How could we roll back the sequence for A without affecting B or later transactions?

I figured it out.
I needed to write a wrapper function around my INSERT statement.
The database will normally have one user at a time so the 'race to the next id' condition is rare. What I was concerned about was when my (unmentioned) 'pull rows from remote database table' function would try to reinsert the growing remote database table into the master database table. I am displaying the row ids and I didn't want the users to see the gap in numbering as missing data.
Anyways here is my solution:
CREATE FUNCTION testInsert (test.info%TYPE) RETURNS void AS '
BEGIN
PERFORM info FROM test WHERE info=$1;
IF NOT FOUND THEN
INSERT INTO test (info) VALUES ($1);
END IF;
END;' LANGUAGE plpgsql;

Related

Is it safe to insert data from inside of a CTE expression?

Background
I have a function that generates detail rows and inserts these into a detail table. I want to automatically ensure that the referenced master row is available before inserting the detail rows.
A BEFORE INSERT trigger does the job, but unfortunately the job is done too well. If a detail row is prevented due to a unique index, the master row will still be inserted, leaving me with a master without children (I don't want that).
I managed to solve this by inserting the master rows inside of a cte and then inserting the detail rows in the actual query. This works, but I'm worried that it's not a safe way of doing it.
INSERT rows from inside of a CTE expression
Here is the code to try this out. The input_cte is just generating static dummy data in the example. This particular example could be built with separate static insert SQLs, but in my real case the input_cte is dynamic.
Is this a safe way to solve this, or is it out of spec and could blow up in the next PG version?
CREATE TABLE master (
id INT NOT NULL GENERATED BY DEFAULT AS IDENTITY,
PRIMARY KEY (id)
);
CREATE TABLE detail (
id INT NOT NULL GENERATED BY DEFAULT AS IDENTITY,
master_id INT,
PRIMARY KEY (id, master_id)
);
ALTER TABLE detail
ADD CONSTRAINT detail_master_id_fkey
FOREIGN KEY (master_id)
REFERENCES master (id);
WITH input_cte AS (
SELECT 1 AS master_id, 1 AS detail_id
UNION SELECT 1, 2
UNION SELECT 2, 1
),
insert_cte AS (
--Ignore conflicts, as the master row could already exist
INSERT INTO master (id)
SELECT DISTINCT master_id
FROM input_cte
ON CONFLICT DO NOTHING
)
INSERT INTO detail (id, master_id)
SELECT detail_id, master_id
FROM input_cte;
SELECT * FROM master;
SELECT * FROM detail;
Edit
Bummer.. I just realised that this method will also insert master rows even if the detail rows would be stopped by my unique index.
I see two options. Which should I choose?
Check exactly what I can insert before trying to insert. I.e. do the same thing as my unique index does.
Go ahead with the above solution or the BEFORE INSERT trigger concept and then clean up unused master rows afterwards with a separate DELETE query.
Edit 2 as a reponse to Haleemur Ali's comment
I agree with Haleemur, but in my case it's a bit more complicated than the small example I created. The unique key on my detail can actually have null values. The index in my detail table (project_sequence) looks like this to enable null values:
CREATE UNIQUE INDEX
project_sequence_unique_combinations ON main.project_sequence
(project_id, controlpoint_type_id, COALESCE(drawing_id, 0), COALESCE(layer_guid, '00000000-0000-0000-0000-000000000000'));
Due to possible NULL values I cannot use these fields in my primary key, so I have a surrogate integer key. I'm calculating these key values in my CTE so they will always be unique. I.e. they can be inserted into the master table (sequence) even if the detail rows gets stopped by the unique index.*
To clarify I've inserted my actual code below. This code works as it should, but it sure would be nice to be able to utilize the new fancy DEFERRED and REFERENCES triggers.
SELECT COALESCE(MAX(id), 0)
INTO _max_sequence_id
FROM main.sequence;
WITH cte AS (
SELECT
d.project_id,
pct.controlpoint_type_id,
d.id as drawing_id,
DENSE_RANK() OVER(ORDER BY d.id, sequence_group_key) + _max_sequence_id + 1 AS new_sequence_id
FROM main.drawing d
CROSS JOIN main.project_controlpoint_type pct
--This left JOIN along with "ps.project_id IS NULL" is my
--current solution, i.e. its "option 1" from above.
LEFT JOIN main.project_sequence ps
ON ps.project_id = d.project_id
AND ps.drawing_id = d.id
AND ps.controlpoint_type_id = pct.controlpoint_type_id
WHERE d.project_id = _project_id
AND pct.project_id = _project_id
AND pct.sequence_level_id = 2
AND ps.project_id IS NULL
),
insert_sequence_cte AS (
INSERT INTO main.sequence
(id, project_id, last_value)
SELECT DISTINCT cte.new_sequence_id, cte.project_id, 0
FROM cte
ON CONFLICT DO NOTHING
)
INSERT INTO main.project_sequence
(project_id, controlpoint_type_id, drawing_id, sequence_id)
SELECT
project_id,
controlpoint_type_id,
drawing_id,
new_sequence_id
FROM cte;
The manual page on WITH queries states that your use case is legitimate and supported:
You can use data-modifying statements (INSERT, UPDATE, or DELETE) in WITH.
and
... data-modifying statements are only allowed in WITH clauses that are attached to the top-level statement. However, normal WITH visibility rules apply, so it is possible to refer to the WITH statement's output from the sub-SELECT.
Further:
If a data-modifying statement in WITH lacks a RETURNING clause, then it forms no temporary table and cannot be referred to in the rest of the query. Such a statement will be executed nonetheless.
If the rule "no master without a detail" is important to your model, you should let the DB enforce it. This will free you from "auto-inserting" master to reduce the potential for error and let you use conventional methods again.
Take a look at CONSTRAINT TRIGGERS. They allow you to just detect and reject violations of your no-master-without-detail rule while leaving the actual INSERTs to application code.
Your use case would need a CONSTRAINT TRIGGER on your master table that is DEFERRABLE INITIALLY DEFERRED. This allows you to INSERT master, then INSERT detail and still be sure that the transaction only commits if all is consistent.
From the manual linked above:
Constraint triggers must be AFTER ROW triggers on plain tables (not foreign tables). They can be fired either at the end of the statement causing the triggering event, or at the end of the containing transaction; in the latter case they are said to be deferred.
You'll need two triggers, one handling INSERT/UPDATE on master and another one handling DELETE/UPDATE on detail:
CREATE CONSTRAINT TRIGGER trigger_assert_master_has_detail
AFTER INSERT OR UPDATE OF id ON master
DEFERRABLE INITIALLY DEFERRED FOR EACH ROW
EXECUTE PROCEDURE assert_master_has_detail();
CREATE CONSTRAINT TRIGGER trigger_assert_no_leftover_master
AFTER DELETE OR UPDATE OF master_id ON detail
DEFERRABLE INITIALLY DEFERRED FOR EACH ROW
EXECUTE PROCEDURE assert_no_leftover_master();
Note that UPDATEs will only fire the trigger if they concern the FK/PK column.
The two trigger functions will then check if there are 1-n details for the master:
CREATE FUNCTION assert_master_has_detail() RETURNS trigger
AS
$$
BEGIN
IF NOT EXISTS (SELECT 1 FROM detail WHERE master_id = NEW.id)
THEN
RAISE EXCEPTION 'no detail for master_id=%', NEW.id;
ELSE
RETURN NEW;
END IF;
END;
$$
LANGUAGE plpgsql;
CREATE FUNCTION assert_no_leftover_master() RETURNS trigger
AS
$$
BEGIN
IF NOT EXISTS (SELECT 1 FROM detail WHERE master_id = OLD.master_id)
AND EXISTS (SELECT 1 FROM master WHERE id = OLD.master_id)
THEN
RAISE EXCEPTION 'last detail for master_id=% removed, but master still exists', OLD.master_id;
ELSE
RETURN NULL;
END IF;
END;
$$
LANGUAGE plpgsql;
Example of a violation:
INSERT INTO master
VALUES (1);
-- ERROR: no detail for master_id=1
-- CONTEXT: PL/pgSQL function assert_master_has_detail() line 5 at RAISE
and a legal scenario:
INSERT INTO master
VALUES (1);
INSERT INTO detail
VALUES (10, 1);
-- trigger fired at end of transaction, finds everything is OK
Here's a complete solution as dbfiddle.

Sequence in postgresql

Converting below SQL Server procedures and tables to store and generate sequence to postgresql.
Can anyone guide how to do this in Postgres (via table and this function) and not via sequence or nextval or currval
Sequence table
IF NOT EXISTS (SELECT name FROM sys.tables WHERE name = 'testtable')
    CREATE TABLE dbo.testtable(Sequence int NOT NULL )
go
IF NOT EXISTS (SELECT * FROM testtable)
    INSERT INTO testtable VALUES (-2147483648) 
go 
Sequence generating proc
CREATE PROCEDURE test_proc
AS
SET NOCOUNT ON
DECLARE #iReturn int
BEGIN TRANSACTION
SELECT #iReturn = Sequence FROM schema.test (TABLOCKX) -- set exclusive table lock 
UPDATE schema.test SET Sequence = ( Sequence + 1 )
COMMIT TRANSACTION
SELECT #iReturn
RETURN #iReturn 
go 
grant execute on schema.test to public 
go
Disclaimer: using a sequence is the only scalable and efficient way to generate unique numbers.
Having said that, it is possible to implement your own sequence generator. The only situation where makes any sense is, if you are required to generate gapless numbers. If you do not have such a requirement, use a sequence.
You need one table that stores the values of the sequences. I usually use one table with a row for each "generator" that avoids costly table locks.
create table seq_generator
(
entity varchar(30) not null primary key,
seq_value integer default 0 not null
);
insert into seq_generator (entity) values ('testsequence');
Then create a function to increment the sequence value:
create or replace function next_value(p_entity varchar)
returns integer
as
$$
update seq_generator
set seq_value = seq_value + 1
where entity = lower(p_entity)
returning seq_value;
$$
language sql;
To obtain the next sequence value, e.g. inside an insert:
insert into some_table
(id, ...)
values
(next_value('testsequence'), ...);
Or make it a default value:
create table some_table
(
id integer not null primary key default next_value('testsequence'),
...
);
The UPDATE increments and locks the row in a single statement returning the new value for the sequence. If the calling transaction commits, the update to seq_generator will also be committed. If the calling transaction rolls back, the update will roll back as well.
If a second transaction calls next_value() for the same entity, it has to wait until the first transaction commits or rolls back.
So access to the generator is serialized through this function. Only one transaction at a time can do that.
If you need a second gapless sequence, just insert a new row in the `seq_generator' table.
This will seriously affect performance when you use in an environment that does a lot of concurrent inserts.
The only reason that would justify this is a legal requirement to have a gapless number. In every other case you should really, really use a native Postgres sequence.

Postgresql function not working as expected with INSERT INTO

I have function to insert data from one table to another
$BODY$
BEGIN
INSERT INTO backups.calls2 (uid,queue_id,connected,callerid2)
SELECT distinct (c.uid) ,c.queue_id,c.connected,c.callerid2
FROM public.calls c
WHERE c.connected is not null;
RETURN;
EXCEPTION WHEN unique_violation THEN NULL;
END;
$BODY$
And structure of table:
CREATE TABLE backups.nc_calls_id
(
uid character(30) NOT NULL,
queue_id integer,
callerid2 text,
connected timestamp without time zone,
id serial NOT NULL,
CONSTRAINT calls2_pkey PRIMARY KEY (uid)
)
WITH (
OIDS=FALSE
);
When I have first executed this query, everything went ok, 200000 rows was inserted to new table with unique Id.
But now, when I executing it again, no rows are being inserted
From the rather minimalist description given (no PostgreSQL version, no CREATE FUNCTION statement showing params etc, no other table structure, no function invocation) I'm guessing that you're attempting to do a merge, where you insert a row only if it doesn't exist by skipping rows if they already exist.
What the above function will do is skip all rows if any row already exists.
You need to either use a loop to do the insert within individual BEGIN ... EXCEPTION blocks (slow) or LOCK the table and do an INSERT INTO ... SELECT ... FROM newtable WHERE NOT EXISTS (SELECT 1 FROM oldtable where oldtable.key = newtable.key).
The INSERT INTO ... SELECT ... WHERE NOT EXISTS method will perform a lot better but will fail if more than one runs concurrently or if anything else inserts into the destination table at the same time. LOCKing the destination table before running it will make sure it's safe.
The PL/PgSQL looping BEGIN ... EXCEPTION method sounds nice and safe at first glance. Then you think about what happens when you run two of them at once. One will insert some keys first, one will insert other keys first, so they have a split of the values between them. That's OK, together they make up the full set. But what if only one of them commits and the other fails for some reason? You'll have an interesting sparsely inserted result. For that reason it's probably best to lock the destination table if using this approach too ... in which case you might as well use the vastly more efficient single pass INSERT with subquery-based uniqueness violation check.

PostgreSQL function for last inserted ID

In PostgreSQL, how do I get the last id inserted into a table?
In MS SQL there is SCOPE_IDENTITY().
Please do not advise me to use something like this:
select max(id) from table
( tl;dr : goto option 3: INSERT with RETURNING )
Recall that in postgresql there is no "id" concept for tables, just sequences (which are typically but not necessarily used as default values for surrogate primary keys, with the SERIAL pseudo-type).
If you are interested in getting the id of a newly inserted row, there are several ways:
Option 1: CURRVAL(<sequence name>);.
For example:
INSERT INTO persons (lastname,firstname) VALUES ('Smith', 'John');
SELECT currval('persons_id_seq');
The name of the sequence must be known, it's really arbitrary; in this example we assume that the table persons has an id column created with the SERIAL pseudo-type. To avoid relying on this and to feel more clean, you can use instead pg_get_serial_sequence:
INSERT INTO persons (lastname,firstname) VALUES ('Smith', 'John');
SELECT currval(pg_get_serial_sequence('persons','id'));
Caveat: currval() only works after an INSERT (which has executed nextval() ), in the same session.
Option 2: LASTVAL();
This is similar to the previous, only that you don't need to specify the sequence name: it looks for the most recent modified sequence (always inside your session, same caveat as above).
Both CURRVAL and LASTVAL are totally concurrent safe. The behaviour of sequence in PG is designed so that different session will not interfere, so there is no risk of race conditions (if another session inserts another row between my INSERT and my SELECT, I still get my correct value).
However they do have a subtle potential problem. If the database has some TRIGGER (or RULE) that, on insertion into persons table, makes some extra insertions in other tables... then LASTVAL will probably give us the wrong value. The problem can even happen with CURRVAL, if the extra insertions are done intto the same persons table (this is much less usual, but the risk still exists).
Option 3: INSERT with RETURNING
INSERT INTO persons (lastname,firstname) VALUES ('Smith', 'John') RETURNING id;
This is the most clean, efficient and safe way to get the id. It doesn't have any of the risks of the previous.
Drawbacks? Almost none: you might need to modify the way you call your INSERT statement (in the worst case, perhaps your API or DB layer does not expect an INSERT to return a value); it's not standard SQL (who cares); it's available since Postgresql 8.2 (Dec 2006...)
Conclusion: If you can, go for option 3. Elsewhere, prefer 1.
Note: all these methods are useless if you intend to get the last inserted id globally (not necessarily by your session). For this, you must resort to SELECT max(id) FROM table (of course, this will not read uncommitted inserts from other transactions).
Conversely, you should never use SELECT max(id) FROM table instead one of the 3 options above, to get the id just generated by your INSERT statement, because (apart from performance) this is not concurrent safe: between your INSERT and your SELECT another session might have inserted another record.
See the RETURNING clause of the INSERT statement. Basically, the INSERT doubles as a query and gives you back the value that was inserted.
Leonbloy's answer is quite complete. I would only add the special case in which one needs to get the last inserted value from within a PL/pgSQL function where OPTION 3 doesn't fit exactly.
For example, if we have the following tables:
CREATE TABLE person(
id serial,
lastname character varying (50),
firstname character varying (50),
CONSTRAINT person_pk PRIMARY KEY (id)
);
CREATE TABLE client (
id integer,
CONSTRAINT client_pk PRIMARY KEY (id),
CONSTRAINT fk_client_person FOREIGN KEY (id)
REFERENCES person (id) MATCH SIMPLE
);
If we need to insert a client record we must refer to a person record. But let's say we want to devise a PL/pgSQL function that inserts a new record into client but also takes care of inserting the new person record. For that, we must use a slight variation of leonbloy's OPTION 3:
INSERT INTO person(lastname, firstname)
VALUES (lastn, firstn)
RETURNING id INTO [new_variable];
Note that there are two INTO clauses. Therefore, the PL/pgSQL function would be defined like:
CREATE OR REPLACE FUNCTION new_client(lastn character varying, firstn character varying)
RETURNS integer AS
$BODY$
DECLARE
v_id integer;
BEGIN
-- Inserts the new person record and retrieves the last inserted id
INSERT INTO person(lastname, firstname)
VALUES (lastn, firstn)
RETURNING id INTO v_id;
-- Inserts the new client and references the inserted person
INSERT INTO client(id) VALUES (v_id);
-- Return the new id so we can use it in a select clause or return the new id into the user application
RETURN v_id;
END;
$BODY$
LANGUAGE plpgsql VOLATILE;
Now we can insert the new data using:
SELECT new_client('Smith', 'John');
or
SELECT * FROM new_client('Smith', 'John');
And we get the newly created id.
new_client
integer
----------
1
you can use RETURNING clause in INSERT statement,just like the following
wgzhao=# create table foo(id int,name text);
CREATE TABLE
wgzhao=# insert into foo values(1,'wgzhao') returning id;
id
----
1
(1 row)
INSERT 0 1
wgzhao=# insert into foo values(3,'wgzhao') returning id;
id
----
3
(1 row)
INSERT 0 1
wgzhao=# create table bar(id serial,name text);
CREATE TABLE
wgzhao=# insert into bar(name) values('wgzhao') returning id;
id
----
1
(1 row)
INSERT 0 1
wgzhao=# insert into bar(name) values('wgzhao') returning id;
id
----
2
(1 row)
INSERT 0
The other answers don't show how one might use the value(s) returned by RETURNING. Here's an example where the returned value is inserted into another table.
WITH inserted_id AS (
INSERT INTO tbl1 (col1)
VALUES ('foo') RETURNING id
)
INSERT INTO tbl2 (other_id)
VALUES ((select id from inserted_id));
See the below example
CREATE TABLE users (
-- make the "id" column a primary key; this also creates
-- a UNIQUE constraint and a b+-tree index on the column
id SERIAL PRIMARY KEY,
name TEXT,
age INT4
);
INSERT INTO users (name, age) VALUES ('Mozart', 20);
Then for getting last inserted id use this for table "user" seq column name "id"
SELECT currval(pg_get_serial_sequence('users', 'id'));
SELECT CURRVAL(pg_get_serial_sequence('my_tbl_name','id_col_name'))
You need to supply the table name and column name of course.
This will be for the current session / connection
http://www.postgresql.org/docs/8.3/static/functions-sequence.html
For the ones who need to get the all data record, you can add
returning *
to the end of your query to get the all object including the id.
You can use RETURNING id after insert query.
INSERT INTO distributors (id, name) VALUES (DEFAULT, 'ALI') RETURNING id;
and result:
id
----
1
In the above example id is auto-increment filed.
The better way is to use Insert with returning. Though there are already same answers, I just want to add, if you want to save this to a variable then you can do this
insert into my_table(name) returning id into _my_id;
Postgres has an inbuilt mechanism for the same, which in the same query returns the id or whatever you want the query to return.
here is an example. Consider you have a table created which has 2 columns column1 and column2 and you want column1 to be returned after every insert.
# create table users_table(id serial not null primary key, name character varying);
CREATE TABLE
#insert into users_table(name) VALUES ('Jon Snow') RETURNING id;
id
----
1
(1 row)
# insert into users_table(name) VALUES ('Arya Stark') RETURNING id;
id
----
2
(1 row)
Try this:
select nextval('my_seq_name'); // Returns next value
If this return 1 (or whatever is the start_value for your sequence), then reset the sequence back to the original value, passing the false flag:
select setval('my_seq_name', 1, false);
Otherwise,
select setval('my_seq_name', nextValue - 1, true);
This will restore the sequence value to the original state and "setval" will return with the sequence value you are looking for.
I had this issue with Java and Postgres.
I fixed it by updating a new Connector-J version.
postgresql-9.2-1002.jdbc4.jar
https://jdbc.postgresql.org/download.html:
Version 42.2.12
https://jdbc.postgresql.org/download/postgresql-42.2.12.jar
Based on #ooZman 's answer above, this seems to work for PostgreSQL v12 when you need to INSERT with the next value of a "sequence" (akin to auto_increment) without goofing anything up in your table(s) counter(s). (Note: I haven't tested it in more complex DB cluster configurations though...)
Psuedo Code
$insert_next_id = $return_result->query("select (setval('"your_id_seq"', (select nextval('"your_id_seq"')) - 1, true)) +1");

Sequences not affected by transactions?

I have a table
create table testtable(
testtable_rid serial not null,
data integer not null,
constraint pk_testtable primary key(testtable_rid)
);
So lets say I do this code about 20 times:
begin;
insert into testtable (data) values (0);
rollback;
and then I do
begin;
insert into testtable (data) values (0);
commit;
And finally a
select * from testtable
Result:
row0: testtable_rid=21 | data=0
Expected result:
row0: testtable_rid=1 | data=0
As you can see, sequences do not appear to be affected by transaction rollbacks. They continue to increment as if the transaction was committed and then the row was deleted. Is there some way to prevent sequences from behaving in this way?
It would not be a good idea to rollback sequences. Imagine two transactions happening at the same time, each of which uses the sequence for a unique id. If the second transaction commits and the first transaction rolls back, then the second inserted a row with "2" while the first rolls the sequence back to "1".
If that sequence is then used again, the value of the sequence will become "2" which could lead to a unique constraint problem.
No, there isn't. See the note at the bottom of this page. It's a bad idea to do something like that anyway. If you have two transactions running at the same time, each inserting one row, you want them to insert rows with different IDs.