How to check in postgres that lastval() is defined - postgresql

I'm creating a mini DB handling class and for convinience purposes, I want my generated INSERT queries are always try to return last insert id or null, if not applicable. I had tried to use:
INSERT INTO ... RETURNING lastval();
it works fine with tables which have a sequence, but if table have no sequence key field, that insert nothing and falls with error: lastval is not yet defined in this session.
How to check in RERURNING block, is sequence generator was used in this session or silently return last id or null, without error raising?
Note: for usablity purposes, I want to avoid direct specifing of PK field name.

Related

CockroachDB: Why do I get the error "lastval is not yet defined in this session"?

If have a SERIAL column on my table and insert a value, the column gets automatically populated but if I call SELECT lastval() to get the value afterwards, even though it's the same session, I get the error "lastval is not yet defined in this session". This works in Postgres but is an error in CockroachDB. Why is that and how do I fix it?
lastval() itself works the same in CockroachDB and Postgres--it returns the most recent value generated by nextval() in the same SQL session, and returns that error if it was never called. The difference is CockroachDB's default implementation of the SERIAL keyword. Postgres implements this by creating a sequence and implicitly calling nextval on it whenever you insert into the table. CockroachDB instead calls unique_rowid(), which is more performant but doesn't populate lastval. You can get compatible behavior by setting the serial_normalization variable to virtual_sequence before creating tables with SERIAL columns, and/or modifying existing serial columns to use a virtual sequence.
For example,
CREATE SEQUENCE dummy_seq VIRTUAL;
ALTER TABLE users ALTER COLUMN id SET DEFAULT nextval('dummy_seq');
Or you can avoid the extra trip to the database entirely by using a RETURNING clause on your insert.

Way to migrate a create table with sequence from postgres to DB2

I need to migrate a DDL from Postgres to DB2, but I need that it works the same as in Postgres. There is a table that generates values from a sequence, but the values can also be explicitly given.
Postgres
create sequence hist_id_seq;
create table benchmarksql.history (
hist_id integer not null default nextval('hist_id_seq') primary key,
h_c_id integer,
h_c_d_id integer,
h_c_w_id integer,
h_d_id integer,
h_w_id integer,
h_date timestamp,
h_amount decimal(6,2),
h_data varchar(24)
);
(Look at the sequence call in the hist_id column to define the value of the primary key)
The business logic inserts into the table by explicitly providing an ID, and in other cases, it leaves the database to choose the number.
If I change this in DB2 to a GENERATED ALWAYS it will throw errors because there are some provided values. On the other side, if I create the table with GENERATED BY DEFAULT, DB2 will throw an error when trying to insert with the same value (SQL0803N), because the "internal sequence" does not take into account the already inserted values, and it does not retry with a next value.
And, I do not want to restart the sequence each time a provided ID was inserted.
This is the problem in BenchmarkSQL when trying to port it to DB2: https://sourceforge.net/projects/benchmarksql/ (File sqlTableCreates)
How can I implement the same database logic in DB2 as it does in Postgres (and apparently in Oracle)?
You're operating under a misconception: that sources external to the db get to dictate its internal keys. Ideally/conceptually, autogenerated ids will never need to be seen outside of the db, as conceptually there should be unique natural keys for export or reporting. Still, there are times when applications will need to manage some ids, often when setting up related entities (eg, JPA seems to want to work this way).
However, if you add an id value that you generated from a different source, the db won't be able to manage it. How could it? It's not efficient - for one thing, attempting to do so would do one of the following
Be unsafe in the face of multiple clients (attempt to add duplicate keys)
Serialize access to the table (for a potentially slow query, too)
(This usually shows up when people attempt something like: SELECT MAX(id) + 1, which would require locking the entire table for thread safety, likely including statements that don't even touch that column. If you try to find any "first-unused" id - trying to fill gaps - this gets more complicated and problematic)
Neither is ideal, so it's best to not have the problem in the first place. This is usually done by having id columns be autogenerated, but (as pointed out earlier) there are situations where we may need to know what the id will be before we insert the row into the table. Fortunately, there's a standard SQL object for this, SEQUENCE. This provides a db-managed, thread-safe, fast way to get ids. It appears that in PostgreSQL you can use sequences in the DEFAULT clause for a column, but DB2 doesn't allow it. If you don't want to specify an id every time (it should be autogenerated some of the time), you'll need another way; this is the perfect time to use a BEFORE INSERT trigger;
CREATE TRIGGER Add_Generated_Id NO CASCADE BEFORE INSERT ON benchmarksql.history
NEW AS Incoming_Entity
FOR EACH ROW
WHEN Incoming_Entity.id IS NULL
SET id = NEXTVAL FOR hist_id_seq
(something like this - not tested. You didn't specify where in the project this would belong)
So, if you then add a row with something like:
INSERT INTO benchmarksql.history (hist_id, h_data) VALUES(null, 'a')
or
INSERT INTO benchmarksql.history (h_data) VALUES('a')
an id will be generated and attached automatically. Note that ALL ids added to the table must come from the given sequence (as #mustaccio pointed out, this appears to be true even in PostgreSQL), or any UNIQUE CONSTRAINT on the column will start throwing duplicate-key errors. So any time your application needs an id before inserting a row in the table, you'll need some form of
SELECT NEXT VALUE FOR hist_id_seq
FROM sysibm.sysdummy1
... and that's it, pretty much. This is completely thread and concurrency safe, will not maintain/require long-term locks, nor require serialized access to the table.

PostgreSQL: trivial INSERT fails the first time, succeeds afterwards

I am puzzled by a weird Postgres problem I encounter in the trivial database shown below: If I first insert a tag and explicitly specify its ID and then try to insert another tag without passing an ID, then this second insert fails. If I try a third time (again without ID), the insert succeeds.
DROP DATABASE IF EXISTS mydb;
CREATE DATABASE mydb;
\c mydb
DROP SCHEMA public;
CREATE SCHEMA core;
CREATE TABLE core.tag
(
id serial PRIMARY KEY,
title text NOT NULL
);
-- this works: all columns specified explicitly
INSERT INTO core.tag(id, title) VALUES (1, 'known tag');
-- omitting the tag ID fails with
-- ERROR: duplicate key value violates unique constraint "tag_pkey"
-- DETAIL: Key (id)=(1) already exists.
INSERT INTO core.tag(title) VALUES ('unknown tag');
-- this works again ?!?
INSERT INTO core.tag(title) VALUES ('unknown tag');
The issue only seems to occur on a freshly created database and once it does, it does not seem to happen again. I have never come across anything like this - so far, I have just inserted data with or without explicit ID and AFAICS, nothing ever failed like this...
Does anyone have an idea what's going on here ?!?
Environment: PostgreSQL 9.1.3 on Mac OSX 10.7.5
Of course this fails.
What happens?
When you create the table, a sequence is also created that generates the values for the ID column. The sequence starts with 1 but it is only used if you do not specify a value for the ID column.
Now when you run
INSERT INTO core.tag(id, title) VALUES (1, 'known tag');
you bypass Postgres' automatic assigment of the ID value, the sequence "stays" at one.
Now when you run
INSERT INTO core.tag(title) VALUES ('unknown tag');
Postgres takes the next value from the sequence - which is 1. But that alreay exists so the insert fails. After taking the value from the sequence, the next value is 2, so the subsequent insert without specifying an ID value gets the 2 and succeeds.
The solution is to either never include the ID column in your inserts. Or - if you do - request the ID from the sequence:
INSERT INTO core.tag(id, title) VALUES (nextval('tag_id_seq'), 'known tag');
When a serial column is created it is automatically associated with a sequence which is named <table_name>_<column_name>_seq. And that's the name I used in the above statement.
More details about how the serial "data type" works are in the manual: http://www.postgresql.org/docs/current/static/datatype-numeric.html#DATATYPE-SERIAL

postgres autoincrement not updated on explicit id inserts

I have the following table in postgres:
CREATE TABLE "test" (
"id" serial NOT NULL PRIMARY KEY,
"value" text
)
I am doing following insertions:
insert into test (id, value) values (1, 'alpha')
insert into test (id, value) values (2, 'beta')
insert into test (value) values ('gamma')
In the first 2 inserts I am explicitly mentioning the id. However the table's auto increment pointer is not updated in this case. Hence in the 3rd insert I get the error:
ERROR: duplicate key value violates unique constraint "test_pkey"
DETAIL: Key (id)=(1) already exists.
I never faced this problem in Mysql in both MyISAM and INNODB engines. Explicit or not, mysql always update autoincrement pointer based on the max row id.
What is the workaround for this problem in postgres? I need it because I want a tighter control for some ids in my table.
UPDATE:
I need it because for some values I need to have a fixed id. For other new entries I dont mind creating new ones.
I think it may be possible by manually incrementing the nextval pointer to max(id) + 1 whenever I am explicitly inserting the ids. But I am not sure how to do that.
That's how it's supposed to work - next_val('test_id_seq') is only called when the system needs a value for this column and you have not provided one. If you provide value no such call is performed and consequently the sequence is not "updated".
You could work around this by manually setting the value of the sequence after your last insert with explicitly provided values:
SELECT setval('test_id_seq', (SELECT MAX(id) from "test"));
The name of the sequence is autogenerated and is always tablename_columnname_seq.
In the recent version of Django, this topic is discussed in the documentation:
Django uses PostgreSQL’s SERIAL data type to store auto-incrementing
primary keys. A SERIAL column is populated with values from a sequence
that keeps track of the next available value. Manually assigning a
value to an auto-incrementing field doesn’t update the field’s
sequence, which might later cause a conflict.
Ref: https://docs.djangoproject.com/en/dev/ref/databases/#manually-specified-autoincrement-pk
There is also management command manage.py sqlsequencereset app_label ... that is able to generate SQL statements for resetting sequences for the given app name(s)
Ref: https://docs.djangoproject.com/en/dev/ref/django-admin/#django-admin-sqlsequencereset
For example these SQL statements were generated by manage.py sqlsequencereset my_app_in_my_project:
BEGIN;
SELECT setval(pg_get_serial_sequence('"my_project_aaa"','id'), coalesce(max("id"), 1), max("id") IS NOT null) FROM "my_project_aaa";
SELECT setval(pg_get_serial_sequence('"my_project_bbb"','id'), coalesce(max("id"), 1), max("id") IS NOT null) FROM "my_project_bbb";
SELECT setval(pg_get_serial_sequence('"my_project_ccc"','id'), coalesce(max("id"), 1), max("id") IS NOT null) FROM "my_project_ccc";
COMMIT;
It can be done automatically using a trigger. This way you are sure that the largest value is always used as the next default value.
CREATE OR REPLACE FUNCTION set_serial_id_seq()
RETURNS trigger AS
$BODY$
BEGIN
EXECUTE (FORMAT('SELECT setval(''%s_%s_seq'', (SELECT MAX(%s) from %s));',
TG_TABLE_NAME,
TG_ARGV[0],
TG_ARGV[0],
TG_TABLE_NAME));
RETURN OLD;
END;
$BODY$
LANGUAGE plpgsql;
CREATE TRIGGER set_mytable_id_seq
AFTER INSERT OR UPDATE OR DELETE
ON mytable
FOR EACH STATEMENT
EXECUTE PROCEDURE set_serial_id_seq('mytable_id');
The function can be reused for multiple tables. Change "mytable" to the table of interest.
For more info regarding triggers:
https://www.postgresql.org/docs/9.1/plpgsql-trigger.html
https://www.postgresql.org/docs/9.1/sql-createtrigger.html

Using the serial datatype as a foreign key

Lets say that I have two tables.
The first is: table lists, with list_id SERIAL, list_name TEXT
The second table is, trivially, a table which says if the list is public: list_id INT, is_public INT
Obviously a bit of a contrived case, but I am planning out some tables and this seems to be an issue. If I insert a new list_name into table lists, then it'll give me a new serial number...but now I will need to use that serial number in the second table. Obviously in this case, you could simply add is_public to the first table, but in the case of a linking list where you have a compound key, you'll need to know the serial value that was returned.
How do people usually handle this? Do they get the return type from the insert using whatever system they're interacting with the database with?
One approach to this sort of thing is:
INSERT...
SELECT lastval()
INSERT...
INSERT into the first table, use lastval() to get the "value most recently obtained with nextval for any sequence" (in the current session), and then use that value to build your next INSERT.
There's also INSERT ... RETURNING:
The optional RETURNING clause causes INSERT to compute and return value(s) based on each row actually inserted. This is primarily useful for obtaining values that were supplied by defaults, such as a serial sequence number.
Using INSERT ... RETURNING id basically combines the first two steps above into one so you'd do:
INSERT ... RETURNING id
INSERT ...
where the second INSERT would use the id returned from the first INSERT.