postgresql: nested insert - postgresql

I have two tables. Lets say tblA and tblB.
I need to insert a row in tblA and use the returned id as a value to be inserted as one of the columns in tblB.
I tried finding out this in documentation but could not get it. Well, is it possible to write a statement (intended to be used in prepared) like
INSERT INTO tblB VALUES
(DEFAULT, (INSERT INTO tblA (DEFAULT, 'x') RETURNING id), 'y')
like we do for SELECT?
Or should I do this by creating a Stored Procedure?. I'm not sure if I can create a prepared statement out of a Stored Procedure.
Please advise.
Regards,
Mayank

You'll need to wait for PostgreSQL 9.1 for this:
with
ids as (
insert ...
returning id
)
insert ...
from ids;
In the meanwhile, you need to use plpgsql, a temporary table, or some extra logic in your app...

This is possible with 9.0 and the new DO for anonymous blocks:
do $$
declare
new_id integer;
begin
insert into foo1 (id) values (default) returning id into new_id;
insert into foo2 (id) values (new_id);
end$$;
This can be executed as a single statement. I haven't tried creating a PreparedStatement out of that though.
Edit
Another approach would be to simply do it in two steps, first run the insert into tableA using the returning clause, get the generated value through JDBC, then fire the second insert, something like this:
PreparedStatement stmt_1 = con.prepareStatement("INSERT INTO tblA VALUES (DEFAULT, ?) returning id");
stmt_1.setString(1, "x");
stmt_1.execute(); // important! Do not use executeUpdate()!
ResultSet rs = stmt_1.getResult();
long newId = -1;
if (rs.next()) {
newId = rs.getLong(1);
}
PreparedStatement stmt_2 = con.prepareStatement("INSERT INTO tblB VALUES (default,?,?)");
stmt_2.setLong(1, newId);
stmt_2.setString(2, "y");
stmt_2.executeUpdate();

You can do this in two inserts, using currval() to retrieve the foreign key (provided that key is serial):
create temporary table tb1a (id serial primary key, t text);
create temporary table tb1b (id serial primary key,
tb1a_id int references tb1a(id),
t text);
begin;
insert into tb1a values (DEFAULT, 'x');
insert into tb1b values (DEFAULT, currval('tb1a_id_seq'), 'y');
commit;
The result:
select * from tb1a;
id | t
----+---
3 | x
(1 row)
select * from tb1b;
id | tb1a_id | t
----+---------+---
2 | 3 | y
(1 row)
Using currval in this way is safe whether in or outside of a transaction. From the Postgresql 8.4 documentation:
currval
Return the value most recently
obtained by nextval for this sequence
in the current session. (An error is
reported if nextval has never been
called for this sequence in this
session.) Because this is returning a
session-local value, it gives a
predictable answer whether or not
other sessions have executed nextval
since the current session did.

You may want to use AFTER INSERT trigger for that. Something along the lines of:
create function dostuff() returns trigger as $$
begin
insert into table_b(field_1, field_2) values ('foo', NEW.id);
return new; --values returned by after triggers are ignored, anyway
end;
$$ language 'plpgsql';
create trigger trdostuff after insert on table_name for each row execute procedure dostuff();
after insert is needed because you need to have the id to reference it. Hope this helps.
Edit
A trigger will be called in the same "block" as the command that triggered it, even if not using transactions - in other words, it becomes somewhat part of that command.. Therefore, there is no risk of something changing the referenced id between inserts.

Related

postgresql unique non-sequential id for url

I've looked a few methods of creating alphanumeric IDs on Stackoverflow, but they all had their weaknesses, some did not check for collision and others used sequences which are not a good option when using logical replication.
After some Googling I found this website that has the following script which checks for collisions and does not use sequences. However this is done as a trigger when a row is inserted into the table.
-- Create a trigger function that takes no arguments.
-- Trigger functions automatically have OLD, NEW records
-- and TG_TABLE_NAME as well as others.
CREATE OR REPLACE FUNCTION unique_short_id()
RETURNS TRIGGER AS $$
-- Declare the variables we'll be using.
DECLARE
key TEXT;
qry TEXT;
found TEXT;
BEGIN
-- generate the first part of a query as a string with safely
-- escaped table name, using || to concat the parts
qry := 'SELECT id FROM ' || quote_ident(TG_TABLE_NAME) || ' WHERE id=';
-- This loop will probably only run once per call until we've generated
-- millions of ids.
LOOP
-- Generate our string bytes and re-encode as a base64 string.
key := encode(gen_random_bytes(6), 'base64');
-- Base64 encoding contains 2 URL unsafe characters by default.
-- The URL-safe version has these replacements.
key := replace(key, '/', '_'); -- url safe replacement
key := replace(key, '+', '-'); -- url safe replacement
-- Concat the generated key (safely quoted) with the generated query
-- and run it.
-- SELECT id FROM "test" WHERE id='blahblah' INTO found
-- Now "found" will be the duplicated id or NULL.
EXECUTE qry || quote_literal(key) INTO found;
-- Check to see if found is NULL.
-- If we checked to see if found = NULL it would always be FALSE
-- because (NULL = NULL) is always FALSE.
IF found IS NULL THEN
-- If we didn't find a collision then leave the LOOP.
EXIT;
END IF;
-- We haven't EXITed yet, so return to the top of the LOOP
-- and try again.
END LOOP;
-- NEW and OLD are available in TRIGGER PROCEDURES.
-- NEW is the mutated row that will actually be INSERTed.
-- We're replacing id, regardless of what it was before
-- with our key variable.
NEW.id = key;
-- The RECORD returned here is what will actually be INSERTed,
-- or what the next trigger will get if there is one.
RETURN NEW;
END;
$$ language 'plpgsql';
I have have a table which already contains data, I have added a new column called pid would it be possible to modify this and use the function call as default so all my prior data gets a short id?
Suppose you have a table test:
DROP TABLE IF EXISTS test;
CREATE TABLE test (foo text, bar int);
INSERT INTO test (foo, bar) VALUES ('A', 1), ('B', 2);
You could add an id column to it:
ALTER TABLE test ADD COLUMN id text;
and attach the trigger:
DROP TRIGGER IF EXISTS unique_short_id_on_test ON test;
CREATE TRIGGER unique_short_id_on_test
BEFORE INSERT ON test
FOR EACH ROW EXECUTE PROCEDURE unique_short_id();
Now make a temporary table, temp, with the same structure as test (but with no data):
DROP TABLE IF EXISTS temp;
CREATE TABLE temp (LIKE test INCLUDING ALL);
CREATE TRIGGER unique_short_id_on_temp
BEFORE INSERT ON temp
FOR EACH ROW EXECUTE PROCEDURE unique_short_id();
Pouring test into temp:
INSERT INTO temp (foo, bar)
SELECT foo, bar
FROM test
RETURNING *
yields something like:
| foo | bar | id |
|------------+-----+----------|
| A | 1 | 9yt9XQwm |
| B | 2 | LCeiA-P8 |
If other tables have foreign key references on the test table or if test must remain online,
it may not be possible to drop test and rename temp to test.
Instead, it is safer to update test with the ids from temp.
Assuming test has a primary key (for concreteness, let's call it, testid), then
you could update test with the ids from temp using:
UPDATE test
SET id = temp.id
FROM temp
WHERE test.testid = temp.testid;
Then you could drop the temp table:
DROP TABLE temp;

Trigger that will build one row from several rows

I mean:
INSERT INTO test VALUES(1, 'message'), (2, 'message'), (3, 'message);
triggering will cause the result in the table to look like this:
1, E'message\nmessage\nmessage'
How to forbid inserting rows and then continue operations on the transferred data in the insert?
I am using postresgql.
In Postgres 10+ you can use a transition table in an AFTER trigger, see Example 43.7. Auditing with Transition Tables. Assuming that id is a primary key (or unique):
create table my_table(id int primary key, message text);
you can update one and delete the remaining inserted rows:
create or replace function after_insert_on_my_table()
returns trigger language plpgsql as $$
declare r record;
begin
select
array_agg(id) as ids,
array_to_string(array_agg(message), e'\n') as message
from new_table
into r;
update my_table
set message = r.message
where id = r.ids[1];
delete from my_table
where id = any(r.ids[2:]);
return null;
end $$;
In a trigger definition declare a transition table (as new_table):
create trigger after_insert_on_my_table
after insert on my_table
referencing new table as new_table
for each statement
execute procedure after_insert_on_my_table();
In earlier versions of Postgres you can simulate a transition table introduced in Postgres 10.
Test it in db<>fiddle.

How to use variable settings in trigger functions?

I would like to record the id of a user in the session/transaction, using SET, so I could be able to access it later in a trigger function, using current_setting. Basically, I'm trying option n2 from a very similar ticket posted previously, with the difference that I'm using PG 10.1 .
I've been trying 3 approaches to setting the variable:
SET local myvars.user_id = 4, thereby setting it locally in the transaction;
SET myvars.user_id = 4, thereby setting it in the session;
SELECT set_config('myvars.user_id', '4', false), which depending of the last argument, will be a shortcut for the previous 2 options.
None of them is usable in the trigger, which receives NULL when getting the variable through current_setting. Here is a script I've devised to troubleshoot it (can be easily used with the postgres docker image):
database=$POSTGRES_DB
user=$POSTGRES_USER
[ -z "$user" ] && user="postgres"
psql -v ON_ERROR_STOP=1 --username "$user" $database <<-EOSQL
DROP TRIGGER IF EXISTS add_transition1 ON houses;
CREATE TABLE IF NOT EXISTS houses (
id SERIAL NOT NULL,
name VARCHAR(80),
created_at TIMESTAMP WITHOUT TIME ZONE DEFAULT now(),
PRIMARY KEY(id)
);
CREATE TABLE IF NOT EXISTS transitions1 (
id SERIAL NOT NULL,
house_id INTEGER,
user_id INTEGER,
created_at TIMESTAMP WITHOUT TIME ZONE DEFAULT now(),
PRIMARY KEY(id),
FOREIGN KEY(house_id) REFERENCES houses (id) ON DELETE CASCADE
);
CREATE OR REPLACE FUNCTION add_transition1() RETURNS TRIGGER AS \$\$
DECLARE
user_id integer;
BEGIN
user_id := current_setting('myvars.user_id')::integer || NULL;
INSERT INTO transitions1 (user_id, house_id) VALUES (user_id, NEW.id);
RETURN NULL;
END;
\$\$ LANGUAGE plpgsql;
CREATE TRIGGER add_transition1 AFTER INSERT OR UPDATE ON houses FOR EACH ROW EXECUTE PROCEDURE add_transition1();
BEGIN;
%1% SELECT current_setting('myvars.user_id');
%2% SELECT set_config('myvars.user_id', '55', false);
%3% SELECT current_setting('myvars.user_id');
INSERT INTO houses (name) VALUES ('HOUSE PARTY') RETURNING houses.id;
SELECT * from houses;
SELECT * from transitions1;
COMMIT;
DROP TRIGGER IF EXISTS add_transition1 ON houses;
DROP FUNCTION IF EXISTS add_transition1;
DROP TABLE transitions1;
DROP TABLE houses;
EOSQL
The conclusion I came to was that the function is triggered in a different transaction and a different (?) session. Is this something that one can configure, so that all happens within the same context?
Handle all possible cases for the customized option properly:
option not set yet
All references to it raise an exception, including current_setting() unless called with the second parameter missing_ok. The manual:
If there is no setting named setting_name, current_setting throws an error unless missing_ok is supplied and is true.
option set to a valid integer literal
option set to an invalid integer literal
option reset (which burns down to a special case of 3.)
For instance, if you set a customized option with SET LOCAL or set_config('myvars.user_id3', '55', true), the option value is reset at the end of the transaction. It still exists, can be referenced, but it returns an empty string now ('') - which cannot be cast to integer.
Obvious mistakes in your demo aside, you need to prepare for all 4 cases. So:
CREATE OR REPLACE FUNCTION add_transition1()
RETURNS trigger AS
$func$
DECLARE
_user_id text := current_setting('myvars.user_id', true); -- see 1.
BEGIN
IF _user_id ~ '^\d+$' THEN -- one or more digits?
INSERT INTO transitions1 (user_id, house_id)
VALUES (_user_id::int, NEW.id); -- valid int, cast is safe
ELSE
INSERT INTO transitions1 (user_id, house_id)
VALUES (NULL, NEW.id); -- use NULL instead
RAISE WARNING 'Invalid user_id % for house_id % was reset to NULL!'
, quote_literal(_user_id), NEW.id; -- optional
END IF;
RETURN NULL; -- OK for AFTER trigger
END
$func$ LANGUAGE plpgsql;
db<>fiddle here
Notes:
Avoid variable names that match column names. Very error prone. One popular naming convention is to prepend variable names with an underscore: _user_id.
Assign at declaration time to save one assignment. Note the data type text. We'll cast later, after sorting out invalid input.
Avoid raising / trapping an exception if possible. The manual:
A block containing an EXCEPTION clause is significantly more expensive
to enter and exit than a block without one. Therefore, don't use
EXCEPTION without need.
Test for valid integer strings. This simple regular expression allows only digits (no leading sign, no white space): _user_id ~ '^\d+$'. I reset to NULL for any invalid input. Adapt to your needs.
I added an optional WARNING for your debugging convenience.
Cases 3. and 4. only arise because customized options are string literals (type text), valid data types cannot be enforced automatically.
Related:
User defined variables in PostgreSQL
Is there a way to define a named constant in a PostgreSQL query?
All that aside, there may be more elegant solutions for what you are trying to do without customized options, depending on your exact requirements. Maybe this:
Fastest way to get current user's OID in Postgres?
It is not clear why you are trying to concat NULL to user_id but it is obviously the cause of the problem. Get rid of it:
CREATE OR REPLACE FUNCTION add_transition1() RETURNS TRIGGER AS $$
DECLARE
user_id integer;
BEGIN
user_id := current_setting('myvars.user_id')::integer;
INSERT INTO transitions1 (user_id, house_id) VALUES (user_id, NEW.id);
RETURN NULL;
END;
$$ LANGUAGE plpgsql;
Note that
SELECT 55 || NULL
always gives NULL.
You can catch the exception when the value doesn't exist - here's the changes I made to get this to work:
CREATE OR REPLACE FUNCTION add_transition1() RETURNS TRIGGER AS $$
DECLARE
user_id integer;
BEGIN
BEGIN
user_id := current_setting('myvars.user_id')::integer;
EXCEPTION WHEN OTHERS THEN
user_id := 0;
END;
INSERT INTO transitions1 (user_id, house_id) VALUES (user_id, NEW.id);
RETURN NULL;
END;
$$ LANGUAGE plpgsql;
CREATE OR REPLACE FUNCTION insert_house() RETURNS void as $$
DECLARE
user_id integer;
BEGIN
PERFORM set_config('myvars.user_id', '55', false);
INSERT INTO houses (name) VALUES ('HOUSE PARTY');
END; $$ LANGUAGE plpgsql;

Postgresql function: get id of updated or inserted row

I have this function in my postgresql database that update row if exist or insert new one if it doesn't exist:
CREATE OR REPLACE FUNCTION insert_or_update(val1 integer, val2 integer) RETURNS VOID AS $$
DECLARE
BEGIN
UPDATE my_table SET col2 = val2 WHERE col1 = val1;
IF NOT FOUND THEN
INSERT INTO my_table (col2) values ( val2 );
END IF;
END;
$$ LANGUAGE 'plpgsql';
For now it's working perfect but I want to get the id of row if updated or inserted.
How can I do it?
Your function is declared as returns void so it can't return anything.
Assuming col1 is the primary key and is also defined as a serial, you can do something like this:
CREATE OR REPLACE FUNCTION insert_or_update(val1 integer, val2 integer)
RETURNS int
AS $$
DECLARE
l_id integer;
BEGIN
l_id := val1; -- initialize the local variable.
UPDATE my_table
SET col2 = val2
WHERE col1 = val1; -- !! IMPORTANT: this assumes col1 is unique !!
IF NOT FOUND THEN
INSERT INTO my_table (col2) values ( val2 )
RETURNING col1 -- this makes the generated value available
into l_id; -- and this stores it in the local variable
END IF;
return l_id; -- return whichever was used.
END;
$$ LANGUAGE plpgsql;
I changed four things compared to your function:
the function is declared as returns integer in order to be able to return something
you need a variable where you can store the returned value from the insert statement
and finally the generated value needs to be returned:
The language name is an identifier, so it must not be quoted using single quotes.
If you want to distinguish between an update or an insert from the caller, you could initialize l_id to null. In that case the function will return null if an update occurred and some value otherwise.
You can get the LastInsert ID using the method CURVAL(SEQUENCE_NAME_OF_TABLE).
But the best way is always to use the INSERT or UPDATE queries with RETURNING Clause.
CREATE OR REPLACE FUNCTION insert_or_update(val1 integer, val2 integer) RETURNS VOID AS $$
DECLARE
BEGIN
UPDATE my_table SET col2 = val2 WHERE col1 = val1 RETURNING col1;
IF NOT FOUND THEN
INSERT INTO my_table (col2) values ( val2 ) RETURNING col1;
END IF;
END;
$$ LANGUAGE 'plpgsql';
You can refer the following examples:
Insert Command - Last Example
Postgres with RETURNING clause
Note: In your UPDATE query, your WHERE clause is col1=val1. I assume that Val1 will be unique value, else multiple records will be updated. Hope you know that. And I assume col1 is your Primary Key like ID or so.
The PostgreSQL wiki's entry on UPSERT states that INSERT ... ON CONFLICT UPDATE will be added to PostgreSQL 9.5. This will allow you to more directly express the operation you desire without resorting to a stored procedure and/or introducing race conditions.
This operation is otherwise surprisingly tricky to express in earlier PostgreSQL versions without the risk of database corruption and/or a race condition. The code fragments posted so far all contain an error in that if two callers happen to want to upsert the same nonexistent row, the initial UPDATE will update zero rows and then they will both attempt an INSERT, one of which will fail. It should at least fail safe, aborting the query and any transaction in progress.
The PostgreSQL documentation on INSERT (search on that page for the text "Attempt to insert a new stock item along with the quantity of stock") shows how to do it safely and correctly on PostgreSQL 9.4 and earlier. Of particular note is that it tries the INSERT first to avoid any races on that front, and if that fails, does an UPDATE of the row it now knows exists. It uses a SAVEPOINT to ensure that a failed INSERT does not abort the transaction.

Postgresql, update if row with some unique value exists, else insert

I have a URLs table. They contain
(id int primary key,
url character varying unique,
content character varying,
last analyzed date).
I want to create trigger or something(rule may be), so each time i make insert from my java program, it updates some single row if row with such URL exists. Else it should perform an Insert.
Please, can you provide a complete code in Postgresql. Thanks.
This has been asked many times. A possible solution can be found here:
https://stackoverflow.com/a/6527838/552671
This solution requires both an UPDATE and INSERT.
UPDATE table SET field='C', field2='Z' WHERE id=3;
INSERT INTO table (id, field, field2)
SELECT 3, 'C', 'Z'
WHERE NOT EXISTS (SELECT 1 FROM table WHERE id=3);
With Postgres 9.1 it is possible to do it with one query:
https://stackoverflow.com/a/1109198/2873507
If INSERTS are rare, I would avoid doing a NOT EXISTS (...) since it emits a SELECT on all updates. Instead, take a look at wildpeaks answer: https://dba.stackexchange.com/questions/5815/how-can-i-insert-if-key-not-exist-with-postgresql
CREATE OR REPLACE FUNCTION upsert_tableName(arg1 type, arg2 type) RETURNS VOID AS $$
DECLARE
BEGIN
UPDATE tableName SET col1 = value WHERE colX = arg1 and colY = arg2;
IF NOT FOUND THEN
INSERT INTO tableName values (value, arg1, arg2);
END IF;
END;
$$ LANGUAGE 'plpgsql';
This way Postgres will initially try to do a UPDATE. If no rows was affected, it will fall back to emitting an INSERT.
I found this post more relevant in this scenario:
WITH upsert AS (
UPDATE spider_count SET tally=tally+1
WHERE date='today' AND spider='Googlebot'
RETURNING *
)
INSERT INTO spider_count (spider, tally)
SELECT 'Googlebot', 1
WHERE NOT EXISTS (SELECT * FROM upsert)
Firstly It tries insert. If there is a conflict on url column then it updates content and last_analyzed fields. If updates are rare this might be better option.
INSERT INTO URLs (url, content, last_analyzed)
VALUES
(
%(url)s,
%(content)s,
NOW()
)
ON CONFLICT (url)
DO
UPDATE
SET content=%(content)s, last_analyzed = NOW();
create table urls (
url_id serial primary key,
url text unique,
content text,
last_analyzed timestamptz);
insert into urls(url) values('hello'),
('How'),('are'),
('you'),('doing');
By creating procedure, you also also do upsert.
CREATE OR REPLACE PROCEDURE upsert_url(_url text) LANGUAGE plpgsql
as $$
BEGIN
INSERT INTO URLs (url) values (_url)
ON CONFLICT (url)
DO UPDATE SET last_analyzed = NOW();
END
$$;
Test it through call the procedure.
call upsert_url('I am is ok');
call upsert_url('hello');