Postgresql Stored Procedures: Using id from one statement in another - postgresql

Is it possible to create a stored procedure with two insert statements where id/primary_key from the first insert statement will be used in second.
For eg.
INSERT INTO activity VALUES (DEFAULT, 'text', 'this is a test');
If the id returned from the above statement is x, the second insert will be like:
INSERT INTO activity_tree VALUES (DEFAULT, **x**, user_id) or something like that.
I understand thatlibpq has functions which can give the id from the first statement.
But I want to combine them into a stored procedure. Please advise.
Regards,
Mayank

Declare a variable, e.g. new_id and then you can store the generate id in there:
INSERT INTO activity VALUES (DEFAULT, 'text', 'this is a test')
RETURNING id INTO new_id;
INSERT INTO activity_tree VALUES (DEFAULT, new_id, user_id);
This assumes the column where the value is generated is called id
Btw: do not use "unqualified" insert statements. Always specify the columns in the INSERT part. That makes your code much more stable:
INSERT INTO activity (id, some_column, other_column)
VALUES
(DEFAULT, 'text', 'this is a test')
And make sure you are not calling variables the same way as columns. user_id seems to be a potential naming conflict here. This will compile with 8.x but might give you strange errors. I think this will no longer compile with 9.x

Related

Good Pattern for executing SELECT after INSERT with Amazon Aurora Postgres?

Imagine you have an Amazon Aurora Postgres DB. You perform an INSERT into one table. You then need do a SELECT to get the auto-generated CompanyId of the newly added record. You determine that there is often a significant enough delay between when the INSERT occurs and when the record is available to run the SELECT on.
I've discussed with my colleagues some possible code patterns to best handle this lag time. What, in your opinion, is the best approach?
You don't need a separate SELECT statement. The best and most efficient option is to just use the returning clause:
insert into some_table (c1, c2, c3)
values (...)
returning *;
Instead of returning * you can also specify the column you want, e.g.: returning company_id
Another other option is to use currval() or lastval() after the insert to the get the value of the sequence directly:
insert into some_table (..)
values (...);
select lastval();
The usage of lastval() requires that no other value is generated by a different sequence between the INSERT and the SELECT. If you can't guarantee that, use currval() and specify the name of the sequence:
insert into some_table (...)
values (...);
select currval('some_table_company_id_seq');
If you want to avoid hardcoding the sequence name, use pg_get_serial_sequence()
select currval(pg_get_serial_sequence('some_table', 'company_id'));

INSERT INTO, return number of rows inserted [duplicate]

My database driver for PostgreSQL 8/9 does not return a count of records affected when executing INSERT or UPDATE.
PostgreSQL offers the non-standard syntax "RETURNING" which seems like a good workaround. But what might be the syntax? The example returns the ID of a record, but I need a count.
INSERT INTO distributors (did, dname) VALUES (DEFAULT, 'XYZ Widgets')
RETURNING did;
I know this question is oooolllllld and my solution is arguably overly complex, but that's my favorite kind of solution!
Anyway, I had to do the same thing and got it working like this:
-- Get count from INSERT
WITH rows AS (
INSERT INTO distributors
(did, dname)
VALUES
(DEFAULT, 'XYZ Widgets'),
(DEFAULT, 'ABC Widgets')
RETURNING 1
)
SELECT count(*) FROM rows;
-- Get count from UPDATE
WITH rows AS (
UPDATE distributors
SET dname = 'JKL Widgets'
WHERE did <= 10
RETURNING 1
)
SELECT count(*) FROM rows;
One of these days I really have to get around to writing a love sonnet to PostgreSQL's WITH clause ...
I agree w/ Milen, your driver should do this for you. What driver are you using and for what language? But if you are using plpgsql, you can use GET DIAGNOSTICS my_var = ROW_COUNT;
http://www.postgresql.org/docs/current/static/plpgsql-statements.html#PLPGSQL-STATEMENTS-DIAGNOSTICS
You can take ROW_COUNT after update or insert with this code:
insert into distributors (did, dname) values (DEFAULT, 'XYZ Widgets');
get diagnostics v_cnt = row_count;
It's not clear from your question how you're calling the statement. Assuming you're using something like JDBC you may be calling it as a query rather than an update. From JDBC's executeQuery:
Executes the given SQL statement, which returns a single ResultSet
object.
This is therefore appropriate when you execute a statement that returns some query results, such as SELECT or INSERT ... RETURNING. If you are making an update to the database and then want to know how many tuples were affected, you need to use executeUpdate which returns:
either (1) the row count for SQL Data Manipulation Language (DML)
statements or (2) 0 for SQL statements that return nothing
You could wrap your query in a transaction and it should show you the count before you ROLLBACK or COMMIT. Example:
BEGIN TRANSACTION;
INSERT .... ;
ROLLBACK TRANSACTION;
If you run the first 2 lines of the above, it should give you the count. You can then ROLLBACK (undo) the insert if you find that the number of affected lines isn't what you expected. If you're satisfied that the INSERT is correct, then you can run the same thing, but replace line 3 with COMMIT TRANSACTION;.
Important note: After you run any BEGIN TRANSACTION; you must either ROLLBACK; or COMMIT; the transaction, otherwise the transaction will create a lock that can slow down or even cripple an entire system, if you're running on a production environment.

Insert a result from a stored procedure in postgresql

I'm trying to understand how to deal with procedures in Postgresql.
I get the idea of creating a function that returns a variable. What I don't get is how I can use such variable, for instance, in an insert.
Imagine this, I have a function called getName(), that returns a variable $name$.
What I want is to insert such variable in another table... How can I do this?
If the function returns a single value, you can use it anywhere a constant could be used.
insert into some_table (id, name)
values (42, get_name());
this is the same as using a built-in function:
insert into some_table (id, modified_at)
values (42, now());
It can be used the same way in an update statement
update some_table
set name = get_name()
where id = 42;

Composing SQL insert statement into another insert statement

I know that in T-SQL (Server 2008 R2) I can use the 'Output' keyword to get the Id of a row I just inserted. For example, I can do
insert into [Membership].[dbo].[User] (EmailAddress)
output Inserted.UserId
values('testUser1#test.com')
Is there any way of composing this into another insert? For example, lets say I want to add a new user and immediately add that user to a UserRole table which maps the UserId to a RoleId.
Basically, I would like to do something like below.
insert into UserRole (RoleId, UserId)
values
(
1,
insert into [Membership].[dbo].[User] (EmailAddress)
output Inserted.UserId values('testUser1#test.com')
)
But I can't seem to get this to work. I tried wrapping the internal insert in brackets () or using a select * from () etc.
What am I missing? Is this composition even possible?
Thanks for the help.
Regards,
You would have to capture the output into a table variable:
DECLARE #TempVar TABLE (UserID INT)
insert into [Membership].[dbo].[User] (EmailAddress)
output Inserted.UserId INTO #TempVar(UserID)
values('testUser1#test.com')
and then in a second step do an insert from that temp table into the target table:
INSERT INTO dbo.UserRole (RoleId, UserId)
SELECT
(yourRoleId), tv.UserID
FROM #TempVar tv
You could also direct the OUTPUT clause directly into the target table - that'll work if you can e.g. use a fixed value for your RoleID:
DECLARE #FixedRoleID INT = 42
INSERT INTO [Membership].[dbo].[User] (EmailAddress)
OUTPUT #FixedRoleID, Inserted.UserId INTO dbo.UserRole(RoleId, UserId)
VALUES ('testUser1#test.com')
Another solution is to use triggers:
http://msdn.microsoft.com/en-us/library/ms189799.aspx
and pay attention to "after" insert triggers:
FOR | AFTER AFTER specifies that the DML trigger is fired only when
all operations specified in the triggering SQL statement have executed
successfully. All referential cascade actions and constraint checks
also must succeed before this trigger fires.
AFTER is the default when FOR is the only keyword specified.
AFTER triggers cannot be defined on views.

how to emulate "insert ignore" and "on duplicate key update" (sql merge) with postgresql?

Some SQL servers have a feature where INSERT is skipped if it would violate a primary/unique key constraint. For instance, MySQL has INSERT IGNORE.
What's the best way to emulate INSERT IGNORE and ON DUPLICATE KEY UPDATE with PostgreSQL?
With PostgreSQL 9.5, this is now native functionality (like MySQL has had for several years):
INSERT ... ON CONFLICT DO NOTHING/UPDATE ("UPSERT")
9.5 brings support for "UPSERT" operations.
INSERT is extended to accept an ON CONFLICT DO UPDATE/IGNORE clause. This clause specifies an alternative action to take in the event of a would-be duplicate violation.
...
Further example of new syntax:
INSERT INTO user_logins (username, logins)
VALUES ('Naomi',1),('James',1)
ON CONFLICT (username)
DO UPDATE SET logins = user_logins.logins + EXCLUDED.logins;
Edit: in case you missed warren's answer, PG9.5 now has this natively; time to upgrade!
Building on Bill Karwin's answer, to spell out what a rule based approach would look like (transferring from another schema in the same DB, and with a multi-column primary key):
CREATE RULE "my_table_on_duplicate_ignore" AS ON INSERT TO "my_table"
WHERE EXISTS(SELECT 1 FROM my_table
WHERE (pk_col_1, pk_col_2)=(NEW.pk_col_1, NEW.pk_col_2))
DO INSTEAD NOTHING;
INSERT INTO my_table SELECT * FROM another_schema.my_table WHERE some_cond;
DROP RULE "my_table_on_duplicate_ignore" ON "my_table";
Note: The rule applies to all INSERT operations until the rule is dropped, so not quite ad hoc.
For those of you that have Postgres 9.5 or higher, the new ON CONFLICT DO NOTHING syntax should work:
INSERT INTO target_table (field_one, field_two, field_three )
SELECT field_one, field_two, field_three
FROM source_table
ON CONFLICT (field_one) DO NOTHING;
For those of us who have an earlier version, this right join will work instead:
INSERT INTO target_table (field_one, field_two, field_three )
SELECT source_table.field_one, source_table.field_two, source_table.field_three
FROM source_table
LEFT JOIN target_table ON source_table.field_one = target_table.field_one
WHERE target_table.field_one IS NULL;
Try to do an UPDATE. If it doesn't modify any row that means it didn't exist, so do an insert. Obviously, you do this inside a transaction.
You can of course wrap this in a function if you don't want to put the extra code on the client side. You also need a loop for the very rare race condition in that thinking.
There's an example of this in the documentation: http://www.postgresql.org/docs/9.3/static/plpgsql-control-structures.html, example 40-2 right at the bottom.
That's usually the easiest way. You can do some magic with rules, but it's likely going to be a lot messier. I'd recommend the wrap-in-function approach over that any day.
This works for single row, or few row, values. If you're dealing with large amounts of rows for example from a subquery, you're best of splitting it into two queries, one for INSERT and one for UPDATE (as an appropriate join/subselect of course - no need to write your main filter twice)
To get the insert ignore logic you can do something like below. I found simply inserting from a select statement of literal values worked best, then you can mask out the duplicate keys with a NOT EXISTS clause. To get the update on duplicate logic I suspect a pl/pgsql loop would be necessary.
INSERT INTO manager.vin_manufacturer
(SELECT * FROM( VALUES
('935',' Citroën Brazil','Citroën'),
('ABC', 'Toyota', 'Toyota'),
('ZOM',' OM','OM')
) as tmp (vin_manufacturer_id, manufacturer_desc, make_desc)
WHERE NOT EXISTS (
--ignore anything that has already been inserted
SELECT 1 FROM manager.vin_manufacturer m where m.vin_manufacturer_id = tmp.vin_manufacturer_id)
)
INSERT INTO mytable(col1,col2)
SELECT 'val1','val2'
WHERE NOT EXISTS (SELECT 1 FROM mytable WHERE col1='val1')
As #hanmari mentioned in his comment. when inserting into a postgres tables, the on conflict (..) do nothing is the best code to use for not inserting duplicate data.:
query = "INSERT INTO db_table_name(column_name)
VALUES(%s) ON CONFLICT (column_name) DO NOTHING;"
The ON CONFLICT line of code will allow the insert statement to still insert rows of data. The query and values code is an example of inserted date from a Excel into a postgres db table.
I have constraints added to a postgres table I use to make sure the ID field is unique. Instead of running a delete on rows of data that is the same, I add a line of sql code that renumbers the ID column starting at 1.
Example:
q = 'ALTER id_column serial RESTART WITH 1'
If my data has an ID field, I do not use this as the primary ID/serial ID, I create a ID column and I set it to serial.
I hope this information is helpful to everyone.
*I have no college degree in software development/coding. Everything I know in coding, I study on my own.
Looks like PostgreSQL supports a schema object called a rule.
http://www.postgresql.org/docs/current/static/rules-update.html
You could create a rule ON INSERT for a given table, making it do NOTHING if a row exists with the given primary key value, or else making it do an UPDATE instead of the INSERT if a row exists with the given primary key value.
I haven't tried this myself, so I can't speak from experience or offer an example.
This solution avoids using rules:
BEGIN
INSERT INTO tableA (unique_column,c2,c3) VALUES (1,2,3);
EXCEPTION
WHEN unique_violation THEN
UPDATE tableA SET c2 = 2, c3 = 3 WHERE unique_column = 1;
END;
but it has a performance drawback (see PostgreSQL.org):
A block containing an EXCEPTION clause is significantly more expensive
to enter and exit than a block without one. Therefore, don't use
EXCEPTION without need.
On bulk, you can always delete the row before the insert. A deletion of a row that doesn't exist doesn't cause an error, so its safely skipped.
For data import scripts, to replace "IF NOT EXISTS", in a way, there's a slightly awkward formulation that nevertheless works:
DO
$do$
BEGIN
PERFORM id
FROM whatever_table;
IF NOT FOUND THEN
-- INSERT stuff
END IF;
END
$do$;