How do I do this in TSQL? : How UPDATE and SELECT at the same time
The TSQL equivalent of the answer in the linked question would be something like
UPDATE [table]
SET foo=1
OUTPUT INSERTED.*, DELETED.*
WHERE boo=2
In an update statement you can use INSERTED to get the "after" values and DELETED the "before" values.
You're looking for the OUTPUT clause.
Related
I'm using a PostgreSQL with a Go driver. Sometimes I need to query not existing fields, just to check - maybe something exists in a DB. Before querying I can't tell whether that field exists. Example:
where size=10 or length=10
By default I get an error column "length" does not exist, however, the size column could exist and I could get some results.
Is it possible to handle such cases to return what is possible?
EDIT:
Yes, I could get all the existing columns first. But the initial queries can be rather complex and not created by me directly, I can only modify them.
That means the query can be simple like the previous example and can be much more complex like this:
WHERE size=10 OR (length=10 AND n='example') OR (c BETWEEN 1 and 5 AND p='Mars')
If missing columns are length and c - does that mean I have to parse the SQL, split it by OR (or other operators), check every part of the query, then remove any part with missing columns - and in the end to generate a new SQL query?
Any easier way?
I would try to check within information schema first
"select column_name from INFORMATION_SCHEMA.COLUMNS where table_name ='table_name';"
And then based on result do query
Why don't you get a list of columns that are in the table first? Like this
select column_name
from information_schema.columns
where table_name = 'table_name' and (column_name = 'size' or column_name = 'length');
The result will be the columns that exist.
There is no way to do what you want, except for constructing an SQL string from the list of available columns, which can be got by querying information_schema.columns.
SQL statements are parsed before they are executed, and there is no conditional compilation or no short-circuiting, so you get an error if a non-existing column is referenced.
I would like to execute and update similar to this:
UPDATE dummy
SET customer=subquery.customer,
address=subquery.address,
partn=subquery.partn
FROM (SELECT address_id, customer, address, partn
FROM dummy) AS subquery
WHERE dummy.address_id=subquery.address_id;
Taken from this answer: https://stackoverflow.com/a/6258586/411965
I found this and wondered if this can auto converted to jooq fluent syntax.
What is the equivalent jooq query? specifically, how do I perform the outer when referencing the subquery?
Assuming you're using code generation, do it like this:
Table<?> subquery = table(
select(
DUMMY.ADDRESS_ID,
DUMMY.CUSTOMER,
DUMMY.ADDRESS,
DUMMY.PARTN
)
.from(DUMMY)
).as("subquery");
ctx.update(DUMMY)
.set(DUMMY.CUSTOMER, subquery.field(DUMMY.CUSTOMER))
.set(DUMMY.ADDRESS, subquery.field(DUMMY.ADDRESS))
.set(DUMMY.PARTN, subquery.field(DUMMY.PARTN))
.from(subquery)
.where(DUMMY.ADDRESS_ID.eq(subquery.field(DUMMY.ADDRESS_ID)))
.execute();
Of course, the query makes no sense it is, because you're just going to touch every row without modifying it, but since you copied the SQL from another answer, I'm assuming your subquery's dummy table is really something else.
This has been asked multiple times here and here, but none of the answers are suitable in my case because I do not want to execute my update statement in a PL/PgSQL function and use GET DIAGNOSTICS integer_var = ROW_COUNT.
I have to do this in raw SQL.
For instance, in MS SQL SERVER we have ##ROWCOUNT which could be used like the following :
UPDATE <target_table>
SET Proprerty0 = Value0
WHERE <predicate>;
SELECT <computed_value_columns>
FROM <target>
WHERE ##ROWCOUNT > 0;
In one roundtrip to the database I know if the update was successfull and get the calculated values back.
What could be used instead of '##ROWCOUNT' ?
Can someone confirm that this is in fact impossible at this time ?
Thanks in advance.
EDIT 1 : I confirm that I need to use raw SQL (I wrote "raw plpgsql" in the original description).
In an attempt to make my question clearer please consider that the update statement affects only one row and think about optimistic concurrency:
The client did a SELECT Statement at first.
He builds the UPDATE and knows which database computed columns are to be included in the SELECT clause. Among other things, the predicate includes a timestamp that is computed each time the rows is updated.
So, if we have 1 row returned then everything is OK. If no row is returned then we know that there was a previous update and the client may need to refresh the data before trying to update clause again. This is why we need to know how many rows where affected by the update statement before returning computed columns. No row should be returned if the update fails.
What you want is not currently possible in the form that you describe, but I think you can do what you want with UPDATE ... RETURNING. See UPDATE ... RETURNING in the manual.
UPDATE <target_table>
SET Proprerty0 = Value0
WHERE <predicate>
RETURNING Property0;
It's hard to be sure, since the example you've provided is so abstract as to be somewhat meaningless.
You can also use a wCTE, which allows more complex cases:
WITH updated_rows AS (
UPDATE <target_table>
SET Proprerty0 = Value0
WHERE <predicate>
RETURNING row_id, Property0
)
SELECT row_id, some_computed_value_from_property
FROM updated_rows;
See common table expressions (WITH queries) and depesz's article on wCTEs.
UPDATE based on some added detail in the question, here's a demo using UPDATE ... RETURNING:
CREATE TABLE upret_demo(
id serial primary key,
somecol text not null,
last_updated timestamptz
);
INSERT INTO upret_demo (somecol, last_updated) VALUES ('blah',current_timestamp);
UPDATE upret_demo
SET
somecol = 'newvalue',
last_updated = current_timestamp
WHERE last_updated = '2012-12-03 19:36:15.045159+08' -- Change to your timestamp
RETURNING
somecol || '_computed' AS a,
'totally_new_computed_column' AS b;
Output when run the 1st time:
a | b
-------------------+-----------------------------
newvalue_computed | totally_new_computed_column
(1 row)
When run again, it'll have no effect and return no rows.
If you have more complex calculations to do in the result set, you can use a wCTE so you can JOIN on the results of the update and do other complex things.
WITH upd_row AS (
UPDATE upret_demo SET
somecol = 'newvalue',
last_updated = current_timestamp
WHERE last_updated = '2012-12-03 19:36:15.045159+08'
RETURNING id, somecol, last_updated
)
SELECT
'row_'||id||'_'||somecol||', updated '||last_updated AS calc1,
repeat('x',4) AS calc2
FROM upd_row;
In other words: Use UPDATE ... RETURNING, either directly to produce the calculated rows, or in a writeable CTE for more complex cases.
Generally the answer to this question depends on the type of the driver used.
PQcmdTuples() function does what is needed, if the application uses libpq. Other libraries on top of libpq need to have some wrapper on top of this function.
For JDBC the Statement.executeUpdate() method seems to the job.
ODBC provides SQLRowCount() function for the similar purpose.
Some SQL servers have a feature where INSERT is skipped if it would violate a primary/unique key constraint. For instance, MySQL has INSERT IGNORE.
What's the best way to emulate INSERT IGNORE and ON DUPLICATE KEY UPDATE with PostgreSQL?
With PostgreSQL 9.5, this is now native functionality (like MySQL has had for several years):
INSERT ... ON CONFLICT DO NOTHING/UPDATE ("UPSERT")
9.5 brings support for "UPSERT" operations.
INSERT is extended to accept an ON CONFLICT DO UPDATE/IGNORE clause. This clause specifies an alternative action to take in the event of a would-be duplicate violation.
...
Further example of new syntax:
INSERT INTO user_logins (username, logins)
VALUES ('Naomi',1),('James',1)
ON CONFLICT (username)
DO UPDATE SET logins = user_logins.logins + EXCLUDED.logins;
Edit: in case you missed warren's answer, PG9.5 now has this natively; time to upgrade!
Building on Bill Karwin's answer, to spell out what a rule based approach would look like (transferring from another schema in the same DB, and with a multi-column primary key):
CREATE RULE "my_table_on_duplicate_ignore" AS ON INSERT TO "my_table"
WHERE EXISTS(SELECT 1 FROM my_table
WHERE (pk_col_1, pk_col_2)=(NEW.pk_col_1, NEW.pk_col_2))
DO INSTEAD NOTHING;
INSERT INTO my_table SELECT * FROM another_schema.my_table WHERE some_cond;
DROP RULE "my_table_on_duplicate_ignore" ON "my_table";
Note: The rule applies to all INSERT operations until the rule is dropped, so not quite ad hoc.
For those of you that have Postgres 9.5 or higher, the new ON CONFLICT DO NOTHING syntax should work:
INSERT INTO target_table (field_one, field_two, field_three )
SELECT field_one, field_two, field_three
FROM source_table
ON CONFLICT (field_one) DO NOTHING;
For those of us who have an earlier version, this right join will work instead:
INSERT INTO target_table (field_one, field_two, field_three )
SELECT source_table.field_one, source_table.field_two, source_table.field_three
FROM source_table
LEFT JOIN target_table ON source_table.field_one = target_table.field_one
WHERE target_table.field_one IS NULL;
Try to do an UPDATE. If it doesn't modify any row that means it didn't exist, so do an insert. Obviously, you do this inside a transaction.
You can of course wrap this in a function if you don't want to put the extra code on the client side. You also need a loop for the very rare race condition in that thinking.
There's an example of this in the documentation: http://www.postgresql.org/docs/9.3/static/plpgsql-control-structures.html, example 40-2 right at the bottom.
That's usually the easiest way. You can do some magic with rules, but it's likely going to be a lot messier. I'd recommend the wrap-in-function approach over that any day.
This works for single row, or few row, values. If you're dealing with large amounts of rows for example from a subquery, you're best of splitting it into two queries, one for INSERT and one for UPDATE (as an appropriate join/subselect of course - no need to write your main filter twice)
To get the insert ignore logic you can do something like below. I found simply inserting from a select statement of literal values worked best, then you can mask out the duplicate keys with a NOT EXISTS clause. To get the update on duplicate logic I suspect a pl/pgsql loop would be necessary.
INSERT INTO manager.vin_manufacturer
(SELECT * FROM( VALUES
('935',' Citroën Brazil','Citroën'),
('ABC', 'Toyota', 'Toyota'),
('ZOM',' OM','OM')
) as tmp (vin_manufacturer_id, manufacturer_desc, make_desc)
WHERE NOT EXISTS (
--ignore anything that has already been inserted
SELECT 1 FROM manager.vin_manufacturer m where m.vin_manufacturer_id = tmp.vin_manufacturer_id)
)
INSERT INTO mytable(col1,col2)
SELECT 'val1','val2'
WHERE NOT EXISTS (SELECT 1 FROM mytable WHERE col1='val1')
As #hanmari mentioned in his comment. when inserting into a postgres tables, the on conflict (..) do nothing is the best code to use for not inserting duplicate data.:
query = "INSERT INTO db_table_name(column_name)
VALUES(%s) ON CONFLICT (column_name) DO NOTHING;"
The ON CONFLICT line of code will allow the insert statement to still insert rows of data. The query and values code is an example of inserted date from a Excel into a postgres db table.
I have constraints added to a postgres table I use to make sure the ID field is unique. Instead of running a delete on rows of data that is the same, I add a line of sql code that renumbers the ID column starting at 1.
Example:
q = 'ALTER id_column serial RESTART WITH 1'
If my data has an ID field, I do not use this as the primary ID/serial ID, I create a ID column and I set it to serial.
I hope this information is helpful to everyone.
*I have no college degree in software development/coding. Everything I know in coding, I study on my own.
Looks like PostgreSQL supports a schema object called a rule.
http://www.postgresql.org/docs/current/static/rules-update.html
You could create a rule ON INSERT for a given table, making it do NOTHING if a row exists with the given primary key value, or else making it do an UPDATE instead of the INSERT if a row exists with the given primary key value.
I haven't tried this myself, so I can't speak from experience or offer an example.
This solution avoids using rules:
BEGIN
INSERT INTO tableA (unique_column,c2,c3) VALUES (1,2,3);
EXCEPTION
WHEN unique_violation THEN
UPDATE tableA SET c2 = 2, c3 = 3 WHERE unique_column = 1;
END;
but it has a performance drawback (see PostgreSQL.org):
A block containing an EXCEPTION clause is significantly more expensive
to enter and exit than a block without one. Therefore, don't use
EXCEPTION without need.
On bulk, you can always delete the row before the insert. A deletion of a row that doesn't exist doesn't cause an error, so its safely skipped.
For data import scripts, to replace "IF NOT EXISTS", in a way, there's a slightly awkward formulation that nevertheless works:
DO
$do$
BEGIN
PERFORM id
FROM whatever_table;
IF NOT FOUND THEN
-- INSERT stuff
END IF;
END
$do$;
In a similar vein to my previous question I again ask the SO guys for your collective wisdom and help.
In a stored procedure and after passing some checks I need to insert a new row and return the newly created id for it. The check if a row exists works so it is the bit after that which I am undecided upon.
The table has two important columns: The LocationID and the CaseID. The CaseID is autoincrementing, so when you add insert a new locationid it will automatically rachet up.
I currently have this:
-- previous checks for existance of CaseID
IF #CaseID IS NULL
BEGIN
INSERT INTO
Cases(LocationID)
VALUES
(#LocationID)
-- what now?
END
I was thinking of performing a #CaseID = (SELECT blah) statement immeadiately after but I was wondering if there is a better way?
Is there a better way? How would you do this?
SELECT #CaseID = SCOPE_IDENTITY()
In fact, you can just do (if that's the end of the stored proc.):
SELECT SCOPE_IDENTITY()
(The OUTPUT clause is only available in SQL Server 2005 onwards...)
Ref: SCOPE_IDENTITY
scope_identity()
SELECT SCOPE_IDENTITY()
As others mentioned, SCOPE_IDENTITY() is the way to go, though some ORM tools provide this functionality as well.
The only thing you need to remember is SCOPE_IDENTITY() will return the last identity key value generated during the current session only. This is useful in filtering out new keys which may have been created by other clients simultaneously. SELECT ##IDENTITY will return the last key generated by any client/session.
You need to use the OUTPUT clause
http://blog.jemm.net/articles/databases/how-to-using-sql-server-2005s-output-to-return-generated-identity/
...which, as pointed out, is only available in sqlserver 2005. Plz disregard.