UPSERT from table with different table sizes

UPSERT from table with different table sizes - postgresql

I'm getting the error:
ERROR: column "some_col_name" does not exist Hint: There is a column named "some_col_name" in table "usert_test", but it cannot be referenced from this part of the query.
On UPSERT The cause of this error is that the source table (read in from API) doesn't always have the same number of fields as the table I'm looking to UPSERT. Within the UPSERT process is there a way to handle this? So far I've tried the below:
INSERT INTO scratch."usert_test" (many_cols)
SELECT *
FROM scratch.daily_scraper
ON CONFLICT (same_unique_id)
DO UPDATE
SET
many_fields = excluded.many_fields;

Name each column specifically in every instance.
insert into scratch."usert_test" (column_name1, column_name2, column_name3,column_name3)
select cola, colb, colc, colf
from scratch.daily_scraper
on conflict (column_name1, column_name4)
do update
set
column_name3 = excluded.column_name3
, column_name2 = excluded.column_name2;
How ever many columns you have properly name every one. (IMHO) As you should always do.

Related

Upserting into one Postgres table from another table?

I have two tables with identical structures. All columns are integers and are named "A" , "B" and "key".
I can insert from one table into another with some SQL like this:
INSERT INTO test_table_kewmfsznj
SELECT * FROM tmp_test_table_kewmfsznj_cnxtbkbq ta
But this doesn't work:
INSERT INTO test_table_kewmfsznj
SELECT * FROM tmp_test_table_kewmfsznj_cnxtbkbq ta
ON CONFLICT ("key")
DO NOTHING
My expectation was that this code would skip any row from "ta" where the key already exists in the table I'm inserting into. Instead I get this error:
ERROR: missing FROM-clause entry for table "ta"
Position: 152
Here's what I really want to do: When the key already exists in the table I'm inserting into, I want to update certain columns:
INSERT INTO test_table_kewmfsznj
SELECT * FROM tmp_test_table_kewmfsznj_cnxtbkbq ta
ON CONFLICT ("key")
DO UPDATE SET "A" = ta."A", "B" = ta."B"
Unfortunately this gives me (almost) the same error:
ERROR: missing FROM-clause entry for table "ta"
Position: 134
Can somebody explain what I am doing wrong here?
EDIT0: I tried it without upper case table names. The columns are now called "a", "b" and "key". The data remains unchanged - it's all integers.
INSERT INTO test_table_mrcvnaoia
SELECT * from tmp_test_table_mrcvnaoia_uuxkaidv ta
ON CONFLICT (key)
DO UPDATE SET a = ta.a, b = ta.b
... and now I get this error:
SQL Error [42P01]: ERROR: missing FROM-clause entry for table "ta"
Position: 129
To me, this suggests that there's something wrong with my ON CONFLICT statement, and probably not the first half of the query, but beyond that I'm out of clues. Can anybody help?

You almost had it, but you can't reference the table name, you reference EXCLUDED:
INSERT INTO test_table_mrcvnaoia
SELECT * from tmp_test_table_mrcvnaoia_uuxkaidv
ON CONFLICT (key)
DO UPDATE SET a = EXCLUDED.a, b = EXCLUDED.b;
Furthermore, to avoid errors in the future, make sure you explicitly specify the column names in the insert and select portions of your statement. They are the same for now, but that might not always be the case.

Generate a unique uuid for each row in a table with Postgres

I have a UUID constraint set up as my Id field in one of my tables, however despite this (I think due to the fact that uuid_generate_v4 only creates on UUID per transaction?) when I imported a load of CSV data into my table, each row in the table was given the same UUID.
I want to be able to change this and give each row a unique UUID, however running
update monitors_nontest set id = uuid_generate_v1()
Again only produces one UUID for each row.
How can I change this command so that each row gets a different UUID?

Just a possibility. How did you actually generate the uuid? The function uuid_generate_v1() actually generates a different value each time it is called. But that is the key each time it is called. It seems if called within a sub select the Postgres optimizer may feel it can bypass the call and use cached result. Try the following.
create table uuid_gen( id1a uuid
, id1b uuid
, num1 integer
);
insert into uuid_gen(num1)
select generate_series(1,50);
update uuid_gen set id1a = uuid_generate_v1();
update uuid_gen set id1b = (select uuid_generate_v1());
select count(distinct id1a), count(distinct id1b) from uuid_gen;
Unfortunately, I could not find a fiddle processor that had the function uuid_generate_v1() available, nor uuid_generate_v4() which worked exactly the same.

I had not exactly the same problem, but something similar.
I was working with pre-Postgres13 version (so I did not have any function ready).
I had a table where I needed to insert new rows into the table while generating a new UUID(v4) for each new row.
I was looking everywhere but couldn't find anything w/o creating a function or installing extensions.
So I made it this way:
INSERT INTO monitors_nontest (id, col1, col2, col3)
SELECT uuid_in(md5(random()::text || random()::text)::cstring), mn.col1, mn.col2, mn.col3
FROM monitors_nontest mn
WHERE mn.col1 = 'some-text'
This could be adjusted for the UPDATE query. I hope it will help somebody else.

postgresql on conflict-cannot affect row a second time

I have a table, i have auto numbering/sequence on data_id
tabledata
---------
data_id [PK]
data_code [Unique]
data_desc
example code:
insert into tabledata(data_code,data_desc) values(Z01,'red')
on conflict (data_code) do update set data_desc=excluded.data_desc
works fine, and then i insert again
insert into tabledata(data_code,data_desc) values(Z01,'blue')
on conflict (data_code) do update set data_desc=excluded.data_desc
i got this error
[Err] ERROR: ON CONFLICT DO UPDATE command cannot affect row a second time
HINT: Ensure that no rows proposed for insertion within the same command have duplicate constrained values.
this is my real code
insert into psa_aso_branch(branch_code,branch_desc,regional_code,status,created_date,lastmodified_date)
(select branch_code, branch, kode_regional,
case when status_data='Y' then true
else false end, current_date, current_date
from branch_history) on conflict (branch_code) do
update set branch_desc = excluded.branch_desc, regional_code = excluded.regional_code,status = (case when excluded.status='Y' then true else false end), created_date=current_date, lastmodified_date=current_date;
working fine on first, but not the next one (like the example i give you before)

You can use update on the existing record/row, and not on row you are inserting.
Here update in on conflict clause applies to row in excluded table, which holds row temporarily.
In the first case record is inserted since there is no clash on data_code and update is not executed at all.
In the second insert you are inserting Z01 which is already inserted as data_code and data_code is unique.
The excluded table still holds the duplicate value of data_code after the update, so the record is not inserted. In update set data_code have to be changed in order to insert record properly.

I have been stuck on this issue for about 24 hours.
It is weird when I test the query on cli and it's works fine. It is working fine when I make an insertion using one data row. This errors only appear when I'm using insert-select.
It is not mostly because of insert-select problem. It is because the select rows is not unique. This will trigger the CONFLICT for more than once.
Thanks to #zivaricha comment. I experiment from his notes. Just that its hard to understand at first.
Solution:
Using distinct to make sure the select returns unique result.

This error comes when the duplicacy occurs multiple times in the single insertion
for example you have column a , b , c and combination of a and b is unique and on duplicate you are updating c.
Now suppose you already have a = 1 , b = 2 , c = 3 and you are inserting a = 1 b = 2 c = 4 and a = 1 b = 2 c = 4
so means conflict occurs twice so it cant update a row twice

I think what is happening here
when you do an update on conflict, it does an update that re conflicts again and then throws that error

We can find the error message from the source code, which we can simply understand why we got ON CONFLICT DO UPDATE command cannot affect row a second time.
In the source code of PostgreSQL at src/backend/executor/nodeModifyTable.c and the function of ExecOnConflictUpdate(), we can find this comment:
This can occur when a just inserted tuple is updated again in the same command. E.g. because multiple rows with the same conflicting key values are inserted.
This is somewhat similar to the ExecUpdate() TM_SelfModified case. We do not want to proceed because it would lead to the same row being updated a second time in some unspecified order, and in contrast to plain UPDATEs there's no historical behavior to break.
As the comment said, we can not update the row which we are inserting in INSERT ... ON CONFLICT, just like:
postgres=# CREATE TABLE t (id int primary key, name varchar);
postgres=# INSERT INTO t VALUES (1, 'smart'), (1, 'keyerror')
postgres=# ON CONFLICT (id) DO UPDATE SET name = 'Buuuuuz';
ERROR: ON CONFLICT DO UPDATE command cannot affect row a second time
HINT: Ensure that no rows proposed for insertion within the same command have duplicate constrained values.
Remember, the executor of postgresql is a volcano model, so it will process the data we insert one by one. When we process to (1, 'smart'), since the table is empty, we can insert normally. When we get to (1, 'keyerror'), there is a conflict with the (1, 'smart') we just inserted, so the update logic is executed, which results in updating our own inserted data, which PostgreSQL doesn't allow us to do.
Similarly, we cannot update the same row of data twice:
postgres=# DROP TABLE IF EXISTS t;
postgres=# CREATE TABLE t (id int primary key, name varchar);
postgres=# INSERT INTO t VALUES (1, 'keyerror'), (1, 'buuuuz')
postgres=# ON CONFLICT (id) DO UPDATE SET name = 'Buuuuuuuuuz';
ERROR: ON CONFLICT DO UPDATE command cannot affect row a second time
HINT: Ensure that no rows proposed for insertion within the same command have duplicate constrained values.

How to Update/Insert (Upsert) for Multiple Values DB2

I am trying to do an UPSERT in DB2 9.7 without creating a temporary table to merge. I am specifying values as parameters, however I'm always getting a syntax error for the comma separating the values when I try to include more than one row of values.
MERGE INTO table_name AS tab
USING (VALUES
(?,?),
(?,?)
) AS merge (COL1, COL2)
ON tab.COL1 = merge.COL1
WHEN MATCHED THEN
UPDATE SET tab.COL1 = merge.COL1,
tab.COL2 = merge.COL2
WHEN NOT MATCHED THEN
INSERT (COL1, COL2)
VALUES (merge.COL1, merge.COL2)
I have also tried teknopaul's answer from Does DB2 have an “insert or update” statement, but have received another syntax error complaining about the use of SELECT.
Does anybody know how to correctly include a table with values in my merge, without actually creating/dropping one on the database?

I believe you need something like USING (SELECT * FROM VALUES ( ...) ) AS ...

how to emulate "insert ignore" and "on duplicate key update" (sql merge) with postgresql?

Some SQL servers have a feature where INSERT is skipped if it would violate a primary/unique key constraint. For instance, MySQL has INSERT IGNORE.
What's the best way to emulate INSERT IGNORE and ON DUPLICATE KEY UPDATE with PostgreSQL?

With PostgreSQL 9.5, this is now native functionality (like MySQL has had for several years):
INSERT ... ON CONFLICT DO NOTHING/UPDATE ("UPSERT")
9.5 brings support for "UPSERT" operations.
INSERT is extended to accept an ON CONFLICT DO UPDATE/IGNORE clause. This clause specifies an alternative action to take in the event of a would-be duplicate violation.
...
Further example of new syntax:
INSERT INTO user_logins (username, logins)
VALUES ('Naomi',1),('James',1)
ON CONFLICT (username)
DO UPDATE SET logins = user_logins.logins + EXCLUDED.logins;

Edit: in case you missed warren's answer, PG9.5 now has this natively; time to upgrade!
Building on Bill Karwin's answer, to spell out what a rule based approach would look like (transferring from another schema in the same DB, and with a multi-column primary key):
CREATE RULE "my_table_on_duplicate_ignore" AS ON INSERT TO "my_table"
WHERE EXISTS(SELECT 1 FROM my_table
WHERE (pk_col_1, pk_col_2)=(NEW.pk_col_1, NEW.pk_col_2))
DO INSTEAD NOTHING;
INSERT INTO my_table SELECT * FROM another_schema.my_table WHERE some_cond;
DROP RULE "my_table_on_duplicate_ignore" ON "my_table";
Note: The rule applies to all INSERT operations until the rule is dropped, so not quite ad hoc.

For those of you that have Postgres 9.5 or higher, the new ON CONFLICT DO NOTHING syntax should work:
INSERT INTO target_table (field_one, field_two, field_three )
SELECT field_one, field_two, field_three
FROM source_table
ON CONFLICT (field_one) DO NOTHING;
For those of us who have an earlier version, this right join will work instead:
INSERT INTO target_table (field_one, field_two, field_three )
SELECT source_table.field_one, source_table.field_two, source_table.field_three
FROM source_table
LEFT JOIN target_table ON source_table.field_one = target_table.field_one
WHERE target_table.field_one IS NULL;

Try to do an UPDATE. If it doesn't modify any row that means it didn't exist, so do an insert. Obviously, you do this inside a transaction.
You can of course wrap this in a function if you don't want to put the extra code on the client side. You also need a loop for the very rare race condition in that thinking.
There's an example of this in the documentation: http://www.postgresql.org/docs/9.3/static/plpgsql-control-structures.html, example 40-2 right at the bottom.
That's usually the easiest way. You can do some magic with rules, but it's likely going to be a lot messier. I'd recommend the wrap-in-function approach over that any day.
This works for single row, or few row, values. If you're dealing with large amounts of rows for example from a subquery, you're best of splitting it into two queries, one for INSERT and one for UPDATE (as an appropriate join/subselect of course - no need to write your main filter twice)

To get the insert ignore logic you can do something like below. I found simply inserting from a select statement of literal values worked best, then you can mask out the duplicate keys with a NOT EXISTS clause. To get the update on duplicate logic I suspect a pl/pgsql loop would be necessary.
INSERT INTO manager.vin_manufacturer
(SELECT * FROM( VALUES
('935',' Citroën Brazil','Citroën'),
('ABC', 'Toyota', 'Toyota'),
('ZOM',' OM','OM')
) as tmp (vin_manufacturer_id, manufacturer_desc, make_desc)
WHERE NOT EXISTS (
--ignore anything that has already been inserted
SELECT 1 FROM manager.vin_manufacturer m where m.vin_manufacturer_id = tmp.vin_manufacturer_id)
)

INSERT INTO mytable(col1,col2)
SELECT 'val1','val2'
WHERE NOT EXISTS (SELECT 1 FROM mytable WHERE col1='val1')

As #hanmari mentioned in his comment. when inserting into a postgres tables, the on conflict (..) do nothing is the best code to use for not inserting duplicate data.:
query = "INSERT INTO db_table_name(column_name)
VALUES(%s) ON CONFLICT (column_name) DO NOTHING;"
The ON CONFLICT line of code will allow the insert statement to still insert rows of data. The query and values code is an example of inserted date from a Excel into a postgres db table.
I have constraints added to a postgres table I use to make sure the ID field is unique. Instead of running a delete on rows of data that is the same, I add a line of sql code that renumbers the ID column starting at 1.
Example:
q = 'ALTER id_column serial RESTART WITH 1'
If my data has an ID field, I do not use this as the primary ID/serial ID, I create a ID column and I set it to serial.
I hope this information is helpful to everyone.
*I have no college degree in software development/coding. Everything I know in coding, I study on my own.

Looks like PostgreSQL supports a schema object called a rule.
http://www.postgresql.org/docs/current/static/rules-update.html
You could create a rule ON INSERT for a given table, making it do NOTHING if a row exists with the given primary key value, or else making it do an UPDATE instead of the INSERT if a row exists with the given primary key value.
I haven't tried this myself, so I can't speak from experience or offer an example.

This solution avoids using rules:
BEGIN
INSERT INTO tableA (unique_column,c2,c3) VALUES (1,2,3);
EXCEPTION
WHEN unique_violation THEN
UPDATE tableA SET c2 = 2, c3 = 3 WHERE unique_column = 1;
END;
but it has a performance drawback (see PostgreSQL.org):
A block containing an EXCEPTION clause is significantly more expensive
to enter and exit than a block without one. Therefore, don't use
EXCEPTION without need.

On bulk, you can always delete the row before the insert. A deletion of a row that doesn't exist doesn't cause an error, so its safely skipped.

For data import scripts, to replace "IF NOT EXISTS", in a way, there's a slightly awkward formulation that nevertheless works:
DO
$do$
BEGIN
PERFORM id
FROM whatever_table;
IF NOT FOUND THEN
-- INSERT stuff
END IF;
END
$do$;

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

UPSERT from table with different table sizes - postgresql

Related

Upserting into one Postgres table from another table?

Generate a unique uuid for each row in a table with Postgres

postgresql on conflict-cannot affect row a second time

How to Update/Insert (Upsert) for Multiple Values DB2

how to emulate "insert ignore" and "on duplicate key update" (sql merge) with postgresql?

Categories

Resources