Appending array during Postgresql upsert & getting ambiguous column error - postgresql

When trying to do a query like below:
INSERT INTO employee_channels (employee_id, channels)
VALUES ('46356699-bed1-4ec4-9ac1-76f124b32184', '{a159d680-2f2e-4ba7-9498-484271ad0834}')
ON CONFLICT (employee_id)
DO UPDATE SET channels = array_append(channels, 'a159d680-2f2e-4ba7-9498-484271ad0834')
WHERE employee_id = '46356699-bed1-4ec4-9ac1-76f124b32184'
AND NOT lower(channels::text)::text[] #> ARRAY['a159d680-2f2e-4ba7-9498-484271ad0834'];
I get the following error
[42702] ERROR: column reference "channels" is ambiguous Position: 245
The specific reference to channels it's referring to is the 'channels' inside array_append.
channels is a CITEXT[] data type

You may need to specify the EXCLUDED table in your set statement.
SET channels = array_append(EXCLUDED.channels, 'a159d680-2f2e-4ba7-9498-484271ad0834')
When using the ON CONFLICT DO UPDATE clause the values that aren't inserted because of the conflict are stored in the EXCLUDED table. Its an ephemeral table you don't have to actually make, the way NEW and OLD are in triggers.
From the PostgreSQL Manual:
conflict_action specifies an alternative ON CONFLICT action. It can be either DO NOTHING, or a DO UPDATE clause specifying the exact
details of the UPDATE action to be performed in case of a conflict.
The SET and WHERE clauses in ON CONFLICT DO UPDATE have access to the
existing row using the table's name (or an alias), and to rows
proposed for insertion using the special excluded table. SELECT
privilege is required on any column in the target table where
corresponding excluded columns are read.
Note that the effects of all per-row BEFORE INSERT triggers are reflected in excluded values, since those effects may have contributed
to the row being excluded from insertion.

Related

Postgres Upsert Cardinality Violation

I am trying to insert extracted data from a sql table into a postgres table where the rows may or may not exist. If they do exist, I would like to set a specific column to its default (0)
The table is as
site_notes (
job_id text primary key,
attachment_id text,
complete int default 0);
My query is
INSERT INTO site_notes (
job_id,
attachment_id
)
VALUES
{jobs_sql}
ON CONFLICT (job_id) DO UPDATE
SET complete = DEFAULT;
However I am getting an error: psycopg2.errors.CardinalityViolation: ON CONFLICT DO UPDATE command cannot affect row a second time
HINT: Ensure that no rows proposed for insertion within the same command have duplicate constrained values.
Would anyone be able to advise on how to set the complete column to the default on event of a conflict ?
Many Thanks
An INSERT ... ON CONFLICT DO UPDATE statement (and indeed an UPDATE statement too) is not allowed to modify the same row more than once. It is not clear what {jobs_sql} in your question is, but it must contain several rows, and at least two of those have the same job_id.
Make sure that the same job_id does not occur more than once in the rows you want to INSERT.

How to add a column to a table on production PostgreSQL with zero downtime?

Here
https://stackoverflow.com/a/53016193/10894456
is an answer provided for Oracle 11g,
My question is the same:
What is the best approach to add a not null column with default value
in production oracle database when that table contain one million
records and it is live. Does it create any locks if we do the column
creation , adding default value and making it as not null in a single
statement?
but for PostgreSQL ?
This prior answer essentially answers your query.
Cross referencing the relevant PostgreSQL doc with the PostgreSQL sourcecode for AlterTableGetLockLevel mentioned in the above answer shows that ALTER TABLE ... ADD COLUMN will always obtain an ACCESS EXCLUSIVE table lock, precluding any other transaction from accessing the table for the duration of the ADD COLUMN operation.
This same exclusive lock is obtained for any ADD COLUMN variation; ie. it doesn't matter whether you add a NULL column (with or without DEFAULT) or have a NOT NULL with a default.
However, as mentioned in the linked answer above, adding a NULL column with no DEFAULT should be very quick as this operation simply updates the catalog.
In contrast, adding a column with a DEFAULT specifier necessitates a rewrite the entire table in PostgreSQL 10 or less.
This operation is likely to take a considerable time on your 1M record table.
According to the linked answer, PostgreSQL >= 11 does not require such a rewrite for adding such a column, so should perform more similarly to the no-DEFAULT case.
I should add that for PostgreSQL 11 and above, the ALTER TABLE docs note that table rewrites are only avoided for non-volatile DEFAULT specifiers:
When a column is added with ADD COLUMN and a non-volatile DEFAULT is specified, the default is evaluated at the time of the statement and the result stored in the table's metadata. That value will be used for the column for all existing rows. If no DEFAULT is specified, NULL is used. In neither case is a rewrite of the table required.
Adding a column with a volatile DEFAULT [...] will require the entire table and its indexes to be rewritten. [...] Table and/or index rebuilds may take a significant amount of time for a large table; and will temporarily require as much as double the disk space.

would postgres really update page file when fields are all equals before and after update?

I am working with a little website crawler program. I use PostgresQL to store data and use such statement to update that,
INSERT INTO topic (......) VALUES (......)
ON CONFLICT DO UPDATE /* updagte all fields here */
The question is if all fields before upate and after update are really equals, would PostgresQL really update that?
Postgres (like nearly all other DBMS) will not check if the target values are different then the original ones. So the answer is: yes, it will update the row even if the values are different.
However, you can easily prevent the "empty" update in this case by including a where clause:
INSERT INTO topic (......)
VALUES (......)
ON CONFLICT (...)
DO UPDATE
set ... -- update all column
WHERE topic IS DISTINCT FROM excluded;
The where clause will prevent updating a row that is identical to the one that is being inserted. To make that work correctly your insert has to list all columns of the target tables. Otherwise the topic is distinct from excluded condition will always be true because the excluded row has fewer columns then the topic row and thus it id "distinct" from it.
Adding a check for modified values has been discussed multiple times on the mailing list and has always be discarded. The main reason being, that it doesn't make sense to have the overhead of checking for changes for every statement just to cope with a few badly written ones.

How to upsert in Postgres on conflict on one of 2 columns?

Is it possible to do upsert in Postgres 9.5 when conflict happens on one of 2 columns in a table.? Basically I have 2 columns and if either column throws unique constraint violation, then I would like to perform update operation.
Yes, and this behaviour is default. Any unique constraint violation constitutes a conflict and then the UPDATE is performed if ON CONFLICT DO UPDATE is specified. The INSERT statement can have only a single ON CONFLICT clause, but the conflict_target of that clause can specify multiple column names each of which must have an index, such as a UNIQUE constraint. You are, however, limited to a single conflict_action and you will not have information on which constraint caused the conflict when processing that action. If you need that kind of information, or specific action depending on the constraint violation, you should write a trigger function but then you lose the all-important atomicity of the INSERT ... ON CONFLICT DO ... statement.
I think in Postgres 9.5 ON CONFLICT can have only one constraint or multiple column names but on that multiple columns must have combine one index

How can I copy an IDENTITY field?

I’d like to update some parameters for a table, such as the dist and sort key. In order to do so, I’ve renamed the old version of the table, and recreated the table with the new parameters (these can not be changed once a table has been created).
I need to preserve the id field from the old table, which is an IDENTITY field. If I try the following query however, I get an error:
insert into edw.my_table_new select * from edw.my_table_old;
ERROR: cannot set an identity column to a value [SQL State=0A000]
How can I keep the same id from the old table?
You can't INSERT data setting the IDENTITY columns, but you can load data from S3 using COPY command.
First you will need to create a dump of source table with UNLOAD.
Then simply use COPY with EXPLICIT_IDS parameter as described in Loading default column values:
If an IDENTITY column is included in the column list, the EXPLICIT_IDS
option must also be specified in the COPY command, or the COPY command
will fail. Similarly, if an IDENTITY column is omitted from the column
list, and the EXPLICIT_IDS option is specified, the COPY operation
will fail.
You can explicitly specify the columns, and ignore the identity column:
insert into existing_table (col1, col2) select col1, col2 from another_table;
Use ALTER TABLE APPEND twice, first time with IGNOREEXTRA and the second time with FILLTARGET.
If the target table contains columns that don't exist in the source
table, include FILLTARGET. The command fills the extra columns in the
source table with either the default column value or IDENTITY value,
if one was defined, or NULL.
It moves the columns from one table to another, extremely quickly; took me 4s for 1GB table in dc1.large node.
Appends rows to a target table by moving data from an existing source
table.
...
ALTER TABLE APPEND is usually much faster than a similar CREATE TABLE
AS or INSERT INTO operation because data is moved, not duplicated.
Faster and simpler than UNLOAD + COPY with EXPLICIT_IDS.