I'm getting column "my_column" contains null values' when adding a composite primary key

I'm getting column "my_column" contains null values' when adding a composite primary key - postgresql

Is it not supposed to delete null values before altering the table? I'm confused...
My query looks roughly like this:
BEGIN;
DELETE FROM my_table
WHERE my_column IS NULL;
ALTER TABLE my_table DROP CONSTRAINT my_table_pk;
ALTER TABLE my_table ADD PRIMARY KEY (id, my_column);
-- this is to repopulate the data afterwards
INSERT INTO my_table (name, other_table_id, my_column)
SELECT
ya.name,
ot.id,
my_column
FROM other_table ot
LEFT JOIN yet_another ya
ON ya.id = ot."fileId"
WHERE NOT EXISTS (
SELECT
1
FROM my_table mt
WHERE ot.id = mt.other_table_id AND ot.my_column = mt.my_column
) AND my_column IS NOT NULL;
COMMIT;
sorry for naming

There are two possible explanations:
A concurrent session inserted a new row with a NULL value between the start of the DELETE and the start of ALTER TABLE.
To avoid that, lock the table in SHARE mode before you DELETE.
There is a row where id has a NULL value.

Related

How to insert correctly content of a table directly in an other table?

Good day, I have using Javascript in Mirth Connect to insert all raws that are allocated in a table of PostgreSql database directly in an other table , and in case of duplicate, update the row. I am trying with it,but it gives to me this error:
InsertIntoPatientMapping= dbConn.executeUpdate('insert into patient_mapping (select * from patient_mapping_test) ON DUPLICATE KEY UPDATE');
Wrapped org.postgresql.util.PSQLException: ERROR: syntax error at or near "DUPLICATE" Position: 69
What did I wrong?

Assuming you have a id column, for upserting considering only the key as duplicate, you can:
insert into patient_mapping (select * from patient_mapping_test) ON CONFLICT (id) DO UPDATE SET id=EXCLUDED.id
https://www.postgresql.org/docs/13/sql-insert.html
Edit
There are a couple of things you need to consider for this:
You need to specify which columns the conflict occurs;
Each column must have an index;
You need to specify how to replace the conflicted values.
For instance:
create table T1 (id integer unique);
create table T2 (id integer unique);
insert into T1 (id) values (1);
insert into T1 (id) values (2);
insert into T2 (id) values (1);
insert into T2 (select * from T1) on conflict (id) do update set id=EXCLUDED.id;
select * from T2;
This may help you: https://www.postgresqltutorial.com/postgresql-upsert/

Use COPY FROM command in PostgreSQL to insert in multiple tables

I'm trying to use the performance of COPY FROM command in PostgreSQL to get all data of 1 table of a CSV file (CSV -> table1) and I need to insert other data, but, in a new table. I will need of a primary key of first table to put as a foreign key in second table.
Example:
I need to insert 1,000,000 of names in table1 and 500,000 of names in table2, but, all names in table2 reference to 1 tuple in table1.
CREATE TABLE table1 (
table1Id bigserial NOT NULL,
Name varchar(100) NULL,
CONSTRAINT table1Id PRIMARY KEY (table1Id)
);
CREATE TABLE table2 (
table2Id bigserial NOT NULL,
Other_name varchar(100) NOT NULL
table1_table1Id int8 NOT NULL,
CONSTRAINT table2_pk PRIMARY KEY (table2Id)
);

Command COPY does not allow table manipulations while copying data (such as look up to other table for fetching proper foreign keys to insert). To insert into table2 ids for corresponding rows from table1 you need to drop NOT NULL constraint for that field, COPY data and then UPDATE that fields separately.
Assuming table1 and table2 tables can be joined by table1.Name = table2.Other_name, the code is:
Before COPY:
ALTER TABLE table2 ALTER COLUMN table1_table1Id DROP NOT NULL;
After COPY:
UPDATE table2 SET table2.table1_table1Id = table1.table1Id
FROM table1
WHERE table1.Name = table2.Other_name;
ALTER TABLE table2 ALTER COLUMN table1_table1Id SET NOT NULL;

Make duplicate row in Postgresql

I am writing migration script to migrate database. I have to duplicate the row by incrementing primary key considering that different database can have n number of different columns in the table. I can't write each and every column in query. If i simply just copy the row then, I am getting duplicate key error.
Query: INSERT INTO table_name SELECT * FROM table_name WHERE id=255;
ERROR: duplicate key value violates unique constraint "table_name_pkey"
DETAIL: Key (id)=(255) already exist
Here, It's good that I don't have to mention all column names. I can select all columns by giving *. But, same time I am also getting duplicate key error.
What's the solution of this problem? Any help would be appreciated. Thanks in advance.

If you are willing to type all column names, you may write
INSERT INTO table_name (
pri_key
,col2
,col3
)
SELECT (
SELECT MAX(pri_key) + 1
FROM table_name
)
,col2
,col3
FROM table_name
WHERE id = 255;
Other option (without typing all columns , but you know the primary key ) is to CREATE a temp table, update it and re-insert within a transaction.
BEGIN;
CREATE TEMP TABLE temp_tab ON COMMIT DROP AS SELECT * FROM table_name WHERE id=255;
UPDATE temp_tab SET pri_key_col = ( select MAX(pri_key_col) + 1 FROM table_name );
INSERT INTO table_name select * FROM temp_tab;
COMMIT;

This is just a DO block but you could create a function that takes things like the table name etc as parameters.
Setup:
CREATE TABLE public.t1 (a TEXT, b TEXT, c TEXT, id SERIAL PRIMARY KEY, e TEXT, f TEXT);
INSERT INTO public.t1 (e) VALUES ('x'), ('y'), ('z');
Code to duplicate values without the primary key column:
DO $$
DECLARE
_table_schema TEXT := 'public';
_table_name TEXT := 't1';
_pk_column_name TEXT := 'id';
_columns TEXT;
BEGIN
SELECT STRING_AGG(column_name, ',')
INTO _columns
FROM information_schema.columns
WHERE table_name = _table_name
AND table_schema = _table_schema
AND column_name <> _pk_column_name;
EXECUTE FORMAT('INSERT INTO %1$s.%2$s (%3$s) SELECT %3$s FROM %1$s.%2$s', _table_schema, _table_name, _columns);
END $$
The query it creates and runs is: INSERT INTO public.t1 (a,b,c,e,f) SELECT a,b,c,e,f FROM public.t1. It's selected all the columns apart from the PK one. You could put this code in a function and use it for any table you wanted, or just use it like this and edit it for whatever table.

Update new value on conflict

I'd like to update a row with an insert statement. Indeed, I can insert rows but if the unique attribute already exists, I'd like to update the content.
So I got these tables
CREATE TABLE dtest (
"id" text,
followers_count int4,
someuniq int4
);
CREATE UNIQUE INDEX dtest_idx ON dtest USING btree (someuniq);
CREATE TABLE temp_data (
tid text,
tfo int4,
tuniq int4);
Let's consider that I have a temp tab, and I insert/update data from this table
INSERT INTO temp_data VALUES ('id1',4,1);
INSERT INTO temp_data VALUES ('id2',0,2);
INSERT INTO temp_data VALUES ('id3',40,3);
INSERT INTO dtest("id","followers_count","someuniq")
SELECT t.tid, t.tfo, t.tuniq
FROM temp_data t
ON CONFLICT DO NOTHING;
Instead of doing an insert and then an update, I'd like to know if it's possible to update the values with something like this
INSERT INTO dtest("id","followers_count","someuniq")
SELECT tid, tfo, tuniq
FROM temp_data
ON CONFLICT ("someuniq")
DO UPDATE SET followers_count =
(SELECT tfo FROM temp_data where tid = EXCLUDED.tid)
WHERE EXCLUDED.id = tid;
Wich means, "Update some fields if the row already exists", what am I doing wrong ?

There is no need to use a sub-select. The excluded row will contain all target columns, including the followers_count:
INSERT INTO dtest (id, followers_count, someuniq)
SELECT tid, tfo, tuniq
FROM temp_data
ON CONFLICT (someuniq)
DO UPDATE
SET followers_count = excluded.followers_count;

SELECT or INSERT a row in one command

I'm using PostgreSQL 9.0 and I have a table with just an artificial key (auto-incrementing sequence) and another unique key. (Yes, there is a reason for this table. :)) I want to look up an ID by the other key or, if it doesn't exist, insert it:
SELECT id
FROM mytable
WHERE other_key = 'SOMETHING'
Then, if no match:
INSERT INTO mytable (other_key)
VALUES ('SOMETHING')
RETURNING id
The question: is it possible to save a round-trip to the DB by doing both of these in one statement? I can insert the row if it doesn't exist like this:
INSERT INTO mytable (other_key)
SELECT 'SOMETHING'
WHERE NOT EXISTS (SELECT * FROM mytable WHERE other_key = 'SOMETHING')
RETURNING id
... but that doesn't give the ID of an existing row. Any ideas? There is a unique constraint on other_key, if that helps.

Have you tried to union it?
Edit - this requires Postgres 9.1:
create table mytable (id serial primary key, other_key varchar not null unique);
WITH new_row AS (
INSERT INTO mytable (other_key)
SELECT 'SOMETHING'
WHERE NOT EXISTS (SELECT * FROM mytable WHERE other_key = 'SOMETHING')
RETURNING *
)
SELECT * FROM new_row
UNION
SELECT * FROM mytable WHERE other_key = 'SOMETHING';
results in:
id | other_key
----+-----------
1 | SOMETHING
(1 row)

No, there is no special SQL syntax that allows you to do select or insert. You can do what Ilia mentions and create a sproc, which means it will not do a round trip fromt he client to server, but it will still result in two queries (three actually, if you count the sproc itself).

using 9.5 i successfully tried this
based on Denis de Bernardy's answer
only 1 parameter
no union
no stored procedure
atomic, thus no concurrency problems (i think...)
The Query:
WITH neworexisting AS (
INSERT INTO mytable(other_key) VALUES('hello 2')
ON CONFLICT(other_key) DO UPDATE SET existed=true -- need some update to return sth
RETURNING *
)
SELECT * FROM neworexisting
first call:
id|other_key|created |existed|
--|---------|-------------------|-------|
6|hello 1 |2019-09-11 11:39:29|false |
second call:
id|other_key|created |existed|
--|---------|-------------------|-------|
6|hello 1 |2019-09-11 11:39:29|true |
First create your table ;-)
CREATE TABLE mytable (
id serial NOT NULL,
other_key text NOT NULL,
created timestamptz NOT NULL DEFAULT now(),
existed bool NOT NULL DEFAULT false,
CONSTRAINT mytable_pk PRIMARY KEY (id),
CONSTRAINT mytable_uniq UNIQUE (other_key) --needed for on conflict
);

you can use a stored procedure
IF (SELECT id FROM mytable WHERE other_key = 'SOMETHING' LIMIT 1) < 0 THEN
INSERT INTO mytable (other_key) VALUES ('SOMETHING')
END IF

I have an alternative to Denis answer, that I think is less database-intensive, although a bit more complex:
create table mytable (id serial primary key, other_key varchar not null unique);
WITH table_sel AS (
SELECT id
FROM mytable
WHERE other_key = 'test'
UNION
SELECT NULL AS id
ORDER BY id NULLS LAST
LIMIT 1
), table_ins AS (
INSERT INTO mytable (id, other_key)
SELECT
COALESCE(id, NEXTVAL('mytable_id_seq'::REGCLASS)),
'test'
FROM table_sel
ON CONFLICT (id) DO NOTHING
RETURNING id
)
SELECT * FROM table_ins
UNION ALL
SELECT * FROM table_sel
WHERE id IS NOT NULL;
In table_sel CTE I'm looking for the right row. If I don't find it, I assure that table_sel returns at least one row, with a union with a SELECT NULL.
In table_ins CTE I try to insert the same row I was looking for earlier. COALESCE(id, NEXTVAL('mytable_id_seq'::REGCLASS)) is saying: id could be defined, if so, use it; whereas if id is null, increment the sequence on id and use this new value to insert a row. The ON CONFLICT clause assure
that if id is already in mytable I don't insert anything.
At the end I put everything together with a UNION between table_ins and table_sel, so that I'm sure to take my sweet id value and execute both CTE.
This query needs to search for the value other_key only once, and is a "search this value" not a "check if this value not exists in the table", that is very heavy; in Denis alternative you use other_key in both types of searches. In my query you "check if a value not exists" only on id that is a integer primary key, that, for construction, is fast.

Minor tweak a decade late to Denis's excellent answer:
-- Create the table with a unique constraint
CREATE TABLE mytable (
id serial PRIMARY KEY
, other_key varchar NOT NULL UNIQUE
);
WITH new_row AS (
-- Only insert when we don't find anything, avoiding a table lock if
-- possible.
INSERT INTO mytable ( other_key )
SELECT 'SOMETHING'
WHERE NOT EXISTS (
SELECT *
FROM mytable
WHERE other_key = 'SOMETHING'
)
RETURNING *
)
(
-- This comes first in the UNION ALL since it'll almost certainly be
-- in the query cache. Marginally slower for the insert case, but also
-- marginally faster for the much more common read-only case.
SELECT *
FROM mytable
WHERE other_key = 'SOMETHING'
-- Don't check for duplicates to be removed
UNION ALL
-- If we reach this point in iteration, we needed to do the INSERT and
-- lock after all.
SELECT *
FROM new_row
) LIMIT 1 -- Just return whatever comes first in the results and allow
-- the query engine to cut processing short for the INSERT
-- calculation.
;
The UNION ALL tells the planner it doesn't have to collect results for de-duplication. The LIMIT 1 at the end allows the planner to short-circuit further processing/iteration once it knows there's an answer available.
NOTE: There is a race condition present here and in the original answer. If the entry does not already exist, the INSERT will fail with a unique constraint violation. The error can be suppressed with ON CONFLICT DO NOTHING, but the query will return an empty set instead of the new row. This is a difficult problem because getting that info from another transaction would violate the I in ACID.