How to retrieve duplicate records and delete them in table A, also insert these duplicate records in another table B (in postgres) - postgresql

how to retrieve duplicate records and delete them in table A, also insert these retrieved duplicate records in another table B (in postgres db)
SQL query's are required for my project.

To delete duplicates without having a unique column you can use the ctid virtual column which is essentially the same thing as the rowid in Oracle:
delete from table_A t1
where ctid <> (select min(t2.ctid)
from table_A t2
where t1.unique_column = t2.unique_column);
You can use the returning clause to get the deleted rows and insert them into the other table:
with deleted as (
delete from table_A x1
where ctid <> (select min(t2.ctid)
from table_A t2
where t1.unique_column = t2.unique_column);
returning *
)
insert into table_B (col_1, col_2)
select unique_column, some_other_column
from deleted;
If you further want to see those deleted rows, you can throw in another CTE:
with deleted as (
delete from table_A x1
where ctid <> (select min(t2.ctid)
from table_A t2
where t1.unique_column = t2.unique_column);
returning *
), moved as (
insert into table_B (col_1, col_2)
select unique_column, some_other_column
from deleted
returning *
)
select *
from moved;

Related

Postgres join involving tables having join condition defined on an text array

I have two tables in postgresql
One table is of the form
Create table table1(
ID serial PRIMARY KEY,
Type []Text
)
Create table table2(
type text,
sellerID int
)
Now i want to get all the rows from table1 which are having type same that in table2 but the problem is that in table1 the type is an array.
In case the type in the table has an identifiable delimiter like ',' ,';' etc. you can rewrite the query as regexp_split_to_table(type,',') or versions later than 9.5 unnest function can be use too.
For eg.,
select * from
( select id ,regexp_split_to_table(type,',') from table1)table1
inner join
select * from table2
on trim(table1.type) = trim(table2.type)
Another good example can be found - https://www.dbrnd.com/2017/03/postgresql-regexp_split_to_array-to-split-string-using-different-delimiters/
SELECT
a[1] AS DiskInfo
,a[2] AS DiskNumber
,a[3] AS MessageKeyword
FROM (
SELECT regexp_split_to_array('Postgres Disk information , disk 2 , failed', ',')
) AS dt(a)
You can use the ANY operator in the JOIN condition:
select *
from table1 t1
join table2 t2 on t2.type = any (t1.type);
Note that if the types in the table1 match multiple rows in table2, you would get duplicates (from table1) because that's how a join works. Maybe you want an EXISTS condition instead:
select *
from table1 t1
where exists (select *
from table2 t2
where t2.type = any(t1.type));

PostgreSQL - Append a table to another and add a field without listing all fields

I have two tables:
table_a with fields item_id,rank, and 50 other fields.
table_b with fields item_id, and the same 50 fields as table_a
I need to write a SELECT query that adds the rows of table_b to table_a but with rank set to a specific value, let's say 4.
Currently I have:
SELECT * FROM table_a
UNION
SELECT item_id, 4 rank, field_1, field_2, ...
How can I join the two tables together without writing out all of the fields and without using an INSERT query?
EDIT:
My idea is to join table_b to table_a somehow with the rank field remaining empty, then simply replace the null rank fields. The rank field is never null, but item_id can be duplicated and table_a may have item_id values that are not in table_b, and vice-versa.
I am not sure I understand why you need this, but you can use jsonb functions:
select (jsonb_populate_record(null::table_a, row)).*
from (
select to_jsonb(a) as row
from table_a a
union
select to_jsonb(b) || '{"rank": 4}'
from table_b b
) s
order by item_id;
Working example in rextester.
I'm pretty sure I've got it. The predefined rank column can be inserted into table_b by joining to the subset of itself with only the columns left of the column behind which you want to insert.
WITH
_leftcols AS ( SELECT item_id, 4 rank FROM table_b ),
_combined AS ( SELECT * FROM table_b JOIN _leftcols USING (item_id) )
SELECT * FROM _combined
UNION
SELECT * FROM table_a

Conditional delete and insert in postgres

I have two tables Table_A and Table_B. How can I write a conditional SQL that does the following logic
If table A records match table B records on id
then
delete records from table A and Insert records into Table B
How can I do this with SQL most likely using with
delete from Table_A where Exists (select a.id from TABLE_A
join TABLE_B as b on a.id = b.id)
The Insert is:Insert into Table_A (id) select id from TABLE_B
Use a CTE to catch the ids of the deleted records, and re-join these with the b records:
WITH del AS (
DELETE FROM a
WHERE EXISTS ( SELECT *
FROM b
WHERE b.id = a.id
)
returning *
)
INSERT INTO a (id, x, y, z)
SELECT id, x, y, z
FROM b
WHERE EXISTS (
SELECT *
FROM del
WHERE del.id = b.id
);
BTW: you should have very good reasons (such as wanting to activate the triggers) to prefer delete+insert to a update.

TSQL query to delete all duplicate records but one [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
delete duplicate records in SQL Server
I have a table in which unique records are denoted by a composite key, such as (COL_A, COL_B).
I have checked and confirmed that I have duplicate rows in my table by using the following query:
select COL_A, COL_B, COUNT(*)
from MY_TABLE
group by COL_A, COL_B
having count(*) > 1
order by count(*) desc
Now, I would like to remove all duplicate records but keep only one.
Could someone please shed some light on how to achieve this with 2 columns?
EDIT:
Assume the table only has COL_A and COL_B
1st solution,
It is flexible, because you can add more columns than COL_A and COL_B :
-- create table with identity filed
-- using idenity we can decide which row we can delete
create table MY_TABLE_COPY
(
id int identity,
COL_A varchar(30),
COL_B varchar(30)
/*
other columns
*/
)
go
-- copy data
insert into MY_TABLE_COPY (COL_A,COL_B/*other columns*/)
select COL_A, COL_B /*other columns*/
from MY_TABLE
group by COL_A, COL_B
having count(*) > 1
-- delete data from MY_TABLE
-- only duplicates (!)
delete MY_TABLE
from MY_TABLE_COPY c, MY_TABLE t
where c.COL_A=t.COL_A
and c.COL_B=t.COL_B
go
-- copy data without duplicates
insert into MY_TABLE (COL_A, COL_B /*other columns*/)
select t.COL_A, t.COL_B /*other columns*/
from MY_TABLE_COPY t
where t.id = (
select max(id)
from MY_TABLE_COPY c
where t.COL_A = c.COL_A
and t.COL_B = c.COL_B
)
go
2nd solution
If you have really two columns in MY_TABLE you can use:
-- create table and copy data
select distinct COL_A, COL_B
into MY_TABLE_COPY
from MY_TABLE
-- delete data from MY_TABLE
-- only duplicates (!)
delete MY_TABLE
from MY_TABLE_COPY c, MY_TABLE t
where c.COL_A=t.COL_A
and c.COL_B=t.COL_B
go
-- copy data without duplicates
insert into MY_TABLE
select t.COL_A, t.COL_B
from MY_TABLE_COPY t
go
Try:
-- Copy Current Table
SELECT * INTO #MY_TABLE_COPY FROM MY_TABLE
-- Delte all rows from current able
DELETE FROM MY_TABLE
-- Insert only unique values, removing your duplicates
INSERT INTO MY_TABLE
SELECT DISTINCT * FROM #MY_TABLE_COPY
-- Remove Temp Table
DROP TABLE #MY_TABLE_COPY
That should work as long as you don't break any foreign keys when deleting rows from MY_TABLE.

PostgreSQL delete duplicate rows ignoring case

I need to delete from a table rows that have the same value on a specified field ignoring case. For example if I have a row that has 'foo' as value for a field and another row that has 'Foo' as value for the same field, I want to delete only one of these rows (keeping 1 row).
I've tried something like this:
delete from table t1
where exists (select 1
from table t2
where t1.key <> t2.key
and t1.field ILIKE t2.field)
but this deletes the other row too.
Are there any suggestions?
Just change <> by <:
DELETE
FROM table t1
WHERE exists (
SELECT 1
FROM table t2
WHERE t1.key < t2.key
and t1.field ILIKE t2.field
)
This way you keep the rows with the highest key. You also could use > to keep the records with the lowest key.
Assuming key is the primary key of the table:
DELETE FROM the_table t1
WHERE t1.key not in (select min(t2.key)
from the_table t2
group by lower(t2.field));