How to Finding Array Differences using postgres SQL - postgresql

I have a table in Postgres DB as below:
SOURCE
Entity
DiFF
PRD
E1,E2,E3,E4,E5
MC
E1,E2
GT1
E1,E2,E3
I Need to insert the differences Between PRD and MC,GT1 into the DIFF column using postgres SQL
Expected result
SOURCE
Entity
DiFF
PRD
E1,E2,E3,E4,E5
MC
E1,E2
E3,E4,E5
GT1
E1,E2,E3
E4,E5

This is a terrible data model, but anyways:
The following query will get the "difference" between the column separated lists (not arrays) for MC and GT1:
select source,
(select string_agg(xp.item, ',')
from data p
cross join unnest(string_to_array(p.entity, ',')) as xp(item)
where p.source = 'PRD'
and not exists (select *
from unnest(string_to_array(d.entity, ',')) as x(item)
where xp.item = x.item)) as diff
from data d
where source in ('MC', 'GT1')
Given your sample data this returns:
source | diff
-------+---------
MC | E3,E4,E5
GT1 | E4,E5
This can be used to UPDATE the table (not "insert"!)
update data
set diff = t.diff
from (
select source,
(select string_agg(xp.item, ',')
from data p
cross join unnest(string_to_array(p.entity, ',')) as xp(item)
where p.source = 'PRD'
and not exists (select *
from unnest(string_to_array(d.entity, ',')) as x(item)
where xp.item = x.item)) as diff
from data d
where source in ('MC', 'GT1')
) t
where data.source = t.source;

Related

Find a difference between 2 tables

I want to check that the poi_equipement table (relationship table) corresponds to the data in the data table (i.e. a two-way check)
https://dbfiddle.uk/gFMjbIpX
detect that wc (in poi_equipement) is extra (because it is not present in the data table) and that hotel is not in poi_equipement so it is absent compared to the data table
I don't understand why with the raquĂȘte except he just answers me hotel.
I want him to answer me hotel and wc.
select object from data where subject = 'url1'
except
select subject from poi_equipement inner join equipement on poi_equipement.equipement_id = equipement.id;
ideally I want to know when I have a difference in poi_equipement, in data or in the 2 tables
A full outer join will do
with params as (
select 'url1' as subject),
data_object as (
select d.object
from data d
join params prm
on d.subject = prm.subject),
equipment_subject as (
select e.subject
from poi_equipement pe
join poi p
on pe.poi_id = p.id
join equipement e
on pe.equipement_id = e.id
join params prm
on p.id_url = prm.subject)
select d.object as data,
e.subject as poi_equipment
from data_object d
full outer
join equipment_subject e
on d.object = e.subject
where d.object is null
or e.subject is null;
Result:
data |poi_equipment|
-----+-------------+
hotel| |
|wc |
You can remove where clause if you need to see which item is in both places.

Merge in postgres

Am trying to convert below oracle query to postgres,
MERGE INTO table1 g
USING (SELECT distinct g.CDD , d.SGR
from table2 g, table3 d
where g.IDF = d.IDF) f
ON (g.SGR = f.SGR and g.CDD = f.CDD)
WHEN NOT MATCHED THEN
INSERT (SGR, CDD)
VALUES (f.SGR, f.CDD);
I made changes as below compatible to postgres:
WITH f AS (
SELECT distinct g.CDD , d.SGR
from table2 g, table3 d
where g.IDF = d.IDF
),
upd AS (
update table1 g
set
SGR = f.SGR , CDD = f.CDD
FROM f where g.SGR = f.SGR and g.CDD = f.CDD
returning g.CDD, g.SGR
)
INSERT INTO table1(SGR, CDD ) SELECT f.SGR, f.CDD FROM f;
But am doubtful ,my oracle query is not updating any columns if data matched , but am unable to convert it accordingly . Can anyone help me to correct it ?
Assuming you have a primary (or unique) key on (sgr, cdd) you can convert this to an insert ... on conflict statement:
insert into table1 (SGR, CDD)
select distinct g.CDD, d.SGR
from table2 g
join table3 d ON g.IDF = d.IDF
on conflict (cdd, sgr) do nothing;
If you don't have a unique constraint (which bears the question: why?) then a straight-forward INSERT ... SELECT statement should work (which would have worke in Oracle as well).
WITH f AS (
SELECT distinct g.CDD, d.SGR
from table2 g
join table3 d on g.IDF = d.IDF
)
INSERT INTO table1 (SGR, CDD)
SELECT f.SGR, f.CDD
FROM f
WHERE NOT EXISTS (select *
from table1 t1
join f on (t1.sgr, t1.cdd) = (f.cdd, f.sgrf));
Note that this is NOT safe for concurrent execution (and neither is Oracle's MERGE statement). You can still wind up with duplicate values in table1 (with regards to the combination of (sgr,cdd)).
The only sensible way to prevent duplicates is to create a unique index (or constraint) - which would enable you to use the much more efficient insert on conflict. You should really consider that if your business rules disallow duplicates.
Note that I converted your ancient, implicit join in the WHERE clause to a modern, explicit JOIN operator, but it is not required for this to work.

Query table with multiple joined values

I've created a query that joins six tables:
SELECT a.accession, b.value, c.name, d.description, e.value, f.seqlen, f.residues
FROM chado.dbxref a inner join chado.dbxrefprop b on a.dbxref_id = b.dbxref_id
inner join chado.biomaterial d on b.dbxref_id = d.dbxref_id
inner join chado.feature f on d.dbxref_id = f.dbxref_id
inner join chado.biomaterialprop e on d.biomaterial_id = e.biomaterial_id
inner join chado.contact c on d.biosourceprovider_id = c.contact_id;
The output:
I'm currently working with a PostgreSQL schema called Chado (http://gmod.org/wiki/Chado_Tables). My attempts to comply with the preexisting schema have led me to deposit multiple joined values within the same table (two different values within the dbxrefprop table, three different values within the biomaterialprop table). Querying the database results in a substantial amount of redundant output. Is there a way for me to reduce output redundancy by modifying my query statement? Ideally, I'd like the output to resemble the following:
test001 | GB0101 | source011 | Faaberg,K.; Lyoo,K.; Korol,D.M. | serum | T1 | Iowa, USA | 01 Jan 2005 | 1234 | AUGAACGCCUUGCAUUACUAUGACUAUGAUU
Working query statement:
SELECT a.accession, string_agg(distinct b.value, ' | ' ORDER BY b.value) AS bvalue_list, c.name, d.description, string_agg(distinct e.value, ' | ' ORDER BY e.value) AS evalue_list, f.seqlen, f.residues
FROM chado.dbxref a INNER JOIN chado.dbxrefprop b ON a.dbxref_id = b.dbxref_id
INNER JOIN chado.biomaterial d ON b.dbxref_id = d.dbxref_id
INNER JOIN chado.feature f ON d.dbxref_id = f.dbxref_id
INNER JOIN chado.biomaterialprop e ON d.biomaterial_id = e.biomaterial_id
INNER JOIN chado.contact c ON d.biosourceprovider_id = c.contact_id
GROUP BY a.accession, c.name, d.description, f.seqlen, f.residues;

Postgres SQL Cursor

I'm a SQL Server guy and I have a need to write some dynamic SQL in Postgres. Here's what I need. The dynamic SQL would be dependent upon integers produced by this query:
SELECT local_channel_id
FROM d_channels dc
INNER JOIN channel c ON c.id = dc.channel_id
AND c.name LIKE '%__Achv'
Using this, I need to build and execute a select and subsequent union select on the below query substituting the the values produced by the above query where indicated below by {X} (4 places):
SELECT
dmc.message_id,
dmm.received_date,
dmm.server_id,
dc.channel_id,
dmcm."SOURCE",
dmcm."TYPE",
dmm.status,
dmc.content
FROM
d_mc{X} dmc
INNER JOIN
d_mm{X} dmm ON dmc.message_id = dmm.message_id
INNER JOIN
d_channels dc ON dc.local_channel_id = {X}
INNER JOIN
d_mcm{X} dmcm ON dmcm.message_id = dmc.message_id
AND dmcm.metadata_id = 0
WHERE
dmm.connector_name = 'Source'
AND dmc.content_type = 1 --Raw
AND date(dmm.received_date) + interval '7' < now()
Can anybody help with this? I'm truly clueless when it comes to Postgres.

Bulk update a column in Oracle 11G

I have two tables say Table1 and Table2 that contains the following column with which I should join and perform an update a column of Table1 with the value of the same column present in Table2.
Columns for Join condition:
Table1.mem_ssn and Table2.ins_ssn
Table1.sys_id and Table2.sys_id
Table1.grp_id and Table2.grp_id
Column to update:
Table1.dtofhire=Table2.dtofhire
I need a way to bulk update (using single update query without looping) the above mentioned column in Oracle 11G.
Table1 does not contain any key constraint specified since it will be used as a staging table for Data upload.
Please help me out to update the same.
You can use the MERGE statement.
It should look something like this:
MERGE INTO table1 D
USING (SELECT * FROM table2 ) S
ON (D.mem_ssn = S.ins_ssn and D.sys_id = S.sys_id and D.grp_id=S.grp_id)
WHEN MATCHED THEN
UPDATE SET D.dtofhire=S.dtofhire;
UPDATE:
Since you have more than one row in table2 with the same (ins_ssn,sys_id,grp_id) and you want the max dtofhire, you should change the query in the using clause:
MERGE INTO table1 D
USING (SELECT ins_ssn, sys_id, grp_id, max(dtofhire) m_dtofhire
FROM table2
GROUP BY ins_ssn,sys_id,grp_id) S
ON (D.mem_ssn = S.ins_ssn and D.sys_id = S.sys_id and D.grp_id=S.grp_id)
WHEN MATCHED THEN
UPDATE SET D.dtofhire=S.m_dtofhire;
The query that I used to arrive the functionality is seen below
UPDATE table1 T2
SET dtofhire = (SELECT Max(dtofhire) AS dtofhire
FROM table2 T1
WHERE T2.mem_ssn = T1.ins_ssn
AND T2.sys_id = T1.sys_id
AND T2.grp_id = T1.grp_id
GROUP BY ins_ssn,
sys_id,
grp_id)
WHERE ( mem_ssn, sys_id, grp_id ) IN (SELECT ins_ssn,
sys_id,
grp_id
FROM table2 );