Getting duplicate records from 2 sql tables

Getting duplicate records from 2 sql tables - db2

I have 2 SQL tables
Table #1
account
product
expiry-date
101
prod1
2021-01-30
102
prod2
2021-02-20
103
prod3
2021-03-09
103
prod3
2021-03-19
104
prod4
2021-03-15
105
prod5
2021-04-23
105
prod5
2021-04-24
106
prod6
2021-04-25
Table #2
account
101
106
From the above 2 tables I want to get only unmatched records from Table1 and avoid duplicate records.
Result:
account
product
expiry-date
102
prod2
2021-02-20
103
prod3
2021-03-09
104
prod4
2021-03-15
105
prod5
2021-04-23
Below query I tried but I am getting duplicate records, because expiry date is unique on account. I am getting below records in my output
SQL query I tried:
select distinct (a.account, a.product, a.expiry-date)
from table1 a
where a.account not in (select account from table2)
Result:
account
product
expiry-date
102
prod2
2021-02-20
103
prod3
2021-03-09
103
prod3
2021-03-19
104
prod4
2021-03-15
105
prod5
2021-04-23
105
prod5
2021-04-24

You can use the same query using aggregation:
SELECT a.account
,a.product
,MIN(a.expiry) expiry
FROM table1 a
WHERE a.account NOT IN (
SELECT account
FROM table2
)
GROUP BY a.account
,a.product

You can use an anti-join and then ROW_NUMBER() For example:
select *
from (
select a.*, row_number() over(partition by accoun order by expiry) as rn
from table1 a
left join table2 b on b.account = a.account
where b.account is null
) x
where rn = 1

Related

postgresql select rows from same table twice

I want to compare deposit for each person in the table.
and return all the rows where the deposit field is decreased.
Here is what I have done so far;
The customer table is;
person_id employee_id deposit ts
101 201 44 2021-09-30 10:12:19+00
100 200 45 2021-09-30 10:12:19+00
101 201 47 2021-09-30 09:12:19+00
100 200 21 2021-09-29 10:12:19+00
104 203 54 2021-09-27 10:12:19+00
and as a result I want is;
person_id employee_id deposit ts
101 201 44 2021-09-30 10:12:19+00
SELECT person_id,
employee_id,
deposit,
ts,
lag(deposit) over client_window as pre_deposit,
lag(ts) over client_window as pre_ts
FROM customer
WINDOW client_window as (partition by person_id order by ts)
ORDER BY person_id , ts
so it returns table with the following results;
person_id employee_id deposit ts pre_deposit pre_ts
101 201 44 2021-09-30 10:12:19+00 47 2021-09-30 09:12:19+00
100 200 45 2021-09-30 10:12:19+00 21 2021-09-29 10:12:19+00
101 201 47 2021-09-30 09:12:19+00 null null
100 200 21 2021-09-29 10:12:19+00 null null
104 203 54 2021-09-27 10:12:19+00 null null
SELECT person_id,
employee_id,
deposit,
ts,
lag(deposit) over client_window as pre_deposit,
lag(ts) over client_window as pre_ts
FROM customer
WINDOW client_window as (partition by person_id order by ts)
WHERE pre_deposit > deposit //this returns column not found for pre_deposit
ORDER BY person_id , ts
so far somehow I need to select the same table again to be able to apply this condition;
where pre_deposit > deposit
what does it make sense here?
union? outer-join? left-join? right-join?

Use your query as a subquery and filter the results:
SELECT person_id, employee_id, deposit, ts
FROM (
SELECT *, lag(deposit) over client_window as pre_deposit
FROM customer
WINDOW client_window as (partition by person_id order by ts)
) t
WHERE deposit < pre_deposit
ORDER BY person_id, ts;
See the demo.

SQL Server CTE and recursion example misunderstanding

Could someone help me with cte expresion? I have a table:
old_card
new_card
dt
111
555
2020-01-09
222
223
2020-02-10
333
334
2020-03-11
444
222
2020-04-12
555
666
2020-05-12
666
777
2020-06-13
777
888
2020-07-14
888
0
2020-08-15
999
333
2020-09-16
223
111
2020-10-16
I need to get all the changes of old_card to a new_card, since old_card number 111 to a new_card number 0. So I must get 5 records from this table having only a new_card = 0 as input parameter
old_card
new_card
dt
111
555
2020-01-09
555
666
2020-05-12
666
777
2020-06-13
777
888
2020-07-14
888
0
2020-08-15
I think of to do it using cte, but I get all the records from the source table and can't understand why. Here is my cte:
;with cte as(
select
old_card,
new_card,
dt
from
cards_transfer
where
new_card = 0
union all
select
t1.old_card,
t1.new_card,
t1.dt
from
cards_transfer t1
inner join
cte on cte.old_card = t1.new_card)
But I get 8 rows instead. Can someone tell me please what I did wrong?

You said you wanted from 111 onwards. So you need to add that "stop" condition
where cte.old_card <> 111
;with cte as(
select
old_card,
new_card,
dt
from
cards_transfer
where
new_card = 0
union all
select
t1.old_card,
t1.new_card,
t1.dt
from
cards_transfer t1
inner join
cte on cte.old_card = t1.new_card
where cte.old_card <> 111
)

Joining distinct counts from another table per group

I have 2 tables:
table_1
date id_1 name id_2 transaction_id
202116 1 Google 235 ABAF51
202116 1 Google 489 GHH512
202116 1 Google 973 JDDF12
202116 1 Google 1189 HDFTS1
202116 1 Amazon 207 HSDY12
202116 1 Amazon 3329 KFGJD88
202116 1 Amazon 3360 JHTJDS1
202116 1 Facebook 862 SYTAHJ4
table_2
date id_1 name id_2
202116 1 Google 22
202116 1 Google 102
202116 1 Google 104
202116 1 Google 196
202116 1 Amazon 228
202116 1 Facebook 230
202116 1 Google 235
202116 1 Google 240
I am trying to have a table like so:
date id_1 name id_2 transactions
202116 1 Google 22 1
202116 1 Google 102 3
202116 1 Google 104 4
202116 1 Google 196 2
202116 1 Amazon 228 3
202116 1 Facebook 230 7
202116 1 Google 235 3
202116 1 Google 240 2
Where transactions is the DISTINCT COUNT of transaction_id from table_1 per group of date, id, name, id_2 ( mapped to table_2 and joined by date, id, name, id_2 )
So, the idea would be to count distinct transaction_id from table_1 values for
date id_1 name id_2
202116 1 Google 235
And assign the value ( let's say 1 ) to table_2 column transactions where:
date id_1 name id_2 transactions
202116 1 Google 235 1
And so on for each combination of date, id_1, name, id_2.
What I've tried:
select jp.date, jp.id_1, jp.name, jp.id_2, count(distinct(transaction_id)) from table_2 jp
left join table_1 using(date, id_1, name, id_2)
group by jp.date,jp.id_1, jp.name, jp.id_2,transaction_id
But it does not give me the correct output.
How can I achieve the desired result

without knowing the details of your table structure it's a bit hard, but why don't you solve the problem in 2 steps:
count distinct transaction_id from table_1 for date, id, name, id_2 with
with first_selection as (
select date, id, name, id_2, count(distinct transaction_id) nr_transactions
)
join the result with table_2
select t.date,
t.id,t.name,
t.id_2,
fs.nr_transactions
from table_2 t
join first_selection fs
on t.date=fs.date
and t.id = fs.id
and t.id_2 = fs.id_2
and t.name = ft.name
with the complete query being
with first_selection as (
select date, id, name, id_2, count(distinct transaction_id) nr_transactions
)
select t.date,
t.id,t.name,
t.id_2,
fs.nr_transactions
from table_2 t
join first_selection fs
on t.date=fs.date
and t.id = fs.id
and t.id_2 = fs.id_2
and t.name = ft.name

PostgreSQL : comparing two sets of results does not work

I have a table that contains 3 columns of ids, clothes, shoes, customers and relates them.
I have a query that works fine :
select clothes, shoes from table where customers = 101 (all clothes and shoes of customer 101). This returns
clothes - shoes (SET A)
1 6
1 2
33 12
24 null
Another query that works fine :
select clothes ,shoes from table
where customers in
(select customers from table where clothes = 1 and customers <> 101 ) (all clothes and shoes of any other customer than 101, with specified clothes). This returns
shoes - clothes(SET B)
6 null
null 24
1 1
2 1
12 null
null 26
14 null
Now I want to get all clothes and shoes from SET A that are not in SET B.
So (example) select from SET A where NOT IN SET B. This should return just clothes 33, right?
I try to convert this to a working query :
select clothes, shoes from table where customers = 101
and
(clothes,shoes) not in
(
select clothes,shoes from
table where customers in
(select customers from table where clothes = 1 and customers <> 101 )
) ;
I tried different syntaxes, but the above looks more logic.
Problem is I never get clothes 33, just an empty set.
How do I fix this? What goes wrong?
Thanks
Edit , here is the contents of the table
id shoes customers clothes
1 1 1 1
2 1 4 1
3 1 5 1
4 2 2 2
5 2 3 1
6 1 3 1
44 2 101 1
46 6 101 1
49 12 101 33
51 13 102
52 101 24
59 107 51
60 107 24
62 23 108 51
63 23 108 2
93 124 25
95 6 125
98 127 25
100 3 128
103 24 131
104 25 132
105 102 28
106 10 102
107 23 133
108 4 26
109 6 4
110 4 24
111 12 4
112 14 4
116 102 48
117 102 24
118 102 25
119 102 26
120 102 29
122 134 31

The except clause in PostgreSQL works the way the minus operator does in Oracle. I think that will give you what you want.
I think notionally your query looks right, but I suspect those pesky nulls are impacting your results. Just like a null is not-NOT equal to 5 (it's nothing, therefore it's neither equal to nor not equal to anything), a null is also not-NOT "in" anything...
select clothes, shoes
from table1
where customers = 101
except
select clothes, shoes
from table1
where customers in (
select customers
from table1
where clothes = 1 and customers != 101
)

For PostgreSQL null is undefined value, so You must get rid of potential nulls in your result:
select id,clothes,shoes from t1 where customers = 101 -- or select id...
and (
clothes not in
(
select COALESCE(clothes,-1) from
t1 where customers in
(select customers from t1 where clothes = 1 and customers <> 101 )
)
OR
shoes not in
(
select COALESCE(shoes,-1) from
t1 where customers in
(select customers from t1 where clothes = 1 and customers <> 101 )
)
)
if You wanted unique pairs you would use:
select clothes, shoes from t1 where customers = 101
and
(clothes,shoes) not in
(
select coalesce(clothes,-1),coalesce(shoes,-1) from
t1 where customers in
(select customers from t1 where clothes = 1 and customers <> 101 )
) ;
You can't get "clothes 33" if You are selecting both clothes and shoes columns...
Also if u need to know exactly which column, clothes or shoes was unique to this customer, You might use this little "hack":
select id,clothes,-1 AS shoes from t1 where customers = 101
and
clothes not in
(
select COALESCE(clothes,-1) from
t1 where customers in
(select customers from t1 where clothes = 1 and customers <> 101)
)
UNION
select id,-1,shoes from t1 where customers = 101
and
shoes not in
(
select COALESCE(shoes,-1) from
t1 where customers in
(select customers from t1 where clothes = 1 and customers <> 101)
)
And Your result would be:
id=49, clothes=33, shoes=-1
(I assume that there aren't any clothes or shoes with id -1, You may put any exotic value here)
Cheers

Sql join and remove distinct in two separate column

I have table ordered
form_id | procedure_id
----------+-------------
101 | 24
101 | 23
101 | 22
102 | 7
102 | 6
102 | 3
102 | 2
And another table have table performed
form_id | procedure_id
----------+-------------
101 | 42
101 | 45
102 | 5
102 | 3
102 | 7
102 | 12
102 | 13
Expected output
form_id o_procedure_id p_procedure_id
101 24 42
101 23 45
101 22 NULL
102 7 7
102 6 5
102 3 3
102 2 12
102 NULL 13
I tried the below query:
with ranked as
(select
dense_rank() over (partition by po.form_id order by po.procedure_id) rn1,
dense_rank() over (partition by po.form_id order by pp.procedure_id) rn2,
po.form_id,
po.procedure_id,
pp.procedure_id
from ordered po,
performed pp where po.form_id = pp.form_id)
select ranked.* from ranked
--where rn1=1 or rn2=1
The above query return the value with repeat value ordered and procedure ID.
How to get Excepted output?

I wasn't quite sure how you would want to handle multiple null values and/or null values on both sides of your tables. My example therefor assumes the first table to be leading and include all entries while the second table might include holes. Query ain't pretty but i suppose it does what you expect it to:
select test1_sub.form_id, test1_sub.process_id as pid_1, test2_sub.process_id as pid_2 from (
select form_id,
process_id,
rank() over (partition by form_id order by process_id asc nulls last)
from test1) as test1_sub
left join (
select * from (
select form_id,
process_id,
rank() over (partition by form_id order by process_id asc nulls last)
from test2
) as test2_nonexposed
) as test2_sub on test1_sub.form_id = test2_sub.form_id and test1_sub.rank = test2_sub.rank;

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Getting duplicate records from 2 sql tables - db2

You can use the same query using aggregation: SELECT a.account ,a.product ,MIN(a.expiry) expiry FROM table1 a WHERE a.account NOT IN ( SELECT account FROM table2 ) GROUP BY a.account ,a.product

You can use an anti-join and then ROW_NUMBER() For example: select * from ( select a.*, row_number() over(partition by accoun order by expiry) as rn from table1 a left join table2 b on b.account = a.account where b.account is null ) x where rn = 1

Related

postgresql select rows from same table twice

SQL Server CTE and recursion example misunderstanding

Joining distinct counts from another table per group

PostgreSQL : comparing two sets of results does not work

Sql join and remove distinct in two separate column

Categories

Resources