Comparing Subqueries - postgresql

I have two subqueries. Here is the output of subquery A....
id | date_lat_lng | stat_total | rnum
-------+--------------------+------------+------
16820 | 2016_10_05_10_3802 | 9 | 2
15701 | 2016_10_05_10_3802 | 9 | 3
16821 | 2016_10_05_11_3802 | 16 | 2
17861 | 2016_10_05_11_3802 | 16 | 3
16840 | 2016_10_05_12_3683 | 42 | 2
17831 | 2016_10_05_12_3767 | 0 | 2
17862 | 2016_10_05_12_3802 | 11 | 2
17888 | 2016_10_05_13_3683 | 35 | 2
17833 | 2016_10_05_13_3767 | 24 | 2
16823 | 2016_10_05_13_3802 | 24 | 2
and subquery B, in which date_lat_lng and stat_total has commonality with subquery A, but id does not.
id | date_lat_lng | stat_total | rnum
-------+--------------------+------------+------
17860 | 2016_10_05_10_3802 | 9 | 1
15702 | 2016_10_05_11_3802 | 16 | 1
17887 | 2016_10_05_12_3683 | 42 | 1
15630 | 2016_10_05_12_3767 | 20 | 1
16822 | 2016_10_05_12_3802 | 20 | 1
16841 | 2016_10_05_13_3683 | 35 | 1
15632 | 2016_10_05_13_3767 | 23 | 1
17863 | 2016_10_05_13_3802 | 3 | 1
16842 | 2016_10_05_14_3683 | 32 | 1
15633 | 2016_10_05_14_3767 | 12 | 1
Both subquery A and B pull data from the same table. I want to delete the rows in that table that share the same ID as subquery A but only where date_lat_lng and stat_total have a shared match in subquery B.
Effectively I need:
DELETE FROM table WHERE
id IN
(SELECT id FROM (subqueryA) WHERE
subqueryA.date_lat_lng=subqueryB.date_lat_lng
AND subqueryA.stat_total=subqueryB.stat_total)
Except I'm not sure where to place subquery B, or if I need an entirely different structure.

Something like this,
DELETE FROM table WHERE
id IN (
SELECT DISTINCT id
FROM subqueryA
JOIN subqueryB
USING (id,date_lat_lng,stat_total)
)

Related

postgresql: query two tables with same column names and show the result side by side ordered their column names, which occur in both tables

Having two tables (table1, table2) with the same column names (generation, parent), the desired output would be the combination of all columns of both tables. Thereby the rows of table2 should join table1 so that the rows of table2 are matching those of table1 on generation column. The parent number should be ordered ascending for the entries in table1 as well as in table2. The number of rows of the query results should be equal of those of table1.
Given the following tables
table1:
| generation | parent |
|:----------:|:------:|
| 0 | 1 |
| 0 | 2 |
| 0 | 3 |
| 1 | 3 |
| 1 | 2 |
| 1 | 1 |
| 2 | 2 |
| 2 | 1 |
| 2 | 3 |
table2:
| generation | parent |
|:----------:|:------:|
| 1 | 3 |
| 1 | 1 |
| 1 | 3 |
| 2 | 1 |
| 2 | 2 |
| 2 | 3 |
The following queries are thought for creating and populating two sample tables as shown above:
create table table1(generation integer, parent integer);
insert into table1 (generation, parent) values(0,1),(0,2),(0,3),(1,3),(1,2),(1,1),(2,2),(2,1),(2,3);
create table table2(generation integer, parent integer);
insert into table2 (generation, parent) values(1,3),(1,1),(1,3),(2,1),(2,2),(2,3);
the imagined query should lead to the following desired result:
| table1_generation | table1_parent | table2_generation | table2_parent |
|:-----------------:|:-------------:|:-----------------:|:-------------:|
| 0 | 1 | | |
| 0 | 2 | | |
| 0 | 3 | | |
| 1 | 1 | 1 | 1 |
| 1 | 2 | 1 | 3 |
| 1 | 3 | 1 | 3 |
| 2 | 1 | 2 | 1 |
| 2 | 2 | 2 | 2 |
| 2 | 3 | 2 | 3 |
Current query looks as follows:
with
p as (
select
generation,
parent
from
table1
order by
generation,
parent
), o as(
select
generation,
parent
from
table2
order by
generation,
parent
)
select
p.generation as table1_generation,
p.parent as table1_parent,
o.generation as table2_generation,
o.parent as table2_parent
from
p
left join o on
o.generation=p.generation;
Which leads to the following result:
| table1_generation | table1_parent | table2_generation | table2_parent |
|:-----------------:|:-------------:|:-----------------:|:-------------:|
| 0 | 1 | | |
| 0 | 2 | | |
| 0 | 3 | | |
| 1 | 1 | 1 | 1 |
| 1 | 1 | 1 | 3 |
| 1 | 1 | 1 | 3 |
| 1 | 2 | 1 | 1 |
| 1 | 2 | 1 | 3 |
| 1 | 2 | 1 | 3 |
| 1 | 3 | 1 | 1 |
| 1 | 3 | 1 | 3 |
| 1 | 3 | 1 | 3 |
| 2 | 1 | 2 | 1 |
| 2 | 1 | 2 | 2 |
| 2 | 1 | 2 | 3 |
| 2 | 2 | 2 | 1 |
| 2 | 2 | 2 | 2 |
| 2 | 2 | 2 | 3 |
| 2 | 3 | 2 | 1 |
| 2 | 3 | 2 | 2 |
| 2 | 3 | 2 | 3 |
This link led to the conclusion, that any join command might not what is necessary here ... But union does only append rows... so for me it is absolutely unclear, how the desired result can be achieved o.O
Any help is highly appreciated. Thanks in advance!
The main misunderstanding on this question arose from the fact that you mentioned join, which is a very precisely mathematically defined concept based on the Cartesian product and can be applied to any two sets. So the current output is clear.
But as you wrote in the title, you want to put two tables side by side. You take advantage of the fact that they have the same number of rows (triples).
This select returns the output you want.
I made artificial join columns, row_number() OVER (order by generation, parent) as rnum, and moved the second table using the addition of three. I hope this helps you:
with
p as (
select
row_number() OVER (order by generation, parent) as rnum,
generation,
parent
from
table1
order by
generation,
parent
), o as(
select
row_number() OVER (order by generation, parent) as rnum,
generation,
parent
from
table2
order by
generation,
parent
)
select
p.generation as table1_generation,
p.parent as table1_parent,
o.generation as table2_generation,
o.parent as table2_parent
from
p
left join o on
o.rnum+3=p.rnum
order by 1,2,3,4;
Output:
table1_generation
table1_parent
table2_generation
table2_parent
0
1
(null)
(null)
0
2
(null)
(null)
0
3
(null)
(null)
1
1
1
1
1
2
1
3
1
3
1
3
2
1
2
1
2
2
2
2
2
3
2
3

postgres sum different table columns from many to one joined data

Suppose I have the following two tables:
foo:
| id | goober | value |
|----|--------|-------|
| 1 | a1 | 25 |
| 2 | a1 | 125 |
| 3 | b2 | 500 |
bar:
| id | foo_id | value |
|----|--------|-------|
| 1 | 1 | 4 |
| 2 | 3 | 19 |
| 3 | 3 | 42 |
| 4 | 3 | 22 |
| 5 | 3 | 56 |
Note the n:1 relationship of bar.foo_id : foo.id.
My goal is to sum the value columns for tables foo and bar, joining on bar.foo_id=foo.id, and finally grouping by goober from foo. Then performing a calculation if possible, though not critical.
Resulting in a final output looking something like:
| goober | foo_value_sum | bar_value_sum | foo_bar_diff |
|--------|---------------|---------------|--------------|
| a1 | 150 | 4 | 146 |
| b2 | 500 | 139 | 361 |
This should be rather simple by the following query that creates two CTEs and then joins them afterwards:
with bar_agg as
(
select foo.goober
,sum(bar.value) bar_value_sum
from foo
join bar
on bar.foo_id = foo.id
group by foo.goober
)
,foo_agg as
(
select foo.goober
,sum(foo.value) foo_value_sum
from foo
group by foo.goober
)
select foo.goober
,foo_value_sum
,bar_value_sum
,foo_value_sum - bar_value_sum foo_bar_diff
from foo_agg foo
left join bar_agg bar
on bar.goober = foo.goober
order by foo.goober

Serial Number in logical order without gaps

I'm trying to generate a serial number based on a few conditions.
My dataset:
+--------+------------+------------+---------+--------+
| Client | Start_Date | End_date | Product | Ser_No |
+--------+------------+------------+---------+--------+
| 44 | 22-01-2018 | 31-12-2018 | A | |
+--------+------------+------------+---------+--------+
| 44 | 24-02-2018 | 01-01-2019 | B | |
+--------+------------+------------+---------+--------+
| 44 | 12-03-2018 | 01-01-2019 | C | |
+--------+------------+------------+---------+--------+
| 100 | 24-01-2018 | 30-11-2018 | A | |
+--------+------------+------------+---------+--------+
| 100 | 26-01-2018 | 15-12-2018 | D | |
+--------+------------+------------+---------+--------+
| 100 | 26-01-2018 | 01-02-2019 | E | |
+--------+------------+------------+---------+--------+
| 100 | 01-03-2018 | 31-01-2019 | F | |
+--------+------------+------------+---------+--------+
What I did to configure my serial number:
RANK() OVER(PARTITION BY Client ORDER BY Client, Start_date ASC)
So now it generates a serial number for my which looks like this:
+--------+------------+------------+---------+--------+
| Client | Start_Date | End_date | Product | Ser_No |
+--------+------------+------------+---------+--------+
| 44 | 22-01-2018 | 31-12-2018 | A | 1 |
+--------+------------+------------+---------+--------+
| 44 | 24-02-2018 | 01-01-2019 | B | 2 |
+--------+------------+------------+---------+--------+
| 44 | 12-03-2018 | 01-01-2019 | C | 3 |
+--------+------------+------------+---------+--------+
| 100 | 24-01-2018 | 30-11-2018 | A | 1 |
+--------+------------+------------+---------+--------+
| 100 | 26-01-2018 | 15-12-2018 | D | 2 |
+--------+------------+------------+---------+--------+
| 100 | 26-01-2018 | 01-02-2019 | E | 2 |
+--------+------------+------------+---------+--------+
| 100 | 01-03-2018 | 31-01-2019 | F | 4 |
+--------+------------+------------+---------+--------+
What goes wrong for my analysis is the last line, it generates the serial number. What it has to be is 3.
Can anayone help me to generate it in this order?
Thanks in advance!
Extra
In addition to my question from yesterday, there is something extra that I need to do. Because the Ser_No has to be the same when my Start_Date is the same, but the Ser_No has also be the same when my folowing records is the same product (also when it has a different Start_Date)
So what I I expect and what I get right now:
+--------+------------+------------+---------+--------+------------+
| Client | Start_Date | End_date | Product | Ser_No | Ser_No New |
+--------+------------+------------+---------+--------+------------+
| 44 | 22-01-2018 | 31-12-2018 | A | 1 | 1 |
+--------+------------+------------+---------+--------+------------+
| 44 | 24-02-2018 | 01-01-2019 | B | 2 | 2 |
+--------+------------+------------+---------+--------+------------+
| 44 | 12-03-2018 | 01-01-2019 | C | 2 | 2 |
+--------+------------+------------+---------+--------+------------+
| 100 | 24-01-2018 | 30-11-2018 | A | 1 | 1 |
+--------+------------+------------+---------+--------+------------+
| 100 | 26-01-2018 | 15-12-2018 | D | 2 | 2 |
+--------+------------+------------+---------+--------+------------+
| 100 | 26-01-2018 | 01-02-2019 | E | 2 | 2 |
+--------+------------+------------+---------+--------+------------+
| 100 | 01-03-2018 | 31-01-2019 | F | 3 | 3 |
+--------+------------+------------+---------+--------+------------+
| 100 | 11-04-2018 | 31-03-2019 | F | 4 | 3 |
+--------+------------+------------+---------+--------+------------+
| 100 | 20-04-2018 | 31-01-2019 | G | 5 | 4 |
+--------+------------+------------+---------+--------+------------+
| 100 | 21-04-2018 | 31-01-2019 | A | 6 | 5 |
+--------+------------+------------+---------+--------+------------+
| 100 | 21-04-2018 | 31-01-2019 | B | 6 | 5 |
+--------+------------+------------+---------+--------+------------+
| 100 | 01-05-2018 | 31-01-2019 | B | 7 | 5 |
+--------+------------+------------+---------+--------+------------+
Any idea on how to achieve this, because I won't get it
You need to use DENSE_RANK instead:
This function returns the rank of each row within a result set partition, with no gaps in the ranking values.
DENSE_RANK() OVER(PARTITION BY Client ORDER BY Start_date) AS Ser_no
Additionaly the Client in ORDER BY has no effect because it has the same value per partition.

Sort using auxiliary fields, start and end

In PostgreSQL, what is the best way to sort records using start and end fields in a generic way, without the need to include in the query the first record (where start_id=3)?
Example table:
+-------+----------+--------+--------+
| FK_ID | START_ID | END_ID | STRING |
+-------+----------+--------+--------+
| 77 | 1 | 9 | E |
| 82 | 5 | 2 | A |
| 77 | 7 | 1 | I |
| 77 | 3 | 7 | W |
| 82 | 9 | 5 | Q |
| 77 | 9 | 5 | X |
| 82 | 2 | 7 | G |
+-------+----------+--------+--------+
Sorted where FK_ID = 77:
+----+---+---+---+
| 77 | 3 | 7 | W |
| 77 | 7 | 1 | I |
| 77 | 1 | 9 | E |
| 77 | 9 | 5 | X |
+----+---+---+---+
Sorted where FK_ID = 82:
+----+---+---+---+
| 82 | 9 | 5 | Q |
| 82 | 5 | 2 | A |
| 82 | 2 | 7 | G |
+----+---+---+---+
Result query sequence:
+-------+----------+
| FK_ID | SEQUENCE |
+-------+----------+
| 82 | QAG |
| 77 | WIEX |
+-------+----------+
I do not think this is the most efficient way but you can try with a recursive CTE
WITH RECURSIVE path AS (
SELECT * FROM myTable AS t1 WHERE NOT EXISTS(
SELECT 1 FROM myTable AS t2 WHERE t1.fk_id = t2.fk_id AND t2.end_id = t1.start_id
) ORDER BY start_id LIMIT 1
UNION ALL
SELECT myTable.* FROM myTable JOIN path ON path.end_id = myTable.start_id
)
SELECT fk_id,array_to_string(array_agg(string)) FROM path GROUP BY fk_id

left join 2 tables not working

I have 2 tables:
Table1: 'op_ats'
| ID1 | numero |id_cofre | id_chave | estadoAT
| 1 | 111 | 1 | 3 | 1
| 2 | 222 | 3 | 3 | 2
| 3 | 333 | 1 | 4 | 2
| 4 | 444 | 1 | 2 | 3
Table_2: 'op_ats_cofres_chaves'
| ID2 | num_chave |
| 1 | A |
| 2 | B |
| 3 | C |
| 4 | D |
| 5 | E |
I have this SQL:
SELECT chaves.*, ats.numero numAT, ats.estadoAT
FROM op_ats_cofres_chaves chaves
LEFT JOIN op_ats ats ON ats.id_chave_cofre = chaves.id AND ats.id_cofre = 1
With this I get the following result:
| ID2 | num_chave | numAT | estadoAT |
| 1 | A | 444 | 3 |
| 2 | B | NULL | NULL |
| 3 | C | 111 | 1 |
| 4 | D | 333 | 2 |
| 5 | E | NULL | NULL |
Now the problem is that I want to filter the rows that are in Table1 but only that have the column 'estadoAT' with values 1 and 2. I've tried to add the line
WHERE op_ats.estadoAT = 1 OR op_ats.estadoAT = 2
But this makes the following result:
| ID2 | num_chave | numAT | estadoAT |
| 1 | A | 444 | 3 |
| 3 | C | 111 | 1 |
| 4 | D | 333 | 2 |
Resuming...
My intention is to get ALL rows in the Table2 and join the Table1 rows that have the 'id_cofre = 1' and '(estadoAT = 1 OR estadoAT = 2)'.
Any help is appreciated.
You have to move condition to JOIN clause instead of WHERE.
SELECT chaves.*, ats.numero numAT, ats.estadoAT
FROM op_ats_cofres_chaves chaves
LEFT JOIN op_ats ats ON ats.id_chave_cofre = chaves.id AND ats.id_cofre = 1
AND op_ats.estadoAT = 1 OR op_ats.estadoAT = 2;