Postgresql joining one column with 2 not null columns

Postgresql joining one column with 2 not null columns - postgresql

I have 2 tables that I want match by ID.
EDIT: Table1:
| id | other columns A |
| 23 | ... |
| 27 | ... |
| 9 | ... |
| 50 | ... |
Table2:
| id_new | id_old | other columns B
| 23 | 7 | ...
| 27 | 8 | ...
| 33 | 9 | ...
Problem is that the second table contains 2 ID columns: first with new ID second with the old one - both can match the ID from first table.
EDIT: there are some rows from table A which ID not match neither id_new nor id_old. But I want them to retain in the new table.
This is my desired result:
| id | id_new | id_old | other columns A + B
| 23 | 23 | 7 | A + B
| 27 | 27 | 8 | A + B
| 9 | 33 | 9 | A + B
| 50 | -- | -- | A
I tried this one but it's a huge dataset and my query takes a long time to execute.
create table spoj2
as
select *
from table1
left join table2 on table1.id = table2.id_new
or table1.id = table2.id_old

Is this what you need?
select table1.id, IFNULL(T2A.id_new, T2B.id_new) as id_new
, IFNULL(T2A.id_old, T2B.id_old) as id_old
FROM table1
LEFT JOIN table2 as T2A ON table1.id = T2A.id_new
LEFT JOIN table2 as T2B ON table1.id = T2B.id_old

WITH t1(id,o_t1) AS ( VALUES
(23,'...'),
(27,'...'),
(9,'...')
), t2(id_new,id_old,o_t2) AS ( VALUES
(23,7,'...'),
(27,8,'...'),
(33,9,'...')
)
SELECT t1.id,t2.id_new,t2.id_old,t1.o_t1,t2.o_t2 FROM t1
INNER JOIN t2 ON t2.id_new = t1.id
UNION ALL
SELECT t1.id,t2.id_new,t2.id_old,t1.o_t1,t2.o_t2 FROM t1
INNER JOIN t2 ON t2.id_old = t1.id;
Result:
id | id_new | id_old | o_t1 | o_t2
----+--------+--------+------+------
23 | 23 | 7 | ... | ...
27 | 27 | 8 | ... | ...
9 | 33 | 9 | ... | ...
(3 rows)

Related

Make sure every distinct value of Column1 has a row with every distinct value of Column2, by populating a table with 0s - postgresql

Here's a crude example I've made up to illustrate what I want to achieve:
table1:
| Shop | Product | QuantityInStock |
| a | Prod1 | 13 |
| a | Prod3 | 13 |
| b | Prod2 | 13 |
| b | Prod3 | 13 |
| b | Prod4 | 13 |
table1 becomes:
| Shop | Product | QuantityInStock |
| a | Prod1 | 13 |
| a | Prod2 | 0 | -- new
| a | Prod3 | 13 |
| a | Prod4 | 0 | -- new
| b | Prod1 | 0 | -- new
| b | Prod2 | 13 |
| b | Prod3 | 13 |
| b | Prod4 | 13 |
In this example, I want to represent every Shop/Product combination
every Shop {a,b} to have a row with every Product {Prod1, Prod2, Prod3, Prod4}
QuantityInStock=13 has no significance, I just wanted a placeholder number :)

Use a calendar table cross join approach:
SELECT s.Shop, p.Product, COALESCE(t1.QuantityInStock, 0) AS QuantityInStock
FROM (SELECT DISTINCT Shop FROM table1) s
CROSS JOIN (SELECT DISTINCT Product FROM table1) p
LEFT JOIN table1 t1
ON t1.Shop = s.Shop AND
t1.Product = p.Product
ORDER BY
s.Shop,
p.Product;
The idea here is to generate an intermediate table containing of all shop/product combinations via a cross join. Then, we left join this to table1. Any shop/product combinations which do not have a match in the actual table are assigned a zero stock quantity.

Find rows in relation with at least n rows in a different table without joins

I have a table as such (tbl):
+----+------+-----+
| pk | attr | val |
+----+------+-----+
| 0 | ohif | 4 |
| 1 | foha | 56 |
| 2 | slns | 2 |
| 3 | faso | 11 |
+----+------+-----+
And another table in n-to-1 relationship with tbl (tbl2):
+----+-----+
| pk | rel |
+----+-----+
| 0 | 0 |
| 1 | 1 |
| 2 | 0 |
| 3 | 2 |
| 4 | 2 |
| 5 | 3 |
| 6 | 1 |
| 7 | 2 |
+----+-----+
(tbl2.rel -> tbl.pk.)
I would like to select only the rows from tbl which are in relationship with at least n rows from tbl2.
I.e., for n = 2, I want this table:
+----+------+-----+
| pk | attr | val |
+----+------+-----+
| 0 | ohif | 4 |
| 1 | foha | 56 |
| 2 | slns | 2 |
+----+------+-----+
This is the solution I came up with:
SELECT DISTINCT ON (tbl.pk) tbl.*
FROM (
SELECT tbl.pk
FROM tbl
RIGHT OUTER JOIN tbl2 ON tbl2.rel = tbl.pk
GROUP BY tbl.pk
HAVING COUNT(tbl2.*) >= 2 -- n
) AS tbl_candidates
LEFT OUTER JOIN tbl ON tbl_candidates.pk = tbl.pk
Can it be done without selecting the candidates with a subquery and re-joining the table with itself?
I'm on Postgres 10. A standard SQL solution would be better, but a Postgres solution is acceptable.

OK, just join once, as below:
select
t1.pk,
t1.attr,
t1.val
from
tbl t1
join
tbl2 t2 on t1.pk = t2.rel
group by
t1.pk,
t1.attr,
t1.val
having(count(1)>=2) order by t1.pk;
pk | attr | val
----+------+-----
0 | ohif | 4
1 | foha | 56
2 | slns | 2
(3 rows)
Or just join once and use CTE(with clause), as below:
with tmp as (
select rel from tbl2 group by rel having(count(1)>=2)
)
select b.* from tmp t join tbl b on t.rel = b.pk order by b.pk;
pk | attr | val
----+------+-----
0 | ohif | 4
1 | foha | 56
2 | slns | 2
(3 rows)
Is the SQL clearer?

How to change the query to remain only leaf nodes

I have table with the following data:
id | parent_id | short_name
----+-----------+----------------
6 | 5 | cpu
7 | 5 | ram
14 | 9 | tier-a
15 | 9 | rfc1918
16 | 9 | tolerant
17 | 9 | nononymous
13 | 12 | cloudstack
5 | 13 | virtualmachine
8 | 13 | volume
9 | 13 | ipv4
3 | | domain
4 | | account
12 | | vdc
(13 rows)
with recursive query it looks like this:
with recursive tree ( id, parent_id, short_name, deep_name ) as (
select resource_type_id, parent_resource_type_id, short_name, short_name::text
from resource_type
where parent_resource_type_id is null
union all
select rt.resource_type_id as id, rt.parent_resource_type_id, rt.short_name,
tree.deep_name || '.' || rt.short_name
from tree, resource_type rt
where tree.id = rt.parent_resource_type_id
)
select * from tree;
id | parent_id | short_name | deep_name
----+-----------+----------------+-----------------------------------
4 | | account | account
3 | | domain | domain
12 | | vdc | vdc
13 | 12 | cloudstack | vdc.cloudstack
9 | 13 | ipv4 | vdc.cloudstack.ipv4
5 | 13 | virtualmachine | vdc.cloudstack.virtualmachine
8 | 13 | volume | vdc.cloudstack.volume
6 | 5 | cpu | vdc.cloudstack.virtualmachine.cpu
15 | 9 | rfc1918 | vdc.cloudstack.ipv4.rfc1918
17 | 9 | nononymous | vdc.cloudstack.ipv4.nononymous
16 | 9 | tolerant | vdc.cloudstack.ipv4.tolerant
14 | 9 | tier-a | vdc.cloudstack.ipv4.tier-a
7 | 5 | ram | vdc.cloudstack.virtualmachine.ram
(13 rows)
How to fix the query so in result I get only leafs? eg. vdc.cloudstack.volume row and no vdc, vdc.cloudstack rows
UPD
rows with no children

Exclude the rows where deep_name has a superstring somewhere else in the table:
WITH RECURSIVE tree AS (...)
SELECT * FROM tree AS t1
WHERE NOT EXISTS (
SELECT 1 FROM tree AS t2
WHERE t2.deep_name
LIKE t1.deep_name || '.%'
);

Laurenz Albe's answer give me an idea. I think it would be more efficient to count childs than working with strings.
My solution is:
WITH RECURSIVE tree AS (...)
SELECT * FROM tree t1
WHERE not EXISTS ( SELECT 1 FROM tree t2 WHERE t1.id = t2.parent_id );

A leaf node is a child which is not itself a parent.
If all you want is a list of leaf notes you don't need the recursive CTE, you just need an anti-join in your preferred format.
If (as I imagine you do) you need the deep_name, I would anti-join the result of the recursive CTE to the raw source table on id = parent_id.
WITH RECURSIVE tree AS (...)
SELECT * FROM tree AS t1
WHERE NOT EXISTS (SELECT 1 FROM resource_type AS t2
WHERE t2.parent_resource_type_id = t1.id);

Valid periods - SQL VIEW

I have 2 tables (actually there are 4, but for now lets say it's 2) with data like this:
Table PersonA
ClientID ID From Till
1 10 1.1.2017 30.4.2017
1 12 1.8.2017 2.1.2018
Table PersonB
ClientID ID From Till
1 6 1.3.2017 30.6.2017
And I need to generate view that would show something like this:
ClientID From Till PersonA PersonB
1 1.1.2017 28.2.2017 10 NULL
1 1.3.2017 30.4.2017 10 6
1 1.5.2017 30.6.2017 NULL 6
1 1.8.2017 02.1.2018 12 NULL
So basically I need to create view that would show what "persons" each client had in given period.
So when there is an overlap, client have both PersonA and PersonB (same should apply for PersonC and PersonD).
So in the final view one client can't have any overlapping dates.
I don't know how to approach this.

In an adaptation of this algorithm, we can already handle the overlaps:
declare #PersonA table(ClientID int, ID int, [From] date, Till date);
insert into #PersonA values (1,10,'20170101','20170430'),(1,12,'20170801','20180112');
declare #PersonB table(ClientID int, ID int, [From] date, Till date);
insert into #PersonB values (1,6,'20170301','20170630');
declare #PersonC table(ClientID int, ID int, [From] date, Till date);
insert into #PersonC values (1,12,'20170401','20170625');
declare #PersonD table(ClientID int, ID int, [From] date, Till date);
insert into #PersonD values (1,14,'20170501','20170525'),(1,14,'20170510','20171122');
with X(ClientID,EdgeDate)
as (select ClientID
,case
when toggle = 1
then Till
else [From]
end as EdgeDate
from
(
select ClientID,[From],Till from #PersonA
union all
select ClientID,[From],Till from #PersonB
union all
select ClientID,[From],Till from #PersonC
union all
select ClientID,[From],Till from #PersonD
) as concated
cross join
(
select-1 as toggle
union all
select 1 as toggle
) as toggler
),merged
as (select distinct
S.ClientID
,S.EdgeDate as [From]
,min(E.EdgeDate) as Till
from
X as S
inner join X as E
on S.ClientID = E.ClientID
and S.EdgeDate < E.EdgeDate
group by S.ClientID
,S.EdgeDate
),prds
as (select distinct
merged.ClientID
,merged.[From]
,merged.Till
,A.ID as PersonA
,B.ID as PersonB
,C.ID as PersonC
,D.ID as PersonD
from
merged
left join #PersonA as A
on merged.ClientID = A.ClientID
and A.[From] <= merged.[From]
and merged.Till <= A.Till
left join #PersonB as B
on merged.ClientID = B.ClientID
and B.[From] <= merged.[From]
and merged.Till <= B.Till
left join #PersonC as C
on merged.ClientID = C.ClientID
and C.[From] <= merged.[From]
and merged.Till <= C.Till
left join #PersonD as D
on merged.ClientID = D.ClientID
and D.[From] <= merged.[From]
and merged.Till <= D.Till
where not(A.ID is null
and B.ID is null
and C.ID is null
and D.ID is null
)
)
select ClientID
,[From]
,case
when Till = lead([From]
) over(order by Till)
then dateadd(d,-1,Till)
else Till
end as Till
,PersonA
,PersonB
,PersonC
,PersonD
from
prds
order by ClientID
,[From]
,Till;
Output with just the two Person tables given in the question:
+----------+------------+------------+---------+---------+
| ClientID | From | Till | PersonA | PersonB |
+----------+------------+------------+---------+---------+
| 1 | 2017-01-01 | 2017-02-28 | 10 | NULL |
| 1 | 2017-03-01 | 2017-04-29 | 10 | 6 |
| 1 | 2017-04-30 | 2017-06-30 | NULL | 6 |
| 1 | 2017-08-01 | 2018-01-12 | 12 | NULL |
+----------+------------+------------+---------+---------+
Output of script as it is above, with four Person tables:
+----------+------------+------------+---------+---------+---------+---------+
| ClientID | From | Till | PersonA | PersonB | PersonC | PersonD |
+----------+------------+------------+---------+---------+---------+---------+
| 1 | 2017-01-01 | 2017-02-28 | 10 | NULL | NULL | NULL |
| 1 | 2017-03-01 | 2017-03-31 | 10 | 6 | NULL | NULL |
| 1 | 2017-04-01 | 2017-04-29 | 10 | 6 | 12 | NULL |
| 1 | 2017-04-30 | 2017-04-30 | NULL | 6 | 12 | NULL |
| 1 | 2017-05-01 | 2017-05-09 | NULL | 6 | 12 | 14 |
| 1 | 2017-05-10 | 2017-05-24 | NULL | 6 | 12 | 14 |
| 1 | 2017-05-25 | 2017-06-24 | NULL | 6 | 12 | 14 |
| 1 | 2017-06-25 | 2017-06-29 | NULL | 6 | NULL | 14 |
| 1 | 2017-06-30 | 2017-07-31 | NULL | NULL | NULL | 14 |
| 1 | 2017-08-01 | 2017-11-21 | 12 | NULL | NULL | 14 |
| 1 | 2017-11-22 | 2018-01-12 | 12 | NULL | NULL | NULL |
+----------+------------+------------+---------+---------+---------+---------+

1th and 7th row in grouping

I have this table named Samples. The Date column values are just symbolic date values.
+----+------------+-------+------+
| Id | Product_Id | Price | Date |
+----+------------+-------+------+
| 1 | 1 | 100 | 1 |
| 2 | 2 | 100 | 2 |
| 3 | 3 | 100 | 3 |
| 4 | 1 | 100 | 4 |
| 5 | 2 | 100 | 5 |
| 6 | 3 | 100 | 6 |
...
+----+------------+-------+------+
I want to group by product_id such that I have the 1'th sample in descending date order and a new colomn added with the Price of the 7'th sample row in each product group. If the 7'th row does not exist, then the value should be null.
Example:
+----+------------+-------+------+----------+
| Id | Product_Id | Price | Date | 7thPrice |
+----+------------+-------+------+----------+
| 4 | 1 | 100 | 4 | 120 |
| 5 | 2 | 100 | 5 | 100 |
| 6 | 3 | 100 | 6 | NULL |
+----+------------+-------+------+----------+
I belive I can achieve the table without the '7thPrice' with the following
SELECT * FROM (
SELECT ROW_NUMBER() OVER (PARTITION BY Product_Id ORDER BY date DESC) r, * FROM Samples
) T WHERE T.r = 1
Any suggestions?

You can try something like this. I used your query to create a CTE. Then joined rank1 to rank7.
;with sampleCTE
as
(SELECT ROW_NUMBER() OVER (PARTITION BY Product_Id ORDER BY date DESC) r, * FROM Samples)
select *
from
(select * from samplecte where r = 1) a
left join
(select * from samplecte where r=7) b
on a.product_id = b.product_id