RECURSIVE query for Postgres on a single table - postgresql

I want to create one RECURSIVE query on a single table in Postgres, which is basically base on Parent and child.
Here is the demo table employee with data
id parentid managerid status
------------------------------------
3741 [null] 1709 7
3742 3741 1709 12
3749 3742 1709 12
3900 3749 1709 4
1) If Status = 12 then the result will be, the data which has status = 12 and all the parents of that particular node.
The expected result will be :
id parentid managerid status
--------------------------------------
3741 [null] 1709 7
3742 3741 1709 12
3749 3742 1709 12
For that I have tried the query which is given below is working fine and giving proper result, even if I change the status value than also its working fine.
WITH RECURSIVE nodes AS (
SELECT s1.id, case when s1.parentid=s1.id then null else s1.parentid end parentid,s1.managerid, s1.status
FROM employees s1 WHERE id IN
(SELECT employees.id FROM employees WHERE
"employees"."status" = 12 AND "employees"."managerid" = 1709)
UNION ALL
SELECT s2.id, case when s2.parentid=s2.id then null else s2.parentid end parentid,s2.managerid, s2.status
FROM employees s2 JOIN nodes ON s2.id = nodes.parentid
)
SELECT distinct nodes.id, nodes.parentid, nodes.managerid, nodes.status
FROM nodes ORDER BY nodes.id ASC NULLS FIRST;
2) If Status != 12 then the result will be, only all the parents of that particular node.
The expected result will be :
id parentid managerid status
--------------------------------------
3741 [null] 1709 7
I want the query for status not equal some value.

WITH RECURSIVE cte AS (
SELECT * FROM tablename
WHERE status != 12
UNION
SELECT t.*
FROM tablename t INNER JOIN cte c
ON c.parentid = t.id
)
SELECT DISTINCT * FROM cte;
For more refer the Demo: demo

This is a very simple solution but I think it should work for smaller sets of data
SELECT * FROM employee
WHERE
status=12
OR id IN (
SELECT DISTINCT parentId FROM employee WHERE status=12
)
`

With this recursive CTE:
with recursive cte as (
select * from tablename
where status = 12
union all
select t.*
from tablename t inner join cte c
on c.parentid = t.id
)
select distinct * from cte;
See the demo.
Results:
| id | parentid | managerid | status |
| ---- | -------- | --------- | ------ |
| 3741 | | 1709 | 7 |
| 3742 | 3741 | 1709 | 12 |
| 3749 | 3742 | 1709 | 12 |

WITH RECURSIVE CTE AS
(
SELECT *
FROM tablename
WHERE status = 12
UNION
SELECT t.*
FROM tablename t
INNER JOIN cte c ON c.Id = t.parentid
)
SELECT t.*
FROM tablename t
LEFT JOIN cte c on t.id=c.id
WHERE c.id IS NULL
ORDER BY id ASC NULLS FIRST;

Related

how can I get all ids starting from a given id recursively in a postgresql table that references itself?

the title may not be very clear so let's consider this example (this is not my code, just taking this example to model my request)
I have a table that references itself (like a filesystem)
id | parent | name
----+----------+-------
1 | null | /
2 | 1 | home
3 | 2 | user
4 | 3 | bin
5 | 1 | usr
6 | 5 | local
Is it possible to make a sql request so if I choose :
1 I will get a table containing 2,3,4,5,6 (because this is the root) so matching :
/home
/home/user
/home/user/bin
/usr
etc...
2 I will get a table containing 3,4 so matching :
/home/user
/home/user/bin
and so on
Use recursive common table expression. Always starting from the root, use an array of ids to get paths for a given id in the WHERE clause.
For id = 1:
with recursive cte(id, parent, name, ids) as (
select id, parent, name, array[id]
from my_table
where parent is null
union all
select t.id, t.parent, concat(c.name, t.name, '/'), ids || t.id
from cte c
join my_table t on c.id = t.parent
)
select id, name
from cte
where 1 = any(ids) and id <> 1
id | name
----+-----------------------
2 | /home/
5 | /usr/
6 | /usr/local/
3 | /home/user/
4 | /home/user/bin/
(5 rows)
For id = 2:
with recursive cte(id, parent, name, ids) as (
select id, parent, name, array[id]
from my_table
where parent is null
union all
select t.id, t.parent, concat(c.name, t.name, '/'), ids || t.id
from cte c
join my_table t on c.id = t.parent
)
select id, name
from cte
where 2 = any(ids) and id <> 2
id | name
----+-----------------------
3 | /home/user/
4 | /home/user/bin/
(2 rows)
Bidirectional query
The question is really interesting. The above query works well but is inefficient as it parses all tree nodes even when we're asking for a leaf. The more powerful solution is a bidirectional recursive query. The inner query walks from a given node to top, while the outer one goes from the node to bottom.
with recursive outer_query(id, parent, name) as (
with recursive inner_query(qid, id, parent, name) as (
select id, id, parent, name
from my_table
where id = 2 -- parameter
union all
select qid, t.id, t.parent, concat(t.name, '/', q.name)
from inner_query q
join my_table t on q.parent = t.id
)
select qid, null::int, right(name, -1)
from inner_query
where parent is null
union all
select t.id, t.parent, concat(q.name, '/', t.name)
from outer_query q
join my_table t on q.id = t.parent
)
select id, name
from outer_query
where id <> 2; -- parameter

PostgreSQL UNION don't merge lines properly

I have 3 tables in a PostgreSQL database:
localities (loc, 12561 rows)
plants (pl, 17052 rows)
specimens or samples (esp, 9211 rows)
pl and esp each have a field loc, to specify where that tagged plant lives, or where that sample (usually a branch with leaves and flowers) came from.
I need a report of the places that have plants or samples, and the number of plants and samples in each place. The best I did up to now is the union of two subqueries, that runs very fast (33 ms to fetch 69 rows):
(select l.id,l.nome,count(pl.id) pls,null esps
from loc l
left join pl on pl.loc = l.id
where l.id in
(select distinct pl.loc
from pl
where pl.loc > 0)
group by l.id,l.nome
union
select l.id,l.nome,null pls,count(e.id) esps
from loc l
left join esp e on e.loc = l.id
where l.id in
(select distinct e.loc
from esp e
where e.loc > 0)
group by l.id,l.nome)
order by id
The point is, when the same place has both plants and samples, it becomes two distinct lines, like:
11950 | San Martin | | 5 |
11950 | San Martin | 61 | |
Of course what I want is:
11950 | San Martin | 61 | 5 |
Before that, I have tried doing all in one query:
select l.id,l.nome,count(pl.id),count(e.id) esps
from loc l
left join pl on pl.loc = l.id
left join esp e on e.loc = l.id
where l.id in
(select distinct pl.loc
from pl
where pl.loc > 0)
or l.id in
(select distinct e.loc
from esp e
where e.loc > 0)
group by l.id,l.nome
but it returns a strange repetition (it's multiplying both results and showing the result twice):
11950 | San Martin | 305 | 305 |
I have tried without subqueries, but it was taking about 13 seconds, which is too long.
I created test layout with:
create table localities (id integer, loc_name text);
create table plants (plant_id integer, loc_id integer);
create table samples (sample_id integer, loc_id integer);
insert into localities select x, ('Loc ' || x::text) from generate_series(1, 12561) x ;
insert into plants select x, (random()*12561)::integer from generate_series(1, 17052) x;
insert into samples select x, (random()*12561)::integer from generate_series(1, 9211) x;
The trick is to create an intermediate table from plants and samples but with same structure. Where data doesn't make sense (plant has no sample_id), you add null:
select loc_id, plant_id, null as sample_id from plants
union all
select loc_id, null as plant_id, sample_id from samples
This table has unified structure and you can then aggregate on it (I'm using WITH to make it a bit more readable.):
with localities_used as (
select loc_id, plant_id, null as sample_id from plants
union all
select loc_id, null as plant_id, sample_id from samples)
select
localities_used.loc_id,
count(localities_used.plant_id) plant_count,
count(localities_used.sample_id) sample_count
from
localities_used
group by
localities_used.loc_id;
If you need additional data from localities, you can join them on the aggregated table:
with localities_used as (
select loc_id, plant_id, null as sample_id from plants
union all
select loc_id, null as plant_id, sample_id from samples),
aggregated as (
select
localities_used.loc_id,
count(localities_used.plant_id) plant_count,
count(localities_used.sample_id) sample_count
from
localities_used
group by
localities_used.loc_id)
select * from aggregated left outer join localities on aggregated.loc_id = localities.id;
This takes 75ms on my laptop all together.
This should be as easy as
select * from (
select
location.*,
(select count(id) from plant where plant.location = location.id) as plants,
(select count(id) from sample where sample.location = location.id) as samples
from location
) subquery
where subquery.plants > 0 or subquery.samples > 0;
id | name | plants | samples
----+------------+--------+---------
1 | San Martin | 2 | 1
2 | Rome | 1 | 2
3 | Dallas | 3 | 1
(3 rows)
This is the database I quickly set up to experiment with:
create table location(id serial primary key, name text);
create table plant(id serial primary key, name text, location integer references location(id));
create table sample(id serial primary key, name text, location integer references location(id));
insert into location (name) values ('San Martin'), ('Rome'), ('Dallas'), ('Ghost Town');
insert into plant (name, location) values ('San Martin Dandelion', 1),('San Martin Camomile', 1), ('Rome Raspberry', 2), ('Dallas Locoweed', 3), ('Dallas Lemongrass', 3), ('Dallas Setaria', 3);
insert into sample (name, location) values ('San Martin Bramble', 1), ('Rome Iris', 2), ('Rome Eucalypt', 2), ('Dallas Dogbane', 3);
tests=# select * from location;
id | name
----+------------
1 | San Martin
2 | Rome
3 | Dallas
4 | Ghost Town
(4 rows)
tests=# select * from plant;
id | name | location
----+----------------------+----------
1 | San Martin Dandelion | 1
2 | San Martin Camomile | 1
3 | Rome Raspberry | 2
4 | Dallas Locoweed | 3
5 | Dallas Lemongrass | 3
6 | Dallas Setaria | 3
(6 rows)
tests=# select * from sample;
id | name | location
----+--------------------+----------
1 | San Martin Bramble | 1
2 | Rome Iris | 2
3 | Rome Eucalypt | 2
4 | Dallas Dogbane | 3
(4 rows)
I didn't test that but I think it could be something like this:
SELECT
l.id,
l.nome,
SUM(CASE WHEN pl.id IS NOT NULL THEN 1 ELSE 0 END) as plants_count,
SUM(CASE WHEN e.id IS NOT NULL THEN 1 ELSE 0 END) as esp_count
FROM loc l
LEFT JOIN pl ON pl.loc = l.id
LEFT JOIN esp e ON e.loc = l.id
GROUP BY l.id,l.nome
The point is to count non null ids of each type.

SQL Selecting Maximum Based on Minor-Major scheme

I am trying to create a query that will select a DISTINCT line, select using a revision Minor / Major scheme. Below is an example table:
Serial Number | RevMajor | RevMinor
-----------------------------------
AQ155 | 1 | 1
AQ155 | 1 | 2
AQ155 | 1 | 1
AQ155 | 1 | 7
AQ155 | 2 | 1 <---------
JR2709 | 1 | 7
JR2709 | 2 | 2 <---------
How can I write a query in T-SQL 2008 that will select only the two highlighted lines, the "Newest Revision"?
Thanks in advance!
You could
select * from (
select *, row_number() over (partition by [Serial Number] order by RevMajor desc, RevMinor desc) VersionRank
from table
) T
where VersionRank = 1
select [serial number], revmajor, revminor
from table1
where revMajor = (select max(revmajor) from table1)
another way to do this could be:
select [serial number], revmajor, revminor
from table1 a
inner join ( select max(revMajor) from table1 ) b on a.revmajor = b.revmajor
Another way if you know there are only 2 rows:
select top 2 [serial number], revmajor, revminor
from table1 a
order by revmajor desc, revminor desc

sql join if value exists in other table then Count it

I have following tables.
Table A
UserID | key 1 | A 2 | B 3 | A 4 | C 5 |
Table B
UserID | Num1 | 501 | 3002 |3 | 1004 | 20
I have query like this
SELECT COUNT(key) AS cnt, key
FROM A
WHERE key <> ''
GROUP BY key
ORDER BY cnt DESC
The results should be something like this
key | cnt A | 2 B | 1 C | 1
What I would like to add is Joining Table B.
If UserID has value in Num in Table B, I would like to count UserID with/Num Grouped by key
Here is desired results
key | cnt | Has Num? A | 2 | 2 B | 1 | 0 C | 1 | 1
I tried to write subquery but I can't attach it to main query. Subquery is something like this.
SELECT COUNT(DISTINCT UserID) AS num
FROM B
LEFT OUTER JOIN A ON B.UserID = A.UserID
WHERE Num <>'' AND key <> ''
GROUP BY key
If I'm understanding this correctly, what you're looking for is a count of the Keys in Table A when they were used by a UserID, and then a count of the number of unique UserIDs in Table B who both appeared in the first Table A query and had a Num.
Try this:
SELECT a.[Key], COUNT(a.[Key]) AS cnt, isNull(SUM(b.bCnt), 0) AS [Has Num?]
FROM #TableA a
LEFT OUTER JOIN (
SELECT b.UserID, 1
FROM #TableB b
WHERE LEN(b.Num) > 0
GROUP BY b.UserID
) b (UserID, bCnt) ON b.UserID = a.UserID
WHERE LEN(a.[Key]) > 0
GROUP BY a.[Key]
This query gives the results that you were expecting.
DECLARE #TableA TABLE(UserID INT, [Key] CHAR(1))
INSERT INTO #TableA VALUES(1,'A'),(2,'B'),(3,'A'),(4,'C'),(5,'')
DECLARE #TableB TABLE(UserID INT, Num INT NULL)
INSERT INTO #TableB VALUES(1,50),(1,300),(2,NULL),(3,100),(4,20)
SELECT x.[Key],x.Cnt,y.[Has Num?]
FROM
( SELECT [Key],Cnt = COUNT([Key])
FROM #TableA
WHERE LEN([Key])>0
GROUP BY [Key]
)X
JOIN
(
SELECT a.[Key],[Has Num?] = COUNT(b.Num)
FROM #TableA a
JOIN #TableB b ON a.UserID = b.UserID
GROUP BY a.[Key]
)Y
ON x.[Key] = Y.[Key]
Key Cnt Has Num?
A 2 3
B 1 0
C 1 1
How about an OUTER APPLY
SELECT [Key], COUNT(a.[Key]) AS cnt, SUM(x.NumCount) AS [Has Num?]
FROM #TableA a
OUTER APPLY (SELECT COUNT(NUM) AS NumCount
FROM #TableB b
WHERE b.UserId = a.UserId AND Num IS NOT NULL
) x
WHERE [Key] <> ''
GROUP BY [Key]
ORDER BY cnt DESC
Result:
Key cnt Has Num?
---- ----------- -----------
A 2 3
B 1 0
C 1 1

SQL query for insert into with a set of constants

It seems like there should be a query for this, but I can't think of how to do it.
I've got a table with a composite primary key consisting of two fields I'd like to populate with data,
I can do an insert into from one table to fill up half the keys, but I want to fill up the other half with a set of constants (0, 3, 5, 6, 9) etc...
so the end result would look like this
+--------------+
|AwesomeTable |
+--------------+
| Id1 | Id2 |
| 1 | 0 |
| 1 | 3 |
| 1 | 5 |
| 1 | 6 |
| 1 | 9 |
| 2 | 0 |
| 2 | 3 |
| ... | ... |
+--------------+
I've got as far as insert into awesometable (id1, id2) select id1, [need something here] from table1 [need something else here]
I've got a table with 2 primary keys
No, you don't. A table can only have one primary key. You probably mean a composite primary key.
I believe you want this:
INSERT
INTO awesometable (id1, id2)
SELECT t1.id1, q.id2
FROM table1 t1
CROSS JOIN
(
SELECT 0 AS id2
UNION ALL
SELECT 3
UNION ALL
SELECT 5
UNION ALL
SELECT 6
UNION ALL
SELECT 9
) q
, or in Oracle:
INSERT
INTO awesometable (id1, id2)
SELECT t1.id1, q.id2
FROM table1 t1
CROSS JOIN
(
SELECT 0 AS id2
FROM dual
UNION ALL
SELECT 3
FROM dual
UNION ALL
SELECT 5
FROM dual
UNION ALL
SELECT 6
FROM dual
UNION ALL
SELECT 9
FROM dual
) q
If I understand correctly, maybe you can use something like this:
insert into awesometable (id1, id2)
select id1, (select top 1 id2 from table2 where /*a condition here to retreive only one result*/)
from table1