Sort hierarchical table CTE query - tsql

How I can sort a hierarchical table with CTE query ?
sample table :
|ID|Name |ParentID|
| 0| |-1 |
| 1|1 |0 |
| 2|2 |0 |
| 3|1-1 |1 |
| 4|1-2 |1 |
| 5|2-1 |2 |
| 6|2-2 |2 |
| 7|2-1-1 |5 |
and my favorite result is :
|ID|Name |ParentID|Level
| 0| |-1 |0
| 1|1 |0 |1
| 3|1-1 |1 |2
| 4|1-2 |1 |2
| 2|2 |0 |1
| 5|2-1 |2 |2
| 7|2-1-1 |5 |3
| 6|2-2 |2 |2
another Sample :
an other sample :
|ID|Name |ParentID|
| 0| |-1 |
| 1|Book |0 |
| 2|App |0 |
| 3|C# |1 |
| 4|VB.NET |1 |
| 5|Office |2 |
| 6|PhotoShop |2 |
| 7|Word |5 |
and my favorite result is :
|ID|Name |ParentID|Level
| 0| |-1 |0
| 1|Book |0 |1
| 3|C# |1 |2
| 4|VB.NET |1 |2
| 2|App |0 |1
| 5|Office |2 |2
| 7|Word |5 |3
| 6|PhotoShop |2 |2

The hierarchyid datatype is able to represent hierarchical data, and already has the desired sorting order. If you can't replace your ParentID column, then you can convert to it on the fly:
(Most of this script is data setup, the actual answer is quite small)
declare #t table (ID int not null,Name varchar(10) not null,ParentID int not null)
insert into #t(ID,Name,ParentID)
select 0,'' ,-1 union all
select 1,'Book' ,0 union all
select 2,'App' ,0 union all
select 3,'C#' ,1 union all
select 4,'VB.NET' ,1 union all
select 5,'Office' ,2 union all
select 6,'PhotoShop' ,2 union all
select 7,'Word' ,5
;With Sensible as (
select ID,Name,NULLIF(ParentID,-1) as ParentID
from #t
), Paths as (
select ID,CONVERT(hierarchyid,'/' + CONVERT(varchar(10),ID) + '/') as Pth
from Sensible where ParentID is null
union all
select s.ID,CONVERT(hierarchyid,p.Pth.ToString() + CONVERT(varchar(10),s.ID) + '/')
from Sensible s inner join Paths p on s.ParentID = p.ID
)
select
*
from
Sensible s
inner join
Paths p
on
s.ID = p.ID
order by p.Pth

ORDER BY Name should work as desired:
WITH CTE
AS(
SELECT parent.*, 0 AS Level
FROM #table parent
WHERE parent.ID = 0
UNION ALL
SELECT parent.*, Level+1
FROM #table parent
INNER JOIN CTE prev ON parent.ParentID = prev.ID
)
SELECT * FROM CTE
ORDER BY Name
Here's your sample data(add it next time yourself):
declare #table table(ID int,Name varchar(10),ParentID int);
insert into #table values(0,'',-1);
insert into #table values(1,'1',0);
insert into #table values(2,'2',0);
insert into #table values(3,'1-1',1);
insert into #table values(4,'1-2',1);
insert into #table values(5,'2-1',2);
insert into #table values(6,'2-2',2);
insert into #table values(7,'2-1-1',5);
Result:
ID Name ParentID Level
0 -1 0
1 1 0 1
3 1-1 1 2
4 1-2 1 2
2 2 0 1
5 2-1 2 2
7 2-1-1 5 3
6 2-2 2 2

Related

MariaDB - Conjunction-Search in Many-to-Many

I have problems to implement an "and-concatenated" search with many-to-many tables. I tried to present a simple example below. I use MariaDB.
I have a table with process. To the process a can assign persons and tags. There is a table for tags and a table for persons.
There a two many-to-many relationships: tags_to_processes and persons_to_processes.
example: Find all process with person 1 and person 2 and with tag 1 and 2. Result: process 1.
example: Find all process with person 1 and person 2 and with tag 2. Result: Process 1 and Process 2.
Thank you very much!
'processes' Table
+-----------+-------------------+
|process_id |process_name |
+-----------+-------------------+
|1 |Process 1 |
|2 |Process 2 |
|3 |Process 3 |
+-----------+-------------------+
'persons' table
+----------+------------+
|person_id |person_name |
+----------+------------+
|1 |Person 1 |
|2 |Person 2 |
|3 |Person 3 |
|4 |Person 4 |
|5 |Person 5 |
+----------+------------+
'tags' table
+----------+-----------+
|tag_id |tag_name |
+----------+-----------+
|1 |Tag 1 |
|2 |Tag 2 |
|3 |Tag 3 |
|4 |Tag 4 |
|5 |Tag 5 |
|6 |Tag 6 |
+----------+-----------+
'persons_to_processes' table
+----------+-----------+
|person_id |process_id |
+----------+-----------+
|1 |1 |
|2 |1 |
|3 |1 |
|4 |1 |
|5 |1 |
|1 |2 |
|2 |2 |
|4 |3 |
+----------+-----------+
'tags_to_processes' table
+----------+-----------+
|tag_id |process_id |
+----------+-----------+
|1 |1 |
|2 |1 |
|3 |1 |
|6 |1 |
|2 |2 |
|2 |3 |
+----------+-----------+
You can join persons_to_processes to persons, filter the resuults for the persons that you want and use aggregation:
SELECT ptp.process_id
FROM persons_to_processes ptp INNER JOIN persons p
ON p.person_id = ptp.person_id
WHERE p.person_name IN ('Person 1', 'Person 2')
GROUP BY ptp.process_id
HAVING COUNT(*) = 2 -- 2 persons
Similarly for the tables tags_to_processes and tags:
SELECT ttp.process_id
FROM tags_to_processes ttp INNER JOIN tags t
ON t.tag_id = ttp.tag_id
WHERE t.tag_name IN ('Tag 1', 'Tag 2')
GROUP BY ttp.process_id
HAVING COUNT(*) = 2 -- 2 tags
Finally, you can combine the 2 queries to get their common results with INTERSECT:
WITH
cte1 AS (
SELECT ptp.process_id
FROM persons_to_processes ptp INNER JOIN persons p
ON p.person_id = ptp.person_id
WHERE p.person_name IN ('Person 1', 'Person 2')
GROUP BY ptp.process_id
HAVING COUNT(*) = 2 -- 2 persons
),
cte2 AS (
SELECT ttp.process_id
FROM tags_to_processes ttp INNER JOIN tags t
ON t.tag_id = ttp.tag_id
WHERE t.tag_name IN ('Tag 1', 'Tag 2')
GROUP BY ttp.process_id
HAVING COUNT(*) = 2 -- 2 tags
)
SELECT process_id FROM cte1
INTERSECT
SELECT process_id FROM cte2;
See the demo.

POSTGRESQL Recursive LEFT JOIN on Array

I have the following table with the parent_path array:
Id | Account Name | parent_path
1 A {1}
2 B {2,1}
3 C {3,2,1}
4 D {4,3,2,1}
What I'm looking to do is to have a recursive left join in order to create 1 column per item in the parent_path array
Id | Account Name | parent_path | parent_name1 | parent_name2 | parent_name3
1 A NULL NULL NULL NULL
2 B {1} A NULL NULL
3 C {2,1} B A NULL
4 D {3,2,1} C B A
Thanks!
This is an abuse of SQL, but here goes:
with get_names as (
select h.id, h.account_name, h.parent_path, array_agg(h2.account_name order by p.rn) as name_path
from hier h
cross join lateral unnest(h.parent_path) with ordinality as p(path_id, rn)
join hier h2 on h2.id = p.path_id
group by h.id, h.account_name, h.parent_path
)
select id, account_name, parent_path,
name_path[2] as parent_name1,
name_path[3] as parent_name2,
name_path[4] as parent_name3,
name_path[5] as parent_name4,
name_path[6] as parent_name5,
name_path[7] as parent_name6,
name_path[8] as parent_name7,
name_path[9] as parent_name8
from get_names;
id | account_name | parent_path | parent_name1 | parent_name2 | parent_name3 | parent_name4 | parent_name5 | parent_name6 | parent_name7 | parent_name8
----+--------------+-------------+--------------+--------------+--------------+--------------+--------------+--------------+--------------+--------------
1 | A | {1} | | | | | | | |
2 | B | {2,1} | A | | | | | | |
3 | C | {3,2,1} | B | A | | | | | |
4 | D | {4,3,2,1} | C | B | A | | | | |
(4 rows)
There is a cleaner solution, using PostgreSQL intarray, instead. It works best for small(ish) tables, since it's not optimised for performance:
CREATE EXTENSION intarray;
SELECT id,
name,
path - id as parents,
(SELECT name FROM hierarchy h2 WHERE h2.id = (h.path - h.id)[1]) as parent_1,
(SELECT name FROM hierarchy h2 WHERE h2.id = (h.path - h.id)[2]) as parent_2,
(SELECT name FROM hierarchy h2 WHERE h2.id = (h.path - h.id)[3]) as parent_3,
(SELECT name FROM hierarchy h2 WHERE h2.id = (h.path - h.id)[4]) as parent_4,
(SELECT name FROM hierarchy h2 WHERE h2.id = (h.path - h.id)[5]) as parent_5,
(SELECT name FROM hierarchy h2 WHERE h2.id = (h.path - h.id)[6]) as parent_6,
(SELECT name FROM hierarchy h2 WHERE h2.id = (h.path - h.id)[7]) as parent_7
FROM hierarchy h
In my code, it produced the following (truncated) output:
+--+-----------+-------+--------+--------+--------+
|id|name |parents|parent_1|parent_2|parent_3|
+--+-----------+-------+--------+--------+--------+
|1 |Europe | |NULL |NULL |NULL |
|2 |Germany |{1} |Europe |NULL |NULL |
|4 |Netherlands|{1} |Europe |NULL |NULL |
|7 |Africa | |NULL |NULL |NULL |
|10|France |{1} |Europe |NULL |NULL |
|12|America | |NULL |NULL |NULL |
|17|Finland |{1} |Europe |NULL |NULL |
|3 |Berlin |{1,2} |Europe |Germany |NULL |
+--+-----------+-------+--------+--------+--------+

SQL Select Unique Values Each Column

I'm looking to select unique values from each column of a table and output the results into a single table. Take the following example table:
+------+---------------+------+---------------+
|col1 |col2 |col_3 |col_4 |
+------+---------------+------+---------------+
|1 |"apples" |A |"red" |
|2 |"bananas" |A |"red" |
|3 |"apples" |B |"blue" |
+------+---------------+------+---------------+
the ideal output would be:
+------+---------------+------+---------------+
|col1 |col2 |col_3 |col_4 |
+------+---------------+------+---------------+
|1 |"apples" |A |"red" |
|2 |"bananas" |B |"blue" |
|3 | | | |
+------+---------------+------+---------------+
Thank you!
Edit: My actual table has many more columns, so ideally the SQL query can be done via a SELECT * as opposed to 4 individual select queries within the FROM statement.

SPARK-SCALA: Update End date for a ID with the new start_date for the updated respective ID

I want to create a new column end_date for an id with the value of start_date column of the updated record for the same id using Spark Scala
Consider the following Data frame:
+---+-----+----------+
| id|Value|start_date|
+---+---- +----------+
| 1 | a | 1/1/2018 |
| 2 | b | 1/1/2018 |
| 3 | c | 1/1/2018 |
| 4 | d | 1/1/2018 |
| 1 | e | 10/1/2018|
+---+-----+----------+
Here initially start date of id=1 is 1/1/2018 and value is a, while on 10/1/2018(start_date) the value of id=1 became e. so i have to populate a new column end_date and populate value for id=1 in the beginning to 10/1/2018 and NULL values for all other records for end_date column
Result should be like below:
+---+-----+----------+---------+
| id|Value|start_date|end_date |
+---+---- +----------+---------+
| 1 | a | 1/1/2018 |10/1/2018|
| 2 | b | 1/1/2018 |NULL |
| 3 | c | 1/1/2018 |NULL |
| 4 | d | 1/1/2018 |NULL |
| 1 | e | 10/1/2018|NULL |
+---+-----+----------+---------+
I am using spark 2.3.
Can anyone help me out here please
With Window function "lead":
val df = List(
(1, "a", "1/1/2018"),
(2, "b", "1/1/2018"),
(3, "c", "1/1/2018"),
(4, "d", "1/1/2018"),
(1, "e", "10/1/2018")
).toDF("id", "Value", "start_date")
val idWindow = Window.partitionBy($"id")
.orderBy($"start_date")
val result = df.withColumn("end_date", lead($"start_date", 1).over(idWindow))
result.show(false)
Output:
+---+-----+----------+---------+
|id |Value|start_date|end_date |
+---+-----+----------+---------+
|3 |c |1/1/2018 |null |
|4 |d |1/1/2018 |null |
|1 |a |1/1/2018 |10/1/2018|
|1 |e |10/1/2018 |null |
|2 |b |1/1/2018 |null |
+---+-----+----------+---------+

How to search and join multi indexes with SphinxQL?

I have 2 indexes, indexA and indexB. There 2 indexes have different columns.
Example:
Index A:
+---+-----+
|id |text |
+---+-----+
|1 |john |
|2 |tom |
|3 |sam |
+---+-----+
Index B:
+---+---------+-----+
|id |parentid |num |
+---+---------+-----+
|1 |1 |64 |
|2 |1 |128 |
|3 |2 |256 |
+---+---------+-----+
Question:
How do I get result like this?
/*Client search*/
SELECT
A.id, A.text, B.num
FROM
indexa A
INNER JOIN
indexb B ON A.id = B.parentid
WHERE
B.num > 100
Result:
+-----+--------+-------+
|A.id | A.text |B.num |
+-----+--------+-------+
|1 |john |128 |
|2 |tom |256 |
+-----+--------+-------+
After edit index query, problem solved.
Solved index query:
SELECT
A.id,A.text,B.num
FROM
tableA A
LEFT JOIN
tableB B ON A.id=B.parentid
Search query:
SELECT * FROM indexA