Sort using auxiliary fields, start and end - postgresql

In PostgreSQL, what is the best way to sort records using start and end fields in a generic way, without the need to include in the query the first record (where start_id=3)?
Example table:
+-------+----------+--------+--------+
| FK_ID | START_ID | END_ID | STRING |
+-------+----------+--------+--------+
| 77 | 1 | 9 | E |
| 82 | 5 | 2 | A |
| 77 | 7 | 1 | I |
| 77 | 3 | 7 | W |
| 82 | 9 | 5 | Q |
| 77 | 9 | 5 | X |
| 82 | 2 | 7 | G |
+-------+----------+--------+--------+
Sorted where FK_ID = 77:
+----+---+---+---+
| 77 | 3 | 7 | W |
| 77 | 7 | 1 | I |
| 77 | 1 | 9 | E |
| 77 | 9 | 5 | X |
+----+---+---+---+
Sorted where FK_ID = 82:
+----+---+---+---+
| 82 | 9 | 5 | Q |
| 82 | 5 | 2 | A |
| 82 | 2 | 7 | G |
+----+---+---+---+
Result query sequence:
+-------+----------+
| FK_ID | SEQUENCE |
+-------+----------+
| 82 | QAG |
| 77 | WIEX |
+-------+----------+

I do not think this is the most efficient way but you can try with a recursive CTE
WITH RECURSIVE path AS (
SELECT * FROM myTable AS t1 WHERE NOT EXISTS(
SELECT 1 FROM myTable AS t2 WHERE t1.fk_id = t2.fk_id AND t2.end_id = t1.start_id
) ORDER BY start_id LIMIT 1
UNION ALL
SELECT myTable.* FROM myTable JOIN path ON path.end_id = myTable.start_id
)
SELECT fk_id,array_to_string(array_agg(string)) FROM path GROUP BY fk_id

Related

Filtering out hierarchical data

I need help with a problem I am facing processing hierarchical data.
Schema of the tables that maintain hierarchical data:
Category table:
| ID | Label |
Mapping table:
| ID | QualifierID | ItemID | ParentID |
Step 1: Wrote a simple self-join query to trasnform above mappings:
WITH category_masterlist AS (
SELECT id,
label
FROM Category
)
select id, id as itemid, label, NULL as parentId from [Category] where categoryLevel = 1
UNION
select itemid as id, itemId, (select label from category_masterlist where id = cm.itemid) Label, parentId
from [CategoryMapping] cm
Step 2: Wrote a self-join query using common table expression to return mapping data as follows:
WITH CategoryCTE(ParentID, ID, Label, CategoryLevel) AS
(
SELECT ParentID, ItemID, Label, 0 AS CategoryLevel
FROM [view_TreeviewCategoryMapping]
WHERE ParentID IS NULL
UNION ALL
SELECT e.ParentID, e.ItemID, e.Label, CategoryLevel + 1
FROM [view_TreeviewCategoryMapping] AS e
INNER JOIN CategoryCTE AS d
ON e.ParentID = d.ID
)
SELECT distinct ParentID, ID, Label, CategoryLevel
FROM CategoryCTE
| ID | Label | ParentID | CategoryLevel |
--------------------------------------------------------------------------------
| 90 | Satellite | NULL | 0 |
| 91 | Concrete | NULL | 0 |
| 92 | ETC | NULL | 0 |
| 93 | Chisel | NULL | 0 |
| 94 | Steel | NULL | 0 |
| 96 | Wood | NULL | 0 |
| 97 | MIC Systems | 90 | 1 |
| 97 | MIC Systems | 91 | 1 |
| 99 | Foundations | 91 | 1 |
| 100 | Down Systems | 91 | 1 |
| 101 | Side Systems | 91 | 1 |
| 102 | Systems | 91 | 1 |
| 98 | DWG | 92 | 1 |
| 97 | MIC Systems | 93 | 1 |
| 97 | MIC Systems | 94 | 1 |
| 99 | Foundations | 94 | 1 |
| 100 | Down Systems | 94 | 1 |
| 101 | Side Systems | 94 | 1 |
| 102 | Systems | 94 | 1 |
| 97 | MIC Systems | 95 | 1 |
| 98 | DWG | 95 | 1 |
| 102 | Systems | 95 | 1 |
| 103 | Project Management| 95 | 1 |
| 104 | Software | 95 | 1 |
| 99 | Foundations | 96 | 1 |
| 119 | Fronts | 97 | 2 |
| 121 | Technology | 98 | 2 |
| 112 | Root Systems | 98 | 2 |
| 112 | Root Systems | 99 | 2 |
| 137 | Closed Systems | 112 | 3 |
| 203 | Support | 121 | 3 |
Step 3: I would like to filter above results so that only categories that are mapped completely are returned. Completed mapping is a mapping that has children at level=3. For example, below is what I am looking for based on above resultset:
| ID | Label | ParentID | CategoryLevel |
--------------------------------------------------------------------------------
| 96 | Wood | NULL | 0 |
| 92 | ETC | NULL | 0 |
| 98 | DWG | 92 | 1 |
| 99 | Foundations | 96 | 1 |
| 121 | Technology | 98 | 2 |
| 112 | Root Systems | 98 | 2 |
| 112 | Root Systems | 99 | 2 |
| 137 | Closed Systems | 112 | 3 |
| 203 | Support | 121 | 3 |
Step 4: Ultimately, end user should be presented with a tree view control as follows:
Root
|
|---Wood
| |---Foundations
| |---Root Systems
| |---Closed Systems
|
|---ETC
| |---DWG
| |---Technology
| |---Support
| |---Root Systems
| |---Closed Systems
Please note, a category can have multiple parents. For example, Root Systems has two parents - DWG and Foundations. Did I get the schema correct for category and mapping table especially for the case when a category can have multiple parents?
How can I filter out categories that are not mapped completely from Step 2 to Step 3? That is the hurdle I am unable to cross. Any pointers? I can filter them out at the application level but would really love to filter them out at database level.
I am open to suggestions and recommendations that will help me achieve my goal. I also want a confirmation that the schema I am using is the most efficient one.
Thank you!
Here is a working option that uses the datatype hierarchyID
The nesting is option and really for illustration.
Example
Declare #Top int = null --<< Sets top of Hier Try 94
;with cteP as (
Select ID
,ParentID
,Label
,HierID = convert(hierarchyid,concat('/',ID,'/'))
From YourTable
Where IsNull(#Top,-1) = case when #Top is null then isnull(ParentID ,-1) else ID end
Union All
Select ID = r.ID
,Pt = r.ParentID
,Label = r.Label
,HierID = convert(hierarchyid,concat(p.HierID.ToString(),r.ID,'/'))
From YourTable r
Join cteP p on r.ParentID = p.ID)
Select Lvl = HierID.GetLevel()
,ID
,ParentID
,Label = replicate('|----',HierID.GetLevel()-1) + Label -- Nesting Optional ... For Presentation
,HierID_String = HierID.ToString()
From cteP A
Order By A.HierID
Results
Now if #Top was set to 94

Return unique grouped rows with the latest timestamp [duplicate]

This question already has answers here:
Select first row in each GROUP BY group?
(20 answers)
Closed 3 years ago.
At the moment I'm struggling with a problem that looks very easy.
Tablecontent:
Primay Keys: Timestamp, COL_A,COL_B ,COL_C,COL_D
+------------------+-------+-------+-------+-------+--------+--------+
| Timestamp | COL_A | COL_B | COL_C | COL_D | Data_A | Data_B |
+------------------+-------+-------+-------+-------+--------+--------+
| 31.07.2019 15:12 | - | - | - | - | 1 | 2 |
| 31.07.2019 15:32 | 1 | 1 | 100 | 1 | 5000 | 20 |
| 10.08.2019 09:33 | - | - | - | - | 1000 | 7 |
| 31.07.2019 15:38 | 1 | 1 | 100 | 1 | 33 | 5 |
| 06.08.2019 08:53 | - | - | - | - | 0 | 7 |
| 06.08.2019 09:08 | - | - | - | - | 0 | 7 |
| 06.08.2019 16:06 | 3 | 3 | 3 | 3 | 0 | 23 |
| 07.08.2019 10:43 | - | - | - | - | 0 | 42 |
| 07.08.2019 13:10 | - | - | - | - | 0 | 24 |
| 08.08.2019 07:19 | 11 | 111 | 111 | 12 | 0 | 2 |
| 08.08.2019 10:54 | 2334 | 65464 | 565 | 76 | 1000 | 19 |
| 08.08.2019 11:15 | 232 | 343 | 343 | 43 | 0 | 2 |
| 08.08.2019 11:30 | 2323 | rtttt | 3434 | 34 | 0 | 2 |
| 10.08.2019 14:47 | - | - | - | - | 123 | 23 |
+------------------+-------+-------+-------+-------+--------+--------+
Needed query output:
+------------------+-------+-------+-------+-------+--------+--------+
| Timestamp | COL_A | COL_B | COL_C | COL_D | Data_A | Data_B |
+------------------+-------+-------+-------+-------+--------+--------+
| 31.07.2019 15:38 | 1 | 1 | 100 | 1 | 33 | 5 |
| 06.08.2019 16:06 | 3 | 3 | 3 | 3 | 0 | 23 |
| 08.08.2019 07:19 | 11 | 111 | 111 | 12 | 0 | 2 |
| 08.08.2019 10:54 | 2334 | 65464 | 565 | 76 | 1000 | 19 |
| 08.08.2019 11:15 | 232 | 343 | 343 | 43 | 0 | 2 |
| 08.08.2019 11:30 | 2323 | rtttt | 3434 | 34 | 0 | 2 |
| 10.08.2019 14:47 | - | - | - | - | 123 | 23 |
+------------------+-------+-------+-------+-------+--------+--------+
As you can see, I'm trying to get single rows for my primary keys, using the latest timestamp, which is also a primary key.
Currently, I tried a query like:
SELECT Timestamp, COL_A, COL_B, COL_C, COL_D, Data_A, Data_B From Table XY op
WHERE Timestamp = (
SELECT MAX(Timestamp) FROM XY as tsRow
WHERE op.COL_A = tsRow.COL_A
AND op.COL_B = tsRow.COL_B
AND op.COL_C = tsRow.COL_C
AND op.COL_D = tsRow."COL_D
);
which gives me result that looks fine at first glance.
Is there a better or more safe way to get my preferred result?
demo:db<>fiddle
You can use the DISTINCT ON clause, which gives you the first record of an ordered group. Here your group is your (A, B, C, D). This is ordered by the Timestamp column, in descending order, to get the most recent record to be the first.
SELECT DISTINCT ON ("COL_A", "COL_B", "COL_C", "COL_D")
*
FROM
mytable
ORDER BY "COL_A", "COL_B", "COL_C", "COL_D", "Timestamp" DESC
If you want to get your expected order, you need a second ORDER BY after this operation:
SELECT
*
FROM (
SELECT DISTINCT ON ("COL_A", "COL_B", "COL_C", "COL_D")
*
FROM
mytable
ORDER BY "COL_A", "COL_B", "COL_C", "COL_D", "Timestamp" DESC
) s
ORDER BY "Timestamp"
Note: If you have the Timestamp column as part of the PK, are you sure, you really need the four other columns as PK as well? It seems, that the TS column is already unique.

Comparing Subqueries

I have two subqueries. Here is the output of subquery A....
id | date_lat_lng | stat_total | rnum
-------+--------------------+------------+------
16820 | 2016_10_05_10_3802 | 9 | 2
15701 | 2016_10_05_10_3802 | 9 | 3
16821 | 2016_10_05_11_3802 | 16 | 2
17861 | 2016_10_05_11_3802 | 16 | 3
16840 | 2016_10_05_12_3683 | 42 | 2
17831 | 2016_10_05_12_3767 | 0 | 2
17862 | 2016_10_05_12_3802 | 11 | 2
17888 | 2016_10_05_13_3683 | 35 | 2
17833 | 2016_10_05_13_3767 | 24 | 2
16823 | 2016_10_05_13_3802 | 24 | 2
and subquery B, in which date_lat_lng and stat_total has commonality with subquery A, but id does not.
id | date_lat_lng | stat_total | rnum
-------+--------------------+------------+------
17860 | 2016_10_05_10_3802 | 9 | 1
15702 | 2016_10_05_11_3802 | 16 | 1
17887 | 2016_10_05_12_3683 | 42 | 1
15630 | 2016_10_05_12_3767 | 20 | 1
16822 | 2016_10_05_12_3802 | 20 | 1
16841 | 2016_10_05_13_3683 | 35 | 1
15632 | 2016_10_05_13_3767 | 23 | 1
17863 | 2016_10_05_13_3802 | 3 | 1
16842 | 2016_10_05_14_3683 | 32 | 1
15633 | 2016_10_05_14_3767 | 12 | 1
Both subquery A and B pull data from the same table. I want to delete the rows in that table that share the same ID as subquery A but only where date_lat_lng and stat_total have a shared match in subquery B.
Effectively I need:
DELETE FROM table WHERE
id IN
(SELECT id FROM (subqueryA) WHERE
subqueryA.date_lat_lng=subqueryB.date_lat_lng
AND subqueryA.stat_total=subqueryB.stat_total)
Except I'm not sure where to place subquery B, or if I need an entirely different structure.
Something like this,
DELETE FROM table WHERE
id IN (
SELECT DISTINCT id
FROM subqueryA
JOIN subqueryB
USING (id,date_lat_lng,stat_total)
)

left join 2 tables not working

I have 2 tables:
Table1: 'op_ats'
| ID1 | numero |id_cofre | id_chave | estadoAT
| 1 | 111 | 1 | 3 | 1
| 2 | 222 | 3 | 3 | 2
| 3 | 333 | 1 | 4 | 2
| 4 | 444 | 1 | 2 | 3
Table_2: 'op_ats_cofres_chaves'
| ID2 | num_chave |
| 1 | A |
| 2 | B |
| 3 | C |
| 4 | D |
| 5 | E |
I have this SQL:
SELECT chaves.*, ats.numero numAT, ats.estadoAT
FROM op_ats_cofres_chaves chaves
LEFT JOIN op_ats ats ON ats.id_chave_cofre = chaves.id AND ats.id_cofre = 1
With this I get the following result:
| ID2 | num_chave | numAT | estadoAT |
| 1 | A | 444 | 3 |
| 2 | B | NULL | NULL |
| 3 | C | 111 | 1 |
| 4 | D | 333 | 2 |
| 5 | E | NULL | NULL |
Now the problem is that I want to filter the rows that are in Table1 but only that have the column 'estadoAT' with values 1 and 2. I've tried to add the line
WHERE op_ats.estadoAT = 1 OR op_ats.estadoAT = 2
But this makes the following result:
| ID2 | num_chave | numAT | estadoAT |
| 1 | A | 444 | 3 |
| 3 | C | 111 | 1 |
| 4 | D | 333 | 2 |
Resuming...
My intention is to get ALL rows in the Table2 and join the Table1 rows that have the 'id_cofre = 1' and '(estadoAT = 1 OR estadoAT = 2)'.
Any help is appreciated.
You have to move condition to JOIN clause instead of WHERE.
SELECT chaves.*, ats.numero numAT, ats.estadoAT
FROM op_ats_cofres_chaves chaves
LEFT JOIN op_ats ats ON ats.id_chave_cofre = chaves.id AND ats.id_cofre = 1
AND op_ats.estadoAT = 1 OR op_ats.estadoAT = 2;

AVG didn't give the correct value - Postgresql

I have a table in which there many redundant points, I want to select distinct points using (distinct) and to select the average of some row (eg. rscp).
Here we have an example :
| id | point | rscp | ci
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
| 1 | POINT(10.1192 36.8018) | 10 | 701
| 2 | POINT(10.1192 36.8018) | 11 | 701
| 3 | POINT(10.1192 36.8018) | 12 | 701
| 4 | POINT(10.4195 36.0017) | 30 | 701
| 5 | POINT(10.4195 36.0017) | 44 | 701
| 6 | POINT(10.4195 36.0017) | 55 | 701
| 7 | POINT(10.9197 36.3014) | 20 | 701
| 8 | POINT(10.9197 36.3014) | 22 | 701
| 9 | POINT(10.9197 36.3014) | 25 | 701
What i want to get is this table below : (rscp_avg is the average of rscp of the redundant points)
| id | point | rscp_avg | ci
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
| * | POINT(10.1192 36.8018) | 11 | *
| * | POINT(10.4195 36.0017) | 43 | *
| * | POINT(10.9197 36.3014) | 22.33 | *
I tried this, but it gave me a false average !!!!
select distinct on(point)
id,st_astext(point),avg(rscp) as rscp_avg,ci
from mesures
group by id,point,ci;
Thanks for your help (^_^)
Hamdoulah ! Thanks God !
I find the solution just now :
select on distinct(point)
id,st_astext(point),rscp_avg,ci
from
(select id,point,avg(rscp) over w as rscp_avg,ci
from mesures
window w as (partition by point order by id desc)
) ss
order by point,id asc;
The websites that help me are :
http://www.postgresql.org/docs/9.1/static/tutorial-window.html
http://www.w3resource.com/PostgreSQL/postgresql-avg-function.php