I have data like in a dataframe
CommsId
Id
Amount
Date
85
1
10
07/10/2020
72
1
15
09/09/2021
85
1
25
09/09/2021
70
1
30
09/09/2021
72
1
-15
05/11/2020
70
1
-30
05/11/2020
For each date, I want to find the sum of amounts of the latest CommsId as of the date.
Expected output is as below
Date
Sum_Amount
Id
07/10/2020
10
1
09/09/2021
70
1
05/11/2021
25
1
I have my data that looks like this:
user_id touchpoint_number days_difference
1 1 5
1 2 20
1 3 25
1 4 10
2 1 2
2 2 30
2 3 4
I would like to create one more column that would create a cumulative sum of the days_difference, partitioned by user_id, but would reset whenever the value reaches 30 and starts counting from 0. I have been trying to do it, but I couldn't figure it out how to do it in PostgreSQL, because it has to be recursive.
The outcome I would like to have would be something like:
user_id touchpoint_number days_difference cum_sum_upto30
1 1 5 5
1 2 20 25
1 3 25 0 --- new count all over again
1 4 10 10
2 1 2 2
2 2 30 0 --- new count all over again
2 3 4 4
Do you have any cool ideas how this could be done?
This should do what you want:
with cte as (
select t.a, t.b, t.c, t.c as sumc
from t
where b = 1
union all
select t.a, t.b, t.c,
(case when t.c + cte.sumc > 30 then 0 else t.c + cte.sumc end)
from t join
cte
on t.b = cte.b + 1 and t.a = cte.a
)
select *
from cte
order by a, b;
Here is a rextester.
id o_num d_num
69af4bf986c4df522afb54da6512bdc5 5 5
69af6111de53b550b0d13f86398b59e5 19 19
69b264c4b93a1984450689b16807b293 10 10
69b26c0fb38ff1cd2d4b01696aa14883 20 20
69b5c46bdc8a8f49f913d9d2325f0a76 15 15
69b71276a69dece5630ed3405ceca411 1 6
69b790c7937602e8fd52bc4d28194625 5 17
69b7bfde4effdaf31d362165a23a8dd0 4 13
69b93626a799636aef2ab3567cf3a110 14 14
I have a table like this, there are total 20 o_num in the table, and i want to select all the row that o_num is 1 then group by the d_num to count the id number, and them change the o_num to 2, until o_num to 20. and the result is in one table.
here is my code for 1 time:
SELECT COUNT(id), o_num, d_num
FROM table1
WHERE o_num = 1
GROUP BY o_num, d_num
how can i change the code to get my table
I want get the reselt like this,a table with 3 columns
sum o_num d_num
9 1 1
8 1 2
4 1 3
……
5 1 20
4 2 1
6 2 2
8 2 3
……
3 2 20
5 3 1
……
……
2 20 20
TSQL MSSQL 2008r2
I'm re-writing the question to try and make it clear what the issue is that I'm trying to explain.
I've got a stored proc that takes 3 parameters. VehicleKey, StartDate and EndDateTime. I'm querying a Data Warehouse db. So the data shouldn't change.
When the proc is called with the same parameters then most of the time the results will be as expected but on some random occasions, with those same parameters, the results differ. I'm querying a Data WH so the data doesn't change.
The problem is with the dynamic derived column "Island".
It's completely random. The proc can be executed 20 times and give the expected results and then the next 2 will give incorrect results.
There can be 1 or more VehicleKey/DriverKey combinations in a given date range.
This is the problem query
SELECT
A.VehicleKey
,A.NodeId
,A.DriverKey
,MIN(A.StartTrip) 'StartTrip'
,MAX(A.EndTrip) 'EndTrip'
,SUM(A.PrivOdo) 'Private'
,SUM(A.BusOdo) 'Business'
,SUM(A.TravOdo) 'Travel'
,SUM(A.PrivOdo + A.BusOdo + A.TravOdo )'Total'
FROM
(
SELECT
Island = ( ROW_NUMBER() OVER (PARTITION BY T.VehicleKey ORDER BY MONTH(StartTrip)) ) - ( ROW_NUMBER() OVER (PARTITION BY T.VehicleKey, T.DriverKey ORDER BY T.StartTrip) )
,NodeId
,VehicleKey
,DriverKey
,StartTrip
,EndTrip
,BusOdo
,PrivOdo
,TravOdo
FROM
#xYTD_BPTotals T
) AS A
GROUP BY
A.Island
,A.VehicleKey
,A.NodeId
,A.DriverKey
ORDER BY
A.VehicleKey
,MIN(A.StartTrip);
I am of the understanding that the ORDER BY should be on the outside of the derived table for it to take effect.
I think I've narrowed it down to the issue presenting itself only when a Vehicle has 2 or more DriverKey combinations.
for example, Parameters VehicleKey 4865, StartDateTime = '2016-01-01', EndDateTime = '2016-10-31'
This is the correct result - including Island column
VehicleKey NodeId DriverKey Island StartTrip EndTrip Private Business Travel Total_
4865 458 0 0 2016-09-06 14:06:08 2016-09-28 17:02:08 54.75 737.83 0 792.58
4865 458 1202 134 2016-09-29 11:10:04 2016-09-30 17:25:51 0 211.32 0 211.32
4865 458 0 27 2016-10-03 07:39:25 2016-10-14 17:00:15 0 579.81 0 579.81
and this is when it's wrong. Parameters VehicleKey 4865, StartDateTime = '2016-01-01', EndDateTime = '2016-10-31'
- including Island column
The first two rows here should be combined.
VehicleKey NodeId DriverKey Island StartTrip EndTrip Private Business Travel Total_
4865 458 0 98 2016-09-06 14:06:08 2016-09-21 09:15:49 0 313.87 0 313.87
4865 458 0 -63 2016-09-21 09:21:10 2016-09-28 17:02:08 54.75 423.96 0 478.71
4865 458 1202 71 2016-09-29 11:10:04 2016-09-30 17:25:51 0 211.32 0 211.32
4865 458 0 27 2016-10-03 07:39:25 2016-10-14 17:00:15 0 579.81 0 579.81
If I show the first few rows from the derived table, I've broken down the "Island" column
SELECT
Island = ( ROW_NUMBER() OVER (PARTITION BY T.VehicleKey ORDER BY MONTH(StartTrip)) ) - ( ROW_NUMBER() OVER (PARTITION BY T.VehicleKey, T.DriverKey ORDER BY T.StartTrip) )
,Island_x =( ROW_NUMBER() OVER (PARTITION BY T.VehicleKey ORDER BY MONTH(StartTrip)) )
,Island_y = ( ROW_NUMBER() OVER (PARTITION BY T.VehicleKey, T.DriverKey ORDER BY T.StartTrip) )
,NodeId
,VehicleKey
,DriverKey
,StartTrip
,EndTrip
,BusOdo
,PrivOdo
,TravOdo
FROM
#xYTD_BPTotals T
The correct result should be
Island Island_x Island_y NodeId VehicleKey DriverKey StartTrip EndTrip BusOdo PrivOdo TravOdo
0 1 1 24901 4865 0 2016-09-06 14:06:08 2016-09-06 14:08:50 0 0 0
0 2 2 24901 4865 0 2016-09-06 15:39:14 2016-09-06 15:40:53 114 0 0
0 3 3 24901 4865 0 2016-09-08 11:06:43 2016-09-08 11:07:23 0 0 0
0 4 4 24901 4865 0 2016-09-08 11:12:03 2016-09-08 11:12:26 20 0 0
0 5 5 24901 4865 0 2016-09-08 11:19:20 2016-09-08 11:19:52 1 0 0
0 6 6 24901 4865 0 2016-09-08 11:26:58 2016-09-08 11:27:56 88 0 0
0 7 7 24901 4865 0 2016-09-08 11:33:40 2016-09-08 11:35:02 1 0 0
0 8 8 24901 4865 0 2016-09-12 09:08:53 2016-09-12 09:10:42 34 0 0
but I sometimes get this with the same input paramaters.
Island Island_x Island_y NodeId VehicleKey DriverKey StartTrip EndTrip BusOdo PrivOdo TravOdo
98 1 1 24901 4865 0 2016-09-06 14:06:08 2016-09-06 14:08:50 0 0 0
98 2 2 24901 4865 0 2016-09-06 15:39:14 2016-09-06 15:40:53 114 0 0
98 3 3 24901 4865 0 2016-09-08 11:06:43 2016-09-08 11:07:23 0 0 0
98 4 4 24901 4865 0 2016-09-08 11:12:03 2016-09-08 11:12:26 20 0 0
98 5 5 24901 4865 0 2016-09-08 11:19:20 2016-09-08 11:19:52 1 0 0
98 6 6 24901 4865 0 2016-09-08 11:26:58 2016-09-08 11:27:56 88 0 0
98 7 7 24901 4865 0 2016-09-08 11:33:40 2016-09-08 11:35:02 1 0 0
98 8 8 24901 4865 0 2016-09-12 09:08:53 2016-09-12 09:10:42 34 0 0
Why is the "Island" calculated column wrong? 1-1 = 0 not 98.
Where am I going wrong?
EDIT - #YourData now looks like your raw table
Declare #YourTable table (VehicleKey int,NodeId int,DriverKey int,StartTrip datetime,EndTrip datetime,PrivOdo decimal(10,2),BusOdo decimal(10,2), TravOdo decimal(10,2))
Insert Into #YourTable values
(4865,458,0 ,'2016-09-06 14:06:08','2016-09-21 09:15:49',0 ,313.87,0),
(4865,458,0 ,'2016-09-21 09:21:10','2016-09-28 17:02:08',54.75,423.96,0),
(4865,458,1202,'2016-09-29 11:10:04','2016-09-30 17:25:51',0 ,211.32,0),
(4865,458,0 ,'2016-10-03 07:39:25','2016-10-14 17:00:15',0 ,579.81,0)
Select VehicleKey
,NodeID
,VehicleKey
,DriverKey
,StartTrip = min(StartTrip)
,EndTrip = max(EndTrip)
,Private = sum(PrivOdo)
,Business = sum(BusOdo)
,Travel = sum(TravOdo)
,Total = sum(PrivOdo + BusOdo + TravOdo )
From (
Select Island = ( ROW_NUMBER() OVER (PARTITION BY VehicleKey ORDER BY MONTH(StartTrip)) ) - ( ROW_NUMBER() OVER (PARTITION BY VehicleKey, DriverKey ORDER BY StartTrip) )
,*
From #YourTable
) A
Group By Island,VehicleKey,NodeID,VehicleKey,DriverKey
Order By min(StartTrip)
Returns
FYI - The sub-query produces
I've been trying to adapt other answers here for ages without success, so here goes... I have a basic query:
SELECT
*
FROM
#tbl_counts
ORDER BY
sgId,
CategoryCount DESC,
qccId;
This shows the following results:
sgId qccId CategoryCount
------- -------- -------------
4668 18 8
4668 77 7
4668 2 6
4669 43 2
4669 46 2
4670 25 3
4670 27 3
4670 74 2
4671 56 4
4671 60 3
4671 74 3
4671 54 3
4671 55 3
4671 78 2
4671 88 1
4671 89 1
4671 90 3
I need to amend this query to show the following:
For each unique sgId value, show the top 3 CategoryCount values (with ties if they exist), and the appropriate qccId value. Therefore, the results should be:
sgId qccId CategoryCount
------- ----- -------------
4668 18 8
4668 77 7
4668 2 6 -- top 3 4668
4669 43 2
4669 46 2 -- top 2 4669 because only 2 existed
4670 25 3
4670 27 3
4670 74 2 -- top 3 4670
4671 56 4
4671 60 3
4671 74 3
4671 54 3
4671 55 3 -- top 5 4671 caused by TIES, but discards others
Usually I would ROW_NUMBER here, but am struggling because it doesn't present TIES (I don't think). Therefore on adapting other answers I've found, I've got this far but it doesn't work properly...
SELECT
cnt.*
FROM
(
SELECT DISTINCT
sgId
FROM
#tbl_counts) sg INNER JOIN
(
SELECT TOP 3 WITH TIES
*
FROM
#tbl_counts
ORDER BY
CategoryCount DESC) cnt ON cnt.sgId = sg.sgId
Look like the perfect job for the window function DENSE_RANK:
;WITH
cte AS
(
SELECT *,
DENSE_RANK() OVER (PARTITION BY sgId ORDER BY CategoryCount DESC) As RowRank
FROM #tbl_counts
)
SELECT *
FROM cte
WHERE RowRank <= 3
ORDER BY sgId, CategoryCount DESC, qccId
You could use TOP 1 WITH TIES as you proposed in title:
SELECT TOP 1 WITH TIES *
FROM #tbl_counts
ORDER BY
IIF(DENSE_RANK() OVER(PARTITION BY sgId ORDER BY CategoryCount DESC)<=3,0,1);
LiveDemo