TSQL Row Number split by reference and date - tsql

Using a single table with a reference and date column example below, how could I produce the out below to split the row number. The same reference on the same day should show as the same row number.
example below;
MAINFRAJOB SyncDate Row Number
7861 02/10/2019 1
7861 02/10/2019 1
7861 03/10/2019 2
1045679 25/09/2019 1
10233649 03/10/2019 1
10233652 04/10/2019 1
10233652 04/10/2019 1
10233652 06/10/2019 2
123456789 02/10/2019 1
123456789 02/10/2019 1
123456789 03/10/2019 2
123456789 04/10/2019 3
I have tried this but it is not producing the correct results;
ROW_NUMBER()over(partition by cast(ard.SyncDate as date), ard.actionref order by cast(ard.SyncDate as date) desc) AS 'RowNo'
Thanks for any guidance.

I think you are really looking for Dense_Rank() as BarneyL mentioned, but you also want to partition by MAINFRAJOB
Example
Select *
,Row_Number = DENSE_RANK() over (Partition By [MAINFRAJOB] Order by [SyncDate])
From YourTable
Returns

Try DENSE_RANK instead, you also need to remove the date from the partition otherwise it resets to 1 each date change:
DENSE_RANK()over(partition by cast(ard.SyncDate as date), ard.actionref order by cast(ard.SyncDate as date) desc) AS 'RowNo'

Related

how to get last known contiguous value in postgres ltree field?

I have a child table called wbs_numbers. the primary key id is a ltree
A typical example is
id
series_id
abc.xyz.00001
1
abc.xyz.00002
1
abc.xyz.00003
1
abc.xyz.00101
1
so the parent table called series. it has a field called last_contigous_max.
given the above example, i want the series of id 1 to have its last contigous max be 3
can always assume that the ltree of wbs is always 3 fragment separated by dot. and the last fragment is always a 5 digit numeric string left padded by zero. can always assume the first child is always ending with 00001 and the theoretical total children of a series will never exceed 9999.
If you think of it as gaps and islands, the wbs_numbers will never start with a gap within a series. it will always start with an island.
meaning to say this is not possible.
id
series_id
abc.xyz.00010
1
abc.xyz.00011
1
abc.xyz.00012
1
abc.xyz.00101
1
This is possible
id
series_id
abc.xyz.00001
1
abc.xyz.00004
1
abc.xyz.00005
1
abc.xyz.00051
1
abc.xyz.00052
1
abc.xyz.00100
1
abc.xyz.10001
2
abc.xyz.10002
2
abc.xyz.10003
2
abc.xyz.10051
2
abc.xyz.10052
2
abc.xyz.10100
2
abc.xyz.20001
3
abc.xyz.20002
3
abc.xyz.20003
3
abc.xyz.20004
3
abc.xyz.20052
3
abc.xyz.20100
3
so the last max contiguous in this case is
for series id 1 => 1
for series id 2 => 3
for series id 3 => 4
What's the query to calculate the last_contigous_max number for any given series_id?
I also don't mind having another table just to store "islands".
Also, you can safely assume that wbs_number records will never be deleted once created. The id in the wbs_numbers table will never be altered once filled in as well.
Meaning to say islands will only grow and never shrink.
You can carry out your problem following these steps:
extract your integer value from your "id" field
compute a ranking value sided with your id value
filter out when your ranking value does not match your id value
get tied last row for each of your matches
WITH cte AS (
SELECT *, CAST(RIGHT(id_, 4) AS INTEGER) AS idval
FROM tab
), ranked AS (
SELECT *,
ROW_NUMBER() OVER(PARTITION BY series_id ORDER BY idval) AS rn
FROM cte
)
SELECT series_id, idval
FROM ranked
WHERE idval = rn
ORDER BY ROW_NUMBER() OVER(PARTITION BY series_id ORDER BY idval DESC)
FETCH FIRST ROWS WITH TIES
Check the demo here.

How to apply partition by in lag function using postrgresql

I have a table like as shown below
subject_id, date_inside, value
1 2110-02-12 19:41:00 1.3
1 2110-02-15 01:40:00 1.4
1 2110-02-15 02:40:00 1.5
2 2110-04-15 04:07:00 1.6
2 2110-04-15 08:00:00 1.7
2 2110-04-15 18:30:00 1.8
I would like to compute the date difference between consecutive rows for each subject
I tried the below
select a.subject_id,a.date_inside, a.value,
a. date_inside- lag(a. date_inside) over (order by a. date_inside) as difference
from table1 a
While the above works, I am not able to apply partition by for each subject. So, it ends up calculating the difference for all the rows (without considering the subject_id). Basically, the last row of each subject has to be null because that's his or her last row (and should not be subtracted from consecutive record of the next subject)
I expect my output to be like as shown below
subject_id, date_inside, difference
1 2110-02-12 19:41:00 66 hours
1 2110-02-15 01:40:00 1 hour
1 2110-02-15 02:40:00 NULL
2 2110-04-15 04:07:00 3 hours, 53 minutes
2 2110-04-15 08:00:00 10 hours, 30 minutes
2 2110-04-15 18:30:00 NULL
Just add a PARTITION BY clause, and also your expected output seems to want LEAD, not LAG:
SELECT subject_id, date_inside, value,
LEAD(date_inside) OVER (PARTITION BY subject_id ORDER BY date_inside)
- date_inside AS difference
FROM table1
ORDER BY
subject_id,
date_inside;
Think of "partition by" to be simiar to how you could use "group by". In this case the logical boundaries are determined by subject_id so just include as part of the over clause:
select a.subject_id,a.date_inside, a.value,
a.date_inside - lag(a.date_inside) over (partition by a.subject_id order by a.date_inside) as difference
from table1

Get distinct rows based on one column with T-SQL

I have a column in the following format:
Time Value
17:27 2
17:27 3
I want to get the distinct rows based on one column: Time. So my expected result would be one result. Either 17:27 3 or 17:27 3.
Distinct
T-SQL uses distinct on multiple columns instead of one. Distinct would return two rows since the combinations of Time and Value are unique (see below).
select distinct [Time], * from SAPQMDATA
would return
Time Value
17:27 2
17:27 3
instead of
Time Value
17:27 2
Group by
Also group by does not appear to work
select * from table group by [Time]
Will result in:
Column 'Value' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
Questions
How can I select all unique 'Time' columns without taking into account other columns provided in a select query?
How can I remove duplicate entries?
This is where ROW_NUMBER will be your best friend. Using this as your sample data...
time value
-------------------- -----------
17:27 2
17:27 3
11:36 9
15:14 5
15:14 6
.. below are two solutions with that you can copy/paste/run.
DECLARE #youtable TABLE ([time] VARCHAR(20), [value] INT);
INSERT #youtable VALUES ('17:27',2),('17:27',3),('11:36',9),('15:14',5),('15:14',6);
-- The most elegant way solve this
SELECT TOP (1) WITH TIES t.[time], t.[value]
FROM #youtable AS t
ORDER BY ROW_NUMBER() OVER (PARTITION BY t.[time] ORDER BY (SELECT NULL));
-- A more efficient way solve this
SELECT t.[time], t.[value]
FROM
(
SELECT t.[time], t.[value], ROW_NUMBER() OVER (PARTITION BY t.[time] ORDER BY (SELECT NULL)) AS RN
FROM #youtable AS t
) AS t
WHERE t.RN = 1;
Each returns:
time value
-------------------- -----------
11:36 9
15:14 5
17:27 2

Postgresql : Average over a limit of Date with group by

I have a table like this
item_id date number
1 2000-01-01 100
1 2003-03-08 50
1 2004-04-21 10
1 2004-12-11 10
1 2010-03-03 10
2 2000-06-29 1
2 2002-05-22 2
2 2002-07-06 3
2 2008-10-20 4
I'm trying to get the average for each uniq Item_id over the last 3 dates.
It's difficult because there are missing date in between so a range of hardcoded dates doesn't always work.
I expect a result like :
item_id MyAverage
1 10
2 3
I don't really know how to do this. Currently i manage to do it for one item but i have trouble extending it to multiples items :
SELECT AVG(MyAverage.number) FROM (
SELECT date,number
FROM item_list
where item_id = 1
ORDER BY date DESC limit 3
) as MyAverage;
My main problem is with generalising the "DESC limit 3" over a group by id.
attempt :
SELECT item_id,AVG(MyAverage.number)
FROM (
SELECT item_id,date,number
FROM item_list
ORDER BY date DESC limit 3) as MyAverage
GROUP BY item_id;
The limit is messing things up there.
I have made it " work " using between date and date but it's not working as i want because i need a limit and not an hardcoded date..
Can anybody help
You can use row_number() to assign 1 to 3 for the records with the last date for an ID an then filter for that.
SELECT x.item_id,
avg(x.number)
FROM (SELECT il.item_id,
il.number,
row_number() OVER (PARTITION BY il.item_id
ORDER BY il.date DESC) rn
FROM item_list il) x
WHERE x.rn BETWEEN 1 AND 3
GROUP BY x.item_id;

Postgresql difference between rows

My data:
id value
1 10
1 20
1 60
2 10
3 10
3 30
How to compute column 'change'?
id value change | my comment, how to compute
1 10 10 | 20-10
1 20 40 | 60-20
1 60 40 | default_value-60. In this example default_value=100
2 10 90 | default_value-10
3 10 20 | 30-10
3 30 70 | default_value-30
In other words: if row of id is last, then compute 100-value,
else compute next_value-value_now
You can access the value of the "next" (or "previous") row using a window function. The concept of a "next" row only makes sense if you have a column to define an order on the rows. You said you have a date column on which you can order the result. I used the column name your_date_column for this. You need to replace that with the actual column name of course.
select id,
value,
lead(value, 1, 100) over (partition by id order by your_date_column) - value as change
from the_table
order by id, your_date_column
lead(value, 1, 100) says: take the column value of the "next" row (that's the 1). If there is no such row, use the default value 100 instead.
Join on a subquery and use ROW_NUMBER to find the last value per group
WITH CTE AS(
SELECT id,value,
ROW_NUMBER() OVER (PARTITION BY id ORDER BY date) rn,
(LEAD(value) OVER (PARTITION BY id ORDER BY date)-value) change FROM t)
SELECT cte.id,cte.value,
(CASE WHEN cte.change IS NULL THEN 100-cte.value ELSE cte.change END)as change FROM cte LEFT JOIN
(SELECT id,MAX(rn) mrn FROM cte
GROUP BY id) as x
ON x.mrn=cte.rn AND cte.id=x.id
FIDDLE