look for records with consecutive dates and know the number of days

look for records with consecutive dates and know the number of days - tsql

I have a table containing the fields:
cazi, cdip, date
1 2 13/03/2021
1 2 14/03/2021
1 2 15/03/2021
1 2 18/03/2021
1 2 19/03/2021
1 3 13/03/2021
1 3 14/03/2021
1 3 15/03/2021
1 3 20/03/2021
1 3 21/03/2021
I can't get the result with the columns:
cazi, cdip, date1, date2, num_dd
1 2 13/03/2021 15/03/2021 3
1 2 18/03/2021 19/03/2021 2
1 3 13/03/2021 15/03/2021 3
1 3 20/03/2021 21/03/2021 2
Can you help me ?
With the following code I get the min and max of the records, but I need the consecutive records:
WITH
dateGroup AS
(
SELECT DISTINCT
UniqueDate = [date]
,DateGroup = DATEADD(dd, - ROW_NUMBER() OVER (ORDER BY [date]), [date])
FROM malt
GROUP BY [date]
)
SELECT distinct
StartDate = MIN(UniqueDate)
,EndDate = MAX(UniqueDate)
,Days = DATEDIFF(dd,MIN(UniqueDate),MAX(UniqueDate))+1
,cazi
,cdip
FROM dateGroup JOIN
malt u ON u.date = UniqueDate
GROUP BY
DateGroup
,cazi
,cdip

This is traditional GAPS & ISLAND problem. You can try below query to achieve the desired result -
SELECT cazi, cdip, MIN(T.[date]), MAX(T.[date])
FROM (SELECT M.*, ROW_NUMBER() OVER(PARTITION BY cdip ORDER BY [date]) RN
FROM malt M) T
GROUP BY cazi, cdip, DATEADD(DAY, - RN, [date]);
Demo.

Related

How to calculate the number of messages within 10 seconds before the previous one?

I have a table with messages and I need to find chats where were two or more messages in period of 10 seconds. table
id message_id time
1 1 2021.11.10 13:09:00
1 2 2021.11.10 13:09:01
1 3 2021.11.10 13:09:50
2 1 2021.11.10 15:18:00
2 2 2021.11.10 15:20:00
3 1 2021.11.12 15:00:00
3 2 2021.11.12 15:10:00
3 2 2021.11.12 15:10:10
So the result looks like
id
1
3
I can't come up with the idea how to group by a period or maybe it can be done other way?
select id
from t
group by id, ?
having count(message_id) > 1

You can join the table with itself, matching them on the chat id and your timeframe.
create table messages (chat_id integer,message_id integer,"time" timestamp);
insert into messages values
(1,1,'2021.11.10 13:09:00'),
(1,2,'2021.11.10 13:09:01'),
(1,3,'2021.11.10 13:09:50'),
(2,1,'2021.11.10 15:18:00'),
(2,2,'2021.11.10 15:20:00'),
(3,1,'2021.11.12 15:00:00'),
(3,2,'2021.11.12 15:10:00'),
(3,2,'2021.11.12 15:10:10');
select target_chat,
target_message,
count(*) "number of messages preceding by no more than 10 seconds"
from
(select t1.chat_id target_chat,
t1.message_id target_message,
t1.time,
t2.chat_id,
t2.message_id,
t2.time
from messages t1
inner join messages t2
on t1.chat_id=t2.chat_id
and t1.message_id<>t2.message_id
and (t2.time<=t1.time-'10 seconds'::interval and t2.time<=t1.time)) a
group by 1,2;
-- target_chat | target_message | number of messages preceding by no more than 10 seconds
---------------+----------------+---------------------------------------------------------
-- 1 | 3 | 2
-- 2 | 2 | 1
-- 3 | 2 | 2
--(3 rows)
From that you can select the records with your desired number of preceding messages.

this is a simple query that finds every previous value that is included in our interval
select id from test_table t where
t.time + interval '10 second' >=
(select time from test_table where id=t.id and time>t.time limit 1)
group by id;
results
id
----
1
3

To find rows within an period of time, you can tipically use a window function which avoids a self join on the table :
SELECT id, count(*) OVER (ORDER BY time RANGE BETWEEN CURRENT ROW AND '10 minutes' FOLLOWING)
FROM t
GROUP BY id
Then you can use this query as a sub-query if you only want the id with count(*) > 1 :
SELECT DISTINCT ON (l.id) l.id
FROM
( SELECT id, count(*) OVER (ORDER BY time RANGE BETWEEN CURRENT ROW AND '10 minutes' FOLLOWING) AS ct
FROM t
GROUP BY id
) AS l
WHERE l.ct > 1 ;

How to enumerate rows by division?

I have the following table
id num sub_id
1 3 1
1 5 2
1 1 1
1 4 2
2 1 5
2 2 5
I want to get this result
id num sub_id number
1 3 1 1
1 5 2 2
1 1 1 1
1 4 2 2
2 1 5 1
2 2 5 1
I tried to do this row_number() over (partition by id order by num,sub_id DESC) but th result is obviosly differs

I don't understand your business because you don't explain your logic and information about that, but maybe this query helps you?
Result and info: dbfiddle
with recursive
cte_r as (
select id,
num,
sub_id,
row_number() over () as rn
from test),
cte as (
select id,
num,
sub_id,
rn,
rn as grp
from cte_r
where rn = 1
union all
select cr.id,
cr.num,
cr.sub_id,
cr.rn,
case
when cr.id != c.id then 1
when cr.id = c.id and cr.sub_id = c.sub_id then c.grp
when cr.id = c.id and cr.sub_id > c.sub_id then c.grp + 1
when cr.id = c.id and cr.sub_id < c.sub_id then 1
end
from cte c,
cte_r cr
where c.rn = cr.rn - 1)
select id,
num,
sub_id,
grp
from cte
order by id

It looks like you actually want to ignore the num column and then use DENSE_RANK on sub_id:
SELECT *, dense_rank() AS number OVER (PARTITION BY id ORDER BY sub_id) FROM …;

Select dates missing data in a range

I have a postgres table test_table that looks like this:
date | test_hour
------------+-----------
2000-01-01 | 1
2000-01-01 | 2
2000-01-01 | 3
2000-01-02 | 1
2000-01-02 | 2
2000-01-02 | 3
2000-01-02 | 4
2000-01-03 | 1
2000-01-03 | 2
I need to select all the dates which don't have test_hour = 1, 2, and 3, so it should return
date
------------
2000-01-03
Here is what I have tried:
SELECT date FROM test_table WHERE test_hour NOT IN (SELECT generate_series(1,3));
But that only returns dates that have extra hours beyond 1, 2, 3

You can use aggregation and conditional HAVING clauses, like so:
SELECT mydate
FROM mytable
GROUP BY mydate
HAVING
MAX(CASE WHEN test_hour = 1 THEN 1 END) != 1
OR MAX(CASE WHEN test_hour = 2 THEN 1 END) != 1
OR MAX(CASE WHEN test_hour = 3 THEN 1 END) != 1

Another possibility would be to join it against the series (or another subquery containing the hours) and do a [distinct] count on the hours aggregatet per date:
select date from tst
inner join (select generate_series(1,3) "hour") hours on hours.hour = tst.hour
group by tst.date
having count(distinct tst.hour) < 3;
or
select date from tst
where hour in (select generate_series(1,3))
group by date
having count(distinct tst.hour) < 3;
[You don't need the distinct if date/hour combinations in Your table are unique]

A solution using set difference, giving you exactly the rows that are missing:
(SELECT DISTINCT
date, all_hour
FROM test_table
CROSS JOIN generate_series(1,3) all_hour)
EXCEPT
(TABLE test_table)
And a solution using an array aggregate and the array contains operator:
SELECT date
FROM test_table
GROUP BY date
HAVING NOT array_agg(test_hour) #> ARRAY(SELECT generate_series(1,3))
(online demos)

SQL select converting transaction rows to columns

I have a table that lists all transactions as follows:
ID Account Date Amount
---------------------------
1 2 02/01/2015 30
2 5 05/01/2015 25
3 2 05/01/2015 12
4 2 07/01/2015 42
5 5 10/012015 19
6 2 11/01/2015 58
7 3 15/01/2015 36
Would like to write a select statement that will list only the last 3 transactions of each account, as follows please.
Account Date1 Amount Date2 Amount Date3 Amount
---------------------------------------------------------------
2 11/01/2015 58 07/01/2015 42 05/01/2015 12
3 15/01/2015 36
5 10/01/2015 19 05/01/2015 25
Thank you for any advice

You can use the row_number() function in a derived table to partition the data by account, and give each date within the partition a number, and then do a conditional aggregation over the rows with the top 3 numbers, grouped by account:
select
account,
date1 = max(case when rn = 1 then date end),
amount = max(case when rn = 1 then amount end),
date2 = max(case when rn = 2 then date end),
amount = max(case when rn = 2 then amount end),
date3 = max(case when rn = 3 then date end),
amount = max(case when rn = 3 then amount end)
from (
select *, rn = row_number() over (partition by account order by date desc)
from your_table
) a
where rn <= 3
group by account
Sample SQL Fiddle

Counting dates that fall between two dates in the same column

I have two tables and for each ID and Level combination in table1, I need to get a count of times matching ID appears in table2 in between sequential times for levels in table1.
So for example, for ID = 1 and Level=1 in table1, two Time entries from table2 for ID=1 fall between Time of Level=1 and Level=2 in table1, so result will be 2 in the result table.
table1:
ID Level Time
1 1 6/7/13 7:03
1 2 6/9/13 7:05
1 3 6/12/13 12:02
1 4 6/17/13 5:01
2 1 6/18/13 8:38
2 3 6/20/13 9:38
2 4 6/23/13 10:38
2 5 6/28/13 1:38
table2:
ID Time
1 6/7/13 11:51
1 6/7/13 14:15
1 6/9/13 16:39
1 6/9/13 19:03
2 6/20/13 11:02
2 6/20/13 15:50
Result would be
ID Level Count
1 1 2
1 2 2
1 3 0
1 4 0
2 1 0
2 3 2
2 4 0
2 5 0

select transformed_tab1.id, transformed_tab1.level, count(tab2.id)
from
(select tab1.id, tab1.level, tm, lead(tm) over (partition by id order by tm) as next_tm
from
(
select 1 as id, 1 as level, '2013-06-07 07:03'::timestamp as tm union
select 1 as id, 2 as level, '2013-06-09 07:05 '::timestamp as tm union
select 1 as id, 3 as level, '2013-06-12 12:02'::timestamp as tm union
select 1 as id, 4 as level, '2013-06-17 05:01'::timestamp as tm union
select 2 as id, 1 as level, '2013-06-18 08:38'::timestamp as tm union
select 2 as id, 3 as level, '2013-06-20 09:38'::timestamp as tm union
select 2 as id, 4 as level, '2013-06-23 10:38'::timestamp as tm union
select 2 as id, 5 as level, '2013-06-28 01:38'::timestamp as tm) tab1
) transformed_tab1
left join
(select 1 as id, '2013-06-07 11:51'::timestamp as tm union
select 1 as id, '2013-06-07 14:15'::timestamp as tm union
select 1 as id, '2013-06-09 16:39'::timestamp as tm union
select 1 as id, '2013-06-09 19:03'::timestamp as tm union
select 2 as id, '2013-06-20 11:02'::timestamp as tm union
select 2 as id, '2013-06-20 15:50'::timestamp as tm) tab2
on transformed_tab1.id=tab2.id and tab2.tm between transformed_tab1.tm and transformed_tab1.next_tm
group by transformed_tab1.id, transformed_tab1.level
order by transformed_tab1.id, transformed_tab1.level
;

SQL Fiddle
select t1.id, level, count(t2.id)
from
(
select id, level,
tsrange(
"time",
lead("time", 1, 'infinity') over(
partition by id order by level
),
'[)'
) as time_range
from t1
) t1
left join
t2 on t1.id = t2.id and t1.time_range #> t2."time"
group by t1.id, level
order by t1.id, level
The solution starts creating a range of timestamps using the lead window function. Notice the [) parameter to the tsrange constructor. It means to include the lower and exclude the upper bound.
Then it joins the two tables with the #> range operator. It means the range includes the element.
It is necessary to left join t1 to have the zero counts.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

look for records with consecutive dates and know the number of days - tsql

This is traditional GAPS & ISLAND problem. You can try below query to achieve the desired result - SELECT cazi, cdip, MIN(T.[date]), MAX(T.[date]) FROM (SELECT M.*, ROW_NUMBER() OVER(PARTITION BY cdip ORDER BY [date]) RN FROM malt M) T GROUP BY cazi, cdip, DATEADD(DAY, - RN, [date]); Demo.

Related

How to calculate the number of messages within 10 seconds before the previous one?

How to enumerate rows by division?

Select dates missing data in a range

SQL select converting transaction rows to columns

Counting dates that fall between two dates in the same column

Categories

Resources