Trigger an event when the event count maximum of last 12 months window - complex-event-processing

I have a requirement like, Trigger an event when the idle well count maximum of last 12 months window.
For Example:
Well_date Count
1986-01-01 00:00:00 17
1986-02-01 00:00:00 16
1986-03-01 00:00:00 23
1986-04-01 00:00:00 33
1986-05-01 00:00:00 31
1986-06-01 00:00:00 42
1986-07-01 00:00:00 43
1986-08-01 00:00:00 43
1986-09-01 00:00:00 41
1986-10-01 00:00:00 42
1986-11-01 00:00:00 46
1986-12-01 00:00:00 52
Output:
1986-12-01 00:00:00 52
Suppose, if the event count is minimum of last 11 months then it will be ignored.
Thanks in advance

This one will give you a stream of last max well counts, i.e. the max excluding the current event:
insert into LastMaxStream select rstream max(well_count) as lastMax from SomeEvent
The LastMaxStream can be used to compare:
#name('out') select * from SomeEvent(well_count > (select lastMax from LastMaxStream.std:lastevent()));
There may be other solutions but that is the one that comes to mind. For considering some time period add that to the group-by clause, or declare a context that starts when 1986 starts and ends when 1986 ends, for example.

Related

How to get the minimum value of unique items based upon a datediff function in t-sql?

I am trying to figure out the minimum time elapsed between two columns, grouped by values in a third column
ID
Start Time
End Time
1
2021-08-22 00:00:00
2021-08-24 00:00:00
1
2021-08-21 00:00:00
2021-08-24 00:00:00
2
2021-08-22 00:00:00
2021-08-24 00:00:00
2
2021-08-21 00:00:00
2021-08-24 00:00:00
3
2021-08-22 00:00:00
2021-08-24 00:00:00
3
2021-08-21 00:00:00
2021-08-24 00:00:00
From this table, I would like to get the results:
ID
Elapsed Time
1
48 hours
2
48 hours
3
48 hours
Currently I have this SQL function
SELECT ID, datediff(hour, Start Time, End Time) as diff
FROM t
WHERE
MIN(diff)
GROUP BY ID
Jacob, this should give you the results you are looking for:
SELECT
ID,
MIN(DATEDIFF (HOUR, StartTime, EndTime)) AS diff
FROM
t
GROUP BY
ID;

Maximum count of overlapping intervals in PostgreSQL

Suppose there is a table structured as follows:
id start end
--------------------
01 00:18 00:23
02 00:22 00:31
03 00:23 00:48
04 00:23 00:39
05 00:24 00:25
06 00:24 00:31
07 00:24 00:38
08 00:25 00:37
09 00:26 00:42
10 00:31 00:34
11 00:33 00:38
The objective is to compute the overall maximum number of rows having been active (i.e. between start and end) at any given moment in time. This would be relatively straightforward using a procedural algorithm, but I'm not sure how to do this in SQL.
According to the above example, this maximum value would be 8 and would correspond to the 00:31 timestamp where active rows were 2, 3, 4, 6, 7, 8, 9, 10 (as shown in the schema below).
Obtaining the timestamp(s) and the active rows corresponding to the maximum value is not important, all is needed is the actual value itself.
I was thinking of at first, using generate_series() to iterate every minute and get the count of active intervals for each, then take the max of this.
You can improve your idea and iterate only "start" values from the table because one of "start" points includes in time interval with maximum active rows.
select id, start,
(select count(1) from tbl t where tbl.start between t.start and t."end")
from tbl;
Here results
id start count
-----------------
1 00:18:00 1
2 00:22:00 2
3 00:23:00 4
4 00:23:00 4
5 00:24:00 6
6 00:24:00 6
7 00:24:00 6
8 00:25:00 7
9 00:26:00 7
10 00:31:00 8
11 00:33:00 7
So, this query gives you maximum number of rows having been active
select
max((select count(1) from tbl t where tbl.start between t.start and t."end"))
from tbl;
max
-----
8

PostgreSQL - How can I SUM until a certain hour of the day?

I'm trying to create a metric for a PostgreSQL integrated dashboard which would show today's "Total Payment Value" (TPV) of a certain product, as well as yesterday's TPV of the same product, up until the same moment as today, so if I'm accessing the dashboard at 5 pm, it will show what it was yesterday until 5 pm and today's TPV.
edit: My question wasn't very clear so I'm adding a few more lines and editing the query, which had a mistake.
I tried this:
select
sum(case when table.product in (13,14,15,16) then amount else 0 end) as "TPV"
,date_trunc('day', table.date) as "Day"
from table
where
date > current_date - 1
group by date_trunc('day', table.date)
order by 2,1
I only want to sum the amount when product = 13, 14, 15 or 16
An example of the product, date and amount would be like this:
product amount date
8 4750 19/03/2019 00:21
14 7840 12/04/2019 22:40
14 15000 22/03/2019 18:27
14 11715 19/03/2019 00:12
14 1054 22/03/2019 18:22
14 18491 17/03/2019 14:28
14 12253 17/03/2019 14:30
14 27600 17/03/2019 14:32
14 3936 17/03/2019 14:28
14 19007 19/03/2019 00:14
8 9400 19/03/2019 00:21
8 4750 19/03/2019 00:21
8 25000 19/03/2019 00:17
14 10346 22/03/2019 18:23
I would like to have a metric that always calculates the sum of the product value today up until the current moment - when the "product" corresponds to values 13, 14, 15 or 16 - as well as the same metric for yesterday, e.g., it's 1 PM now, I want today's TPV until 1 PM and yesterday's TPV until 1 PM as well!

Day and night average per day in R

I have a data set from april to october with registered data every 5 minutes per day. I want to get the average temperature and RH of day and night for every day, considering "day" from 7:30 to 18:30 and "night" for the rest of hours,
The table looks like this:
Date Time Temp RH
18/04/2018 00:00:00 21.9 73
18/04/2018 00:05:00 21.9 73
18/04/2018 00:10:00 21.8 73
18/04/2018 00:15:00 21.6 73
18/04/2018 00:20:00 21.6 72
18/04/2018 00:25:00 21.5 72
18/04/2018 00:30:00 21.4 74
And so on till october. I have tried codes from similar questions but for some reason or the other, I always get an error. In one example I saw that there is a column with "AM/PM" values to make this simpler, but then I'd have to create this new column for all the rows. Also tried with "hourly.apply" but it seems that the function doesn't exist.
What I want to obtain is this:
Date Time Temp RH
18/04/2018 day 25.8 80
18/04/2018 night 17.3 43
19/04/2018 day 24.2 73
19/04/2018 night 15.1 42
I typed the code:
> n=287
> T24_GH111 <- aggregate(GH111[,3],list(rep(1:nrow(GH111%%n+1), each=n, leng=nrow(GH111))),mean)[-1];`
But this will give me the average of 24 hours.
Thanks in advance!
Let's start with a simple example and create a dateframe with datetimes.
library(lubridate) # for datetime manipulation
# Creating simple example
Datetime <- c(as.POSIXct("2018-04-17 22:00", tz="Europe/Berlin"),
as.POSIXct("2018-04-18 01:00", tz="Europe/Berlin"),
as.POSIXct("2018-04-18 10:00", tz="Europe/Berlin"),
as.POSIXct("2018-04-18 13:00", tz="Europe/Berlin"),
as.POSIXct("2018-04-18 22:00", tz="Europe/Berlin"),
as.POSIXct("2018-04-19 01:00", tz="Europe/Berlin")
)
x <- c(1,3,10,20,2,5)
df <- data.frame(Datetime,x)
Now, we are using local_time() from the lubridate package to define a new day/night variable.
# Getting local time in hours
df$time <- local_time(df$Datetime, units ="hours")
# Setting day night parameter
t1 <- 7.5 # 07:30
t2 <- 18.5 # 18:30
df$dayNight <- ""
idx <- xor(t1 < df$time ,df$time < t2)
df$dayNight[idx] <- "day"
df$dayNight[!idx] <- "night"
To aggregate by day, we need to change the dates for all datetimes < 07:30. Fortunately, we have already set up the local time. So, let's use this for setting up a dummyDate variable. (This will be the resulting Date)
cond <- df$time < t1
# Using dummyDate for aggregate for dayNight values per day
df$dummyDate <- df$Datetime
df$dummyDate[nightCondition] <- df$Datetime[nightCondition] - days(1)
df$dummyDate <- floor_date(df$dummyDate, unit = "day") # flooring date for aggregation
df
Datetime x time dayNight dummyDate
1 2018-04-17 22:00:00 1 22 hours day 2018-04-17
2 2018-04-18 01:00:00 3 1 hours day 2018-04-17
3 2018-04-18 10:00:00 10 10 hours night 2018-04-18
4 2018-04-18 13:00:00 20 13 hours night 2018-04-18
5 2018-04-18 22:00:00 2 22 hours day 2018-04-18
6 2018-04-19 01:00:00 5 1 hours day 2018-04-18
Now, we have set up all variables to use the aggregate function to calculate the mean of x by dayNight and dummyDate
# Aggregating x value per dummyDate and daynight variables
dfAgg <- aggregate(df[,2], list(Date = df$dummyDate, Time = df$dayNight), mean)
dfAgg
Date Time x
1 2018-04-17 day 2.0
2 2018-04-18 day 3.5
3 2018-04-18 night 15.0

Calculating Running Avg for YTD Sum with constant denominator for a year

I have the following table from SQL
ID Date Score
-----+-------------+----------
10 2015-01-10 5
20 2015-01-10 5
10 2015-02-10 15
40 2015-02-10 25
30 2015-02-10 5
10 2015-03-10 15
10 2014-01-10 25
20 2014-02-10 35
50 2014-03-10 45
In Tableau I want a line graph to display
(YTD Sum of Score)/Total number of IDs for a year.
For Jan 2015 - 10/4=2.5
For Feb 2015 - 55/4=13.75
For Jan 2014 - 60/3=20
The denominator should remain constant throughout the year and not change monthwise.
Looks like you can achieve your desired result with two calculated fields. First, make a [Year] field with:
year([Date])
Then make a second calculated field as follows:
sum([Score])/sum({fixed [Year] : countd([Id])})
This will sum the score and divide by IDs for the given year. It uses Level of Detail calculation.