DB2 - get the average time between set of dates - db2

I have a list of events and each one has a startDate and endDate. I need to know the average time taken for each event.
I need something like this:
select sum ( (timestamp(startDate) - timestamp(endDate)) for each event )
/ (count of events)

It only makes mathematical sense to take the AVG() of a numeric value, not datetime values or durations. Since you want your answer to be in minutes precision, you want to get your difference in minutes, then convert back to days, hours, minutes. (There are 24*60=1440 minutes in a standard day.)
with q as
(select avg(
timestampdiff(4, char(endDate - startDate) )
) as avgmns
from yourChosenData
)
select int(avgmns / 1440) as avg_days,
int( mod(avgmns,1440) / 60) as avg_mins,
mod(avgmns, 60) as avg_secs
from q
As mentioned below, timestampdiff() is an estimate. To avoid this issue, one could use a more accurate calculation.
with q as
(select avg(
( days(endDate) - days(startDate) ) * 1440
+ ( midnight_seconds(endDate) - midnight_seconds(startDate) ) / 60
) as avgmns
from yourChosenData
)
select int(avgmns / 1440) as avg_days,
int( mod(avgmns,1440) / 60) as avg_mins,
mod(avgmns, 60) as avg_secs
from q
In order to address the DST issue, if needed, one might choose either of:
include a UTC offset column corresponding to each timestamp field. This would also be useful if timstamps were being recorded in more than one timezone. The diference in offsets could then be fed into the calculation along with the timestamps.
provide a deterministic UDF which could return a UTC or DST adjustment offset for a given timestamp. If multiple timezones are involved, then the zone should also be a parameter to the function. Depending on the geographic areas involved, the logic may also need to consider areas which observe alternative DST rules.

You have to be careful of the denominator to prevent a 0 division: SQL0802 - Data Conversion or Data Mapping Error
Depending on the precision of the results, you will need to convert the date. Let's suppose you need seconds (2)
select
sum ( timestampdiff(2, endDate - startDate))
/
sum (count of events)
from yourTable
http://pic.dhe.ibm.com/infocenter/db2luw/v10r5/topic/com.ibm.db2.luw.sql.ref.doc/doc/r0000861.html

Related

How to bin timestamp data into buckets of n minutes in postgres

I have the following query which works, binning timestamped "observations" into buckets whose boundaries are defined by the bins table:
SELECT
count(id),
width_bucket(
time :: TIMESTAMP,
(SELECT ARRAY(SELECT start_time
FROM bins
WHERE owner_id = 'some id'
ORDER BY start_time ASC) :: TIMESTAMP[])
) bucket
FROM observations
WHERE owner_id = 'some id'
GROUP BY bucket
ORDER BY bucket;
I would like to modify this to allow for querying arbitrary n-minute bins starting from a specified timestamp, rather than having to pull from from an actual "bins" table.
That is, given a start time, a "bin width" in minutes, and a number of bins, is there a way I can generate the array of timestamps to pass into the width_bucket function?
Alternatively, is there a different/simpler approach to get the same results?
Use the function generate_series(start, stop, step interval), e.g.
select array(
select generate_series(
timestamp '2018-04-15 00:00',
'2018-04-15 01:00',
'30 minutes'))
array
---------------------------------------------------------------------
{"2018-04-15 00:00:00","2018-04-15 00:30:00","2018-04-15 01:00:00"}
(1 row)
Example in Db<>fiddle.
The above answers seem to do what you want, but as of PostgreSQL 14, there is now a function date_bin just for binning timestamps.
Quoting the documentation:
date_bin(stride,source,origin)
source is a value expression of type timestamp or timestamp with time zone. (Values of type date are cast automatically to timestamp.) stride is a value expression of type interval. The return value is likewise of type timestamp or timestamp with time zone, and it marks the beginning of the bin into which the source is placed.
Examples:
SELECT date_bin('15 minutes', TIMESTAMP '2020-02-11 15:44:17', TIMESTAMP > '2001-01-01');
Result: 2020-02-11 15:30:00
SELECT date_bin('15 minutes', TIMESTAMP '2020-02-11 15:44:17', TIMESTAMP '2001-01-01 00:02:30');
Result: 2020-02-11 15:32:30
In the case of full units (1 minute, 1 hour, etc.), it gives the same result as the analogous date_trunc call, but the difference is that date_bin can truncate to an arbitrary interval.
The stride interval must be greater than zero and cannot contain units of month or larger.
I would like to call special attention to the line
The return value [...] marks the beginning of the bin into which the source is placed.
This means that input timestamps will always be binned by "rounding down", rather than binning to whichever bin is closest. E.g. if you do:
SELECT date_bin('1 hour', '2021-10-13 00:59:59', '2021-10-13 00:00:00');
Then the result will be 2020-10-13 00:00:00 (rounded down by 59 minutes and 59 seconds), NOT 2021-10-13 01:00:00 (which is only one second away from the supplied timestamp). So the date_bin function does something slightly different than exactly what you ask for, but I figure this is good to post for anyone coming here in the future.
A different approach without a series:
Divide the difference of time and start by the width of the bin (5 minutes in the example) and add 1 because the first bucket of width_bucket(...) is 1 not 0.
floor(extract(epoch from (time - '2019-06-04 00:00'::timestamp)) / (5 * 60) ) + 1 as bucket
Getting the start of the bin is also possible
to_timestamp(floor(extract(epoch from a.time) / (5 * 60)) * (5 * 60)) as bin_start
Putting this all together:
SELECT
count(id),
floor(extract(epoch from (time - '2019-06-04 00:00'::timestamp)) / (5 * 60) ) + 1 as bucket,
to_timestamp(floor(extract(epoch from time) / (5 * 60)) * (5 * 60)) as bin_start
FROM observations
WHERE owner_id = 'some id'
GROUP BY bucket, bin_start
ORDER BY bucket;

PostgreSQL: Date Difference with fractions

SELECT cu.user_id, cu.last_activity, cu.updated_time,
DATE_PART('day', cu.last_activity - cu.updated_time), to_char(end_date - start_date, 'DD.HH24')
FROM stats.core_users cu
WHERE cu.user_id = '117132014' or cu.user_id = '117132012';
Get the result like:
117132014 2017-12-11 10:34:51.349905 2017-12-09 12:00:38.503518 1 01.22
117132012 2017-12-11 05:18:20.312283 2017-12-08 15:46:51.914085 2 02.13
Is is feasible to get the day difference with fractions like 1.91 days in the first case, instead of 1 days and 22 hours, to be more precise and easier to fit in a machine learning model?
date_part() does what it's name says: it returns one part of several elements from a date, interval or timestamp. In your case it's one part of an interval (because timestamp - timestamp returns an interval).
If you want the result as a fraction, you need to extract the seconds of the interval and then divide that by 86400 (which is the number of seconds in a day)
extract(epoch from cu.last_activity - cu.updated_time) / 86400

Is it possible to find data from MySQL by month using JPA and java.time.LocalDate date format?

I creating an application, for that I need to find data by month using JPA and java.time.LocalDate. So, is it possible to retrieve data by month from mysql?
Thanks in advance for help.
First find start and end date of month and use between method of JPA to find data of current month.
LocalDate start = LocalDate.ofEpochDay(System.currentTimeMillis() / (24 * 60 * 60 * 1000) ).withDayOfMonth(1);
LocalDate end = LocalDate.ofEpochDay(System.currentTimeMillis() / (24 * 60 * 60 * 1000) ).plusMonths(1).withDayOfMonth(1).minusDays(1);
In Repository
List<Object> findByCreatedateGreaterThanAndCreatedateLessThan(LocalDate start,LocalDate end);
Its better to use the between keyword, it makes things allot shorter.
List<Object> findByCreatedateBetween(LocalDate start,LocalDate end);
Also if you want to use the LocalDate or LocalDateTime objects with Spring Data you should use the converter class Jsr310JpaConverters or else the documents will be stored as Blobs instead of Dates (which is bad for portability of the database). Please see this tutorial on how to implement the Converter.
https://www.mkyong.com/spring-boot/spring-boot-spring-data-jpa-java-8-date-and-time-jsr310/
tl;dr
YearMonth.now( ZoneId.of( "Pacific/Auckland" ) ) // Get current month for particular time zone.
.atDayOfMonth( 1 ) // Get the first date of that month.
.plusMonths( 1 ) // Get first of next month for Half-Open query.
Details
Assuming your column in MySQL is of DATE type…
LocalDate
The LocalDate class represents a date-only value without time-of-day and without time zone.
Time zone
A time zone is crucial in determining a date. For any given moment, the date varies around the globe by zone. For example, a few minutes after midnight in Paris France is a new day while still “yesterday” in Montréal Québec.
Specify a proper time zone name in the format of continent/region, such as America/Montreal, Africa/Casablanca, or Pacific/Auckland. Never use the 3-4 letter abbreviation such as EST or IST as they are not true time zones, not standardized, and not even unique(!).
ZoneId z = ZoneId.of( "America/Montreal" );
LocalDate today = LocalDate.now( z );
YearMonth
The YearMonth class represents an entire month. Getting the current month requires a time zone as discussed above. Around the beginning/ending of the month, the current moment could be “next” month in Auckland New Zealand while still “previous” month in Kolkata India.
YearMonth currentMonth = YearMonth.now( z ) ;
Get the first date of the month.
LocalDate start = currentMonth.atDayOfMonth( 1 ) ;
Half-Open
Generally best to use the Half-Open [) approach to defining a span of time, where the beginning is inclusive while the ending is exclusive. So defining a month means starting with the first date of the month and running up to, but not including, the first date of the following month.
LocalDate stop = start.plusMonths( 1 ) ;
Query
Do not use the BETWEEN command in SQL as it is fully closed [], both beginning and ending being inclusive. Half-Open uses >= & < logic.
SELECT when FROM tbl
WHERE when >= start
AND when < stop
;
it's also useful
#Query("from PogWorkTime p where p.codePto = :codePto and month(p.dateApply) = :month and year(p.dateApply) = :year")
Iterable<PtoExceptWorkTime> findByCodePtoAndDateApply_MonthAndDateApply_Year(#Param("codePto") String codePto,#Param("month") int month, #Param("year") int year);

Quartz .Net - Meaning of BigInt DateTime

we've used sql server as our persisted data store for Quartz.net. I'd like to write some queries looking # the Time values. Specifically - Qrtz_Fired_Triggers.Fired_Time, Qrtz_Triggers.Next_fire_time, Prev_fire_time.
For the life of me, I can't find anything that says what this data is - ticks, milliseconds, microseconds, nanoseconds. I've guessed at a couple of things, but they've all proven wrong.
The best answer would include the math to convert the big int into a datetime and perhaps even a link(s) to the pages/documentation that I should have found - explaining the meaning of the data in those fields.
If you have specific instructions on using Quartz .Net libraries to view this information, that would be appreciated, but, I really have 2 goals - to understand the meaning of the date/time data being stored and to keep this in T-SQL. If I get the one, I can figure out T-SQL or out.
On the SQL side, you can convert from Quartz.NET BIGINT times to a DateTime in UTC time with:
SELECT CAST(NEXT_FIRE_TIME/864000000000.0 - 693595.0 AS DATETIME) FROM QRTZ_TRIGGERS
Numbers Explanation
Values stored in the column are the number of ticks from .NET DateTime.MinValue in UTC time. There are 10000 ticks per millisecond.
The 864000000000.0 represents the number of ticks in a single day. You can verify this with
SELECT DATEDIFF(ms,'19000101','19000102')*10000.0
Now, if we take March 13, 2013 at midnight, .NET returns 634987296000000000 as the number of ticks.
var ticks = new DateTime(2013, 3, 13).Ticks;
To get a floating point number where whole numbers represent days and decimal numbers represent time, we take the ticks and divide by the number of ticks per day (giving us 734939.0 in our example)
SELECT 634987296000000000/(DATEDIFF(ms,'19000101','19000102')*10000.0)
If we get put the date in SQL and convert to a float, we get a different number: 41344.0
SELECT CAST(CAST('March 13, 2013 0:00' AS DATETIME) AS FLOAT)
So, we need to generate a conversion factor for the .NET-to-SQL days. SQL minimum date is January 1, 1900 0:00, so the correction factor can be calculated by taking the number of ticks for that time (599266080000000000) and dividing by the ticks per day, giving us 693595.0
SELECT 599266080000000000/(DATEDIFF(ms,'19000101','19000102')*10000.0)
So, to calculate the DateTime of a Quartz.NET date:
take the value in the column
divide by the number of ticks per day
subtract out the correction factor
convert to a DATETIME
SELECT CAST([Column]/864000000000.0 - 693595.0 AS DATETIME)
The value stored in database is the DateTime.Ticks value. From MSDN:
A single tick represents one hundred
nanoseconds or one ten-millionth of a
second. There are 10,000 ticks in a
millisecond.
The value of this property represents
the number of 100-nanosecond intervals
that have elapsed since 12:00:00
midnight, January 1, 0001, which
represents DateTime.MinValue. It does
not include the number of ticks that
are attributable to leap seconds.
So, unless I missed something and am making this too complicated, I couldn't get the dateadd functions in Ms Sql Server 2008 to handle such large values and I kept getting overflow errors. The approach I took in Ms Sql Server was this:
a) find a date closer to now than 0001.01.01 & its ticks value
b) use a function to give me a DateTime value.
Notes:
* for my application - seconds was good enough.
* I've not tested this extensively, but so far, it has acted pretty well for me.
The function:
CREATE FUNCTION [dbo].[net_ticks_to_date_time]
(
#net_ticks BIGINT
)
RETURNS DATETIME
AS
BEGIN
DECLARE
#dt_2010_11_01 AS DATETIME = '2010-11-01'
, #bi_ticks_for_2010_11_01 AS BIGINT = 634241664000000000
, #bi_ticks_in_a_second AS BIGINT = 10000000
RETURN
(
DATEADD(SECOND , ( ( #net_ticks - #bi_ticks_for_2010_11_01 ) / #bi_ticks_in_a_second ) , #dt_2010_11_01)
);
END
GO
Here is how I came up with the # of ticks to some recent date:
DECLARE
#dt2_dot_net_min AS DATETIME2 = '01/01/0001'
, #dt2_first_date AS DATETIME2
, #dt2_next_date AS DATETIME2
, #bi_seconds_since_0101001 BIGINT = 0
SET #dt2_first_date = #dt2_dot_net_min;
SET #dt2_next_date = DATEADD ( DAY, 1, #dt2_first_date )
WHILE ( #dt2_first_date < '11/01/2010' )
BEGIN
SELECT #bi_seconds_since_0101001 = DATEDIFF(SECOND, #dt2_first_date, #dt2_next_date ) + #bi_seconds_since_0101001
PRINT 'seconds 01/01/0001 to ' + CONVERT ( VARCHAR, #dt2_next_date, 101) + ' = ' + CONVERT ( VARCHAR, CAST ( #bi_seconds_since_0101001 AS MONEY ), 1)
SET #dt2_first_date = DATEADD ( DAY, 1, #dt2_first_date );
SET #dt2_next_date = DATEADD ( DAY, 1, #dt2_first_date )
END

How can I compare two datetime fields but ignore the year?

I get to dust off my VBScript hat and write some classic ASP to query a SQL Server 2000 database.
Here's the scenario:
I have two datetime fields called fieldA and fieldB.
fieldB will never have a year value that's greater than the year of fieldA
It is possible the that two fields will have the same year.
What I want is all records where fieldA >= fieldB, independent of the year. Just pretend that each field is just a month & day.
How can I get this? My knowledge of T-SQL date/time functions is spotty at best.
You may want to use the built in time functions such as DAY and MONTH. e.g.
SELECT * from table where
MONTH(fieldA) > MONTH(fieldB) OR(
MONTH(fieldA) = MONTH(fieldB) AND DAY(fieldA) >= DAY(fieldB))
Selecting all rows where either the fieldA's month is greater or the months are the same and fieldA's day is greater.
select *
from t
where datepart(month,t.fieldA) >= datepart(month,t.fieldB)
or (datepart(month,t.fieldA) = datepart(month,t.fieldB)
and datepart(day,t.fieldA) >= datepart(day,t.fieldB))
If you care about hours, minutes, seconds, you'll need to extend this to cover the cases, although it may be faster to cast to a suitable string, remove the year and compare.
select *
from t
where substring(convert(varchar,t.fieldA,21),5,20)
>= substring(convert(varchar,t.fieldB,21),5,20)
SELECT *
FROM SOME_TABLE
WHERE MONTH(fieldA) > MONTH(fieldB)
OR ( MONTH(fieldA) = MONTH(fieldB) AND DAY(fieldA) >= DAY(fieldB) )
I would approach this from a Julian date perspective, convert each field into the Julian date (number of days after the first of year), then compare those values.
This may or may not produce desired results with respect to leap years.
If you were worried about hours, minutes, seconds, etc., you could adjust the DateDiff functions to calculate the number of hours (or minutes or seconds) since the beginning of the year.
SELECT *
FROM SOME_Table
WHERE DateDiff(d, '1/01/' + Cast(DatePart(yy, fieldA) AS VarChar(5)), fieldA) >=
DateDiff(d, '1/01/' + Cast(DatePart(yy, fieldB) AS VarChar(5)), fieldB)
Temp table for testing
Create table #t (calDate date)
Declare #curDate date = '2010-01-01'
while #curDate < '2021-01-01'
begin
insert into #t values (#curDate)
Set #curDate = dateadd(dd,1,#curDate)
end
Example of any date greater than or equal to today
Declare #testDate date = getdate()
SELECT *
FROM #t
WHERE datediff(dd,dateadd(yy,1900 - year(#testDate),#testDate),dateadd(yy,1900 - year(calDate),calDate)) >= 0
One more example with any day less than today
Declare #testDate date = getdate()
SELECT *
FROM #t
WHERE datediff(dd,dateadd(yy,1900 - year(#testDate),#testDate),dateadd(yy,1900 - year(calDate),calDate)) < 0