Unable to convert age function to a float with Postgresql - postgresql

I am trying to write a query to output the average earnings per hour spent dashing by day of week.
worker_table:-
session_id
worker_id
session_start
session_end
total_pay
num_of_deliveries
7712
9347
2020-08-31 03:32:43
2020-08-31 05:53:43
46.72
3
1560
5645
2020-07-26 01:48:40
2020-07-26 04:48:40
65.32
4
So far I am able to extract the day of week and age but I'm not too sure how to cast the age to numeric so I can proceed with my query. When I run the query below, the age would be "02:21:00" but I want it to be a float so I can divide total_pay with age. Thanks.
select extract(dow from session_start), age(session_end, session_start)
from worker_table
Edit: If For some reason the table is not showing correctly. Please view my table here: https://pastebin.com/g8GvyWWR

To get the number of seconds from an interval, you can use extract epoch. In your case, you could do something like this -
select extract(epoch from age(session_end, session_start)) as session_length_in_seconds
from worker_table
The full query might then look something like this -
select
extract(dow from session_start),
avg(total_pay / (extract(epoch from age(session_end, session_start)) / 3600.0)) as avg_earnings_per_hour
from worker_table
group by 1

Related

Postgres generate date series with exactly 100 steps

Lets say we have the dates
'2017-01-01'
and
'2017-01-15'
and I would like to get a series of exactly N timestamps in between these dates, in this case 7 dates:
SELECT * FROM
generate_series_n(
'2017-01-01'::timestamp,
'2017-01-04'::timestamp,
7
)
Which I would like to return something like this:
2017-01-01-00:00:00
2017-01-01-12:00:00
2017-01-02-00:00:00
2017-01-02-12:00:00
2017-01-03-00:00:00
2017-01-03-12:00:00
2017-01-04-00:00:00
How can I do this in postgres?
Possibly this can be useful, using the generate series, and doing the math in the select
select '2022-01-01'::date + generate_series *('2022-05-31'::date - '2022-01-01'::date)/15
FROM generate_series(1, 15)
;
output
?column?
------------
2022-01-11
2022-01-21
2022-01-31
2022-02-10
2022-02-20
2022-03-02
2022-03-12
2022-03-22
2022-04-01
2022-04-11
2022-04-21
2022-05-01
2022-05-11
2022-05-21
2022-05-31
(15 rows)
WITH seconds AS
(
SELECT EXTRACT(epoch FROM('2017-01-04'::timestamp - '2017-01-01'::timestamp))::integer AS sec
),
step_seconds AS
(
SELECT sec / 7 AS step FROM seconds
)
SELECT generate_series('2017-01-01'::timestamp, '2017-01-04'::timestamp, (step || 'S')::interval)
FROM step_seconds
Conversion to function is easy, let me know if have trouble with it.
One problem with this solution is that extract epoch always assumes 30-days months. If this is problem for your use case (long intervals), you can tweak the logic for getting seconds from interval.
You can divide the difference between the end and the start value by the number of values you want:
SELECT *
FROM generate_series('2017-01-01'::timestamp,
'2017-01-04'::timestamp,
('2017-01-04'::timestamp - '2017-01-01'::timestamp) / 7)
This could be wrapped into a function if you want to avoid repeating the start and end value.

Number of days in a month in DB2

Is there a way to find the number of days in a month in DB2. For example I have a datetime field which I display as Jan-2020, Feb-2020 and so on. Based on this field I need to fetch the number of days for that month. The output should be something like below table,
I'm using the below query
select reportdate, TO_CHAR(reportdate, 'Mon-YYYY') as textmonth from mytable
Expected output
ReportDate textMonth No of Days
1-1-2020 08:00 Jan-2020 31
1-2-2020 09:00 Feb-2020 29
12-03-2020 07:00 Mar-2020 31
Try this:
/*
WITH MYTABLE (reportdate) AS
(
VALUES
TIMESTAMP('2020-01-01 08:00:00')
, TIMESTAMP('2020-02-01 09:00:00')
, TIMESTAMP('2020-03-12 07:00:00')
)
*/
SELECT reportdate, textMonth, DAYS(D + 1 MONTH) - DAYS(D) AS NO_OF_DAYS
FROM
(
SELECT
reportdate, TO_CHAR(reportdate, 'Mon-YYYY') textMonth
, DATE(TO_DATE('01-' || TO_CHAR(reportdate, 'Mon-YYYY'), 'dd-Mon-yyyy')) D
FROM MYTABLE
);
Db2 has the function DAYS_TO_END_OF_MONTH and several others which you could use. Based on your month input, construct the first day of the month. This should be something like 2020-01-01 for Jan-2020 or 2020-02-01 for Feb-2020. Follow the link for several other conversion functions which allow you to transform between formats and to perform date arithmetics.
convert your column to a proper date and try this: day(last_day(date_column))

How to bin timestamp data into buckets of n minutes in postgres

I have the following query which works, binning timestamped "observations" into buckets whose boundaries are defined by the bins table:
SELECT
count(id),
width_bucket(
time :: TIMESTAMP,
(SELECT ARRAY(SELECT start_time
FROM bins
WHERE owner_id = 'some id'
ORDER BY start_time ASC) :: TIMESTAMP[])
) bucket
FROM observations
WHERE owner_id = 'some id'
GROUP BY bucket
ORDER BY bucket;
I would like to modify this to allow for querying arbitrary n-minute bins starting from a specified timestamp, rather than having to pull from from an actual "bins" table.
That is, given a start time, a "bin width" in minutes, and a number of bins, is there a way I can generate the array of timestamps to pass into the width_bucket function?
Alternatively, is there a different/simpler approach to get the same results?
Use the function generate_series(start, stop, step interval), e.g.
select array(
select generate_series(
timestamp '2018-04-15 00:00',
'2018-04-15 01:00',
'30 minutes'))
array
---------------------------------------------------------------------
{"2018-04-15 00:00:00","2018-04-15 00:30:00","2018-04-15 01:00:00"}
(1 row)
Example in Db<>fiddle.
The above answers seem to do what you want, but as of PostgreSQL 14, there is now a function date_bin just for binning timestamps.
Quoting the documentation:
date_bin(stride,source,origin)
source is a value expression of type timestamp or timestamp with time zone. (Values of type date are cast automatically to timestamp.) stride is a value expression of type interval. The return value is likewise of type timestamp or timestamp with time zone, and it marks the beginning of the bin into which the source is placed.
Examples:
SELECT date_bin('15 minutes', TIMESTAMP '2020-02-11 15:44:17', TIMESTAMP > '2001-01-01');
Result: 2020-02-11 15:30:00
SELECT date_bin('15 minutes', TIMESTAMP '2020-02-11 15:44:17', TIMESTAMP '2001-01-01 00:02:30');
Result: 2020-02-11 15:32:30
In the case of full units (1 minute, 1 hour, etc.), it gives the same result as the analogous date_trunc call, but the difference is that date_bin can truncate to an arbitrary interval.
The stride interval must be greater than zero and cannot contain units of month or larger.
I would like to call special attention to the line
The return value [...] marks the beginning of the bin into which the source is placed.
This means that input timestamps will always be binned by "rounding down", rather than binning to whichever bin is closest. E.g. if you do:
SELECT date_bin('1 hour', '2021-10-13 00:59:59', '2021-10-13 00:00:00');
Then the result will be 2020-10-13 00:00:00 (rounded down by 59 minutes and 59 seconds), NOT 2021-10-13 01:00:00 (which is only one second away from the supplied timestamp). So the date_bin function does something slightly different than exactly what you ask for, but I figure this is good to post for anyone coming here in the future.
A different approach without a series:
Divide the difference of time and start by the width of the bin (5 minutes in the example) and add 1 because the first bucket of width_bucket(...) is 1 not 0.
floor(extract(epoch from (time - '2019-06-04 00:00'::timestamp)) / (5 * 60) ) + 1 as bucket
Getting the start of the bin is also possible
to_timestamp(floor(extract(epoch from a.time) / (5 * 60)) * (5 * 60)) as bin_start
Putting this all together:
SELECT
count(id),
floor(extract(epoch from (time - '2019-06-04 00:00'::timestamp)) / (5 * 60) ) + 1 as bucket,
to_timestamp(floor(extract(epoch from time) / (5 * 60)) * (5 * 60)) as bin_start
FROM observations
WHERE owner_id = 'some id'
GROUP BY bucket, bin_start
ORDER BY bucket;

PostgreSQL: Date Difference with fractions

SELECT cu.user_id, cu.last_activity, cu.updated_time,
DATE_PART('day', cu.last_activity - cu.updated_time), to_char(end_date - start_date, 'DD.HH24')
FROM stats.core_users cu
WHERE cu.user_id = '117132014' or cu.user_id = '117132012';
Get the result like:
117132014 2017-12-11 10:34:51.349905 2017-12-09 12:00:38.503518 1 01.22
117132012 2017-12-11 05:18:20.312283 2017-12-08 15:46:51.914085 2 02.13
Is is feasible to get the day difference with fractions like 1.91 days in the first case, instead of 1 days and 22 hours, to be more precise and easier to fit in a machine learning model?
date_part() does what it's name says: it returns one part of several elements from a date, interval or timestamp. In your case it's one part of an interval (because timestamp - timestamp returns an interval).
If you want the result as a fraction, you need to extract the seconds of the interval and then divide that by 86400 (which is the number of seconds in a day)
extract(epoch from cu.last_activity - cu.updated_time) / 86400

PostgreSQL difference between two dates and return in hours and minutes

Hello I want to return in my PostgreSQL the difference between two dates:
START: 2016-06-01 00:00:00
END: 2016-06-06 08:35:33
Expected return value: 128:35:33, formatted like format [h]:mm:ss;# in Excel. Hours must be added up if there is more than 24 hours of difference.
Here's my SQL:
SELECT EXTRACT(EPOCH FROM dt_termino::timestamp - dt_inicio::timestamp)/3600 FROM crm.task_interacao WHERE id_task_tarefa = 1
UPDATE!!!
hello now i'm facing another problema I have a table like this:
my table in database like this
start;end
2013-06-01 09:29:33;2016-06-07 14:08:19
2016-06-07 14:22:09;2016-06-07 14:22:43
2016-06-07 14:22:51; null
i need to sum values ....i'm trying as you said (1st awnser).. I cant use function because i'm using inside a php code
SELECT SUM(COALESCE(end::timestamp, now()::timestamp) - start::timestamp) FROM crm.task_interacao WHERE id_task_tarefa = 1
but is returning
1102 days 26:07:54.864879
why 26 hours??? I was supose be te at maximum 24...
no problem now to return (Days HH:MM:SS) and not miliseconds
You can simply subtract timestamps to get interval:
select '2016-06-06 08:35:33'::timestamp- '2016-06-01 00:00:00' result
result
-----------------
5 days 08:35:33
(1 row)
There is no standard function to convert the result to the format you need but you can write one:
create or replace function interval_without_days(interval)
returns interval language sql as $$
select $1- date_part('day', $1)* '1d'::interval+ date_part('day', $1)* '24h'::interval;
$$;
select interval_without_days('2016-06-06 08:35:33'::timestamp- '2016-06-01 00:00:00');
interval_without_days
-----------------------
128:35:33
(1 row)
Question #2. Use the functions date_trunc(text, interval) and justify_hours(interval):
select date_trunc('sec', justify_hours('1102 days 26:07:54.864879'));
date_trunc
--------------------
1103 days 02:07:54
(1 row)