Calculating average time of events with same ID

Calculating average time of events with same ID - grafana

I'm trying to calculate the average of the elapsed time of events with the same id.
Event 1 started at 123 and ended at 129 -> it lasted 6 seconds.
Event 2 started at 134 and ended at 138 -> it lasted 4 seconds.
time
id
123
1
125
1
129
1
134
2
138
2
The average would be 5 seconds.
Like this I just get all elapsed times, not grouped by ids.
SELECT elapsed(id)
FROM measurement1
GROUP BY id

You can use the query for getting all the elapsed times as a subquery to get the average time elapsed across id's.
SELECT AVG(elapsed)
FROM (
SELECT MAX(time) - MIN(time) as elapsed
FROM measurement1
GROUP BY id1
)

Related

Extracting timestamps difference by day, minute and second in Postgresql

I would like to get the difference between 2 timestamps both have the day and month but I would like to see the output as the largest value. Currently my output is showing the difference in years as well as months and days etc, but I only need to see the days and hours. Is there a way to do this?
This is the data using this code:
select
id,
ts1 ,
ts2 ,
(ts2 - ts1) as output
from table_a
an example row (out of multiple):
id |ts1 | ts2 | output
---|-----------------------|-----------------------| --------------------
3 |2020-04-27 11:00:46.00 |2020-04-27 12:52:04.00 | 0 years 0 mons 0 days 1 hours 51 mins 18.00 secs
But I would like to show the shortest answer possible (so just days when necessary, hours, minutes and excluding seconds) so in this case just 1 hours 51 minutes etc
Desired result:
id |ts1 | ts2 | output
----|-----------------------|-----------------------| --------------------
3 |2020-04-27 11:00:46.00 |2020-04-27 12:52:04.00 | 1 hours 51 mins
How do I do this (using postgresql) ?

Compare number of instances in a column for two different dates in SAS

I have the following dataset (items) with transactions on any date and amount paid on the next business day.
The amount paid for each id on the next business day is $10 for the ids whose rate is >5
My task is to compare the number of instances where rate > 5 are in line with amount paid on the next business day (This will have a standard code 121)
For instance, there are four instances with rate > 5 on 4/14/2017' - The amount$40 (4*10)is paid on4/17/2017`
Date id rate code batch
4/14/2017 1 12 100 A1
4/14/2017 1 2 101 A1
4/14/2017 1 13 101 A1
4/14/2017 1 10 100 A1
4/14/2017 1 10 100 A1
4/17/2017 1 40 121
4/20/2017 2 12 100 A1
4/20/2017 2 2 101 A1
4/20/2017 2 3 101 A1
4/20/2017 2 10 100 A1
4/20/2017 2 10 100 A1
4/21/2017 2 30 121
My code
proc sql;
create table items2 as select
count(id) as id_count,
(case when code='121' then rate/10 else 0 end) as rate_count
from items
group by date,id;
quit;
This has not yielded the desired result and the challenge I have here is to check the transaction dates (4/14/2017 and 4/20/2017) and next business day dates (4/17/2017,4/21/2017).
Appreciate your help.

LAG function will do the trick here. As we can use lagged values to create the condition we want without having to use the rate>5 condition.
Here is the solution:-
Data items;
set items;
Lag_dt=LAG(Date);
Lag_id=LAG(id);
Lag_rate=LAG(rate);
if ((id=lag_id) and (code=121) and (Date>lag_dt)) then rate_count=(rate/lag_rate);
else rate_count=0;
Drop lag_dt lag_id lag_rate;
run;
Hope this helps.

Postgresql Query for display of records every 45 days

I have a table that has data of user_id and the timestamp they joined.
If I need to display the data month-wise I could just use:
select
count(user_id),
date_trunc('month',(to_timestamp(users.timestamp))::timestamp)::date
from
users
group by 2
The date_trunc code allows to use 'second', 'day', 'week' etc. Hence I could get data grouped by such periods.
How do I get data grouped by "n-day" period say 45 days ?
Basically I need to display number users per 45 day period.
Any suggestion or guidance appreciated!
Currently I get:
Date Users
2015-03-01 47
2015-04-01 72
2015-05-01 123
2015-06-01 132
2015-07-01 136
2015-08-01 166
2015-09-01 129
2015-10-01 189
I would like the data to come in 45 days interval. Something like :-
Date Users
2015-03-01 85
2015-04-15 157
2015-05-30 192
2015-07-14 229
2015-08-28 210
2015-10-12 294
UPDATE:
I used the following to get the output, but one problem remains. I'm getting values that are offset.
with
new_window as (
select
generate_series as cohort
, lag(generate_series, 1) over () as cohort_lag
from
(
select
*
from
generate_series('2015-03-01'::date, '2016-01-01', '45 day')
)
t
)
select
--cohort
cohort_lag -- This worked. !!!
, count(*)
from
new_window
join users on
user_timestamp <= cohort
and user_timestamp > cohort_lag
group by 1
order by 1
But the output I am getting is:
Date Users
2015-04-15 85
2015-05-30 157
2015-07-14 193
2015-08-28 225
2015-10-12 210
Basically The users displayed at 2015-03-01 should be the users between 2015-03-01 and 2015-04-15 and so on.
But I seem to be getting values of users upto a date. ie: upto 2015-04-15 users 85. which is not the results I want.
Any help here ?

Try this query :
SELECT to_char(i::date,'YYYY-MM-DD') as date, 0 as users
FROM generate_series('2015-03-01', '2015-11-30','45 day'::interval) as i;
OUTPUT :
date users
2015-03-01 0
2015-04-15 0
2015-05-30 0
2015-07-14 0
2015-08-28 0
2015-10-12 0
2015-11-26 0

This looks like a hot mess, and it might be better wrapped in a function where you could use some variables, but would something like this work?
with number_of_intervals as (
select
min (timestamp)::date as first_date,
ceiling (extract (days from max (timestamp) - min (timestamp)) / 45)::int as num
from users
),
intervals as (
select
generate_series(0, num - 1, 1) int_start,
generate_series(1, num, 1) int_end
from number_of_intervals
),
date_spans as (
select
n.first_date + 45 * i.int_start as interval_start,
n.first_date + 45 * i.int_end as interval_end
from
number_of_intervals n
cross join intervals i
)
select
d.interval_start, count (*) as user_count
from
users u
join date_spans d on
u.timestamp >= d.interval_start and
u.timestamp < d.interval_end
group by
d.interval_start
order by
d.interval_start
With this sample data:
User Id timestamp derived range count
1 3/1/2015 3/1-4/15
2 3/26/2015 "
3 4/4/2015 "
4 4/6/2015 " (4)
5 5/6/2015 4/16-5/30
6 5/19/2015 " (2)
7 6/16/2015 5/31-7/14
8 6/27/2015 "
9 7/9/2015 " (3)
10 7/15/2015 7/15-8/28
11 8/8/2015 "
12 8/9/2015 "
13 8/22/2015 "
14 8/27/2015 " (5)
Here is the output:
2015-03-01 4
2015-04-15 2
2015-05-30 3
2015-07-14 5

Hive Calculation of Percentage

I am trying to write a simple code to calculate the percentage of occurrence of distinct instances in a table.
Can I do this in one go???
Below is my code which is giving me error.
select 100 * total_sum/sum(total_sum) from jav_test;

In the past when I have had to do similar things this is the approach I've taken:
SELECT
jav_test.total_sum AS total_sum,
withsum.total_sum AS sum_of_all_total_sum,
100 * (jav_test.total_sum / withsum.total_sum) AS percentage
FROM
jav_test,
(SELECT sum(total_sum) AS total_sum FROM jav_test) withsum -- This computes sum(total_sum) here as a single-row single-column table aliased as "withsum"
;
The presence of the total_sum and sum_of_all_total_sum columns in the output is just to convince myself that the correct math took place - the one you are interested in is percentage, based on the query you posted in the question.
After populating a small dummy table, this was the result:
hive> describe jav_test;
OK
total_sum int
Time taken: 1.777 seconds, Fetched: 1 row(s)
hive> select * from jav_test;
OK
28
28
90113
90113
323694
323694
Time taken: 0.797 seconds, Fetched: 6 row(s)
hive> SELECT
> jav_test.total_sum AS total_sum,
> withsum.total_sum AS sum_of_all_total_sum,
> 100 * (jav_test.total_sum / withsum.total_sum) AS percentage
> FROM jav_test, (SELECT sum(total_sum) AS total_sum FROM jav_test) withsum;
...
... lots of mapreduce-related spam here
...
Total MapReduce CPU Time Spent: 3 seconds 370 msec
OK
28 827670 0.003382990805514275
28 827670 0.003382990805514275
90113 827670 10.887551802046708
90113 827670 10.887551802046708
323694 827670 39.10906520714777
323694 827670 39.10906520714777
Time taken: 41.257 seconds, Fetched: 6 row(s)
hive>

How to subtract seconds from postgres datetime without having to add it in group by clause?

Say I have column of type dateTime with value "2014-04-14 12:17:55.772" & I need to subtract seconds "2" seconds from it to get o/p like this "12:17:53".
userid EndDate seconds
--------------------------------------------------------
1 "2014-04-14 12:17:14.295" 512
1 "2014-04-14 12:31:14.295" 12
2 "2014-04-14 12:48:14.295" 2
2 "2014-04-14 13:22:14.295" 12
& the query is
select (enddate::timestamp - (seconds* interval '1 second')) seconds, userid
from user
group by userid
Now I need to group by userid only but enddate & seconds added to select query that is asking me to add it in group by clause which will not give me correct o/p.
I am expecting data in this format where I need to calculate start_time from end_time & total seconds spent.
user : 1
start_time end_time total (seconds)
"12:17" "12:17" 1
"12:22" "12:31" 512
total: 513
user : 2
"12:43" "12:48" 288
"13:22" "13:22" 1
total 289
Is there some way i could avoid group by clause in this?

Like #IMSoP says, you can use a window function to include a total for each user in your query output:
SELECT userid
, (enddate - (seconds * interval '1 second')) as start_time
, enddate as end_time
, seconds
, sum(seconds) OVER (PARTITION BY userid) as total
FROM so23063314.user;
Then you would only display the parts of the row you're interested in for each subtotal line, and display the total at the end of each block.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Calculating average time of events with same ID - grafana

You can use the query for getting all the elapsed times as a subquery to get the average time elapsed across id's. SELECT AVG(elapsed) FROM ( SELECT MAX(time) - MIN(time) as elapsed FROM measurement1 GROUP BY id1 )

Related

Extracting timestamps difference by day, minute and second in Postgresql

Compare number of instances in a column for two different dates in SAS

Postgresql Query for display of records every 45 days

Hive Calculation of Percentage

How to subtract seconds from postgres datetime without having to add it in group by clause?

Categories

Resources