Calculating average time of events with same ID - grafana

I'm trying to calculate the average of the elapsed time of events with the same id.
Event 1 started at 123 and ended at 129 -> it lasted 6 seconds.
Event 2 started at 134 and ended at 138 -> it lasted 4 seconds.
time
id
123
1
125
1
129
1
134
2
138
2
The average would be 5 seconds.
Like this I just get all elapsed times, not grouped by ids.
SELECT elapsed(id)
FROM measurement1
GROUP BY id

You can use the query for getting all the elapsed times as a subquery to get the average time elapsed across id's.
SELECT AVG(elapsed)
FROM (
SELECT MAX(time) - MIN(time) as elapsed
FROM measurement1
GROUP BY id1
)

Related

Extracting timestamps difference by day, minute and second in Postgresql

I would like to get the difference between 2 timestamps both have the day and month but I would like to see the output as the largest value. Currently my output is showing the difference in years as well as months and days etc, but I only need to see the days and hours. Is there a way to do this?
This is the data using this code:
select
id,
ts1 ,
ts2 ,
(ts2 - ts1) as output
from table_a
an example row (out of multiple):
id |ts1 | ts2 | output
---|-----------------------|-----------------------| --------------------
3 |2020-04-27 11:00:46.00 |2020-04-27 12:52:04.00 | 0 years 0 mons 0 days 1 hours 51 mins 18.00 secs
But I would like to show the shortest answer possible (so just days when necessary, hours, minutes and excluding seconds) so in this case just 1 hours 51 minutes etc
Desired result:
id |ts1 | ts2 | output
----|-----------------------|-----------------------| --------------------
3 |2020-04-27 11:00:46.00 |2020-04-27 12:52:04.00 | 1 hours 51 mins
How do I do this (using postgresql) ?

Compare number of instances in a column for two different dates in SAS

I have the following dataset (items) with transactions on any date and amount paid on the next business day.
The amount paid for each id on the next business day is $10 for the ids whose rate is >5
My task is to compare the number of instances where rate > 5 are in line with amount paid on the next business day (This will have a standard code 121)
For instance, there are four instances with rate > 5 on 4/14/2017' - The amount$40 (4*10)is paid on4/17/2017`
Date id rate code batch
4/14/2017 1 12 100 A1
4/14/2017 1 2 101 A1
4/14/2017 1 13 101 A1
4/14/2017 1 10 100 A1
4/14/2017 1 10 100 A1
4/17/2017 1 40 121
4/20/2017 2 12 100 A1
4/20/2017 2 2 101 A1
4/20/2017 2 3 101 A1
4/20/2017 2 10 100 A1
4/20/2017 2 10 100 A1
4/21/2017 2 30 121
My code
proc sql;
create table items2 as select
count(id) as id_count,
(case when code='121' then rate/10 else 0 end) as rate_count
from items
group by date,id;
quit;
This has not yielded the desired result and the challenge I have here is to check the transaction dates (4/14/2017 and 4/20/2017) and next business day dates (4/17/2017,4/21/2017).
Appreciate your help.
LAG function will do the trick here. As we can use lagged values to create the condition we want without having to use the rate>5 condition.
Here is the solution:-
Data items;
set items;
Lag_dt=LAG(Date);
Lag_id=LAG(id);
Lag_rate=LAG(rate);
if ((id=lag_id) and (code=121) and (Date>lag_dt)) then rate_count=(rate/lag_rate);
else rate_count=0;
Drop lag_dt lag_id lag_rate;
run;
Hope this helps.

Postgresql Query for display of records every 45 days

I have a table that has data of user_id and the timestamp they joined.
If I need to display the data month-wise I could just use:
select
count(user_id),
date_trunc('month',(to_timestamp(users.timestamp))::timestamp)::date
from
users
group by 2
The date_trunc code allows to use 'second', 'day', 'week' etc. Hence I could get data grouped by such periods.
How do I get data grouped by "n-day" period say 45 days ?
Basically I need to display number users per 45 day period.
Any suggestion or guidance appreciated!
Currently I get:
Date Users
2015-03-01 47
2015-04-01 72
2015-05-01 123
2015-06-01 132
2015-07-01 136
2015-08-01 166
2015-09-01 129
2015-10-01 189
I would like the data to come in 45 days interval. Something like :-
Date Users
2015-03-01 85
2015-04-15 157
2015-05-30 192
2015-07-14 229
2015-08-28 210
2015-10-12 294
UPDATE:
I used the following to get the output, but one problem remains. I'm getting values that are offset.
with
new_window as (
select
generate_series as cohort
, lag(generate_series, 1) over () as cohort_lag
from
(
select
*
from
generate_series('2015-03-01'::date, '2016-01-01', '45 day')
)
t
)
select
--cohort
cohort_lag -- This worked. !!!
, count(*)
from
new_window
join users on
user_timestamp <= cohort
and user_timestamp > cohort_lag
group by 1
order by 1
But the output I am getting is:
Date Users
2015-04-15 85
2015-05-30 157
2015-07-14 193
2015-08-28 225
2015-10-12 210
Basically The users displayed at 2015-03-01 should be the users between 2015-03-01 and 2015-04-15 and so on.
But I seem to be getting values of users upto a date. ie: upto 2015-04-15 users 85. which is not the results I want.
Any help here ?
Try this query :
SELECT to_char(i::date,'YYYY-MM-DD') as date, 0 as users
FROM generate_series('2015-03-01', '2015-11-30','45 day'::interval) as i;
OUTPUT :
date users
2015-03-01 0
2015-04-15 0
2015-05-30 0
2015-07-14 0
2015-08-28 0
2015-10-12 0
2015-11-26 0
This looks like a hot mess, and it might be better wrapped in a function where you could use some variables, but would something like this work?
with number_of_intervals as (
select
min (timestamp)::date as first_date,
ceiling (extract (days from max (timestamp) - min (timestamp)) / 45)::int as num
from users
),
intervals as (
select
generate_series(0, num - 1, 1) int_start,
generate_series(1, num, 1) int_end
from number_of_intervals
),
date_spans as (
select
n.first_date + 45 * i.int_start as interval_start,
n.first_date + 45 * i.int_end as interval_end
from
number_of_intervals n
cross join intervals i
)
select
d.interval_start, count (*) as user_count
from
users u
join date_spans d on
u.timestamp >= d.interval_start and
u.timestamp < d.interval_end
group by
d.interval_start
order by
d.interval_start
With this sample data:
User Id timestamp derived range count
1 3/1/2015 3/1-4/15
2 3/26/2015 "
3 4/4/2015 "
4 4/6/2015 " (4)
5 5/6/2015 4/16-5/30
6 5/19/2015 " (2)
7 6/16/2015 5/31-7/14
8 6/27/2015 "
9 7/9/2015 " (3)
10 7/15/2015 7/15-8/28
11 8/8/2015 "
12 8/9/2015 "
13 8/22/2015 "
14 8/27/2015 " (5)
Here is the output:
2015-03-01 4
2015-04-15 2
2015-05-30 3
2015-07-14 5

Hive Calculation of Percentage

I am trying to write a simple code to calculate the percentage of occurrence of distinct instances in a table.
Can I do this in one go???
Below is my code which is giving me error.
select 100 * total_sum/sum(total_sum) from jav_test;
In the past when I have had to do similar things this is the approach I've taken:
SELECT
jav_test.total_sum AS total_sum,
withsum.total_sum AS sum_of_all_total_sum,
100 * (jav_test.total_sum / withsum.total_sum) AS percentage
FROM
jav_test,
(SELECT sum(total_sum) AS total_sum FROM jav_test) withsum -- This computes sum(total_sum) here as a single-row single-column table aliased as "withsum"
;
The presence of the total_sum and sum_of_all_total_sum columns in the output is just to convince myself that the correct math took place - the one you are interested in is percentage, based on the query you posted in the question.
After populating a small dummy table, this was the result:
hive> describe jav_test;
OK
total_sum int
Time taken: 1.777 seconds, Fetched: 1 row(s)
hive> select * from jav_test;
OK
28
28
90113
90113
323694
323694
Time taken: 0.797 seconds, Fetched: 6 row(s)
hive> SELECT
> jav_test.total_sum AS total_sum,
> withsum.total_sum AS sum_of_all_total_sum,
> 100 * (jav_test.total_sum / withsum.total_sum) AS percentage
> FROM jav_test, (SELECT sum(total_sum) AS total_sum FROM jav_test) withsum;
...
... lots of mapreduce-related spam here
...
Total MapReduce CPU Time Spent: 3 seconds 370 msec
OK
28 827670 0.003382990805514275
28 827670 0.003382990805514275
90113 827670 10.887551802046708
90113 827670 10.887551802046708
323694 827670 39.10906520714777
323694 827670 39.10906520714777
Time taken: 41.257 seconds, Fetched: 6 row(s)
hive>

How to subtract seconds from postgres datetime without having to add it in group by clause?

Say I have column of type dateTime with value "2014-04-14 12:17:55.772" & I need to subtract seconds "2" seconds from it to get o/p like this "12:17:53".
userid EndDate seconds
--------------------------------------------------------
1 "2014-04-14 12:17:14.295" 512
1 "2014-04-14 12:31:14.295" 12
2 "2014-04-14 12:48:14.295" 2
2 "2014-04-14 13:22:14.295" 12
& the query is
select (enddate::timestamp - (seconds* interval '1 second')) seconds, userid
from user
group by userid
Now I need to group by userid only but enddate & seconds added to select query that is asking me to add it in group by clause which will not give me correct o/p.
I am expecting data in this format where I need to calculate start_time from end_time & total seconds spent.
user : 1
start_time end_time total (seconds)
"12:17" "12:17" 1
"12:22" "12:31" 512
total: 513
user : 2
"12:43" "12:48" 288
"13:22" "13:22" 1
total 289
Is there some way i could avoid group by clause in this?
Like #IMSoP says, you can use a window function to include a total for each user in your query output:
SELECT userid
, (enddate - (seconds * interval '1 second')) as start_time
, enddate as end_time
, seconds
, sum(seconds) OVER (PARTITION BY userid) as total
FROM so23063314.user;
Then you would only display the parts of the row you're interested in for each subtotal line, and display the total at the end of each block.