How to find the average of 3 weeks - average

I am trying to get the average rate in total for weeks (43,44,45) and (47,48,49) and compare to week 46.
So I am looking for one average rate (in total) for weeks 43-45, then again (47-49). How can I do this?
SELECT
week, code, count(case when on_time = '1' then 1 else null end) 1.0 / count() as arrived
FROM table
where Week in ('43','44','45','46','47','48', '49') and code in ('GAL',
'DRA');
Where would I add in to get the avg rate for arrived for the two buckets of weeks(listed above)?

You can group per week, considering the week 46 as a single item group. After that you extract the average of each group by casting the on_time value this way:
SELECT
(
CASE WHEN week IN ('43', '44', '45') THEN
'w1'
WHEN week IN ('47', '48', '49') THEN
'w2'
ELSE
'46'
END) week_group,
code,
(sum(
CASE WHEN on_time = '1' THEN
1
ELSE
0
END) / count()) AS on_time_rate
FROM
TABLE
WHERE
Week IN ('43', '44', '45', '46', '47', '48', '49')
AND code IN ('GAL', 'DRA')
GROUP BY
(
CASE WHEN week IN ('43', '44', '45') THEN
'w1'
WHEN week IN ('47', '48', '49') THEN
'w2'
ELSE
'46'
END);

Related

Count distinct dates between two timestamps

I want to count %days when a user was active. A query like this
select
a.id,
a.created_at,
CURRENT_DATE - a.created_at::date as days_since_registration,
NOW() as current_d
from public.accounts a where a.id = 3257
returns
id created_at days_since_registration current_d tot_active
3257 2022-04-01 22:59:00.000 1 2022-04-02 12:00:0.000 +0400 2
The person registered less than 24 hours ago (less than a day ago), but there are two distinct dates between the registration and now. Hence, if a user was active one hour before midnight and one hour after midnight, he is two days active in less than a day (active 200% of days)
What is the right way to count distinct dates and get 2 for a user, who registered at 23:00:00 two hours ago?
WITH cte as (
SELECT 42 as userID,'2022-04-01 23:00:00' as d
union
SELECT 42,'2022-04-02 01:00:00' as d
)
SELECT
userID,
count(d),
max(d)::date-min(d)::date+1 as NrOfDays,
count(d)/(max(d)::date-min(d)::date+1) *100 as PercentageOnline
FROM cte
GROUP BY userID;
output:
userid
count
nrofdays
percentageonline
42
2
2
100

Grouping data by quarter intervals (or any time interval) with a defined starting basis in postgresql

Let's say I have a table orders with columns amount and order_date.
I want to be able to group this data by quarters and aggregate the amount, the catch however is that the quarters do not start on January 1st but on any given arbitrary date, say July 12th. These quarters are also split in 13 week intervals. From what I see using something like date_trunc such as:
SELECT SUM(orders.amount), DATE_TRUNC('quarter', orders.order_date) AS interval FROM orders WHERE orders.order_date BETWEEN [date_start] AND [date_end] GROUP BY interval
is out of the question as this forces quarters to start on Jan 1st and it has 'hardcoded' quarter starting dates (Apr 1st, Jul 1st, etc).
I have tried using something like:
SELECT SUM(orders.amount),
to_timestamp(floor((extract('epoch' from orders.order_date / 7862400 )) * 7862400 ) AT TIME ZONE 'UTC' AS interval
FROM orders
WHERE orders.order_date BETWEEN [date_start] AND [date_end]
GROUP BY interval
(where 7862400 is the time interval that I want)
But with this method I cannot figure out how to set the offset for the initial grouping date, in my example I would like it to start from July 12th of each year (then count 13 weeks and start the next quarter, and so on). Hope I was clear and I would appreciate any help!
You can use generate_series() to create the first day of each quarter, join it and group by it.
SELECT quarters.first_day,
quarters.first_day + '13 weeks'::interval last_day,
sum(orders.amount) amount
FROM orders
LEFT JOIN generate_series('2019-07-12'::timestamp,
'2020-07-10'::timestamp,
'13 weeks'::interval) quarters (first_day)
ON quarters.first_day <= orders.order_date
AND quarters.first_day + '13 weeks'::interval > orders.order_date
WHERE orders.order_date BETWEEN [date_start]
AND [date_end]
GROUP BY quarters.first_day,
quarters.first_day + '13 weeks'::interval;
You just need to make sure, that the boundary days you give the generate_series() cover the whole period you want to query, so that depends on your [date_start] and [date_end].
You can generate your own 'quarterly calendar' and use that in place of the Postgers 'quarter' date extraction.
create or replace function quarterly_calendar(annual_date text default extract('YEAR' from current_date)::text)
returns table( quarter integer
, quarter_start_date date
, quarter_end_date date
)
language sql immutable strict leakproof
as $$
with RECURSIVE quarters as
(select 1 qtr, qdt::date q_start_dt, (qdt + interval '90 day' )::date q_end_dt, (qdt+interval '1 year' - interval '1 day')::date last_dt
from ( select date_trunc('year',current_date) + interval '6 month 11 day' qdt) q
union all
select qtr+1, (q_end_dt + interval '1 day')::date, least ((q_end_dt + interval '91 day')::date,last_dt), last_dt
from quarters
where qtr+1 <=5
)
select qtr, q_start_dt, q_end_dt
from quarters;
$$;
-- test
select * from quarterly_calender();
It does actually create 5 quarters. But that is because a year is not a multiple of 13 weeks (or 91 days or 7862400 seconds). In your given year from 12-July-2019 through 11-July-2020 is 2 days (366 days total) over 4 times that interval. You'll have to decide how to handle that 5th quarter. It occurs every year, having either 1 or 2 days. Hope this helps .

Add X days to a Received Date but Exclude Weekends/Holidays from a Date Table

I hope someone can help with a calculation that I am having trouble developing.
I am developing a report in a DB2 database that I need to add "X" number of days to a "RECEIVED" date/time when an order comes in between X and Y; but exclude Weekends and Holidays to add to the received date. I have created a [TBLCALENDAR] that lists the Weekends and Holidays (Example below); and from this, I want to ADD X number of days to a "DUEDATE"
tblCalendar]
DATE DAYOFWK DAY HOLIDAY
1/19/2019 7 Saturday
1/20/2019 1 Sunday
1/21/2019 2 Monday YES
So, for example 1, if I have an order that is placed on 1/18/2019 at 4:01pm; the due date should be 1/23/2019 at 11:00am.
Example 2: if I have an order that is placed on 1/18/2019 at
Conditions are:
Previous Date 4:01pm to Current Date 11:00am = Due Date should be + "X" business days by 11:00am
If order received Current day by 4:00pm = Due Date should be + "X" business days by 4:00pm
I have tried to reference the tblCalendar to get the [Received] date/time and add X number of days based off of an order, but it's not functioning the way I have hoped.
I have used the following code...but it doesn't exclude Weekends or Holidays when adding the specified number of days or have my order time requirement to take into account previous day after 4:00pm to current date of 11:00am:
RECEIVEDDATETIME + 2 days as DUEDATE;
I have also used the below code to reference TBLCALENDAR to find the # of holidays and weekend days in a date range:
( SELECT COUNT (*) FROM TBLCALENDAR AS C WHERE C.HOLIDAY = 'YES'
AND C.DATE BETWEEN TBLORDERS.RECEIVEDDATETIME
AND TBLORDERS.DUEDATETIME) +
(SELECT COUNT (*) FROM TBLCALENDAR
WHERE DAYOFWK IN (1,7)
AND DATE BETWEEN TBLORDERS.RECEIVEDDATETIME
AND TBLORDERS.UPLOADTIME) AS NONWORKINGDAYS
Expected field output
If order was received between 1/17/2019 4:01pm to 1/18/2019 10:59am = 1/23/2019 11:00am
If order received Current day by 4:00pm 1/18/2019 3:59am= 1/23/2019 by 4:00pm.
RECEIVEDDATETIME DUEDATE
1/17/2019 4:01pm 1/23/2019 11:00am
1/18/2019 10:00am 1/23/2019 4:00pm
Here is a solution without the time logic.
with tblCalendar(DATE, DAYOFWK, DAY, HOLIDAY) as (values
(date('2019-01-19'), 7, 'Saturday', '')
, (date('2019-01-20'), 1, 'Sunday', '')
, (date('2019-01-21'), 2, 'Monday', 'YES')
, (date('2019-01-22'), 3, 'Tuesday', '')
, (date('2019-01-23'), 4, 'Wednesday', 'YES')
, (date('2019-01-24'), 5, 'Thursday', '')
, (date('2019-01-25'), 6, 'Friday', '')
, (date('2019-01-26'), 7, 'Saturday', '')
)
, mytab (RECEIVEDDATE, DAYS2ADD) as (values
(date('2019-01-19'), 2)
, (date('2019-01-20'), 2)
, (date('2019-01-21'), 2)
, (date('2019-01-22'), 2)
)
select m.*, t.date as DUEDATE
--, dayofweek(date) as DAYOFWK, dayname(date) as DAY
from mytab m
, table
(
select date
from table
(
select
date
, sum(case when HOLIDAY='YES' or dayofweek(date) in (7,1) then 0 else 1 end) over (order by date) as dn_
from tblCalendar t
where t.date > m.RECEIVEDDATE
)
where dn_ = m.DAYS2ADD
fetch first 1 row only
) t;
The idea is to enumerate each day of the calendar after the RECEIVEDDATE (1-st parameter) starting from 1 with the following logic: the number of each day increases by 1 if it's non-holiday non-weekend day (the sum(...) over(...) expression).
Finally, we select a date with the corresponding number of days needed to add (2-nd parameter).
Solution idea:
Your tblCalendar is a good idea but I recommend to add the working day information instead of (only) flagging the holidays and weekends. The problem with the "off days" are that after you have figured out how many of them are in the period from your receive date to the receive date + X days you cannot easily add them because there could be other "off dates" in that perios again.
By numbering all the work days you could identify the workday which is closest (equal or bigger) to the receive date. Retrieve its number and add the X days to that number. Retrieve the date that has this work day number and you are fine.
The time logic should be built before that all because it could add another day to the X days.

Group every N days

Ive been searching for something like that using PostgreSQL, but havent found yet.
Lets suppose i have the following table:
id order amount created_at
2 527837 10.0 2014-12-01T...
3 527838 50.0 2014-12-02T...
4 527839 30.0 2014-12-02T...
5 527840 40.0 2014-12-10T...
6 527841 80.0 2014-12-13T...
And i want to have a query that returns the sum of all amounts for each full week of 7 days (even if some day had no orders):
Example:
week total_amount
Dec/01 - Dec/07 90.0
Dec/08 - Dec/15 120.0
Dec/16 - Dec/23 0.0
//...and so on until current date
Also, lets suppose, that January has 30 days, and February has 28 days, i want the weeks to be grouped like that:
Jan/01-Jan/08
Jan/09-Jan/16
Jan/17-Jan/24
Jan/25-Feb/02 (theres no problem on crossing months)
Feb/03-Feb/10
What is the best way to do this?
EDIT 1:
I have found a way to build a query to generate a temporary table with the days i need for my grouping, i just having dificult grouping this and joining it with my original table...
(SELECT TO_CHAR(generate_series, 'YYYY-MM-DD') as "day" FROM generate_series('2013-06-01 00:00'::timestamp,
'2015-06-01 00:00'::timestamp, '1 Day'))
select date_trunc('week', created_At),date_trunc('week', created_At)+ INTERVAL '6' DAY,
SUM(amount)
from t
GROUP BY date_trunc('week', created_At)
ORDER BY MIN(created_At);
FIDDLE

CASE expressions with MAX aggregate functions Oracle

Using Oracle, I have selected the title_id with its the associated month of publication with:
SELECT title_id,
CASE EXTRACT(month FROM pubdate)
WHEN 1 THEN 'Jan'
WHEN 2 THEN 'Feb'
WHEN 3 THEN 'Mar'
WHEN 4 THEN 'Apr'
WHEN 5 THEN 'May'
WHEN 6 THEN 'Jun'
WHEN 7 THEN 'Jul'
WHEN 8 THEN 'Aug'
WHEN 9 THEN 'Sep'
WHEN 10 THEN 'Oct'
WHEN 11 THEN 'Nov'
ELSE 'Dec'
END MONTH
FROM TITLES;
Using the statement:
SELECT MAX(Most_Titles)
FROM (SELECT count(title_id) Most_Titles, month
FROM (SELECT title_id, extract(month FROM pubdate) AS MONTH FROM titles) GROUP BY month);
I was able to determine the month with the maximum number of books published.
Is there a way to join the two statements so that I can associate the month's text equivalent with the maximum number of titles?
In order to convert a month to a string, I wouldn't use a CASE statement, I'd just use a TO_CHAR. And you can use analytic functions to rank the results to get the month with the most books published.
SELECT num_titles,
to_char( publication_month, 'Mon' ) month_str
FROM (SELECT count(title_id) num_titles,
trunc(pubdate, 'MM') publication_month,
rank() over (order by count(title_id) desc) rnk
FROM titles
GROUP BY trunc(pubdate, 'MM'))
WHERE rnk = 1
A couple of additional caveats
If there are two months that are tied with the most publications, this query will return both rows. If you want Oracle to arbitrarily pick one, you can use the row_number analytic function rather than rank.
If the PUBDATE column in your table only has dates of midnight on the first of the month where the book is published, you can eliminate the trunc on the PUBDATE column.