Group every N days - postgresql

Ive been searching for something like that using PostgreSQL, but havent found yet.
Lets suppose i have the following table:
id order amount created_at
2 527837 10.0 2014-12-01T...
3 527838 50.0 2014-12-02T...
4 527839 30.0 2014-12-02T...
5 527840 40.0 2014-12-10T...
6 527841 80.0 2014-12-13T...
And i want to have a query that returns the sum of all amounts for each full week of 7 days (even if some day had no orders):
Example:
week total_amount
Dec/01 - Dec/07 90.0
Dec/08 - Dec/15 120.0
Dec/16 - Dec/23 0.0
//...and so on until current date
Also, lets suppose, that January has 30 days, and February has 28 days, i want the weeks to be grouped like that:
Jan/01-Jan/08
Jan/09-Jan/16
Jan/17-Jan/24
Jan/25-Feb/02 (theres no problem on crossing months)
Feb/03-Feb/10
What is the best way to do this?
EDIT 1:
I have found a way to build a query to generate a temporary table with the days i need for my grouping, i just having dificult grouping this and joining it with my original table...
(SELECT TO_CHAR(generate_series, 'YYYY-MM-DD') as "day" FROM generate_series('2013-06-01 00:00'::timestamp,
'2015-06-01 00:00'::timestamp, '1 Day'))

select date_trunc('week', created_At),date_trunc('week', created_At)+ INTERVAL '6' DAY,
SUM(amount)
from t
GROUP BY date_trunc('week', created_At)
ORDER BY MIN(created_At);
FIDDLE

Related

Count distinct dates between two timestamps

I want to count %days when a user was active. A query like this
select
a.id,
a.created_at,
CURRENT_DATE - a.created_at::date as days_since_registration,
NOW() as current_d
from public.accounts a where a.id = 3257
returns
id created_at days_since_registration current_d tot_active
3257 2022-04-01 22:59:00.000 1 2022-04-02 12:00:0.000 +0400 2
The person registered less than 24 hours ago (less than a day ago), but there are two distinct dates between the registration and now. Hence, if a user was active one hour before midnight and one hour after midnight, he is two days active in less than a day (active 200% of days)
What is the right way to count distinct dates and get 2 for a user, who registered at 23:00:00 two hours ago?
WITH cte as (
SELECT 42 as userID,'2022-04-01 23:00:00' as d
union
SELECT 42,'2022-04-02 01:00:00' as d
)
SELECT
userID,
count(d),
max(d)::date-min(d)::date+1 as NrOfDays,
count(d)/(max(d)::date-min(d)::date+1) *100 as PercentageOnline
FROM cte
GROUP BY userID;
output:
userid
count
nrofdays
percentageonline
42
2
2
100

Grouping data by quarter intervals (or any time interval) with a defined starting basis in postgresql

Let's say I have a table orders with columns amount and order_date.
I want to be able to group this data by quarters and aggregate the amount, the catch however is that the quarters do not start on January 1st but on any given arbitrary date, say July 12th. These quarters are also split in 13 week intervals. From what I see using something like date_trunc such as:
SELECT SUM(orders.amount), DATE_TRUNC('quarter', orders.order_date) AS interval FROM orders WHERE orders.order_date BETWEEN [date_start] AND [date_end] GROUP BY interval
is out of the question as this forces quarters to start on Jan 1st and it has 'hardcoded' quarter starting dates (Apr 1st, Jul 1st, etc).
I have tried using something like:
SELECT SUM(orders.amount),
to_timestamp(floor((extract('epoch' from orders.order_date / 7862400 )) * 7862400 ) AT TIME ZONE 'UTC' AS interval
FROM orders
WHERE orders.order_date BETWEEN [date_start] AND [date_end]
GROUP BY interval
(where 7862400 is the time interval that I want)
But with this method I cannot figure out how to set the offset for the initial grouping date, in my example I would like it to start from July 12th of each year (then count 13 weeks and start the next quarter, and so on). Hope I was clear and I would appreciate any help!
You can use generate_series() to create the first day of each quarter, join it and group by it.
SELECT quarters.first_day,
quarters.first_day + '13 weeks'::interval last_day,
sum(orders.amount) amount
FROM orders
LEFT JOIN generate_series('2019-07-12'::timestamp,
'2020-07-10'::timestamp,
'13 weeks'::interval) quarters (first_day)
ON quarters.first_day <= orders.order_date
AND quarters.first_day + '13 weeks'::interval > orders.order_date
WHERE orders.order_date BETWEEN [date_start]
AND [date_end]
GROUP BY quarters.first_day,
quarters.first_day + '13 weeks'::interval;
You just need to make sure, that the boundary days you give the generate_series() cover the whole period you want to query, so that depends on your [date_start] and [date_end].
You can generate your own 'quarterly calendar' and use that in place of the Postgers 'quarter' date extraction.
create or replace function quarterly_calendar(annual_date text default extract('YEAR' from current_date)::text)
returns table( quarter integer
, quarter_start_date date
, quarter_end_date date
)
language sql immutable strict leakproof
as $$
with RECURSIVE quarters as
(select 1 qtr, qdt::date q_start_dt, (qdt + interval '90 day' )::date q_end_dt, (qdt+interval '1 year' - interval '1 day')::date last_dt
from ( select date_trunc('year',current_date) + interval '6 month 11 day' qdt) q
union all
select qtr+1, (q_end_dt + interval '1 day')::date, least ((q_end_dt + interval '91 day')::date,last_dt), last_dt
from quarters
where qtr+1 <=5
)
select qtr, q_start_dt, q_end_dt
from quarters;
$$;
-- test
select * from quarterly_calender();
It does actually create 5 quarters. But that is because a year is not a multiple of 13 weeks (or 91 days or 7862400 seconds). In your given year from 12-July-2019 through 11-July-2020 is 2 days (366 days total) over 4 times that interval. You'll have to decide how to handle that 5th quarter. It occurs every year, having either 1 or 2 days. Hope this helps .

Postgresql - 24 hour rolling window

I am currently using Metabase to put together a live dashboard of some internal company metrics and one of the things I am trying to query is a 24 hour rolling window for transaction on our mobile app. Metabase has a useful visualization tool called "Smart Number" which allows you to compare changes in values over a defined time period. Like this.
I am having trouble writing a query that outputs data in 24 hour intervals so I can compare the past 24 hours to the 24 hours before that. I have tried using the date_trunc function to divide the transactions by hour and then possibly limit the results to the last 24 but it doesn't print out hours that don't have transactions. I also tried using the filter function as seen in the code below but the data needs to to be transposed for "Smart Numbers" to work. Does anyone have any suggestions as to how I should approach this problem?
Example of one of my approaches:
SELECT
(DATE_TRUNC('hour', (reservations.created_at::timestamptz))) as hour,
SUM(reservations.covers) as total_covers
FROM reservations
JOIN restaurants on restaurants.id = reservations.restaurant_id
WHERE reservations.origin = 'mobile'
and restaurants.relationship_type in ('listing_only', 'difficult', 'ipad')
GROUP BY hour
ORDER BY hour desc
Which outputs something like this:
hour total_covers
"2019-02-19 15:00:00+00" 4
"2019-02-19 13:00:00+00" 15
"2019-02-19 12:00:00+00" 4
"2019-02-19 11:00:00+00" 4
"2019-02-19 10:00:00+00" 26
"2019-02-19 09:00:00+00" 5
"2019-02-19 08:00:00+00" 8
"2019-02-19 07:00:00+00" 12
"2019-02-19 03:00:00+00" 2
I would like to get something like this:
Time_Interval Total_Covers
24 Hours 389
48 Hours 254
72 hours 459
96 Hours 239
This query uses the date function to group results.
SELECT
DATE(reservations.created_at) AS day,
SUM(reservations.covers) AS total_covers
FROM reservations
WHERE
reservations.origin = 'mobile' AND
restaurants.relationship_type IN ('listing_only', 'difficult', 'ipad')
GROUP BY day
ORDER BY day DESC
This query calculates the total covers from the time of the query, and groups them by the number of days ago.
SELECT
CEIL(EXTRACT(EPOCH FROM NOW() - reservations.created_at) / 86400) as days_ago
SUM(reservations.covers) as total_covers,
FROM reservations
WHERE
reservations.origin = 'mobile' AND
restaurants.relationship_type IN ('listing_only', 'difficult', 'ipad')
GROUP BY days_ago
ORDER BY days_ago;
See it in action on rextester.
Here is an example of using a generated series for the from, and then left-joining the data you want to count. Idea being you want every hour to appear in the result, and every piece of data to be counted, but it is optional.
Note I just checked the query compiled; I don't have a mock schema on hand to test this.
SELECT
hours.x as hour,
coalesce(SUM(reservations.covers), 0) as total_covers
FROM (
select * from generate_series(date_trunc('hour', (now() - INTERVAL '24 hours')), now(), '1 hour')
) hours(x)
LEFT JOIN reservations on hours.x = (DATE_TRUNC('hour', (reservations.created_at::timestamptz)))
LEFT JOIN restaurants on restaurants.id = reservations.restaurant_id
WHERE reservations.origin = 'mobile'
and restaurants.relationship_type in ('listing_only', 'difficult', 'ipad')
GROUP BY hours.x
ORDER BY hours.x desc

PostgreSQL: Get rows which date is 5 days old than payment_date

Currently I need to send an Email to all users that have 5 days with their payment due_date expired and are status=1 (pending to pay) for the current month and year because they might have future dates or past dates. example
due_date= 27/06/2018 send email after 5 days 1/05/2018
my Query to grab all users with a interval within 5 days is the following:
SELECT payments_payment.id, payments_payment.due_date
FROM payments_payment
WHERE payments_payment.due_date < NOW() - '5 day'::interval
AND payments_payment.status = 1
AND EXTRACT(year FROM payments_payment.due_date) = EXTRACT(year FROM NOW())
AND EXTRACT(month FROM payments_payment.due_date) = EXTRACT(month FROM NOW())
ORDER BY payments_payment.due_date ASC;
Need to make a different approach since the question is inverse for that reason I need to get the difference between 2 dates and see if it matches my day limit here is the Query.
PostgreSQL Query:
SELECT due_date
FROM payments_payment
WHERE payments_payment.due_date + interval '5 day' < current_date
AND payments_payment.status = 1
Explanation
Get all payment dates where status equals 1 and month equals current month and year where the due_date substracted by current date is equals to 5 days.

calculating last 5 years from current date in hive

I need to calculate some count based on the given time frame
I need to consider the dates between current date and last 5 years
select count(*) from table where (year(current_date) -year('2015-12-01')) < 5 ;
above query will give counts for last 5 years however it will consider only year part but I need exact counts considering days so if I write
select count(*) from table where datediff(current_date,final_dt) <= 1825 ;
it won't consider the leap years if any in the last 5 years
so Is there any function in hive to calculate exact difference between two dates consider scenarios like leap years?
Use add_months function (assuming the dates should go back to 2013-05-25 with the current date being 2018-05-25).
select count(*)
from table
where final_dt >= add_months(current_date,-60) and final_dt <= current_date
I think you are trying to calculate count(*) all records between current_date and a date which is 5 year in the past from current_date, in this case, you can do something like this:
SELECT count(*) FROM table_1 WHERE date_column BETWEEN current_date AND to_date(CONCAT(YEAR(current_date) - 5, '-', MONTH(current_date), '-', DAY(current_date)));
And SELECT datediff( current_date() ,to_date(CONCAT(YEAR(current_date) - 5, '-', MONTH(current_date), '-', DAY(current_date))));
gives you 1826 (considering the fact that 2016 is a leap year).