How can I always get the full period when grouping by week in PostgreSQL? - postgresql

I'm used to do the following syntax when analysing weekly data:
select week(creation_date)::date as week,
count(*) as n
from table_1
where creation_date > current_date - 30
group by 1
However, by doing this I will get just part of the first week.
Is there any smart way to alway get a whole week in the beginning?
Like get the first day of the week I would get half of.

First off you need to define what you mean by "week". This is more difficult than it appears. While humans have an intuitive since of a week, computers are just not that smart. There are 2 common conventions: the ISO-8601 Standard and, for lack of a better term, Traditional. ISO-8601 defines a week as always beginning on Monday and always containing 7 days. Traditional weeks begin on Sunday (usually) but may have weeks with less than 7 days. This results from having the 1st week of the year beginning on 1-Jan regardless of day of week. Thus the 1st and/or last weeks may have less than 7 days. ISO-8601 throws it own curve into the mix: the 1st week of the year begins on the week containing 4-Jan. Thus the last days of Dec may be in week 1 of the next year and the first days Jan may be in week 52/53 of the prior year.
All the below assume the ISO-8061.
Secondly there is no week function in Postgres. In you need extract function. So for this particular case:
select extract(week from creation_date)::integer as week, ...
Finally, your predicate (current_date - 30) ensures you will unusually not begin on the 1st of the week. To get the correct date take that result back 1 week, then go forward to the next Monday.
with days_to_monday (day_adj) as
( values ('{7,6,5,4,3,2,1}'::int[]) )
select current_date - 30
, current_date - 30 - 7 + day_adj[extract (isodow from current_date - 30 )]
from table_1 cross join days_to_monday;
The CTE establishes an array which for a given day of the week contains the number of days need to the next Monday. That main query extracts the day of week of current date and uses that to index the array. The corresponding value is added to get the proper date.
Putting that together with your original query to arrive at:
with next_week (monday) as
( values (current_date - 30 - 7
+ ('{7,6,5,4,3,2,1}'::int[])[extract (isodow from current_date - 30 )])
)
select extract(week from creation_date) as week,
count(*) as n
from table_1
where creation_date >= (select monday from next_week)
group by 1
order by 1;
For full example see fiddle.

Related

How to find the 1st and 3rd Sunday/Monday of a current month in postgresql

Can anyone help me to find the specific week day of the month in postgresql... like 1st and 3rd week sunday/monday or 2nd and 4th week wednesday
First off contrary to initial expectations working with dates is complex, sometime extremely so. The combination of week numbers and days the week fall into the latter category.
The problem stems from the 2 ISO definitions:
All weeks start on Monday and are 7 days long.
The first week of the year is the week containing 4-Jan.
This dooms any effort (at least any reasonable simple onc) to failure. While an admirable effort I'll use #Abelisto suggestion as a sample. See Fiddle. I've changed that just enough to use multiple parameters, while for most months it's correct but look at 30,31-Jan-2019 and Jan-2021.
The problem with the first being while the ISO week is perfectly consistent the calendar is not. This results that the first week of a given month be the same as the last week of the previous month, and the reverse.
While this can usually be worked around by itself not so when combined with the other. As a result of each being 7 days long and the 1st week of the year containing 4-Jan gives rise to the larger problem. The last few days of Dec maybe in the 1st week of the next year. Also the first days of Jan can be in the 52( or 53) week of the prior year (see 2nd query in fiddle). Is there a solution? I'm sure there is somewhere out there. I just don't have it. At least with the Extract function.
So how about this specific issue: Well basically it comes down to getting the last day of the previous month, then finding the next DOW (Sunday or Monday) as needed. Now coming from a Oracle back ground I'd just use the NEXT_DAY function which would do just that for me. Unfortunately Postgres does not provide that useful function. But you can roll your own. Below I provide a a couple functions I wrote to do this functionality in Postgres. It consists of 2 Postgres SQL functions:
- utl_dates_first_dow_of_month(). It takes 2 parameters, the target Day-Of-Week (DOW) as the first 3 characters of the day name (case insensitive) and a date in the desired month. It returns the DATE which is the first occurrence of the requested DOW.
- utl_dates_next_dow(). It takes the same 2 parameters and returns the next calendar date of the specified DOW from the from the specified date. If the date specified fall on the requested DOW the routine DOES NOT return the specified date. Function is actually used by the first.
Fortunately the routines are shorter than the description.
create or replace function utl_dates_next_dow(dow_in text, date_in date)
returns date
language sql
immutable strict
as $$
-- Given a DOW and a date return the calendar date for the next occurrence of DOW
with dy as (select string_to_array('mon,tue,wed,thu,fri,sat,sun', ',') dl)
, dn as (select array_position(dl, (substring(to_char(date_in, 'day'),1,3))) fn
, array_position(dl, lower(substring(dow_in,1,3))) dn
from dy
)
select case when dn <= fn
then (date_in + (dn+7-fn) * interval '1 day')::date
else (date_in + (dn-fn) * interval '1 day')::date
end
from dn;
$$;
create or replace function utl_dates_first_dow_of_month(dow_in text, date_in date)
returns date
language sql
immutable strict
as $$
-- Given a DOW and a Date return the calendar date of the first specified dow in which the specified date falls.
select utl_dates_next_dow(dow_in, (date_trunc('month', date_in) - interval '1 day')::date);
$$;
Now with that out out of the way on the the issue at hand. As Abelisto, and others, indicate the request is ambiguous. There is no such thing as 1st or 3rd Sunday/Monday. Do you want the 1st and 3rd Sunday and the 1st and 3rd Monday of the month? Do you want
Do you want the 1st and 3rd Sunday of the month and the Monday following each respectively. Do you want the Sunday and Monday for the 1st and 3rd week on the month (If so Monday would always be the earlier date, see definition 1 above)? Please try to be more specific with your questions. And include test data - as text no images - and the expected results from that data. The solutions are however just slight modifications of each other. (No solution for the 3rd listed possibility.)
For the case of 1st and 3rd Sunday and the 1st and 3rd Monday:
with parms (dt) as (values ( date '2020-04-01'), (date '2020-06-01') )
, base_dates( fsun, fmon) as
( select utl_dates_first_dow_of_month('Sun',dt)
, utl_dates_first_dow_of_month('Mon',dt)
from parms
)
select '1st & 3rd Sunday and 1st & 3rd Monday'
, fsun "1st Sunday"
, (fsun+interval '14 days')::date "3rd Sunday"
, fmon "1st Monday"
, (fmon+interval '14 days')::date "3rd Monday"
from base_dates;
For the 1st and 3rd Sunday of the month and the Monday following:
with parms (dt) as (values ( date '2020-04-01'), (date '2020-06-01') )
, base_dates( fsun, fmon) as
( select utl_dates_first_dow_of_month('Sun',dt)
, (utl_dates_first_dow_of_month('Sun',dt)+interval '1 day')::date
from parms
)
select '1st & 3rd Sunday and Monday Following '
, fsun "1st Sunday"
, fmon "1st Monday"
, (fsun+interval '14 days')::date "3rd Sunday"
, (fmon+interval '14 days')::date "3rd Monday"
from base_dates;
select * from
(
select dow, i, row_number() over(partition by dow order by i) as rnk from
(
select
extract(dow from i::date) as dow,
i::date
from generate_series('2022-10-01'::date,'2022-10-31'::date,interval '1 Day') i
) tmp where dow = 1
)tmp_out where rnk = 3;

Tricking Weekofyear in Hive by shifting the week, for counting

I've been working on this problem for a while now. Basically I have a simple set of data with UserId, and TimeStamp. I want to know how many distinct UserId's appear each week, the catch is my week is measured in Sunday-Saturday, NOT Monday - Sunday, which is what Weekofyear() uses.
Right now I'm hardcoding each week and running the query:
SELECT
count(distinct UserId)
FROM data.table
where from_unixtime((CAST(timestamp as BIGINT)))
between TO_DATE("2016-06-05") AND TO_DATE("2016-06-12")
I'm trying to find a way to shift the timestamp back a day to trick weekofyear into thinking my Sunday is actually a Monday, but have not been successful. My latest futile attempt looked like:
SELECT
count(distinct UserId), weekofyear(date_sub(from_unixtime(CAST(timestamp as BIGINT)),1))
FROM table.data
where from_unixtime((CAST(timestamp as BIGINT)))
between TO_DATE("2016-06-01") AND TO_DATE("2016-06-30")
group by weekofyear(date_sub(from_unixtime(CAST(timestamp as BIGINT)),1))
This results in the same numbers as if I didn't subtract a day. I not sure why this isn't working. I feel like there should be a way to manage this. Right now if I wanted to pull all the data by week WHERE X is true, I'd have to manually do each week, that won't be sustainable. Any suggestions on how to work smarter?
Thank you.
Simple Solution
You can simply create your own formula instead of going with pre-defined function for "week of the year"
Advantage: you will be able to take any set of 7 days for a week.
In your case since you want the week should start from Sunday-Saturday we will just need the first date of sunday in a year
eg- In 2016, First Sunday is on '2016-01-03' which is 3rd of Jan'16
--assumption considering the timestamp column in the format 'yyyy-mm-dd'
SELECT
count(distinct UserId), lower(datediff(timestamp,'2016-01-03') / 7) + 1 as week_of_the_year
FROM table.data
where timestamp>='2016-01-03'
group by lower(datediff(timestamp,'2016-01-03') / 7) + 1;

Get this week's monday's date in Postgres?

How can I get this week's monday's date in PostgreSQL?
For example, today is 01/16/15 (Friday). This week's monday date is 01/12/15.
You can use date_trunc() for this:
select date_trunc('week', current_date);
More details in the manual:
http://www.postgresql.org/docs/current/static/functions-datetime.html#FUNCTIONS-DATETIME-TRUNC
If "today" is Monday it will return today's date.
SELECT current_date + cast(abs(extract(dow FROM current_date) - 7) + 1 AS int);
works, although there might be more elegant ways of doing it.
The general idea is to get the current day of the week, dow, subtract 7, and take the abs, which will give you the number of days till the end of the week, and add 1, to get to Monday. This gives you next Monday.
EDIT: having completely misread the question, to get the prior Monday, is much simpler:
SELECT current_date - ((6 + cast(extract(dow FROM current_date) AS int)) % 7)
ie, subtract the current day of the week from today's date (the number of day's past Monday) and add one, to get back to Monday.
And for other mondays:
Next Monday:
date_trunc('week', now())+ INTERVAL '7days'
Last week's monday:
date_trunc('week', now())- INTERVAL '7days'
etc. :)
I usually use a calendar table. There are two main advantages.
Simple. Junior devs can query it correctly with little training.
Obvious. Correct queries are obviously correct.
Assuming that "this week's Monday" means the Monday before today, unless today is Monday then . . .
select max(cal_date) as previous_monday
from calendar
where day_of_week = 'Mon'
and cal_date <= current_date;

Need to sort by Date then Hour, then output Date, text Day of week , range of hours SQL Server 2008 R2

NEWBIE at work! I am trying to create a simple summary that counts the number of customer visits and groups by 1) date and 2) hour, BUT outputs this:
Date Day of Wk Hour #visits
8/12/2013 Monday 0 5
8/12/2013 Monday 1 7
8/12/2013 Monday 6 10
8/13/2013 Tuesday 14 25
8/13/2013 Tuesday 16 4
We are on military time, so 14 = 2:00 pm
Select
TPM300_PAT_VISIT.adm_ts as [Date]
,TPM300_PAT_VISIT.adm_ts as [Day of Week]
,TPM300_PAT_VISIT.adm_ts as [Hour]
,count(TPM300_PAT_VISIT.vst_ext_id) as [Total Visits]
From
TPM300_PAT_VISIT
Where
TPM300_PAT_VISIT.adm_srv_cd='22126'
and TPM300_PAT_VISIT.adm_ts between '07-01-2013' and '08-01-2013'
Group by
cast(TPM300_PAT_VISIT.adm_ts as DATE)
,datepart(weekday,TPM300_PAT_VISIT.adm_ts)
,datepart(hour,TPM300_PAT_VISIT.adm_ts)
Order by
CAST(TPM300_PAT_VISIT.adm_ts as DATE)
,DATEPART(hour,TPM300_PAT_VISIT.adm_ts)
This should solve the problem:
; With Streamlined as (
SELECT
DATEADD(hour,DATEDIFF(hour,'20010101',adm_ts),'20010101') as RoundedTime,
vst_ext_id
from
TPM300_PAT_VISIT
where
adm_srv_cd='22126' and
adm_ts >= '20130701' and
adm_ts < '20130801'
)
Select
CONVERT(date,RoundedTime) as [Date],
DATEPART(weekday,RoundedTime) as [Day of Week],
DATEPART(hour,RoundedTime) as [Hour],
count(vst_ext_id) as [Total Visits]
From
Streamlined
Group by
RoundedTime
Order by
CONVERT(date,RoundedTime),
DATEPART(hour,RoundedTime)
In the CTE (Streamlined)'s select list, we floor each adm_ts value down to the nearest hour using DATEADD/DATEDIFF. This makes the subsequent grouping easier to specify.
We also specify a semi-open interval for the datetime comparisons, which makes sure we include everything in July (including stuff that happened at 23:59:59.997) whilst excluding events that happened at midnight on 1st August. This is frequently the correct type of comparison to use when working with continuous data (floats, datetimes, etc), but means you have to abandon BETWEEN.
I'm also specifying the dates as YYYYMMDD which is a safe, unambiguous format. Your original query could have been interpreted as either January 7th - January 8th or 1st July - 1st August, depending on the settings of whatever account you use to connect to SQL Server. Better yet, if these dates are being supplied by some other (non-SQL) code, would be for them to be passed as datetimes in the first place, to avoid any formatting issues.

SQL DateDiff Weeks - Need and alternative

The MS SQL DateDiff function counts the number of boundaries crossed when calculating the difference between two dates.
Unfortunately for me, that's not what I'm after. For instance, 1 June 2012 -> 30 June 2012 crosses 4 boundaries, but covers 5 weeks.
Is there an alternative query that I can run which will give me the number of weeks that a month intersects?
UPDATE
To try and clarify exactly what I'm after:
For any given month I need the number of weeks that intersect with that month.
Also, for the suggestion of just taking the datediff and adding one, that won't work. For instance February 2010 only intersects with 4 weeks. And the DateDiff calls returns 4, meaning that simply adding 1 would leave me the wrong number of weeks.
Beware: Proper Week calculation is generally trickier than you think!
If you use Datepart(week, aDate) you make a lot of assumptions about the concept 'week'.
Does the week start on Sunday or Monday? How do you deal with the transition between week 1 and week 5x. The actual number of weeks in a year is different depending on which week calculation rule you use (first4dayweek, weekOfJan1 etc.)
if you simply want to deal with differences you could use
DATEDIFF('s', firstDateTime, secondDateTime) > (7 * 86400 * numberOfWeeks)
if the first dateTime is at 2011-01-01 15:43:22 then the difference is 5 weeks after 2011-02-05 15:43:22
EDIT: Actually, according to this post: Wrong week number using DATEPART in SQL Server
You can now use Datepart(isoww, aDate) to get ISO 8601 week number. I knew that week was broken but not that there was now a fix. Cool!
THIS WORKS if you are using monday as the first day of the week
set language = british
select datepart(ww, #endofMonthDate) -
datepart(ww, #startofMonthDate) + 1
Datepart is language sensistive. By setting language to british you make monday the first day of the week.
This returns the correct values for feburary 2010 and june 2012! (because of monday as opposed to sunday is the first day of the week).
It also seems to return correct number of weeks for january and december (regardless of year). The isoww parameter uses monday as the first day of the week, but it causes january to sometimes start in week 52/53 and december to sometimes end in week 1 (which would make your select statement more complex)
SET DATEFIRST is important when counting weeks. To check what you have you can use select ##datefirst. ##datefirst=7 means that first day of week is sunday.
set datefirst 7
declare #FromDate datetime = '20100201'
declare #ToDate datetime = '20100228'
select datepart(week, #ToDate) - datepart(week, #FromDate) + 1
Result is 5 because Sunday 28/2 - 2010 is the first day of the fifth week.
If you want to base your week calculations on first day of week is Monday you need to do this instead.
set datefirst 1
declare #FromDate datetime = '20100201'
declare #ToDate datetime = '20100228'
select datepart(week, #ToDate) - datepart(week, #FromDate) + 1
Result is 4.