why justify_interval('360 days'::interval) results '1 year' - postgresql

For some reason justify_interval(now() - '2013-02-14'::timestamptz) produces weird results:
postgres=# select justify_interval(concat(365*4 +1,' days')::interval); -[ RECORD 1 ]----+----------------
justify_interval | 4 years 21 days
I checked one year:
postgres=# select justify_interval('365 days'::interval);
justify_interval
------------------
1 year 5 days
So I went further:
postgres=# select justify_interval('360 days'::interval);
justify_interval
------------------
1 year
(1 row)
This behavior is not platform specific (tried several Linuxes, 9.2, 9.3, 9.6)
Why one year is 360 days?..

It seems that you are looking for something, which PostgreSQL calls a "symbolic" result that uses years and months, rather than just days, which is what the age(timestamp, timestamp) (and age(timestamp)) function(s) returns.
select age(now(), '2013-02-14'); -- 4 years 16:41:02.571547
select age(timestamp '2013-02-14'); -- 4 years
The - operator always returns the difference in days (at most). The justify_*() functions (and the *, /, <, > operators) always "cut" values to an average (i.e. 1 day is 24 hours and 1 month is 30 days) despite the fact that 1 day actually can contain 23-25 hours (just think of daylight saving time zones) and 1 month can contain 28-31 days (so the result depends on the actual start and end points of the range, which creates the interval).

accrding to docs:
justify_interval(interval) - Adjust interval using justify_days and
justify_hours, with additional sign adjustments
and further:
justify_days(interval) - Adjust interval so 30-day time periods
are represented as months
So 30*12=360
Not expected but obviously defined in docs...

Related

How to find the week number per current month in PostgreSQL?

My working environment
PostgreSQL version: 10.9 (64 bits)
OS: Windows 10 (64 bits)
I need to find the week number of a given date in the current month. So unlike the ISO week number which is computed from the beginning of the year, here the computation is done from the beginning of the current month and therefore the result is a number between 1 and 5. Here are a few examples for June 2020:
Date in the current month Week number per current month
========================== ===============================
2020-06-01 -----> 1
2020-06-10 -----> 2
2020-06-19 -----> 3
2020-06-23 -----> 4
2020-06-23 -----> 5
I was reading the online documentation: Date/Time Functions and Operators It seems that there is no function providing directly what I'm looking for. So after a few successful tests, here is the solution that I found:
select
extract('week' from current_date) -
extract('week' from date_trunc('month', current_date))
+ 1;
I consider myself to be rather a beginner in using date functions, so just to make sure that I'm on the right track, do you think that this solution is correct? As I said, after a few tests it seems to me that it gets the job done.
The to_char() method offers such a feature:
W - week of month (1-5) (the first week starts on the first day of the month)
select to_char(current_date, 'W');
It returns a string value, but that can easily be cast to a number.

How can I always get the full period when grouping by week in PostgreSQL?

I'm used to do the following syntax when analysing weekly data:
select week(creation_date)::date as week,
count(*) as n
from table_1
where creation_date > current_date - 30
group by 1
However, by doing this I will get just part of the first week.
Is there any smart way to alway get a whole week in the beginning?
Like get the first day of the week I would get half of.
First off you need to define what you mean by "week". This is more difficult than it appears. While humans have an intuitive since of a week, computers are just not that smart. There are 2 common conventions: the ISO-8601 Standard and, for lack of a better term, Traditional. ISO-8601 defines a week as always beginning on Monday and always containing 7 days. Traditional weeks begin on Sunday (usually) but may have weeks with less than 7 days. This results from having the 1st week of the year beginning on 1-Jan regardless of day of week. Thus the 1st and/or last weeks may have less than 7 days. ISO-8601 throws it own curve into the mix: the 1st week of the year begins on the week containing 4-Jan. Thus the last days of Dec may be in week 1 of the next year and the first days Jan may be in week 52/53 of the prior year.
All the below assume the ISO-8061.
Secondly there is no week function in Postgres. In you need extract function. So for this particular case:
select extract(week from creation_date)::integer as week, ...
Finally, your predicate (current_date - 30) ensures you will unusually not begin on the 1st of the week. To get the correct date take that result back 1 week, then go forward to the next Monday.
with days_to_monday (day_adj) as
( values ('{7,6,5,4,3,2,1}'::int[]) )
select current_date - 30
, current_date - 30 - 7 + day_adj[extract (isodow from current_date - 30 )]
from table_1 cross join days_to_monday;
The CTE establishes an array which for a given day of the week contains the number of days need to the next Monday. That main query extracts the day of week of current date and uses that to index the array. The corresponding value is added to get the proper date.
Putting that together with your original query to arrive at:
with next_week (monday) as
( values (current_date - 30 - 7
+ ('{7,6,5,4,3,2,1}'::int[])[extract (isodow from current_date - 30 )])
)
select extract(week from creation_date) as week,
count(*) as n
from table_1
where creation_date >= (select monday from next_week)
group by 1
order by 1;
For full example see fiddle.

PostgreSQL interval subtraction

I've tried to subtract interval from timestamp, but I've got a wrong result in comparison to days via subtracting 2 dates.
E.g.:
select
(now::date - past::date) as days,
(now::date - past::date) / 365.25 as years,
justify_interval(now - past::date) as interval_test
from (
select '2020-09-17 00:00:01'::timestamp as now, '2010-09-17 00:00:01'::timestamp as past
) b;
gives results:
3653 days
10.0013 years
'10 years 1 mon 23 days' interval test
Could anyone help me to understand what is wrong with subtracting?
When I do it vice versa, it's ok:
select
(past::date + 3653)::date,
(past + interval '10 years')::date,
(past + 10*interval '1 year')::date,
(past + 10*12*interval '1 month')::date
from (
select '2020-09-17 00:00:01'::timestamp as now, '2010-09-17 00:00:01'::timestamp as past
) b;
all results give the same date '2020-09-17'
What I do wrong?
I am using PostgreSQL 10.5.
There is nothing wrong with subtracting. It is just that justify_interval doesn't do what you seem to expect. justify_interval uses 30 day months and 24 hour days. So 12 months becomes only 360 and 10 years only 3600 days. Leaving 53 days which is 1 (30 day) month and 23 days.
Edit
The justify_interval documentation on this page refers to justify_days and justify_hours which are directly above it which do mention the use of 30 days months and 24 hour days.
The justify functions do have to make these assumption because the interval type is a general length of time (it has no specific start and end). So the justify functions does not know over which specific months the interval was originally calculated.
The age function however does not take an interval it takes an end and a start so it actually knows which specific months and years are in that period.

inconsistency between month, day, second representation of interval data type

I understand why postgresql uses month,day and second fields to representate the sql interval datatype. A month is not always the same length and a day can have 23, 24 or 25 hours if a daylight savings time adjustment is involved. this is from postgresql documentation.
But I then do not understand why this is not consequently handled both for months and days. see the following query which calculates an exact interval where the number of seconds between two points in time is exactly calculatable:
select ('2017-01-01'::timestamp-'2016-01-01'::timestamp); -->366 days.
postgresql chooses to give a result in days. not in months and not in seconds.
But why is the result days and not seconds? it is NOT defined how long days are (they can be 23,24 or 25 hours long). so why does he not give output in seconds?
Then since the length of months is also not defined, why doesn't postgresql give an output of 12 month instead of 366 days?
He does not care that the length of days is not defined, but obviously he cares that the length of month is not defined.
Why this asymmetrie?
For further explanation, see this query:
select ('10 days'::interval-'24 hours'::interval); --> 10 days -24:00:00
you see that postgresql correctly refuses to answer with 9 days. He is pretty aware of the problem that days and hours cannot be interchanged. But then again why does the first query return days?
I can't answer your question, but I think I can point you in the right direction. I think the book SQL-99 Complete, Really is the most accessible source for understanding SQL intervals. It's available online: https://mariadb.com/kb/en/sql-99/08-temporal-values/.
SQL standards describe two kinds of intervals: year-month intervals and day-time intervals. It does this to prevent month parts and day parts from appearing in the same interval, because, as you already know, the number of days in a month is ambiguous. The number of days in the interval '3' month depends on which three months you're talking about.
I think this is the verbose, standard SQL way to write your first query.
select cast(timestamp '2017-01-01' - timestamp '2016-01-01' as interval day to hour) as new_column;
new_column
interval day to hour
--
366 days
I suspect that you'll find that SQL standards have rules for what a SQL dbms is supposed to do when things like interval day to hour are omitted. PostgreSQL might or might not follow those rules.
postgresql chooses to give a result in days. not in months and not in seconds.
Standard SQL prevents month parts and day parts from appearing in the same interval. Also, the range of valid seconds is from 0 to 59.
select interval '59' second;
interval
interval second
--
00:00:59
select interval '60' second;
interval
interval second
--
00:01:00

How to calculate average weekly hours between 2 dates covering multiple weeks?

Postgresql 8.4.
I'm new to this concept so if people could teach me I'd appreciate it.
For Obamacare, anyone that works 30 hours per week or more must be offered the same healthcare as is offered to any other worker. We can't afford that so we have to limit work hours for temp and part-timers. This is affecting the whole country.
I need to calculate the hours worked (doesn't matter if overtime,
regular time, double time, etc) between two dates, say Jan 1, 2014,
and Nov 1, 2014 (Saturday) for each custom week (which beings on Sunday), not the week as defined by Postgresql (which begins on Monday).
Each of my custom work weeks begins on Sunday and ends on Saturday.
I don't know if I have to include weeks where
they did not work at all in the average, but let's assume I do. Zero hours that week would draw down the average.
Table name is 'employeetime', date field is 'employeetime.stopdate', hours worked per day is in the field 'employeetime.hours', employeeid field is 'employeetime.empid'.
I'd prefer to do this in one query per employee and I will execute the query once per employee as I loop through employees. If not I'm open to suggestions. But I'd like to understand the SQL presented in the answer.
Currently EXTRACT(week from '2014-01-01') calculates the start of the week as a Monday, so that doesn't work for me. Link here.
How would I do that without doing, say a separate query for each week, per person? We have 200 people to process.
Thank you.
I have set up a table to match your format:
select * from employeetime order by date;
id date hours
1 2014-11-06 10
1 2014-11-07 3
1 2014-11-08 5
1 2014-11-09 3
1 2014-11-10 5
You can get the week starting on Sunday by shifting. Note, here the 9th is a Sunday, so that is where we want the boundary.
select *, extract(week from date + '1 day'::interval) as week
from employeetime
order by week;
id date hours week
1 2014-11-07 3 45
1 2014-11-06 10 45
1 2014-11-08 5 45
1 2014-11-09 3 46
1 2014-11-10 5 46
And now the week shifts on Sunday rather than Monday. From here, the query to get hours by week/employee would be simple:
select id, sum(hours) as hours, extract(week from date + '1 day'::interval) as week
from employeetime
group by id, week
order by id, week;
id hours week
1 18 45
1 8 46