Inconsistent behavior while computing AGE in PostgreSQL? - postgresql

I am encountering different results while using AGE function in postgresql 9.1.
Incorrect. Two days are added when the current day is changed from Feb 28 to March 1st.
Query : select AGE ('2015-02-28', '2012-06-24');
Result : 2 years 8 mons 4 days
Query : select AGE ('2015-03-01', '2012-06-24');
Result : 2 years 8 mons 7 days
Seems correct, when the age is computed from 2014-02-27. The extra days are not added here.
Query : select AGE ('2015-02-28', '2014-02-27');
Result : 1 year 1 day
Query : select AGE ('2015-03-01', '2014-02-27');
Result : 1 year 2 days
What must be happening while the query is being run?

The documentation says:
PostgreSQL's approach uses the month from the earlier of the two dates when calculating partial months. For example, age('2004-06-01', '2004-04-30') uses April to yield 1 mon 1 day, while using May would yield 1 mon 2 days because May has 31 days, while April has only 30.
So for your example, the 2012-06-24is your "earlier date" and the june has 30 days. So your difference computes as 24 to 30 is 6 and to 1 is 7.
This is not sane imho, but the age() function behaves exactly how it should according to the documentation.
Edit:
To be more precise: the age() function does not compute differences between dates. Don't use it to do that.

Related

Using partitions (window functions) in combination with aggregations in MongoDB

In MongoDB I have documents like below (I cross the names of calls for confidentiality):
Now I need to build a query to return results grouped by the name of the call and for each type of call I need to get the number of calls by month, day and hour. Also, in this query I need to indicate a range between two dates (including time).
In SQL server this is done using window functions (partitions) in combination with aggregations but how can I do the same in Mongo?
I am using MongoDB compass as mongo client.
I need to obtain something as below:
call name month day hour #ByMonth #ByDay #ByHour
GetEmployee January 1 14 10 6 1
GetEnployee January 1 18 10 6 5
GetEmployee January 3 12 10 4 4
GetEmployee March 5 20 8 8 8
GetEmployee April 12 17 45 35 35
GetEmployee April 20 10 45 10 10
For example, for GetEmployee call the distribution is as below:
10 calls done in January
8 calls done in March
45 calls done in April
For the January, the 10 calls are being distributed as below:
6 calls done on 1st January(these 6 calls are distributed as follows: 1 call at 14h and 5 calls at 18h)
4 calls done on 3rd January(these 4 calls are all done at 12h)
and so on for the rest of months.
For example, in SQL Server, if I have below table:
processName initDateTime
processA 2020-06-15 13:31:15.330
processB 2020-06-20 10:00:30.000
processA 2020-06-20 13:31:15.330
...
and so on
The SQL query is:
select
processName,
month(initDateTime),
day(initDateTime),
datepart(hour, initDateTime),
sum(count(*)) over(partition by processName, year(initDateTime), month(initDateTime)) byMonth,
sum(count(*)) over(partition by processName, year(initDateTime), month(initDateTime), day(initDateTime)) byDay,
count(*) byHour
from mytable
group by
processName,
year(initDateTime),
month(initDateTime),
day(initDateTime),
datepart(hour, initDateTime)
So How to do the same in Mongo? above processName and initDateTime fields would be "call" and "created" attributes respectively in mongodb.

why justify_interval('360 days'::interval) results '1 year'

For some reason justify_interval(now() - '2013-02-14'::timestamptz) produces weird results:
postgres=# select justify_interval(concat(365*4 +1,' days')::interval); -[ RECORD 1 ]----+----------------
justify_interval | 4 years 21 days
I checked one year:
postgres=# select justify_interval('365 days'::interval);
justify_interval
------------------
1 year 5 days
So I went further:
postgres=# select justify_interval('360 days'::interval);
justify_interval
------------------
1 year
(1 row)
This behavior is not platform specific (tried several Linuxes, 9.2, 9.3, 9.6)
Why one year is 360 days?..
It seems that you are looking for something, which PostgreSQL calls a "symbolic" result that uses years and months, rather than just days, which is what the age(timestamp, timestamp) (and age(timestamp)) function(s) returns.
select age(now(), '2013-02-14'); -- 4 years 16:41:02.571547
select age(timestamp '2013-02-14'); -- 4 years
The - operator always returns the difference in days (at most). The justify_*() functions (and the *, /, <, > operators) always "cut" values to an average (i.e. 1 day is 24 hours and 1 month is 30 days) despite the fact that 1 day actually can contain 23-25 hours (just think of daylight saving time zones) and 1 month can contain 28-31 days (so the result depends on the actual start and end points of the range, which creates the interval).
accrding to docs:
justify_interval(interval) - Adjust interval using justify_days and
justify_hours, with additional sign adjustments
and further:
justify_days(interval) - Adjust interval so 30-day time periods
are represented as months
So 30*12=360
Not expected but obviously defined in docs...

Tableau Fill missing value with latest value available

I just started to use Tableau and I would like to know how to take the latest value available.
For example I have :
ID Date Active
1 01/01/2016 1
1 01/02/2016 1
1 01/07/2016 0
2 01/02/2016 1
2 01/08/2016 0
Now I would like to have a view by month on the SUM of the Active flag, something like :
01/01/2016 1
01/02/2016 2
01/03/2016 2
01/04/2016 2
01/04/2016 2
01/05/2016 2
01/06/2016 2
01/07/2016 1
01/08/2016 0
As you can see we assume that the Active flag take the latest value available like :
1 01/01/2016 1
1 01/02/2016 1
1 01/07/2016 0
Will be transform in :
01/01/2016 1
01/02/2016 1
01/03/2016 1
01/04/2016 1
01/04/2016 1
01/05/2016 1
01/06/2016 1
01/07/2016 0
After you do the sum on the Active flag.
I think that I have to use Calculated field but I didn't manage t find the right formula.
I assume your example contains an error so I give you the solution for the problem as I understand it. Please explain how I misinterpreted. I do feel like the techniques used should apply nevertheless.
I think you want two things: first you want Tableau to show missing months, which you can do by right clicking on the months and selecting show missing values. This would give you:
Month of Date Active
January 1
February 2
March
April
May
June
July 0
August 0
Secondly you would like missing values to have the value of the result of the previous month.
Month of Date Active
January 1
February 2
March 2
April 2
May 2
June 2
July 0
August 0
And here I have a difference from your example since you state July should have a value of 1 which I don't understand since the sum of July is 0.
If it is the case that this is just due to a typo you can achieve the above table by indeed using a calculated field:
ifnull(sum([Active]), previous_value(0))
If I misinterpreted some part of your problem, please let me know so I change my solution accordingly. But I think in general a combination of lookup, ifnull and previous_value will be able to solve your issue.

How to calculate average weekly hours between 2 dates covering multiple weeks?

Postgresql 8.4.
I'm new to this concept so if people could teach me I'd appreciate it.
For Obamacare, anyone that works 30 hours per week or more must be offered the same healthcare as is offered to any other worker. We can't afford that so we have to limit work hours for temp and part-timers. This is affecting the whole country.
I need to calculate the hours worked (doesn't matter if overtime,
regular time, double time, etc) between two dates, say Jan 1, 2014,
and Nov 1, 2014 (Saturday) for each custom week (which beings on Sunday), not the week as defined by Postgresql (which begins on Monday).
Each of my custom work weeks begins on Sunday and ends on Saturday.
I don't know if I have to include weeks where
they did not work at all in the average, but let's assume I do. Zero hours that week would draw down the average.
Table name is 'employeetime', date field is 'employeetime.stopdate', hours worked per day is in the field 'employeetime.hours', employeeid field is 'employeetime.empid'.
I'd prefer to do this in one query per employee and I will execute the query once per employee as I loop through employees. If not I'm open to suggestions. But I'd like to understand the SQL presented in the answer.
Currently EXTRACT(week from '2014-01-01') calculates the start of the week as a Monday, so that doesn't work for me. Link here.
How would I do that without doing, say a separate query for each week, per person? We have 200 people to process.
Thank you.
I have set up a table to match your format:
select * from employeetime order by date;
id date hours
1 2014-11-06 10
1 2014-11-07 3
1 2014-11-08 5
1 2014-11-09 3
1 2014-11-10 5
You can get the week starting on Sunday by shifting. Note, here the 9th is a Sunday, so that is where we want the boundary.
select *, extract(week from date + '1 day'::interval) as week
from employeetime
order by week;
id date hours week
1 2014-11-07 3 45
1 2014-11-06 10 45
1 2014-11-08 5 45
1 2014-11-09 3 46
1 2014-11-10 5 46
And now the week shifts on Sunday rather than Monday. From here, the query to get hours by week/employee would be simple:
select id, sum(hours) as hours, extract(week from date + '1 day'::interval) as week
from employeetime
group by id, week
order by id, week;
id hours week
1 18 45
1 8 46

SAS - Creating a week variable

I'm using SAS 9.3
I need to create a way to sum up by week total, and I have no idea how to do it. So basically I have a year list of dates (left column below) with a total from that date (the right column). Our week goes from Friday to the previous Thursday (e.g. Thursday Oct 17 through Friday the Oct 25th).
An issue I also have is as you see the dates on the left are not completely daily and don't always have a Thursday date before the last Friday date. Would any know a way to add these weeks up - Week 1, Week 2, etc etc ...?
Thanks for any help that can be provided
2013-01-01 3
2013-01-02 8
2013-01-03 8
2013-01-04 10
2013-01-06 1
2013-01-07 10
2013-01-08 14
2013-01-09 12
2013-01-10 8
2013-01-11 9
2013-01-12 1
2013-01-14 12
2013-01-15 8
2013-01-16 5
2013-01-17 15
2013-01-18 7
2013-01-20 1
Trivial way:
data want;
set have;
weekno = ceil((date-'03JAN2013'd)/7);
run;
IE, subtract the first thursday and divide by 7, (so 1/1-1/3 is weekno=0).
INTCK function is also adept at calculating this. The basic structure is
weekno=intck('WEEK.5','04JAN2013'd,date); *the second argument is when you want your first week to start;
WEEK means calculate weeks, # on left side of decimal is multiple week groups (2 week periods is WEEK2.), on right side is shift index (from the default sunday-saturday week).
You could also create a format that contained your weeks, and use that.