Create date efficiently - postgresql

On Pavel's page is the following function:
CREATE OR REPLACE FUNCTION makedate(year int, dayofyear int)
RETURNS date AS $$
SELECT (date '0001-01-01' + ($1 - 1) * interval '1 year' + ($2 - 1) * interval '1 day'):: date
$$ LANGUAGE sql;
I have the following code:
makedate(y.year,1)
What is the fastest way in PostgreSQL to create a date for January 1st of a given year?
Pavel's function would lead me to believe it is:
date '0001-01-01' + y.year * interval '1 year' + interval '1 day';
My thought would be more like:
to_date( y.year||'-1-1', 'YYYY-MM-DD');
Am looking for the fastest way using PostgreSQL 8.4. (The query that uses the date function can select between 100,000 and 1 million records, so it needs speed.)
Thank you!

I would just use the following, given that year is a variable holding the year, instead of using a function:
(year || '-01-01')::date
Btw. I can't believe that this conversion is your bottleneck. But maybe you should have a look at generate_series here (I don't know your usecase).
select current_date + s.a as dates from generate_series(0,14,7) as s(a);
dates
------------
2004-02-05
2004-02-12
2004-02-19
(3 rows)

Using to_date() is even simpler than you expect:
> select to_date('2008','YYYY');
to_date
------------
2008-01-01
(1 row)
> select to_date(2008::text,'YYYY');
to_date
------------
2008-01-01
(1 row)
Note that you still have to pass the year as a string, but no concatenation is needed.

As suggested by Daniel, in the unlikely case that this conversion is a bottleneck, you might prefer to precompute the function and store in a table. Eg:
select ynum, to_date( ynum ||'-01-01', 'YYYY-MM-DD') ydate
from generate_series(2000,2009) as ynum;
If there are a few years (and hence no need of indexes), you might even create the table dinamically for the scope of each query, with the new WITH.

Related

How to write the query to get the first and last date of a January and other month's in postgresql

How to get the first and last date of the particular month i.e if i pass the particular month name say March it should return output as 01/03/2019 and 31/03/2019.( For current year)
If you want to pass value March you would have to modify the code to understand every month. I'm not sure it's worth the trouble. Anyways, here's a code to return two values (start and end of month) based on current_date. Should you wish to change the day, you could put for example '2019-04-13' in that place.
SELECT
date_trunc('month', current_date) as month_start
, (date_trunc('month', current_date) + interval '1 month' - interval '1 day')::date as month_end
DATE_TRUNC function truncates the date to the precision specified in first argument, thus making the date as of first day of given month (taken from current_date in above example).
For end of month you need a bit more computation. I've always used this in production and what it does is it first truncates your date to first day of month, then adds one month and goes back one day, so that you have your end of month date (whether it's 30, 31, or special case for February during leap years).
for any month, the first day must be 1st,
so it is:
make_date(2019, 3, 1)
and for any month, the last day is 1 day before the first day of next month,
so it is:
make_date(2019, 4, 1) - integer '1'
sorry, I don't have a PostgreSQL environment to test if it is correct,
so please test it yourself.
and, BTW,
you can find more details about date/time operators and functions here:
https://www.postgresql.org/docs/current/functions-datetime.html
One straightforward approach, which would also work on most other databases, would be to truncate the incoming date by month to obtain the first day of that month. Then, truncate the date with one month added to it, and subtract one day, to obtain the last day of the month.
SELECT
DATE_TRUNC('month', '2019-03-15'::date) AS date_start,
DATE_TRUNC('month', '2019-03-15'::date + INTERVAL '1 MONTH')
- INTERVAL '1 DAY' AS date_end;
Demo
From here Date LastDay
SELECT date_trunc('MONTH', dtCol)::DATE;
CREATE OR REPLACE FUNCTION last_day(DATE)
RETURNS DATE AS
$$
SELECT (date_trunc('MONTH', $1) + INTERVAL '1 MONTH - 1 day')::DATE;
$$ LANGUAGE 'sql' IMMUTABLE STRICT;
The conversion from month name parameter is actually rather simple. Create an array with the month names and find the position in the array of the parameter, that result becomes the month value into the make_date function with year extracted from current date and day 1. The below contains an overloaded function providing for either date or month name with optional year.
create type first_last_date as ( first_of date, last_of date);
create or replace function first_last_of_month(date_in date)
returns first_last_date
language sql immutable strict leakproof
as $$
select (date_trunc('month', date_in))::date, (date_trunc('month', date_in) + interval '1 month' - interval '1 day')::date ;
$$;
create or replace function first_last_of_month( month_name_in text
, year_in integer default null
)
returns first_last_date
language sql immutable leakproof
as $$
select first_last_of_month ( make_date ( coalesce (year_in, extract ('year' from now())::integer)
, array_position(ARRAY['jan','feb','mar','apr','may','jun','jul','aug','sep','nov','dec']
, lower(substring(month_name_in,1,3)))
,1 ) );
$$;
-- test
Select first_last_of_month('March');
Select first_last_of_month('February') y2019
, first_last_of_month('February', 2020) y2020;
Select first_last_of_month(now()::date);

Simplify calculation of months between 2 dates (postgresql)

This question is asked many times and one of the suggested queries to get months between 2 dates is not working.
SELECT date_part('month',age('2016-06-30', '2018-06-30'))
The result of this query is 0. It should be 24 months. Because the months are 06 in both dates.
This works, but it is a bit clumsy compared to the sql server function:
SELECT date_part ('year', f) * 12 + date_part ('month', f)
FROM age ('2016-06-30', '2018-06-30') f
Like sql server (I think):
DATEDIFF(month, date1, date2)
Is there no simple way (like the above) to calculate the months between 2 dates in Postgresql? I prefer not to use a function if it is possible.
Unfortunately you already have the most elegant solution.
If you look at the documentation for extract (same as date_part):
https://www.postgresql.org/docs/current/static/functions-datetime.html#FUNCTIONS-DATETIME-EXTRACT
month
For timestamp values, the number of the month within the year (1 - 12) ; for interval values, the number of months, modulo 12 (0 - 11)
SELECT EXTRACT(MONTH FROM TIMESTAMP '2001-02-16 20:38:40');
Result: 2
SELECT EXTRACT(MONTH FROM INTERVAL '2 years 3 months');
Result: 3
SELECT EXTRACT(MONTH FROM INTERVAL '2 years 13 months');
Result: 1
For your problem it would be nice if there was a version of month that wasn't modulo 12 but that doesn't exist.
The option you have (extract the year * 12 + month) is the best option there is.
Edit
If you do want to create a function then see the following two functions:
CREATE OR REPLACE FUNCTION get_months(i interval) RETURNS double precision AS $$
SELECT date_part ('year', i) * 12 + date_part ('month', i) ;
$$ LANGUAGE SQL IMMUTABLE;
SELECT get_months(age('2016-06-30', '2018-06-30'));
Or
CREATE OR REPLACE FUNCTION get_months(to_date date, from_date date) RETURNS double precision AS $$
SELECT date_part ('year', f) * 12 + date_part ('month', f)
FROM age (to_date, from_date) f;
$$ LANGUAGE SQL IMMUTABLE;
SELECT get_months('2016-06-30', '2018-06-30');
You can actually create both then just use whichever suits your code.
This will give you the # of months between two dates excluding days.
CREATE OR REPLACE FUNCTION get_months_between(to_date date, from_date date) RETURNS double precision AS $$
SELECT (date_part ('year', to_date) * 12 + date_part ('month', to_date)) - (date_part ('year', from_date) * 12 + date_part ('month', from_date))
$$ LANGUAGE SQL IMMUTABLE;

How to get the absolute value of an interval

Consider the following statement:
select interval '-1 hours'
I couldn't figure out how to get the absolute value of an interval, i.e. to toggle or remove the sign if negative. The only thing that came to my mind is the following:
select abs(extract(epoch from interval '-1 hours'))
But I wonder if there is a more elegant way (a way that preserves the interval type)?
You can find the greatest value between i and -i. For example:
SELECT greatest(-'1 hour'::interval, '1 hour'::interval);
A CASE expression would look more self-explanatory. Example:
SELECT
i,
(CASE WHEN (i < INTERVAL '0') THEN (-i) ELSE i END) AS abs_i
FROM
(VALUES
(INTERVAL '-2 h'),
(INTERVAL '2 m')
) AS foo (i)
which produces:
i | abs_i
-----------+----------
-02:00:00 | 02:00:00
00:02:00 | 00:02:00
There's a discussion on the pgsql-general mailing-list: Absolute value of intervals on why a built-in abs(interval) function is not provided with PostgreSQL.
In short, there's no consensus about what it should do in some cases, when considering the componentized nature of the interval type.
But anyone can create their function implementing their own idea about what it should compute, for instance, building on the expression from LisMorski's answer:
CREATE FUNCTION abs(interval) RETURNS interval AS
$$ select case when ($1<interval '0') then -$1 else $1 end; $$
LANGUAGE sql immutable;
Simple SQL functions are generally inlined during query execution, so the performance should be comparable to having the expression inside the query.
Example:
#= select abs(interval '-2 days +3 minutes');
abs
------------------
2 days -00:03:00
# select abs(now()-clock_timestamp());
abs
-----------------
00:00:00.000146
It way isn't more elegant than yours, but it returns an interval type
select interval '-1 hours'*sign(extract(epoch from interval '-1 hours'))

Postgres birthdays selection

I work with a Postgres database. This DB has a table with users, who have a birthdate (date field). Now I want to get all users who have their birthday in the upcoming week....
My first attempt: SELECT id FROM public.users WHERE id IN (lange reeks) AND birthdate > NOW() AND birthdate < NOW() + interval '1 week'
But this does not result, obviously because off the year. How can I work around this problem?
And does anyone know what happen to PG would go with the cases at 29-02 birthday?
We can use a postgres function to do this in a really nice way.
Assuming we have a table people, with a date of birth in the column dob, which is a date, we can create a function that will allow us to index this column ignoring the year. (Thanks to Zoltán Böszörményi):
CREATE OR REPLACE FUNCTION indexable_month_day(date) RETURNS TEXT as $BODY$
SELECT to_char($1, 'MM-DD');
$BODY$ language 'sql' IMMUTABLE STRICT;
CREATE INDEX person_birthday_idx ON people (indexable_month_day(dob));
Now, we need to query against the table, and the index. For instance, to get everyone who has a birthday in April of any year:
SELECT * FROM people
WHERE
indexable_month_day(dob) >= '04-01'
AND
indexable_month_day(dob) < '05-01';
There is one gotcha: if our start/finish period crosses over a year boundary, we need to change the query:
SELECT * FROM people
WHERE
indexable_month_day(dob) >= '12-29'
OR
indexable_month_day(dob) < '01-04';
To make sure we match leap-day birthdays, we need to know if we will 'move' them a day forward or backwards. In my case, it was simpler to just match on both days, so my general query looks like:
SELECT * FROM people
WHERE
indexable_month_day(dob) > '%(start)%'
%(AND|OR)%
indexable_month_day(dob) < '%(finish)%';
I have a django queryset method that makes this all much simpler:
def birthday_between(self, start, finish):
"""Return the members of this queryset whose birthdays
lie on or between start and finish."""
start = start - datetime.timedelta(1)
finish = finish + datetime.timedelta(1)
return self.extra(where=["indexable_month_day(dob) < '%(finish)s' %(andor)s indexable_month_day(dob) > %(start)s" % {
'start': start.strftime('%m-%d'),
'finish': finish.strftime('%m-%d'),
'andor': 'and if start.year == finish.year else 'or'
}]
def birthday_on(self, date):
return self.birthday_between(date, date)
Now, I can do things like:
Person.objects.birthday_on(datetime.date.today())
Matching leap-day birthdays only on the day before, or only the day after is also possible: you just need to change the SQL test to a `>=' or '<=', and not adjust the start/finish in the python function.
I'm not overly confident in this, but it seems to work in my testing. The key here is the OVERLAPS operator, and some date arithmetic.
I assume you have a table:
create temporary table birthdays (name varchar, bday date);
Then I put some stuff into it:
insert into birthdays (name, bday) values
('Aug 24', '1981-08-24'), ('Aug 04', '1982-08-04'), ('Oct 10', '1980-10-10');
This query will give me the people with birthdays in the next week:
select * from
(select *, bday + date_trunc('year', age(bday)) + interval '1 year' as anniversary from birthdays) bd
where
(current_date, current_date + interval '1 week') overlaps (anniversary, anniversary)
The date_trunc truncates the date at the year, so it should get you up to the current year. I wound up having to add one year. This suggests to me I have an off-by-one in there for some reason. Perhaps I just need to find a way to get dates to round up. In any case, there are other ways to do this calculation. age gives you the interval from the date or timestamp to today. I'm trying to add the years between the birthday and today to get a date in the current year.
The real key is using overlaps to find records whose dates overlap. I use the anniversary date twice to get a point-in-time.
Finally, to show the upcoming birthdays of the next 14 days I used this:
SELECT
-- 14 days before birthday of 2000
to_char( to_date(to_char(c.birthdate, '2000-MM-dd'), 'YYYY-MM-dd') - interval '14 days' , 'YYYY-MM-dd') as _14b_b2000,
-- birthday of 2000
to_date(to_char(c.birthdate, '2000-MM-dd'), 'YYYY-MM-dd') as date_b2000,
-- current date of 2000
to_date(to_char(current_date, '2000-MM-dd'), 'YYYY-MM-dd') as date_c2000,
-- 14 days after current date of 2000
to_char( to_date(to_char(current_date, '2000-MM-dd'), 'YYYY-MM-dd') + interval '14 days' , 'YYYY-MM-dd') as _14a_c2000,
-- 1 year after birthday of 2000
to_char( to_date(to_char(c.birthdate, '2000-MM-dd'), 'YYYY-MM-dd') + interval '1 year' , 'YYYY-MM-dd') as _1ya_b2000
FROM c
WHERE
-- the condition
-- current date of 2000 between 14 days before birthday of 2000 and birthday of 2000
to_date(to_char(current_date, '2000-MM-dd'), 'YYYY-MM-dd') between
to_date(to_char(c.birthdate, '2000-MM-dd'), 'YYYY-MM-dd') - interval '14 days' and
to_date(to_char(c.birthdate, '2000-MM-dd'), 'YYYY-MM-dd')
or
-- 1 year after birthday of 2000 between current date of 2000 and 14 days after current date of 2000
to_date(to_char(c.birthdate, '2000-MM-dd'), 'YYYY-MM-dd') + interval '1 year' between
to_date(to_char(current_date, '2000-MM-dd'), 'YYYY-MM-dd') and
to_date(to_char(current_date, '2000-MM-dd'), 'YYYY-MM-dd') + interval '14 days'
;
So:
To solve the leap-year issue, I set both birthdate and current date to 2000,
and handle intervals only from this initial correct dates.
To take care of the near end/beginning dates,
I compared first the 2000 current date to the 2000 birthday interval,
and in case current date is at the end of the year, and the birthday is at the beginning,
I compared the 2001 birthday to the 2000 current date interval.
Here's a query that gets the right result, most of the time.
SELECT
(EXTRACT(MONTH FROM DATE '1980-08-05'),
EXTRACT(DAY FROM DATE '1980-08-05'))
IN (
SELECT EXTRACT(MONTH FROM CURRENT_DATE + s.a) AS m,
EXTRACT(DAY FROM CURRENT_DATE + s.a) AS d
FROM GENERATE_SERIES(0, 6) AS s(a)
);
(it doesn't take care of leap years correctly; but you could use extract again to work the subselect in terms of a leap year instead of the current year.
EDIT: Got it working for all cases, and as a useful query rather than a scalar select. I'm using some extra subselects so that I don't have to type the same date or expression twice for month and day, and of course the actual data would be in a table instead of the values expression. You might adapt this differently. It might still stand to improve by making a more intelligent series for weeks containing leap days, since sometimes that interval will only contain 6 days (for non-leap years).
I'll try to explain this from the inside-out; First thing I do is normalize the target date (CURRENT_DATE usually, but explicit in this code) into a year that I know is a leap year, so that February 29th appears among dates. The next step is to generate a relation with all of the month-day pairs that are under consideration; Since there's no easy way to do an interval check in terms of month-day, it's all happening using generate_series,
From there it's a simple matter of extracting the month and day from the target relation (the people alias) and filtering just the rows that are in the subselect.
SELECT *
FROM
(select column1 as birthdate, column2 as name
from (values
(date '1982-08-05', 'Alice'),
(date '1976-02-29', 'Bob'),
(date '1980-06-10', 'Carol'),
(date '1992-06-13', 'David')
) as birthdays) as people
WHERE
((EXTRACT(MONTH FROM people.birthdate),
EXTRACT(DAY FROM people.birthdate)) IN (
SELECT EXTRACT(MONTH FROM thedate.theday + s.a) AS m,
EXTRACT(DAY FROM thedate.theday + s.a) AS d
FROM
(SELECT date (v.column1 -
(extract (YEAR FROM v.column1)-2000) * INTERVAL '1 year'
) as theday
FROM (VALUES (date '2011-06-09')) as v) as thedate,
GENERATE_SERIES(0, 6) AS s(a)
)
)
Operating on days, as I've done here, should work splendidly all the way up until a two month interval (if you wanted to look out that far), since december 31 + two months and change should include the leap day. On the other hand, it's almost certainly more useful to just work on whole months for such a query, in which case you don't really need anything more than extract(month from ....
First find out how old the person currently is using age(), then grab the year from that extract(year from age()). This is how old they are currently in years, so for their age at their next birthday add 1 to the year. Then their next birthday is found by adding an interval of this many years * interval '1 year' to their birthday. Done.
I've used a subselect here to add the next_birth_day column in to the complete table to make the select clause simpler. You can then play with the where conditions to suit your needs.
select *
from (
select *,
(extract(year from age(birth_date)) + 1) * interval '1 year' + birth_date "next_birth_day"
from public.users
) as users_with_upcoming_birth_days
where next_birth_day between now() and now() + '7 days'
This is based on Daniel Lyons's anniversary idea, by calculating the interval between the next birthday and today, with just +/- date arithmetic:
SELECT
today,
birthday,
CASE
WHEN this_year_anniversary >= today
THEN this_year_anniversary
ELSE this_year_anniversary + '1 year'::interval
END - today < '1 week'::interval AS is_upcoming
FROM
(
SELECT
today,
birthday,
birthday + years AS this_year_anniversary
FROM
(
SELECT
today,
birthday,
((
extract(year FROM today) - extract(year from birthday)
) || ' years')::interval AS years
FROM
(VALUES ('2011-02-28'::date)) AS t1 (today),
(VALUES
('1975-02-28'::date),
('1975-03-06'::date),
('1976-02-28'::date),
('1976-02-29'::date),
('1976-03-06'::date)
) AS t2 (birthday)
) AS t
) AS t;
In case you want it to work with leap years:
create or replace function birthdate(date)
  returns date
as $$
  select (date_trunc('year', now()::date)
         + age($1, 'epoch'::date)
         - (extract(year from age($1, 'epoch'::date)) || ' years')::interval
         )::date;
$$ language sql stable strict;
Then:
where birthdate(birthdate) between current_date
and current_date + interval '1 week'
See also:
Getting all entries who's Birthday is today in PostgreSQL
Exemple: birthdate between: jan 20 and feb 10
SELECT * FROM users WHERE TO_CHAR(birthdate, '1800-MM-DD') BETWEEN '1800-01-20' AND '1800-02-10'
Why 1800?
No matter may be any year;
In my registration form, I can inform the date of birth (with years) or just the birthday (without year), in which case I saved as 1800 to make it easier to work with the date
Here's my take, which works with leap years too:
CREATE OR REPLACE FUNCTION days_until_birthday(
p_date date
) RETURNS integer AS $$
DECLARE
v_now date;
v_days integer;
v_date_upcoming date;
v_years integer;
BEGIN
v_now = now()::date;
IF (p_date IS NULL OR p_date > v_now) THEN
RETURN NULL;
END IF;
v_years = date_part('year', v_now) - date_part('year', p_date);
v_date_upcoming = p_date + v_years * interval '1 year';
IF (v_date_upcoming < v_now) THEN
v_date_upcoming = v_date_upcoming + interval '1 year';
END IF;
v_days = v_date_upcoming - v_now;
RETURN v_days;
END
$$ LANGUAGE plpgsql IMMUTABLE;
I know this post is old, but I had the same issue and came up with this simple and elegant solution:
It is pretty easy with age() and accounts for lap years... for the people who had their birthdays in the last 20 days:
SELECT * FROM c
WHERE date_trunc('year', age(birthdate)) != date_trunc('year', age(birthdate + interval '20 days'))
I have simply created this year date from original birth date.
( DATE_PART('month', birth_date) || '/' || DATE_PART('day', birth_date) || '/' || DATE_PART('year', now()))::date between :start_date and :end_date
I hope this help.

How to get the number of days in a month?

I am trying to get the following in Postgres:
select day_in_month(2);
Expected output:
28
Is there any built-in way in Postgres to do that?
SELECT
DATE_PART('days',
DATE_TRUNC('month', NOW())
+ '1 MONTH'::INTERVAL
- '1 DAY'::INTERVAL
)
Substitute NOW() with any other date.
Using the smart "trick" to extract the day part from the last date of the month, as demonstrated by Quassnoi. But it can be a bit simpler / faster:
SELECT extract(days FROM date_trunc('month', now()) + interval '1 month - 1 day');
Rationale
extract is standard SQL, so maybe preferable, but it resolves to the same function internally as date_part(). The manual:
The date_part function is modeled on the traditional Ingres equivalent to the SQL-standard function extract:
But we only need to add a single interval. Postgres allows multiple time units at once. The manual:
interval values can be written using the following verbose syntax:
[#] quantity unit[quantity unit...] [direction]
where quantity is a number (possibly signed); unit is microsecond,
millisecond, second, minute, hour, day, week, month, year, decade,
century, millennium, or abbreviations or plurals of these units;
ISO 8601 or standard SQL format are also accepted. Either way, the manual again:
Internally interval values are stored as months, days, and seconds.
This is done because the number of days in a month varies, and a day
can have 23 or 25 hours if a daylight savings time adjustment is
involved. The months and days fields are integers while the seconds
field can store fractions.
(Output / display depends on the setting of IntervalStyle.)
The above example uses default Postgres format: interval '1 month - 1 day'. These are also valid (while less readable):
interval '1 mon - 1 d' -- unambiguous abbreviations of time units are allowed
IS0 8601 format:
interval '0-1 -1 0:0'
Standard SQL format:
interval 'P1M-1D';
All the same.
Note that expected output for day_in_month(2) can be 29 because of leap years. You might want to pass a date instead of an int.
Also, beware of daylight saving : remove the timezone or else some monthes calculations could be wrong (next example in CET / CEST) :
SELECT DATE_TRUNC('month', '2016-03-12'::timestamptz) + '1 MONTH'::INTERVAL
- DATE_TRUNC('month', '2016-03-12'::timestamptz) ;
------------------
30 days 23:00:00
SELECT DATE_TRUNC('month', '2016-03-12'::timestamp) + '1 MONTH'::INTERVAL
- DATE_TRUNC('month', '2016-03-12'::timestamp) ;
----------
31 days
This works as well.
WITH date_ AS (SELECT your_date AS d)
SELECT d + INTERVAL '1 month' - d FROM date_;
Or just:
SELECT your_date + INTERVAL '1 month' - your_date;
These two return interval, not integer.
SELECT cnt_dayofmonth(2016, 2); -- 29
create or replace function cnt_dayofmonth(_year int, _month int)
returns int2 as
$BODY$
-- ZU 2017.09.15, returns the count of days in mounth, inputs are year and month
declare
datetime_start date := ('01.01.'||_year::char(4))::date;
datetime_month date := ('01.'||_month||'.'||_year)::date;
cnt int2;
begin
select extract(day from (select (datetime_month + INTERVAL '1 month -1 day'))) into cnt;
return cnt;
end;
$BODY$
language plpgsql;
You can write a function:
CREATE OR REPLACE FUNCTION get_total_days_in_month(timestamp)
RETURNS decimal
IMMUTABLE
AS $$
select cast(datediff(day, date_trunc('mon', $1), last_day($1) + 1) as decimal)
$$ LANGUAGE sql;