pgSQL date function - postgresql

How do I tweak this pgSQL SELECT using pgSQL's date functions so it always returns "01" for the day # and "00:00:00" for the time?
SELECT s.last_mailing + '1 month'::interval AS next_edition_date FROM
last_mailing is defined as
last_mailing timestamp without time zone
Examples of the result I am wanting are:
2015-10-01 00:00:00
2015-11-01 00:00:00
2015-12-01 00:00:00

You're looking for date_trunc().
psql (9.5alpha2, server 9.4.4)
Type "help" for help.
testdb=# create table subscriptions as select 1 "id", '2015-07-14T12:32'::timestamp last_mailing union all select 2, '2015-08-15T00:00';
SELECT 2
testdb=# select * from subscriptions; id | last_mailing
----+---------------------
1 | 2015-07-14 12:32:00
2 | 2015-08-15 00:00:00
(2 rows)
testdb=# select *, date_trunc('month', last_mailing) + interval '1 month' AS next_edition_date from subscriptions;
id | last_mailing | next_edition_date
----+---------------------+---------------------
1 | 2015-07-14 12:32:00 | 2015-08-01 00:00:00
2 | 2015-08-15 00:00:00 | 2015-09-01 00:00:00
(2 rows)

If you want the first day of the next month, then use:
SELECT (s.last_mailing - (1 - extract(day from s.last_mailing)) * interval '1 day') +
interval '1 month' AS next_edition_date
FROM . . .
If you don't want the time, then use date_trunc():
SELECT date_trunc('day',
(s.last_mailing - (1 - extract(day from s.last_mailing)) * interval '1 day') +
interval '1 month'
) AS next_edition_date
FROM . . .

Related

postgres, group by date, and bucketize per hour

I would like to create a result object that can be used with Grafana for a heatmap. In order to display the data correctly I need it the output to be like:
| date | 00:00 | 01:00 | 02:00 | 03:00 | ...etc |
| 2023-01-01 | 1 | 2 | 0 | 1 | ... |
| 2023-01-02 | 0 | 0 | 1 | 1 | ... |
| 2023-01-03 | 4 | 0 | 2 | 0 | ... |
my data table structure:
trades
-----
id
closed_at
asset
So far, I know that I need to use generate_series and use the interval function to return the hours, but I need my query to plot these hours as columns, but I've not been able to do that, as its getting a bit too advanced.
So far I have the following query:
SELECT
closed_at::DATE,
COUNT(id)
FROM trades
GROUP BY closed_at
ORDER BY closed_at
It now shows the amount of rows grouped by the days, I want to further aggregate the data, so it outputs the count per hour, as shown above.
Thanks for your help!
You can add more columns, now I only add 0:00 to 05:00.
filter usage: https://www.postgresql.org/docs/current/sql-expressions.html#SYNTAX-AGGREGATES
date_trunc usage: https://www.postgresql.org/docs/current/functions-datetime.html#FUNCTIONS-DATETIME-TRUNC
BEGIN;
CREATE temp TABLE trades (
id bigint GENERATED BY DEFAULT AS IDENTITY,
closed_a timestamp,
asset text
) ON COMMIT DROP;
INSERT INTO trades (closed_a)
SELECT
date '2023-01-01' + interval '10 min' * (random() * i * 10)::int
FROM
generate_series(1, 10) g (i);
INSERT INTO trades (closed_a)
SELECT
date '2023-01-02' + interval '10 min' * (random() * i * 10)::int
FROM
generate_series(1, 10) g (i);
SELECT
closed_a::date
,COUNT(id) FILTER (WHERE date_trunc('hour', closed_a) = closed_a::date) AS "0:00"
,COUNT(id) FILTER (WHERE date_trunc('hour', closed_a) = closed_a::date + interval '1 hour') AS "1:00"
,COUNT(id) FILTER (WHERE date_trunc('hour', closed_a) = closed_a::date + interval '2 hour') AS "2:00"
,COUNT(id) FILTER (WHERE date_trunc('hour', closed_a) = closed_a::date + interval '3 hour') AS "3:00"
,COUNT(id) FILTER (WHERE date_trunc('hour', closed_a) = closed_a::date + interval '4 hour') AS "4:00"
,COUNT(id) FILTER (WHERE date_trunc('hour', closed_a) = closed_a::date + interval '5 hour') AS "5:00"
FROM
trades
GROUP BY
1;
END;

Generate date series by month between two dates and avrage by month in postgresql

I want to create a row for every month between two dates, the first day of every month should be the day of the start date or the first day of every month, and the last date should be the last day of every month or the end date, with average (if date start = 15, then the average should be 15/30) for my table.
input :
product_id | date_start | date_end
1 | 16-01-2020 | 15-03-2020
2 | 07-01-2020 | 22-04-2020
The result should be :
product_id | date_start | date_end | average
1 | 16-01-2020 | 31-01-2020 | 0.5
1 | 01-02-2020 | 29-02-2020 | 1
1 | 01-03-2020 | 15-03-2020 | 0.5
2 | 07-01-2020 | 31-01-2020 | 0.76 -- (30-07)/30
2 | 01-02-2020 | 29-02-2020 | 1
2 | 01-03-2020 | 31-03-2020 | 1
2 | 01-04-2020 | 22-04-2020 | 0.76
I tried using generate series and date trunc and union
SELECT (date_trunc('month', dt) + INTERVAL '1 MONTH' ):: DATE AS date_start ,
(date_trunc('month', dt) + INTERVAL '2 MONTH - 1 day' ):: DATE AS date_end
FROM generate_series( DATE '2020-01-15', DATE '2020-05-21', interval '1 MONTH' ) AS dt
union select '2020-01-15' as date_start,
(date_trunc('month', '2020-01-15'::date) + INTERVAL '1 MONTH - 1 day' ):: DATE AS date_end
union select (date_trunc('month', '2020-05-21'::date) ):: DATE AS date_start ,
'2020-05-21' AS date_end
order by date_start
To adding average I calculate the difference between two dates
SELECT (date_trunc('month', dt) + INTERVAL '1 MONTH' ):: DATE AS date_start ,
(date_trunc('month', dt) + INTERVAL '2 MONTH - 1 day' ):: DATE AS date_end,
((date_trunc('month', dt) + INTERVAL '2 MONTH - 1 day' ) - (date_trunc('month', dt) + INTERVAL '1 MONTH' ):: DATE )
FROM generate_series( DATE '2020-01-15', DATE '2020-05-21', interval '1 MONTH' ) AS dt
with this it seemed like I was hit a wall.
The following gives approximately the same result as you desired, only averages deviates. I believe this stems from an inconsistency in the your calculations where the dates are inclusive in some and excludes either start or end date in others, I was inclusive in all. The other area of difference being I used the actual number of days in the month for denominator calculating it instead of 30. This is necessary for Feb to ever have average 1, otherwise max would be 0.97, and full months having 31 days would average 1.03.
with product_dates(product_id, date_start, date_end) as
( values (1,'2020-01-16'::date,'2020-03-15'::date)
, (2,'2020-01-07'::date,'2020-04-22'::date)
)
select product_id, start_date, end_date, round((end_date-start_date+1 ) * 1.0 / (eom-som+1),2) average
from (select product_id
, greatest(date_start,dt::date) start_date
, least(date_end, (dt+interval '1 month' -interval '1 day')::date) end_date
, dt::date som
, (dt+interval '1 month' -interval '1 day')::date eom
from product_dates
cross join generate_series(date_trunc('month', date_start)
,date_trunc('month', date_end) + interval '1 month' - interval '1 day'
,interval '1 month'
) gs(dt)
) s1;
The heart is the generate_series working directly with dates, notice the date manipulation to ensure I had first day and last day of month. Then in the outer portion of the quest I selected those dates or the parameter date or the generated one (greatest and least functions),

Postgres expand time window using date_part

Have two dates - '2018-05-01' and '2018-06-01'. I would like to expand this window to the past by day difference of those dates.
SELECT * FROM data
WHERE
start_time > CAST('2018-05-01' AS timestamptz) - INTERVAL '30 DAY'
AND start_time < CAST('2018-06-01' AS timestamptz)
How can I replace INTERVAL '30 DAY' with number of days between given dates without explicitly defining number of days? I know to calculate day difference:
date_part('day',age('2018-05-01', '2018-06-01'))
But not sure how to incorporate into the substraction. Dates and days between them will change.
You can use date_trunc('mon', some_date_expression) to round down to the start of a month:
select date_trunc('mon', now() - '3 mon'::interval) as date_begin
, date_trunc('mon', now() - '1 day'::interval) as date_end
;
Result
date_begin | date_end
------------------------+------------------------
2018-03-01 00:00:00+01 | 2018-06-01 00:00:00+02
(1 row)
You can simply subtract the difference from the start date:
with t (start_date, end_date) as (
values (date '2018-05-01', date '2018-06-01')
)
select start_date - (end_date - start_date) as new_start,
end_date
from t;
returns
new_start | new_end
-----------+-----------
2018-03-31 | 2018-06-01

Getting data from postgres weekly (according to date)

user timespent(in sec) date(in timestamp)
u1 10 t1(2015-08-15)
u1 20 t2(2015-08-19)
u1 15 t3(2015-08-28)
u1 16 t4(2015-09-06)
Above is the format of my table, which represents timespent by user on a course and it is ordered by timestamp. I want to get sum of timespent by a particular user, say u1 weekly in the format :
start_date end_date sum
2015-08-15 2015-08-21 30
2015-08-22 2015-08-28 15
2015-08-29 2015-09-04 0
2015-09-05 2015-09-11 16
The difficulty lies in the fact that the seven-day periods that you want to get are not regular weeks starting with Monday.
You can not therefore use standard functions to get the week number based on the date, and have to use your own weeks generator using generate_series().
Example data:
create table sessions (user_name text, time_spent int, session_date timestamp);
insert into sessions values
('u1', 10, '2015-08-15'),
('u1', 20, '2015-08-19'),
('u1', 15, '2015-08-28'),
('u1', 16, '2015-09-06');
The query for an arbitrary chosen period from 2015-08-15 to 2015-09-06:
with weeks as (
select d::date start_date, d::date+ 6 end_date
from generate_series('2015-08-15', '2015-09-06', '7d'::interval) d
)
select w.start_date, w.end_date, coalesce(sum(time_spent), 0) total
from weeks w
left join (
select start_date, end_date, coalesce(time_spent, 0) time_spent
from weeks
join sessions
on session_date between start_date and end_date
where user_name = 'u1'
) s
on w.start_date = s.start_date and w.end_date = s.end_date
group by 1, 2
order by 1;
start_date | end_date | total
------------+------------+-------
2015-08-15 | 2015-08-21 | 30
2015-08-22 | 2015-08-28 | 15
2015-08-29 | 2015-09-04 | 0
2015-09-05 | 2015-09-11 | 16
(4 rows)
select
ui,
date_trunc('week', the_date)::date as start_date,
date_trunc('week', the_date)::date + 6 as end_date,
sum(timespent) as "sum"
from t
group by 1, 2, 3
order by 1,2
Something like this (assuming that by timestamp you mean the data type timestamp).
In order to make the 1st day of the week to be Sunday, I added and extra day to "date" in the group by.
select (start_date - date_part('dow', start_date) * interval '1 day')::date start_date,
(start_date + (6 - date_part('dow', start_date)) * interval '1 day')::date end_date,
total_time_spent
from (
select min("date") start_date, sum(timespent) total_time_spent
from mytable
where user=u1
group by date_part('year', "date"), date_part('week', "date" + interval '1 day')) "tmp"
order by start_date
This is a more generic approach, for any date interval.

Postgres: first and last days of any month

I have two date columns in postgres: 'StartDate' and 'EndDate'.
I need to assign various combinations of date ranges with a different code.
e.g
IF StartDate = EndDate THEN 'a'
What I'd like to be able to do is select any row in which the StartDate is the first day of any month AND the EndDate is the last day of any month. (IF StartDate = FirstDayOfMonth AND EndDate = LastDayOfMonth THEN 'b').
e.g. when StartDate = '01-02-2011' and EndDate = '31-05'2012' then'b', or StartDate = '01-11-1996' and EndDate = '31-01-2001' then 'b'.
The easiest to get the first and last day of a month is to rely on date arithmetics, e.g.:
# select date_trunc('month', now()::date);
date_trunc
------------------------
2014-12-01 00:00:00+01
(1 row)
# select date_trunc('month', now()::date)
+ interval '1 month'
- interval '1 day';
?column?
------------------------
2014-12-31 00:00:00+01
(1 row)
If needed, note that you can use generate_series() to compute the full list between two dates:
select d as first_day,
d + interval '1 month' - interval '1 day' as last_day
from generate_series('2014-01-01'::date,
'2014-12-01'::date,
'1 month') as d;
first_day | last_day
------------------------+------------------------
2014-01-01 00:00:00+01 | 2014-01-31 00:00:00+01
2014-02-01 00:00:00+01 | 2014-02-28 00:00:00+01
2014-03-01 00:00:00+01 | 2014-03-31 00:00:00+02
2014-04-01 00:00:00+02 | 2014-04-30 00:00:00+02
2014-05-01 00:00:00+02 | 2014-05-31 00:00:00+02
2014-06-01 00:00:00+02 | 2014-06-30 00:00:00+02
2014-07-01 00:00:00+02 | 2014-07-31 00:00:00+02
2014-08-01 00:00:00+02 | 2014-08-31 00:00:00+02
2014-09-01 00:00:00+02 | 2014-09-30 00:00:00+02
2014-10-01 00:00:00+02 | 2014-10-31 00:00:00+01
2014-11-01 00:00:00+01 | 2014-11-30 00:00:00+01
2014-12-01 00:00:00+01 | 2014-12-31 00:00:00+01
(12 rows)
You can extract the day part of the date and check its value. For the first day of the month it should be 1, and for the last - you simply add one day to the date and check whether it is the first day of the next month, this would mean that the original date is the last day of the current month.
...
CASE WHEN
EXTRACT('day' FROM StartDate) = 1
AND
EXTRACT('day' FROM EndDate + '1 day'::interval) = 1
THEN 'b'
...