Postgresql split date range by year parts (financial year) - postgresql

I have a table like follows:
id start_date end_date
1 2020-01-01 2020-05-01
2 2020-03-01 2021-04-02
I need to be able to split the rows by financial year e.g. 2020-04-01 -> 2021-03-31)
So the result of the query would be as follows:
id start_date end_date
1 2020-01-01 2020-03-31
1 2020-04-01 2020-05-01
2 2020-03-01 2020-03-31
2 2020-04-01 2021-03-31
2 2021-04-01 2021-04-02

Actually another post helped me resolve this: Date split-up based on Fiscal Year
DROP TABLE your_table;
CREATE TABLE your_table (id int, start_date date, end_date date);
INSERT INTO your_table VALUES (1, '2020-01-01', '2020-05-01');
INSERT INTO your_table VALUES (2, '2020-03-01', '2021-04-02');
SELECT
id,
GREATEST(start_date, ('01-04-'||series.year)::date) AS year_start,
LEAST(end_date, ('31-03-'||series.year + 1)::date) AS year_end
FROM
(SELECT
id,
start_date,
end_date,
generate_series(
date_part('year', your_table.start_date - INTERVAL '3 months')::int,
date_part('year', your_table.end_date - INTERVAL '3 months')::int)
FROM your_table) AS series(id, start_date, end_date, year)
ORDER BY
start_date;
Result:
"id","year_start","year_end"
1,"2020-01-01","2020-03-31"
1,"2020-04-01","2020-05-01"
2,"2020-03-01","2020-03-31"
2,"2020-04-01","2021-03-31"
2,"2021-04-01","2021-04-02"

Related

Postgresql : Create a new table based on another table and create a date column using start date and end date

I have a table shown below (sample) and I want to create a new table with an extra column 'NewDate' which will look at StartDate and show the last date of the month for start date and subsequently last date of every month till the end date for each ID and if my ID has End Date as Null the series will stop at the last date of current month which is May 2022.
ID StartDate EndDate
100 1/01/2022 26/04/2022
101 20/04/2022 Null
102 1/01/2022 27/02/2022
....
I am using Postgresql and my Expected Output:
ID StartDate EndDate NewDate
100 1/01/2022 26/04/2022 31/01/2022
100 1/01/2022 26/04/2022 28/02/2022
100 1/01/2022 26/04/2022 31/03/2022
100 1/01/2022 26/04/2022 30/04/2022
101 20/04/2022 Null 30/04/2022
101 20/04/2022 Null 31/05/2022
102 1/01/2022 27/02/2022 31/01/2022
102 1/01/2022 27/02/2022 28/02/2022
...
demo
(
SELECT
id,
start_date,
end_date,
(new_date::date + interval '1 month - 1 day')::date
FROM
test_date,
generate_series((date_trunc('month', start_date)), (date_trunc('month', end_date) + interval '1 month - 1 day'), interval '1 month') g (new_date)
ORDER BY
id)
UNION ALL ((
SELECT
id,
start_date,
end_date,
(date_trunc('month', start_date) + interval '1 month - 1 day')::date
FROM
test_date
WHERE
test_date.end_date IS NULL)
UNION ALL (
SELECT
id,
start_date,
end_date,
(date_trunc('month', (date_trunc('month', start_date) + interval '1 month - 1 day')::date) + interval '2 month - 1 day')::date
FROM
test_date
WHERE
test_date.end_date IS NULL)
ORDER BY
id);
key gotta: How to get the last day of month in postgres?
Maybe there is some simple version, but anyway this way works.

How do I generate months between start date and now() in postgresql

I also have the question how do i get code block to work on stack overflow but that's a side issue.
I have this quasi-code that works:
select
*
from
unnest('{2018-6-1,2018-7-1,2018-8-1,2018-9-1}'::date[],
'{2018-6-30,2018-7-31,2018-8-31,2018-9-30}'::date[]
) zdate(start_date, end_date)
left join lateral pipe_f(zdate...
But now I want it to work from 6/1/2018 until now(). What's the best way to do this.
Oh, postgresql 10. yay!!
Your query gives a list of first and last days of months between "2018-06-01" and now. So I am assuming that you want to this in a more dynamic way:
demo: db<>fiddle
SELECT
start_date,
(start_date + interval '1 month -1 day')::date as end_date
FROM (
SELECT generate_series('2018-6-1', now(), interval '1 month')::date as start_date
)s
Result:
start_date end_date
2018-06-01 2018-06-30
2018-07-01 2018-07-31
2018-08-01 2018-08-31
2018-09-01 2018-09-30
2018-10-01 2018-10-31
generate_series(timestamp, timestamp, interval) generates a list of timestamps. Starting with "2018-06-01" until now() with the 1 month interval gives this:
start_date
2018-06-01 00:00:00+01
2018-07-01 00:00:00+01
2018-08-01 00:00:00+01
2018-09-01 00:00:00+01
2018-10-01 00:00:00+01
These timestamps are converted into dates with ::date cast.
Then I add 1 month to get the next month. But as we are interested in the last day of the previous month I subtract one day again (+ interval '1 month -1 day')
Another option that's more ANSI-compliant is to use a recursive CTE:
WITH RECURSIVE
dates(d) AS
(
SELECT '2018-06-01'::TIMESTAMP
UNION ALL
SELECT d + INTERVAL '1 month'
FROM dates
WHERE d + INTERVAL '1 month' <= '2018-10-01'
)
SELECT
d AS start_date,
-- add 1 month, then subtract 1 day, to get end of current month
(d + interval '1 month') - interval '1 day' AS end_date
FROM dates

How to query hourly aggregated data by date with postgresql?

There is one table:
ID DATE
1 2017-09-16 20:12:48
2 2017-09-16 20:38:54
3 2017-09-16 23:58:01
4 2017-09-17 00:24:48
5 2017-09-17 00:26:42
..
The result I need is the last 7-days of data with hourly aggregated count of rows:
COUNT DATE
2 2017-09-16 21:00:00
0 2017-09-16 22:00:00
0 2017-09-16 23:00:00
1 2017-09-17 00:00:00
2 2017-09-17 01:00:00
..
I tried different stuff with EXTRACT, DISTINCT and also used the generate_series function (most stuff from similar stackoverflow questions)
This try was the best one currently:
SELECT
date_trunc('hour', demotime) as date,
COUNT(demotime) as count
FROM demo
GROUP BY date
How to generate hourly series for 7 days and fill-in the count of rows?
SQL DEMO
SELECT dd, count("demotime")
FROM generate_series
( current_date - interval '7 days'
, current_date
, '1 hour'::interval) dd
LEFT JOIN Table1
ON dd = date_trunc('hour', demotime)
GROUP BY dd;
To work from now and now - 7 days:
SELECT dd, count("demotime")
FROM generate_series
( date_trunc('hour', NOW()) - interval '7 days'
, date_trunc('hour', NOW())
, '1 hour'::interval) dd
LEFT JOIN Table1
ON dd = date_trunc('hour', demotime)
GROUP BY dd;

PostgreSQL Query: Column Sum for the latest available date of each month

Given a pSQL table which looks like this:
date | data
2015-01-23 | 15
2015-01-23 | 11
2015-02-25 | 15
2015-02-25 | 11
2015-01-25 | 24
2015-01-25 | 2
2015-01-25 | 13
2015-01-29 | 5
2015-02-28 | 12
2015-02-28 | 1
2015-05-15 | 12
2015-05-16 | 1
How can I get the sum of data for the last available date of each month?
Example result:
date | data
2015-01-29 | 5
2015-02-28 | 13
2015-05-16 | 1
This is what I've tried so far:
SELECT year,month,max(day),sum(data) FROM
(
SELECT
date,
date_part('year', date) AS year,
date_part('month', date) AS month,
date_part('day', date) AS day,
sum(data) AS tdata
FROM table a
GROUP BY date, date_part('year', date), date_part('month', date), date_part('day', date)
ORDER BY year ASC, month ASC, day ASC
) dataq
GROUP BY year,month
The sum I get from this appears to be wrong.
You should calculate the sums in the inner query, grouping by a single day. Select latest day in month in the outer query:
select distinct on (year, month)
make_date(year::int, month::int, day::int) as date,
data
from (
select
date_part('year', date) as year,
date_part('month', date) as month,
date_part('day', date) as day,
sum(data) as data
from my_table
group by date
) s
order by year, month, day desc
date | data
------------+------
2015-01-29 | 5
2015-02-28 | 13
2015-05-16 | 1
(3 rows)
I guess you need just to remove days that you don't want to sum. For example using NOT EXISTS as follows:
SELECT year,month,max(day),sum(tdata) tdata FROM
(
SELECT
d,
date_part('year', d) AS year,
date_part('month', d) AS month,
date_part('day', d) AS day,
sum(data) AS tdata
FROM tab a
WHERE NOT EXISTS
(
SELECT *
FROM tab a2
WHERE date_part('year', a.d) = date_part('year', a2.d) AND
date_part('month', a.d) = date_part('month', a2.d) AND
date_part('day', a.d) < date_part('day', a2.d)
)
GROUP BY d, date_part('year', d), date_part('month', d), date_part('day', d)
ORDER BY year ASC, month ASC, day ASC
) dataq
GROUP BY year,month
SQLFiddle

Getting data from postgres weekly (according to date)

user timespent(in sec) date(in timestamp)
u1 10 t1(2015-08-15)
u1 20 t2(2015-08-19)
u1 15 t3(2015-08-28)
u1 16 t4(2015-09-06)
Above is the format of my table, which represents timespent by user on a course and it is ordered by timestamp. I want to get sum of timespent by a particular user, say u1 weekly in the format :
start_date end_date sum
2015-08-15 2015-08-21 30
2015-08-22 2015-08-28 15
2015-08-29 2015-09-04 0
2015-09-05 2015-09-11 16
The difficulty lies in the fact that the seven-day periods that you want to get are not regular weeks starting with Monday.
You can not therefore use standard functions to get the week number based on the date, and have to use your own weeks generator using generate_series().
Example data:
create table sessions (user_name text, time_spent int, session_date timestamp);
insert into sessions values
('u1', 10, '2015-08-15'),
('u1', 20, '2015-08-19'),
('u1', 15, '2015-08-28'),
('u1', 16, '2015-09-06');
The query for an arbitrary chosen period from 2015-08-15 to 2015-09-06:
with weeks as (
select d::date start_date, d::date+ 6 end_date
from generate_series('2015-08-15', '2015-09-06', '7d'::interval) d
)
select w.start_date, w.end_date, coalesce(sum(time_spent), 0) total
from weeks w
left join (
select start_date, end_date, coalesce(time_spent, 0) time_spent
from weeks
join sessions
on session_date between start_date and end_date
where user_name = 'u1'
) s
on w.start_date = s.start_date and w.end_date = s.end_date
group by 1, 2
order by 1;
start_date | end_date | total
------------+------------+-------
2015-08-15 | 2015-08-21 | 30
2015-08-22 | 2015-08-28 | 15
2015-08-29 | 2015-09-04 | 0
2015-09-05 | 2015-09-11 | 16
(4 rows)
select
ui,
date_trunc('week', the_date)::date as start_date,
date_trunc('week', the_date)::date + 6 as end_date,
sum(timespent) as "sum"
from t
group by 1, 2, 3
order by 1,2
Something like this (assuming that by timestamp you mean the data type timestamp).
In order to make the 1st day of the week to be Sunday, I added and extra day to "date" in the group by.
select (start_date - date_part('dow', start_date) * interval '1 day')::date start_date,
(start_date + (6 - date_part('dow', start_date)) * interval '1 day')::date end_date,
total_time_spent
from (
select min("date") start_date, sum(timespent) total_time_spent
from mytable
where user=u1
group by date_part('year', "date"), date_part('week', "date" + interval '1 day')) "tmp"
order by start_date
This is a more generic approach, for any date interval.