Aggregate data per week - postgresql

I'd like to aggregate data weekly according to a date and a value.
I have a table like this :
create table test (t_val integer, t_date date);
insert into test values(1,'2017-02-09'),(2,'2017-02-10'),(4,'2017-02-16');
This is the query :
WITH date_range AS (
SELECT MIN(t_date) as start_date,
MAX(t_date) as end_date
FROM test
)
SELECT
date_part('year', f.date) as date_year,
date_part('week', f.date) as date_week,
f.val
FROM generate_series( (SELECT start_date FROM date_range), (SELECT end_date FROM date_range), '7 day')d
LEFT JOIN
(
SELECT t_val as val, t_date as date
FROM test
WHERE t_date >= (SELECT start_date FROM date_range)
AND t_date <= (SELECT end_date FROM date_range)
GROUP BY t_val, t_date
) f
ON f.date BETWEEN d.date AND (d.date + interval '7 day')
GROUP BY date_part('year', f.date),date_part('week', f.date), f.val;
I expect a result like this :
| Year | Week | Val |
| 2017 | 6 | 3 |
| 2017 | 7 | 4 |
BUt the query returns :
| Year | Week | Val |
| 2017 | 6 | 1 |
| 2017 | 6 | 2 |
| 2017 | 7 | 4 |
What is missing ?

Related

Postgres Maximum max month for each quarter/year

I have table with columns: PK,Amount,quarter,month,year,program. I would like to create query which help me to select only the max month for each quarter/year for each policy. Example:
should return something like
how would I accomplish this using postgres?
WITH max_month AS (
SELECT
quarter,
policy,
max(to_date(month || ' ' || year, 'month YYYY')) AS max_date
FROM
qtr_test
GROUP BY
quarter,
policy) SELECT qtr_test.*
FROM
qtr_test, max_month
WHERE
qtr_test.quarter = max_month.quarter
AND qtr_test.policy = max_month.policy
AND to_date(month || ' ' || year, 'month YYYY') = max_date
ORDER BY quarter, policy;
pk | amount | quarter | month | year | program | policy
----+--------+---------+-------+------+---------+--------
1 | 2500 | q1 | march | 2022 | A | p1
8 | 3000 | q1 | march | 2022 | A | p2
5 | 2700 | q2 | july | 2022 | A | p2
Use CTE to find the maximum date for the quarter, policy combination. Then use the quarter, policy and max_date to select the appropriate rows from the table.

Select the highest value of every N days

| Date | Price |
| 2022-05-11 04:00:00.0000000 +00:00 | 1 |
| 2022-05-12 04:00:00.0000000 +00:00 | 2 |
| 2022-05-13 04:00:00.0000000 +00:00 | 3 |
I have a long table which looks like above with various timestamps. I would like to select the highest price of every N days. How should I do the grouping?
Thanks #EdmCoff, In my case the answer looks like
select MAX(Price)
from MyTable
Group by DATEADD(DAY, 0, 3 * FLOOR(DATEDIFF(DAY, 0, Date) / 3) )
order by min(Date) asc```

Grouping with different timespans

currently I am struggling to achieve some aggregation that is kinda overlapping.
The current structure of my table is:
|ymd |id|costs|
|--------|--|-----|
|20200101|a |10 |
|20200102|a |12 |
|20200101|b |13 |
|20200101|c |15 |
|20200102|c |1 |
However i'd like to group it in a way that I had different timespan per item. Considering that I am running this query on the 20200103, the result i am trying to achieve is:
| timespan | id | costs |
|------------|----|-------|
| last 2 days| a | 22 |
| last 1 day | a | 12 |
| last 2 days| b | 13 |
| last 1 day | b | 0 |
| last 2 days| c | 16 |
| last 1 day | c | 1 |
I have tried many things, but so far I wasn't able to achieve what I need. This is the query that I have tried, with no correct results:
SELECT
CASE
WHEN ymd BETWEEN date_add(current_date(),-2) AND to_date(current_date()) THEN '2 days'
WHEN ymd BETWEEN date_add(current_date(),-1) AND to_date(current_date()) THEN '1 day'
END AS timespan,
id,
sum(costs) AS costs
FROM `table`
GROUP BY
CASE
WHEN ymd BETWEEN date_add(current_date(),-2) AND to_date(current_date()) THEN '2 days'
WHEN ymd BETWEEN date_add(current_date(),-1) AND to_date(current_date()) THEN '1 day'
END,
id
You can build a derived table that stores the timestamps, cross join it with the list of distinct users to generate all possible combinations, then bring the table with a left join and aggregate:
select d.timespan, i.id, coalesce(sum(t.costs), 0) costs
from (select distinct id from mytable) i
cross join (
select 1 n, 'last 1 day' timespan
union all select 2, 'last 2 day'
) d
left join mytable t
on t.ymd between date_add(current_date(), - d.n) and current_date()
group by d.n, d.timespan, i.id

postgresql: splitting time period at event

I have a table of country-periods. In some cases, certain country attributes (e.g. the capital) changes on a date within a time period. Here I would like to split the country-period into two new periods, one before and one after this change.
Example:
Country | start_date | end_date | event_date
A | 1960-01-01 | 1999-12-31 | 1994-07-20
B | 1926-01-01 | 1995-12-31 | NULL
Desired output:
Country | start_date | end_date | event_date
A | 1960-01-01 | 1994-07-19 | 1994-07-20
A | 1994-07-20 | 1999-12-31 | 1994-07-20
B | 1926-01-01 | 1995-12-31 | NULL
I considered starting off with generate_series along these lines:
SELECT country, min(p1) as sdate1, max(p1) as sdate2,
min(p2) as sdate2, min(p2) as edate2
FROM
(SELECT country,
generate_series(start_date, (event_date-interval '1 day'), interval '1 day')::date as p1,
generate_series(event_date, end_date, interval '1 day')::date as p2
FROM table)t
GROUP BY country
But these seems way to inefficient and messy. Unfortunately I don't have any experience when it comes to writing functions. Any ideas on how I can solve this?
You can do UNION instead. This way you don't generate unnecessary rows
SELECT country, start_date,
CASE WHEN event_date BETWEEN start_date AND end_date
THEN event_date - 1
ELSE end_date
END AS end_date, event_date
FROM table1
UNION ALL
SELECT country, event_date, end_date, event_date
FROM table1
WHERE event_date BETWEEN start_date AND end_date
ORDER BY country, start_date, end_date, event_date
Here is a SQLFiddle demo
Output:
| country | start_date | end_date | event_date |
|---------|------------|------------|------------|
| A | 1960-01-01 | 1994-07-19 | 1994-07-20 |
| A | 1994-07-20 | 1999-12-31 | 1994-07-20 |
| B | 1926-01-01 | 1995-12-31 | (null) |

postgresql-group by- complex query

Here's what my table TheTable looks like
ColA | ColB |
------+-------+------
abc | 2005 |
abc | 2010 |
def | 2009 |
def | 2010 |
def | 2011 |
abc | 2012 |
And I want to write a query to return this result:
ColA | ColB | ColC
------+-------+------
abc | 2005 | 2010
def | 2009 | 2011
abc | 2012 | -
I believe you can get the results you want using window functions and a nested subquery:
select "ColA"
, max(case when parity = 0 then "ColB" end) as "ColB"
, max(case when parity = 1 then "ColB" end) as "ColC"
from (
select *
, (rank() over(partition by "ColA" order by "ColB" asc) - 1)
, (rank() over(partition by "ColA" order by "ColB" asc) - 1) / 2 as result_row
, (rank() over(partition by "ColA" order by "ColB" asc) - 1) % 2 as parity
from TheTable ) t
GROUP BY "ColA", result_row