Postgres Maximum max month for each quarter/year - postgresql

I have table with columns: PK,Amount,quarter,month,year,program. I would like to create query which help me to select only the max month for each quarter/year for each policy. Example:
should return something like
how would I accomplish this using postgres?

WITH max_month AS (
SELECT
quarter,
policy,
max(to_date(month || ' ' || year, 'month YYYY')) AS max_date
FROM
qtr_test
GROUP BY
quarter,
policy) SELECT qtr_test.*
FROM
qtr_test, max_month
WHERE
qtr_test.quarter = max_month.quarter
AND qtr_test.policy = max_month.policy
AND to_date(month || ' ' || year, 'month YYYY') = max_date
ORDER BY quarter, policy;
pk | amount | quarter | month | year | program | policy
----+--------+---------+-------+------+---------+--------
1 | 2500 | q1 | march | 2022 | A | p1
8 | 3000 | q1 | march | 2022 | A | p2
5 | 2700 | q2 | july | 2022 | A | p2
Use CTE to find the maximum date for the quarter, policy combination. Then use the quarter, policy and max_date to select the appropriate rows from the table.

Related

Select the highest value of every N days

| Date | Price |
| 2022-05-11 04:00:00.0000000 +00:00 | 1 |
| 2022-05-12 04:00:00.0000000 +00:00 | 2 |
| 2022-05-13 04:00:00.0000000 +00:00 | 3 |
I have a long table which looks like above with various timestamps. I would like to select the highest price of every N days. How should I do the grouping?
Thanks #EdmCoff, In my case the answer looks like
select MAX(Price)
from MyTable
Group by DATEADD(DAY, 0, 3 * FLOOR(DATEDIFF(DAY, 0, Date) / 3) )
order by min(Date) asc```

Distinct Count Dates by timeframe

I am trying to find the daily count of frequent visitors from a very large data-set. Frequent visitors in this case are visitor IDs used on 2 distinct days in a rolling 3 day period.
My data set looks like the below:
ID | Date | Location | State | Brand |
1 | 2020-01-02 | A | CA | XYZ |
1 | 2020-01-03 | A | CA | BCA |
1 | 2020-01-04 | A | CA | XYZ |
1 | 2020-01-06 | A | CA | YQR |
1 | 2020-01-06 | A | WA | XYZ |
2 | 2020-01-02 | A | CA | XYZ |
2 | 2020-01-05 | A | CA | XYZ |
This is the result I am going for. The count in the visits column is equal to the count of distinct days from the date column, -2 days for each ID. So for ID 1 on 2020-01-05, there was a visit on the 3rd and 4th, so the count is 2.
Date | ID | Visits | Frequent Prior 3 Days
2020-01-01 |Null| Null | Null
2020-01-02 | 1 | 1 | No
2020-01-02 | 2 | 1 | No
2020-01-03 | 1 | 2 | Yes
2020-01-03 | 2 | 1 | No
2020-01-04 | 1 | 3 | Yes
2020-01-04 | 2 | 1 | No
2020-01-05 | 1 | 2 | Yes
2020-01-05 | 2 | 1 | No
2020-01-06 | 1 | 2 | Yes
2020-01-06 | 2 | 1 | No
2020-01-07 | 1 | 1 | No
2020-01-07 | 2 | 1 | No
2020-01-08 | 1 | 1 | No
2020-01-09 | 1 | null | Null
I originally tried to use the following line to get the result for the visits column, but end up with 3 in every successive row at whichever date it first got to 3 for that ID.
,
count(ID) over (Partition by ID order by Date ASC rows between 3 preceding and current row) as visits
I've scoured the forum, but every somewhat similar question seems to involve counting the values rather than the dates and haven't been able to figure out how to tweak to get what I need. Any help is much appreciated.
You can aggregate the dataset by user and date, then use window functions with a range frame to look at the three preceding rows.
You did not tell which database you are running - and not all databases support the window ranges, nor have the same syntax for literal intervals. In standard SQL, you would go:
select
id,
date,
count(*) cnt_visits
case
when sum(count(*)) over(
partition by id
order by date
range between interval '3' day preceding and current row
) >= 2
then 'Yes'
else 'No'
end is_frequent_visitor
from mytable
group by id, date
On the other hand, if you want a record for every user and every day (event when there is no visit), then it is a bit different. You can generate the dataset first, then bring the table with a left join:
select
i.id,
d.date,
count(t.id) cnt_visits,
case
when sum(count(t.id)) over(
partition by i.id
order by d.date
rows between '3' day preceding and current row
) >= 2
then 'Yes'
else 'No'
end is_frequent_visitor
from (select distinct id from mytable) i
cross join (select distinct date from mytable) d
left join mytable t
on t.date = d.date
and t.id = i.id
group by i.id, d.date
I would be inclined to approach this by expanding out the days and visitors using a cross join and then just window functions. Assuming you have all dates in the data:
select i.id, d.date,
count(t.id) over (partition by i.id
order by d.date
rows between 2 preceding and current row
) as cnt_visits,
(case when count(t.id) over (partition by i.id
order by d.date
rows between 2 preceding and current row
) >= 2
then 'Yes' else 'No'
end) as is_frequent_visitor
from (select distinct id from t) i cross join
(select distinct date from t) d left join
(select distinct id, date from t) t
on t.date = d.date and
t.id = i.id;

Grouping with different timespans

currently I am struggling to achieve some aggregation that is kinda overlapping.
The current structure of my table is:
|ymd |id|costs|
|--------|--|-----|
|20200101|a |10 |
|20200102|a |12 |
|20200101|b |13 |
|20200101|c |15 |
|20200102|c |1 |
However i'd like to group it in a way that I had different timespan per item. Considering that I am running this query on the 20200103, the result i am trying to achieve is:
| timespan | id | costs |
|------------|----|-------|
| last 2 days| a | 22 |
| last 1 day | a | 12 |
| last 2 days| b | 13 |
| last 1 day | b | 0 |
| last 2 days| c | 16 |
| last 1 day | c | 1 |
I have tried many things, but so far I wasn't able to achieve what I need. This is the query that I have tried, with no correct results:
SELECT
CASE
WHEN ymd BETWEEN date_add(current_date(),-2) AND to_date(current_date()) THEN '2 days'
WHEN ymd BETWEEN date_add(current_date(),-1) AND to_date(current_date()) THEN '1 day'
END AS timespan,
id,
sum(costs) AS costs
FROM `table`
GROUP BY
CASE
WHEN ymd BETWEEN date_add(current_date(),-2) AND to_date(current_date()) THEN '2 days'
WHEN ymd BETWEEN date_add(current_date(),-1) AND to_date(current_date()) THEN '1 day'
END,
id
You can build a derived table that stores the timestamps, cross join it with the list of distinct users to generate all possible combinations, then bring the table with a left join and aggregate:
select d.timespan, i.id, coalesce(sum(t.costs), 0) costs
from (select distinct id from mytable) i
cross join (
select 1 n, 'last 1 day' timespan
union all select 2, 'last 2 day'
) d
left join mytable t
on t.ymd between date_add(current_date(), - d.n) and current_date()
group by d.n, d.timespan, i.id

Update column with correct daterange using generate_series

I have a column with incorrect dateranges (a day is missing). The code
to generate these dateranges was written by a previous employee and
cannot be found.
The dateranges look like this, notice the missing day:
+-------+--------+-------------------------+
| id | client | date_range |
+-------+--------+-------------------------+
| 12885 | 30 | [2016-01-07,2016-01-13) |
| 12886 | 30 | [2016-01-14,2016-01-20) |
| 12887 | 30 | [2016-01-21,2016-01-27) |
| 12888 | 30 | [2016-01-28,2016-02-03) |
| 12889 | 30 | [2016-02-04,2016-02-10) |
| 12890 | 30 | [2016-02-11,2016-02-17) |
| 12891 | 30 | [2016-02-18,2016-02-24) |
+-------+--------+-------------------------+
And should look like this:
+-------------------------+
| range |
+-------------------------+
| [2016-01-07,2016-01-14) |
| [2016-01-14,2016-01-21) |
| [2016-01-21,2016-01-28) |
| [2016-01-28,2016-02-04) |
| [2016-02-04,2016-02-11) |
| [2016-02-11,2016-02-18) |
| [2016-02-18,2016-02-25) |
| [2016-02-25,2016-03-03) |
+-------------------------+
The code I've written to generate correct dateranges looks like this:
create or replace function generate_date_series(startsOn date, endsOn date, frequency interval)
returns setof date as $$
select (startsOn + (frequency * count))::date
from (
select (row_number() over ()) - 1 as count
from generate_series(startsOn, endsOn, frequency)
) series
$$ language sql immutable;
select DATERANGE(
generate_date_series(
'2016-01-07'::date, '2024-11-07'::date, interval '7days'
)::date,
generate_date_series(
'2016-01-14'::date, '2024-11-13'::date, interval '7days'
)::date
) as range;
However, I'm having trouble trying to update the column with the
correct dateranges. I initially executed this UPDATE query on a test
database I created:
update factored_daterange set date_range = dt.range from (
select daterange(
generate_date_series(
'2016-01-07'::date, '2024-11-07'::date, interval '7days'
)::date,
generate_date_series(
'2016-01-14'::date, '2024-11-14'::date, interval '7days'
)::date ) as range ) dt where client_id=30;
But that is not correct, it simply assigns the first generated
daterange to each row. I want to essentially update the dateranges
row-by-row since there is no other join or condition I can match the
dates up to. Any assistance in this matter is greatly appreciated.
Your working too hard. Just update the upper range value.
update your_table_name
set date_range = daterange(lower(date_range),(upper(date_range) + interval '1 day')::date) ;

Aggregate data per week

I'd like to aggregate data weekly according to a date and a value.
I have a table like this :
create table test (t_val integer, t_date date);
insert into test values(1,'2017-02-09'),(2,'2017-02-10'),(4,'2017-02-16');
This is the query :
WITH date_range AS (
SELECT MIN(t_date) as start_date,
MAX(t_date) as end_date
FROM test
)
SELECT
date_part('year', f.date) as date_year,
date_part('week', f.date) as date_week,
f.val
FROM generate_series( (SELECT start_date FROM date_range), (SELECT end_date FROM date_range), '7 day')d
LEFT JOIN
(
SELECT t_val as val, t_date as date
FROM test
WHERE t_date >= (SELECT start_date FROM date_range)
AND t_date <= (SELECT end_date FROM date_range)
GROUP BY t_val, t_date
) f
ON f.date BETWEEN d.date AND (d.date + interval '7 day')
GROUP BY date_part('year', f.date),date_part('week', f.date), f.val;
I expect a result like this :
| Year | Week | Val |
| 2017 | 6 | 3 |
| 2017 | 7 | 4 |
BUt the query returns :
| Year | Week | Val |
| 2017 | 6 | 1 |
| 2017 | 6 | 2 |
| 2017 | 7 | 4 |
What is missing ?