4 weeks or accounting month grouping in postgresql - postgresql

I have a requirement to group data in transaction table into 4 week groups, or accounting month grouping, any suggestions on how to approach it on postgresql
Regards

SELECT sum(ITEM_C._demand) AS aggre_demand,ITEM_C._week as account_month FROM (SELECT sum(quantity) AS _demand,( CASE WHEN floor((extract(doy from shipment_date)-1)/7)+1 > 0 AND floor((extract(doy from shipment_date)-1)/7)+1 <=4 THEN 1
ELSE (CASE WHEN floor((extract(doy from shipment_date)-1)/7)+1 >4 AND floor((extract(doy from shipment_date)-1)/7)+1 <=8 THEN 2 ELSE 3 END ) END)AS _week, date_trunc('year',shipment_date) AS _Year
FROM smp.shipment
GROUP BY floor((extract(doy from shipment_date)-1)/7)+1,date_trunc('year',shipment_date)) AS ITEM_C
GROUP BY ITEM_C._week, ITEM_C._Year
I think this is not the idle way to do this, as i need to repeat this 1..52 weeks total 13 times

use date_trunc
https://www.postgresql.org/docs/9.1/static/functions-datetime.html
Anything more than this will need some more information. copy some examples and anything you have tried already.
SELECT date_trunc('week', your_date_field)
FROM your_table_name
GROUP BY 1

Related

Is there a dynamic way in BigQuery to select/create columns with pattern?

Not sure how to sure how to best phrase it, but essentially I need ~50 columns x 12 weeks in BigQuery and was hoping there was a way to do it more efficiently using some sort of logic or function.
I can generate the script in Python, but the end output itself is long and unwieldy. Is there a cleaner way to do it within BigQuery itself?
Example code:
select id,
sum(case when first_week_flag then visit else 0 end) as sum_visit_first_week,
sum(case when second_week_flag then visit else 0 end) as sum_visit_second_week,
... for 12 weeks,
avg(case when first_week_flag then gap else 0 end) as avg_gap_first_week,
avg(case when second_week_flag then gap else 0 end) as avg_gap_second_week,
... for 12 weeks,
etc. for 50 columns
from table
group by id
Potential for simplification:
select id,
sum(case when {WEEK}_flag then visit else 0 end) as sum_visit_{WEEK},
avg(case when {WEEK}_flag then gap else 0 end) as avg_gap_{WEEk},
etc. for 50 columns
from table
group by id
Can anyone point me in the right direction of what to search for? Thanks!

Using 'over' function results in column "table.id" must appear in the GROUP BY clause or be used in an aggregate function

I'm currently writing an application which shows the growth of the total number of events in my table over time, I currently have the following query to do this:
query = session.query(
count(Event.id).label('count'),
extract('year', Event.date).label('year'),
extract('month', Event.date).label('month')
).filter(
Event.date.isnot(None)
).group_by('year', 'month').all()
This results in the following output:
Count
Year
Month
100
2021
1
50
2021
2
75
2021
3
While this is okay on it's own, I want it to display the total number over time, so not just the number of events that month, so the desired outpout should be:
Count
Year
Month
100
2021
1
150
2021
2
225
2021
3
I read on various places I should use a window function using SqlAlchemy's over function, however I can't seem to wrap my head around it and every time I try using it I get the following error:
sqlalchemy.exc.ProgrammingError: (psycopg2.errors.GroupingError) column "event.id" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: SELECT count(event.id) OVER (PARTITION BY event.date ORDER...
^
[SQL: SELECT count(event.id) OVER (PARTITION BY event.date ORDER BY EXTRACT(year FROM event.date), EXTRACT(month FROM event.date)) AS count, EXTRACT(year FROM event.date) AS year, EXTRACT(month FROM event.date) AS month
FROM event
WHERE event.date IS NOT NULL GROUP BY year, month]
This is the query I used:
session.query(
count(Event.id).over(
order_by=(
extract('year', Event.date),
extract('month', Event.date)
),
partition_by=Event.date
).label('count'),
extract('year', Event.date).label('year'),
extract('month', Event.date).label('month')
).filter(
Event.date.isnot(None)
).group_by('year', 'month').all()
Could someone show me what I'm doing wrong? I've been searching for hours but can't figure out how to get the desired output as adding event.id in the group by would stop my rows from getting grouped by month and year
The final query I ended up using:
query = session.query(
extract('year', Event.date).label('year'),
extract('month', Event.date).label('month'),
func.sum(func.count(Event.id)).over(order_by=(
extract('year', Event.date),
extract('month', Event.date)
)).label('count'),
).filter(
Event.date.isnot(None)
).group_by('year', 'month')
I'm not 100% sure what you want, but I'm assuming you want the number of events up to that month for each month. You're going to first need to calculate the # of events per month and also sum them with the postgresql window function.
You can do that with in a single select statement:
SELECT extract(year FROM events.date) AS year
, extract(month FROM events.date) AS month
, SUM(COUNT(events.id)) OVER(ORDER BY extract(year FROM events.date), extract(month FROM events.date)) AS total_so_far
FROM events
GROUP BY 1,2
but it might be easier to think about if you split it into two:
SELECT year, month, SUM(events_count) OVER(ORDER BY year, month)
FROM (
SELECT extract(year FROM events.date) AS year
, extract(month FROM events.date) AS month
, COUNT(events.id) AS events_count
FROM events
GROUP BY 1,2
)
but not sure how to do that in SqlAlchemy

How to fix more than one columns returned error in PostgreSQL

I was trying to write a code to fetch records from a table and group by some columns but the subquery returns "more than one" error.
When I write the codes independently, I get a great awesome result but combining them is a problem.
select
year as Season,
cal_scheme as Scheme,
(case when cal_scheme='Mt.Elgon' then '1000'
when cal_scheme='West Nile' then '2000'
when cal_scheme='Rwenzori' then '1500' else '' end) as Target,
min(today::date) as startdatetime,
max(today::date)-min(today::date) as No_of_days,
(select count(id) as id from
kcl_internal_edit where new_farmer='' or new_farmer is null
group by year, cal_scheme)as growers
from kcl_internal_edit
group by year, cal_scheme
The expected result is to be as follows:
Season Scheme Target Startdatetime No_of_days growers
2019 Mt.Elgon 1000 28-10-2019 5 5
2019 West Nile 2000 29-05-2019 10 1
2018 Mt.Elgon 1500 29-08-2018 207 3
Your query should look like this:
select
year as Season,
cal_scheme as Scheme,
(case when cal_scheme='Mt.Elgon' then '1000'
when cal_scheme='West Nile' then '2000'
when cal_scheme='Rwenzori' then '1500' else '' end) as Target,
min(today::date) as startdatetime,
max(today::date)-min(today::date) as No_of_days,
count(id) FILTER (WHERE new_farmer='' or new_farmer is null) as growers
from kcl_internal_edit
group by year, cal_scheme;
There is no need for a subselect!

Getting Dates by Selecting a week in oracle

I have a textbox with random numbers from 1 to 52 which are week numbers of a calendar and a drop down which mentions as years.
For example if I select 2 in a textbox with year 2014, then I want the dates to be mentioned as 05-1-2014 - 11-1-2014. Is it possible to do it.
Also I have tried one query which doesnt match my requirement
SELECT date_val, TO_CHAR (date_val, 'ww')
FROM (SELECT TO_DATE ('01-jan-2013', 'DD-MON-YYYY') + LEVEL AS date_val
FROM DUAL
CONNECT BY LEVEL <= 365)
Please help.
Try this. Here 2 is the number of week in the year (FirstSunday+(NumberOfWeek-1)*7 as WeekStart, FirstSunday+ NumberOfWeek*7-1 as WeekEnd) and 2014 is a year:
select
FirstSunday+(2-1)*7 as WeekStart,
FirstSunday+ 2*7-1 as WeekEnd
from
(
Select NEXT_DAY(TO_DATE('01/01/'||'2014','DD/MM/YYYY')-7, 'SUN') as FirstSunday
from dual
)
SQLFiddle demo
Try this too,
SELECT start_date,
start_date + 6 end_day
FROM(
SELECT TRUNC(Trunc(to_date('2014', 'YYYY'),'YYYY')+ 1 * 7,'IW')-1 start_date
FROM duaL
);

CASE expressions with MAX aggregate functions Oracle

Using Oracle, I have selected the title_id with its the associated month of publication with:
SELECT title_id,
CASE EXTRACT(month FROM pubdate)
WHEN 1 THEN 'Jan'
WHEN 2 THEN 'Feb'
WHEN 3 THEN 'Mar'
WHEN 4 THEN 'Apr'
WHEN 5 THEN 'May'
WHEN 6 THEN 'Jun'
WHEN 7 THEN 'Jul'
WHEN 8 THEN 'Aug'
WHEN 9 THEN 'Sep'
WHEN 10 THEN 'Oct'
WHEN 11 THEN 'Nov'
ELSE 'Dec'
END MONTH
FROM TITLES;
Using the statement:
SELECT MAX(Most_Titles)
FROM (SELECT count(title_id) Most_Titles, month
FROM (SELECT title_id, extract(month FROM pubdate) AS MONTH FROM titles) GROUP BY month);
I was able to determine the month with the maximum number of books published.
Is there a way to join the two statements so that I can associate the month's text equivalent with the maximum number of titles?
In order to convert a month to a string, I wouldn't use a CASE statement, I'd just use a TO_CHAR. And you can use analytic functions to rank the results to get the month with the most books published.
SELECT num_titles,
to_char( publication_month, 'Mon' ) month_str
FROM (SELECT count(title_id) num_titles,
trunc(pubdate, 'MM') publication_month,
rank() over (order by count(title_id) desc) rnk
FROM titles
GROUP BY trunc(pubdate, 'MM'))
WHERE rnk = 1
A couple of additional caveats
If there are two months that are tied with the most publications, this query will return both rows. If you want Oracle to arbitrarily pick one, you can use the row_number analytic function rather than rank.
If the PUBDATE column in your table only has dates of midnight on the first of the month where the book is published, you can eliminate the trunc on the PUBDATE column.