I am trying to rollup in postgresql 8.0. In latest version on postgresql we have ROLLUP function, but how to implement rollup in postgresql 8.0 ? Anyone have experience with the same?
I tried the below
SELECT
EXTRACT (YEAR FROM rental_date) y,
EXTRACT (MONTH FROM rental_date) M,
EXTRACT (DAY FROM rental_date) d,
COUNT (rental_id)
FROM
rental
GROUP BY
ROLLUP (
EXTRACT (YEAR FROM rental_date),
EXTRACT (MONTH FROM rental_date),
EXTRACT (DAY FROM rental_date)
);
But getting the following error:
42883: function rollup( integer, integer, integer) does not exist
followed from http://www.postgresqltutorial.com/postgresql-rollup/
As GROUP BY ROLLUP was introduced with version 9.5, the query has no chance to work. But if you think about what it does it should be very easy in your case to come up with a version producing the same result.
Basically, you want to have:
an overall sum
a sum per year
and a sum per month
for the daily counts
I've written the above in a special way, so that it becomes clear what you actually need:
produce daily counts
generate sum per month from daily counts
generate sum per year from monthly sums or daily counts
generate total from yearly sums, monthly sums or daily counts
UNION ALL of the above in the order you want
As the default for GROUP BY ROLLUP is to write-out the total first and then the individual grouping sets with NULLS LAST, the following query will do the same:
WITH
daily AS (
SELECT EXTRACT (YEAR FROM rental_date) y, EXTRACT (MONTH FROM rental_date) M, EXTRACT (DAY FROM rental_date) d, COUNT (rental_id) AS count
FROM rental
GROUP BY 1, 2, 3
),
monthly AS (
SELECT y, M, NULL::double precision d, SUM (count) AS count
FROM daily
GROUP BY 1, 2
),
yearly AS (
SELECT y, NULL::double precision M, NULL::double precision d, SUM (count) AS count
FROM monthly
GROUP BY 1
),
totals AS (
SELECT NULL::double precision y, NULL::double precision M, NULL::double precision d, SUM (count) AS count
FROM yearly
)
SELECT * FROM totals
UNION ALL
SELECT * FROM daily
UNION ALL
SELECT * FROM monthly
UNION ALL
SELECT * FROM yearly
;
The above works with PostgreSQL 8.4+. If you don't even have that version, we must fall back to the old-school UNION without re-using aggregation data:
SELECT NULL::double precision y, NULL::double precision M, NULL::double precision d, COUNT (rental_id) AS count
FROM rental
UNION ALL
SELECT EXTRACT (YEAR FROM rental_date) y, EXTRACT (MONTH FROM rental_date) M, EXTRACT (DAY FROM rental_date) d, COUNT (rental_id) AS count
FROM rental
GROUP BY 1, 2, 3
UNION ALL
SELECT EXTRACT (YEAR FROM rental_date) y, EXTRACT (MONTH FROM rental_date) M, NULL::double precision d, COUNT (rental_id) AS count
FROM rental
GROUP BY 1, 2
UNION ALL
SELECT EXTRACT (YEAR FROM rental_date) y, NULL::double precision M, NULL::double precision d, COUNT (rental_id) AS count
FROM rental
GROUP BY 1
;
Related
I am running an analysis on medication prescribing practices. We want to identify whether someone has been on a class of medications for 60 days out of a 90 day quarter. We have a start and end date for each prescription, and the bounds of the quarter (e.g., 4/1/2022 – 6/30/2022). For each prescription I’ve calculated the number of days between the start and end date (only including days that fall within the bounds of the quarter). There are many instances in which multiple drugs within the same class are prescribed someone might try one antidepressant but not like it, so be given another in the same class.
My original strategy was just to total up number of days for each class of medication and see if it’s 60 or over. The days don’t have to be consecutive, but if they overlap, days during an overlap period shouldn’t count twice (which they would in a simple sum).
For instance in the data table below, patient 1 in row 1 should be included as they are over 60 days. Patient 2 should also get in (rows 2 and 3) because the non-overlapping total (57+8) within the same med class gets them to over 60 days. However, patient 3 should NOT get in, even though the total of 32 + 32 is over 60 because the intervals overlap. This means that they were really on the medication class for only 32 days – this is an instance where someone might be on two different antidepressants simultaneously.
It’s not sufficient to just sum the days in the interval, but I also have to include some way to examine whether the intervals are overlapping and only add days if an interval for a given medication class falls outside another interval for that same class.
Row num Patid Med class Start date End date Interval
1 1 A 2022-04-28 2022-09-12 63
2 2 B 2022-05-03 2022-06-29 57
3 2 B 2022-04-21 2022-04-29 8
4 3 A 2022-01-19 2022-05-03 32
5 3 A 2022-01-19 2022-05-03 32
I’m having a hard time figuring out how to do this. Note, I'm limited to just using SQL for this.
Code that produced the above data. I would embed this in another query to generate a total interval but need to deal with the overlap issue.
DECLARE #startdt DATE;
DECLARE #enddt DATE;
SET #startdt='4/1/2022'
SET #enddt='6/30/2022'
--for q4 fy2022-23 (4/1/2022-6/30/2022)`
SELECT DISTINCT
rx.patid, d.medication_category as medcat, start_date, end_date,
-- case statement to capture days within quarter only
CASE WHEN start_date<#startdt and end_date>#enddt then 90
WHEN start_date<#startdt and end_date>=#startdt then datediff(d,#startdt,end_date)
WHEN start_date>=#startdt and end_date>#enddt then datediff(d,start_date,#enddt)
ELSE datediff(d,start_date,end_date)
END as interval
FROM rx
INNER JOIN Drug_names_categories d
ON rx.drugname=d.drugname
WHERE start_date<'7/1/2022' and end_date>'3/30/2022'
AND rx.patid IS NOT NULL
AND d.medication_category IS NOT NULL
AND d.medication_category <>''
You can accomplish what you want by generating a calendar table (using a Common Table Expression) of individual days within the test range, joining those days with the prescriptions with overlapping days, and then counting distinct days for each patient and medication category combination.
Something like:
DECLARE #startdt DATE = '2022-04-01';
DECLARE #enddt DATE = '2022-06-30';
DECLARE #threshold INT = 60;
WITH Days AS (
SELECT #startdt AS Day
UNION ALL
SELECT DATEADD(day, 1, Day)
FROM Days
WHERE Day < #enddt
)
SELECT
rx.patid, d.medication_category as medcat,
COUNT(DISTINCT DD.Day) AS days_medicated,
MIN(DD.Day) AS start_date,
MAX(DD.Day) AS end_date
FROM rx
INNER JOIN Drug_names_categories d
ON rx.drugname = d.drugname
INNER JOIN Days DD
ON DD.Day BETWEEN rx.start_date AND rx.end_date
WHERE rx.start_date <= #enddt AND #startdt <= rx.end_date
GROUP BY rx.patid, d.medication_category
HAVING COUNT(DISTINCT DD.Day) >= #threshold
ORDER BY rx.patid, start_date;
If using SQL Server 2022 or later, the Days generator can be simplified by using the new GENERATE_SERIES() function:
WITH Days AS (
SELECT DATEADD(day, S.value, #startdt) AS Day
FROM GENERATE_SERIES(0, DATEDIFF(day, #Startdt, #enddt)) S
)
See this db<>fiddle for an example with some sample data.
I would do this using a date/calendar table, then it's pretty easy.
If you don't already have a date table, this link is one of many that describe how to create one easily ( https://www.mssqltips.com/sqlservertip/4054/creating-a-date-dimension-or-calendar-table-in-sql-server/ )
Here's the script from this link (in case the link dies)
DECLARE #StartDate date = '20100101';
DECLARE #CutoffDate date = DATEADD(DAY, -1, DATEADD(YEAR, 30, #StartDate));
;WITH seq(n) AS
(
SELECT 0 UNION ALL SELECT n + 1 FROM seq
WHERE n < DATEDIFF(DAY, #StartDate, #CutoffDate)
),
d(d) AS
(
SELECT DATEADD(DAY, n, #StartDate) FROM seq
),
src AS
(
SELECT
TheDate = CONVERT(date, d),
TheDay = DATEPART(DAY, d),
TheDayName = DATENAME(WEEKDAY, d),
TheWeek = DATEPART(WEEK, d),
TheISOWeek = DATEPART(ISO_WEEK, d),
TheDayOfWeek = DATEPART(WEEKDAY, d),
TheMonth = DATEPART(MONTH, d),
TheMonthName = DATENAME(MONTH, d),
TheQuarter = DATEPART(Quarter, d),
TheYear = DATEPART(YEAR, d),
TheFirstOfMonth = DATEFROMPARTS(YEAR(d), MONTH(d), 1),
TheLastOfYear = DATEFROMPARTS(YEAR(d), 12, 31),
TheDayOfYear = DATEPART(DAYOFYEAR, d)
FROM d
)
SELECT *
INTO MyDateTable
FROM src
ORDER BY TheDate
OPTION (MAXRECURSION 0);
No that you have your new date table you can join to it to get the list of dates that are within the start and end date, something like
SELECT DISTINCT COUNT(TheDate)
FROM rx
INNER JOIN MyDateTable dt on dt BETWEEN rx.start_date AND rx.end_date
INNER JOIN Drug_names_categories d ON rx.drugname=d.drugname
WHERE start_date<'7/1/2022' and end_date>'3/30/2022'
AND rx.patid IS NOT NULL
AND d.medication_category IS NOT NULL
AND d.medication_category <>''
Obviously this is simple example but you could extend this easily to include all the details you need, the point is that you now have a list of dates or distinct list of dates which you can work with easily.
You could also simply the date range applied by referencing the TheQuarter and TheYear columns. If this is a common task consider extending the date table to contain a comound YearQurater columns (e.g. 2023Q1/202301 etc)
Get xth Business day of a calendar month. For ex. if Nov'21 then 3rd business day is 3rd November, but if Oct'21 3rd business day is 5th Oct. We need to build a query or function to get this dynamically. We need to exclude the weekends (0,6) and any public holidays (from a table with public holidays)..
I believe we dont have a direct calendar function in postgres, may be we can try getting the input as month and integer for (xth business day) we need to get the output as date..
if input : Nov/11 (Month) and 3 (xth Business Day) it will be output: '2021-11-03' as output
create or replace function nth_bizday(y integer, m integer, bizday integer)
returns date language sql as
$$
select max(d) from
(
select d
from generate_series
(
make_date(y, m, 1),
make_date(y, m, 1) + interval '1 month - 1 day',
interval '1 day'
) t(d)
where extract(isodow from d) < 6
-- and not exists (select from nb_days where nb_day = d)
limit bizday
) t;
$$;
select nth_bizday(2021, 11, 11);
-- 2021-11-15
If you want to skip other non-business days except weekends then the where clause should be extended as #SQLPro suggests, something like this (supposing that you have the non-business days listed in a table, nb_days):
where extract(isodow from d) < 6
and not exists (select from nb_days where nb_day = d)
Business days are generally specific to organization... You must create a CALENDAR table with date and entries from the begining to the end, with a boolean column that indicates if a day is on or off...
Then a view can compute the nth "on" days for every month...
I have a table temperatures with columns mac_address varchar(255), tm datetime,
temperature float.
I'd like to create a report with 3 parameters: mac_address, week_number and year.
The report should show maximum temperature on certain mac_address in certain week_number (01, ... 50, ...) in certain year. There may be more than 1 rows for certain week_number and year...
The SQL Query for the dataset could be something like
select max(temperature), mac_address, tm
from temperatures
group by mac_address
having mac_address = #mac_address and week_number = #week_number and year = #year
Do you know how to construct the query correctly? Maybe I will need 3 more datasets.
For #mac_address parameter it is easy. I will
select distinct mac_address from temperatures
But how can I do it with #week_number and #year parameters? Is the only option to add the values for dropdown list manually?
The possible result may be:
When the user select from the parameters
mac_address 001, week_number 47 and year 2019
max_temperature | tm
27.8 t1
27.8 t2
27.8 t3
Now it returns 3 rows. Most of the time there will be only one row.
From what I understand from your question, you want to dynamically search by weeknumber, mac_address and year returning the highest temperature for each day matching in that range. A query like the following should do what you are after.
declare #macaddress varchar(255) = '2', #WeekNumber int = 36, #year int = 2019
Select
Date = tm,
Temperature = max(temperature)
from
Temps t
where
year(tm)=#year
and mac_address=#macaddress
and datepart(week,tm) = #WeekNumber
group by
tm
If you want the mac_address included in the results, simply add to the Select and Group By sections.
Here's the SQL fiddle with a test setup.
For a data set to build the year parameter you could use something like this, which will get you 10 years. You can expand as needed:
;WITH years AS (
SELECT YEAR(DATEADD(YY,-5,GETDATE())) AS yr
UNION ALL
SELECT yr + 1
FROM years
WHERE yr < YEAR(DATEADD(YY,5,GETDATE()))
)
SELECT *
FROM years
For the weeks parameter you could hard-code them or use something like:
;WITH weeks AS (
SELECT 1 AS wk
UNION ALL
SELECT wk + 1
FROM weeks
WHERE wk < 52
)
SELECT *
FROM weeks
In your main SQL data set you would want to use something like:
select max(temperature), mac_address, tm
from temperatures
where mac_address = #mac_address
and week_number = #week_number
and year = #year
group by mac_address, tm
Edit: remove tm from the SELECT and GROUP BY
select max(temperature), mac_address
from temperatures
where mac_address = #mac_address
and week_number = #week_number
and year = #year
group by mac_address
I would like to calculate growth rate for customers for following data.
month | customers
-------------------------
01-2015 | 1
02-2015 | 10
03-2014 | 10
06-2015 | 15
I have used following formula to calculate the growth rate, it works only for one month interval as well as not able to give expected output due to gap between 3rd and 6th month as shown in above table
select
month, total,
(total::float / lag(total) over (order by month) - 1) * 100 growth
from (
select to_char(created, 'yyyy-mm') as month, count(id) total
from customers
group by month
) s
order by month;
I think this can be done by creating a date range and group by that range.
I expect two main output separately
1) Generate growth rate with exact one month difference
2) Growth rate with interval of 2 month instead of single month only. In above data sum the two month result and group by two month instead of month
Still not sure about the second part. Here's growth from your adapted query and twon month growth column:
select
month, total,
(total::float / lag(total) over (order by m) - 1) * 100 growth,m,m2
from (
select created, (sum(customers) over (order by m))::float total,customers,m,m2,to_char(created, 'yyyy-mm') as month
from customers c
right outer join (
select generate_series('2015-01-01','2015-06-01','1 month'::interval) m
) m1 on m=c.created
left outer join (
select generate_series('2015-01-01','2015-06-01','2 month'::interval) m2
) m2 on m2=m
order by m
) s
order by m;
basically answer is use generate_series
I would like to create a bar chart displaying the number of objects that were a available on a monthly base. All rows have a start and end date. I know how to do the count for a single month:
SELECT COUNT(*) As NumberOfItems
FROM Items
WHERE DATEPART(MONTH, Items.StartDate) <= #monthNumber
AND DATEPART(MONTH, Items.EndDate) >= #monthNumber
Now I would like do create the SQL to get the month number and the number of items using a single SELECT statement.
Is there any elegant way of accomplishing this? I am aware I have to take the year number into account.
Assuming Sql Server 2005 or newer.
CTE part will return month numbers spanning years between #startDate and #endDate. Main body joins month numbers with items performing the same conversion on Items.StartDate and Items.EndDate.
; with months (month) as (
select datediff (m, 0, #startDate)
union all
select month + 1
from months
where month < datediff (m, 0, #endDate)
)
select year (Items.StartDate) Year,
month (Items.StartDate) Month,
count (*) NumberOfItems
from months
inner join Items
on datediff (m, 0, Items.StartDate) <= months.month
and datediff (m, 0, Items.EndDate) >= months.month
group by
year (Items.StartDate),
month (Items.StartDate)
Note: if you intend to span more than hundred months you will need option (maxrecursion 0) at the end of query.