PostgreSQL - get previous row and average - postgresql

I'm using this SQL query to minus current row from the previous row
output = current row - previous row
SELECT x.price_index FROM
(SELECT date_time, price - LAG(price) OVER (ORDER BY date_time
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS price_index FROM my_table) as x
But the problem is first not sure how to get the previous 5 rows. 2nd is the current average previous 5 rows minus the previous average of 6~10 rows
problem 1 (solved)
output = current row - previous row (row number 5)
SQL query
SELECT x.price_index FROM
(SELECT date_time, price - LAG(price, 5) OVER (ORDER BY date_time
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS price_index FROM my_table) as x
problem 2:
output = current average total of 5 previous rows (row number 1(current)~ 5) - previous average total of 6~10 previous rows (row number 6~10)

Related

How to sum for previous n number of days for a number of dates in PostgreSQL

I have a list of dates each with a value in Postgresql.
For each date I want to sum the value for this date and the previous 4 days.
I also want to sum the values for the start of that month to the present date. So for example:
For 07/02/2021 sum all values from 07/02/2021 to 01/02/2021
For 06/02/2021 sum all values from 06/02/2021 to 01/02/2021
For 31/01/2021 sum all values from 31/01/2021 to 01/01/2021
The output should look like, will be created as two separate tables:
Output
Any help would be appreciated.
Thanks
Sample data and structure: dbfiddle
For first part of query:
select date,
value,
sum(value) over (
order by to_date(date, 'DD/MM/YYYY')
rows between 4 preceding and current row) as five_day_period
from your_table_name
order by to_date(date, 'DD/MM/YYYY') desc;
For second part of query:
select date,
value,
sum(value)
over (
partition by regexp_replace(date, '[0-9]{2}/(.+)', '\1')
order by to_date(date, 'DD/MM/YYYY')
rows between unbounded preceding and current row) as month_to_date
from your_table_name
order by to_date(date, 'DD/MM/YYYY') desc;

How do I calculate cumulative sum for last 7 rows on a specific date in Postgresql?

I have a table that has these columns: user_id, day, valueA, valueB.
I'd like to calculate the running sum of last 7 rows of valueA and valueB for each user that has data on a specific day, for example '2020-08-01'.
(Note: Users only have a row when their valueA and valueB is not zero so there are some dates not in the table.)
I tried this query:
select user_id, day,
sum(valueA) over(partition by user_id rows between 7 preceding and current row) as last_7_A,
sum(valueB) over(partition by user_id rows between 7 preceding and current row) as last_7_B
from table where day='2020-08-01'
But this query doesn't calculate the running sum and returns me the valueA and valueB on date 2020-08-01
I could just calculate on each day and select the date I want but that'll be really inefficient. Any ideas how to add the date constraint and let it just calculate on just one row's last 7 running sum for each user?
As per question:
sum of last 7 rows for each user for a particular date, this might work
select user_id, sum(valueA) "sum of valueA", sum(valueB) "sum of valueB"
from sample_table
where id in (
select id
from sample_table
where day='2020-08-08'
order by id desc limit 7)
group by user_id;

How can I fetch next n rows after a particular column value in postgresql

I have a result set from which I want to get next n rows (or previous n rows) after (before) the row that matches a particular cell value.
So for example, here is my data:
A B C
1 10 100
2 20 200
3 30 300
4 40 400
5 50 500
6 60 600
I am interested to get next 3 rows after the row where C=300, including C=300 row, so my output should be
A B C
3 30 300
4 40 400
5 50 500
6 60 600
With FETCH and OFFSET, you need to know the exact position number of the row, here I have to search where the data condition, i.e C=300 resides so I cannot assume that it will be the 3rd row.
select *
from table
order by C asc
Assuming you've got a table named sample, you could use a nested query and window functions to solve your issue, something like:
select *
from (
select *, lag(c,3) over(order by c asc) as three_back
from sample
where sample.c >= 300
) t
where coalesce(three_back,300) = 300
If your rows are ordered by the column value you are interested in then
SELECT *
FROM table_name
WHERE column_name >= x
ORDER BY column_name
LIMIT n
should do it. If not you’ll have to get creative
If your column values are unique and you want to order by another value then
SELECT *
FROM table_name
WHERE other_column >= (
SELECT other_column
FROM table_name
WHERE column_value = x
)
ORDER BY other_column
LIMIT n
If your column values are not unique you can
SELECT MIN(other_column)
in the inner select. This finds the first occurrence (using the other column to order by), and then retrieves the next (n - 1) rows

SQL - how to sum groups of 15 rows and find the max sum

The purpose of this question is to optimize some SQL by using set-based operations vs iterative (looping, like I'm doing below):
Some Explanation -
I have this cte that is inserted to a temp table #dataForPeak. Each row represents a minute, and a respective value retrieved.
For every row, my code uses a while loop to add 15 rows at a time (the current row + the next 14 rows). These sums are inserted into another temp table #PeakDemandIntervals, which is my workaround for then finding the max sum of these groups of 15.
I've bolded my end goal above. My code achieves this but in about 12 seconds for 26k rows. I'll be looking at much more data, so I know this is not enough for my use case.
My question is,
can anyone help me find a fast alternative to this loop?
It can include more tables, CTEs, nested queries, whatever. The while loop might not even be the issue, it's probably the inner code.
insert into #dataForPeak
select timestamp, value
from cte
order by timestamp;
while ##ROWCOUNT<>0
begin
declare #timestamp datetime = (select top 1 timestamp from #dataForPeak);
insert into #PeakDemandIntervals
select #timestamp, sum(interval.value) as peak
from (select * from #dataForPeak base
where base.timestamp >= #timestamp
and base.timestamp < DATEADD(minute,14,#timestamp)
) interval;
delete from #dataForPeak where timestamp = #timestamp;
end
select max(peak)
from #PeakDemandIntervals;
Edit
Here's an example of my goal, using groups of 3min instead of 15min.
Given the data:
Time | Value
1:50 | 2
1:51 | 4
1:52 | 6
1:53 | 8
1:54 | 6
1:55 | 4
1:56 | 2
the max sum (peak) I'm looking for is 20, because the group
1:52 | 6
1:53 | 8
1:54 | 6
has the highest sum.
Let me know if I need to clarify more than that.
Based on the example given it seems like you are trying to get the maximum value of a rolling sum. You can calculate the 15-minute rolling sum very easily as follow:
SELECT [Time]
,[Value]
,SUM([Value]) OVER (ORDER BY [Time] ASC ROWS 14 PRECEDING) [RollingSum]
FROM #dataForPeak
Note the key here is the ROWS 14 PRECEDING statement. It effectively state that SQL Server should sum the preceding 14 records with the current record which will give you your 15 minute interval.
Now you can simply max the result of the rolling sum. The full query will look as follow:
;WITH CTE_RollingSum
AS
(
SELECT [Time]
,[Value]
,SUM([Value]) OVER (ORDER BY [Time] ASC ROWS 14 PRECEDING) [RollingSum]
FROM #dataForPeak
)
SELECT MAX([RollingSum]) AS Peak
FROM CTE_RollingSum

Postgres calculate growth rate over two month

I would like to calculate growth rate for customers for following data.
month | customers
-------------------------
01-2015 | 1
02-2015 | 10
03-2014 | 10
06-2015 | 15
I have used following formula to calculate the growth rate, it works only for one month interval as well as not able to give expected output due to gap between 3rd and 6th month as shown in above table
select
month, total,
(total::float / lag(total) over (order by month) - 1) * 100 growth
from (
select to_char(created, 'yyyy-mm') as month, count(id) total
from customers
group by month
) s
order by month;
I think this can be done by creating a date range and group by that range.
I expect two main output separately
1) Generate growth rate with exact one month difference
2) Growth rate with interval of 2 month instead of single month only. In above data sum the two month result and group by two month instead of month
Still not sure about the second part. Here's growth from your adapted query and twon month growth column:
select
month, total,
(total::float / lag(total) over (order by m) - 1) * 100 growth,m,m2
from (
select created, (sum(customers) over (order by m))::float total,customers,m,m2,to_char(created, 'yyyy-mm') as month
from customers c
right outer join (
select generate_series('2015-01-01','2015-06-01','1 month'::interval) m
) m1 on m=c.created
left outer join (
select generate_series('2015-01-01','2015-06-01','2 month'::interval) m2
) m2 on m2=m
order by m
) s
order by m;
basically answer is use generate_series