"partition by" giving incorrect value - postgresql

battery_pct tstamp charging phone_id
90 t1 yes 12
91 t2 yes 22
95 t3 no 22
89 t4 no 22
87 t5 no 22
80 t6 no 22
78 t7 yes 22
85 t8 yes 4
50 t9 no 4
40 t10 no 4
38 t11 no 4
20 t12 yes 4
I want to calculate battery depletion rate as : change in battery / time taken
This should be calculated for ALL the windows when charging is 'no' (sandwiched in between 2 "yes"), and then the average of those rates should be taken.
So, for this dataset it should be:
95 - 80 / t6 - t3 = rate for phone_id 22
50 - 38 / t11 - t9 = rate for phone_id 4
average rate = ( rate 1 + rate 2 ) / 2
Please note there can be more than one windows of no's for each phone_id in the data.
I have to find average rate across ALL phone id's. i.e. one value for average rate which encompasses all phones.
Here is my current code, it does not give any error, but is returning a value that is NOT plausible -
with discharge_intervals as (
select battery_pct, tstamp,
sum((charging = 'yes')::int) over (partition by phone_id order by tstamp) as ival_number,
charging = 'no' as keep
from dataset
), interval_rates as (
select ival_number,
(max(battery_pct) - min(battery_pct))
/ extract(epoch from max(tstamp) - min(tstamp)) as ival_rate
from discharge_intervals
where keep
group by ival_number
)
select avg(ival_rate)
from interval_rates;

Your interval_rates are calculated without grouping by phone, but should be. The ival_numbers are partitioned by phone_id, but that just means multiple phones will create rows with the same ival_number. You'll want to use
with discharge_intervals as (
select battery_pct, tstamp, phone_id,
-- ^^^^^^^^^
sum((charging = 'yes')::int) over (partition by phone_id order by tstamp) as ival_number,
charging = 'no' as keep
from dataset
), interval_rates as (
select (max(battery_pct) - min(battery_pct))
/ extract(epoch from max(tstamp) - min(tstamp)) as ival_rate
from discharge_intervals
where keep
group by phone_id, ival_number
-- ^^^^^^^^^
)
select avg(ival_rate)
from interval_rates;

Related

Unable to calculate compound interest in PostgreSQL

I have a table table1 which contains the details of any depositor like
Depositor
Deposit_Amount
Deposit_Date
Maturity_Date
Tenure
Rate
A
25000
2021-08-10
2022-08-10
12
10%
I have another table table2 which contains the interest due date as:
Interest_Due_Date
2021-09-30
2021-12-31
2022-03-31
2022-06-30
2022-08-10
My Code is:
with recursive recur (n, start_bal, days,principle,interest, end_bal) as
(
select sno,deposit_amount,rate,days,deposit_amount * (((rate::decimal(18,2))/100)/365)*days as interest, deposit_amount+(deposit_amount * (((rate::decimal(18,2))/100)/365)*days) as end_bal from (
SELECT
sno, COALESCE(DATE_PART('day', deposit_date::TIMESTAMP - lag(deposit_date::TIMESTAMP) over
(ORDER BY sno ASC rows BETWEEN UNBOUNDED PRECEDING AND CURRENT row)),0) AS
days, deposit_date, deposit_amount, rate
FROM
( SELECT
ROW_NUMBER () OVER (ORDER BY deposit_date) AS sno,
deposit_date,
deposit_amount,
rate
FROM
( SELECT
t1.deposit_date, t1.deposit_amount, t1.rate from table1 t1
UNION ALL
SELECT
t2.Interest_Due_Date AS idate, 0 as depo_amount, 0 as rate
FROM
table2 t2
ORDER BY
deposit_date) dep) calc) b where sno = 1 union all select b.sno, b.end_bal,b.days,b.prin_bal,(coalesce(a.end_bal,0)) * (((b.rate)/100)/365)*b.days as interest_NEW,
coalesce(a.end_bal,0)+ ((a.end_bal) * (((calc.rate)/100)/365)*calc.days) as end_bal_NEW
from b, recur as a
where calc.sno = a.n+1 ) select * from recur
"Every time when i try to execute the query its showing an error 'relation 'b' does not exist"
...
The result table should be
Deposit Amount
Date
Days
Interest
Total Amount
25000
2021-08-10
0
0
25000
0
2021-09-30
51
349.32
25349.32
0
2021-12-31
92
638.94
25988.26
0
2022-03-31
90
640.81
26629.06
0
2022-06-30
91
663.90
27292.97
0
2022-08-10
41
306.58
27599.54

Calculating a simple rate for "windows" within data

Battery % time charging
90 t1 yes
91 t2 yes
95 t3 no
89 t4 no
87 t5 no
80 t6 no
78 t7 yes
85 t8 yes
50 t9 no
40 t10 no
38 t11 no
20 t12 yes
I want to calculate battery depletion rate as :
change in battery / time taken
This should be calculated for ALL the windows when charging is 'no' (sandwiched in between 2 "yes"), and then the average of those rates should be taken.
So, for this dataset it should be:
95 - 80 / t6 - t3 = rate 1
50 - 38 / t11 - t9 = rate 2
average rate = ( rate 1 + rate 2 ) / 2
Please note there can be more than 2 windows of no's in the data
Here is my current code -
select ((max(battery_Percentage) - min (battery_Percentage)) / NULLIF(Extract(epoch FROM (max(time) - min(time))/3600),0)) as rate_of_battery_decline
from table
where
table.charging = 'no'
but this is not taking into account windows of no's in between the yes's as I want. Please help.
You have to separate the runs between the charging = 'yes' blocks:
with discharge_intervals as (
select battery_pct, tstamp,
sum((charging = 'yes')::int) over (order by tstamp) as ival_number,
charging = 'no' as keep
from cstats
), interval_rates as (
select ival_number,
(max(battery_pct) - min(battery_pct))
/ extract(epoch from max(tstamp) - min(tstamp)) as ival_rate
from discharge_intervals
where keep
group by ival_number
)
select avg(ival_rate)
from interval_rates;
Fiddle Here

PostgreSQL non-overlapping ranges

I use PostgreSQL database and have a cards table.
Each record(card) in this table have card_drop_rate integer value.
For example:
id | card_name |card_drop_rate
-------------------------------
1 |card1 |34
2 |card2 |16
3 |card3 |54
max drop rate is 34 + 16 + 54 = 104.
In accordance to my application logic I need to find a random value between 0 and 104 and then retrieve card according to this number, for example:
random value: 71
card1 range: 0 - 34(0 + 34)
card2 range: 34 - 50(34 + 16)
card3 range: 50 - 104(50 + 54)
So, my card is card3 because 71 is placed in the range 50 - 104
What is the proper way to reflect this structure in PostgreSQL ? I'll need to query this data often under so the performance is a criterion number one for this solution.
Following query works fine:
SELECT
b.id,
b.card_drop_rate
FROM (SELECT a.id, sum(a.card_drop_rate) OVER(ORDER BY id) - a.card_drop_rate as rate, card_drop_rate FROM cards as a) b
WHERE b.rate < 299 ORDER BY id DESC LIMIT 1
You can do this using cumulative sums and random. The "+ 1"s might be throwing me off, but it is something like this:
with c as (
select c.*,
sum(card_drop_rate + 1) - card_drop_rate as threshhold
from cards c
),
r as (
select random() * (sum(card_drop_rate) + count(*) - 1) as which_card
from cards c
)
select c.*
from c cross join
r
where which_card >= threshhold
order by threshhold
limit 1;
For performance, I would simply take the cards and generate a new table with 106 slots. Assign the card value to the slots and build an index on the slot number. Then get a value using:
select s.*
from slots s
where s.slotid = floor(random() * 107);

SQL - Select max week from a group

I need to be able to get a result set which shows the last teacher for a course, for which I have the following SQL query:
SELECT
a.acad_period, MAX(a.start_week) as start_week,
a.staff_code, b.aos_code, b.aos_period
FROM
qlsdat.dbo.sttstaff a
INNER JOIN
qlsdat..sttrgaos b ON a.acad_period = b.acad_period
AND a.register_id = b.register_id
AND a.register_group = b.register_group
WHERE
a.acad_period = '14/15'
GROUP BY
a.acad_period, a.staff_code, b.aos_code, b.aos_period
However, the issue is that it returns to me the maximum start week for a teacher on that course, whereas I want the maximum start week for a course, and the teacher that happens to be teaching for that start week.
Here is a sample result set returned from the above query:
14/15 37 HKARUNATHIL A2ES 001A
14/15 37 CSHUKLA A2ES 001B
14/15 37 PSEDOV A2ES 002A
14/15 37 BBANFIELD A2ES 002B
14/15 14 VKRISHNASWA A2EX BL1 X
14/15 14 VKRISHNASWA A2EX BL2 X
14/15 6 BODAMEKENTO ACA2 BL1 A
14/15 41 SKLER ACA2 BL1 A
14/15 44 BODAMEKENTO ACAS BL1 F
14/15 37 MMILLER ARA2 BL1 C
14/15 45 MMILLER ARAS BL1 E
14/15 44 SHOULTON ARAS BL1 E
Here is an example of the problem within the result set:
14/15 10 HMALIK MMGX GB2F3
14/15 44 JMULLANEY MMGX GB2F3
In the above example I only want:
14/15 44 JMULLANEY MMGX GB2F3
The query produced is going to be used as a subquery in another query.
This will get the row for highest start_week, however you may encounter some problems if you have data from more than 1 year, this can be resolved by putting your your field in addition to your week column in this part
row_number() over (partition by
a.acad_period, b.aos_code, b.aos_period
order by
a.start_year desc,
a.start_date desc) rn
Query:
;WITH CTE AS
(
SELECT
a.acad_period, a.start_week,
a.staff_code, b.aos_code, b.aos_period,
row_number() over (partition by
a.acad_period, b.aos_code,
b.aos_period
order by a.start_week desc) rn
FROM
qlsdat.dbo.sttstaff a
INNER JOIN
qlsdat..sttrgaos b ON a.acad_period = b.acad_period
AND a.register_id = b.register_id
AND a.register_group = b.register_group
WHERE
a.acad_period = '14/15'
)
SELECT
acad_period, start_week,
staff_code, aos_code, aos_period,
FROM CTE
WHERE rn = 1

Complicated AVG within date range

I've got a table with a tracking of a plant's equipment installation.
Here is a sample:
ID Name Date Percentage
1 GT-001 2011-01-08 30
2 GT-002 2011-01-11 40
3 GT-003 2011-02-02 30
4 GT-001 2011-02-03 50
5 GT-003 2011-02-15 50
6 GT-004 2011-02-15 30
7 GT-002 2011-02-15 60
8 GT-001 2011-02-20 60
9 GT-003 2011-03-01 60
10 GT-004 2011-03-05 50
11 GT-001 2011-03-10 70
12 GT-004 2011-03-15 60
And the corresponding script:
CREATE TABLE [dbo].[SampleTable](
[ID] [int] NOT NULL,
[Name] [nvarchar](50) NULL,
[Date] [date] NULL,
[Percentage] [int] NULL) ON [PRIMARY]
GO
--Populate the table with values
INSERT INTO [dbo].[SampleTable] VALUES
('1', 'GT-001', '2011-01-08', '30'),
('2', 'GT-002', '2011-01-11', '40'),
('3', 'GT-003', '2011-02-02', '30'),
('4', 'GT-001', '2011-02-03', '50'),
('5', 'GT-003', '2011-02-15', '50'),
('6', 'GT-004', '2011-02-15', '30'),
('7', 'GT-002', '2011-02-15', '60'),
('8', 'GT-001', '2011-02-20', '60'),
('9', 'GT-003', '2011-03-01', '60'),
('10', 'GT-004', '2011-03-05', '50'),
('11', 'GT-001', '2011-03-10', '70'),
('12', 'GT-004', '2011-03-15', '60');
GO
What i need is to create a chart with Date on the X and Average Percentage on the Y. Average Percentage is an average percentage of all equipment by that particular date starting from the beggining of the installation process (MIN(Fields!Date.Value, "EquipmentDataset"))
Having no luck in implementing this using SSRS only, i decided to create a more complicated dataset for it using T-SQL.
I guess that it is nessesary to add a calculated column named 'AveragePercentage' that should store an average percentage on that date, calculating only the most latest equipment percentage values in a range between the beggining of the installation process (MIN(Date)) and the current row's date. Smells like a recursion, but i'm newbie to T-SQL....))
Here is the desired output
ID Name Date Percentage Average
1 GT-001 2011-01-08 30 30
2 GT-002 2011-01-11 40 35
3 GT-003 2011-02-02 30 33
4 GT-001 2011-02-03 50 40
5 GT-003 2011-02-15 50 48
6 GT-004 2011-02-15 30 48
7 GT-002 2011-02-15 60 48
8 GT-001 2011-02-20 60 50
9 GT-003 2011-03-01 60 53
10 GT-004 2011-03-05 50 58
11 GT-001 2011-03-10 70 60
12 GT-004 2011-03-15 60 63
What do you think?
I'll be very appreciated for any help.
You could use cross apply with row_number to find the latest value for each machine. An additional subquery is required because you cannot use row_number in the where clause directly. Here's the query:
select t1.id
, t1.Name
, t1.Date
, t1.Percentage
, avg(1.0*last_per_machine.percentage)
from SampleTable t1
outer apply
(
select *
from (
select row_number() over (partition by Name order by id desc)
as rn
, *
from SampleTable t2
where t2.date <= t1.date
) as numbered
where rn = 1
) as last_per_machine
group by
t1.id
, t1.Name
, t1.Date
, t1.Percentage
Working example on SE Data.