Running Balance per Account Number - tsql

I require to have a running balance per Account Number.
I tried developing a function, but keep getting the wrong values.
See below Sample Data and how I currently calculate the Running Balance:
declare #tbl table (id int primary key identity, Sort varchar(100), MasterAccount varchar(50), SubAccount varchar(50), Amount float)
insert into #tbl (Sort,MasterAccount,SubAccount,Amount)
values
('1000_1','1000','aaOpening Balance',78137.58)
,('1000_2_13699','1000','1000',35516.34)
,('1000_2_14379','1000','1000',16675)
,('1000_2_15141','1000','1000',9252.21)
,('1000_2_15151','1000','1000',8167.3)
,('1000_2_15161','1000','1000',5729.3)
,('1000_2_15166','1000','1000',8898.7)
,('1000_2_15623','1000','1000',3335)
,('1000_2_15633','1000','1000',2620.85)
,('1000_2_15638','1000','1000',3425.39)
,('1000_2_17582','1000','1000',7281.55)
,('1000_2_18756','1000','1000',2126)
,('1000_2_19698','1000','1000',8000)
,('1000_2_19713','1000','1000',8000)
,('1000_2_19718','1000','1000',8000)
,('1000_2_19847','1000','1000',8000)
,('1000_2_20055','1000','1000',3933.1)
,('1000_2_20060','1000','1000',5304.37)
,('1000_2_20099','1000','1000',0.00000000123)
,('1000_2_20104','1000','1000',-0.00000000123)
,('1000_2_20330','1000','1000',8000)
,('1000_2_20340','1000','1000',8000)
,('1000_2_20360','1000','1000',8000)
,('1000_2_20390','1000','1000',8000)
,('1000_2_20416','1000','1000',8000)
,('1000_2_20576','1000','1000',8000)
,('1000_2_21033','1000','1000',8000)
,('1000_3','1000','zzClosing Balance',278402.69)
,('2000_1','2000','aaOpening Balance',128023.65)
,('2000_2_14381','2000','2000',15174.5)
,('2000_2_15143','2000','2000',10534.92)
,('2000_2_15153','2000','2000',9299.6)
,('2000_2_15163','2000','2000',6523.6)
,('2000_2_15168','2000','2000',10132.4)
,('2000_2_15625','2000','2000',3084.28)
,('2000_2_15635','2000','2000',2652.48)
,('2000_2_15640','2000','2000',3466.73)
,('2000_2_17584','2000','2000',8489.38)
,('2000_2_18758','2000','2000',2132.41)
,('2000_2_20057','2000','2000',3944.96)
,('2000_2_20062','2000','2000',5320.37)
,('2000_3','2000','zzClosing Balance',208779.28)
select
*
, round(case SubAccount when 'zzClosing Balance' then 0 else sum(Amount) over(order by [Sort] rows unbounded preceding) end,2) RunningBalance
from #tbl
The results look as following:
The problem I am facing is the balance for the second Account Number (in this case Acc Number "2000") is calculated wrong.
I require the Running Balance to Start with the value 128,023.65 and end with 208,779.28.
To give you a better idea, my expected results would look like this:
How would I get the Expected results?
Your assistance is greatly appreciated!

I would do conditional aggregation with window function :
sum(case when SubAccount when 'zzClosing Balance' then 0 else amount end) over (partition by MasterAccount order by Sort rows unbounded preceding)

I tried Partitioning the Running Balance, but could not get it right.
I kept trying and finally got it right.
I changed my Running Balance Script to this and it worked:
select
*
, round(case SubAccount when 'zzClosing Balance' then 0 else sum(Amount) over(partition by MasterAccount order by Sort rows unbounded preceding ) end,2) RunningBalance
from #tbl
Thanks for you help either way.

Related

How to Return Records Equal to a Specific Percentage of an Aggregate in Transact-SQL?

My requirement is to provide a random sample of claims that comprise 2.5% of the total amount paid and also comprise 2.5% of total claims for a given population. The goal is to deliver records in a report that meet both criteria. My staging table is defined as follows:
[RecordId] UniqueIdentifier NOT NULL PRIMARY KEY DEFAULT NEWID()
,ClaimNO varchar(50)
,Company_ID varchar(10)
,HPCode varchar(10)
,FinancialResponsibility varchar(30)
,ProviderType varchar(50)
,DateOfService date
,DatePaid date
,ClaimType varchar(50)
,TotalBilled numeric(11,2)
,TotalPaid numeric(11,2)
,ProcessorType varchar(100)
I've already built the logic to return 2.5% of the total number of claims but need guidance in how best to ensure both criterion are met.
Here's what I've tried thus far:
with cteTotals as (
Select Count(*) as TotalClaims, sum(TotalPaid) as TotalPaid, sum(TotalPaid) * .025 as PaidSampleAmount
from [Z_Monthly_Quality_Review]
),
ctePopulation as (
Select *
from [Z_Monthly_Quality_Review]
),
cteSampleRows as (
select TOP 2.5 PERCENT NEWID() RandomID, RecordID, ClaimNo, HPCode, FinancialResponsibility, ProviderType, ProcessorType,
Format(DateOfService, 'MM/dd/yyyy') as DateOfService, Format(DatePaid, 'MM/dd/yyyy') as DatePaid, ClaimType, TotalBilled, TotalPaid
from [Z_Monthly_Quality_Review]
order by NEWID()
),
cteSamplePaid as (
Select Top 2.5 PERCENT NEWID() RandomID, RecordID, ClaimNo, HPCode, FinancialResponsibility, ProviderType, ProcessorType,
Format(DateOfService, 'MM/dd/yyyy') as DateOfService, Format(DatePaid, 'MM/dd/yyyy') as DatePaid, ClaimType, TotalBilled, TotalPaid
from [Z_Monthly_Quality_Review] mqr
inner join ctePopulation cte on mqr.ClaimNo = cte.ClaimNO
order by NEWID()
)
Since both criterion must be satisfied, how should I structure both CTEs to ensure this? In my cteSamplePaid, how do I ensure that the sum of total paid equals 2.5% of the total population? Would this be accomplished with a Having clause? The end result will be displayed to my business users via SQL Server Reporting Services. Ideally, I would want to provide them with 1 sample that meets both criteria. If that's not possible, how do I randomly sample claims from both criterion?
Don't think there is a guaranteed way it will add up to 2.5% of the total. There's no guarantee results and the performance would be very poor as it you would essentially have to brute force every possible combination of rows. A way to get very close to your goal would be to use return rows that add up to an acceptable margin of error.
Since no sample data was provided, I just used AdventureWorks2017 (downloaded from here)
USE AdventureWorks2017
GO
DROP TABLE IF EXISTS #SalesData
SELECT SalesOrderID AS ID,TotalDue
INTO #SalesData
FROM Sales.SalesOrderHeader
Declare #DesiredPercentage Numeric(10,3) = .025 /*Desired sum percentage of total rows*/
,#AcceptableMargin Numeric(10,3) = .01 /*Random row total can be plus or minus this percentage of the desired sum*/
DECLARE #DesiredSum Numeric(16,2) = #DesiredPercentage *(SELECT SUM(TotalDue) FROM #SalesData)
/*For loop*/
DECLARE #RowNum INT
,#LoopCounter INT = 1
WHILE (1=1)
BEGIN
DROP TABLE IF EXISTS #RandomData
SELECT RowNum = ROW_NUMBER() OVER (ORDER BY B.RandID),A.*,RunningTotal = SUM(TotalDue) OVER (ORDER BY B.RandID)
INTO #RandomData
FROM #SalesData AS A
CROSS APPLY (SELECT RandID = NEWID()) AS B
WHERE TotalDue < #DesiredSum /*If single row bigger than desired sum, then filter it out*/
ORDER BY B.RandID
SELECT Top(1) #RowNum = RowNum
FROM #RandomData AS A
CROSS APPLY (SELECT DeltaFromDesiredSum = ABS(RunningTotal-#DesiredSum)) AS B
WHERE RunningTotal BETWEEN #DesiredSum *(1-#AcceptableMargin) AND #DesiredSum *(1+#AcceptableMargin)
ORDER BY DeltaFromDesiredSum
IF (#RowNum IS NOT NULL)
BREAK;
IF (#LoopCounter >=100) /*Prevents infinite loops*/
THROW 59194,'Result unable to be generated in 100 tries. Recommend expanding acceptable margin',1;
SET #LoopCounter +=1;
END
SELECT *
FROM #RandomData
WHERE RowNum <= #RowNum
SELECT RandomRowTotal = SUM(TotalDue)
,DesiredSum = #DesiredSum
,PercentageFromDesiredSum = Concat(Cast(Round(100*(1-SUM(TotalDue)/#DesiredSum),2) as Float),'%')
FROM #RandomData
WHERE RowNum <= #RowNum

PostgreSQL - Unexpected division by zero using SUM

This query (minimal reproducible example):
WITH t as (
SELECT 3 id, 2 price, 0 amount
)
SELECT
CASE WHEN amount > 0 THEN
SUM(price / amount)
ELSE
price
END u_price
FROM t
GROUP BY id, price, amount
on PostgreSQL 9.4 throws
division by zero
Without the SUM it works.
How is this possible?
I liked this question and I turned for help to these tough guys :
The planner is guilty:
A CASE cannot prevent evaluation of an aggregate expression contained
within it, because aggregate expressions are computed before other
expressions in a SELECT list or HAVING clause are considered
More details at https://www.postgresql.org/docs/10/static/sql-expressions.html#SYNTAX-EXPRESS-EVAL
I cannot figure the "why" part out, but here is a workaround...
WITH t as (
SELECT 3 id, 2 price, 0 amount
)
SELECT SUM(price / case when amount = 0 then 1 else amount end) u_cena
FROM t
GROUP BY id, price, amount
OR: you can use the following and avoid the "case"
SELECT SUM(price / power(amount,sign(amount))) u_cena
FROM t
GROUP BY id, price, amount

How do I find the sum of all transactions since an event?

So, let's say that I have a group of donors, and they make donations on an irregular basis. I can put the donor name, the donation amount, and the donation date into a table, but then I want to do a report that shows all of that information PLUS the value of all donations after that amount.
I know that I can parse through this using a loop, but is there a better way?
I'm cheating here by not bothering with the code that would go through and assign a transaction number by donor and ensure that everything is the right order. That's easy enough.
DECLARE #Donors TABLE (
ID INT IDENTITY
, Name NVARCHAR(30)
, NID INT
, Amount DECIMAL(7,2)
, DonationDate DATE
, AmountAfter DECIMAL(7,2)
)
INSERT INTO #Donors VALUES
('Adam Zephyr',1,100.00,'2017-01-14',NULL)
, ('Adam Zephyr',2,200.00,'2017-01-17',NULL)
, ('Adam Zephyr',3,150.00,'2017-01-20',NULL)
, ('Braden Yu',1,50.00,'2017-01-11',NULL)
, ('Braden Yu',2,75.00,'2017-01-19',NULL)
DECLARE #Counter1 INT = 0
, #Name NVARCHAR(30)
WHILE #Counter1 < (SELECT MAX(ID) FROM #Donors)
BEGIN
SET #Counter1 += 1
SET #Name = (SELECT Name FROM #Donors WHERE ID = #Counter1)
UPDATE d1
SET AmountAfter = (SELECT ISNULL(SUM(Amount),0) FROM #Donors d2 WHERE ID > #Counter1 AND Name = #Name)
FROM #Donors d1
WHERE d1.ID = #Counter1
END
SELECT * FROM #Donors
It seems like there ought to be a way to do this recursively, but I just can't wrap my head around it.
This would show the latest donation per Name which I presume is the donor and the total of all amounts donated by that person. Perhaps it's more appropriate to use NID for the partitions.
;with MostRecentDonations as (
select *,
row_number() over (partition by Name order by DonationDate desc) as rn,
sum(Amount) over (partition by Name) as TotalDonations
from #Donors
)
select * from MostRecentDonations
where rn = 1;
There's certainly no need to store a running total anywhere unless you have some kind of performance issue.
EDIT:
I've thought about your question and now I'm thinking that you just want a running total with all the transactions included. That's easy too:
select *,
sum(Amount) over (partition by Name order by DonationDate) as DonationsToDate
from #Donors
order by Name, DonationDate;

TSQL - Control a number sequence

Im a new in TSQL.
I have a table with a field called ODOMETER of a vehicle. I have to get the quantity of km in a period of time from 1st of the month to the end.
SELECT MAX(Odometer) - MIN(Odometer) as TotalKm FROM Table
This will work in ideal test scenary, but the Odomometer can be reset to 0 in anytime.
Someone can help to solve my problem, thank you.
I'm working with MS SQL 2012
EXAMPLE of records:
Date Odometer value
datetime var, 37210
datetime var, 37340
datetime var, 0
datetime var, 220
Try something like this using the LAG. There are other ways, but this should be easy.
EDIT: Changing the sample data to include records outside of the desired month range. Also simplifying that Reading for easy hand calc. Will shows a second option as siggested by OP.
DECLARE #tbl TABLE (stamp DATETIME, Reading INT)
INSERT INTO #tbl VALUES
('02/28/2014',0)
,('03/01/2014',10)
,('03/10/2014',20)
,('03/22/2014',0)
,('03/30/2014',10)
,('03/31/2014',20)
,('04/01/2014',30)
--Original solution with WHERE on the "outer" SELECT.
--This give a result of 40 as it include the change of 10 between 2/28 and 3/31.
;WITH cte AS (
SELECT Reading
,LAG(Reading,1,Reading) OVER (ORDER BY stamp ASC) LastReading
,Reading - LAG(Reading,1,Reading) OVER (ORDER BY stamp ASC) ChangeSinceLastReading
,CONVERT(date, stamp) stamp
FROM #tbl
)
SELECT SUM(CASE WHEN Reading = 0 THEN 0 ELSE ChangeSinceLastReading END)
FROM cte
WHERE stamp BETWEEN '03/01/2014' AND '03/31/2014'
--Second option with WHERE on the "inner" SELECT (within the CTE)
--This give a result of 30 as it include the change of 10 between 2/28 and 3/31 is by the filtered lag.
;WITH cte AS (
SELECT Reading
,LAG(Reading,1,Reading) OVER (ORDER BY stamp ASC) LastReading
,Reading - LAG(Reading,1,Reading) OVER (ORDER BY stamp ASC) ChangeSinceLastReading
,CONVERT(date, stamp) stamp
FROM #tbl
WHERE stamp BETWEEN '03/01/2014' AND '03/31/2014'
)
SELECT SUM(CASE WHEN Reading = 0 THEN 0 ELSE ChangeSinceLastReading END)
FROM cte
I think Karl solution using LAG is better than mine, but anyway:
;WITH [Rows] AS
(
SELECT o1.[Date], o1.[Value] as CurrentValue,
(SELECT TOP 1 o2.[Value]
FROM #tbl o2 WHERE o1.[Date] < o2.[Date]) as NextValue
FROM #tbl o1
)
SELECT SUM (CASE WHEN [NextValue] IS NULL OR [NextValue] < [CurrentValue] THEN 0 ELSE [NextValue] - [CurrentValue] END )
FROM [Rows]

Speeding up TSQL

Hi all i wondering if there's a more efficient way of executing this TSQl script. It basically goes and gets the very latest activity ordering by account name and then join this to the accounts table. So you get the very latest activity for a account. The problem is there are currently about 22,000 latest activities, so obviously it has to go through alot of data, just wondering if theres a more efficient way of doing what i'm doing?
DECLARE #pastAppointments TABLE (objectid NVARCHAR(100), account NVARCHAR(500), startdate DATETIME, tasktype NVARCHAR(100), ownerid UNIQUEIDENTIFIER, owneridname NVARCHAR(100), RN NVARCHAR(100))
INSERT INTO #pastAppointments (objectid, account, startdate, tasktype, ownerid, owneridname, RN)
SELECT * FROM (
SELECT fap.regardingobjectid, fap.regardingobjectidname, fap.actualend, fap.activitytypecodename, fap.ownerid, fap.owneridname,
ROW_NUMBER() OVER (PARTITION BY fap.regardingobjectidname ORDER BY fap.actualend DESC) AS RN
FROM FilteredActivityPointer fap
WHERE fap.actualend < getdate()
AND fap.activitytypecode NOT LIKE 4201
) tmp WHERE RN = 1
ORDER BY regardingobjectidname
SELECT fa.name, fa.owneridname, fa.new_technicalaccountmanagername, fa.new_customerid, fa.new_riskstatusname, fa.new_numberofopencases,
fa.new_numberofurgentopencases, app.startdate, app.tasktype, app.ownerid, app.owneridname
FROM FilteredAccount fa LEFT JOIN #pastAppointments app on fa.accountid = app.objectid and fa.ownerid = app.ownerid
WHERE fa.statecodename = 'Active'
AND fa.ownerid LIKE #owner_search
ORDER BY fa.name
You can remove ORDER BY regardingobjectidname from the first INSERT query - the only (narrow) purpose such a sort would have on an INSERT query is if there was an identity column on the table being inserted into. And there isn't in this case, so if the optimizer isn't smart enough, it'll perform a pointless sort.