Is there a way to use windowing functions to simplify this query?

Is there a way to use windowing functions to simplify this query? - tsql

I have some data, real-world example. In order to display the data a particular way, the data is stored in such a way that I have to manipulate it to get the results I want. Essentially, there are two different types of invoices: a pre-invoice (P) where amounts are billed before an order is processed; and a standard invoice (I) where the remainder is billed after shipping. You would expect, under ordinary circumstances, that the standard invoice would have the pre-invoice amount subtracted out of it, e.g., if the pre-invoice was for $2 and the order was for $10 that the standard invoice would be $8. Instead, though, the standard invoice is stored as $10.
Below is a fully operational query (That query's operational! It's a trap!) that both populates the data and returns the results I want. The goal is to take the invoice amount and if it's a pre-invoice, return that amount; but if it's a standard invoice, to "use up" the value of the pre-invoice and return a new amount, minimum zero. I've included six scenarios, because the pre-invoice could be any amount and could technically be required at any time.
Any assistance on this would be much appreciated. I tried some windowing functions, including UNBOUNDED PRECEDING type stuff, but it always seemed like I needed to be recursive and go into an infinite loop. Hence my brute-force approach, below.
DECLARE #PData TABLE (
CustNum INT
,TransNum INT
,InvType NVARCHAR(1)
,InvAmt DECIMAL(5,2)
,CRank INT
,TRank INT
,ModInvAmt DECIMAL(5,2)
)
INSERT INTO #PData (
CustNum
,TransNum
,InvType
,InvAmt
)
VALUES
(124, 1,'P',2)
,(124, 2,'I',10)
,(124, 3,'I',10)
,(153, 4,'I',10)
,(153, 5,'P',2)
,(153, 6,'I',10)
,(324, 7,'I',10)
,(324, 8,'I',10)
,(324, 9,'P',2)
,(441,10,'P',12)
,(441,11,'I',10)
,(441,12,'I',10)
,(455,13,'I',10)
,(455,14,'P',12)
,(455,15,'I',10)
,(667,16,'I',10)
,(667,17,'I',10)
,(667,18,'P',12)
UPDATE pd1
SET CRank = pd2.CDR
FROM #PData pd1
JOIN (SELECT CustNum, TransNum, DENSE_RANK() OVER (ORDER BY CustNum) AS CDR
FROM #PData) pd2
ON pd1.TransNum = pd2.TransNum
UPDATE pd1
SET TRank = pd2.TDR
FROM #PData pd1
JOIN (SELECT CustNum, TransNum, DENSE_RANK() OVER (PARTITION BY CustNum ORDER BY TransNum) AS TDR
FROM #PData) pd2
ON pd1.TransNum = pd2.TransNum
DECLARE #Counter1 INT
,#Counter2 INT
,#CBal DECIMAL(5,2)
,#TNum INT
,#IAmt DECIMAL(5,2)
SET #Counter1 = 0
WHILE #Counter1 < (SELECT MAX(CRank) FROM #PData)
BEGIN
SET #Counter1 += 1
SET #CBal = 0
SET #Counter2 = 0
WHILE #Counter2 < (SELECT MAX(TRank) FROM #PData WHERE CRank = #Counter1)
BEGIN
SET #Counter2 += 1
SET #TNum = (SELECT TransNum FROM #PData WHERE CRank = #Counter1 AND TRank = #Counter2)
SET #IAmt = (SELECT InvAmt FROM #PData WHERE TransNum = #TNum)
IF (SELECT InvType FROM #PData WHERE TransNum = #TNum) = 'P'
BEGIN
UPDATE #PData SET ModInvAmt = #IAmt WHERE TransNum = #TNum
SET #CBal += -#IAmt
END
ELSE
BEGIN
UPDATE #PData SET ModInvAmt = (ABS(#IAmt+#CBal)+(#IAmt+#CBal))/2 -- MINIMUM = 0
WHERE TransNum = #TNum
SET #CBal += (#IAmt - (ABS(#IAmt+#CBal)+(#IAmt+#CBal))/2)
END
END
END
SELECT CustNum
,TransNum
,InvType
,InvAmt
,ModInvAmt
FROM #PData
Here are the results I get:
I wouldn't normally report the original invoice amount--just the new one--but I've included it here so it's more clear how it changed.
CustNum TransNum InvType InvAmt ModInvAmt
124 1 P 2.00 2.00
124 2 I 10.00 8.00
124 3 I 10.00 10.00
153 4 I 10.00 10.00
153 5 P 2.00 2.00
153 6 I 10.00 8.00
324 7 I 10.00 10.00
324 8 I 10.00 10.00
324 9 P 2.00 2.00
441 10 P 12.00 12.00
441 11 I 10.00 0.00
441 12 I 10.00 8.00
455 13 I 10.00 10.00
455 14 P 12.00 12.00
455 15 I 10.00 0.00
667 16 I 10.00 10.00
667 17 I 10.00 10.00
667 18 P 12.00 12.00

I would do it recursivly with common table expressions.
Best regards
Peter
DECLARE #PData TABLE (
CustNum INT
,TransNum INT
,InvType NVARCHAR(1)
,InvAmt DECIMAL(5,2)
)
INSERT INTO #PData (
CustNum
,TransNum
,InvType
,InvAmt
)
VALUES
(124, 1,'P',2)
,(124, 2,'I',10)
,(124, 3,'I',10)
,(153, 4,'I',10)
,(153, 5,'P',2)
,(153, 6,'I',10)
,(324, 7,'I',10)
,(324, 8,'I',10)
,(324, 9,'P',2)
,(441,10,'P',12)
,(441,11,'I',10)
,(441,12,'I',10)
,(455,13,'I',10)
,(455,14,'P',12)
,(455,15,'I',10)
,(455,19,'I',10)
,(667,16,'I',10)
,(667,17,'I',10)
,(667,18,'P',12)
;WITH Data as (
SELECT
CustNum,
TransNum,
InvType,
InvAmt,
ROW_NUMBER() OVER (PARTITION BY custNum ORDER BY Transnum ASC) row,
CASE WHEN InvType = 'P' THEN cast(-1*InvAmt AS DECIMAL(5,2)) ELSE 0 END prepaidAmt
FROM
#PData
), modified as(
SELECT
CustNum,
TransNum,
InvType,
InvAmt,
prepaidAmt,
row,
InvAmt total
FROM Data d1
WHERE row = 1
UNION ALL
SELECT
d2.CustNum,
d2.TransNum,
d2.InvType,
d2.InvAmt,
CASE
WHEN
d2.InvAmt+m.prepaidAmt < 0
THEN
CAST (d2.InvAmt+m.prepaidAmt AS DECIMAL(5,2))
ELSE
CASE
WHEN
d2.invtype = 'P'
THEN
CAST(-1*d2.invamt AS DECIMAL(5,2))
ELSE
0
END
END ,
d2.row,
CASE
WHEN
d2.InvAmt+m.prepaidAmt <0
THEN
0
ELSE
CAST( d2.InvAmt+m.prepaidAmt AS DECIMAL(5,2))
END
FROM Data d2
JOIN modified m
ON
m.CustNum = d2.CustNum and
m.row = d2.row-1
)
SELECT
m.CustNum,
m.TransNum,
m.InvType,
m.InvAmt,
m.total
FROM modified m
ORDER BY
custnum,
transnum

Related

PostgreSQL cummulative sum max condition

I have the following table:
account
size id name
100 1 John
200 2 Mary
300 3 Jane
400 4 Anne
100 5 Mike
600 6 Joanne
I want to partition the rows in groups, where the sum of size <= 600.
Expected result:
account
group size id name
1 100 1 John
1 200 2 Mary
1 300 3 Jane
2 400 4 Anne
2 100 5 Mike
3 600 6 Joanne
I don't know how to do the partition and add the condition.

I cannot think of how to do this without recursion. I left the running_total in the result to make it easier to follow:
with recursive rns as ( -- Assign row numbers as rn in case of gaps in id
select *, row_number() over (order by id) as rn
from account
), sumgrp as ( -- Start with first row
select size, id, name, rn, 1 as grp, size as running_total
from rns
where rn = 1
union all
select n.size, n.id, n.name, n.rn,
case -- Increment the grp when running_total exceeds 600
when p.running_total + n.size > 600 then p.grp + 1
else p.grp
end as grp,
case -- Reset the running_total when it exceeds 600
when p.running_total + n.size > 600 then n.size
else p.running_total + n.size
end as running_total
from sumgrp p
join rns n on n.rn = p.rn + 1
)
select grp, size, id, name, running_total
from sumgrp
order by id;
db<>fiddle here

T-SQL vlookup with fake calendar table?

I am rather new in T-SQL and I have to create a view, where the output will be as shown below:
enter image description here
But my sales table doesn't have any data about sales in February and May for customer ABC and no data in January for customer XYZ, but I really want to have 0 for these months. How to do it in T-SQL?

This is great question about a very important topic that, even many experienced developers need to touch up on. Being "relatively new at SQL" I wont just offer a solution, I'll explain the key concepts involved.
The Auxiliary Table Numbers
First lets learn about what a tally table, aka numbers table is all about.
What does this do?
SELECT N = 1 ;
It returns the number 1.
N
-----
1
How about this?
SELECT N = 1 FROM (VALUES(0)) AS e(N);
Same thing:
N
-----
1
What does this return?
SELECT N = 1 FROM (VALUES(0),(0),(0),(0),(0),(0)) AS e(n);
Here I'm leveraging the VALUES table constructer which allows for a list of values to be treated like a view. This returns:
N
-------
1
1
1
1
1
We don't need the ones, we need the rows. This will make more sense in a moment. Now, what does this do?
WITH e(N) AS (SELECT 1 FROM (VALUES(0),(0),(0),(0),(0)) AS e(n))
SELECT N = 1 FROM e e1;
It returns the same thing, five 1's, but I've wrapped the code into a CTE named e. Think of CTEs as inline unnamed views that you can reference multiple times. Now lets CROSS JOIN e to itself. This returns for 25 dummy rows (5*5).
WITH e(N) AS (SELECT 1 FROM (VALUES(0),(0),(0),(0),(0)) AS e(n))
SELECT N = 1 FROM e e1, e e2;
Next we leverage ROW_NUMBER() over our set of dummy values.
WITH E1(N) AS (SELECT 1 FROM (VALUES(0),(0),(0),(0),(0)) AS e(n))
SELECT N = ROW_NUMBER() OVER (ORDER BY(SELECT NULL)) FROM E1, E1 a;
Returns (truncated for brevity):
N
--------------------
1
2
3
...
24
25
Using as an auxiliary numbers table
#OneToTen is a table with random numbers 1 to 10. I need to count how many there are, returning 0 when there aren't any. NOTE MY COMMENTS:
;--== 2. Simple Use Case - Counting all numbers, including missing ones (missing = 0)
DECLARE #OneToTen TABLE (N INT);
INSERT #OneToTen VALUES(1),(2),(2),(2),(4),(8),(8),(10),(10),(10);
WITH E1(N) AS (SELECT 1 FROM (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) AS e(n)),
iTally(N) AS (SELECT ROW_NUMBER() OVER (ORDER BY(SELECT NULL)) FROM E1, E1 a)
SELECT
N = i.N,
Wrong = COUNT(*), -- WRONG!!! Don't do THIS, this counts ALL rows returned
Correct = COUNT(t.N) -- Correct, this counts numbers from #OneToTen AKA "t.N"
FROM iTally AS i -- Aux Table of numbers
LEFT JOIN #OneToTen AS t -- Table to evaluate
ON i.N = t.N -- LEFT JOIN #OneToTen numbers to our Aux table of numbers
WHERE i.N <= 10 -- We only need the numbers 1 to 10
GROUP BY i.N; -- Group by with no Sort!!!
This returns:
N Wrong Correct
----- ----------- -----------
1 1 1
2 3 3
3 1 0
4 1 1
5 1 0
6 1 0
7 1 0
8 2 2
9 1 0
10 3 3
Note that I show you the wrong and right way to do this. Note how COUNT(*) is wrong for this, you need COUNT(whatever you are counting).
Auxiliary table of Dates (AKA calendar table)
My we use our numbers table to create a calendar table.
;--== 3. Auxilliary Month/Year Calendar Table
DECLARE #Start DATE = '20191001',
#End DATE = '20200301';
WITH E1(N) AS (SELECT 1 FROM (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) AS e(n)),
iTally(N) AS (SELECT ROW_NUMBER() OVER (ORDER BY(SELECT NULL)) FROM E1, E1 a)
SELECT TOP(DATEDIFF(MONTH,#Start,#End)+1)
TheDate = f.Dt,
TheYear = YEAR(f.Dt),
TheMonth = MONTH(f.Dt),
TheWeekday = DATEPART(WEEKDAY,f.Dt),
DayOfTheYear = DATEPART(DAYOFYEAR,f.Dt),
LastDayOfMonth = EOMONTH(f.Dt)
FROM iTally AS i
CROSS APPLY (VALUES(DATEADD(MONTH, i.N-1, #Start))) AS f(Dt)
This returns:
TheDate TheYear TheMonth TheWeekday DayOfTheYear LastDayOfMonth
---------- ----------- ----------- ----------- ------------ --------------
2019-10-01 2019 10 3 274 2019-10-31
2019-11-01 2019 11 6 305 2019-11-30
2019-12-01 2019 12 1 335 2019-12-31
2020-01-01 2020 1 4 1 2020-01-31
2020-02-01 2020 2 7 32 2020-02-29
2020-03-01 2020 3 1 61 2020-03-31
You will only need the YEAR and MONTH.
The Auxiliary Customer table
Because you are performing aggregations (SUM,COUNT,etc.) against multiple customers we will also need an Auxiliary table of customers, more commonly known as a lookup or dimension.
SAMPLE DATA:
;--== Sample Data
DECLARE #sale TABLE
(
Customer VARCHAR(10),
SaleYear INT,
SaleMonth TINYINT,
SaleAmt DECIMAL(19,2),
INDEX idx_cust(Customer)
);
INSERT #sale
VALUES('ABC',2019,12,410),('ABC',2020,1,668),('ABC',2020,1,50), ('ABC',2020,3,250),
('CDF',2019,10,200),('CDF',2019,11,198),('CDF',2020,1,333),('CDF',2020,2,5000),
('CDF',2020,2,325),('CDF',2020,3,1105),('FRED',2018,11,1105);
Distinct list of customers for an "Auxilliary Table of Customers"
SELECT DISTINCT s.Customer FROM #sale AS s;
For my sample data we get:
Customer
----------
ABC
CDF
FRED
Putting it all together
Here I'm going to:
Create a numbers table
Use my numbers table to create a calendar table
Create an auxiliary Customer table from #sale
CROSS JOIN (combine) both tables for a "junk dimension"
LEFT JOIN our sales data to our calendar/customer auxiliary tables/junk dimension
Group by the auxiliary table values
SOLUTION:
;--==== SAMPLE DATA
DECLARE #sale TABLE
(
Customer VARCHAR(10),
SaleYear INT,
SaleMonth TINYINT,
SaleAmt DECIMAL(19,2),
INDEX idx_cust(Customer)
);
INSERT #sale
VALUES('ABC',2019,12,410),('ABC',2020,1,668),('ABC',2020,1,50), ('ABC',2020,3,250),
('CDF',2019,10,200),('CDF',2019,11,198),('CDF',2020,1,333),('CDF',2020,2,5000),
('CDF',2020,2,325),('CDF',2020,3,1105),('FRED',2018,11,1105);
;--==== START/END DATEs
DECLARE #Start DATE = '20191001',
#End DATE = '20200301';
;--==== FINAL SOLUTION
WITH -- 6.1. Auxilliary Table of numbers:
E1(N) AS (SELECT 1 FROM (VALUES(0),(0),(0),(0),(0),(0),(0),(0),(0),(0)) AS e(n)),
iTally(N) AS (SELECT ROW_NUMBER() OVER (ORDER BY(SELECT NULL)) FROM E1, E1 a),
-- 6.2. Use numbers table to create an "Auxilliary Date Table" (Calendar Table):
MonthYear(SaleYear,SaleMonth) AS
(
SELECT TOP(DATEDIFF(MONTH,#Start,#End)+1) YEAR(f.Dt), MONTH(f.Dt)
FROM iTally AS i
CROSS APPLY (VALUES(DATEADD(MONTH, i.N-1, #Start))) AS f(Dt)
)
SELECT
Customer = cust.Customer,
MonthYear = CONCAT(cal.SaleYear,'-',cal.SaleMonth),
Sales = ISNULL(SUM(s.SaleAmt),0)
-- Auxilliary Table of Customers
FROM (SELECT DISTINCT s.Customer FROM #sale AS s) AS cust -- 6.3. Aux Customer Table
CROSS JOIN MonthYear AS cal -- 6.4. Cross join to create Calendar/Customer Junk Dimension
LEFT JOIN #sale AS s -- 6.5. Join #sale to Junk Dimension on Year,Month and Customer
ON s.SaleYear = cal.SaleYear
AND s.SaleMonth = cal.SaleMonth
AND s.Customer = cust.Customer
GROUP BY cust.Customer, cal.SaleYear, cal.SaleMonth -- 6.6. Group by Junk Dim values
ORDER BY cust.Customer, cal.SaleYear, cal.SaleMonth; -- Order by not required
RESULTS:
Customer MonthYear Sales
---------- ------------ ------------
ABC 2019-10 0.00
ABC 2019-11 0.00
ABC 2019-12 410.00
ABC 2020-1 718.00
ABC 2020-2 0.00
ABC 2020-3 250.00
CDF 2019-10 200.00
CDF 2019-11 198.00
CDF 2019-12 0.00
CDF 2020-1 333.00
CDF 2020-2 5325.00
CDF 2020-3 1105.00
FRED 2019-10 0.00
FRED 2019-11 0.00
FRED 2019-12 0.00
FRED 2020-1 0.00
FRED 2020-2 0.00
FRED 2020-3 0.00

Find Accounts with X Number of Transactions within Y Days of Each Other in a Larger Date Range

I am trying to write a SQL statement that will find the accounts that have had 3 or more transactions within 3 days whose absolute value is greater than $10.00 over the course of a week and then return those transactions.
Consider this data...
TransactionID AccountNumber TransactionDate TransactionAmount
------------- ------------- --------------- -----------------
1 0123 2020-09-01 45.75
2 0123 2020-09-02 5.23
3 0123 2020-09-03 9.94
4 0123 2020-09-05 8.35
5 0123 2020-09-06 -16.23
6 0123 2020-09-07 14.71
7 0123 2020-09-08 15.03
8 0123 2020-09-08 23.10
9 0123 2020-09-09 94.20
10 0123 2020-09-09 5.01
11 0123 2020-09-10 3.02
12 0123 2020-09-11 4.37
13 0123 2020-09-12 4.54
14 9876 2020-09-01 -45.75
15 9876 2020-09-02 5.27
16 9876 2020-09-05 19.79
17 9876 2020-09-05 -11.64
18 9876 2020-09-06 12.42
If the week under review is 2020-09-01 to 2020-09-07 I would expect only AccountNumber 9876 to fit the criteria with TransactionIDs 16, 17, and 18 being the 3 transactions within 3 days with an absolute value greater than $10.00.
It seems like I should be able to use window functions (and perhaps framing), but I can't figure out how to start.
I have attempted without the use of window functions based on the answers to this question...
multiple transactions within a certain time period, limited by date range
DECLARE
#BeginDate DATE
, #EndDate DATE
, #ThresholdAmount DECIMAL(10, 2)
, #ThresholdCount INT
, #NumberOfDays INT;
SET #BeginDate = '09/01/2020';
SET #EndDate = '09/07/2020';
SET #ThresholdAmount = 10.00;
SET #ThresholdCount = 3;
SET #NumberOfDays = 3;
SELECT t.*
FROM (
SELECT
t1.*
, (
SELECT COUNT(*)
FROM Transactions t2
WHERE t2.AccountNumber = t1.AccountNumber
AND t2.TransactionID <> t1.TransactionID
AND t2.TransactionDate >= t1.TransactionDate
AND t2.TransactionDate < DATEADD(DAY, #NumberOfDays, t1.TransactionDate)
AND ABS(t2.TransactionAmount) > #ThresholdAmount
) AS NumberWithinXDays
FROM Transactions t1
WHERE t1.TransactionDate BETWEEN #BeginDate AND #EndDate
AND ABS(t1.TransactionAmount) > #ThresholdAmount
) t
WHERE t.NumberWithinXDays >= #ThresholdCount;
SELECT *
FROM Transactions t
WHERE EXISTS (
SELECT *
FROM (
SELECT t1.AccountNumber
FROM Transactions t1
INNER JOIN Transactions t2 ON t1.AccountNumber = t2.AccountNumber
AND t1.TransactionID <> t2.TransactionID
AND DATEDIFF(DAY, t1.TransactionDate, t2.TransactionDate) BETWEEN 0 AND (#NumberOfDays-1)
WHERE t1.TransactionDate BETWEEN #BeginDate AND #EndDate
AND t2.TransactionDate BETWEEN #BeginDate AND #EndDate
AND ABS(t1.TransactionAmount) > #ThresholdAmount
AND ABS(t2.TransactionAmount) > #ThresholdAmount
GROUP BY t1.AccountNumber
HAVING COUNT(t1.TransactionID) >= #ThresholdCount
) x
WHERE x.AccountNumber = t.AccountNumber
)
AND t.TransactionDate BETWEEN #BeginDate AND #EndDate
AND ABS(t.TransactionAmount) > #ThresholdAmount
My first query comes back with...
TransactionID AccountNumber TransactionDate TransactionAmount NumberWithinXDays
------------- ------------- --------------- ----------------- -----------------
5 0123 2020-09-06 -16.23 3
6 0123 2020-09-07 14.71 3
Not even close. And the second query returns...
TransactionID AccountNumber TransactionDate TransactionAmount
------------- ------------- --------------- -----------------
14 9876 2020-09-01 -45.75
16 9876 2020-09-05 19.79
17 9876 2020-09-05 -11.64
18 9876 2020-09-06 12.42
Closer, but not restricted to just transaction within 3 days of each other. This is the result I want.
TransactionID AccountNumber TransactionDate TransactionAmount
------------- ------------- --------------- -----------------
16 9876 2020-09-05 19.79
17 9876 2020-09-05 -11.64
18 9876 2020-09-06 12.42
Now it is certainly possible I have not implemented these suggested queries correctly. Or maybe there is some subtle difference I am missing and they just don't fit my situation.
Any suggestions on fixing either of my attempted queries or something completely different with or without window functions?
Here is full dbfiddle of my code.

I was not able to come up with a solution using window functions. As I thought about it more I thought I might be able to use a CTE, but I could not figure that out either.
I solve it using a couple of subqueries. I was concerned about performance given my transaction table has 86 million rows. However, it runs in less than 30 seconds and that is good enough for me.
-- distinct is need because a particular transaction may fit into more than
-- one transaction window but we only want to see it once in the results
SELECT DISTINCT
t.TransactionID
, t.AccountNumber
, t.TransactionDate
, t.TransactionAmount
FROM (
SELECT
t1.AccountNumber
, t1.TransactionDateWindowBegin
, t1.TransactionDateWindowEnd
, COUNT(DISTINCT t2.TransactionID) AS Count
FROM (
-- establish the transaction window for each transaction within the
-- larger date range and an absolute value above the threshold
SELECT
TransactionID
, AccountNumber
, TransactionDate AS [TransactionDateWindowBegin]
, DATEADD(DAY, #NumberOfDays - 1, TransactionDate) AS [TransactionDateWindowEnd]
, TransactionAmount
FROM Transactions
WHERE TransactionDate BETWEEN #BeginDate AND #EndDate
AND ABS(TransactionAmount) > #ThresholdAmount
) t1
-- join back to the transaction table to find transactions within the transaction window for
-- each transaction, count them, and only keep those that are above the threshold count
INNER JOIN Transactions t2 ON t1.AccountNumber = t2.AccountNumber
AND t1.TransactionDateWindowBegin <= t2.TransactionDate
AND t1.TransactionDateWindowEnd >= t2.TransactionDate
WHERE t2.TransactionDate BETWEEN #BeginDate AND #EndDate
AND ABS(t2.TransactionAmount) > #ThresholdAmount
GROUP BY t1.AccountNumber
, t1.TransactionDateWindowBegin
, t1.TransactionDateWindowEnd
HAVING COUNT(DISTINCT t2.TransactionID) >= #ThresholdCount
) x
-- join back to the transaction table again to get the details for the
-- transactions that meet the threshold amount and count criteria
INNER JOIN Transactions t ON x.AccountNumber = t.AccountNumber
AND x.TransactionDateWindowBegin <= t.TransactionDate
AND x.TransactionDateWindowEnd >= t.TransactionDate
AND ABS(t.TransactionAmount) > #ThresholdAmount;
Here is the full demo.

Postgres - Update running count whenever row meets a certain condition

I have a table with the following entries in them
id price quantity
1. 10 75
2. 10 75
3. 10 -150
4. 10 75
5. 10 -75
What I need to do is to update each row with a number that is the number of times the running total has been 0. In the above example, the cumulative totals would be
id. cum_total
1. 750
2. 1500
3. 0
4. 750
5. 0
Desired result
id price quantity seq
1. 10 75 1
2. 10 75 1
3. 10 -150 1
4. 10 75 2
5. 10 -75 2
I'm now lost in a spiral of CTEs and window functions and figured I'd ask the experts.
Thanks in advance :-)

Here is one option using analytic functions:
WITH cte AS (
SELECT *, CASE WHEN SUM(price*quantity) OVER (ORDER BY id) = 0 THEN 1 ELSE 0 END AS price_sum
FROM yourTable
),
cte2 AS (
SELECT *, LAG(price_sum, 1, 0) OVER (ORDER BY id) price_sum_lag
FROM cte
)
SELECT id, price, quantity, 1 + SUM(price_sum_lag) OVER (ORDER BY id) cumulative_total
FROM cte2
ORDER BY id;
Demo
You may try running each CTE in succession to see how the logic is working.

With window functions:
SELECT id, price, quantity,
coalesce(
sum(CASE WHEN iszero THEN 1 ELSE 0 END)
OVER (ORDER BY id
ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING),
0
) + 1 AS batch
FROM (SELECT id, price, quantity,
sum(price * quantity) OVER (ORDER BY id) = 0 AS iszero
FROM mytable) AS subq;

SQL Query to calculate remaining running balances based on a given conditions

I have a stock transaction table like this:
StockID Item TransDate TranType BatchNo Qty Price
10001 ABC 01-Apr-2012 IN 71001000 200 750.0
10002 ABC 02-Apr-2012 OUT 100
10003 ABC 03-Apr-2012 IN 71001001 50 700.0
10004 ABC 04-Apr-2012 IN 71001002 75 800.0
10005 ABC 10-Apr-2012 OUT 125
10006 XYZ 05-Apr-2012 IN 71001003 150 350.0
10007 XYZ 05-Apr-2012 OUT 120
10008 XYZ 15-Apr-2012 OUT 10
10009 XYZ 20-Apr-2012 IN 71001004 90 340.0
10010 PQR 06-Apr-2012 IN 71001005 50 510.0
10011 PQR 15-Apr-2012 IN 71001006 60 505.0
10012 MNO 01-Apr-2012 IN 71001007 76 410.0
10013 MNO 11-Apr-2012 OUT 76
Each of my IN transactions has price associated to it and a batch number (lot number). Now I would like to calculate the remaining quantity by First In First Out (FIFO) rule, meaning the first in should be adjusted with first out. After adjusting the quantities the remaining balances are to be calculated against each IN transaction for the same item as shown below:
StockID Item TransDate TranType BatchNo Qty Price RemainingQty
10001 ABC 01-Apr-2012 IN 71001000 200 750.0 0
10002 ABC 02-Apr-2012 OUT 100
10003 ABC 03-Apr-2012 IN 71001001 50 700.0 25
10004 ABC 04-Apr-2012 IN 71001002 75 800.0 75
10005 ABC 10-Apr-2012 OUT 125
10006 XYZ 05-Apr-2012 IN 71001003 150 350.0 20
10007 XYZ 05-Apr-2012 OUT 120
10008 XYZ 15-Apr-2012 OUT 10
10009 XYZ 20-Apr-2012 IN 71001004 90 340.0 90
10010 PQR 06-Apr-2012 IN 71001005 50 510.0 50
10011 PQR 15-Apr-2012 IN 71001006 60 505.0 60
10012 MNO 01-Apr-2012 IN 71001007 76 410.0 0
10013 MNO 11-Apr-2012 OUT 76
As we can see from the above table for item ABC, after adjusting (125 + 100) OUT qty against the IN qty (100 + 50 + 75) using FIFO the quantity remaining for the batch 71001000 is 0, 71001001 is 25 and for batch 71001002 is 75. From the remaining quantity the value can be derived.
Please help me to achieve this using any of the methods (either cursor based or CTE or JOINS, etc)
Thanks in advance for the help.
One of the users of StockOverflow suggested this answer:
SELECT 10001 as stockid,'ABC' as item,'01-Apr-2012' as transdate,'IN' as trantype, 71001000 as batchno, 200 as qty, 750.0 as price INTO #sample
UNION ALL SELECT 10002 ,'ABC','02-Apr-2012','OUT', NULL ,100,NULL
UNION ALL SELECT 10003 ,'ABC','03-Apr-2012','IN', 71001001, 50 , 700.0
UNION ALL SELECT 10004 ,'ABC','04-Apr-2012','IN', 71001002, 75 , 800.0
UNION ALL SELECT 10005 ,'ABC','10-Apr-2012','OUT', NULL ,125,NULL
UNION ALL SELECT 10006 ,'XYZ','05-Apr-2012','IN', 71001003, 150 , 350.0
UNION ALL SELECT 10007 ,'XYZ','05-Apr-2012','OUT', NULL , 120 ,NULL
UNION ALL SELECT 10008 ,'XYZ','15-Apr-2012','OUT', NULL , 10 ,NULL
UNION ALL SELECT 10009 ,'XYZ','20-Apr-2012','IN', 71001004, 90 , 340.0
UNION ALL SELECT 10010 ,'PQR','06-Apr-2012','IN', 71001005, 50 , 510.0
UNION ALL SELECT 10011 ,'PQR','15-Apr-2012','IN', 71001006, 60 , 505.0
UNION ALL SELECT 10012 ,'MNO','01-Apr-2012','IN', 71001007, 76 , 410.0
UNION ALL SELECT 10013 ,'MNO','11-Apr-2012','OUT', NULL ,76 ,NULL
;WITH remaining AS
(
SELECT *,
CASE
WHEN trantype = 'IN' THEN 1
ELSE -1
END * qty AS stock_shift,
ROW_NUMBER() OVER(PARTITION BY item ORDER BY transdate) AS row,
CASE
WHEN trantype = 'OUT' THEN NULL
ELSE ROW_NUMBER()OVER(PARTITION BY item, CASE WHEN trantype = 'IN' THEN 0 ELSE 1 END ORDER BY transdate)
END AS in_row,
SUM(CASE WHEN trantype = 'OUT' THEN qty END) OVER(PARTITION BY item) AS total_out
FROM #sample
)
,remaining2 AS
(
SELECT r1.item,
r1.stockid,
MAX(r1.transdate) AS transdate,
MAX(r1.trantype) AS trantype,
MAX(r1.batchno) AS batchno,
MAX(r1.qty) AS qty,
MAX(r1.price) AS price,
MAX(r1.total_out) AS total_out,
MAX(r1.in_row) AS in_row,
CASE
WHEN MAX(r1.trantype) = 'OUT' THEN NULL
WHEN SUM(CASE WHEN r1.trantype = 'IN' THEN r2.qty ELSE 0 END) - MAX(r1.total_out) < 0 THEN SUM(CASE WHEN r1.trantype = 'IN' THEN r2.qty ELSE 0 END)
- MAX(r1.total_out)
ELSE 0
END AS running_in
FROM remaining r1
LEFT OUTER JOIN remaining r2
ON r2.row <= r1.row
AND r2.item = r1.item
GROUP BY
r1.item,
r1.stockid
)
SELECT r2.item,
r2.stockid,
MAX(r2.transdate) AS transdate,
MAX(r2.trantype) AS trantype,
MAX(r2.batchno) AS batchno,
MAX(r2.qty) AS qty,
MAX(r2.price) AS price,
MAX(CASE WHEN r2.trantype = 'OUT' THEN NULL ELSE ISNULL(r2.qty + r3.running_in, 0) END) AS remaining_stock
FROM remaining2 r2
LEFT OUTER JOIN remaining2 r3
ON r2.in_row - 1 = r3.in_row
AND r2.item = r3.item
GROUP BY
r2.item,
r2.stockid
This sql is having a problem and the result is attached here The records for which the value are not matching are indicated in yellow color. Kindly help to solve the problem.

I think this should do the trick?
SELECT 10001 as stockid,'ABC' as item,'01-Apr-2012' as transdate,'IN' as trantype, 71001000 as batchno, 200 as qty, 750.0 as price INTO #sample
UNION ALL SELECT 10002 ,'ABC','02-Apr-2012','OUT', NULL ,100,NULL
UNION ALL SELECT 10003 ,'ABC','03-Apr-2012','IN', 71001001, 50 , 700.0
UNION ALL SELECT 10004 ,'ABC','04-Apr-2012','IN', 71001002, 75 , 800.0
UNION ALL SELECT 10005 ,'ABC','10-Apr-2012','OUT', NULL ,125,NULL
UNION ALL SELECT 10006 ,'XYZ','05-Apr-2012','IN', 71001003, 150 , 350.0
UNION ALL SELECT 10007 ,'XYZ','05-Apr-2012','OUT', NULL , 120 ,NULL
UNION ALL SELECT 10008 ,'XYZ','15-Apr-2012','OUT', NULL , 10 ,NULL
UNION ALL SELECT 10009 ,'XYZ','20-Apr-2012','IN', 71001004, 90 , 340.0
UNION ALL SELECT 10010 ,'PQR','06-Apr-2012','IN', 71001005, 50 , 510.0
UNION ALL SELECT 10011 ,'PQR','15-Apr-2012','IN', 71001006, 60 , 505.0
UNION ALL SELECT 10012 ,'MNO','01-Apr-2012','IN', 71001007, 76 , 410.0
UNION ALL SELECT 10013,'MNO','11-Apr-2012','OUT', NULL ,76 ,NULL
;with remaining_stock as
(
SELECT *
,CASE WHEN trantype = 'IN' THEN 1 ELSE -1 END * qty AS stock_shift
,row_number() OVER (PARTITION BY item ORDER BY transdate) as row
,CASE WHEN trantype = 'OUT' THEN NULL ELSE
row_number()OVER (PARTITION BY item,CASE WHEN trantype = 'IN' THEN 0 ELSE 1 END ORDER BY transdate) END as in_row
,CASE WHEN trantype = 'IN' THEN NULL ELSE
row_number()OVER (PARTITION BY item,CASE WHEN trantype = 'OUT' THEN 0 ELSE 1 END ORDER BY transdate) END as out_row
,ISNULL(SUM(CASE WHEN trantype = 'OUT' THEN qty END) OVER (PARTITION BY item),0) AS total_out
,ISNULL(SUM(CASE WHEN trantype = 'IN' THEN qty END) OVER (PARTITION BY item),0) AS total_in
FROM #sample
)
,remaining_stock2 AS
(
SELECT
r1.item
,r1.stockid
,MAX(r1.transdate) as transdate
,MAX(r1.trantype) as trantype
,MAX(r1.batchno) as batchno
,MAX(r1.qty) as qty
,MAX(r1.price) as price
,MAX(r1.total_in) as total_in
,MAX(r1.total_out) as total_out
,SUM(r2.qty) as running_in
FROM remaining_stock r1
LEFT OUTER JOIN remaining_stock r2 on r2.in_row <= r1.in_row
AND r2.item = r1.item
GROUP BY
r1.item
,r1.stockid
)
SELECT
item
,stockid
,transdate
,trantype
,batchno
,qty
,price
,CASE WHEN trantype = 'OUT' THEN NULL
WHEN total_out >= running_in THEN 0
WHEN (running_in - total_out) < qty THEN (running_in - total_out)
WHEN (running_in - total_out) >= qty THEN qty
END as remaining_stocks
FROM remaining_stock2

Your question isn't very clear to me on how the FIFO logic is to be applied. I'm going to assume that you want to associate each IN record against the next OUT record if one exists. To achieve this you need to join the table on itself like the following
select
t1.BatchNo,
isnull(t1.Qty,0) as 'IN Qty',
isnull(t2.Qty,0) as 'OUT Qty',
isnull(t1.Qty,0) - isnull(t2.Qty,0) as 'Remaining Qty'
from
tbl_test t1
left join tbl_test t2
on t2.StockID = (t1.StockID + 1)
and t2.TranType = 'OUT'
where
t1.TranType = 'IN'
The results will show you the following for the first 5 records for ABC from your question.
BatchNo | IN Qty | OUT Qty | Remaining Qty
71001000 | 200 | 100 | 100
71001001 | 50 | 0 | 50
71001002 | 75 | 125 | -50
The left join works on the assumption that the StockID for each IN record is always one less number than the associated OUT record. I personally think your data model needs improving.
OUT records should have a BatchNo assigned or a reference to the
StockID of the associated IN record
add a timestamp field for sequential ordering
add a DateTime field for handling IN/OUT occuring on same day

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Is there a way to use windowing functions to simplify this query? - tsql

Related

PostgreSQL cummulative sum max condition

T-SQL vlookup with fake calendar table?

Find Accounts with X Number of Transactions within Y Days of Each Other in a Larger Date Range

Postgres - Update running count whenever row meets a certain condition

SQL Query to calculate remaining running balances based on a given conditions

Categories

Resources