T-SQL Counting Distinct Rows to Include Zeroes as Unique - tsql

I have the following test script :
DECLARE #Test TABLE (number INT)
INSERT INTO #Test VALUES (6)
INSERT INTO #Test VALUES (6)
INSERT INTO #Test VALUES (6)
INSERT INTO #Test VALUES (2)
INSERT INTO #Test VALUES (2)
INSERT INTO #Test VALUES (0)
INSERT INTO #Test VALUES (0)
INSERT INTO #Test VALUES (0)
INSERT INTO #Test VALUES (0) INSERT INTO #Test VALUES (0)
SELECT * FROM #Test
SELECT count(*) FROM #Test GROUP BY number
Results
number
6
6
6
2
2
0
0
0
0
0
(No column name)
5
2
3
I'm trying to get a count of 7 , i.e. distinct for the 6's and 2's and unique for the zeros?

The simplest way I came up with is this:
SELECT COUNT(DISTINCT NULLIF(Number, 0)) + SUM(CASE WHEN Number = 0 THEN 1 END)
FROM #Test
The NULLIF makes the COUNT ignore numbers that are equal to 0, the DISTINCT is responsible for counting each number only once, and the SUM with the CASE is calculating the number of 0 records.

Not exactly sure why you would want this but you could do both queries separately and perform a UNION ALL to combine the results.
Test Data
CREATE TABLE #TestData (Number int)
INSERT INTO #TestData (Number)
VALUES
(6), (6), (6), (2), (2), (0), (0), (0), (0), (0)
Query
SELECT DISTINCT
Number
FROM #TestData
WHERE Number <> 0
UNION ALL
SELECT
Number
FROM #TestData
WHERE Number = 0
Results
Number
2
6
0
0
0
0
0
If you want to return the number 7, then just wrap this in an outer query like this;
SELECT
COUNT(1) FinalCount
FROM
(
SELECT DISTINCT
Number
FROM #TestData
WHERE Number <> 0
UNION ALL
SELECT
Number
FROM #TestData
WHERE Number = 0
) a
Result
FinalCount
7

try this:
DECLARE #Test TABLE (number INT)
INSERT INTO #Test VALUES (6), (6), (6), (2), (2), (0), (0), (0), (0) ,(0)
SELECT COUNT(DISTINCT NewNumber)
FROM (SELECT number,
(CASE WHEN number = 0 THEN ROW_NUMBER() OVER(ORDER BY number) * RAND()
ELSE number END) AS NewNumber
FROM #Test) AS T
Output result:
7
if you will add GROUP BY [number] you will receive:
number cnt
0 5
2 1
6 1

Related

Label groups in SQL records based on consecutive matching records

Given the below dataset
CREATE TABLE #temp
( A NUMERIC,
B NUMERIC )
INSERT INTO #temp VALUES (243184, 0);
INSERT INTO #temp VALUES (240719, 0);
INSERT INTO #temp VALUES (236482, 1);
INSERT INTO #temp VALUES (230777, 0);
INSERT INTO #temp VALUES (226023, 0);
INSERT INTO #temp VALUES (222522, 0);
INSERT INTO #temp VALUES (214977, 1);
SELECT *
FROM #temp
ORDER BY A DESC
A B
--------------------------------------- ---------------------------------------
243184 0
240719 0
236482 1
230777 0
226023 0
222522 0
214977 1
How can I obtain the following output?
A B C
--------------------------------------- --------------------------------------- ----
243184 0 1
240719 0 1
236482 1 2
230777 0 3
226023 0 3
222522 0 3
214977 1 4
I want to group consecutive instances of [B] records into groups based on the value of 'A' when sorted in descending value.
So that each time [B] flips from 0 to 1 a new group starts
Any ideas?
Please try this:
SELECT s.A, s.B
,SUM(s.IsChange) OVER (ORDER BY s.A DESC ROWS UNBOUNDED PRECEDING) +1 AS [C]
FROM (
SELECT t.A,t.B
,CASE WHEN t.B <> LEAD(t.B)OVER(ORDER BY t.A) THEN 1 ELSE 0 END AS [IsChange]
FROM #temp t
) s
ORDER BY s.A DESC
;

Postgresql Query Results in Division by 0 After Use of Case to Check for 0

The following query is using a subquery to allow for a weighted value to be calculated. The problem I am receiving is a division by 0 error that occurs at random for true 0 value aggregates as well as possible >0 aggregate returns from the subquery.
SELECT
table1.id,
SUM(subquery1.total_value_1),
CASE
WHEN SUM(subquery1.total_value_1) = 0 THEN 0
ELSE ROUND(SUM(percentage_value * (table1.value_1 /subquery1.total_value_1 ::FLOAT)) ::NUMERIC,2)
END AS percentage_value
FROM
table1,
(SELECT
id,
SUM(value_1) AS total_value_1
FROM
table1
WHERE
report_time BETWEEN '2016-10-28 00:00' AND '2016-10-29 23:59'
GROUP BY
id
) subquery1
WHERE
table1.id = subquery1.id
AND report_time BETWEEN '2016-10-28 00:00' AND '2016-10-29 23:59'
AND table1.id = 12572
GROUP BY
table1.id
ORDER BY
table1.id
In some instances, the Case statement is still doing the evaluation of the division despite the value of subquery1.total_value_1 being 0. Just to note, there is no possibility for subquery1.total_value_1 being NULL, as the table defaults this value to 0 on insert if the value added is not defined.
In example below, sum(column) is 1 for both rows, while column is equal to zero or one:
a=# with v as (
select generate_series(0,1,1) al
)
select sum(v.al) over(),v.al
from v;
sum | al
-----+----
1 | 0
1 | 1
(2 rows)
so in your SUM(subquery1.total_value_1) = 0 can be not equal to zero, but subquery1.total_value_1 ::FLOAT will be, this way you get division by zero

Check for equal amounts of negative numbers as positive numbers

I have a table with two columns: intGroupID, decAmount
I want to have a query that can basically return the intGroupID as a result if for every positive(+) decAmount, there is an equal and opposite negative(-) decAmount.
So a table of (id=1,amount=1.0),(1,2.0),(1,-1.0),(1,-2.0) would return back the intGroupID of 1, because for each positive number there exists a negative number to match.
What I know so far is that there must be an equal number of decAmounts (so I enforce a count(*) % 2 = 0) and the sum of all rows must = 0.0. However, some cases that get by that logic are:
ID | Amount
1 | 1.0
1 | -1.0
1 | 2.0
1 | -2.0
1 | 3.0
1 | 2.0
1 | -4.0
1 | -1.0
This has a sum of 0.0 and has an even number of rows, but there is not a 1-for-1 relationship of positives to negatives. I need a query that can basically tell me if there is a negative amount for each positive amount, without reusing any of the rows.
I tried counting the distinct absolute values of the numbers and enforcing that it is less than the count of all rows, but it's not catching everything.
The code I have so far:
DECLARE #tblTest TABLE(
intGroupID INT
,decAmount DECIMAL(19,2)
);
INSERT INTO #tblTest (intGroupID ,decAmount)
VALUES (1,-1.0),(1,1.0),(1,2.0),(1,-2.0),(1,3.0),(1,2.0),(1,-4.0),(1,-1.0);
DECLARE #intABSCount INT = 0
,#intFullCount INT = 0;
SELECT #intFullCount = COUNT(*) FROM #tblTest;
SELECT #intABSCount = COUNT(*) FROM (
SELECT DISTINCT ABS(decAmount) AS absCount FROM #tblTest GROUP BY ABS(decAmount)
) AS absCount
SELECT t1.intGroupID
FROM #tblTest AS t1
/* Make Sure Even Number Of Rows */
INNER JOIN
(SELECT COUNT(*) AS intCount FROM #tblTest
)
AS t2 ON t2.intCount % 2 = 0
/* Make Sure Sum = 0.0 */
INNER JOIN
(SELECT SUM(decAmount) AS decSum FROM #tblTest)
AS t3 ON decSum = 0.0
/* Make Sure Count of Absolute Values < Count of Values */
WHERE
#intABSCount < #intFullCount
GROUP BY t1.intGroupID
I think there is probably a better way to check this table, possibly by finding pairs and removing them from the table and seeing if there's anything left in the table once there are no more positive/negative matches, but I'd rather not have to use recursion/cursors.
Create TABLE #tblTest (
intA INT
,decA DECIMAL(19,2)
);
INSERT INTO #tblTest (intA,decA)
VALUES (1,-1.0),(1,1.0),(1,2.0),(1,-2.0),(1,3.0),(1,2.0),(1,-4.0),(1,-1.0), (5,-5.0),(5,5.0) ;
SELECT * FROM #tblTest;
SELECT
intA
, MIN(Result) as IsBalanced
FROM
(
SELECT intA, X,Result =
CASE
WHEN count(*)%2 = 0 THEN 1
ELSE 0
END
FROM
(
---- Start thinking here --- inside-out
SELECT
intA
, x =
CASE
WHEN decA < 0 THEN
-1 * decA
ELSE
decA
END
FROM #tblTest
) t1
Group by intA, X
)t2
GROUP BY intA
Not tested but I think you can get the idea
This returns the id that do not conform
The not is easier to test / debug
select pos.*, neg.*
from
( select id, amount, count(*) as ccount
from tbl
where amount > 0
group by id, amount ) pos
full outer join
( select id, amount, count(*) as ccount
from tbl
where amount < 0
group by id, amount ) neg
on pos.id = neg.id
and pos.amount = -neg.amount
and pos.ccount = neg.ccount
where pos.id is null
or neg.id is null
I think this will return a list of id that do conform
select distinct(id) from tbl
except
select distinct(isnull(pos.id, neg.id))
from
( select id, amount, count(*) as ccount
from tbl
where amount > 0
group by id, amount ) pos
full outer join
( select id, amount, count(*) as ccount
from tbl
where amount < 0
group by id, amount ) neg
on pos.id = neg.id
and pos.amount = -neg.amount
and pos.ccount = neg.ccount
where pos.id is null
or neg.id is null
Boy, I found a simpler way to do this than my previous answers. I hope all my crazy edits are saved for posterity.
This works by grouping all numbers for an id by their absolute value (1, -1 grouped by 1).
The sum of the group determines if there are an equal number of pairs. If it is 0 then it is equal, any other value for the sum means there is an imbalance.
The detection of evenness by the COUNT aggregate is only necessary to detect an even number of zeros. I assumed that 0's could exist and they should occur an even number of times. Remove it if this isn't a concern, as 0 will always pass the first test.
I rewrote the query a bunch of different ways to get the best execution plan. The final result below only has one big heap sort which was unavoidable given the lack of an index.
Query
WITH tt AS (
SELECT intGroupID,
CASE WHEN SUM(decAmount) > 0 OR COUNT(*) % 2 = 1 THEN 1 ELSE 0 END unequal
FROM #tblTest
GROUP BY intGroupID, ABS(decAmount)
)
SELECT tt.intGroupID,
CASE WHEN SUM(unequal) != 0 THEN 'not equal' ELSE 'equals' END [pair]
FROM tt
GROUP BY intGroupID;
Tested Values
(1,-1.0),(1,1.0),(1,2),(1,-2), -- should work
(2,-1.0),(2,1.0),(2,2),(2,2), -- fail, two positive twos
(3,1.0),(3,1.0),(3,-1.0), -- fail two 1's , one -1
(4,1),(4,2),(4,-.5),(4,-2.5), -- fail: adds up the same sum, but different values
(5,1),(5,-1),(5,0),(5,0), -- work, test zeros
(6,1),(6,-1),(6,0), -- fail, test zeros
(7,1),(7,-1),(7,-1),(7,1),(7,1) -- fail, 3 x 1
Results
A pairs
_ _____
1 equal
2 not equal
3 not equal
4 not equal
5 equal
6 not equal
7 not equal
The following should return "disbalanced" groups:
;with pos as (
select intGroupID, ABS(decAmount) m
from TableName
where decAmount > 0
), neg as (
select intGroupID, ABS(decAmount) m
from TableName
where decAmount < 0
)
select distinct IsNull(p.intGroupID, n.intGroupID) as intGroupID
from pos p
full join neg n on n.id = p.id and abs(n.m - p.m) < 1e-8
where p.m is NULL or n.m is NULL
to get unpaired elements, select satement can be changed to following:
select IsNull(p.intGroupID, n.intGroupID) as intGroupID, IsNull(p.m, -n.m) as decAmount
from pos p
full join neg n on n.id = p.id and abs(n.m - p.m) < 1e-8
where p.m is NULL or n.m is NULL
Does this help?
-- Expected result - group 1 and 3
declare #matches table (groupid int, value decimal(5,2))
insert into #matches select 1, 1.0
insert into #matches select 1, -1.0
insert into #matches select 2, 2.0
insert into #matches select 2, -2.0
insert into #matches select 2, -2.0
insert into #matches select 3, 3.0
insert into #matches select 3, 3.5
insert into #matches select 3, -3.0
insert into #matches select 3, -3.5
insert into #matches select 4, 4.0
insert into #matches select 4, 4.0
insert into #matches select 4, -4.0
-- Get groups where we have matching positive/negatives, with the same number of each
select mat.groupid, min(case when pos.PositiveCount = neg.NegativeCount then 1 else 0 end) as 'Match'
from #matches mat
LEFT JOIN (select groupid, SUM(1) as 'PositiveCount', Value
from #matches where value > 0 group by groupid, value) pos
on pos.groupid = mat.groupid and pos.value = ABS(mat.value)
LEFT JOIN (select groupid, SUM(1) as 'NegativeCount', Value
from #matches where value < 0 group by groupid, value) neg
on neg.groupid = mat.groupid and neg.value = case when mat.value < 0 then mat.value else mat.value * -1 end
group by mat.groupid
-- If at least one pair within a group don't match, reject
having min(case when pos.PositiveCount = neg.NegativeCount then 1 else 0 end) = 1
You can compare your values this way:
declare #t table(id int, amount decimal(4,1))
insert #t values(1,1.0),(1,-1.0),(1,2.0),(1,-2.0),(1,3.0),(1,2.0),(1,-4.0),(1,-1.0),(2,-1.0),(2,1.0)
;with a as
(
select count(*) cnt, id, amount
from #t
group by id, amount
)
select id from #t
except
select b.id from a
full join a b
on a.cnt = b.cnt and a.amount = -b.amount
where a.id is null
For some reason i can't write comments, however Daniels comment is not correct, and my solution does accept (6,1),(6,-1),(6,0) which can be correct. 0 is not specified in the question and since it is a 0 value it can be handled eather way. My answer does NOT accept (3,1.0),(3,1.0),(3,-1.0)
To Blam: No I am not missing
or b.id is null
My solution is like yours, but not exactly identical

Summing From Consecutive Rows

Assume we have a table and we want to do a sum of the Expend column so that the summation only adds up values of the same Week_Name.
SN Week_Name Exp Sum
-- --------- --- ---
1 Week 1 10 0
2 Week 1 20 0
3 Week 1 30 60
4 Week 2 40 0
5 Week 2 50 90
6 Week 3 10 0
I will assume we will need to `Order By' Week_Name, then compare the previous Week_Name(previous row) with the current row Week_name(Current row).
If both are the same, put zero in the SUM column.
If not the same, add all expenditure, where Week_Name = Week_Name(Previous row) and place in the Sum column. The final output should look like the table above.
Any help on how to achieve this in T-SQL is highly appreciated.
Okay, I was eventually able to resolve this issue, praise Jesus! If you want the exact table I gave above, you can use GilM's response below, it is perfect. If you want your table to have running Cumulatives, i.e. Rows 3 shoud have 60, Row 5, should have 150, Row 6 160 etc. Then, you can use my code below:
USE CAPdb
IF OBJECT_ID ('dbo.[tablebp]') IS NOT NULL
DROP TABLE [tablebp]
GO
CREATE TABLE [tablebp] (
tablebpcCol1 int PRIMARY KEY
,tabledatekey datetime
,tableweekname varchar(50)
,expenditure1 numeric
,expenditure_Cummulative numeric
)
INSERT INTO [tablebp](tablebpcCol1,tabledatekey,tableweekname,expenditure1,expenditure_Cummulative)
SELECT b.s_tablekey,d.PK_Date,d.Week_Name,
SUM(b.s_expenditure1) AS s_expenditure1,
SUM(b.s_expenditure1) + COALESCE((SELECT SUM(s_expenditure1)
FROM source_table bs JOIN dbo.Time dd ON bs.[DATE Key] = dd.[PK_Date]
WHERE dd.PK_Date < d.PK_Date),0)
FROM source_table b
INNER JOIN dbo.Time d ON b.[Date key] = d.PK_Date
GROUP BY d.[PK_Date],d.Week_Name,b.s_tablekey,b.s_expenditure1
ORDER BY d.[PK_Date]
;WITH CTE AS (
SELECT tableweekname
,Max(expenditure_Cummulative) AS Week_expenditure_Cummulative
,MAX(tablebpcCol1) AS MaxSN
FROM [tablebp]
GROUP BY tableweekname
)
SELECT [tablebp].*
,CASE WHEN [tablebp].tablebpcCol1 = CTE.MaxSN THEN Week_expenditure_Cummulative
ELSE 0 END AS [RunWeeklySum]
FROM [tablebp]
JOIN CTE on CTE.tableweekname = [tablebp].tableweekname
I'm not sure why your SN=6 line is 0 rather than 10. Do you really not want the sum for the last Week? If having the last week total is okay, then you might want something like:
;WITH CTE AS (
SELECT Week_Name,SUM([Expend.]) as SumExpend
,MAX(SN) AS MaxSN
FROM T
GROUP BY Week_Name
)
SELECT T.*,CASE WHEN T.SN = CTE.MaxSN THEN SumExpend
ELSE 0 END AS [Sum]
FROM T
JOIN CTE on CTE.Week_Name = T.Week_Name
Based on the requst in the comment wanting a running total in SUM you could try this:
;WITH CTE AS (
SELECT Week_Name, MAX(SN) AS MaxSN
FROM T
GROUP BY Week_Name
)
SELECT T.SN, T.Week_Name,T.Exp,
CASE WHEN T.SN = CTE.MaxSN THEN
(SELECT SUM(EXP) FROM T T2
WHERE T2.SN <= T.SN) ELSE 0 END AS [SUM]
FROM T
JOIN CTE ON CTE.Week_Name = T.Week_Name
ORDER BY SN

Perform arithmetic in select statement

Let's suppose I have balance 2000, and want to select balance as
balance=balance-Cr+Dr
So my balance column will give values as below.
balance DR Cr
40000 0 60000
100000 60000 0
0 0 100000
How is this possible in SQL query?
Please check similar question like me
enter link description here
Here is a recursive CTE that calculates the balance using the balance from the previous row. You need something that defines the order of the rows. I use the ID column in the sample table.
-- Test table
declare #T table
(
ID int identity primary key,
DR int,
Cr int
)
-- Sample data
insert into #T (DR, Cr)
select 0, 60000 union all
select 60000, 0 union all
select 0, 100000
-- In value
declare #StartBalance int
set #StartBalance = 100000
-- Recursive cte calculating balance as a running sum
;with cte as
(
select
T.ID,
#StartBalance - T.Cr + T.DR as Balance,
T.DR,
T.Cr
from #T as T
where T.ID = 1
union all
select
T.ID,
C.Balance - T.Cr + T.DR as Balance,
T.DR,
T.Cr
from cte as C
inner join #T as T
on C.ID+1 = T.ID
)
select Balance, DR, Cr
from cte
option (maxrecursion 0)
Result:
Balance DR Cr
----------- ----------- -----------
40000 0 60000
100000 60000 0
0 0 100000
This should work:
SELECT (T.BALANCE-T.CR+T.DR) as "Balance", T.DR, T.CR
FROM <table-name> T
If you use Oracle, there is a function called LAG to reach the previous row data: http://www.adp-gmbh.ch/ora/sql/analytical/lag.html
If you read this link I think you will see that this is exactly what you need. But only if you use Oracle..