How to Select the Maximum value occurring 2 times? - tsql

Suppose I have a table of values looking like this:
Sample_Number |
-------------------
1 |
1 |
2 |
3 |
3 |
4 |
5 |
How can I write a SELECT statement to return the maximum sample number that occurs exactly 2 times? In the sample data the value I am looking for would be 3.
I imagine there could be a number of answers to this - I am especially interested in a solution with no inner selects and that uses the Having clause (if this is possible).

You can use this query:
SELECT TOP 1 Sample_Number As MaxSampleNumberThatOccursTwice
FROM dbo.TableName
GROUP BY Sample_Number
HAVING COUNT(*) = 2
ORDER BY Sample_Number DESC

I'm sure there's an easier way to do this, but you can do it by pulling all of the Sample_Numbers with exactly two entries, and pulling the MAX() of those values:
;With Cte As
(
Select Sample_Number
From Test
Group By Sample_Number
Having Count(Sample_Number) = 2
)
Select Max(Sample_Number)
From Cte

;with cte
as
(select sample_number
from #temp
group by Sample_Number
having
count(Analysis_ID)=2
)
select max(sample_number) from cte

I would use subselect:
SELECT MAX (sample_number)
FROM (SELECT sample_number
FROM TAB1
GROUP BY sample_number
HAVING COUNT(sample_number) =2
)

Group by the Sample_Number and get the count of the group
and only select if the count is 2
select Sample_Number, count(*) count from someTable
group by Sample_Number
having count=2

Related

PostgreSQL Window Function "column must appear in the GROUP BY clause"

I'm trying to get a leaderboard of summed user scores from a list of user score entries. A single user can have more than one entry in this table.
I have the following table:
rewards
=======
user_id | amount
I want to add up all of the amount values for given users and then rank them on a global leaderboard. Here's the query I'm trying to run:
SELECT user_id, SUM(amount) AS score, rank() OVER (PARTITION BY user_id) FROM rewards;
I'm getting the following error:
ERROR: column "rewards.user_id" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: SELECT user_id, SUM(amount) AS score, rank() OVER (PARTITION...
Isn't user_id already in an "aggregate function" because I'm trying to partition on it? The PostgreSQL manual shows the following entry which I feel is a direct parallel of mine, so I'm not sure why mine's not working:
SELECT depname, empno, salary, avg(salary) OVER (PARTITION BY depname) FROM empsalary;
They're not grouping by depname, so how come theirs works?
For example, for the following data:
user_id | score
===============
1 | 2
1 | 3
2 | 5
3 | 1
I would expect the following output (I have made a "tie" between users 1 and 2):
user_id | SUM(score) | rank
===========================
1 | 5 | 1
2 | 5 | 1
3 | 1 | 3
So user 1 has a total score of 5 and is ranked #1, user 2 is tied with a score of 5 and thus is also rank #1, and user 3 is ranked #3 with a score of 1.
You need to GROUP BY user_id since it's not being aggregated. Then you can rank by SUM(score) descending as you want;
SQL Fiddle Demo
SELECT user_id, SUM(score), RANK() OVER (ORDER BY SUM(score) DESC)
FROM rewards
GROUP BY user_id;
user_id | sum | rank
---------+-----+------
1 | 5 | 1
2 | 5 | 1
3 | 1 | 3
There is a difference between window functions and aggregate functions. Some functions can be used both as a window function and an aggregate function, which can cause confusion. Window functions can be recognized by the OVER clause in the query.
The query in your case then becomes, split in doing first an aggregate on user_id followed by a window function on the total_amount.
SELECT user_id, total_amount, RANK() OVER (ORDER BY total_amount DESC)
FROM (
SELECT user_id, SUM(amount) total_amount
FROM table
GROUP BY user_id
) q
ORDER BY total_amount DESC
If you have
SELECT user_id, SUM(amount) ....
^^^
agreagted function (not window function)
....
FROM .....
You need
GROUP BY user_id

Summing Multiple Records by maxdate

I have a table with the following data
Bldg Suit SQFT Date
1 1 1,000 9/24/2012
1 1 1,500 12/31/2011
1 2 800 8/31/2012
1 2 500 10/1/2005
I want to write a query that will sum the max date for each suit record, so the desired result would be 1,800, and must be in one cell/row. This will ultimately be part of subquery, I am just not getting what I expect with the queries I have writtren so far.
Thanks in advance.
You can use the following (See SQL Fiddle with Demo):
select sum(t1.sqft) Total
from yourtable t1
inner join
(
select max(dt) mxdt, suit, bldg
from yourtable
group by suit, bldg
) t2
on t1.dt = t2.mxdt
and t1.bldg = t2.bldg
and t1.suit = t2.suit
; With Data As
(
Select Bldg, Suit, SQFT, Row_Number() Over (Partition By Bldg, Suit Order By Date DESC) As RowID
From YourTableNameHere
)
Select Bldg, Sum(SQFT) As TotalSQFT
From Data
Where RowId = 1
Group By Bldg

Deleting duplicate entry from table

Suppose I have a table as follows: (on DB2 9.7.2)
COL1 COL2 COL3
----------- ---------- ----------
3 4 xyz
3 4 xyz
Now I want to write a query such that only one from these two identical records will be deleted. How can I achieve this?
I can think of :
delete from ;
or
delete from where col1=3;
but both of the above queries will delete both records whereas I want to keep one of them.
If LIMIT doesn't work, this will:
DELETE FROM (SELECT * FROM tbl WHERE col = 3 FETCH FIRST ROW ONLY)
Can't you use a limit clause?
DELETE FROM <table> WHERE <column>=3 LIMIT 1
This is something that served my purpose:
DELETE FROM tabA M
WHERE M.tabAky IN (SELECT tabAky
FROM (SELECT tabAky,
ROW_NUMBER() OVER (PARTITION BY tabAcol1,
tabAcol2,
tabAcoln)
FROM tabA a) AS X (tabAky, ROWNUM)
WHERE ROWNUM> 1) ;
Try This
delete from table A (select row_number() over (partition by col1 order by col1 ) count,* from table) where A.count> 1

T-SQL how to count the number of duplicate rows then print the outcome?

I have a table ProductNumberDuplicates_backups, which has two columns named ProductID and ProductNumber. There are some duplicate ProductNumbers. How can I count the distinct number of products, then print out the outcome like "() products was backup." ? Because this is inside a stored procedure, I have to use a variable #numrecord as the distinct number of rows. I put my codes like this:
set #numrecord= select distinct ProductNumber
from ProductNumberDuplicates_backups where COUNT(*) > 1
group by ProductID
having Count(ProductNumber)>1
Print cast(#numrecord as varchar)+' product(s) were backed up.'
obviously the error was after the = sign as the select can not follow it. I've search for similar cases but they are just select statements. Please help. Many thanks!
Try
select #numrecord= count(distinct ProductNumber)
from ProductNumberDuplicates_backups
Print cast(#numrecord as varchar)+' product(s) were backed up.'
begin tran
create table ProductNumberDuplicates_backups (
ProductNumber int
)
insert ProductNumberDuplicates_backups(ProductNumber)
select 1
union all
select 2
union all
select 1
union all
select 3
union all
select 2
select * from ProductNumberDuplicates_backups
declare #numRecord int
select #numRecord = count(ProductNumber) from
(select ProductNumber, ROW_NUMBER()
over (partition by ProductNumber order by ProductNumber) RowNumber
from ProductNumberDuplicates_backups) p
where p.RowNumber > 1
print cast(#numRecord as varchar) + ' product(s) were backed up.'
rollback

simple SUM in T-sql

This should be really simple. I am using SSMS 2008, trying to get a sum of just one column. Problem is that I currently group on this one column and also use a HAVING statement. How do I get sum of total # of records > 1? This is my T-SQL logic currently:
select count(*) as consumer_count from #consumer_initiations
group by consumer
having count(1) > 1
But this data looks like:
consumer_count
----------------
2
2
4
3
...
Wrap it?
SELECT SUM(consumer_count)
FROM (
select count(*) as consumer_count from #consumer_initiations
group by consumer
having count(1) > 1
) AS whatever
With a nested query:
select sum(consumer_count)
FROM (
select count(*) as consumer_count from #consumer_initiations
group by consumer
having count(1) > 1
) as child
select sum(t.consumer_count)
from (select count(*) as consumer_count
from #consumer_initiations
group by consumer
having count(1) > 1) t
Try:
select sum(t.consumer_count) from
(
select count(*) as consumer_count from #consumer_initiations
group by consumer
having count(1) > 1
) t
This will give you the sum of records that your original query returns. These type of queries are called nested queries.
Besides the wrapping in another query, you could use this:
SELECT COUNT(*) AS consumer_count
FROM #consumer_initiations AS a
WHERE EXISTS
( SELECT *
FROM #consumer_initiations AS b
WHERE b.consumer = a.consumer
AND b.PK <> a.PK -- the Primary Key of the table
)