Selecting row(s) that have distinct count (one) of certain column - tsql

I have following dataset:
org system_id punch_start_tb1 punch_start_tb2
CG 100242 2022-08-16T00:08:00Z 2022-08-16T03:08:00Z
LA 250595 2022-08-16T00:00:00Z 2022-08-16T03:00:00Z
LB 300133 2022-08-15T04:00:00Z 2022-08-16T04:00:00Z
LB 300133 2022-08-16T04:00:00Z 2022-08-15T04:00:00Z
MO 400037 2022-08-15T14:00:00Z 2022-08-15T23:00:00Z
MO 400037 2022-08-15T23:00:00Z 2022-08-15T14:00:00Z
I am trying to filter out data so that it only populates the outcome when Count of "system_id" = 1.
So, the expected outcome would be only following two rows:
org system_id punch_start_tb1 punch_start_tb2
CG 100242 2022-08-16T00:08:00Z 2022-08-16T03:08:00Z
LA 250595 2022-08-16T00:00:00Z 2022-08-16T03:00:00Z
I tried with Group by and Having clause, but I did not have a success.

You can try below
SELECT * FROM
(
SELECT org,system_id,punch_start_tbl,punch_start_tb2
,ROW_NUMBER()OVER(PARTITION BY system_id ORDER BY system_id)RN
FROM <TableName>
)X
WHERE RN = 1

CTE returns org with only one record then join with main table on org column.
;WITH CTE AS (
select org
from <table_name>
group by org
Having count(1) = 1
)
select t.*
from cte
inner join <table_name> t on cte.org = t.org

You can try this (use min because we have only one row):
select MIN(org), system_id, MIN(punch_start_tb1), MIN(punch_start_tb2)
from <table_name>
group by system_id
Having count(1) = 1
or use answer #Meyssam Toluie with group by by system_id

Related

TSQL -- display results of two queries on one row in SSMS

I am using TSQL, SSMS v.17.9.1 The underlying db is Microsoft SQL Server 2014 SP3
For display purposes, I want to concatenate the results of two queries:
SELECT TOP 1 colA as 'myCol1' FROM tableA
--
SELECT TOP 1 colB as 'myCol2' FROM tableB
and display the results from the queries in one row in SSMS.
(The TOP 1 directive would hopefully guarantee the same number of results from each query, which would assist displaying them together. If this could be generalized to TOP 10 per query that would help also)
This should work for any number of rows, it assumes you want to pair ordered by the values in the column displayed
With
TableA_CTE AS
(
SELECT TOP 1 colA as myCol1
,Row_Number() OVER (ORDER BY ColA DESC) AS RowOrder
FROM tableA
),
TableB_CTE AS
(
SELECT TOP 1 colB as myCol2
,Row_Number() OVER (ORDER BY ColB DESC) AS RowOrder
FROM tableB
)
SELECT A.myCol1, B.MyCol2
FROM TableA_CTE AS A
INNER JOIN TableB_CTE AS B
ON A.RowOrder = B.RowOrder
There are currently two issues with the accepted answer:
I) a missing comma before the line: "Table B As"
II) TSQL seems to find it recursive as written, so I re-wrote it in a non-recursive way:
This is a re-working of the accepted answer that actually works in T-SQL:
USE [Database_1];
With
CTE_A AS
(
SELECT TOP 1 [Col1] as myCol1
,Row_Number() OVER (ORDER BY [Col2] desc) AS RowOrder
FROM [TableA]
)
,
CTE_B AS
(
SELECT TOP 1 [Col2] as myCol2
,Row_Number() OVER (ORDER BY [Col2] desc) AS RowOrder
FROM [TableB]
)
SELECT A.myCol1, B.myCol2
FROM CTE_A AS A
INNER JOIN CTE_B AS B
ON ( A.RowOrder = B.RowOrder)

Postgresql Update Based on count, min and group by

Thank you for taking the time to look at my question.
I've seen similar questions, but not the same depth. Please help!
I would like to update a column all rows in a table that holds user_id and date_created with the lowest date_created for the user_id.
The following select gives me all the rows I would like to update:
select user_id, min(date_created) from mytable s1 where
(select count(1) from mytable s2 where
s1.user_id = s2.user_id group by s2.user_id)
> 1 group by user_id order by user_id;
I would have expected this update to work:
update mytable set join_status = 1 where date_created =
(select min(date_created) from mytable s1 where
(select count(1) from simplepay_payment s2 where
s1.user_id = s2.user_id group by s2.user_id)
> 1 group by user_id);
But is gave the following error:
ERROR: more than one row returned by a subquery used as an expression
I've tried a few different solutions, but nothing seems to help.
Does anyone have any ideas fro me?
Thanks again.
Change your SQL to:
update mytable set join_status = 1 where date_created IN
(select min(date_created) from mytable s1 where
(select count(1) from simplepay_payment s2 where
s1.user_id = s2.user_id group by s2.user_id)
> 1 group by user_id);
Read more on row comparison in the docs.
EDIT:
In the subquery you're performing GROUP BY user_id. This means that you will receive many rows, based on the number of unique user_id values in your simplepay_payment table.
To make your query working as expected, you should join using 2 columns: user_id and date_created. As you've mentioned, you already have the query that gives you the correct results, so you can use it like this:
WITH desired AS (
SELECT user_id, min(date_created) AS mindt
FROM mytable s1 where
(SELECT count(1) FROM mytable s2
WHERE s1.user_id = s2.user_id GROUP BY s2.user_id) > 1
GROUP BY user_id)
UPDATE mytable m SET join_status = 1 FROM desired d
WHERE d.user_id = m.user_id AND d.mindt = m.date_created;
I've wrapped in your query into the Common Table Expression and used it in the UPDATE statement. You can add RETURNING m.* at the end of the query to see the rows that had been updated and their new values.
You can test this query on SQL Fiddle.
EDIT2:
Common Table Expressions (WITH-queries) are not available before version 9.1 for UPDATE statements. You can simply move the CTE subquery into the update, like this:
UPDATE mytable m SET join_status = 1 FROM (
SELECT user_id, min(date_created) AS mindt
FROM mytable s1 where
(SELECT count(1) FROM mytable s2
WHERE s1.user_id = s2.user_id GROUP BY s2.user_id) > 1
GROUP BY user_id) d
WHERE d.user_id = m.user_id AND d.mindt = m.date_created;

T-SQL how to count the number of duplicate rows then print the outcome?

I have a table ProductNumberDuplicates_backups, which has two columns named ProductID and ProductNumber. There are some duplicate ProductNumbers. How can I count the distinct number of products, then print out the outcome like "() products was backup." ? Because this is inside a stored procedure, I have to use a variable #numrecord as the distinct number of rows. I put my codes like this:
set #numrecord= select distinct ProductNumber
from ProductNumberDuplicates_backups where COUNT(*) > 1
group by ProductID
having Count(ProductNumber)>1
Print cast(#numrecord as varchar)+' product(s) were backed up.'
obviously the error was after the = sign as the select can not follow it. I've search for similar cases but they are just select statements. Please help. Many thanks!
Try
select #numrecord= count(distinct ProductNumber)
from ProductNumberDuplicates_backups
Print cast(#numrecord as varchar)+' product(s) were backed up.'
begin tran
create table ProductNumberDuplicates_backups (
ProductNumber int
)
insert ProductNumberDuplicates_backups(ProductNumber)
select 1
union all
select 2
union all
select 1
union all
select 3
union all
select 2
select * from ProductNumberDuplicates_backups
declare #numRecord int
select #numRecord = count(ProductNumber) from
(select ProductNumber, ROW_NUMBER()
over (partition by ProductNumber order by ProductNumber) RowNumber
from ProductNumberDuplicates_backups) p
where p.RowNumber > 1
print cast(#numRecord as varchar) + ' product(s) were backed up.'
rollback

TSQL invalid HAVING count

I am using SSMS 2008 and trying to use a HAVING statement. This should be a real simple query. However, I am only getting one record returned event though there are numerous duplicates.
Am I doing something wrong with the HAVING statement here? Or is there some other function that I could use instead?
select
address_desc,
people_id
from
dbo.address_view
where people_id is not NULL
group by people_id , address_desc
having count(*) > 1
sample data from address_view:
people_id address_desc
---------- ------------
Murfreesboro, TN 37130 F15D1135-9947-4F66-B778-00E43EC44B9E
11 Mohawk Rd., Burlington, MA 01803 C561918F-C2E9-4507-BD7C-00FB688D2D6E
Unknown, UN 00000 C561918F-C2E9-4507-BD7C-00FB688D2D6E
Jacksonville, NC 28546 FC7C78CD-8AEA-4C8E-B93D-010BF8E4176D
Memphis, TN 38133 8ED8C601-5D35-4EB7-9217-012905D6E9F1
44 Maverick St., Fitchburg, MA 8ED8C601-5D35-4EB7-9217-012905D6E9F1
The GROUP BY is going to lump your duplicates together into a single row.
I think instead, you want to find all people_id values with duplicate address_desc:
SELECT a.address_desc, a.people_id
FROM dbo.address_view a
INNER JOIN (SELECT address_desc
FROM dbo.address_view
GROUP BY address_desc
HAVING COUNT(*) > 1) t
ON a.address_desc = t.address_desc
using row_number and partition you can find the duplicate occurrences where row_num>1
select address_desc,
people_id,
row_num
from
(
select
address_desc,
people_id,
row_number() over (partition by address_desc order by address_desc) row_num
from
dbo.address_view
where people_id is not NULL
) x
where row_num>1

Query to get row from one table, else random row from another

tblUserProfile - I have a table which holds all the Profile Info (too many fields)
tblMonthlyProfiles - Another table which has just the ProfileID in it (the idea is that this table holds 2 profileids which sometimes become monthly profiles (on selection))
Now when I need to show monthly profiles, I simply do a select from this tblMonthlyProfiles and Join with tblUserProfile to get all valid info.
If there are no rows in tblMonthlyProfile, then monthly profile section is not displayed.
Now the requirement is to ALWAYS show Monthly Profiles. If there are no rows in monthlyProfiles, it should pick up 2 random profiles from tblUserProfile. If there is only one row in monthlyProfiles, it should pick up only one random row from tblUserProfile.
What is the best way to do all this in one single query ?
I thought something like this
select top 2 * from tblUserProfile P
LEFT OUTER JOIN tblMonthlyProfiles M
on M.profileid = P.profileid
ORder by NEWID()
But this always gives me 2 random rows from tblProfile. How can I solve this ?
Try something like this:
SELECT TOP 2 Field1, Field2, Field3, FinalOrder FROM
(
select top 2 Field1, Field2, Field3, FinalOrder, '1' As FinalOrder from tblUserProfile P JOIN tblMonthlyProfiles M on M.profileid = P.profileid
UNION
select top 2 Field1, Field2, Field3, FinalOrder, '2' AS FinalOrder from tblUserProfile P LEFT OUTER JOIN tblMonthlyProfiles M on M.profileid = P.profileid ORDER BY NEWID()
)
ORDER BY FinalOrder
The idea being to pick two monthly profiles (if that many exist) and then 2 random profiles (as you correctly did) and then UNION them. You'll have between 2 and 4 records at that point. Grab the top two. FinalOrder column is an easy way to make sure that you try and get the monthly's first.
If you have control of the table structure, you might save yourself some trouble by simply adding a boolean field IsMonthlyProfile to the UserProfile table. Then it's a single table query, order by IsBoolean, NewID()
In SQL 2000+ compliant syntax you could do something like:
Select ...
From (
Select TOP 2 ...
From tblUserProfile As UP
Where Not Exists( Select 1 From tblMonthlyProfile As MP1 )
Order By NewId()
) As RandomProfile
Union All
Select MP....
From tblUserProfile As UP
Join tblMonthlyProfile As MP
On MP.ProfileId = UP.ProfileId
Where ( Select Count(*) From tblMonthlyProfile As MP1 ) >= 1
Union All
Select ...
From (
Select TOP 1 ...
From tblUserProfile As UP
Where ( Select Count(*) From tblMonthlyProfile As MP1 ) = 1
Order By NewId()
) As RandomProfile
Using SQL 2005+ CTE you can do:
With
TwoRandomProfiles As
(
Select TOP 2 ..., ROW_NUMBER() OVER ( ORDER BY UP.ProfileID ) As Num
From tblUserProfile As UP
Order By NewId()
)
Select MP.Col1, ...
From tblUserProfile As UP
Join tblMonthlyProfile As MP
On MP.ProfileId = UP.ProfileId
Where ( Select Count(*) From tblMonthlyProfile As MP1 ) >= 1
Union All
Select ...
From TwoRandomProfiles
Where Not Exists( Select 1 From tblMonthlyProfile As MP1 )
Union All
Select ...
From TwoRandomProfiles
Where ( Select Count(*) From tblMonthlyProfile As MP1 ) = 1
And Num = 1
The CTE has the advantage of only querying for the random profiles once and the use of the ROW_NUMBER() column.
Obviously, in all the UNION statements the number and type of the columns must match.