How do I delete records based on query results - sql-delete

How to I delete records from a table which is referenced in my query for example, below is my query which returns me the correct amount of results but I then want to delete those records from the same table that is referenced in the query.
;with cte as (select *,
row_number() over (partition by c.[Trust Discharge], c.[AE Admission], c.[NHS Number]
order by c.[Hospital Number]) as Rn,
count(*) over (partition by c.[Trust Discharge], c.[AE Admission], c.[NHS Number]) as cntDups
from CommDB.dbo.tblNHFDArchive as c)
Select * from cte
Where cte.Rn>1 and cntDups >1

as you can already select the rows by querying Select * from cte Where cte.Rn>1 and cntDups >1, you can delete them by running delete from your_table where unique_column in (Select unique_column from cte Where cte.Rn>1 and cntDups >1)
note that unique_column is a column in your table that cannot have duplicate values, and your_table is the table where the rows reside.
and don't forget to backup your table first if it's on production.

Related

I want to delete duplicate rows from a MySQL table. Please click on the below link to see the table data

I tried to do it with this query, but it's not working...
DELETE FROM employee
WHERE ( SELECT * FROM
(SELECT row_number() OVER (partition by id) rn FROM employee) alias
) > 1;
Please click on this link to view the table
The above query is not working and giving this error message:
Error Code: 1242. Subquery returns more than 1 row
try like below by using subquery
delete from
(
select *.row_number() over (partition by id order by id) rn
from employee
) alias where rn > 1;
You are matching an integer (1) with set of rows returned from the subquery, which SQL will not allow
You can match an integer (1) with a single value returned from the subquery.
Use below query (using CTE) to remove duplicates.
;WITH TempEmp (id,duplicateRecCount)
AS
(
SELECT id,ROW_NUMBER() OVER(PARTITION by id ORDER BY id)
AS duplicateRecCount
FROM employee
)
DELETE FROM TempEmp
WHERE duplicateRecCount > 1

PostgreSQL - How to count when Distinct On

How to get count of rows for each user_id
select distinct on (user_id) *
from some_table
As in such SQL:
select user_id, count(*)
from some_table
group by user_id
Try this:
SELECT DISTINCT ON (a.user_id)
a.*
FROM
(
SELECT user_id
, count(*) OVER(PARTITION BY user_id)
FROM some_table
) a
If you want to be able to use SELECT * in order to get a "sample row", depending on how large your table is you may be able to use a correlated subquery to get the count of rows for that particular user id:
select distinct on (user_id) *
, (select count (1)
from some_table st2
where st2.user_id = some_table.user_id) as user_row_count
from some_table

Get second last row for every record in postgresql query

I had table lets say table_inventory. On the table_inventory i put a trigger for every update of stock insert new row in audit_inventory table:
table column are look like:
table_inventory
|sr_id|inventory_id|p_name|stock|
audit_inventory
|insert_time||sr_id|inventory_id|p_name|stock|
Now my problem is for every inventory_id of table_inventory there are multiple entry in audit_inventory as i put trigger for every update of stock insert a row with time in audit_inventory, so i want to select second last stock value for every inventory_id of table_inventory. I write some cte to do that but unable to get for every inventory_id.
WITH CTE as
(select inventory_id,stock from table_inventory),
cte_1 as(
SELECT
stock,
row_number() over (order by insert_time desc) rn
FROM audit_inventory where inventoryid in (select inventory_id from cte)
),cte_2 as(
SELECT stock
FROM CTE
WHERE rn = 2)
select * from cte,cte_1;
The above query retrns the second last value for single inventory_id but did not understand how to write query for getting second last row value for every inventory_id of table_inventory.
Thanks for your precious time.
Try doing this. I guess this is what you want:
WITH CTE as
( SELECT
stock,
inventory_id,
row_number() over (PARTITION BY inventoryid order by insert_time desc) rn
FROM audit.inventory
)
SELECT
CTE.stock,
ti.inventory_id,
ti.stock
FROM
table_inventory ti
inner join CTE on CTE.inventory_id=ti.inventory_id
WHERE
CTE.rn=2

SQL Server SUM() for DISTINCT records

I have a field called "Users", and I want to run SUM() on that field that returns the sum of all DISTINCT records. I thought that this would work:
SELECT SUM(DISTINCT table_name.users)
FROM table_name
But it's not selecting DISTINCT records, it's just running as if I had run SUM(table_name.users).
What would I have to do to add only the distinct records from this field?
Use count()
SELECT count(DISTINCT table_name.users)
FROM table_name
SQLFiddle demo
This code seems to indicate sum(distinct ) and sum() return different values.
with t as (
select 1 as a
union all
select '1'
union all
select '2'
union all
select '4'
)
select sum(distinct a) as DistinctSum, sum(a) as allSum, count(distinct a) as distinctCount, count(a) as allCount from t
Do you actually have non-distinct values?
select count(1), users
from table_name
group by users
having count(1) > 1
If not, the sums will be identical.
You can see for yourself that distinct works with the following example. Here I create a subquery with duplicate values, then I do a sum distinct on those values.
select DistinctSum=sum(distinct x), RegularSum=Sum(x)
from
(
select x=1
union All
select 1
union All
select 2
union All
select 2
) x
You can see that the distinct sum column returns 3 and the regular sum returns 6 in this example.
You can use a sub-query:
select sum(users)
from (select distinct users from table_name);
SUM(DISTINCTROW table_name.something)
It worked for me (innodb).
Description - "DISTINCTROW omits data based on entire duplicate records, not just duplicate fields." http://office.microsoft.com/en-001/access-help/all-distinct-distinctrow-top-predicates-HA001231351.aspx
;WITH cte
as
(
SELECT table_name.users , rn = ROW_NUMBER() OVER (PARTITION BY users ORDER BY users)
FROM table_name
)
SELECT SUM(users)
FROM cte
WHERE rn = 1
SQL Fiddle
Try here yourself
TEST
DECLARE #table_name Table (Users INT );
INSERT INTO #table_name Values (1),(1),(1),(3),(3),(5),(5);
;WITH cte
as
(
SELECT users , rn = ROW_NUMBER() OVER (PARTITION BY users ORDER BY users)
FROM #table_name
)
SELECT SUM(users) DisSum
FROM cte
WHERE rn = 1
Result
DisSum
9
If circumstances make it difficult to weave a "distinct" into the sum clause, it will usually be possible to add an extra "where" clause to the entire query - something like:
select sum(t.ColToSum)
from SomeTable t
where (select count(*) from SomeTable t1 where t1.ColToSum = t.ColToSum and t1.ID < t.ID) = 0
May be a duplicate to
Trying to sum distinct values SQL
As per Declan_K's answer:
Get the distinct list first...
SELECT SUM(SQ.COST)
FROM
(SELECT DISTINCT [Tracking #] as TRACK,[Ship Cost] as COST FROM YourTable) SQ

ROW_NUMBER() in Redshift to select biggest row from each group?

I need to select one row from each group based on COUNT(1) field.
In other databases I'd use ROW_NUMBER() function, which in redshift is unsupported yet.
The answer is to use a SUM(1) OVER(PARTITION BY group_field ORDER BY order field ROWS UNBOUNDED PRECEDING) construct like that:
SELECT id,
name,
cnt
FROM
(SELECT id,
name,
count(*) cnt,
sum(1) over (partition BY id ORDER BY cnt DESC ROWS UNBOUNDED PRECEDING) AS row_number
FROM table
GROUP BY id,
name)
WHERE row_number = 1
ORDER BY name