ROW_NUMBER() in Redshift to select biggest row from each group? - amazon-redshift

I need to select one row from each group based on COUNT(1) field.
In other databases I'd use ROW_NUMBER() function, which in redshift is unsupported yet.

The answer is to use a SUM(1) OVER(PARTITION BY group_field ORDER BY order field ROWS UNBOUNDED PRECEDING) construct like that:
SELECT id,
name,
cnt
FROM
(SELECT id,
name,
count(*) cnt,
sum(1) over (partition BY id ORDER BY cnt DESC ROWS UNBOUNDED PRECEDING) AS row_number
FROM table
GROUP BY id,
name)
WHERE row_number = 1
ORDER BY name

Related

How to get count(*) total from DB2 with having clause?

How do I get the sum of all return rows with group by clause in DB2?
For example:
Desc Ctr
---- ---
Bowl 30
Plate 21
Spoon 6
Sum 57
SELECT COUNT (name) as Desc, Count(*) OVER ALL
GROUP BY name
Above query return error from DB2. What is the proper SQL statement to return SUM of all rows?
Thanks,
Brandon.
Try this query,
select name, count(*) from table group by name
What is your platform of Db2?
If you want just the total count of rows, then
select count(*)
from mytable
If you want the subtotals by name plus the total, SQL didn't originally support that. You had to union the two results.
select name, count(*) as cnt
from mytable
group by name
UNION ALL
select '', count(*)
from mytable
However more modern versions have added ROLLUP (and CUBE) functionality...
select name, count(*) as cnt
from mytable
group by name with rollup
Edit
To put a value for name, you could simply use COALESCE() assuming name won't ever be null except in the total row.
select coalesce(name,'-Total-') as name, count(*) as cnt
from mytable
group by name with rollup
The more correct method is to use the GROUPING() function
either return just the flag
select name, count(*) as cnt, grouping(name) as IS_TOTAL
from mytable
group by name with rollup
or use it to set the text
select case grouping(name)
when 1 then '-Total-'
else name
end as name
, count(*) as cnt
from mytable
group by name with rollup
Inculde total
To include the total on each line, you could do something like so...
with tot as (select count(*) as cnt from mytable)
select name
, count(*) as name_cnt
, tot.cnt as total_cnt
from mytable
cross join tot
group by name
Note that this will read mytable twice, once for the total and again for the detail rows. But it's real obvious what you're doing.
Another option would be something like so
with allrows as (
select name, count(*) as cnt, grouping(name) as IS_TOTAL
from mytable
group by name with rollup
)
select dtl.name, dtl.cnt, tot.cnt
from allrows dtl
join allrows tot
on tot.is_total = 1
where
dtl.is_total = 0

Get second last row for every record in postgresql query

I had table lets say table_inventory. On the table_inventory i put a trigger for every update of stock insert new row in audit_inventory table:
table column are look like:
table_inventory
|sr_id|inventory_id|p_name|stock|
audit_inventory
|insert_time||sr_id|inventory_id|p_name|stock|
Now my problem is for every inventory_id of table_inventory there are multiple entry in audit_inventory as i put trigger for every update of stock insert a row with time in audit_inventory, so i want to select second last stock value for every inventory_id of table_inventory. I write some cte to do that but unable to get for every inventory_id.
WITH CTE as
(select inventory_id,stock from table_inventory),
cte_1 as(
SELECT
stock,
row_number() over (order by insert_time desc) rn
FROM audit_inventory where inventoryid in (select inventory_id from cte)
),cte_2 as(
SELECT stock
FROM CTE
WHERE rn = 2)
select * from cte,cte_1;
The above query retrns the second last value for single inventory_id but did not understand how to write query for getting second last row value for every inventory_id of table_inventory.
Thanks for your precious time.
Try doing this. I guess this is what you want:
WITH CTE as
( SELECT
stock,
inventory_id,
row_number() over (PARTITION BY inventoryid order by insert_time desc) rn
FROM audit.inventory
)
SELECT
CTE.stock,
ti.inventory_id,
ti.stock
FROM
table_inventory ti
inner join CTE on CTE.inventory_id=ti.inventory_id
WHERE
CTE.rn=2

How do I delete records based on query results

How to I delete records from a table which is referenced in my query for example, below is my query which returns me the correct amount of results but I then want to delete those records from the same table that is referenced in the query.
;with cte as (select *,
row_number() over (partition by c.[Trust Discharge], c.[AE Admission], c.[NHS Number]
order by c.[Hospital Number]) as Rn,
count(*) over (partition by c.[Trust Discharge], c.[AE Admission], c.[NHS Number]) as cntDups
from CommDB.dbo.tblNHFDArchive as c)
Select * from cte
Where cte.Rn>1 and cntDups >1
as you can already select the rows by querying Select * from cte Where cte.Rn>1 and cntDups >1, you can delete them by running delete from your_table where unique_column in (Select unique_column from cte Where cte.Rn>1 and cntDups >1)
note that unique_column is a column in your table that cannot have duplicate values, and your_table is the table where the rows reside.
and don't forget to backup your table first if it's on production.

SQL Server SUM() for DISTINCT records

I have a field called "Users", and I want to run SUM() on that field that returns the sum of all DISTINCT records. I thought that this would work:
SELECT SUM(DISTINCT table_name.users)
FROM table_name
But it's not selecting DISTINCT records, it's just running as if I had run SUM(table_name.users).
What would I have to do to add only the distinct records from this field?
Use count()
SELECT count(DISTINCT table_name.users)
FROM table_name
SQLFiddle demo
This code seems to indicate sum(distinct ) and sum() return different values.
with t as (
select 1 as a
union all
select '1'
union all
select '2'
union all
select '4'
)
select sum(distinct a) as DistinctSum, sum(a) as allSum, count(distinct a) as distinctCount, count(a) as allCount from t
Do you actually have non-distinct values?
select count(1), users
from table_name
group by users
having count(1) > 1
If not, the sums will be identical.
You can see for yourself that distinct works with the following example. Here I create a subquery with duplicate values, then I do a sum distinct on those values.
select DistinctSum=sum(distinct x), RegularSum=Sum(x)
from
(
select x=1
union All
select 1
union All
select 2
union All
select 2
) x
You can see that the distinct sum column returns 3 and the regular sum returns 6 in this example.
You can use a sub-query:
select sum(users)
from (select distinct users from table_name);
SUM(DISTINCTROW table_name.something)
It worked for me (innodb).
Description - "DISTINCTROW omits data based on entire duplicate records, not just duplicate fields." http://office.microsoft.com/en-001/access-help/all-distinct-distinctrow-top-predicates-HA001231351.aspx
;WITH cte
as
(
SELECT table_name.users , rn = ROW_NUMBER() OVER (PARTITION BY users ORDER BY users)
FROM table_name
)
SELECT SUM(users)
FROM cte
WHERE rn = 1
SQL Fiddle
Try here yourself
TEST
DECLARE #table_name Table (Users INT );
INSERT INTO #table_name Values (1),(1),(1),(3),(3),(5),(5);
;WITH cte
as
(
SELECT users , rn = ROW_NUMBER() OVER (PARTITION BY users ORDER BY users)
FROM #table_name
)
SELECT SUM(users) DisSum
FROM cte
WHERE rn = 1
Result
DisSum
9
If circumstances make it difficult to weave a "distinct" into the sum clause, it will usually be possible to add an extra "where" clause to the entire query - something like:
select sum(t.ColToSum)
from SomeTable t
where (select count(*) from SomeTable t1 where t1.ColToSum = t.ColToSum and t1.ID < t.ID) = 0
May be a duplicate to
Trying to sum distinct values SQL
As per Declan_K's answer:
Get the distinct list first...
SELECT SUM(SQ.COST)
FROM
(SELECT DISTINCT [Tracking #] as TRACK,[Ship Cost] as COST FROM YourTable) SQ

Order by on an inline view

I would like to get the top 10 data from a table which needs to be sorted in ascending order in a outer query. Below is the pseudocode of the query. What are the options other than using table valued functions?
select * from
(select top 10 tour_date
from tourtable
order by tour_date desc)
order by tour_date asc
Your query as written should work, you'd just need to alias the subquery:
select *
from (select top 10 tour_date from tourtable order by tour_date desc) t
order by tour_date asc
Another alternative, assuming SQL Server 2005+:
SELECT t.tour_date
FROM (SELECT tour_date, ROW_NUMBER() OVER(ORDER BY tour_date DESC) AS RowNum
FROM tourtable) t
WHERE t.RowNum <= 10
ORDER BY t.tour_date ASC
which could also be written with a CTE:
WITH cteRowNum AS (
SELECT tour_date, ROW_NUMBER() OVER(ORDER BY tour_date DESC) AS RowNum
FROM tourtable
)
SELECT tour_date
FROM cteRowNum
WHERE RowNum <= 10
ORDER BY tour_date ASC
Tested in a non-tsql context:
select * from (select tour_date from tourable order by tour_date desc limit 10) a order by tour_date asc