Is T-SQL (2005) RANK OVER(PARTITION BY) the answer? - tsql

I have a stored procedure that does paging for the front end and is working fine. I now need to modify that procedure to group by four columns of the 20 returned and then only return the row within each group that contains the lowest priority. So when resort_id, bedrooms, kitchen and checkin (date) all match then only return the row that has the min priority. I have to still maintain the paging functionality. The #startIndex and #upperbound are parms passed into the procedure from the front end for paging. I’m thinking that RANK OVER (PARTITION BY) is the answer I just can’t quite figure out how to put it all together.
SELECT I.id,
I.resort_id,
I.[bedrooms],
I.[kitchen],
I.[checkin],
I.[priority],
I.col_1,
I.col_2 /* ..... (more cols) */
FROM (
SELECT ROW_NUMBER() OVER(ORDER by checkin) AS rowNumber,
*
FROM Inventory
) AS I
WHERE rowNumber >= #startIndex
AND rowNumber < #upperBound
ORDER BY rowNumber
Example 2 after fix:
SELECT I.resort_id,
I.[bedrooms],
I.[kitchen],
I.[checkin],
I.[priority],
I.col_1,
I.col_2 /* ..... (more cols) */
FROM Inventory i
JOIN
(
SELECT ROW_NUMBER() OVER(ORDER BY h.checkin) as rowNumber, MIN(h.id) as id
FROM Inventory h
JOIN (
SELECT resort_id, bedrooms, kitchen, checkin, id, MIN(priority) as priority
FROM Inventory
GROUP BY resort_id, bedrooms, kitchen, checkin, id
) h2 on h.resort_id = h2.resort_id and
h.bedrooms = h2.bedrooms and
h.kitchen = h2.kitchen and
h.checkin = h2.checkin and
h.priority = h2.priority
GROUP BY h.resort_id, h.bedrooms, h.kitchen, h.checkin, h.priority
) AS I2
on i.id = i2.id
WHERE rowNumber >= #startIndex
AND rowNumber < #upperBound
ORDER BY rowNumber

I would accompish it this way.
SELECT I.resort_id,
I.[bedrooms],
I.[kitchen],
I.[checkin],
I.[priority],
I.col_1,
I.col_2 /* ..... (more cols) */
FROM Inventory i
JOIN
(
SELECT ROW_NUMBER(ORDER BY Checkin) as rowNumber, MIN(id) id
FROM Inventory h
JOIN (
SELECT resort_id, bedrooms, kitchen, checkin id, MIN(priority) as priority
FROM Inventory
GROUP BY resort_id, bedrooms, kitchen, checkin
) h2 on h.resort_id = h2.resort and
h.bedrooms = h2.bedrooms and
h.kitchen = h2.kitchen and
h.checkin = h2.checkin and
h.priority = h2.priority
GROUP BY h.resort_id, h.bedrooms, h.kitchen, h.checkin, h.priority
) AS I2
on i.id = i2.id
WHERE rowNumber >= #startIndex
AND rowNumber < #upperBound
ORDER BY rowNumber

Related

extracting records from rank = 1

I would like to get name of title that have the number 1 in the rank column.
SELECT title, RANK() OVER(ORDER BY COUNT(*) DESC) rank
FROM rentals as w join copies as e on w.signature = e.signature join books as c on e.idbook = c.idbook
WHERE dateofloan <= CURRENT_DATE - 31
GROUP BY title;
My code shows two columns
title, rank
Thank you in advance for your help.
Subquery and restrict to the first rank:
WITH cte AS (
SELECT title, RANK() OVER (ORDER BY COUNT(*) DESC) rnk
FROM rentals w
INNER JOIN copies e ON w.signature = e.signature
INNER JOIN books c ON e.idbook = c.idbook
WHERE dateofloan <= CURRENT_DATE - 31
GROUP BY title
)
SELECT title
FROM cte
WHERE rnk = 1;

SQL Server - Select with Group By together Raw_Number

I'm using SQL Server 2000 (80). So, it's not possible to use the LAG function.
I have a code a data set with four columns:
Purchase_Date
Facility_no
Seller_id
Sale_id
I need to identify missing Sale_ids. So every sale_id is a 100% sequential, so the should not be any gaps in order.
This code works for a specific date and store if specified. But i need to work on entire data set looping looping through every facility_id and every seller_id for ever purchase_date
declare #MAXCOUNT int
set #MAXCOUNT =
(
select MAX(Sale_Id)
from #table
where
Facility_no in (124) and
Purchase_date = '2/7/2020'
and Seller_id = 1
)
;WITH TRX_COUNT AS
(
SELECT 1 AS Number
union all
select Number + 1 from TRX_COUNT
where Number < #MAXCOUNT
)
select * from TRX_COUNT
where
Number NOT IN
(
select Sale_Id
from #table
where
Facility_no in (124)
and Purchase_Date = '2/7/2020'
and seller_id = 1
)
order by Number
OPTION (maxrecursion 0)
My Dataset
This column:
case when
Sale_Id=0 or 1=Sale_Id-LAG(Sale_Id) over (partition by Facility_no, Purchase_Date, Seller_id)
then 'OK' else 'Previous Missing' end
will tell you which Seller_Ids have some sale missing. If you want to go a step further and have exactly your desired output, then filter out and distinct the 'Previous Missing' ones, and join with a tally table on not exists.
Edit: OP mentions in comments they can't use LAG(). My suggestion, then, would be:
Make a temp table that that has the max(sale_id) group by facility/seller_id
Then you can get your missing results by this pseudocode query:
Select ...
from temptable t
inner join tally N on t.maxsale <=N.num
where not exists( select ... from sourcetable s where s.facility=t.facility and s.seller=t.seller and s.sale=N.num)
> because the only way to "construct" nonexisting combinations is to construct them all and just remove the existing ones.
This one worked out
; WITH cte_Rn AS (
SELECT *, ROW_NUMBER() OVER(PARTITION BY Facility_no, Purchase_Date, Seller_id ORDER BY Purchase_Date) AS [Rn_Num]
FROM (
SELECT
Facility_no,
Purchase_Date,
Seller_id,
Sale_id
FROM MyTable WITH (NOLOCK)
) a
)
, cte_Rn_0 as (
SELECT
Facility_no,
Purchase_Date,
Seller_id,
Sale_id,
-- [Rn_Num] AS 'Skipped Sale'
-- , case when Sale_id = 0 Then [Rn_Num] - 1 Else [Rn_Num] End AS 'Skipped Sale for 0'
, [Rn_Num] - 1 AS 'Skipped Sale for 0'
FROM cte_Rn a
)
SELECT
Facility_no,
Purchase_Date,
Seller_id,
Sale_id,
-- [Skipped Sale],
[Skipped Sale for 0]
FROM cte_Rn_0 a
WHERE NOT EXISTS
(
select * from cte_Rn_0 b
where b.Sale_id = a.[Skipped Sale for 0]
and a.Facility_no = b.Facility_no
and a.Purchase_Date = b.Purchase_Date
and a.Seller_id = b.Seller_id
)
--ORDER BY Purchase_Date ASC

How to select from relationship based on the height value of field postgres?

i have three tables vehicles and trips and componentValues they are related to each other by
vehicles -> trips -> componentValues
vehicles Table : id, ...
trips Table: id, vehicle_id, ...
componentValues Table: id, trip_id, damage, ...
and i'm trying to get all the trips with the highest damage component form the componentValues table like this
SELECT *
FROM (select * from trips WHERE "trips"."vehicle_id" = '7') as t
LEFT JOIN (
select * from "component_values"
where trip_id = '85'
order by damage
desc nulls last limit 1
) as h on t.id = h.trip_id
how can i change the line where trip_id = '85' to be dynamic or is their another way to do this and many thanks in advance.
expected result:
UPDATE
i have did some query that get what i want but how can i improve it by not using sub queries in the select statement
select * ,
(select damage from "component_values" where trip_id = trips.id order by damage desc nulls last limit 1) as h_damage,
(select damage from "component_values" where trip_id = trips.id order by damage asc nulls first limit 1) as l_damage,
(select component_types.name from "component_values" left join component_types on component_values.component_type_id = component_types.id where trip_id = trips.id order by damage desc nulls last limit 1) as hc_damage,
(select component_types.name from "component_values" left join component_types on component_values.component_type_id = component_types.id where trip_id = trips.id order by damage asc nulls first limit 1) as lc_damage
from trips
WHERE trips."vehicle_id" = '7'
I think you want distinct on:
select distinct on (t.id) t.*, dv.damage
from trips t join
component_values cv
on cv.trip_id = t.id
where t.vehicle_id = 7 -- not sure if this is needed
order by t.id, cv.damage desc nulls last;
distinct on is usually the most efficient method in Postgres. You can also do this with window functions:
select distinct on (t.id) t.*, cv.damage
from trips t join
(select cv.*,
row_number() over (partition by cv.trip_id, cv.damage desc nulls last) as seqnum
from component_values cv
) cv
on cv.trip_id = t.id and cv.seqnum = 1
where t.vehicle_id = 7; -- not sure if this is needed
I think you want a lateral join.
SELECT *
FROM (select * from trips WHERE "trips"."vehicle_id" = '7') as t
LEFT JOIN lateral (
select * from "component_values"
where trip_id = t.id
order by damage
desc nulls last limit 1
) as h on true
Although I don't think there is a reason for the first subquery, so:
SELECT *
FROM trips
LEFT JOIN lateral (
select * from "component_values"
where trip_id = trips.id
order by damage
desc nulls last limit 1
) as h on true
WHERE "trips"."vehicle_id" = '7'

Having issues combining HAVING with WHERE on a very simple QUERY

This is what I have tried so far:
SELECT group_id, player_id as winner_id
/*
,sum(M1.first_score + M2.second_score), sum(M2.first_score + M1.second_score)
*/
FROM players as P1
LEFT JOIN matches as M1
ON M1.first_player = P1.player_id
LEFT JOIN matches as M2
ON M2.second_player = P1.player_id
LEFT JOIN matches as M3
ON M3.first_player = P1.player_id
LEFT JOIN matches as M4
ON M4.second_player = P1.player_id
--WHERE P1.player_id is not null /*or P2.player_id is not null*/
GROUP BY group_id, P1.player_id
/*
HAVING
sum(M1.first_score + M2.second_score) > sum(M2.first_score + M1.second_score)
OR
sum(M3.first_score + M4.second_score) > sum(M4.first_score + M3.second_score)
*/
ORDER BY group_id ASC, player_id ASC
The results that I am getting are:
1,30
1,45
1,65
2,20
2,50
3,40
I know I am missing something very obvious as usual
This is my most recent
attempt
-- write your code in PostgreSQL 9.4
SELECT
group_id, player_id as winner_id
/*
,sum(M1.first_score + M2.second_score), sum(M2.first_score + M1.second_score)
*/
FROM players as P1
LEFT JOIN matches as M1
ON M1.first_player = P1.player_id
LEFT JOIN matches as M2
ON M2.second_player = P1.player_id
LEFT JOIN matches as M3
ON M3.first_player = P1.player_id
LEFT JOIN matches as M4
ON M4.second_player = P1.player_id
GROUP BY group_id, P1.player_id, M1,M2,M3,M4
/*
HAVING
(M1 is not null) OR (M2 is not null) OR (M3 is not null) OR (M4 is not null)
*/
/*
HAVING
sum(M1.first_score + M2.second_score) > sum(M2.first_score + M1.second_score)
OR
sum(M3.first_score + M4.second_score) > sum(M4.first_score + M3.second_score)
*/
ORDER BY group_id ASC
/*, player_id DESC
*/
How can I fix the query so that I can get the expected results
I don't have any PostgreSQL background but lets see if this works:
I would start this by simplifying it, by writing a query that first returns the total score by player:
SELECT player_id, SUM(score) score
FROM (
SELECT first_player as player_id, first_score as score
FROM matches
UNION ALL
SELECT second_player, second_score
FROM matches
)
GROUP BY player_id
Now, join that dataset to players to find the groups:
SELECT w.player_id, p.group_id, w.score
FROM
(
SELECT player_id, SUM(score) score
FROM (
SELECT first_player as player_id, first_score as score
FROM matches
UNION ALL
SELECT second_player, second_score
FROM matches
)
GROUP BY player_id
) as w
inner join players p
on p.player_id = w.player_id
Now we have all players, their total score and their group. We want to identify the winner by group? We can use ranking functions to do this:
SELECT
w.player_id,
p.group_id,
w.score,
RANK() OVER (PARTITION BY p.group_id ORDER BY score DESC) as group_placement
FROM
(
SELECT player_id, SUM(score) score
FROM (
SELECT first_player as player_id, first_score as score
FROM matches
UNION ALL
SELECT second_player, second_score
FROM matches
)
GROUP BY player_id
) as w
inner join players p
on p.player_id = w.player_id
Now we just pick out the top ones in each group (rank = 1) using WHERE
SELECT
player_id,
group_id
FROM
(
SELECT
w.player_id,
p.group_id,
w.score,
RANK() OVER (PARTITION BY p.group_id ORDER BY score DESC) as group_placement
FROM
(
SELECT player_id, SUM(score) score
FROM (
SELECT first_player as player_id, first_score as score
FROM matches
UNION ALL
SELECT second_player, second_score
FROM matches
)
GROUP BY player_id
) as w
inner join players p
on p.player_id = w.player_id
) as gp
WHERE group_placement = 1
Looks complicated? yes, but you can see have the final result is provided bit by bit. Each step of this is a 'subtable' and you can run and observe the data at each point.

TSQL Compare 2 select's result and return result with most recent date

Wonder if someone could give me a quick hand. I have 2 select queries (as shown below) and I want to compare the results of both and only return the result that has the most recent date.
So say I have the following 2 results from the queries:-
--------- ---------- ----------------------- --------------- ------ --
COMPANY A EMPLOYEE A 2007-10-16 17:10:21.000 E-mail 6D29D6D5 SYSTEM 1
COMPANY A EMPLOYEE A 2007-10-15 17:10:21.000 E-mail 6D29D6D5 SYSTEM 1
I only want to return the result with the latest date (so the first one). I thought about putting the results into a temporary table and then querying that but just wondering if there's a simpler, more efficient way?
SELECT * FROM (
SELECT fc.accountidname, fc.owneridname, fap.actualend, fap.activitytypecodename, fap.createdby, fap.createdbyname,
ROW_NUMBER() OVER (PARTITION BY fc.accountidname ORDER BY fap.actualend DESC) AS RN
FROM FilteredContact fc
INNER JOIN FilteredActivityPointer fap ON fc.parentcustomerid = fap.regardingobjectid
WHERE fc.statecodename = 'Active'
AND fap.ownerid = '0F995BDC'
AND fap.createdon < getdate()
) tmp WHERE RN = 1
SELECT * FROM (
SELECT fa.name, fa.owneridname, fa.new_technicalaccountmanageridname, fa.new_customerid, fa.new_riskstatusname,
fa.new_numberofopencases, fa.new_numberofurgentopencases, fap.actualend, fap.activitytypecodename, fap.createdby, fap.createdbyname,
ROW_NUMBER() OVER (PARTITION BY fa.name ORDER BY fap.actualend DESC) AS RN
FROM FilteredAccount fa
INNER JOIN FilteredActivityPointer fap ON fa.accountid = fap.regardingobjectid
WHERE fa.statecodename = 'Active'
AND fap.ownerid = '0F995BDC'
AND fap.createdon < getdate()
) tmp2 WHERE RN = 1
if the tables have the same structure (column count and column types to match), then you could just union the results of the two queries, then order by the date desc and then select the top 1.
select top 1 * from
(
-- your first query
union all
-- your second query.
) T
order by YourDateColumn1 desc
You should GROUP BY and use MAX(createdon)