T-SQL: Simplify Select statement with ROW_NUMBER - tsql

I have the following select SQL that does a basic select statement although it does include a calculated column:
Select *
From
(
Select *,
ROW_NUMBER() OVER
(ORDER BY
CASE WHEN #sortBy = 0 THEN R.DateCreated End Desc,
CASE WHEN #sortBy = 1 THEN R.DateCreated end Asc,
CASE WHEN #sortBy = 2 THEN TotalVotes END Desc,
CASE WHEN #sortBy = 2 THEN R.TotalFoundNotUseful END Desc
) AS RowNumber
From
(
Select *, (TotalFoundUseful + TotalFoundNotUseful) As TotalVotes
From Reviews
Where (DealID = #dealID) And (TotalAbuses < 10) And (Deleted = 0)
) As R
) As Rev
Where RowNumber BETWEEN #startRecord AND #endRecord
If you look carefully, the SELECT statement itself is executed 3 times. I can't believe that this is necessary. Is there a way to reduce this to 2 select statements (or possibly even one). I don't actually need to return the RowNumber. It is only used for selecting rows within a certain range.

You can do it with two by putting the rownumber in with the original select against reviews. You can't go to one if you want to have a WHERE clause on a windowing function like ROW_NUMBER.
You do have to write TotalFoundUseful + TotalFoundNotUseful twice but it will only be evaluated once so that doesn't effect performance.
I also wouldn't expect moving to two to have any effect on performance but you should test it.
Select *
From
(
Select *, (TotalFoundUseful + TotalFoundNotUseful) As TotalVotes,
ROW_NUMBER() OVER
(ORDER BY
CASE WHEN #sortBy = 0 THEN DateCreated End Desc,
CASE WHEN #sortBy = 1 THEN DateCreated end Asc,
CASE WHEN #sortBy = 2 THEN TotalFoundUseful + TotalFoundNotUseful END Desc,
CASE WHEN #sortBy = 2 THEN TotalFoundUseful END Desc
) AS RowNumber
From Reviews
Where (DealID = #dealID) And (TotalAbuses < 10) And (Deleted = 0)
) As Rev
Where RowNumber BETWEEN #startRecord AND #endRecord

Related

tsql - Group By on computed columns

Please advice on a better way to do this.
I am sure this can be done in one query itself.
declare #tempTale table (ID bigint, ArticleDate datetime,CommentDate
datetime,MostRecentDate datetime)
declare #MinDate datetime;
set #MinDate = getdate();
set #MinDate = DATEADD(YEAR,-100,#MinDate)
insert into #tempTale
select USER_ARTICLEID, User_Article.CREATED_ON, coalesce(comment.CREATED_ON,#MinDate),
case when coalesce(User_Article.CREATED_ON,#MinDate) > coalesce(comment.CREATED_ON,#MinDate) then User_Article.CREATED_ON else comment.CREATED_ON end as MostRecentDate
from User_Article left join Comment on Comment.CONTENTID = User_Article.USER_ARTICLEID and comment.CONTENT_TYPE = User_Article.CONTENT_TYPE
order by MostRecentDate desc
select distinct top 10 ID,MAX(MostRecentDate) from #tempTale group by ID
order by MAX(MostRecentDate) desc
obvious change is to use sub-queries:
select distinct top 10 ID, MAX(MostRecentDate) from
(
select
USER_ARTICLEID as ID,
(case
when coalesce(User_Article.CREATED_ON,#MinDate) > coalesce(comment.CREATED_ON,#MinDate) then User_Article.CREATED_ON
else comment.CREATED_ON end) as MostRecentDate
from User_Article
left join Comment
on Comment.CONTENTID = User_Article.USER_ARTICLEID and comment.CONTENT_TYPE = User_Article.CONTENT_TYPE
)
group by ID
order by 2 desc
but you don't group on computed columns, so you can go with simple one:
select distinct top 10
USER_ARTICLEID as ID,
(case
when coalesce(User_Article.CREATED_ON,#MinDate) > coalesce(comment.CREATED_ON,#MinDate) then User_Article.CREATED_ON
else comment.CREATED_ON end) as MostRecentDate
from User_Article
left join Comment
on Comment.CONTENTID = User_Article.USER_ARTICLEID and comment.CONTENT_TYPE = User_Article.CONTENT_TYPE
group by USER_ARTICLEID
order by 2 desc

TSQL - LEAD for Next Different Row

Is there a way to use the lead function such that I can get the next row where something has changed, as opposed it where it is the same?
In this example, the RowType can be 'in' or 'out', for each 'in' I need to know the next RowNumber where it has become 'out'. I have been playing with the lead function as it is really fast, however I haven't been able to get it working. I just need to do the following really, which is partition by a RowType which isn't the one in the current row.
select
RowNumber
,RowType --In this case I am only interested in RowType = 'In'
, Lead(RowNumber)
OVER (partition by "RowType = out" --This is the bit I am stuck on--
order by RowNumber ASC) as NextOutFlow
from table
order by RowNumber asc
Thanks in advance for any help
Rather than using lead() I would use an outer apply that returns the next row with type out for all rows with type in:
select RowNumber, RowType, nextOut
from your_table t
outer apply (
select min(RowNumber) as nextOut
from your_table
where RowNumber > t.RowNumber and RowType='Out'
) oa
where RowType = 'In'
order by RowNumber asc
Given sample data like:
RowNumber RowType
1 in
2 out
3 in
4 in
5 out
6 in
This would return:
RowNumber RowType nextOut
1 in 2
3 in 5
4 in 5
6 in NULL
I think this will work
If you would use a bit field for in out you would get better performance
;with cte1 as
(
SELECT [inden], [OnOff]
, lag([OnOff]) over (order by [inden]) as [lagOnOff]
FROM [OnOff]
), cte2 as
(
select [inden], [OnOff], [lagOnOff]
, lead([inden]) over (order by [inden]) as [Leadinden]
from cte1
where [OnOff] <> [lagOnOff]
or [lagOnOff] is null
)
select [inden], [OnOff], [lagOnOff], [Leadinden]
from cte2
where [OnOff] = 'true'
probably slower but if you have the right indexes may work
select t1.rowNum as 'rowNumIn', min(t2.rownum) as 'nextRowNumOut'
from tabel t1
join table t2
on t1.rowType = 'In'
and t2.rowType = 'Out'
and t2.rowNum > t1.rowNum
and t2.rowNum < t1.rowNum + 1000 -- if you can constrain it
group by t1.rowNum

How to match records for two different groups?

I have one main table called Event_log which contains all of the records that I need for this query. Within this table there is one column that I'm calling "Grp". To simplify things, assume that there are only two possible values for this Grp: A and B. So now we have one table, Event_log, with one column "Grp" and one more column called "Actual Date". Lastly I want to add one more Flag column to this table, which works as follows.
First, I order all of the records in descending order by date as demonstrated below. Then, I want to flag each Group "A" row with a 1 or a 0. For all "A" rows, if the previous record (earlier in date) = "B" row then I want to flag 1. Otherwise flag a 0. So this initial table looks like this before setting this flag:
Actual Date Grp Flag
1-29-13 A
12-27-12 B
12-26-12 B
12-23-12 A
12-22-12 A
But after these calculations are done, it should look like this:
Actual Date Grp Flag
1-29-13 A 1
12-27-12 B NULL
12-26-12 B NULL
12-23-12 A 0
12-22-12 A 0
How can I do this? This is simpler to describe than it is to query!
You can use something like:
select el.ActualDate
, el.Grp
, Flag = case
when el.grp = 'B' then null
when prev.grp = 'B' then 1
else 0
end
from Event_log el
outer apply
(
select top 1 prev.grp
from Event_log prev
where el.ActualDate > prev.ActualDate
order by prev.ActualDate desc
) prev
order by el.ActualDate desc
SQL Fiddle with demo.
Try this
;with cte as
(
SELECT CAST('01-29-13' As DateTime) ActualDate,'A' Grp
UNION ALL SELECT '12-27-12','B'
UNION ALL SELECT '12-26-12','B'
UNION ALL SELECT '12-23-12','A'
UNION ALL SELECT '12-22-12','A'
)
, CTE2 as
(
SELECT *, ROW_NUMBER() OVER (order by actualdate desc) rn
FROM cte
)
SELECT a.*,
case
when A.Grp = 'A' THEN
CASE WHEN b.Grp = 'B' THEN 1 ELSE 0 END
ELSE NULL
END Flag
from cte2 a
LEFT OUTER JOIN CTE2 b on a.rn + 1 = b.rn

Is T-SQL (2005) RANK OVER(PARTITION BY) the answer?

I have a stored procedure that does paging for the front end and is working fine. I now need to modify that procedure to group by four columns of the 20 returned and then only return the row within each group that contains the lowest priority. So when resort_id, bedrooms, kitchen and checkin (date) all match then only return the row that has the min priority. I have to still maintain the paging functionality. The #startIndex and #upperbound are parms passed into the procedure from the front end for paging. I’m thinking that RANK OVER (PARTITION BY) is the answer I just can’t quite figure out how to put it all together.
SELECT I.id,
I.resort_id,
I.[bedrooms],
I.[kitchen],
I.[checkin],
I.[priority],
I.col_1,
I.col_2 /* ..... (more cols) */
FROM (
SELECT ROW_NUMBER() OVER(ORDER by checkin) AS rowNumber,
*
FROM Inventory
) AS I
WHERE rowNumber >= #startIndex
AND rowNumber < #upperBound
ORDER BY rowNumber
Example 2 after fix:
SELECT I.resort_id,
I.[bedrooms],
I.[kitchen],
I.[checkin],
I.[priority],
I.col_1,
I.col_2 /* ..... (more cols) */
FROM Inventory i
JOIN
(
SELECT ROW_NUMBER() OVER(ORDER BY h.checkin) as rowNumber, MIN(h.id) as id
FROM Inventory h
JOIN (
SELECT resort_id, bedrooms, kitchen, checkin, id, MIN(priority) as priority
FROM Inventory
GROUP BY resort_id, bedrooms, kitchen, checkin, id
) h2 on h.resort_id = h2.resort_id and
h.bedrooms = h2.bedrooms and
h.kitchen = h2.kitchen and
h.checkin = h2.checkin and
h.priority = h2.priority
GROUP BY h.resort_id, h.bedrooms, h.kitchen, h.checkin, h.priority
) AS I2
on i.id = i2.id
WHERE rowNumber >= #startIndex
AND rowNumber < #upperBound
ORDER BY rowNumber
I would accompish it this way.
SELECT I.resort_id,
I.[bedrooms],
I.[kitchen],
I.[checkin],
I.[priority],
I.col_1,
I.col_2 /* ..... (more cols) */
FROM Inventory i
JOIN
(
SELECT ROW_NUMBER(ORDER BY Checkin) as rowNumber, MIN(id) id
FROM Inventory h
JOIN (
SELECT resort_id, bedrooms, kitchen, checkin id, MIN(priority) as priority
FROM Inventory
GROUP BY resort_id, bedrooms, kitchen, checkin
) h2 on h.resort_id = h2.resort and
h.bedrooms = h2.bedrooms and
h.kitchen = h2.kitchen and
h.checkin = h2.checkin and
h.priority = h2.priority
GROUP BY h.resort_id, h.bedrooms, h.kitchen, h.checkin, h.priority
) AS I2
on i.id = i2.id
WHERE rowNumber >= #startIndex
AND rowNumber < #upperBound
ORDER BY rowNumber

TSQL Compare 2 select's result and return result with most recent date

Wonder if someone could give me a quick hand. I have 2 select queries (as shown below) and I want to compare the results of both and only return the result that has the most recent date.
So say I have the following 2 results from the queries:-
--------- ---------- ----------------------- --------------- ------ --
COMPANY A EMPLOYEE A 2007-10-16 17:10:21.000 E-mail 6D29D6D5 SYSTEM 1
COMPANY A EMPLOYEE A 2007-10-15 17:10:21.000 E-mail 6D29D6D5 SYSTEM 1
I only want to return the result with the latest date (so the first one). I thought about putting the results into a temporary table and then querying that but just wondering if there's a simpler, more efficient way?
SELECT * FROM (
SELECT fc.accountidname, fc.owneridname, fap.actualend, fap.activitytypecodename, fap.createdby, fap.createdbyname,
ROW_NUMBER() OVER (PARTITION BY fc.accountidname ORDER BY fap.actualend DESC) AS RN
FROM FilteredContact fc
INNER JOIN FilteredActivityPointer fap ON fc.parentcustomerid = fap.regardingobjectid
WHERE fc.statecodename = 'Active'
AND fap.ownerid = '0F995BDC'
AND fap.createdon < getdate()
) tmp WHERE RN = 1
SELECT * FROM (
SELECT fa.name, fa.owneridname, fa.new_technicalaccountmanageridname, fa.new_customerid, fa.new_riskstatusname,
fa.new_numberofopencases, fa.new_numberofurgentopencases, fap.actualend, fap.activitytypecodename, fap.createdby, fap.createdbyname,
ROW_NUMBER() OVER (PARTITION BY fa.name ORDER BY fap.actualend DESC) AS RN
FROM FilteredAccount fa
INNER JOIN FilteredActivityPointer fap ON fa.accountid = fap.regardingobjectid
WHERE fa.statecodename = 'Active'
AND fap.ownerid = '0F995BDC'
AND fap.createdon < getdate()
) tmp2 WHERE RN = 1
if the tables have the same structure (column count and column types to match), then you could just union the results of the two queries, then order by the date desc and then select the top 1.
select top 1 * from
(
-- your first query
union all
-- your second query.
) T
order by YourDateColumn1 desc
You should GROUP BY and use MAX(createdon)