SQL Select based on each row of previous select - select

I have a table with answers regarding different questions, all of them numbered. There are basically these columns: IdAnswer (unique for each answer in the table), IdUser (which won't repeat even if the same user answer questions a second time), IdQuestion and Answer.
IdAnswer IdUser IdQuestion Answer
1 John 1 0
2 John 4 1
3 John 5 1
4 John 6 0
5 Bob 1 1
6 Bob 3 1
7 Bob 5 0
8 Mark 2 0
9 Mark 7 1
10 Mark 5 0
I'd like to select from this table all answers to a specific question (say, IdQuestion = 5), and also the last question each user answered just before question number 5.
In the end I need a table that should look like this:
IdAnswer IdUser IdQuestion Answer
2 John 4 1
3 John 5 1
6 Bob 3 1
7 Bob 5 0
9 Mark 7 1
10 Mark 5 0
I've managed to make this work using a cursor to iterate through each line from the first SELECT result (which filters by IdQuestion), but I'm not sure if this is the best (and fastest) way of doing it. Is there any more efficient way of achieving the same result?
And by the way, I'm using SQL Server Management Studio 2012.

Here is one way using LEAD function
select * from
(
select *,NextQ = Lead(IdQuestion)over(partition by IdUser order by IdAnswer)
from youtable
) a
Where 5 in (IdQuestion, NextQ )
for older versions
;WITH cte
AS (SELECT prev_id = Min(CASE WHEN IdQuestion = 5 THEN rn - 1 END) OVER( partition BY IdUser),*
FROM (SELECT rn = Row_number()OVER(partition BY IdUser ORDER BY IdAnswer),*
FROM Yourtable)a)
SELECT *
FROM cte
WHERE rn IN ( prev_id, prev_id + 1 )

Related

PostgreSQL group by and count on specific condition

I have the following tables (example)
Analyze_Line
id
game_id
bet_result
game_type
1
1
WIN
0
2
2
LOSE
0
3
3
WIN
0
4
4
LOSE
0
5
5
LOSE
0
6
6
WIN
0
Game
id
league_id
home_team_id
away_team_id
1
1
1
2
2
2
2
3
3
3
3
4
4
1
1
2
5
2
2
3
6
3
3
4
Required Data:
league_id
WIN
LOSE
GameCnt
1
1
1
2
2
0
2
2
3
2
0
2
The Analyze_Line table is joined with the Game table and simple can get GameCnt grouping by league_id, but I am not sure how to calculate WIN count and LOSE count in bet_result
You can use conditionals in aggregate function to divide win and lose bet results per league.
select
g.league_id,
sum(case when a.bet_result = 'WIN' then 1 end) as win,
sum(case when a.bet_result = 'LOSE' then 1 end) as lose,
count(*) as gamecnt
from
game g
inner join analyze_line a on
g.id = a.game_id
group by
g.league_id
Since there is no mention of postgresql version, I can't recommend using FILTER clause (postgres specific), since it might not work for you.
Adding to Kamil's answer - PostgreSQL introduced the filter clause in PostgreSQL 9.4, released about eight years ago (December 2014). At this point, I think it's safe enough to use in answers. IMHO, it's a tad more elegant than summing over a case expression, but it does have the drawback of being PostgreSQL specific syntax, and thus not portable:
SELECT g.league_id,
COUNT(*) FILTER (WHERE a.bet_result = 'WIN') AS win,
COUNT(*) FILTER (WHERE a.bet_result = 'LOSE') AS lose,
COUNT(*) AS gamecnt
FROM game g
JOIN analyze_line a ON g.id = a.game_id
GROUP BY g.league_id

How to optimize query

I have the same problem as mentioned in In SQL, how to select the top 2 rows for each group. The answer is working fine. But it takes too much time. How to optimize this query?
Example:
sample_table
act_id: act_cnt:
1 1
2 1
3 1
4 1
5 1
6 3
7 3
8 3
9 4
a 4
b 4
c 4
d 4
e 4
Now i want to group it (or using some other ways). And i want to select 2 rows from each group. Sample Output:
act_id: act_cnt:
1 1
2 1
6 3
7 3
9 4
a 4
I am new to SQL. How to do it?
The answer you linked to uses an inefficient workaround for MySQL's lack of window functions.
Using a window function is most probably much faster as you only need to read the table once:
select name,
score
from (
select name,
score,
dense_rank() over (partition by name order by score desc) as rnk
from the_table
) t
where rnk <= 2;
SQLFiddle: http://sqlfiddle.com/#!15/b0198/1
Having an index on (name, score) should speed up this query.
Edit after the question (and the problem) has been changed
select act_id,
act_cnt
from (
select act_id,
act_cnt,
row_number() over (partition by act_cnt order by act_id) as rn
from sample_table
) t
where rn <= 2;
New SQLFiddle: http://sqlfiddle.com/#!15/fc44b/1

T-SQL How to get all items from a tree in T-SQL?

I have a problem with a t-sql query.
Let's say I have a categories tree (categories ID)
cat_table
1
|
2-\
| 3-\
6 | 5
| 4 |
... ...
ads_table
ad_ID
category_ID
of course the category_ID column references to the ID column in the cat_table
the problem is, how to get (recursive ?) all advertisements from all categories which the top-most parent is the 1st category?
try using a recursive Common Table Expressions, aka "CTE" (available in SQL Server 2005 and up) like this:
--go through a nested table supervisor - user table and display the chain
DECLARE #Contacts table (id varchar(6), first_name varchar(10), reports_to_id varchar(6))
INSERT #Contacts VALUES ('1','Jerome', NULL ) -- tree is as follows:
INSERT #Contacts VALUES ('2','Joe' ,'1') -- 1-Jerome
INSERT #Contacts VALUES ('3','Paul' ,'2') -- / \
INSERT #Contacts VALUES ('4','Jack' ,'3') -- 2-Joe 9-Bill
INSERT #Contacts VALUES ('5','Daniel','3') -- / \ \
INSERT #Contacts VALUES ('6','David' ,'2') -- 3-Paul 6-David 10-Sam
INSERT #Contacts VALUES ('7','Ian' ,'6') -- / \ / \
INSERT #Contacts VALUES ('8','Helen' ,'6') -- 4-Jack 5-Daniel 7-Ian 8-Helen
INSERT #Contacts VALUES ('9','Bill ' ,'1') --
INSERT #Contacts VALUES ('10','Sam' ,'9') --
DECLARE #Root_id char(4)
--get 2 and below
SET #Root_id=2
PRINT '#Root_id='+COALESCE(''''+#Root_id+'''','null')
;WITH StaffTree AS
(
SELECT
c.id, c.first_name, c.reports_to_id, c.reports_to_id as Manager_id, cc.first_name AS Manager_first_name, 1 AS LevelOf
FROM #Contacts c
LEFT OUTER JOIN #Contacts cc ON c.reports_to_id=cc.id
WHERE c.id=#Root_id OR (#Root_id IS NULL AND c.reports_to_id IS NULL)
UNION ALL
SELECT
s.id, s.first_name, s.reports_to_id, t.id, t.first_name, t.LevelOf+1
FROM StaffTree t
INNER JOIN #Contacts s ON t.id=s.reports_to_id
WHERE s.reports_to_id=#Root_id OR #Root_id IS NULL OR t.LevelOf>1
)
SELECT * FROM StaffTree
output:
#Root_id='2 '
id first_name reports_to_id Manager_id Manager_first_name LevelOf
------ ---------- ------------- ---------- ------------------ -----------
2 Joe 1 1 Jerome 1
3 Paul 2 2 Joe 2
6 David 2 2 Joe 2
7 Ian 6 6 David 3
8 Helen 6 6 David 3
4 Jack 3 3 Paul 3
5 Daniel 3 3 Paul 3
(7 row(s) affected)
change #Root_id to get different output:
#Root_id=null
id first_name reports_to_id Manager_id Manager_first_name LevelOf
------ ---------- ------------- ---------- ------------------ -----------
1 Jerome NULL NULL NULL 1
2 Joe 1 1 Jerome 2
9 Bill 1 1 Jerome 2
10 Sam 9 9 Bill 3
3 Paul 2 2 Joe 3
6 David 2 2 Joe 3
7 Ian 6 6 David 4
8 Helen 6 6 David 4
4 Jack 3 3 Paul 4
5 Daniel 3 3 Paul 4
(10 row(s) affected)
There is an option to avoid recurrency in tree browsing queries. You can add 'Path' column to your categories tree. It should contain each element ancestors IDs delimited with some non-numeric character (like slash).
For example your "ID=4" category's path would look like that: "/1/2/3/"
Now when you join your ads table to categories you need to do following:
select * from ads_table
inner join cat_table on cat_table.ID = ads_table.category_ID
where cat_table.Path like '/1/%'
And that's your query.
You can read more on this topic on my blog post
Are you familiar with Common Table Expressions in SQL Server? One of the many uses a CTE has is to do recursive queries.
The following is one of the best articles I've found on the subject:
https://web.archive.org/web/20210927200924/http://www.4guysfromrolla.com/webtech/071906-1.shtml

Select max value rows from table column

my table look like this..
id name count
-- ---- -----
1 Mike 0
2 Duke 2
3 Smith 1
4 Dave 6
5 Rich 3
6 Rozie 8
7 Romeo 0
8 Khan 1
----------------------
I want to select rows with max(count) limit 5 (TOP 5 Names with maximum count)
that would look sumthing like...
id name count
-- ---- -----
6 Rozie 8
4 Dave 6
5 Rich 3
2 Duke 2
3 Smith 1
please help,,
thanks
Here is how:
MySQL:
SELECT * FROM tableName ORDER BY count DESC LIMIT 5
MS SQL:
SELECT TOP 5 * FROM tableName ORDER BY count DESC

SQL Server Multiple Running Totals

I have a table like this
UserID Score Date
5 6 2010-1-1
7 8 2010-1-2
5 4 2010-1-3
6 3 2010-1-4
7 4 2010-1-5
6 1 2010-1-6
I would like to get a table like this
UserID Score RunningTotal Date
5 6 6 2010-1-1
5 4 10 2010-1-3
6 3 3 2010-1-4
6 1 4 2010-1-6
7 8 8 2010-1-2
7 4 12 2010-1-5
Thanks!
Unlike Oracle, PostgreSQL and even MySQL, SQL Server has no efficient way to calculate running totals.
If you have few scores per UserID, you can use this:
SELECT userId,
(
SELECT SUM(score)
FROM scores si
WHERE si.UserID = so.UserID
AND si.rn <= so.rn
)
FROM (
SELECT *,
ROW_NUMBER() OVER (PARTITION BY UserID) AS rn
FROM scores
) so
, however, this will be very inefficient for larger tables.
For larger tables, you could benefit from using (God help me) a cursor.
Would something like this work for you...?
SELECT UserID, Score,
(SELECT SUM(Score)
FROM TableName innerTable
WHERE innerTable.UserID = outerTable.userID
AND innerTable.Date <= outerTable.date) AS RunningTotal
FROM TableName outerTable
This assumes, though, that a user cannot have more than one score per day. (What is your PK?)