Problem counting items in an individual row without duplication - tsql

I'm trying to write a query that will include a count for the primary and secondary activity only when Group ID = 260 and Item id in(1302,1303,1305,1306) for each individual RecordID. So far I have been able to single out the rows with those conditions, but I only want to count the primary and secondary activities once(because the Primary and Secondary activities are the same for their corresponding RecordID regardless of how many rows there are), if they aren't null, regardless of how many RecordID's match those conditions.
RecordID: GroupID: ItemID: PrimActivity: SecActivity:
320 260 1302 36 0
320 260 6456 36 0
320 312 1303 36 0
560 400 1302 46 48
560 312 1305 46 48
460 260 1305 45 56
460 260 1302 45 56
Result I'm getting:
RecordID: Count:
320 2
460 4
Expected result:
RecordID: Count:
320 1
460 2
SELECT dfr.RecordID,
COUNT(CASE WHEN dfr.PrimActivity <> 0 and a.GroupID =260 then 1
ELSE NULL END) +
COUNT(CASE WHEN dfr.SecActivity <> 0 and a.GroupID =260 then 1 ELSE
NULL END) AS Count
From ActivityItem ai
Join DailyRecord dfr on ai.PrimActivity = dfr.PrimActivity
Join AreaInfo af on af.AreaInfoID = dfr.AreaInfoID
Join Information a on dfr.RecordID = a.RecordID
Join Lookup lp on lp.ItemID = a.ItemID
WHERE a.GroupID like '260' and EXISTS(
SELECT b.RecordID, b.GroupID, b.ItemID
FROM Areainfo b
where a.RecordID=b.RecordID and b.ItemID IN(1302,1303,1305,1306)
GROUP BY dfr.RecordID

You should be more clear when you explain the structure of tables you are using. However, I reach the expected result starting from your sample table doing this:
SELECT RecordID,COUNT(*) as Count
FROM (SELECT DISTINCT RecordID,ItemID,PrimActivity,SecActivity
FROM [TABLE YOU POSTED]
WHERE GroupID = 260 and ItemID in (1302,1303,1305,1306) ) A
GROUP BY RecordID

Related

Problem Counting Items For an Individual Row

I need to find the count for ActivityID and AdditionalActivityID for each DailyFieldRecordID based on when GroupID = 260 and ItemID is either 1302,1303,1305,1306. The problem I'm having is that regardless of how many rows for each individual DailyFieldRecordID there are, there can only be one ActivityID and one AdditionalActivityID regardless of how many rows comply with the constraints.
Say someone is filling out a form and they list what their Activity for the day was and also what other activity they might have. They can only list one primary activity(ActivityID) and one secondary activity(AdditionalActivity). But during those activities they could participate with multiple groups(GroupID) or people(ItemID). So when I'm running this query I'm able to separate the rows based on the constraints, but I only want to count how many activities they participated in, which will either be 1 or 2 for each DailyFieldRecordID, regardless of how many groups or people were involved. Right now my query is counting each ActivityID and AdditionalActivityID for each row that meets the criteria, which can give me many more than just 1 or 2 for each DailyFieldRecordID. I'm just not sure how I would go about doing this. Any feedback is greatly appreciated.
DailyFieldRecordID: GroupID: ItemID: ActivityID: AdittionalActivityID:
3369320 260 1302 37 0
3369320 260 1305 37 0
3369320 210 2222 37 0
3369320 250 2222 37 0
3372806 260 1302 56 56
3372806 260 1305 56 56
3372806 250 2222 56 56
3388888 260 2222 45 32
Expected Result:
DailyFieldRecordID: Count:
3369320 1
3372806 2
Current Result:
DailyFieldRecordID: Count:
3369320 2
3372806 4
'
select a.DailyFieldRecordID,
count(case when a.ActivityID <>0 then 1 else null end) +
count(case when a.AdditionalActivityID <>0 then 1 else null end) as count
from AB953 a
where a.GroupID= 260 and exists(
select b.DailyFieldRecordID
from AB953 b
where a.DailyFieldRecordID = b.DailyFieldRecordID and b.ItemID in (1302,1303,1305,1306))
group by DailyFieldRecordID
I get this result when trying your data:
DailyfieldrecordID Count
3369320 3
3372806 2
3388888 1
SELECT DailyFieldRecordID,
COUNT(CASE WHEN ActivityID <>0 then 1 else 0 end +
CASE WHEN AdditionalActivityID <>0 then 1 else 0 end) as Count
from Foo
where GroupID= 260 and exists(
select b.DailyFieldRecordID
from fOO b
where DailyFieldRecordID = b.DailyFieldRecordID and b.ItemID in (1302,1303,1305,1306))
group by DailyFieldRecordID
New query: you might need to fiddle with this, not sure if your data is wrong or not....... cant get it to select 3 and then 2:
SELECT DailyFieldRecordID,
COUNT(CASE WHEN ActivityID <>0 then 1 else 0 end +
CASE WHEN AdditionalActivityID <>0 then 1 else 0 end) as Count
from Foo
where GroupID= 260 and DailyFieldRecordID IN(
select b.DailyFieldRecordID
from fOO b
where b.ItemID IN(1302,1303,1305,1306))
group by DailyFieldRecordID
This should do it:
;WITH CTE AS
(
SELECT A.DailyFieldRecordID
,ActivityID = IIF(A.ActivityID = 0, NULL, A.ActivityID)
,AdittionalActivityID = IIF(A.AdittionalActivityID = 0, NULL, A.AdittionalActivityID)
FROM AB953 A
WHERE A.GroupID = 260
AND A.ItemID IN (1302,1303,1305,1306)
)
SELECT DailyFieldRecordID
,CNT = COUNT(DISTINCT ActivityID) + COUNT(DISTINCT AdittionalActivityID)
FROM CTE
GROUP BY DailyFieldRecordID;
I created this DDL and test data for testing:
DROP TABLE IF EXISTS AB953
GO
CREATE TABLE AB953 (
DailyFieldRecordID INT, GroupID INT, ItemID INT, ActivityID INT, AdittionalActivityID INT)
INSERT INTO AB953
VALUES
( 3369320, 260, 1302, 37, 0 )
,( 3369320, 260, 1305, 37, 0 )
,( 3369320, 210, 2222, 37, 0 )
,( 3369320, 250, 2222, 37, 0 )
,( 3372806, 260, 1302, 56, 56 )
,( 3372806, 260, 1305, 56, 56 )
,( 3372806, 250, 2222, 56, 56 )
,( 3388888, 260, 2222, 45, 32 )
GO

PostgreSQL : comparing two sets of results does not work

I have a table that contains 3 columns of ids, clothes, shoes, customers and relates them.
I have a query that works fine :
select clothes, shoes from table where customers = 101 (all clothes and shoes of customer 101). This returns
clothes - shoes (SET A)
1 6
1 2
33 12
24 null
Another query that works fine :
select clothes ,shoes from table
where customers in
(select customers from table where clothes = 1 and customers <> 101 ) (all clothes and shoes of any other customer than 101, with specified clothes). This returns
shoes - clothes(SET B)
6 null
null 24
1 1
2 1
12 null
null 26
14 null
Now I want to get all clothes and shoes from SET A that are not in SET B.
So (example) select from SET A where NOT IN SET B. This should return just clothes 33, right?
I try to convert this to a working query :
select clothes, shoes from table where customers = 101
and
(clothes,shoes) not in
(
select clothes,shoes from
table where customers in
(select customers from table where clothes = 1 and customers <> 101 )
) ;
I tried different syntaxes, but the above looks more logic.
Problem is I never get clothes 33, just an empty set.
How do I fix this? What goes wrong?
Thanks
Edit , here is the contents of the table
id shoes customers clothes
1 1 1 1
2 1 4 1
3 1 5 1
4 2 2 2
5 2 3 1
6 1 3 1
44 2 101 1
46 6 101 1
49 12 101 33
51 13 102
52 101 24
59 107 51
60 107 24
62 23 108 51
63 23 108 2
93 124 25
95 6 125
98 127 25
100 3 128
103 24 131
104 25 132
105 102 28
106 10 102
107 23 133
108 4 26
109 6 4
110 4 24
111 12 4
112 14 4
116 102 48
117 102 24
118 102 25
119 102 26
120 102 29
122 134 31
The except clause in PostgreSQL works the way the minus operator does in Oracle. I think that will give you what you want.
I think notionally your query looks right, but I suspect those pesky nulls are impacting your results. Just like a null is not-NOT equal to 5 (it's nothing, therefore it's neither equal to nor not equal to anything), a null is also not-NOT "in" anything...
select clothes, shoes
from table1
where customers = 101
except
select clothes, shoes
from table1
where customers in (
select customers
from table1
where clothes = 1 and customers != 101
)
For PostgreSQL null is undefined value, so You must get rid of potential nulls in your result:
select id,clothes,shoes from t1 where customers = 101 -- or select id...
and (
clothes not in
(
select COALESCE(clothes,-1) from
t1 where customers in
(select customers from t1 where clothes = 1 and customers <> 101 )
)
OR
shoes not in
(
select COALESCE(shoes,-1) from
t1 where customers in
(select customers from t1 where clothes = 1 and customers <> 101 )
)
)
if You wanted unique pairs you would use:
select clothes, shoes from t1 where customers = 101
and
(clothes,shoes) not in
(
select coalesce(clothes,-1),coalesce(shoes,-1) from
t1 where customers in
(select customers from t1 where clothes = 1 and customers <> 101 )
) ;
You can't get "clothes 33" if You are selecting both clothes and shoes columns...
Also if u need to know exactly which column, clothes or shoes was unique to this customer, You might use this little "hack":
select id,clothes,-1 AS shoes from t1 where customers = 101
and
clothes not in
(
select COALESCE(clothes,-1) from
t1 where customers in
(select customers from t1 where clothes = 1 and customers <> 101)
)
UNION
select id,-1,shoes from t1 where customers = 101
and
shoes not in
(
select COALESCE(shoes,-1) from
t1 where customers in
(select customers from t1 where clothes = 1 and customers <> 101)
)
And Your result would be:
id=49, clothes=33, shoes=-1
(I assume that there aren't any clothes or shoes with id -1, You may put any exotic value here)
Cheers

Redshift - Get a value from one column A for each ID in the grouping ID column B based on max value in another column C

I have a sql problem (on Redshift) where I need to get the value from column index for each id in column id based on max value in column final_score and put this value in a new column fav_index. score2 equals to the value of score1 where index n = index n + 1, for example, for id = abc1, index = 0 and score1 = 10 the value of score2 will be the value of score1 where index = 1 and the value of final_score is the difference between score1 and score2.
It's easier if you look at below table score. This table score is a result of a sql query which is shown later below.
id index score1 score2 final_score
abc1 0 10 20 10
abc1 1 20 45 25
abc1 2 45 (null) (null)
abc2 0 5 10 5
abc2 1 10 (null) (null)
abc3 0 50 30 -20
abc3 1 30 (null) (null)
So, the resulting table containing column fav_index should look like this:
id index score1 score2 final_score fav_index
abc1 0 10 20 10 0
abc1 1 20 45 25 1
abc1 2 45 (null) (null) 0
abc2 0 5 10 5 0
abc2 1 10 (null) (null) 0
abc3 0 50 30 -20 0
abc3 1 30 (null) (null) 0
Below is the script to generate table score from table story:
select
m.id,
m.index,
max(m.max) as score1,
fmt.score2,
round(fmt.score2 - max(m.max), 1) as final_score
from
(select
sv.id,
case when sv.story_number % 2 = 0 then cast(sv.story_number / 2 - 1 as int) else cast(floor(sv.story_number/2) as int) end as index,
max(sv.score1)
from
story as sv
group by
sv.id,
index,
sv.score1
order by
sv.id,
index
) as m
left join
(select
sv.id,
case when sv.story_number % 2 = 0 then cast(sv.story_number / 2 - 1 as int) else cast(floor(sv.story_number/2) as int) end as index,
max(score1) as score2
from
story as sv
group by
id,
index
) as fmt
on
m.id = fmt.id
and
m.index = fmt.index - 1
group by
m.id,
m.index,
fmt.score2
Table story is as below:
id story_number score1
abc1 1 10
abc1 2 10
abc1 3 20
abc1 4 20
abc1 5 45
abc1 6 45
The only solution I can think of is to do something like,
select id, max(final_score) from score group by id
and then join it back to the long script above (which was used to generate table score). I really want to avoid writing such a long script to get just 1 extra column of information that I need.
Is there a better way to do this?
Thank you!
Update: answer in mysql is also accepted. thanks!
After spending more hours on this and asking people around, I finally figured out a solution by referring to this window function documentation - PostgreSQL https://www.postgresql.org/docs/9.1/static/tutorial-window.html
I basically added 2 x select statements at the top and 1 x where statement at the very bottom. The where statement is to take care of the rows where final_score = null because otherwise the rank() function will rank them as 1.
My code then becomes:
select
id, index, final_score, rank, case when rank = 1 then index else null end as fav_index
from
(select
id, index, final_score, rank() over (partition by id order by final_score desc)
from
(select
m.id,
m.index,
max(m.max) as score1,
fmt.score2,
round(fmt.score2 - max(m.max), 1) as final_score
from
(select
sv.id,
case when sv.story_number % 2 = 0 then cast(sv.story_number / 2 - 1 as int) else cast(floor(sv.story_number/2) as int) end as index,
max(sv.score1)
from
story as sv
group by
sv.id,
index,
sv.score1
order by
sv.id,
index
) as m
left join
(select
sv.id,
case when sv.story_number % 2 = 0 then cast(sv.story_number / 2 - 1 as int) else cast(floor(sv.story_number/2) as int) end as index,
max(score1) as score2
from
story as sv
group by
id,
index
) as fmt
on
m.id = fmt.id
and
m.index = fmt.index - 1
group by
m.id,
m.index,
fmt.score2)
where
final_score is not null)
And the result is as follows:
id index final_score rank fav_index
abc1 0 10 2 (null)
abc1 1 25 1 1
abc2 0 5 1 0
abc3 0 -20 1 0
Result is slightly different than what I stated in the question, however, the fav_index for each id is identified and this is what I needed really. Hope this might help someone. Cheers

Ordering row having same values in two columns on top

I have data similar to one given below:
ID UserID PlayerID Name
1 56 21 A
2 57 34 B
3 77 77 C
4 65 23 D
5 77 77 E
I want the rows with same value in UserID and PlayerID column to be at the top.
I have currently done this:
select * from tblTest
order by abs(UserID - PlayerID ) asc
Any better way to achieve this result?
Try this
SELECT * From tblTest
Order By Case When UserID = PlayerID Then 0 Else 1 End

Extract Unique Time Slices in Oracle

I use Oracle 10g and I have a table that stores a snapshot of data on a person for a given day. Every night an outside process adds new rows to the table for any person whose had any changes to their core data (stored elsewhere). This allows a query to be written using a date to find out what a person 'looked' like on some past day. A new row is added to the table even if only a single aspect of the person has changed--the implication being that many columns have duplicate values from slice to slice since not every detail changed in each snapshot.
Below is a data sample:
SliceID PersonID StartDt Detail1 Detail2 Detail3 Detail4 ...
1 101 08/20/09 Red Vanilla N 23
2 101 08/31/09 Orange Chocolate N 23
3 101 09/15/09 Yellow Chocolate Y 24
4 101 09/16/09 Green Chocolate N 24
5 102 01/10/09 Blue Lemon N 36
6 102 01/11/09 Indigo Lemon N 36
7 102 02/02/09 Violet Lemon Y 36
8 103 07/07/09 Red Orange N 12
9 104 01/31/09 Orange Orange N 12
10 104 10/20/09 Yellow Orange N 13
I need to write a query that pulls out time slices records where some pertinent bits, not the whole record, have changed. So, referring to the above, if I only want to know the slices in which Detail3 has changed from its previous value, then I would expect to only get rows having SliceID 1, 3 and 4 for PersonID 101 and SliceID 5 and 7 for PersonID 102 and SliceID 8 for PersonID 103 and SliceID 9 for PersonID 104.
I'm thinking I should be able to use some sort of Oracle Hierarchical Query (using CONNECT BY [PRIOR]) to get what I want, but I have not figured out how to write it yet. Perhaps YOU can help.
Thanks you for your time and consideration.
Here is my take on the LAG() solution, which is basically the same as that of egorius, but I show my workings ;)
SQL> select * from
2 (
3 select sliceid
4 , personid
5 , startdt
6 , detail3 as new_detail3
7 , lag(detail3) over (partition by personid
8 order by startdt) prev_detail3
9 from some_table
10 )
11 where prev_detail3 is null
12 or ( prev_detail3 != new_detail3 )
13 /
SLICEID PERSONID STARTDT N P
---------- ---------- --------- - -
1 101 20-AUG-09 N
3 101 15-SEP-09 Y N
4 101 16-SEP-09 N Y
5 102 10-JAN-09 N
7 102 02-FEB-09 Y N
8 103 07-JUL-09 N
9 104 31-JAN-09 N
7 rows selected.
SQL>
The point about this solution is that it hauls in results for 103 and 104, who don't have slice records where detail3 has changed. If that is a problem we can apply an additional filtration, to return only rows with changes:
SQL> with subq as (
2 select t.*
3 , row_number () over (partition by personid
4 order by sliceid ) rn
5 from
6 (
7 select sliceid
8 , personid
9 , startdt
10 , detail3 as new_detail3
11 , lag(detail3) over (partition by personid
12 order by startdt) prev_detail3
13 from some_table
14 ) t
15 where t.prev_detail3 is null
16 or ( t.prev_detail3 != t.new_detail3 )
17 )
18 select sliceid
19 , personid
20 , startdt
21 , new_detail3
22 , prev_detail3
23 from subq sq
24 where exists ( select null from subq x
25 where x.personid = sq.personid
26 and x.rn > 1 )
27 order by sliceid
28 /
SLICEID PERSONID STARTDT N P
---------- ---------- --------- - -
1 101 20-AUG-09 N
3 101 15-SEP-09 Y N
4 101 16-SEP-09 N Y
5 102 10-JAN-09 N
7 102 02-FEB-09 Y N
SQL>
edit
As egorius points out in the comments, the OP does want hits for all users, even if they haven't changed, so the first version of the query is the correct solution.
In addition to OMG Ponies' answer: if you need to query slices for all persons, you'll need partition by:
SELECT s.sliceid
, s.personid
FROM (SELECT t.sliceid,
t.personid,
t.detail3,
LAG(t.detail3) OVER (
PARTITION BY t.personid ORDER BY t.startdt
) prev_val
FROM t) s
WHERE (s.prev_val IS NULL OR s.prev_val != s.detail3)
I think you'll have better luck with the LAG function:
SELECT s.sliceid
FROM (SELECT t.sliceid,
t.personid,
t.detail3,
LAG(t.detail3) OVER (PARTITION BY t.personid ORDER BY t.startdt) 'prev_val'
FROM TABLE t) s
WHERE s.personid = 101
AND (s.prev_val IS NULL OR s.prev_val != s.detail3)
Subquery Factoring alternative:
WITH slices AS (
SELECT t.sliceid,
t.personid,
t.detail3,
LAG(t.detail3) OVER (PARTITION BY t.personid ORDER BY t.startdt) 'prev_val'
FROM TABLE t)
SELECT s.sliceid
FROM slices s
WHERE s.personid = 101
AND (s.prev_val IS NULL OR s.prev_val != s.detail3)