Select only group of records for where condition - tsql

I have a table similar to this
THid Sid TID Sealantid
1 1 1 1
2 1 2 1
3 1 3 4
4 1 4 1
5 1 5 1
6 1 6 1
33 2 1 1
34 2 2 1
35 2 3 1
36 2 4 1
37 2 5 1
38 2 6 1
65 3 1 1
66 3 2 1
67 3 3 4
68 3 4 1
69 3 5 1
70 3 6 1
97 4 1 1
98 4 2 1
99 4 3 8
100 4 4 1
101 4 5 1
102 4 6 1
129 5 1 1
130 5 2 1
131 5 3 8
132 5 4 1
133 5 5 1
134 5 6 1
161 6 1 1
162 6 2 1
163 6 3 4
164 6 4 1
165 6 5 1
166 6 6 1
193 7 1 1
194 7 2 1
195 7 3 4
196 7 4 1
197 7 5 1
198 7 6 1
225 8 1 1
226 8 2 1
227 8 3 4
228 8 4 1
229 8 5 1
230 8 6 1
257 9 1 1
258 9 2 1
259 9 3 1
260 9 4 1
261 9 5 1
262 9 6 1
289 10 1 1
290 10 2 1
291 10 3 4
292 10 4 1
293 10 5 1
294 10 6 1
Here I wanted to find records only Sid's "where all sealantid=1"
Simple query I tried this
select * from table where sealantid=1
but this gives me all sid's but I want only SID's, Tid's (1 to 6) where all the sealantid=1
In this table 2 , 9

You can use NOT EXISTS with a subquery
Try this:
WITH SampleData AS (
SELECT V.*
FROM (VALUES
(1, 1, 1, 1)
,(2, 1, 2, 1)
,(3, 1, 3, 4)
,(4, 1, 4, 1)
,(5, 1, 5, 1)
,(6, 1, 6, 1)
,(33, 2, 1, 1)
,(34, 2, 2, 1)
,(35, 2, 3, 1)
,(36, 2, 4, 1)
,(37, 2, 5, 1)
,(38, 2, 6, 1)
,(65, 3, 1, 1)
,(66, 3, 2, 1)
,(67, 3, 3, 4)
,(68, 3, 4, 1)
,(69, 3, 5, 1)
,(70, 3, 6, 1)
,(97, 4, 1, 1)
,(98, 4, 2, 1)
,(99, 4, 3, 8)
,(100, 4, 4, 1)
,(101, 4, 5, 1)
,(102, 4, 6, 1)
,(129, 5, 1, 1)
,(130, 5, 2, 1)
,(131, 5, 3, 8)
,(132, 5, 4, 1)
,(133, 5, 5, 1)
,(134, 5, 6, 1)
,(161, 6, 1, 1)
,(162, 6, 2, 1)
,(163, 6, 3, 4)
,(164, 6, 4, 1)
,(165, 6, 5, 1)
,(166, 6, 6, 1)
,(193, 7, 1, 1)
,(194, 7, 2, 1)
,(195, 7, 3, 4)
,(196, 7, 4, 1)
,(197, 7, 5, 1)
,(198, 7, 6, 1)
,(225, 8, 1, 1)
,(226, 8, 2, 1)
,(227, 8, 3, 4)
,(228, 8, 4, 1)
,(229, 8, 5, 1)
,(230, 8, 6, 1)
,(257, 9, 1, 1)
,(258, 9, 2, 1)
,(259, 9, 3, 1)
,(260, 9, 4, 1)
,(261, 9, 5, 1)
,(262, 9, 6, 1)
,(289, 10, 1, 1)
,(290, 10, 2, 1)
,(291, 10, 3, 4)
,(292, 10, 4, 1)
,(293, 10, 5, 1)
,(294, 10, 6, 1)
) AS V (THid, Sid, TID, Sealantid)
)
SELECT DISTINCT SD.Sid
FROM SampleData AS SD
WHERE NOT EXISTS (
SELECT 1 FROM SampleData AS C
WHERE SD.Sid = C.Sid AND C.Sealantid <> 1
)
You can try it on fiddle
An alternative could be LEFT JOIN or NOT IN
SELECT DISTINCT SD.Sid
FROM SampleData AS SD
LEFT JOIN (
SELECT DISTINCT Sid FROM SampleData WHERE Sealantid <> 1
) AS C
ON SD.Sid = C.Sid
WHERE c.Sid IS NULL
SELECT DISTINCT SD.Sid
FROM SampleData AS SD
WHERE SD.Sid NOT IN (
SELECT DISTINCT Sid FROM SampleData WHERE Sealantid <> 1
)

You can use gruop by with having:
Select sid
From table
Group by sid
Having min(sealantId) = 1
And max(sealantId) = 1

Related

Generate percent using group_by and mutate

I am working on a dataset that contains predicted label (predicted) vs. true label (label) for each id and a column indicating whether the predicted label equals true label (match). I want to show the percentage of correct prediction for each label versus the total number of observations belonging to that label.
As an example, given the following data:
id <- c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
label <- c(6, 5, 1, 5, 4, 2, 3, 1, 6, 1)
predicted <- c(6, 5, 1, 3, 2, 2, 3, 1, 4, 4)
match <- c(1, 1, 1, 0, 0, 1, 1, 1, 0, 0)
dt <- data.frame(id, label, predicted, match)
head(dt)
id label predicted match
1 1 6 6 1
2 2 5 5 1
3 3 1 1 1
4 4 5 3 0
5 5 4 2 0
6 6 2 2 1
If I group_by(label) and count(label, predicted) and then mutate(percent = sum(match == 1)/sum(n)), it is expected that I should obtain a new grouped data frame like this
library(plyr)
library(dplyr)
dt %>% group_by(label) %>% dplyr::count(label, predicted) %>% mutate(percent = sum(match == 1)/sum(n))
dt
id label predicted match percent
1 3 1 1 1 0.67
2 8 1 1 1 0.67
3 10 1 4 0 0.67
4 6 2 2 1 1.00
5 7 3 3 1 1.00
6 5 4 2 0 0.00
7 4 5 3 0 0.50
8 2 5 5 1 0.50
9 9 6 4 0 0.50
10 1 6 6 1 0.50
However, my code gives me this following output instead
dt
# A tibble: 6 x 4
# Groups: label [5]
label predicted n percent
<dbl> <dbl> <int> <dbl>
1 1.00 1.00 2 0.600
2 1.00 4.00 1 0.600
3 2.00 2.00 1 0.600
4 3.00 3.00 1 0.600
5 4.00 2.00 1 0.600
6 5.00 3.00 1 0.600
It calculated the percentage of correct prediction for "all" label (hence, all equals 0.600) instead of doing that for each label. How should I modify my code to achieve my desired output?
I wasn't able to reproduce your output with the code that you shared. I think the following will accomplish what you are seeking, though (I used total as the variable name rather than n):
dt %>%
arrange(label) %>%
group_by(label) %>%
mutate(total = n(),
percent = sum(match == 1) / total)
# A tibble: 10 x 6
# Groups: label [6]
id label predicted match total percent
<dbl> <dbl> <dbl> <dbl> <int> <dbl>
1 3 1 1 1 3 0.667
2 8 1 1 1 3 0.667
3 10 1 4 0 3 0.667
4 6 2 2 1 1 1
5 7 3 3 1 1 1
6 5 4 2 0 1 0
7 2 5 5 1 2 0.5
8 4 5 3 0 2 0.5
9 1 6 6 1 2 0.5
10 9 6 4 0 2 0.5

How does [hist] from SMLib work in Pure Data?

I put the following message into a [hist 0 100 10] object (in SMLib):
0 1 2 3 3 4 5 5 5 6 7 7 7 7 8 9 10 11 11 11 11 11 12 13 14 15 16 17 18 19 20 21 22 23 23 23 23 23 23 23 23 23 23 67 99 100 107
I then hit 'absolute' and the following is output.
6 19 18 0 0 0 0 1 0 3
I was expecting it to count the occurrences of the numbers into even bins of size 10 but only six numbers are in the first bin, and the 67 is in the wrong bin!
I counted up how it's evaluated it and got the following:
[0, 1, 2, 3, 3, 4] = 6
[5, 5, 5, 6, 7, 7, 7, 7, 8, 9, 10, 11, 11, 11, 11, 11, 12, 13, 14] = 19
[15, 16, 17, 18, 19, 20, 21, 22, 23, 23, 23, 23, 23, 23, 23, 23, 23, 23] = 18
[] = 0
[] = 0
[] = 0
[] = 0
[67] = 1
[] = 0
[99, 100, 107] = 3
But.. I was expecting the following result.
16 14 13 0 0 0 1 0 0 3
Fixed it!
I was using [hist 0 100 10] when I should have been using [hist 5 105 10]!

find and match multiple values in the same row in array in matlab

I have a data set consists of the following (which are head values for a finite difference groundwater flow model consists of 200 row, 200 column, and 5 layers)
, "id", "k", "i", "j", "f", "Active"
1, 1, 1, 1, 1, 313, 0
2, 2, 1, 1, 2, 315.2.0, 0
3, 3, 1, 1, 3, 301.24, 0
4, 4, 1, 1, 4, 306.05, 0
5, 5, 1, 1, 5, -999.0, 0
6, 6, 1, 1, 6, -999.0, 0
7, 7, 1, 1, 7, 310.57, 0
8, 8, 1, 1, 8, -999.0, 0
9, 9, 1, 1, 9, -999.0, 0
.
.
.
200000, 200000, 5, 200, 200, -999.0, 0
let us assume that I need to find the row that has a specific i,j,k
for example I want to find the row which has i=100, j=50, k=3 to store the value f for multiple i,j,k
I've tried to use find but it finds only the location for a specific item
I know it can be done using for & if but it will be time demanding
Is there a fast way to do so using matlab?
Lets suppose your text file has the following data
"id", "k", "i", "j", "f", "Active"
1, 1, 1, 1, 313, 0
2, 1, 1, 2, 315.2.0, 0
3, 1, 1, 3, 301.24, 0
4, 1, 1, 4, 306.05, 0
5, 1, 1, 5, -999.0, 0
6, 1, 1, 6, -999.0, 0
7, 1, 1, 7, 310.57, 0
8, 1, 1, 8, -999.0, 0
9, 1, 1, 9, -999.0, 0
First read the file through
>> fileID = fopen('testfileaccess.txt');
>> C = textscan(fileID,'%s %s %s %s %s %s','Delimiter',',')
You will get 6 cells representing each column
C =
Columns 1 through 5
{10x1 cell} {10x1 cell} {10x1 cell} {10x1 cell} {10x1 cell}
Column 6
{10x1 cell}
>> Matrix= [str2double(C{2}).';str2double(C{3}).';str2double(C{4}).';].'
the above code will result the i j and k in a matrix with each row representing each variable
Matrix =
NaN NaN NaN
1 1 1
1 1 2
1 1 3
1 1 4
1 1 5
1 1 6
1 1 7
1 1 8
1 1 9
then if you want to find for example k = 1 , i = 1 and j = 8 you use find()
find(Matrix(:,1) == 1 & Matrix(:,2) == 1 & Matrix(:,3) == 8)
and there you have it
ans =
8
8th row
row = jcount*kcount*(i-1) + kcount*(j-1) + k
In your case:
row = 200*5*(i-1) + 5*(j-1) + k

MATLAB - Sort a matrix based off how a vector is sorted [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
How can I sort a 2-D array in MATLAB with respect to one column?
Sort a matrix with another matrix
I have a vector 'A' of 429 values and a matrix 'B' of 429x200 values. Rows in A and B are share the same indices. My vector 'A' contains values 1:1:429 but they are randomly ordered throughout the vector. I want to reorder A so that it indexes in order from 1 to 429 and I also want to sort the rows in matrix 'B' in the same order as the newly sorted 'A'.
Can this be done quick and easy without a for-loop?
Here's an example to illustrate my point:
A =
5
3
1
2
4
B =
3 7 0 4 6
1 2 5 0 8
4 0 2 0 0
3 0 1 0 5
2 2 3 4 4
sortedA =
1
2
3
4
5
sortedB =
4 0 2 0 0
3 0 1 0 5
1 2 5 0 8
2 2 3 4 4
3 7 0 4 6
Thank you everyone!
The example data:
A = [ 5, 3, 1, 2, 4 ]';
B = [ 3, 7, 0, 4, 6; 1, 2, 5, 0, 8; 4, 0, 2, 0, 0; 3, 0, 1, 0, 5; 2, 2, 3, 4, 4 ]
Sort the matrices:
[sortedA,IX] = sort(A);
sortedB = B(IX,:);
sortedA =
1
2
3
4
5
sortedB =
4 0 2 0 0
3 0 1 0 5
1 2 5 0 8
2 2 3 4 4
3 7 0 4 6

export Matrix with this format MATLAB

How to export any size matrix like
A=
1 2 3 4 5 6 ....9
3 6 7 8 9 9 ....4
...
6 7 7 4 4 5 ... 2
To a file that will contain that matrix where each value is separated by ',':
1, 2, 3, 4, 5, 6, ....,9
3, 6, 7, 8, 9, 9, ....,4
...
6, 7, 7, 4, 4, 5, ... ,2
Use DLMWRITE. Doing this:
>> A = [1 2 3 4; 5 6 7 8];
>> dlmwrite('file.csv', A);
writes a file with this:
1,2,3,4
5,6,7,8