comparing multiple columns of matrix together - matlab

I have a 50x50 matrix named nghlist(i,j) containing 0 and 1 values. 1 means there is a relation between (i,j).
There is another 5x50 matrix named chlist.
I need to check the nghlist matrix and if there is any connection between i and j (nghlist(i,j)==1) then I need to go to the chlist matrix and compare the values on column i and column j. For example compare columns (1,3,8,21,52) and get how many similar values they share together. i.e. I find all those columns have 3 similar values.
I tried using following code. But the problem is I need to compare the unknown number of columns (depend on the node connection (nghlist) for example 4 or 5 columns) together.
for i=1:1:n
for j=1:1:n
if (i~=j & nghlist(i,j)==1)
sum(ismember(chlist(:,i),chlist(:,j)));
end
end
end
Any help is highly appreciated.
++++ simplified example ++++++
take a look at the example http://i.imgur.com/mQjDqzz.jpg
nghlist matrix:
1 1 1 0 0
1 1 1 0 0
1 1 1 1 1
0 0 1 1 1
0 0 1 1 1
chlist matrix:
3 1 4 5 4
4 3 5 6 5
5 4 6 7 6
In this example since node 1 is connected to nodes 2 and 3, I need to compare column 1,2 and 3 from chlist. The output would be 1 (because they only share value '4').
And this value for node 5 would be 2 (because columns 3,4 and 5 only share value '5' and '6'). I hope now it is clear.

If the result of your simplified example is [1,2,0,3,2] then the following code worked for me.
(Matrix a stands for nghlist and matrix b for chlist, result is stored in s )
for i = 1:size(a,1)
s(i)=0;
row = a(i,:);
idx = find(row==1);
idx = idx(idx~=i);
tempb = b(:,idx);
for j=1:size(tempb,1)
if sum(sum(tempb==tempb(j,1)))==size(tempb,2)
s(i)=s(i)+1;
end
end
end
For every node you find all the ones in its row, then discard the one referring to the node itself. Pick the appropriate columns of chlist (line 6) and create a new matrix. For every element of the 1st column of this matrix check if it exists in all other columns.If it does, update the s value

Let's say, the indices of the columns that are to be compared is called idxList:
idxList = [1,3,8,21,50];
You may compare all with the first one and use "AND" to find the minimum number of shared values:
shared = ones(size(chlist(:,i)))
for ii = 2:length(idxList)
shared = (shared == (chlist(:,idxList(1)) == chlist(:,idxList(ii))))
end
Finally, sum as before
sum(shared)
I haven't checked the exact code, but the concept should become clear.

I manage to solve it this way. I compare the first and second column of tempb and put the result in tem. then compare tem with third column of tempb and so on. Anyway thank you finmor and also pyStarter, your codes has inspired me. However I knew its not the best way, but at least it works.
for i=1:size(nghlist,1)
s(i)=0;
j=2;
row=nghlist(i,:);
idx=find(row==1);
tempb=chlist(:,idx);
if (size(tempb,2)>1)
tem=intersect(tempb(:,1),tempb(:,2));
if (size(tempb,2)<=2)
s(i)=size(tem,1);
end
while (size(tempb,2)>j & size(tem)~=0)
j=j+1;
tem= intersect(tem(:,1),tempb(:,j));
s(i)=size(tem,1);
end
end
end

Related

Vectorizing cell find and summing in Matlab

Would someone please show me how I can go about changing this code from an iterated to a vectorized implementation to speed up performance in Matlab? It takes approximately 8 seconds per i for i=1:20 on my machine currently.
classEachWordCount = zeros(nwords_train, nClasses);
for i=1:nClasses % (20 classes)
for j=1:nwords_train % (53975 words)
classEachWordCount(j,i) = sum(groupedXtrain{i}(groupedXtrain{i}(:,2)==j,3));
end
end
If context is helpful basically groupedXtrain is a cell of 20 matrices which represent different classes, where each class matrix has 3 columns: document#,word#,wordcount, and unequal numbers of rows (tens of thousands). I'm trying to figure out the count total of each word, for each class. So classEachWordCount should be a matrix of size 53975x20 where each row represents a different word and each column a different label. There's got to be a built-in function to assist in something like this, right?
for example groupedXtrain{1} might start off like:
doc#,word#,wordcount
1 1 3
1 2 1
1 4 3
1 5 1
1 8 2
2 2 1
2 5 4
2 6 2
As is mentioned in the comments, you can use accumarray to sum up the values in the third column for each unique value in the second column for each class
results = zeros(nwords_train, numel(groupedXtrain));
for k = 1:numel(groupedXtrain)
results(:,k) = accumarray(groupedXtrain{k}(:,2), groupedXtrain{k}(:,3), ...
[nwords_train 1], #sum);
end

What is the easiest/ computationally efficient way to find the indexes at particular locations?

I have a matrix
m =
2 2 1
3 2 1
0 4 1
0 4 1
5 4 1
0 5 2
1 2 2
1 3 2
1 4 2
1 1 3
0 2 3
0 3 4
0 3 4
that is potentially of N x 3, where N can be very large.
I want to find the index in the first column (1-13) where i have zeros but only if there are duplicate rows or the rows are unique. I don't want rows that the 2nd and 3rd column are the same but the first column is other than zero. In other words, if there is a zero at the first column but its corresponding number in the second and third column are the same with another one that has a different number other than zero in the first column, then ignore the index of that zero. So, in the example above, i want to return only the indices 6, 11,12, 13. Index 3,4 should not be return because they violate the rule that there is a row similar to that (2nd and 3rd column) but the first column is different, as we can see below:
0 4 1
0 4 1
5 4 1
One slow solution would be to find the indices of the rows that the first column is 0 indm=m(:,1)==0 and then iterate over the rows of the matrix checking whether any other row exists in matrix (m) that has identical 2nd and 3rd columns but different 1st column. If such case does not exist then add the index of the row to the list to be returned by the program.
However, this method, would require "for loops" going over large matrices.
One way to solve this (assuming that a row is bad if there is any other row with the same columns 2 and 3) is to find all the different rows, and then checking whether the first column is the same everywhere.
%# uIdx is the same for sets of rows where m(i,2:3) is equal
[~,~,uIdx] = unique(m(:,2:3),'rows');
%# allZeros is true if all entries in the first column of m
%# corresponding to a set are the zero
allZeros = accumarray(uIdx,m(:,1),[],#(x)all(x==0));
%# a good row belongs to a set of rows from m(:,2:3)
%# where all corresponding entries in the first column are zeros
%# use allZeros(uIdx) to expand allZeros to size(m,1)
goodRowIndices = find(allZeros(uIdx) == true)
goodRowIndices =
6
11
12
13
Here is my solution:
mm = m(:,1)==0;
imm = find(mm);
[mu,~, imu] = unique(m(mm,2:3),'rows','stable');
[~,ia] = setdiff(mu,m(~mm,2:3),'rows');
X = imm(ismember(imu,ia));
Line 3 extract the unique lines beginning with 0; line 4 keeps only the lines that does not appear in the lines not beginning by 0, and line 5 get back the indexes of the lines to keep.
Not sure its the most efficient way, because of it involves two sorts.

matlab the index of the next smallest element in a matrix

I could get the minimum value and its index like in here matlab how to get min value and its index in a matrix.
From a matrix A
A=[1 3 6 2 0 4
6 8 9 5 1 3
7 2 7 8 9 2]
To get the minimal value MinVal(where the row is given (r) and the column is in an interval ([c.. c+x]) and the index ind (number of column of it)). I have to do
[MinVal,I]=min(A(r,c:c+x))
ind= c-1+I;
Example
[MinVal, ind]=min(A(2,3:3+2))
will give me
% MinVal= 1
% ind =5
Then I have
B.state=[ 0
0
1
0
1]
So here I can't take ind=5 because B(5).state==1, I need to move to the next MinVal= 5 and ind = 4. Here, it is ok, I can stop but if B(4).state ==1, then I need to move the next smallest and so on
But then the problem is that I have another structure B where I am going to check if B(ind).state== 1 then I have to move to the next smallest element and get its index and so on until I find the first empty one.
If I try like this
MinD = A(r,c:c+x);
[MinVal,Ind]=min(MinD);
ind= nbrT+Ind;
MinD2 = sort(MinD(:));
p=2;
while (B(ind).state == 1)
MinVal= MinD2(p);
%need to get the new index
%something like this
ind=find (A == MinVal) ;
p=p+1;
end
The problem is that I can get the next minimum value but the index I will get can be of more than one value if MinVal appears more than once so how can I get the one with state == 0
I don't want to use unique either because even if I have two different elements with the same minimum, they refer to two different places and I have to keep both (I can use the second one if the first one is full).
Modified code to
MinD = A(r,c:c+x);
[MinVal,Ind]=min(MinD);
ind= nbrT+Ind;
[MinD2, IndMinD2] = sort(MinD(:));
p=2;
while (B(ind).state == 1)
MinVal=MinD2(p);
Ind=IndMind2(p);
p=p+1;
end
Ind= c-1+Ind;
So how can I do it?
I think this should work:
MinD2 = MinD(:));
for ii=1:numel(MinD2)
[MinVal,Ind]=min(MinD2);
%do you stuff with the index
%and at the end do this:
MinD2(Ind)=Inf;
end

What does it mean to use logical indexing/masking to extract data from a matrix? (MATLAB)

I am new to matlab and I was wondering what it meant to use logical indexing/masking to extract data from a matrix.
I am trying to write a function that accepts a matrix and a user-inputted value to compute and display the total number of values in column 2 of the matrix that match with the user input.
The function itself should have no return value and will be called on later in another loop.
But besides all that hubbub, someone suggested that I use logical indexing/masking in this situation but never told me exactly what it was or how I could use it in my particular situation.
EDIT: since you updated the question, I am updating this answer a little.
Logical indexing is explained really well in this and this. In general, I doubt, if I can do a better job, given available time. However, I would try to connect your problem and logical indexing.
Lets declare an array A which has 2 columns. First column is index (as 1,2,3,...) and second column is its corresponding value, a random number.
A(:,1)=1:10;
A(:,2)=randi(5,[10 1]); //declares a 10x1 array and puts it into second column of A
userInputtedValue=3; //self-explanatory
You want to check what values in second column of A are equal to 3. Imagine as if you are making a query and MATLAB is giving you binary response, YES (1) or NO (0).
q=A(:,2)==3 //the query, what values in second column of A equal 3?
Now, for the indices where answer is YES, you want to extract the numbers in the first column of A. Then do some processing.
values=A(q,2); //only those elements will be extracted: 1. which lie in the
//second column of A AND where q takes value 1.
Now, if you want to count total number of values, just do:
numValues=length(values);
I hope now logical indexing is clear to you. However, do read the Mathworks posts which I have mentioned earlier.
I over simplified the code, and wrote more code than required in order to explain things. It can be achieved in a single-liner:
sum(mat(:,2)==userInputtedValue)
I'll give you an example that may illustrate what logical indexing is about:
array = [1 2 3 0 4 2];
array > 2
ans: [0 0 1 0 1 0]
using logical indexing you could filter elements that fullfil a certain condition
array(array>2) will give: [3 4]
you could also perform alterations to only those elements:
array(array>2) = 100;
array(array<=2) = 0;
will result in "array" equal to
[0 0 100 0 100 0]
Logical indexing means to have a logical / Boolean matrix that is the same size as the matrix that you are considering. You would use this as input into the matrix you're considering, and any locations that are true would be part of the output. Any locations that are false are not part of the output. To perform logical indexing, you would need to use logical / Boolean operators or conditions to facilitate the selection of elements in your matrix.
Let's concentrate on vectors as it's the easiest to deal with. Let's say we had the following vector:
>> A = 1:9
A =
1 2 3 4 5 6 7 8 9
Let's say I wanted to retrieve all values that are 5 or more. The logical condition for this would be A >= 5. We want to retrieve all values in A that are greater than or equal to 5. Therefore, if we did A >= 5, we get a logical vector which tells us which values in A satisfy the above condition:
>> A >= 5
ans =
0 0 0 0 1 1 1 1 1
This certainly tells us where in A the condition is satisfied. The last step would be to use this as input into A:
>> B = A(A >= 5)
B =
5 6 7 8 9
Cool! As you can see, there isn't a need for a for loop to help us select out elements that satisfy a condition. Let's go a step further. What if I want to find all even values of A? This would mean that if we divide by 2, the remainder would be zero, or mod(A,2) == 0. Let's extract out those elements:
>> C = A(mod(A,2) == 0)
C =
2 4 6 8
Nice! So let's go back to your question. Given your matrix A, let's extract out column 2.
>> col = A(:,2)
Now, we want to check to see if any of column #2 is equal to a certain value. Well we can generate a logical indexing array for that. Let's try with the value of 3:
>> ind = col == 3;
Now you'll have a logical vector that tells you which locations are equal to 3. If you want to determine how many are equal to 3, you just have to sum up the values:
>> s = sum(ind);
That's it! s contains how many values were equal to 3. Now, if you wanted to write a function that only displayed how many values were equal to some user defined input and displayed this event, you can do something like this:
function checkVal(A, val)
disp(sum(A(:,2) == val));
end
Quite simply, we extract the second column of A and see how many values are equal to val. This produces a logical array, and we simply sum up how many 1s there are. This would give you the total number of elements that are equal to val.
Troy Haskin pointed you to a very nice link that talks about logical indexing in more detail: http://www.mathworks.com/help/matlab/math/matrix-indexing.html?refresh=true#bq7eg38. Read that for more details on how to master logical indexing.
Good luck!
%% M is your Matrix
M = randi(10,4)
%% Val is the value that you are seeking to find
Val = 6
%% Col is the value of the matrix column that you wish to find it in
Col = 2
%% r is a vector that has zeros in all positions except when the Matrix value equals the user input it equals 1
r = M(:,Col)==Val
%% We can now sum all the non-zero values in r to get the number of matches
n = sum(r)
M =
4 2 2 5
3 6 7 1
4 4 1 6
5 8 7 8
Val =
6
Col =
2
r =
0
1
0
0
n =
1

Remove duplicates appearing next to each other, but keep it if it appears again later

I have a vector that could look like this:
v = [1 1 2 2 2 3 3 3 3 2 2 1 1 1];
that is, the number of equal elements can vary, but they always increase and decrease stepwise by 1.
What I want is an easy way to be left with a new vector looking like this:
v2 = [ 1 2 3 2 1];
holding all the different elements (in the right order as they appear in v), but only one of each. Preferably without looping, since generally my vectors are about 10 000 elements long, and already inside a loop that's taking for ever to run.
Thank you so much for any answers!
You can use diff for this. All you're really asking for is: Delete any element that's equal to the one in front of it.
diff return the difference between all adjacent elements in a vector. If there is no difference, it will return 0. v(ind~=0) will give you all elements that have a value different than zero. The 1 in the beginning is to make sure the first element is counted. As diff returns the difference between elements, numel(diff(v)) = numel(v)-1.
v = [1 1 2 2 2 3 3 3 3 2 2 1 1 1];
ind = [1 diff(v)];
v(ind~=0)
ans =
1 2 3 2 1
This can of course be done in a single line if you want:
v([1, diff(v)]~=0)
You could try using diff which, for a vector X, returns [X(2)-X(1) X(3)-X(2) ... X(n)-X(n-1)] (type help diff for details on this function). Since the elements in your vector always increase or decrease by 1, then
diff(v)
will be a vector (of size one less than v) with zeros and ones where a one indicates a step up or down. We can ignore all the zeros as they imply repeated numbers. We can convert this to a logical array as
logical(diff(v))
so that we can index into v and access its elements as
v(logical(diff(v)))
which returns
1 2 3 2
This is almost what you want, just without the final number which can be added as
[v(logical(diff(v))) v(end)]
Try the above and see what happens!