How to find x number of closest neighbours matching criteria in MATLAB? - matlab

I have an m*n matrix with values of 0-9. Basically, what I want to do is be able to find all cells that have a 0 as their value and find the k closest neighbours that aren't 0.
My code right now is along the lines of this:
locate all cells with a zero and make an a*2 matrix holding their x
and y locations
do the same for all cells without 0
make a loop that cycles through each given zero coordinate and
measure the distance from it to every non-zero cell and record it in
row one of a new matrix (row 2 and 3 are the x and y coordinates,
respectively)
use sortrows() to get this matrix in ascending order and then extract
the first k values (ie, the closest k number of coordinates)
repeat
What I was hoping to do was more along the lines of quickly finding the k closest non-zero cells without an internal loop and then repeating for each 0 cell. Any advice would be greatly appreciated, thanks very much.

I would solve this problem by finding all the nonzero entries, e.g. as you are currently doing, and then determining the corresponding vertices. You can then use dsearchn to find the k nearest points. If you want to do this repeatedly, you will benefit from constructing a delaunayTriangulation object.

Related

Select nodes of a mesh data with matlab to set boundary conditions

I have imported a 2Dmesh file of a surface with the mesh data into matlab.
This mesh file has 3 columns: the first one with the node number, the second one with the x-coordinate of the node and the third one with the y-coordinate of the node.
I want to select the nodes inside this circle x²+y² = 4. After importing the data file into matlab I have three column vectors, the node vector, the x-coordinate vector and the y-coordinate vector.
Any tips to impose the x² + y² < 4 condition to do this? Thank you.
You can easily do that with a for loop that scans the three vectors in parallel.
First of all, you might want to check that these 3 vectors have the same length. Let's say x is the vector with the x-coordinates, y is the vector with the y-coordinates and idx is the vector with the node numbers.
if(length(x)~=length(y) || length(x)~=length(idx) || length(y)~=length(idx))
error('Vectors must have the same length.');
end
Then you can proceed.
SelectedNodes=[];
for i=1:length(x) %or length(y) or length(idx)...they must have the same length
if(x(i)^2+y(i)^2<4)
SelectedNodes=[SelectedNodes idx(i)];
end
end
Now in SelectedNodes you have the IDs of the nodes that lie inside your circle, to know how many nodes lie inside your circle, simply evaluate its length (length(SelectedNodes)).
Update: As #rayryeng correctly pointed out, there is a much smarter way of doing this by using logical indexing instead of a for-loop. The logical indexing (in poor words) puts a logical 1 (true) in i-th position if the i-th element of a vector (or matrix) satisfies a particular condition. Otherwise there will be a logical 0 (false). By running, as suggested,
SelectedNodes=idx(x.^2+y.^2<4)
the code x.^2+y.^2<4 will return an array of the same length as x (and y) containing 1s or 0s in position i depending on whether such element in x and y satisfies the circle equation. Such array will be the input of idx and that means "select from idx the value marked as true". Finally, this will be the result stored in SelectedNodes.

How to change the value of a random subset of elements in a matrix without using a loop?

I'm currently attempting to select a subset of 0's in a very large matrix (about 400x300 elements) and change their value to 1. I am able to do this, but it requires using a loop where each instance it selects the next value in a randperm vector. In other words, 50% of the 0's in the matrix are randomly selected, one-at-a-time, and changed to 1:
z=1;
for z=1:(.5*numberofzeroes)
A(zeroposition(rpnumberofzeroes(z),1),zeroposition(rpnumberofzeroes(z),2))=1;
z=z+1;
end
Where 'A' is the matrix, 'zeroposition' is a 2-column-wide matrix with the positions of the 0's in the matrix (the "coordinates" if you like), and 'rpnumberofzeros' is a randperm vector from 1 to the number of zeroes in the matrix.
So for example, for z=20, the code might be something like this:
A(3557,2684)=1;
...so that the 0 which appears in this location within A will now be a 1.
It performs this loop thousands of times, because .5*numberofzeroes is a very big number. This inevitably takes a long time, so my question is can this be done without using a loop? Or at least, in some way that takes less processing resources/time?
As I said, the only thing that needs to be done is an entirely random selection of 50% (or whatever proportion) of the 0's changed to 1.
Thanks in advance for the help, and let me know if I can clear anything up! I'm new here, so apologies in advance if I've made any faux pa's.
That's very easy. I'd like to introduce you to my friend sub2ind. sub2ind allows you to take row and column coordinates of a matrix and convert them into linear column-major indices so that you can access multiple values in a matrix simultaneously in a single call. As such, the equivalent code you want is:
%// First access the values in rpnumberofzeroes
vals = rpnumberofzeroes(1:0.5*numberofzeroes, :);
%// Now, use the columns of these to determine which rows and columns we want
%// to access A
rows = zeroposition(vals(:,1), 1);
cols = zeroposition(vals(:,2), 2);
%// Get linear indices via sub2ind
ind1 = sub2ind(size(A), rows, cols);
%// Now set these locations to 1
A(ind1) = 1;
The first statement gets the first half of your matrix of coordinates stored in rpnumberofzeroes. The first column is the row coordinates, the second column is the column coordinates. Notice that in your code, you wish to use the values in zeroposition to access the locations in A. As such, extract out the corresponding rows and columns from rpnumberofzeroes to figure out the right rows and columns from zeroposition. Once that's done, we wish to use these new rows and columns from zeroposition and index into A. sub2ind requires three inputs - the size of the matrix you are trying to access... so in our case, that's A, the row coordinates and the column coordinates. The output is a set of column major indices that are computed for each row and column pair.
The last piece of the puzzle is to use these to index into A and set the locations to 1.
This can be accomplished with linear indexing as well:
% find linear position of all zeros in matrix
ix=find(abs(A)<eps);
% set one half of those, selected at random, to one.
A(ix(randperm(round(numel(ix)*.5)))=1;

Remove duplicates in correlations in matlab

Please see the following issue:
P=rand(4,4);
for i=1:size(P,2)
for j=1:size(P,2)
[r,p]=corr(P(:,i),P(:,j))
end
end
Clearly, the loop will cause the number of correlations to be doubled (i.e., corr(P(:,1),P(:,4)) and corr(P(:,4),P(:,1)). Does anyone have a suggestion on how to avoid this? Perhaps not using a loop?
Thanks!
I have four suggestions for you, depending on what exactly you are doing to compute your matrices. I'm assuming the example you gave is a simplified version of what needs to be done.
First Method - Adjusting the inner loop index
One thing you can do is change your j loop index so that it only goes from 1 up to i. This way, you get a lower triangular matrix and just concentrate on the values within the lower triangular half of your matrix. The upper half would essentially be all set to zero. In other words:
for i = 1 : size(P,2)
for j = 1 : i
%// Your code here
end
end
Second Method - Leave it unchanged, but then use unique
You can go ahead and use the same matrix like you did before with the full two for loops, but you can then filter the duplicates by using unique. In other words, you can do this:
[Y,indices] = unique(P);
Y will give you a list of unique values within the matrix P and indices will give you the locations of where these occurred within P. Note that these are column major indices, and so if you wanted to find the row and column locations of where these locations occur, you can do:
[rows,cols] = ind2sub(size(P), indices);
Third Method - Use pdist and squareform
Since you're looking for a solution that requires no loops, take a look at the pdist function. Given a M x N matrix, pdist will find distances between each pair of rows in a matrix. squareform will then transform these distances into a matrix like what you have seen above. In other words, do this:
dists = pdist(P.', 'correlation');
distMatrix = squareform(dists);
Fourth Method - Use the corr method straight out of the box
You can just use corr in the following way:
[rho, pvals] = corr(P);
corr in this case will produce a m x m matrix that contains the correlation coefficient between each pair of columns an n x m matrix stored in P.
Hopefully one of these will work!
this works ?
for i=1:size(P,2)
for j=1:i
Since you are just correlating each column with the other, then why not just use (straight from the documentation)
[Rho,Pval] = corr(P);
I don't have the Statistics Toolbox, but according to http://www.mathworks.com/help/stats/corr.html,
corr(X) returns a p-by-p matrix containing the pairwise linear correlation coefficient between each pair of columns in the n-by-p matrix X.

different sized bins in matlab

In Matlab I have a vector Muen which I want to reduce in size by dividing it in to different length bins. The vector has a few values that need high accuracy bins and a lot of values that are roughly equal and could be collected into bins with size of up to a few hundred values.
I also need to know the index for all old bins going into a new bin in order to shorten a sencod vector fluence.
The goal is to speed up a summation of two vectors sum(fluence.*Muen) by using different sized bins determined by Meun and do the sum of fluence into the new bins before the vector multiplication.
For this I try to use
edges=[min(Muen):0.0001:Muen(13),Muen(12:-1:1));
[N,bin]=histc(*Muen*,edges)
The problem is how to make the vector edges, as there is a large difference between the maximum and minimum of Muen and a small difference between other values. Is there a way to make the steps of edges depending on the derivative Muen?
In order to get the shorter version of Muen would be something like
MuenShort=N.*edges;
but it did not work quit right (could be a fault in edges), any suggestions?
I also do not really get how bin gives the index of the values that go into the new bins?
clarification:
what I want to do is from a vector m or Muen take the elements that are roughly equal and replace the with one element and at the same time keeping track of the index for which element goes into a new vector n or MuenShort. example
{m1}->n1,(1), {m2}->n2,(2), {m3,m4}-> m3=m4=n3,(3,4),{m5,m6,m7,m8}-> m5=m6=m7=m8=n4,{5,6,7,8}...
where n1>>n2 but the difference between n3 and n4 might not be so large. the number of m-elements in each n-element should be determined by the number of m-elements that are roughly equal to each other, or rather lies between two limits. So the bin size should vary between one element to a few hundred elements.
Then I want to use indexes to make the fluence vector shorter
fluenceShort(1:length(MuenShort))= [sum(fluence(1)),sum(fluence(2)),sum(fluence(3,4)),sum(fluence(5,6,7,8))...];
goal=sum(fluenceShort.*MuenShort)
Is there a way to implement this in Matlab?
Even if I don't understand your question clearly, I would suggest this. Perhaps you could sort your vector muen, pick a fixed number n, and define each bin so that it contains exactly n values of muen. For simplicity, the length of muen is assumed to be a multiple of n:
n = 10;
m = length(muen_sorted)/n;
muen_sorted = sort(muen);
edges = [-inf mean([muen_sorted(n:n:end-1); muen_sorted(n+1:n:end)]) inf ];
muen_short = mean(reshape(muen_sorted,n,m));
Note that m+1 edges (vector edges) are obtained, corresponding to m bins. Bin edges lie exactly between the closest values of neighbouring bins. Thus, the upper edge of the first bin is (muen_sorted(n)+muen_sorted(n+1)/2; the upper edge of the next bin is (muen_sorted(2*n)+muen_sorted(2*n+1)/2, and so on.
The "representative value" of each bin (vector muen_short) is computed as the mean of the values that lie in that bin. Or perhaps the median would make more sense, depending on your application.
As a result of this code, muen_short(1) is the value corresponding to the bin with edges edge(1) and edge(2); muen_short(2) is the value corresponding to the bin with edges edge(2) and edge(3), etc.
You can now use the variable edges to build the histogram of fluence with those same edges.

Drawing a random non-zero element from a sparse matrix

I have a sparse logical matrix, which is quite large. I would like to draw random non-zero elements from it without storing all of its non-zero elements in a separate vector (eg. by using find command). Is there an easy way to do this?
Currently I am implementing rejection sampling, which is drawing a random element and checking whether that is non-zero or not. But it is not efficient when the ratio of non-zero elements is small.
A sparse logical matrix is not a very practical representation of your data if you want to pick random locations. Rejection sampling and find are the only two ways that make sense to me. Here's how you can do them efficiently (assuming you want to get 4 random locations):
%# using find
idx = find(S);
%# draw 4 without replacement
fourRandomIdx = idx(randperm(length(idx),4));
%# draw 4 with replacement
fourRandomIdx = idx(randi(1,length(idx),4));
%# get row, column values
[row,col] = ind2sub(size(S),fourRandomIdx);
%# using rejection sampling
density = nnz(S)/prod(size(S));
%# estimate how many samples you need to get at least 4 hits
%# and multiply by 2 (or 3)
n = ceil( 1 / (1-(1-density)^4) ) * 2;
%# random indices w/ replacement
randIdx = randi(1,n,prod(size(S)));
%# identify the first four non-zero elements
[row,col] = find(S(randIdx),4,'first');
An n x m matrix with nnz non-zero elements requires nnz + n + 1 integers to store the locations of its non-zero entries. For a logical matrix there is no need to store the value of the non-zero entries: these are all true. Correspondingly, you would do best to convert your logical sparse matrix into a list of the linear indices of its non-zero entries, together with n and m, which requires only nnz + 2 integers of storage. From these (and ind2sub) you can readily reconstruct the subscripts corresponding to any non-zero entry that you choose randomly using randi over the range 1..nnz
find is the standard interface to get the non-zero elements in a sparse matrix. Have a look here http://www.mathworks.se/help/techdoc/math/f6-9182.html#f6-13040
[i,j,s] = find(S)
find returns the row indices of nonzero values in vector i, the column indices in vector j, and the nonzero values themselves in the vector s.
No need to get s. Just pick a random index in i,j.
By representing the entries in a 3 column format, aka a coordinate list (i, j, value), you can simply select the items from the list. To get this, you can either use your original method for creating the sparse matrix (i.e. the precursor to sparse()), or use the find command, a la [i,j,s] = find(S);
If you don't need the entries, and it seems you don't, you can just extract i and j.
If, for some reason, your matrix is massive and your RAM limitations are severe, you can simply divide the matrix into regions, and let the probability of selecting a given sub-matrix be proportional to the number of non-zero elements (using nnz) in that sub-matrix. You could go so far as to divide the matrix into individual columns, and the rest of the calculation is trivial. NB: by applying sum to the matrix, you can get the per-column counts (assuming your entries are just 1s).
In this way, you need not even bother with rejection sampling (which seems pointless to me in this case, since Matlab knows where all of the non-zero entries are).