Determining if any duplicate rows in two matrices in MatLab

Determining if any duplicate rows in two matrices in MatLab - matlab

Introduction to problem:
I'm modelling a system where i have a matrix X=([0,0,0];[0,1,0],...) where each row represent a point in 3D-space. I then choose a random row, r, and take all following rows and rotate around the point represented by r, and make a new matrix from these rows, X_rot. I now want to check whether any of the rows from X_rot is equal two any of the rows of X (i.e. two vertices on top of each other), and if that is the case refuse the rotation and try again.
Actual question:
Until now i have used the following code:
X_sim=[X;X_rot];
if numel(unique(X_sim,'rows'))==numel(X_sim);
X(r+1:N+1,:,:)=X_rot;
end
Which works, but it takes up over 50% of my running time and i were considering if anybody in here knew a more efficient way to do it, since i don't need all the information that i get from unique.
P.S. if it matters then i typically have between 100 and 1000 rows in X.
Best regards,
Morten
Additional:
My x-matrix contains N+1 rows and i have 12 different rotational operations that i can apply to the sub-matrix x_rot:
step=ceil(rand()*N);
r=ceil(rand()*12);
x_rot=x(step+1:N+1,:);
x_rot=bsxfun(#minus,x_rot,x(step,:));
x_rot=x_rot*Rot(:,:,:,r);
x_rot=bsxfun(#plus,x_rot,x(step,:));

Two possible approaches (I don't know if they are faster than using unique):
Use pdist2:
d = pdist2(X, X_rot, 'hamming'); %// 0 if rows are equal, 1 if different.
%// Any distance function will do, so try those available and choose fastest
result = any(d(:)==0);
Use bsxfun:
d = squeeze(any(bsxfun(#ne, X, permute(X_rot, [3 2 1])), 2));
result = any(d(:)==0);
result is 1 if there is a row of X equal to some row of X_rot, and 0 otherwise.

How about ismember(X_rot, X, 'rows')?

Related

How to save matrices from for loop into another matrix

I have a 5-by-200 matrix where the i:50:200, i=1:50 are related to each other, so for example the matrix columns 1,51,101,151 are related to each other, and columns 49,99,149,199 are also related to each other.
I want to use a for-loop to create another matrix that re-sorts the previous matrix based on this relationship.
My code is
values=zeros(5,200);
for j=1:50
for m=1:4:200
a=factor_mat(:,j:50:200)
values(:,m)=a
end
end
However, the code does not work.

Here's what's happening. Let's say we're on the first iteration of the outer loop, so j == 1. This effectively gives you:
j = 1;
for m=1:4:200
a=factor_mat(:,j:50:200)
values(:,m)=a;
end
So you're creating the same submatrix for a (j doesn't change) 50 times and storing it at different places in the values matrix. This isn't really what you want to do.
To create each 4-column submatrix once and store them in 50 different places, you need to use j to tell you which of the 50 you're currently processing:
for j=1:50
a=factor_mat(:,j:50:200);
m=j*4; %// This gives us the **end** of the current range
values(:,m-3:m)=a;
end
I've used a little trick here, because the indices of Matlab arrays start at 1 rather than 0. I've calculated the index of the last column we want to insert. For the first group, this is column 4. Since j == 1, j * 4 == 4. Then I subtract 3 to find the first column index.
That will fix the problem you have with your loops. But loops aren't very Matlab-ish. They used to be very slow; now they're adequate. But they're still not the cool way to do things.
To do this without loops, you can use reshape and permute:
a=reshape(factor_mat,[],50,4);
b=permute(a,[1,3,2]);
values=reshape(b,[],200);

Finding similar rows in MATLAB

I have a matrix with a large number of rows. I have another matrix that I will loop through one row at a time. For each row in the second matrix, I need to look for similar rows in the first matrix. Once all the similar rows are found, I need to know the row numbers of the similar rows. These rows will almost never be exact, so ismember does not work.
Also, the solution would preferably (not necessarily, however) give some way to set a level of similarity that would trigger the code to say it is similar and give me the row number.
Is there any way to do this? I've looked around, and I can't find anything.

You could use cosine distance, which finds the angle between two vectors. Similar vectors (in your case, a row and your comparison vector) have a value close to 1 and dissimilar vectors have a value close to 0.
function d = cosSimilarity(u, v)
d = dot(u,v)/(norm(u)*norm(v));
end
To apply this function to each to all pairs of rows in the matrices M and V you could use nested for loops. Hardly the most elegant, but it will work:
numRowsM = size(M, 1)
numRowsV = size(V, 1)
similarThresh = .9
for m = 1:numRowsM
for v = 1:numRowsV
similarity = cosSimilarity(V(v,:), M(m, :))
% Notify about similar rows
if similarity > similarThresh
disp([num2str(m) ' is similar to a row in V'])
end
end
end
Instead of nested for loops, there are definitely other ways. You could start by looking at the solution from this question, which will help you avoid the loop by converting the rows of the matrix into cells of a cell array and then applying the function with cellfun.

Changing numbers for given indices between matrices

I'm struggling with one of my matlab assignments. I want to create 10 different models. Each of them is based on the same original array of dimensions 1x100 m_est. Then with for loop I am choosing 5 random values from the original model and want to add the same random value to each of them. The cycle repeats 10 times chosing different values each time and adding different random number. Here is a part of my code:
steps=10;
for s=1:steps
for i=1:1:5
rl(s,i)=m_est(randi(numel(m_est)));
rl_nr(s,i)=find(rl(s,i)==m_est);
a=-1;
b=1;
r(s)=(b-a)*rand(1,1)+a;
end
pert_layers(s,:)=rl(s,:)+r(s);
M=repmat(m_est',s,1);
end
for k=steps
for m=1:1:5
M_pert=M;
M_pert(1:k,rl_nr(k,1:m))=pert_layers(1:k,1:m);
end
end
In matrix M I am storing 10 initial models and want to replace the random numbers with indices from rl_nr matrix into those stored in pert_layers matrix. However, the last loop responsible for assigning values from pert_layers to rl_nr indices does not work properly.
Does anyone know how to solve this?
Best regards

Your code uses a lot of loops and in this particular circumstance, it's quite inefficient. It's better if you actually vectorize your code. As such, let me go through your problem description one point at a time and let's code up each part (if applicable):
I want to create 10 different models. Each of them is based on the same original array of dimensions 1x100 m_est.
I'm interpreting this as you having an array m_est of 100 elements, and with this array, you wish to create 10 different "models", where each model is 5 elements sampled from m_est. rl will store these values from m_est while rl_nr will store the indices / locations of where these values originated from. Also, for each model, you wish to add a random value to every element that is part of this model.
Then with for loop I am choosing 5 random values from the original model and want to add the same random value to each of them.
Instead of doing this with a for loop, generate all of your random indices in one go. Since you have 10 steps, and we wish to sample 5 points per step, you have 10*5 = 50 points in total. As such, why don't you use randperm instead? randperm is exactly what you're looking for, and we can use this to generate unique random indices so that we can ultimately use this to sample from m_est. randperm generates a vector from 1 to N but returns a random permutation of these elements. This way, you only get numbers enumerated from 1 to N exactly once and we will ensure no repeats. As such, simply use randperm to generate 50 elements, then reshape this array into a matrix of size 10 x 5, where the number of rows tells you the number of steps you want, while the number of columns is the total number of points per model. Therefore, do something like this:
num_steps = 10;
num_points_model = 5;
ind = randperm(numel(m_est));
ind = ind(1:num_steps*num_points_model);
rl_nr = reshape(ind, num_steps, num_points_model);
rl = m_est(rl_nr);
The first two lines are pretty straight forward. We are just declaring the total number of steps you want to take, as well as the total number of points per model. Next, what we will do is generate a random permutation of length 100, where elements are enumerated from 1 to 100, but they are in random order. You'll notice that this random vector uses only a value within the range of 1 to 100 exactly once. Because you only want to get 50 points in total, simply subset this vector so that we only get the first 50 random indices generated from randperm. These random indices get stored in ind.
Next, we simply reshape ind into a 10 x 5 matrix to get rl_nr. rl_nr will contain those indices that will be used to select those entries from m_est which is of size 10 x 5. Finally, rl will be a matrix of the same size as rl_nr, but it will contain the actual random values sampled from m_est. These random values correspond to those indices generated from rl_nr.
Now, the final step would be to add the same random number to each model. You can certainly use repmat to replicate a random column vector of 10 elements long, and duplicate them 5 times so that we have 5 columns then add this matrix together with rl.... so something like:
a = -1;
b = 1;
r = (b-a)*rand(num_steps, 1) + a;
r = repmat(r, 1, num_points_model);
M_pert = rl + r;
Now M_pert is the final result you want, where we take each model that is stored in rl and add the same random value to each corresponding model in the matrix. However, if I can suggest something more efficient, I would suggest you use bsxfun instead, which does this replication under the hood. Essentially, the above code would be replaced with:
a = -1;
b = 1;
r = (b-a)*rand(num_steps, 1) + a;
M_pert = bsxfun(#plus, rl, r);
Much easier to read, and less code. M_pert will contain your models in each row, with the same random value added to each particular model.
The cycle repeats 10 times chosing different values each time and adding different random number.
Already done in the above steps.
I hope you didn't find it an imposition to completely rewrite your code so that it's more vectorized, but I think this was a great opportunity to show you some of the more advanced functions that MATLAB has to offer, as well as more efficient ways to generate your random values, rather than looping and generating the values one at a time.
Hopefully this will get you started. Good luck!

Assigning the different row to another matrix after comparing two matrices

i have two matrices
r=10,000x2
q=10,000x2
i have to find out those rows of q which are one value or both values(as it is a two column matrix) different then r and allocate them in another matrix, right now i am trying this.i cannot use isequal because i want to know those rows
which are not equal this code gives me the individual elements not the complete rows different
can anyone help please
if r(:,:)~=q(:,:)
IN= find(registeredPts(:,:)~=q(:,:))
end

You can probably do this using ismember. Is this what you want? Here you get the values from q in rows that are different from r.
q=[1,2;3,4;5,6]
r=[1,2;3,5;5,6]
x = q(sum(ismember(q,r),2) < 2,:)
x =
3 4
What this do:
ismember creates an array with 1's in the positions where q == r, and 0 in the remaining positions. sum(.., 2) takes the column sum of each of these rows. If the sum is less than 2, that row is included in the new array.
Update
If the values might differ some due to floating point arithmetic, check out ismemberf from the file exchange. I haven't tested it myself, but it looks good.

Using ranges in Matlab/Octave matrices

Let's say I want to create an 100x100 matrix of which every row
contains the elements 1-100
A = [1:100; 1:100; 1:100... n]
Obviously forming a matrix is a bad idea, because it would force me to
create 100 rows of range 1:100.
I think I could do it by taking a 'ones' array and multiplying every
row by a vector... but I'm not sure how to do it
a = (ones(100,100))*([])
??
Any tips?

You can use the repeat matrix function (repmat()). You code would then look like this:
A = repmat( 1:100, 100, 1 );
This means that you're repeating the first argument of repmat 100 times vertically and once horizontally (i.e. you leave it as is horizontally).

You could multiply a column vector of 100 1s with a row vector of 1:100.
ones(3,1)*(1:3)
ans =
1 2 3
1 2 3
1 2 3
Or you could use repmat ([edit] as Phonon wrote a few seconds before me [/edit]).

Yes, repmat is the easy solution, and even arguably the right solution. But knowing how to visualize your aim and how to create something that yields that aim will give long term benefits in MATLAB. So try other solutions. For example...
cumsum(ones(100),2)
bsxfun(#plus,zeros(100,1),1:100)
ones(100,1)*(1:100)
cell2mat(repmat({1:100},100,1))
and the boring
repmat(1:100,100,1)

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse