I have encountered a problem in MatLab as I attempt to run a loop. For each iteration in the loop eigenvalues and eigenvectors for a 3x3 matrix are calculated (the matrix differs with each iteration). Further, each iteration should always yield one eigenvector of the form [0 a 0], where only the middle-value, a, is non-zero.
I need to obtain the index of the column of the eigenvector-matrix where this occurs. To do this I set up the following loop within my main-loop (where the matrix is generated):
for i = 1:3
if (eigenvectors(1,i)==0) && (eigenvectors(3,i)==0)
index_sh = i
end
end
The problem is that the eigenvector matrix in question will sometimes have an output of the form:
eigenvectors =
-0.7310 -0.6824 0
0 0 1.0000
0.6824 -0.7310 0
and in this case my code works well, and I get index_sh = 3. However, sometimes the matrix is of the form:
eigenvectors =
0.0000 0.6663 0.7457
-1.0000 0.0000 0.0000
-0.0000 -0.7457 0.6663
And in this case, MatLab does not assign any value to index_sh even though I want index_sh to be equal to 1 in this case.
If anyone knows how I can tackle this problem, so that MatLab assigns a value also when the zeros are written as 0.0000 I would be very grateful!
The problem is, very likely, that those "0.0000" are not exactly 0. To solve that, choose a tolerance and use it when comparing with 0:
tol = 1e-6;
index_sh = find(abs(eigenvectors(1,:))<tol & abs(eigenvectors(3,:))<tol);
In your code:
for ii = 1:3
if abs(eigenvectors(1,ii))<tol && abs(eigenvectors(3,ii))<tol
index_sh = i
end
end
Or, instead of a tolerance, you could choose the column whose first- and third-row entries are closer to 0:
[~, index_sh] = min(abs(eigenvectors(1,:)) + abs(eigenvectors(3,:)));
Related
In my code, I need to divide each values of a matrix by the values of another. I could use A./B but some elements in B are 0. I know that if B(i,j) = 0 so A(i,j) = 0 too and I want to have 0/0 = 0. So I wrote a function div and I use bsxfun but I don't have 0, I have NaN :
A = [1,0;1,1];
B = [1,0;1,2];
function n = div(a,b)
if(b==0)
n = 0;
else
n = a./b;
end
end
C = bsxfun(#div,A,B);
Why not just replace the unwanted values after?
C=A./B;
C(A==0 & B==0)=0;
You could do C(isnan(C))=0;, but this will replace all NaN, even the ones not created by 0/0. If zeros always happen together then just C(B==0)=0; will do
If you know your non-zero values in B are never smaller than a very small number eps (for example 1e-300), a simple trick is to add eps to B. All non-zero values are unchanged, while all zero values become eps. When dividing 0/eps you get the wished result.
The reason this is happening is because bsxfun doesn't process the arrays element-wise. Consequently, your function doesn't get two scalars in. It is actually called only once. Your if statement does not work for non-scalar values of b.
Replacing bsxfun with arrayfun will call your function with scalar inputs, and will yield the expected result:
>> C = arrayfun(#div,A,B)
C =
1.0000 0
1.0000 0.5000
Nonetheless, either of the other two answers will be more efficient:
>> C = A./B;
>> C(B==0) = 0 % Ander's answer
C =
1.0000 0
1.0000 0.5000
or
C = A./(B+eps) % user10259794's answer
C =
1.0000 0
1.0000 0.5000
I have a matrix X with 3 columns. For the porous of the question X=randn(5,3).
I want to normalize the columns of X S.T. each column will have a 0 mean and a 1 std. I was using the following code:
X=(X-mean(X))./std(X);
I am getting an std of 1. My mean, however, is a very small value close to 0 but not essential 0. I tried playing a bit with the numbers to find an explanation:
X=1:15;
X=reshape(X,[5 3]);
mean(X-mean(X));
Which gives me 0 value for each column.
X=1:15;
X=reshape(X,[5 3]);
mean((X-mean(X))./std(X));
Which does not. But 0/anything is still 0. what am I missing?
Why am I not getting 0 values?
Are the values I am getting good enough for a pre-clustering algorithm normalization?
Here is a version that does what I think you're trying to do... you need to replicate the matrix because X-mean(X) isn't valid (if you're using the standard implementation)-- you can't subtract a 1x3 from a 5x3.
r = 5; c = 3;
X=randn(r,c);
Xm=repmat(mean(X),r,1);
Xstd = repmat(std(X),r,1);
Xn = (X-Xm)./Xstd;
mean(Xn)
std(Xn)
For me this prints out
ans =
1.0e-16 *
-0.6661 0 0.4441
ans =
1.0000 1.0000 1.0000
Which seems like exactly what you're looking for... note the 1e-16 multiplier on the mean values... this is essentially 0, with some floating point error.
I am trying to write a method that checks if a matrix is orthogonal and return TRUE if it is or FALSE if it isn't My problem is that my isequal() is not working how I want it to. Basically I can do the check in two ways based on the two formulas:
ONE way is check to see if the transpose of matrix R is equal to the inverse of matrix R. If they are equal then it is orthogonal. (R'=inv(R))
ANOTHER way is to check and see if matrix R times the transpose of matrix R equals the Identity matrix of R. (R'R=I) If yes then the matrix is orthogonal. I have most been using isequal() but it keeps yielding false. Can someone look at my code and tell me why this would be so?
I use Z=orth(randn(3,3)) to generate random orthogonal matrix and i call my method isortho(Z)
function R = isortho(r)
%isortho(R), which returns true if R is orthogonal matrix, otherwise returns false.
if ismatrix(r) && size(r,1)==size(r,2) %checks if input is square matrix
'------'
trans=transpose(r)
inverted=inv(r)
isequal(trans,inverted)
trans==inverted
isequal(transpose(r),inv(r)) %METHOD ONE
i=size(r,1);
I=eye(i) %creating Identity matrix based on size of r
r*transpose(r)
r*transpose(r)==I %METHOD TWO
%check if transpose of r is times inverse of r equals Identity matrix of r
if (r*transpose(r)==I)
R= 'True';
else
R= 'False';
end
end
end
this is my output:
>> isortho(Z)
ans =
------
trans =
-0.2579 -0.7291 -0.6339
0.8740 0.1035 -0.4747
0.4117 -0.6765 0.6106
inverted =
-0.2579 -0.7291 -0.6339
0.8740 0.1035 -0.4747
0.4117 -0.6765 0.6106
ans = ////isequal(trans,inverted) which yielded 0 false
0
ans = ////trans==inverted
0 1 0
1 0 0
0 1 1
ans = ////isequal(transpose(r),inv(r))
0
I =
1 0 0
0 1 0
0 0 1
ans =
1.0000 0 0.0000
0 1.0000 0.0000
0.0000 0.0000 1.0000
ans =
1 1 0
1 1 0
0 0 1
ans =
False
>>
could someone help me fix this or tell my why the isequal() is failing when matrix inverted and trans appear to be the same?
As stated in the comments, you are running into computer precision issues. For more detail see Why is 24.0000 not equal to 24.0000 in MATLAB? and http://matlabgeeks.com/tips-tutorials/floating-point-comparisons-in-matlab/. This is not a Matlab specific thing, it's a computer thing, and you just have to deal with it.
In your case, you are trying to see whether two things are equal, but the two things are the result of a lot of floating point operations. So they will virtually never be exactly the same, but should always be very close. So, set a tolerance, say 1e-12, and say that the two things are equal if some measure of their difference is below that tolerance, e.g.:
norm(r.'-inv(r))<tol
Which finds the 2-norm of the difference between the two matrices, and then if it is less that tol, this will evaluate to 1, or true.
If I set tol=1e-12, then everything works well. If I set tol=1e-15, everything works well. But if I set tol=1e-16, then everything stops working! This is because the amount of computer precsion error is larger than 1e-16, so the answer to norm(r.'-inv(r)) cannot be accurate to that tolerance. The smallest amount Matlab can distinguish between on my computer is roughly 2.2x10^(-16), so you have to ensure that you tolerance is set well above this value. Setting tol too large will, of course, mean you say some non-orthogonal matrices are orthogonal, but I would not expect tol=1e-14 to give you any significant issues.
In Matlab, there is this unique command that returns thew unique rows in an array. This is a very handy command.
But the problem is that I can't assign tolerance to it-- in double precision, we always have to compare two elements within a precision. Is there a built-in command that returns unique elements, within a certain tolerance?
With R2015a, this question finally has a simple answer (see my other answer to this question for details). For releases prior to R2015a, there is such a built-in (undocumented) function: _mergesimpts. A safe guess at the composition of the name is "merge similar points".
The function is called with the following syntax:
xMerged = builtin('_mergesimpts',x,tol,[type])
The data array x is N-by-D, where N is the number of points, and D is the number of dimensions. The tolerances for each dimension are specified by a D-element row vector, tol. The optional input argument type is a string ('first' (default) or 'average') indicating how to merge similar elements.
The output xMerged will be M-by-D, where M<=N. It is sorted.
Examples, 1D data:
>> x = [1; 1.1; 1.05]; % elements need not be sorted
>> builtin('_mergesimpts',x,eps) % but the output is sorted
ans =
1.0000
1.0500
1.1000
Merge types:
>> builtin('_mergesimpts',x,0.1,'first')
ans =
1.0000 % first of [1, 1.05] since abs(1 - 1.05) < 0.1
1.1000
>> builtin('_mergesimpts',x,0.1,'average')
ans =
1.0250 % average of [1, 1.05]
1.1000
>> builtin('_mergesimpts',x,0.2,'average')
ans =
1.0500 % average of [1, 1.1, 1.05]
Examples, 2D data:
>> x = [1 2; 1.06 2; 1.1 2; 1.1 2.03]
x =
1.0000 2.0000
1.0600 2.0000
1.1000 2.0000
1.1000 2.0300
All 2D points unique to machine precision:
>> xMerged = builtin('_mergesimpts',x,[eps eps],'first')
xMerged =
1.0000 2.0000
1.0600 2.0000
1.1000 2.0000
1.1000 2.0300
Merge based on second dimension tolerance:
>> xMerged = builtin('_mergesimpts',x,[eps 0.1],'first')
xMerged =
1.0000 2.0000
1.0600 2.0000
1.1000 2.0000 % first of rows 3 and 4
>> xMerged = builtin('_mergesimpts',x,[eps 0.1],'average')
xMerged =
1.0000 2.0000
1.0600 2.0000
1.1000 2.0150 % average of rows 3 and 4
Merge based on first dimension tolerance:
>> xMerged = builtin('_mergesimpts',x,[0.2 eps],'average')
xMerged =
1.0533 2.0000 % average of rows 1 to 3
1.1000 2.0300
>> xMerged = builtin('_mergesimpts',x,[0.05 eps],'average')
xMerged =
1.0000 2.0000
1.0800 2.0000 % average of rows 2 and 3
1.1000 2.0300 % row 4 not merged because of second dimension
Merge based on both dimensions:
>> xMerged = builtin('_mergesimpts',x,[0.05 .1],'average')
xMerged =
1.0000 2.0000
1.0867 2.0100 % average of rows 2 to 4
This is a difficult problem. I'd even claim it to be impossible to solve in general, because of what I'd call the transitivity problem. Suppose that we have three elements in a set, {A,B,C}. I'll define a simple function isSimilarTo, such that isSimilarTo(A,B) will return a true result if the two inputs are within a specified tolerance of each other. (Note that everything I will say here is meaningful in one dimension as well as in multiple dimensions.) So if two numbers are known to be "similar" to each other, then we will choose to group them together.
So suppose we have values {A,B,C} such that isSimilarTo(A,B) is true, and that isSimilarTo(B,C) is also true. Should we decide to group all three together, even though isSimilarTo(A,C) is false?
Worse, move to two dimensions. Start with k points equally spaced around the perimeter of a circle. Assume the tolerance is chosen such that any point is within the specified tolerance of its immediate neighbors, but not to any other point. How would you choose to resolve which points are "unique" in the setting?
I'll claim that this problem of intransitivity makes the grouping problem not possible to resolve, at least not perfectly, and certainly not in any efficient manner. Perhaps one might try an approach based on a k-means style of aggregation. But this will be quite inefficient, as well, such an approach generally needs to know in advance the number of groups to look for.
Having said that, I would still offer a compromise, something that can sometimes work within limits. The trick is found in Consolidator, as found on the Matlab Central file exchange. My approach was to effectively round the inputs to within the specified tolerance. Having done that, a combination of unique and accumarray allows the aggregation to be done efficiently, even for large sets of data in one or many dimensions.
This is a reasonable approach when the tolerance is large enough that when multiple pieces of data belong together, they will be rounded to the same value, with occasional errors made by the rounding step.
As of R2015a, there is finally a function to do this, uniquetol (before R2015a, see my other answer):
uniquetol Set unique within a tolerance.
uniquetol is similar to unique. Whereas unique performs exact comparisons, uniquetol performs comparisons using a tolerance.
The syntax is straightforward:
C = uniquetol(A,TOL) returns the unique values in A using tolerance TOL.
As are the semantics:
Each value of C is within tolerance of one value of A, but no two elements in C are within tolerance of each other. C is sorted in ascending order. Two values u and v are within tolerance if:
abs(u-v) <= TOL*max(A(:),[],1)
It can also operate "ByRows", and the tolerance can be scaled by an input "DataScale" rather than by the maximum value in the input data.
But there is an important note about uniqueness of the solutions:
There can be multiple valid C outputs that satisfy the condition, "no two elements in C are within tolerance of each other." For example, swapping columns in A can result in a different solution being returned, because the input is sorted lexicographically by the columns. Another result is that uniquetol(-A,TOL) may not give the same results as -uniquetol(A,TOL).
There is also a new function ismembertol is related to ismember in the same way as above.
There is no such function that I know of. One tricky aspect is that if your tolerance is, say, 1e-10, and you have a vector with values that are equally spaced at 9e-11, the first and the third entry are not the same, but the first is the same as the second, and the second is the same as the third - so how many "uniques" are there?
One way to solve the problem is that you round your values to a desired precision, and then run unique on that. You can do that using round2 (http://www.mathworks.com/matlabcentral/fileexchange/4261-round2), or using the following simple way:
r = rand(100,1); % some random data
roundedData = round(r*1e6)/1e6; % round to 1e-6
uniqueValues = unique(roundedData);
You could also do it using the hist command, as long as the precision is not too high:
r = rand(100,1); % create 100 random values between 0 and 1
grid = 0:0.001:1; % creates a vector of uniquely spaced values
counts = hist(r,grid); % now you know for each element in 'grid' how many values there are
uniqueValues = grid(counts>0); % and these are the uniques
I've come across this problem before. The trick is to first sort the data and then use the diff function to find the difference between each item. Then compare when that difference is less then your tolerance.
This is the code that I use:
tol = 0.001
[Y I] = sort(items(:));
uni_mask = diff([0; Y]) > tol;
%if you just want the unique items:
uni_items = Y(uni_mask); %in sorted order
uni_items = items(I(uni_mask)); % in the original order
This doesn't take care of "drifting" ... so something like 0:0.00001:100 would actually return one unique value.
If you want something that can handle "drifting" then I would use histc but you need to make some sort of rough guess as to how many items you're willing to have.
NUM = round(numel(items) / 10); % a rough guess
bins = linspace(min(items), max(items), NUM);
counts = histc(items, bins);
unit_items = bins(counts > 0);
BTW: I wrote this in a text-editor away from matlab so there may be some stupid typos or off by one errors.
Hope that helps
This is hard to define well, assume you have a tolerance of 1.
Then what would be the outcome of [1; 2; 3; 4]?
When you have multiple columns a definition could become even more challenging.
However, if you are mostly worried about rounding issues, you can solve most of it by one of these two approaches:
Round all numbers (considering your tolerance), and then use unique
Start with the top row as your unique set, use ismemberf to determine whether each new row is unique and if so, add it to your unique set.
The first approach has the weakness that 0.499999999 and 0.500000000 may not be seen as duplicates. Whilst the second approach has the weakness that the order of your input matters.
I was stuck the other day with a MatLab 2010, so, no round(X,n), no _mergesimpts (At least I couldn't get it to work) so, a simple solution that works (at least for my data):
Using rat default tolerance:
unique(cellstr(rat(x)))
Other tolerance:
unique(cellstr(rat(x,tol)))
I want to delete several specific values from a matrix (if they exist). It is highly probable that there are multiple copies of the values in the matrix.
For example, consider an N-by-2 matrix intersections. If the pairs of values [a b] and [c d] exist as rows in that matrix, I want to delete them.
Let's say I want to delete rows like [-2.0 0.5] and [7 7] in the following matrix:
intersections =
-4.0000 0.5000
-2.0000 0.5000
2.0000 3.0000
4.0000 0.5000
-2.0000 0.5000
So that after deletion I get:
intersections =
-4.0000 0.5000
2.0000 3.0000
4.0000 0.5000
What's the most efficient/elegant way to do this?
Try this one-liner (where A is your intersection matrix and B is the value to remove):
A = [-4.0 0.5;
-2.0 0.5;
2.0 3.0;
4.0 0.5;
-2.0 0.5];
B = [-2.0 0.5];
A = A(~all(A == repmat(B,size(A,1),1),2),:);
Then just repeat the last line for each new B you want to remove.
EDIT:
...and here's another option:
A = A((A(:,1) ~= B(1)) | (A(:,2) ~= B(2)),:);
WARNING: The answers here are best used for cases where small floating point errors are not expected (i.e. with integer values). As noted in this follow-up question, using the "==" and "~=" operators can cause unwanted results. In such cases, the above options should be modified to use relational operators instead of equality operators. For example, the second option I added would be changed to:
tolerance = 0.001; % Or whatever limit you want to set
A = A((abs(A(:,1)-B(1)) > tolerance) | (abs(A(:,2)-B(2)) > tolerance),:);
Just a quick head's up! =)
SOME RUDIMENTARY TIMING:
In case anyone was really interested in efficiency, I just did some simple timing for three different ways to get the subindex for the matrix (the two options I've listed above and Fanfan's STRMATCH option):
>> % Timing for option #1 indexing:
>> tic; for i=1:10000, index = ~all(A == repmat(B,size(A,1),1),2); end; toc;
Elapsed time is 0.262648 seconds.
>> % Timing for option #2 indexing:
>> tic; for i=1:10000, index = (A(:,1) ~= B(1)) | (A(:,2) ~= B(2)); end; toc;
Elapsed time is 0.100858 seconds.
>> % Timing for STRMATCH indexing:
>> tic; for i=1:10000, index = strmatch(B,A); end; toc;
Elapsed time is 0.192306 seconds.
As you can see, the STRMATCH option is faster than my first suggestion, but my second suggestion is the fastest of all three. Note however that my options and Fanfan's do slightly different things: my options return logical indices of the rows to keep, and Fanfan's returns linear indices of the rows to remove. That's why the STRMATCH option uses the form:
A(index,:) = [];
while mine use the form:
A = A(index,:);
However, my indices can be negated to use the first form (indexing rows to remove):
A(all(A == repmat(B,size(A,1),1),2),:) = []; % For option #1
A((A(:,1) == B(1)) & (A(:,2) == B(2)),:) = []; % For option #2
The simple solution here is to look to set membership functions, i.e., setdiff, union, and ismember.
A = [-4 0.5;
-2 0.5;
2 3;
4 0.5;
-2 0.5];
B = [-2 .5;7 7];
See what ismember does with the two arrays. Use the 'rows' option.
ismember(A,B,'rows')
ans =
0
1
0
0
1
Since we wish to delete rows of A that are also in B, just do this:
A(ismember(A,B,'rows'),:) = []
A =
-4 0.5
2 3
4 0.5
Beware that set membership functions look for an EXACT match. Integers or multiples of 1/2 such as are in A satisfy that requirement. They are exactly represented in floating point arithmetic in MATLAB.
Had these numbers been real floating point numbers, I'd have been more careful. There I'd have used a tolerance on the difference. In that case, I might have computed the interpoint distance matrix between the two sets of numbers, removing a row of A only if it fell within some given distance of one of the rows of B.
You can also abuse the strmatch function to suit your needs: the following code removes all occurences of a given row b in a matrix A
A(strmatch(b, A),:) = [];
If you need to delete more than one row, such as all rows from matrix B, iterate over them:
for b = B'
A(strmatch(b, A),:) = [];
end
Not sure when this function was introduced (using 2012b) but you can just do:
setdiff(A, B, 'rows')
ans =
-4.0000 0.5000
2.0000 3.0000
4.0000 0.5000
Based on:
A = [-4.0 0.5;
-2.0 0.5;
2.0 3.0;
4.0 0.5;
-2.0 0.5];
B = [-2.0 0.5];