matlab evaluates 0.9<0.9 to `True` - matlab

Below is my matlab code snippet:
clear all;
x0=0.5;
x2=1.4;
h=0.1;
while(x0<x2)
x0=x0+0.1;
end
x0
There is no doubt that the result of x0 is 1.4, what confused me is when I replace x2=1.4 to x2=0.8, 0.9, 1.0, 1.1, or 1.2(anyone of them) the result becomes incorrect.
For example, x2=0.9 will makes the code generate x0=1.0 instead of x0=0.9. And I find the fact that during the process with x0 increases to 0.9 then x0<x2(0.9<0.9) will yeild 1(True) which is definitely wrong.
What's going on here?

I would like to extend Dan's answer a little. Run this code:
clear all;
x0=0.5;
x2=0.9;
h=0.1;
while(x0<x2)
x0=x0+0.1;
disp(x2-x0); % <-- I added this line
end
x0
Output (ideone):
0.30000
0.20000
0.10000
1.1102e-16
-0.100000
x0 = 1.00000
The 4th line of the output shows the rounding error: after adding 0.1 to 0.5 four times, the result will be less than 0.9: the difference is 1.1102e-16. That's why x0 will be 1.0 in the end.

Basically you should never compare floating point numbers directly. See this link provided by Sam Roberts to better understand this.
In binary, 0.1 becomes a recurring decimal which your computer eventually has to truncate. So you aren't really adding 0.1 and that rounding error compounds each time you add 0.1 eventually leading to your comparison missing. So you are not actually evaluating 0.9<0.9 but rather the sum of a truncated conversion to binary of 0.1 9 times with the truncated binary version of 0.9. The numbers won't be the same.

If you truly need to compare floating point values, I would suggest doing it a little differently. You can do something similar to what is written below; where threshold is some acceptable margin that is the numbers different by this amount they are considered different, and if the difference is less than this amount, they are considered to be the same.
function result = floatCompare(a,b,operation,threshold)
if nargin < 4
threshold = eps;
end
if nargin < 3
operation = 'eq';
end
switch operation
case 'eq'
disp('equals')
result = abs(a-b) < threshold;
case 'lt'
disp('less than')
result = b-a > threshold;
case 'gt'
disp('greater than')
result = a-b > threshold;
end
end

Related

angle() of a real number

I'm a bit confused about the angle() function in Matlab, in particular when applied to an array of real numbers.
The angle() function should give me the phase of a complex number. Example: y = a + bi, ==> phase = arctan(b/a). Indeed, the following works:
for t=1:1000
comp(t) = exp(1i*(t/10));
end
phase_good_comp1 = unwrap(angle(comp)); %this gives me the right answer
b = imag(comp);
a = real(comp);
phase_good_comp2 = atan(b./a); %this gives me the right answer too, but
wrapped (not sure if there is a way to unwrap this, but unwrap() does not
work)
figure(1)
plot(phase_good_comp1)
hold on
plot(phase_good_comp2,'--r')
legend('good phase1', 'good phase2')
title('complex number')
Here's the plot for the complex numbers --
Note that I can use either the angle() function, or the explicit definition of phase, as I have shown above. Both yield good results (I can't unwrap the latter, but that's not my issue).
Now if I apply the same logic to an array of real numbers, I should get a constant phase everywhere, since no imaginary part exists, so arctan(b/a) = arctan(0) = 0. This works if I use the explicit definition of phase, but I get a weird result if I use angle():
for t=1:1000
ree(t) = cos((t/10));
end
phase_bad_re = unwrap(angle(ree)); %this gives me an unreasonable (?) answer
b = imag(ree);
a = real(ree);
phase_good_re = atan(b./a); %this gives me the right answer
figure(1)
plot(phase_bad_re)
hold on
plot(phase_good_re,'--r')
legend('bad phase', 'good phase')
title('real number')
Here's the plot for the real numbers --
Why the oscillation when I use angle()???
The Matlab documentation tells you how to compute this:
The angle function can be expressed as angle(z) = imag(log(z)) = atan2(imag(z),real(z)).
https://www.mathworks.com/help/matlab/ref/angle.html
Note that they define it with atan2 instead of atan.
Now your data is in the range of cosine, which includes both positive and negative numbers. The angle on the positive numbers should be 0 and the angle on the negative numbers should be an odd-integer multiple of pi in general. Using the specific definition that they've chosen to get a unique answer, it is pi. That's what you got. (Actually, for the positive numbers, any even-integer multiple of pi will do, but 0 is the "natural" choice and the one that you get from atan2.)
If you're not clear why the negative numbers don't have angle = 0, plot it out in the complex plane and keep in mind that the radial part of the complex number is positive by definition. That is z = r * exp(i*theta) for positive r and theta given by this angle you're computing.
Since sign of cosine function is periodically changed, angle() is also oscillated.
Please, try this.
a=angle(1);
b=angle(-1);
Phase of 1+i*0 is 0, while phase of -1+i*0 is 3.14.
But, in case of atan, b/a is always 0, so that the result of atan() is all 0.

Return values differ depending on vector size

I noticed that for example the log(x) function returns slightly different values when called with vectors of different sizes in MATLAB.
Here is a minimal working example:
x1 = 0.1:0.1:1;
x2 = 0.1:0.1:1.1;
y1 = log(x1);
y2 = log(x2);
d = y1 - y2(1:length(x1));
d(7)
Executing returns:
>> ans =
-1.6653e-16
The behaviour seems to start when the vector becomes greater than 10 entries.
Although the difference is very small, when being scaled by a lot of operations using large vectors, the errors became big enough to notice.
Does anyone have an idea what is happening here?
The differences exist in x1 and x2 and those errors are propagated and potentially accentuated by log.
max(abs(x1 - x2(1:numel(x1))))
% 1.1102e-16
This is due to the inability of floating point number to represent your data exactly. See here for more information.
Per Suever’s answer, this is because for unfathomable reasons, Matlab’s colon operator [start : step : stop] with floating-point step produces non-bit-exact results even when start and step are the same, and only stop is different.
This is wrong, although it’s not unknown: in a blog post from 2006 (search for “Typical MATLAB Pitfall”), Loren notes that : colon operator can suffer from floating-point accuracy issues.
Numpy/Python does it right:
import numpy as np
np.all(np.arange(0.1,1.0+1e-4, 0.1) == np.arange(0.1, 1.1+1e-4, 0.1)[:-1]) # => True
(np.arange(start, stop, step) doesn’t include stop so I use stop+1e-4 above.)
Julia does it right too:
all(collect(0.1 : 0.1 : 1) .== collect(0.1 : 0.1 : 1.1)[1:10]) # => true
Alternative. Here’s a straightforward guess as to what Numpy’s arange is doing, in Matlab:
function y = arange(start, stop, step)
%ARANGE An alternative to Matlab's colon operator
%
% The API for this function follows Numpy's arange [1].
%
% ARANGE(START, STOP, STEP) produces evenly-spaced values within the half-open
% interval [START, STOP). The resulting vector has CEIL((STOP - START) / STEP)
% elements and is roughly equivalent to (START : STEP : STOP - STEP / 2), but
% may differ from this COLON-based version due to numerical differences.
%
% ARANGE(START, STOP) assumes STEP of 1.0.
%
% [1] http://docs.scipy.org/doc/numpy/reference/generated/numpy.arange.html
if nargin < 3 || isempty(step)
step = 1.0;
end
len = ceil((stop - start) / step);
y = start + (0 : len - 1) * step;
This function tries to keep things exact until the last possible moment, when it applies the scaling by step and shifting by start. With this, your original two vectors are bit-exact over their shared interval:
y1 = arange(0.1, 1.0 + 1e-4, 0.1);
y2 = arange(0.1, 1.1 + 1e-4, 0.1);
all(y2(1:numel(y1)) == y1) % => 1
And therefore all downstream operations like log are also bit-exact.
I will investigate whether this bug in Matlab is causing any problems in our internal code and check if we should enforce using linspace (which I believe, but have not checked, does not suffer as much from accuracy issues) or something like arange above instead of : for floating-point steps. (arange also can be tricky because, as the docs note, depending on (stop-start)/step, you may get a vector whose last element is greater than stop sometimes—those same docs also recommend using linspace with non-unit steps.)

MatLab using Fixed Point method to find a root

I wanna find a root for the following function with an error less than 0.05%
f= 3*x*tan(x)=1
In the MatLab i've wrote that code to do so:
clc,close all
syms x;
x0 = 3.5
f= 3*x*tan(x)-1;
df = diff(f,x);
while (1)
x1 = 1 / 3*tan(x0)
%DIRV.. z= tan(x0)^2/3 + 1/3
er = (abs((x1 - x0)/x1))*100
if ( er <= 0.05)
break;
end
x0 = x1;
pause(1)
end
But It keeps running an infinite loop with error 200.00 I dunno why.
Don't use while true, as that's usually uncalled for and prone to getting stuck in infinite loops, like here. Simply set a limit on the while instead:
while er > 0.05
%//your code
end
Additionally, to prevent getting stuck in an infinite loop you can use an iteration counter and set a maximum number of iterations:
ItCount = 0;
MaxIt = 1e5; %// maximum 10,000 iterations
while er > 0.05 & ItCount<MaxIt
%//your code
ItCount=ItCount+1;
end
I see four points of discussion that I'll address separately:
Why does the error seemingly saturate at 200.0 and the loop continue infinitely?
The fixed-point iterator, as written in your code, is finding the root of f(x) = x - tan(x)/3; in other words, find a value of x at which the graphs of x and tan(x)/3 cross. The only point where this is true is 0. And, if you look at the value of the iterants, the value of x1 is approaching 0. Good.
The bad news is that you are also dividing by that value converging toward 0. While the value of x1 remains finite, in a floating point arithmetic sense, the division works but may become inaccurate, and er actually goes NaN after enough iterations because x1 underflowed below the smallest denormalized number in the IEEE-754 standard.
Why is er 200 before then? It is approximately 200 because the value of x1 is approximately 1/3 of the value of x0 since tan(x)/3 locally behaves as x/3 a la its Taylor Expansion about 0. And abs(1 - 3)*100 == 200.
Divisions-by-zero and relative orders-of-magnitude are why it is sometimes best to look at the absolute and relative error measures for both the values of the independent variable and function value. If need be, even putting an extremely (relatively) small finite, constant value in the denominator of the relative calculation isn't entirely a bad thing in my mind (I remember seeing it in some numerical recipe books), but that's just a band-aid for robustness's sake that typically hides a more serious error.
This convergence is far different compared to the Newton-Raphson iterations because it has absolutely no knowledge of slope and the fixed-point iteration will converge to wherever the fixed-point is (forgive the minor tautology), assuming it does converge. Unfortunately, if I remember correctly, fixed-point convergence is only guaranteed if the function is continuous in some measure, and tan(x) is not; therefore, convergence is not guaranteed since those pesky poles get in the way.
The function it appears you want to find the root of is f(x) = 3*x*tan(x)-1. A fixed-point iterator of that function would be x = 1/(3*tan(x)) or x = 1/3*cot(x), which is looking for the intersection of 3*tan(x) and 1/x. However, due to point number (2), those iterators still behave badly since they are discontinuous.
A slightly different iterator x = atan(1/(3*x)) should behave a lot better since small values of x will produce a finite value because atan(x) is continuous along the whole real line. The only drawback is that the domain of x is limited to the interval (-pi/2,pi/2), but if it converges, I think the restriction is worth it.
Lastly, for any similar future coding endeavors, I do highly recommend #Adriaan's advice. If would like a sort of compromise between the styles, most of my iterative functions are written with a semantic variable notDone like this:
iter = 0;
iterMax = 1E4;
tol = 0.05;
notDone = 0.05 < er & iter < iterMax;
while notDone
%//your code
iter = iter + 1;
notDone = 0.05 < er & iter < iterMax;
end
You can add flags and all that jazz, but that format is what I frequently use.
I believe that the code below achieves what you are after using Newton's method for the convergence. Please leave a comment if I have missed something.
% find x: 3*x*tan(x) = 1
f = #(x) 3*x*tan(x)-1;
dfdx = #(x) 3*tan(x)+3*x*sec(x)^2;
tolerance = 0.05; % your value?
perturbation = 1e-2;
converged = 1;
x = 3.5;
f_x = f(x);
% Use Newton s method to find the root
count = 0;
err = 10*tolerance; % something bigger than tolerance to start
while (err >= tolerance)
count = count + 1;
if (count > 1e3)
converged = 0;
disp('Did not converge.');
break;
end
x0 = x;
dfdx_x = dfdx(x);
if (dfdx_x ~= 0)
% Avoid division by zero
f_x = f(x);
x = x - f_x/dfdx_x;
else
% Perturb x and go back to top of while loop
x = x + perturbation;
continue;
end
err = (abs((x - x0)/x))*100;
end
if (converged)
disp(['Converged to ' num2str(x,'%10.8e') ' in ' num2str(count) ...
' iterations.']);
end

why am I getting imaginary numbers instead of a simple answer

this is just part of the code that matters and needs to be fixed. I don't know what i'm doing wrong here. all the variables are simple numbers, it's true that one is needed for that other, but there shouldn't be anything wrong with that. the answer for which I'm getting imaginary numbers is supposed to be part of a loop, so it's important I get it right. please ignore the variables that are not needed, as i just wrote a part of the code
the answer i get is:
KrInitialFirstPart = 0.000000000000000e+00 - 1.466747615972368e+05i
clear all;
clc;
% the initial position components
rInitial= 10; %kpc
zInitial= 0; %kpc
% the initial velocity components
vrInitial= 0; %km/s
vzInitial= 150; %tangential velocity component
vtInitial= 150; %not used
% the height
h= rInitial*vzInitial; %angulan momentum constant
tInitial=0;
Dt=1e-3;
e=0.99;
pc=11613.5;
KrInitialFirstPart= -4*pi*pc*sqrt( 1-(e^2) / (e^3) )*rInitial
format long
Here
sqrt( 1-(e^2) / (e^3) )
you have here
e=0.99;
so e < 1 and so e^3 is less than e^2.
Therefore
(e^2)/(e^3) > 1.
The division operation binds tighter than (i.e is evaluated ahead of) the subtraction so you are taking a square root of a negative number. Hence the imaginary component in your result.
Perhaps you require
sqrt( (1-(e^2)) / (e^3) )
which is guaranteed to yield a real number result since
1 - e^2 > 0
for your specified e

Can this Matlab function be vectorized (or made to run faster by other another method)?

The main problem is finding values at a fixed offset from the current value.
My current method is very slow when the Value vector is large (typically 100000's of elements).
function [ AverageValue ] = CalculateAverageValueOverAngle( Value, Angle )
% function [ AverageValue ] = CalculateAverageValueOverAngle( Value, Angle )
% Calculate average value from instantaneous value and angle
% Average value is calculated over +- 90 degrees from current angle
AverageValue = zeros( size( Value ) );
UnwrappedRadians = unwrap( Angle ./ 180 * pi );
for i=1:length(UnwrappedRadians)
mid = UnwrappedRadians(i);
start = find( UnwrappedRadians(1:i) < (mid - pi/2), 1, 'Last');
finish = find( UnwrappedRadians(i:end) > (mid + pi/2), 1, 'First');
if isempty(start) | isempty(finish)
AverageValue(i) = Value(i);
else
AverageValue(i) = mean(Value(start:finish+i-1)); % nanmean
end
end
end
A minor refactoring will save the second find in instances where you don't find results, and preallocating with AverageValue with Value saves the else part.
UnwrappedRadians = unwrap( Angle ./ 180 * pi);
AverageValue = Value;
for i=1:length(UnwrappedRadians)
mid = UnwrappedRadians(i);
start = find( UnwrappedRadians(1:i) < (mid - pi/2), 1, 'Last');
if ~isempty(start)
finish = find( UnwrappedRadians(i:end) > (mid + pi/2), 1, 'First');
if ~isempty(finish)
AverageValue(i) = mean(Value(start:finish+i-1)); % nanmean
end
end
end
If you find that the calculation of finish is empty more often than the calculation of start, you can switch their order so that the finish check is done first.
It's not clear if UnwrappedRadians will always be sorted. If so, you can reuse the results from earlier finds to reduce the size of the subvector you search over. For instance, if the you look for the last 1 between 11 and the end of the vector and it is element 23, when you search for 1.1, you can reduce the search to between 24 and the end of the vector. I've found that this technique can yield really big increases in speed.
Vectorization is difficult in cases like this, because you are using the indexing variable (i) as an index again (in the find statements). It is possible to rig something up using arrayfun, but it is highly unlikely to be faster (I would guess slower, actually) and will definitely be less readable than what you have.
MatlabSorter has provided a refactoring which makes sense to me, so if your code really does what you want, his refactoring is the way forward :-).
Note that in my tests with numel(Angle)=50000, his refactoring did not save much (probably because my sample data assumed that the find()s will almost never fail except at the very beginning and at the end of your data trace).
However, while looking at your code, I was wondering: Are you sure you absolutely want to average all values from the first time the angle gets into the mid-pi/2...mid+pi/2 range, until the last time it leaves that range? If your unwrapped Angles are non-monotonous (for example if backwards movements are allowed, if the sampling rate is too low to avoid aliasing, or simply due to measurement noise in the angle) then you will also be averaging over some values outside (and possibly well outside) the 180° range.
Note that in any case the first measurement (Value(start)) you average over is always more than pi/2 before your "mid" angle (you start with the last angle before the interval), while your last measurement (Value(finish+i-1)) is always more than pi/2 behind mid. So your effective range you average over is always larger than pi, even if data values at exactly mid-pi/2 and mid+pi/2 are available... is that really intended?
So in case you are really interested in averaging only Values where Angle is less than pi/2 from mid, here is my code suggestion, which sadly has only marginally quicker runtime than what you currently use. Note that this is NOT a refactoring, because it acts differently from your code in the two ways described above.
UnwrappedRadians = unwrap( Angle ./ 180 * pi);
AverageValue = Value;
avgstart=find( UnwrappedRadians > (UnwrappedRadians(1) + pi/2), 2, 'First');
avgend=find( UnwrappedRadians < (UnwrappedRadians(end) - pi/2), 1, 'Last');
for i=avgstart:avgend
AverageValue(i) = mean(Value(abs(UnwrappedRadians-UnwrappedRadians(i)) <= pi/2)); % nanmean
end