(0.3)^3 == (0.3)*(0.3)*(0.3) returns false in matlab? - matlab

I am trying to understand roundoff error for basic arithmetic operations in MATLAB and I came across the following curious example.
(0.3)^3 == (0.3)*(0.3)*(0.3)
ans = 0
I'd like to know exactly how the left-hand side is computed. MATLAB documentation suggests that for integer powers an 'exponentiation by squaring' algorithm is used.
"Matrix power. X^p is X to the power p, if p is a scalar. If p is an integer, the power is computed by repeated squaring."
So I assumed (0.3)^3 and (0.3)*(0.3)^2 would return the same value. But this is not the case. How do I explain the difference in roundoff error?

I don't know anything about MATLAB, but I tried it in Ruby:
irb> 0.3 ** 3
=> 0.026999999999999996
irb> 0.3 * 0.3 * 0.3
=> 0.027
According to the Ruby source code, the exponentiation operator casts the right-hand operand to a float if the left-hand operand is a float, and then calls the standard C function pow(). The float variant of the pow() function must implement a more complex algorithm for handling non-integer exponents, which would use operations that result in roundoff error. Maybe MATLAB works similarly.

Interestingly, scalar ^ seems to be implemented using pow while matrix ^ is implemented using square-and-multiply. To wit:
octave:13> format hex
octave:14> 0.3^3
ans = 3f9ba5e353f7ced8
octave:15> 0.3*0.3*0.3
ans = 3f9ba5e353f7ced9
octave:20> [0.3 0;0 0.3]^3
ans =
3f9ba5e353f7ced9 0000000000000000
0000000000000000 3f9ba5e353f7ced9
octave:21> [0.3 0;0 0.3] * [0.3 0;0 0.3] * [0.3 0;0 0.3]
ans =
3f9ba5e353f7ced9 0000000000000000
0000000000000000 3f9ba5e353f7ced9
This is confirmed by running octave under gdb and setting a breakpoint in pow.
The same is likely true in matlab, but I can't really verify.

Thanks to #Dougal I found this:
#include <stdio.h>
int main() {
double x = 0.3;
printf("%.40f\n", (x*x*x));
long double y = 0.3;
printf("%.40f\n", (double)(y*y*y));
which gives:
The case is strange because the computation with more digits gives a worst result. This is due to the fact that anyway the initial number 0.3 is approximated with few digits and hence we start with a relatively "large" error. In this particular case what happens is that the computation with few digits gives another "large" error but with opposite sign... hence compensating the initial one. Instead the computation with more digits gives a second smaller error but the first one remains.

Here's a little test program that follows what the system pow() from Source/Intel/xmm_power.c, in Apple's Libm-2026, does in this case:
#include <stdio.h>
int main() {
// basically lines 1130-1157 of xmm_power.c, modified a bit to remove
// irrelevant things
double x = .3;
int i = 3;
//calculate ix = f**i
long double ix = 1.0, lx = (long double) x;
//calculate x**i by doing lots of multiplication
int mask = 1;
//for each of the bits set in i, multiply ix by x**(2**bit_position)
while(i != 0)
if( i & mask )
ix *= lx;
i -= mask;
mask += mask;
lx *= lx; // In double this might overflow spuriously, but not in long double
printf("%.40f\n", (double) ix);
This prints out 0.0269999999999999962252417162744677625597, which agrees with the results I get for .3 ^ 3 in Matlab and .3 ** 3 in Python (and we know the latter just calls this code). By contrast, .3 * .3 * .3 for me gets 0.0269999999999999996946886682280819513835, which is the same thing that you get if you just ask to print out 0.027 to that many decimal places and so is presumably the closest double.
So there's the algorithm. We could track out exactly what value is set at each step, but it's not too surprising that it would round to a very slightly smaller number given a different algorithm for doing it.

Read Goldberg's "What Every Computer Scientist Should Know About Floating-Point Arithmetic" (this is a reprint by Oracle). Do understand it. Floating point numbers are not the real numbers of calculus. Sorry, no TL;DR version available.


scipy integrate.quad return an incorrect value

i use scipy integrate.quad to calc cdf of normal distribution:
def nor(delta, mu, x):
return 1 / (math.sqrt(2 * math.pi) * delta) * np.exp(-np.square(x - mu) / (2 * np.square(delta)))
delta = 0.1
mu = 0
t = np.arange(4.0, 10.0, 1)
nor_int = lambda t: integrate.quad(lambda x: nor(delta, mu, x), -np.inf, t)
nor_int_vec = np.vectorize(nor_int)
s = nor_int_vec(t)
for i in zip(s[0],s[1]):
print i
while it print as follows:
(1.0000000000000002, 1.2506543424265854e-08)
(1.9563704110140217e-11, 3.5403445591955275e-11)
(1.0000000000001916, 1.2616577562700088e-08)
(1.0842532749783998e-34, 1.9621183122960244e-34)
(4.234531567162006e-09, 7.753407284370446e-09)
(1.0000000000001334, 1.757986959115912e-10)
for some x, it return a value approximate to zero, it should be return 1.
can somebody tell me what is wrong?
Same reason as in why does quad return both zeros when integrating a simple Gaussian pdf at a very small variance? but seeing as I can't mark it as a duplicate, here goes:
You are integrating a function with tight localization (at scale delta) over a very large (in fact infinite) interval. The integration routine can simply miss the part of the interval where the function is substantially different from 0, judging it to be 0 instead. Some guidance is required. The parameter points can be used to this effect (see the linked question) but since quad over an infinite interval does not support it, the interval has to be manually split, like so:
for t in range(4, 10):
int1 = integrate.quad(lambda x: nor(delta, mu, x), -np.inf, mu - 10*delta)[0]
int2 = integrate.quad(lambda x: nor(delta, mu, x), mu - 10*delta, t)[0]
print(int1 + int2)
This prints 1 or nearly 1 every time. I picked mu-10*delta as a point to split on, figuring most of the function lies to the right of it, no matter what mu and delta are.
Use np.sqrt etc; there is usually no reason for put math functions in NumPy code. The NumPy versions are available and are vectorized.
Applying np.vectorize to quad is not doing anything besides making the code longer and slightly harder to read. Use a normal Python loop or list comprehension. See NumPy vectorization with integration

integers can only be combined with integers of same class or scalar doubles error

I do not know what this error means or how to fix it. I am trying to perform an image rotation in a separate space of coordinates. When defining the reference space of the matrix to be at zero, I am getting the error that integers can only be comibined with integers of the same class or scalar doubles. the line is
WZcentered = WZ - [x0;yo]*ones(1,Ncols);
WZ is classified as a 400x299x3 unit 8, in the workspace. It is an image. x0 and y0 are set to 0 when the function is called. How can I fix this issue/what exactly is happening here?
Also, when I do the same thing yet make WZ to be equal to double(WZ) I get the error that 'matrix dimensions must agree.' I am not sure what the double function does however. Here is the whole code.
function [out_flag, WZout, x_final, y_final] = adopted_moveWZ(WZ, x0, y0);
%Initial Test of plot
if Nrows ~= 2
if Ncols ==2
WZ=transpose(WZ); %take transpose
[Nrows,Ncols]=size(WZ); %reset the number of rows and columns
fprintf('ERROR: Input file should have 2-vectors for the input points.\n');
title('These are the original points in the image');
%WZorig = WZ;
WZcentered = WZ - ([x0;y0] * ones(1,Ncols));
axis([-FigScale 2*FigScale -FigScale 2*FigScale])
disp('Hit any key to start the animation');
SceneCenter = zeros(Nrows,Ncols);
WZnew = WZcentered;
for ii=0:20
R = [cos(pi/ii) -sin(pi/ii) 0; sin(pi/ii) cos(pi/ii) 0; 0 0 1];
WZnew = R * WZnew;
%place WZnew at a different place in the scene
SceneCenter = (ii*[30;40])*ones(1,Ncols);
plot(SceneCenter(1,:) + WZnew(1,:), SceneCenter(2,:) + WZnew(2,:),'.')
axis([-FigScale 2*FigScale -FigScale 2*FigScale])
%Set final values for output at end of program
x_final = SceneCenter(1,1);
y_final = SceneCenter(2,1);
PPout = PPnew + SceneCenter;
This happens due to WZ and ([x0;y0] * ones(1,Ncols)) being of different data types. You might think MATLAB is loosely typed, and hence should do the right thing when you have a floating point type operated with an integer type, but this rule breaks every once in a while. A simpler example to demonstrate this is here:
X = uint8(magic(5))
Y = zeros(5)
X - Y
This breaks with the same error that you are reporting. One way to fix this is to force cast one of the operands to the other, typically up-casted to make sure the math works. When you do this, both the numbers you are working on are floating point (double precision), and so they are represented in the same byte formatting sequence in memory. This way, the '-' sign is valid, in the same way that you can say 3 apples + 4 apples = 7 apples, but 3 oranges (uint8) + 4 apples (double) = ?. The double(X) makes it clear that you really mean to use double precision arithmetic, and hence fixes the error. This is how it looks now:
double(X) - Y
After having identified this, the new error is 'matrix dimensions do not match'. This means exactly what it says. WZ is a 400x299x3 matrix, and the right hand side matrix is 2xnCols. Now can you subtract a 2D matrix from a 3D matrix of different sizes meaningfully?
Depending on what your code is really intending to do, you can pad the RHS matrix, or find out other ways to make the sizes equal.
All of this is why MATLAB includes routines to do image rotation, namely http://www.mathworks.com/help/images/ref/imrotate.html . This is part of the Image Processing Toolbox, though.

Inf*0 in Matlab

I have the following line in my code:
1 - sqrt(pi/2)*sig*sqrt(Eb)*theta_l*exp(theta_l^2*sig^2*Eb/2).*(1 + erf(-theta_l*sig*sqrt(Eb)/sqrt(2)));
When I evaluate this expression for the following parameters:
Eb = 6324.6;
sig = 1/sqrt(2);
theta = 0.7;, I get Nan. I know that this comes from the product of Infinity by 0.
However when I tested the same line in Mathematica, the result was a finite value. How can I solve this issue? Thanks.
The problematic part of your function is exp(Eb/2). The value of Eb is so large, that the result of its exponentiation cannot be represented by a double precision floating point number (The numerical precision in Mathematica is obviously higher, or dynamic probably at the cost of performance), so you get Inf.
However, you can just change the input units to your function to stop this happening. For example, if we define your function as an anonymous function ...
funky = #(Eb, sig, theta_l) ...
1 - sqrt(pi/2)*sig*sqrt(Eb)*theta_l*exp(theta_l^2*sig^2*Eb/2) .* ...
(1 + erf(-theta_l*sig*sqrt(Eb)/sqrt(2)));
funky(6324.6 / 1000, (1/sqrt(2))/1000, 0.7 / 1000) == ...
funky(6324.6 / 1e6, (1/sqrt(2))/1e6, 0.7 / 1e6) == ...
funky(6324.6 / 1e10, (1/sqrt(2))/1e10, 0.7 / 1e10) % etc

matlab evaluates 0.9<0.9 to `True`

Below is my matlab code snippet:
clear all;
There is no doubt that the result of x0 is 1.4, what confused me is when I replace x2=1.4 to x2=0.8, 0.9, 1.0, 1.1, or 1.2(anyone of them) the result becomes incorrect.
For example, x2=0.9 will makes the code generate x0=1.0 instead of x0=0.9. And I find the fact that during the process with x0 increases to 0.9 then x0<x2(0.9<0.9) will yeild 1(True) which is definitely wrong.
What's going on here?
I would like to extend Dan's answer a little. Run this code:
clear all;
disp(x2-x0); % <-- I added this line
Output (ideone):
x0 = 1.00000
The 4th line of the output shows the rounding error: after adding 0.1 to 0.5 four times, the result will be less than 0.9: the difference is 1.1102e-16. That's why x0 will be 1.0 in the end.
Basically you should never compare floating point numbers directly. See this link provided by Sam Roberts to better understand this.
In binary, 0.1 becomes a recurring decimal which your computer eventually has to truncate. So you aren't really adding 0.1 and that rounding error compounds each time you add 0.1 eventually leading to your comparison missing. So you are not actually evaluating 0.9<0.9 but rather the sum of a truncated conversion to binary of 0.1 9 times with the truncated binary version of 0.9. The numbers won't be the same.
If you truly need to compare floating point values, I would suggest doing it a little differently. You can do something similar to what is written below; where threshold is some acceptable margin that is the numbers different by this amount they are considered different, and if the difference is less than this amount, they are considered to be the same.
function result = floatCompare(a,b,operation,threshold)
if nargin < 4
threshold = eps;
if nargin < 3
operation = 'eq';
switch operation
case 'eq'
result = abs(a-b) < threshold;
case 'lt'
disp('less than')
result = b-a > threshold;
case 'gt'
disp('greater than')
result = a-b > threshold;

Fast way to compute (1:N)'*(1:N)

I am looking for a fast way to compute
for reasonably large N. I feel like the symmetry of the problem makes it so that actually doing the multiplications and additions is wasteful.
The question of why you want to do this really matters.
In the theoretical sense, the triangular approach suggested in the other answers will save you operations. #jgmao's answer is especially interesting in reducing multiplies.
In the practical sense, number of CPU operations is no longer the metric to minimize when writing fast code. Memory bandwidth dominates when you have so few CPU operations, so tuned cache-aware access patterns are how to make this go fast. Matrix multiplication code is implemented extremely efficiently, since it's such a common operation, and every implementation of the BLAS numeric library worth its salt will use optimized access patterns, and SIMD computation as well.
Even if you wrote straight C and reduced your op count to the theoretic minimum, you'd probably still not beat the full matrix multiply. What this boils down to is to find the numeric primitive which most closely matches your operation.
All that said, there's a BLAS operation which gets a little closer than DGEMM (matrix multiply). It's called DSYRK, the rank-k update, and it can be used for exactly A'*A. The MEX function I wrote for this a long time ago is here. I haven't messed with it in a long time, but it did work when I first wrote it, and did in fact run faster than a straight A'*A.
/* xtrx.c: calculates x'*x taking advantage of the symmetry.
Peter Boettcher <email removed>
Last modified: <Thu Jan 23 13:53:02 2003> */
#include "mex.h"
const double one = 1;
const double zero = 0;
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
double *x, *z;
int i, j, mrows, ncols;
if(nrhs!=1) mexErrMsgTxt("One input required.");
x = mxGetPr(prhs[0]);
mrows = mxGetM(prhs[0]);
ncols = mxGetN(prhs[0]);
plhs[0] = mxCreateDoubleMatrix(ncols,ncols, mxREAL);
z = mxGetPr(plhs[0]);
/* Call the FORTRAN BLAS routine for rank k update */
dsyrk_("U", "T", &ncols, &mrows, &one, x, &mrows, &zero, z, &ncols);
/* Result is in the upper triangle. Copy it down the lower part */
for(i=0; i<ncols; i++)
for(j=i+1; j<ncols; j++)
z[i*ncols + j] = z[j*ncols + i];
MATLAB's matrix multiplication is generally pretty fast, but here are a couple of ways to get just the upper triangular matrix. They are slower than naïvely computing the v'*v (or using a MEX wrapper that calls the more appropriate symmetric rank k update function in BLAS, not surprisingly!). Anyway, here are a few MATLAB-only solutions:
The first uses linear indexing:
% test vector
N = 1e3;
v = 1:N;
% compute upper triangle of product
[ii, jj] = find(triu(ones(N)));
upperMask = false(N,N);
upperMask(ii + N*(jj-1)) = true;
Mu = zeros(N);
Mu(upperMask) = v(ii).*v(jj); % other lines always the same computation
% validate
M = v'*v;
This next way won't be faster than the naive approach either, but here's another solution to compute the lower triangle with bsxfun:
Ml = bsxfun(#(x,y) [zeros(y-1,1); x(y:end)*y],v',v);
For the upper triangle:
Mu = bsxfun(#(x,y) [x(1:y)*y; zeros(numel(x)-y,1)],v',v);
Another solution for the whole matrix using cumsum for this special case (where v=1:N). This one is actually close in speed.
M = cumsum(repmat(v,[N 1]));
Maybe these can be a starting point for something better.
This is 3 times faster than (1:N).'*(1:N) provided an int32 result is acceptable (it's even faster if the numbers are small enough to use int16 instead of int32):
N = 1000;
aux = int32(1:N);
result = bsxfun(#times,aux.',aux);
>> N = 1000; aux = int32(1:N); tic, for count = 1:1e2, bsxfun(#times,aux.',aux); end, toc
Elapsed time is 0.734992 seconds.
>> N = 1000; aux = 1:N; tic, for count = 1:1e2, aux.'*aux; end, toc
Elapsed time is 2.281784 seconds.
Note that aux.'*aux cannot be used for aux = int32(1:N).
As pointed out by #DanielE.Shub, if the result is needed as a double matrix, a final cast has to be done, and in that case the gain is very small:
>> N = 1000; aux = int32(1:N); tic, for count = 1:1e2, double(bsxfun(#times,aux.',aux)); end, toc
Elapsed time is 2.173059 seconds.
Since the special ordered structure of the input, consider the case N=4
(1:4)'*(1:4) = [1 2 3 4
2 4 6 8
3 6 9 12
4 8 12 16]
you will find that 1st row is just (1:N), from second (j=2) row, the value of this row is previous row (j=1) plus (1:N).
So 1. you do not to do many multiplications. Instead, you can generate it by N*N additions.
2. since the output is symmetric, only half of the output matrix need to be computed. So the total computation is (N-1)+(N-2)+...+1 = N^2 / 2 additions.