How to use dsyev routine to calculate eigenvalues? - lapack

i am trying to write code to calculate eign vector and eign values for a symmetric matrix. I understand how to calculate evalues using pen & paper but i am slightly confused with the api!. I am a beginner so i may be wrong in interpreting the api parameters.
int main() {
char jobz='V',uplo='U';
int lda=3,n=3,info=8,lwork=9;
// lapack_int lda=3,n=3,info=8;
int i;
double w[3],work[3];
double a[9] = {
3,2,4,
2,0,2,
4,2,3
};
info=LAPACKE_dsyev(LAPACK_ROW_MAJOR,jobz,uplo, n ,a, lda , w);
//dsyev_( &jobz,&uplo,&n, a, &lda, w,work , &lwork, &info );
if( info > 0 ) {
printf( "The algorithm failed to compute eigenvalues.\n" );
exit( 1 );
}
for(i=0;i<3;i++)
{
printf("%f\n",w[i]);
}
for(i=0;i<9;i++)
{
printf("%f\n",a[i]);
}
exit( 0 );
}
output:
-1.000000
-1.000000
8.000000
0.617945
1.999713
-0.016938
0.010468
0.033876
0.999857
1.381966
0.618034
0.000000
whereas i expected k=-1: [1,-2,0] ,[4,2,-5] and k=8: [2,1,2] somewhere in the output!
am i using api incorrectly or am i reading the output incorrectly?
also please suggest how do i do the same task with fortran api ?
as with fortran i am unable to get proper eign values !.
i.e. eign values i get with fortran:
-0.134742
0.050742
0.523036
eign vectors:
0.617945
1.999713
-0.016938
0.010468
0.033876
0.999857
1.381966
0.618034
0.000000

As suggested by #francis in the comment, the program works if you modify work[3] to work[9]. The obtained result is
Eigenvalues: w[0],w[1],w[2] => -1.000000 -1.000000 8.000000
1st eigenvector: a[0], a[1], a[2] => -0.494101 -0.472019 0.730111
2nd eigenvector: a[3], a[4], a[5] => -0.558050 0.816142 0.149979
3rd eigenvector: a[6], a[7], a[8] => 0.666667 0.333333 0.666667
For comparison, let's diagonalize the same matrix with different programs. For example, Python/Numpy gives the result
>>> import numpy as np
>>> a = np.array([[3,2,4], [2,0,2], [4,2,3]], dtype=np.float )
>>> np.linalg.eig( a )
(array([-1., 8., -1.]),
array([[-0.74535599, 0.66666667, -0.09414024],
[ 0.2981424 , 0.33333333, -0.84960833],
[ 0.59628479, 0.66666667, 0.5189444 ]]))
while Julia gives
julia> a = Float64[ 3 2 4 ; 2 0 2 ; 4 2 3 ]
3x3 Array{Float64,2}:
3.0 2.0 4.0
2.0 0.0 2.0
4.0 2.0 3.0
julia> eig( a )
([-0.9999999999999996,-0.9999999999999947,8.0],
3x3 Array{Float64,2}:
0.447214 -0.596285 -0.666667
-0.894427 -0.298142 -0.333333
0.0 0.745356 -0.666667)
In both cases, the first three numbers are eigenvalues {-1,-1,8} and the following matrix are the corresponding eigenvectors (in their columns). You will see that all the programs give different results for eigenvectors. Because there are two degenerate eigenvalues (-1), any linear combination of the corresponding eigenvectors is also an eigenvector with the same eivenvalue, so the result is not unique. We can confirm that the degenerate eigenvectors obtained from C are related to those of Julia by a 2x2 orthogonal transformation (or a "rotation" by 101.6 degrees).
Interestingly, your expected eigenvectors [-1,2,0] and [4,2,-5] correspond exactly to the eigenvectors obtained from Julia (after normalization), but this is probably accidental and one cannot expect such agreement of degenerate eigenvectors.

Related

Scipy.curve_fit() vs. Matlab fit() weighted nonlinear least squares

I have a Matlab reference routine that I am trying to convert to numpy/scipy. I have encountered a curve fitting problem that does I cannot solve in Python. So here is a simple example which demonstrates the problem. The data is completely synthetic and not part of the problem.
Let's say I'm trying to fit a straight-line model of noisy data -
x = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
y = [0.1075, 1.3668, 1.5482, 3.1724, 4.0638, 4.7385, 5.9133, 7.0685, 8.7157, 9.5539]
For the unweighted solution in Matlab, I would code
g = #(m, b, x)(m*x + b)
f = fittype(g)
bestfit = fit(x, y, g)
which produces a solution of bestfit.m = 1.048, bestfit.b = -0.09219
Running this data through scipy.optimize.curve_fit() produces identical results.
If instead the fit uses a decay function to reduce the impact of data points
dw = [0.7290, 0.5120, 0.3430, 0.2160, 0.1250, 0.0640, 0.0270, 0.0080, 0.0010, 0]
weightedfit = fit(x, y, g, 'Weights', dw)
This produces a slope if 0.944 and offset 0.1484.
I have not figured out how to conjure this result from scipy.optimize.curve_fit using the sigma parameter. If I pass the weights as provided to Matlab, the '0' causes a divide by zero exception. Clearly Matlab and scipy are thinking very differently about the meaning of the weights in the underlying optimization routine. Is there a simple way of converting between the two that allows me to provide a weighting function which produces identical results?
Ok, so after further investigation I can offer the answer, at least for this simple example.
import numpy as np
import scipy as sp
import scipy.optimize
def modelFun(x, m, b):
return m * x + b
def testFit():
w = np.diag([1.0, 1/0.7290, 1/0.5120, 1/0.3430, 1/0.2160, 1/0.1250, 1/0.0640, 1/0.0270, 1/0.0080, 1/0.0010])
x = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
y = np.array([0.1075, 1.3668, 1.5482, 3.1724, 4.0638, 4.7385, 5.9133, 7.0685, 8.7157, 9.5539])
popt = sp.optimize.curve_fit(modelFun, x, y, sigma=w)
print(popt[0])
print(popt[1])
Which produces the desired result.
In order to force sp.optimize.curve_fit to minimize the same chisq metric as Matlab using the curve fitting toolbox, you must do two things:
Use the reciprocal of the weight factors
Create a diagonal matrix from the new weight factors. According to the scipy reference:
sigma None or M-length sequence or MxM array, optional
Determines the uncertainty in ydata. If we define residuals as r =
ydata - f(xdata, *popt), then the interpretation of sigma depends on
its number of dimensions:
A 1-d sigma should contain values of standard deviations of errors in
ydata. In this case, the optimized function is chisq = sum((r / sigma)
** 2).
A 2-d sigma should contain the covariance matrix of errors in ydata.
In this case, the optimized function is chisq = r.T # inv(sigma) # r.
New in version 0.19.
None (default) is equivalent of 1-d sigma filled with ones.

Julia vs. MATLAB - Distance Matrix - Run Time Test

I started learning Julia not a long time ago and I decided to do a simple
comparison between Julia and Matlab on a simple code for computing Euclidean
distance matrices from a set of high dimensional points.
The task is simple and can be divided into two cases:
Case 1: Given two datasets in the form of n x d matrices, say X1 and X2, compute the pair wise Euclidean distance between each point in X1 and all the points in X2. If X1 is of size n1 x d, and X2 is of size n2 x d, then the resulting Euclidean distance matrix D will be of size n1 x n2. In the general setting, matrix D is not symmetric, and diagonal elements are not equal to zero.
Case 2: Given one dataset in the form of n x d matrix X, compute the pair wise Euclidean distance between all the n points in X. The resulting Euclidean distance matrix D will be of size n x n, symmetric, with zero elements on the main diagonal.
My implementation of these functions in Matlab and in Julia is given below. Note that none of the implementations rely on loops of any sort, but rather simple linear algebra operations. Also, note that the implementation using both languages is very similar.
My expectations before running any tests for these implementations is that the Julia code will be much faster than the Matlab code, and by a significant margin. To my surprise, this was not the case!
The parameters for my experiments are given below with the code. My machine is a MacBook Pro. (15" Mid 2015) with 2.8 GHz Intel Core i7 (Quad Core), and 16 GB 1600 MHz DDR3.
Matlab version: R2018a
Julia version: 0.6.3
BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Haswell)
LAPACK: libopenblas64_
LIBM: libopenlibm
LLVM: libLLVM-3.9.1 (ORCJIT, haswell)
The results are given in Table (1) below.
Table 1: Average time in seconds (with standard deviation) over 30 trials for computing Euclidean distance matrices between two different datasets (Col. 1),
and between all pairwise points in one dataset (Col. 2).
Two Datasets || One Dataset
Matlab: 2.68 (0.12) sec. 1.88 (0.04) sec.
Julia V1: 5.38 (0.17) sec. 4.74 (0.05) sec.
Julia V2: 5.2 (0.1) sec.
I was not expecting this significant difference between both languages. I expected Julia to be faster than Matlab, or at least, as fast as Matlab. It was really a surprise to see that Matlab is almost 2.5 times faster than Julia in this particular task. I didn't want to draw any early conclusions based on these results for few reasons.
First, while I think that my Matlab implementation is as good as it can be, I'm wondering whether my Julia implementation is the best one for this task. I'm still learning Julia and I hope there is a more efficient Julia code that can yield faster computation time for this task. In particular, where is the main bottleneck for Julia in this task? Or, why does Matlab have an edge in this case?
Second, my current Julia package is based on the generic and standard BLAS and LAPACK packages for MacOS. I'm wondering whether JuliaPro with BLAS and LAPACK based on Intel MKL will be faster than the current version I'm using. This is why I opted to get some feedback from more knowledgeable people on StackOverflow.
The third reason is that I'm wondering whether the compile time for Julia was
included in the timings shown in Table 1 (2nd and 3rd rows), and whether there is a better way to assess the execution time for a function.
I will appreciate any feedback on my previous three questions.
Thank you!
Hint: This question has been identified as a possible duplicate of another question on StackOverflow. However, this is not entirely true. This question has three aspects as reflected by the answers below. First, yes, one part of the question is related to the comparison of OpenBLAS vs. MKL. Second, it turns out that the implementation as well can be improved as shown by one of the answers. And last, bench-marking the julia code itself can be improved by using BenchmarkTools.jl.
MATLAB
num_trials = 30;
dim = 1000;
n1 = 10000;
n2 = 10000;
T = zeros(num_trials,1);
XX1 = randn(n1,dim);
XX2 = rand(n2,dim);
%%% DIFEERENT MATRICES
DD2ds = zeros(n1,n2);
for (i = 1:num_trials)
tic;
DD2ds = distmat_euc2ds(XX1,XX2);
T(i) = toc;
end
mt = mean(T);
st = std(T);
fprintf(1,'\nDifferent Matrices:: dim: %d, n1 x n2: %d x %d -> Avg. Time %f (+- %f) \n',dim,n1,n2,mt,st);
%%% SAME Matrix
T = zeros(num_trials,1);
DD1ds = zeros(n1,n1);
for (i = 1:num_trials)
tic;
DD1ds = distmat_euc1ds(XX1);
T(i) = toc;
end
mt = mean(T);
st = std(T);
fprintf(1,'\nSame Matrix:: dim: %d, n1 x n1 : %d x %d -> Avg. Time %f (+- %f) \n\n',dim,n1,n1,mt,st);
distmat_euc2ds.m
function [DD] = distmat_euc2ds (XX1,XX2)
n1 = size(XX1,1);
n2 = size(XX2,1);
DD = sqrt(ones(n1,1)*sum(XX2.^2.0,2)' + (ones(n2,1)*sum(XX1.^2.0,2)')' - 2.*XX1*XX2');
end
distmat_euc1ds.m
function [DD] = distmat_euc1ds (XX)
n1 = size(XX,1);
GG = XX*XX';
DD = sqrt(ones(n1,1)*diag(GG)' + diag(GG)*ones(1,n1) - 2.*GG);
end
JULIA
include("distmat_euc.jl")
num_trials = 30;
dim = 1000;
n1 = 10000;
n2 = 10000;
T = zeros(num_trials);
XX1 = randn(n1,dim)
XX2 = rand(n2,dim)
DD = zeros(n1,n2)
# Euclidean Distance Matrix: Two Different Matrices V1
# ====================================================
for i = 1:num_trials
tic()
DD = distmat_eucv1(XX1,XX2)
T[i] = toq();
end
mt = mean(T)
st = std(T)
println("Different Matrices V1:: dim:$dim, n1 x n2: $n1 x $n2 -> Avg. Time $mt (+- $st)")
# Euclidean Distance Matrix: Two Different Matrices V2
# ====================================================
for i = 1:num_trials
tic()
DD = distmat_eucv2(XX1,XX2)
T[i] = toq();
end
mt = mean(T)
st = std(T)
println("Different Matrices V2:: dim:$dim, n1 x n2: $n1 x $n2 -> Avg. Time $mt (+- $st)")
# Euclidean Distance Matrix: Same Matrix V1
# =========================================
for i = 1:num_trials
tic()
DD = distmat_eucv1(XX1)
T[i] = toq();
end
mt = mean(T)
st = std(T)
println("Same Matrix V1:: dim:$dim, n1 x n2: $n1 x $n2 -> Avg. Time $mt (+- $st)")
distmat_euc.jl
function distmat_eucv1(XX1::Array{Float64,2},XX2::Array{Float64,2})
(num1,dim1) = size(XX1)
(num2,dim2) = size(XX2)
if (dim1 != dim2)
error("Matrices' 2nd dimensions must agree!")
end
DD = sqrt.((ones(num1)*sum(XX2.^2.0,2)') +
(ones(num2)*sum(XX1.^2.0,2)')' - 2.0.*XX1*XX2');
end
function distmat_eucv2(XX1::Array{Float64,2},XX2::Array{Float64,2})
(num1,dim1) = size(XX1)
(num2,dim2) = size(XX2)
if (dim1 != dim2)
error("Matrices' 2nd dimensions must agree!")
end
DD = (ones(num1)*sum(Base.FastMath.pow_fast.(XX2,2.0),2)') +
(ones(num2)*sum(Base.FastMath.pow_fast.(XX1,2.0),2)')' -
Base.LinAlg.BLAS.gemm('N','T',2.0,XX1,XX2);
DD = Base.FastMath.sqrt_fast.(DD)
end
function distmat_eucv1(XX::Array{Float64,2})
n = size(XX,1)
GG = XX*XX';
DD = sqrt.(ones(n)*diag(GG)' + diag(GG)*ones(1,n) - 2.0.*GG)
end
First question: If I re-write the julia distance function like so:
function dist2(X1::Matrix, X2::Matrix)
size(X1, 2) != size(X2, 2) && error("Matrices' 2nd dimensions must agree!")
return sqrt.(sum(abs2, X1, 2) .+ sum(abs2, X2, 2)' .- 2 .* (X1 * X2'))
end
I shave >40% off the execution time.
For a single dataset you can save a bit more, like this:
function dist2(X::Matrix)
G = X * X'
dG = diag(G)
return sqrt.(dG .+ dG' .- 2 .* G)
end
Third question: You should do your benchmarking with BenchmarkTools.jl, and perform the benchmarking like this (remember $ for variable interpolation):
julia> using BenchmarkTools
julia> #btime dist2($XX1, $XX2);
Additionally, you should not do powers using floats, like this: X.^2.0. It is faster, and equally correct to write X.^2.
For multiplication there is no speed difference between 2.0 .* X and 2 .* X, but you should still prefer using an integer, because it is more generic. As an example, if X has Float32 elements, multiplying with 2.0 will promote the array to Float64s, while multiplying with 2 will preserve the eltype.
And finally, note that in new versions of Matlab, too, you can get broadcasting behaviour by simply adding Mx1 arrays with 1xN arrays. There is no need to first expand them by multiplying with ones(...).

Getting rank deficient warning when using regress function in MATLAB

I have a dataset comprising of 30 independent variables and I tried performing linear regression in MATLAB R2010b using the regress function.
I get a warning stating that my matrix X is rank deficient to within machine precision.
Now, the coefficients I get after executing this function don't match with the experimental one.
Can anyone please suggest me how to perform the regression analysis for this equation which is comprising of 30 variables?
Going with our discussion, the reason why you are getting that warning is because you have what is known as an underdetermined system. Basically, you have a set of constraints where you have more variables that you want to solve for than the data that is available. One example of an underdetermined system is something like:
x + y + z = 1
x + y + 2z = 3
There are an infinite number of combinations of (x,y,z) that can solve the above system. For example, (x, y, z) = (1, −2, 2), (2, −3, 2), and (3, −4, 2). What rank deficient means in your case is that there is more than one set of regression coefficients that would satisfy the governing equation that would describe the relationship between your input variables and output observations. This is probably why the output of regress isn't matching up with your ground truth regression coefficients. Though it isn't the same answer, do know that the output is one possible answer. By running through regress with your data, this is what I get if I define your observation matrix to be X and your output vector to be Y:
>> format long g;
>> B = regress(Y, X);
>> B
B =
0
0
28321.7264417536
0
35241.9719076362
899.386999172398
-95491.6154990829
-2879.96318251964
-31375.7038251919
5993.52959752106
0
18312.6649115112
0
0
8031.4391233753
27923.2569044728
7716.51932560781
-13621.1638587172
36721.8387047613
80622.0849069525
-114048.707780113
-70838.6034825939
-22843.7931997405
5345.06937207617
0
106542.307940305
-14178.0346010715
-20506.8096166108
-2498.51437396558
6783.3107243113
You can see that there are seven regression coefficients that are equal to 0, which corresponds to 30 - 23 = 7. We have 30 variables and 23 constraints to work with. Be advised that this is not the only possible solution. regress essentially computes the least squared error solution such that sum of residuals of Y - X*B has the least amount of error. This essentially simplifies to:
B = X^(*)*Y
X^(*) is what is known as the pseudo-inverse of the matrix. MATLAB has this available, and it is called pinv. Therefore, if we did:
B = pinv(X)*Y
We get:
B =
44741.6923363563
32972.479220139
-31055.2846404536
-22897.9685877566
28888.7558524005
1146.70695371731
-4002.86163441217
9161.6908044046
-22704.9986509788
5526.10730457192
9161.69080479427
2607.08283489226
2591.21062004404
-31631.9969765197
-5357.85253691504
6025.47661106009
5519.89341411127
-7356.00479046122
-15411.5144034056
49827.6984426955
-26352.0537850382
-11144.2988973666
-14835.9087945295
-121.889618144655
-32355.2405829636
53712.1245333841
-1941.40823106236
-10929.3953469692
-3817.40117809984
2732.64066796307
You see that there are no zero coefficients because pinv finds the solution using the L2-norm, which promotes the "spreading" out of the errors (for a lack of a better term). You can verify that these are correct regression coefficients by doing:
>> Y2 = X*B
Y2 =
16.1491563400241
16.1264219600856
16.525331600049
17.3170318001845
16.7481541301999
17.3266932502295
16.5465094100486
16.5184456100487
16.8428701100165
17.0749421099829
16.7393450000517
17.2993993099419
17.3925811702017
17.3347117202356
17.3362798302375
17.3184486799219
17.1169638102517
17.2813552099096
16.8792925100727
17.2557945601102
17.501873690151
17.6490477001922
17.7733493802508
Similarly, if we used the regression coefficients from regress, so B = regress(Y,X); then doing Y2 = X*B, we get:
Y2 =
16.1491563399927
16.1264219599996
16.5253315999987
17.3170317999969
16.7481541299967
17.3266932499992
16.5465094099978
16.5184456099983
16.8428701099975
17.0749421099985
16.7393449999981
17.2993993099983
17.3925811699993
17.3347117199991
17.3362798299967
17.3184486799987
17.1169638100025
17.281355209999
16.8792925099983
17.2557945599979
17.5018736899983
17.6490476999977
17.7733493799981
There are some slight computational differences, which is to be expected. Similarly, we can also find the answer by using mldivide:
B = X \ Y
B =
0
0
28321.726441712
0
35241.9719075889
899.386999170666
-95491.6154989513
-2879.96318251572
-31375.7038251485
5993.52959751295
0
18312.6649114859
0
0
8031.43912336425
27923.2569044349
7716.51932559712
-13621.1638586983
36721.8387047123
80622.0849068411
-114048.707779954
-70838.6034824987
-22843.7931997086
5345.06937206919
0
106542.307940158
-14178.0346010521
-20506.8096165825
-2498.51437396236
6783.31072430201
You can see that this curiously matches up with what regress gives you. That's because \ is a more smarter operator. Depending on how your matrix is structured, it finds the solution to the system by a different method. I'd like to defer you to the post by Amro that talks about what algorithms mldivide uses when examining the properties of the input matrix being operated on:
How to implement Matlab's mldivide (a.k.a. the backslash operator "\")
What you should take away from this answer is that you can certainly go ahead and use those regression coefficients and they will more or less give you the expected output for each value of Y with each set of inputs for X. However, be warned that those coefficients are not unique. This is apparent as you said that you have ground truth coefficients that don't match up with the output of regress. It isn't matching up because it generated another answer that satisfies the constraints you have provided.
There is more than one answer that can describe that relationship if you have an underdetermined system, as you have seen by my experiments shown above.

why is there significant double precision difference between Matlab and Mathematica?

I created a random double precision value in Matlab by
x = rand(1,1);
then display all possible digits of x by
vpa(x,100)
and obtain:
0.2238119394911369 7971853298440692014992237091064453125
I save x to a .mat file, and import it into Mathematica, and then convert it:
y = N[FromDigits[RealDigits[x]],100]
and obtain:
0.2238119394911369 0000
Then go back to Matlab and use (copy and paste all the Mathematica digits to Matlab):
vpa(0.22381193949113690000,100)
and obtain:
0.22381193949113689 64518061375201796181499958038330078125
Why there is significant difference between the same double precision variable?
How to bridge the gap when exchanging data between Mathematica and Matlab?
You can fix this problem by using ReadList instead of Import. I have added some demo steps below to explore displayed rounding and equality. Note the final test d == e? is False in Mathematica 7 but True in Mathematica 9, (with all the expected digits). So it looks like some precision has been added to Import by version 9. The demo uses a demo file.
Contents of demo.dat:
0.22381193949113697971853298440692014992237091064453125
"0.22381193949113697971853298440692014992237091064453125"
Exploring:-
a = Import["demo.dat"]
b = ReadList["demo.dat"]
a[[1, 1]] == a[[2, 1]]
b[[1]] == b[[2]]
a[[1, 1]] == b[[1]]
a[[1, 1]] == ToExpression#b[[2]]
b[[1]] // FullForm
c = First#StringSplit[ToString#FullForm#b[[1]], "`"]
b[[2]]
ToExpression /# {c, b[[2]]}
d = N[FromDigits[RealDigits[a[[1, 1]]]], 100]
e = N[FromDigits[RealDigits[b[[1]]]], 100]
d == e
The precision is as expected for double values. A double has a 53 bit fraction, thus the precision is about 53*log(10)/log(2)=16 significant digits. You have 16 significant digits, it works as expected.

(0.3)^3 == (0.3)*(0.3)*(0.3) returns false in matlab?

I am trying to understand roundoff error for basic arithmetic operations in MATLAB and I came across the following curious example.
(0.3)^3 == (0.3)*(0.3)*(0.3)
ans = 0
I'd like to know exactly how the left-hand side is computed. MATLAB documentation suggests that for integer powers an 'exponentiation by squaring' algorithm is used.
"Matrix power. X^p is X to the power p, if p is a scalar. If p is an integer, the power is computed by repeated squaring."
So I assumed (0.3)^3 and (0.3)*(0.3)^2 would return the same value. But this is not the case. How do I explain the difference in roundoff error?
I don't know anything about MATLAB, but I tried it in Ruby:
irb> 0.3 ** 3
=> 0.026999999999999996
irb> 0.3 * 0.3 * 0.3
=> 0.027
According to the Ruby source code, the exponentiation operator casts the right-hand operand to a float if the left-hand operand is a float, and then calls the standard C function pow(). The float variant of the pow() function must implement a more complex algorithm for handling non-integer exponents, which would use operations that result in roundoff error. Maybe MATLAB works similarly.
Interestingly, scalar ^ seems to be implemented using pow while matrix ^ is implemented using square-and-multiply. To wit:
octave:13> format hex
octave:14> 0.3^3
ans = 3f9ba5e353f7ced8
octave:15> 0.3*0.3*0.3
ans = 3f9ba5e353f7ced9
octave:20> [0.3 0;0 0.3]^3
ans =
3f9ba5e353f7ced9 0000000000000000
0000000000000000 3f9ba5e353f7ced9
octave:21> [0.3 0;0 0.3] * [0.3 0;0 0.3] * [0.3 0;0 0.3]
ans =
3f9ba5e353f7ced9 0000000000000000
0000000000000000 3f9ba5e353f7ced9
This is confirmed by running octave under gdb and setting a breakpoint in pow.
The same is likely true in matlab, but I can't really verify.
Thanks to #Dougal I found this:
#include <stdio.h>
int main() {
double x = 0.3;
printf("%.40f\n", (x*x*x));
long double y = 0.3;
printf("%.40f\n", (double)(y*y*y));
}
which gives:
0.0269999999999999996946886682280819513835
0.0269999999999999962252417162744677625597
The case is strange because the computation with more digits gives a worst result. This is due to the fact that anyway the initial number 0.3 is approximated with few digits and hence we start with a relatively "large" error. In this particular case what happens is that the computation with few digits gives another "large" error but with opposite sign... hence compensating the initial one. Instead the computation with more digits gives a second smaller error but the first one remains.
Here's a little test program that follows what the system pow() from Source/Intel/xmm_power.c, in Apple's Libm-2026, does in this case:
#include <stdio.h>
int main() {
// basically lines 1130-1157 of xmm_power.c, modified a bit to remove
// irrelevant things
double x = .3;
int i = 3;
//calculate ix = f**i
long double ix = 1.0, lx = (long double) x;
//calculate x**i by doing lots of multiplication
int mask = 1;
//for each of the bits set in i, multiply ix by x**(2**bit_position)
while(i != 0)
{
if( i & mask )
{
ix *= lx;
i -= mask;
}
mask += mask;
lx *= lx; // In double this might overflow spuriously, but not in long double
}
printf("%.40f\n", (double) ix);
}
This prints out 0.0269999999999999962252417162744677625597, which agrees with the results I get for .3 ^ 3 in Matlab and .3 ** 3 in Python (and we know the latter just calls this code). By contrast, .3 * .3 * .3 for me gets 0.0269999999999999996946886682280819513835, which is the same thing that you get if you just ask to print out 0.027 to that many decimal places and so is presumably the closest double.
So there's the algorithm. We could track out exactly what value is set at each step, but it's not too surprising that it would round to a very slightly smaller number given a different algorithm for doing it.
Read Goldberg's "What Every Computer Scientist Should Know About Floating-Point Arithmetic" (this is a reprint by Oracle). Do understand it. Floating point numbers are not the real numbers of calculus. Sorry, no TL;DR version available.