numerical computation problems in matlab - matlab

in this question i am addressing to numerical computation problems in matlab and want to get experience how to avoid this problems/errors in future
for example let consider following simple codes
t = 0.4 + 0.1 - 0.5
t =
0
it works fine,but
u = 0.4 - 0.5 + 0.1
u =
2.7756e-17
of course in mind it is also 0,but why not in first calculation got the same result?or what is difference?also please look
v = (sin(2*pi) = = sin(4*pi))
v = (sin(2*pi)==sin(4*pi))
v =
0
it shows that sine function is not periodic,so what is general advice in this case?introduce some epsilon?like
V=((sin(2*pi)-sin(4*pi))<eps)
V =
0
or
EPS=0.000000000000001
EPS =
1.0000e-15
>> V=((sin(2*pi)-sin(4*pi))<EPS)
V =
1
please help me

It's normal you get these results, because floating-point relative accuracy in Matlab is
eps('double')
ans =
2.2204e-16
For V=((sin(2*pi)-sin(4*pi))<eps), because
sin(2*pi)-sin(4*pi)
ans =
2.4493e-16
which is larger than eps('double'), so its result will be V=0.
And for V=((sin(2*pi)-sin(4*pi))<EPS), because EPS>2.4493e-16, so its result will be V=1.

Related

Doing Andrew Ng's Logistic Regression execrise without fminunc

I've been trying to finish Andrew Ng's Machine Learning course, I am at the part about logistic regression now. I am trying to discover the parameters and also calculate the cost without using the MATLAB function fminunc. However, I am not converging to the correct results as posted by other students who have finished the assignment using fminunc. Specifically, my problems are:
the parameters theta are incorrect
my cost seems to be blowing up
I get many NaNs in my cost vector (I just create a vector of the costs to keep track)
I attempted to discover the parameters via Gradient Descent as how I understood the content. However, my implementation still seems to be giving me incorrect results.
dataset = load('dataStuds.txt');
x = dataset(:,1:end-1);
y = dataset(:,end);
m = length(x);
% Padding the the 1's (intercept term, the call it?)
x = [ones(length(x),1), x];
thetas = zeros(size(x,2),1);
% Setting the learning rate to 0.1
alpha = 0.1;
for i = 1:100000
% theta transpose x (tho why in MATLAB it needs to be done the other way
% round? :)
ttrx = x * thetas;
% the hypothesis function h_x = g(z) = sigmoid(-z)
h_x = 1 ./ (1 + exp(-ttrx));
error = h_x - y;
% the gradient (aka the derivative of J(\theta) aka the derivative
% term)
for j = 1:length(thetas)
gradient = 1/m * (h_x - y)' * x(:,j);
% Updating the parameters theta
thetas(j) = thetas(j) - alpha * gradient;
end
% Calculating the cost, just to keep track...
cost(i) = 1/m * ( -y' * log(h_x) - (1-y)' * log(1-h_x) );
end
% Displaying the final theta's that I obtained
thetas
The parameters theta that I get are:
thetas =
-482.8509
3.7457
2.6976
The results below is from one example that I downloaded, but the author used fminunc for this one.
Cost at theta found by fminunc: 0.203506
theta:
-24.932760
0.204406
0.199616
The data:
34.6236596245170 78.0246928153624 0
30.2867107682261 43.8949975240010 0
35.8474087699387 72.9021980270836 0
60.1825993862098 86.3085520954683 1
79.0327360507101 75.3443764369103 1
45.0832774766834 56.3163717815305 0
61.1066645368477 96.5114258848962 1
75.0247455673889 46.5540135411654 1
76.0987867022626 87.4205697192680 1
84.4328199612004 43.5333933107211 1
95.8615550709357 38.2252780579509 0
75.0136583895825 30.6032632342801 0
82.3070533739948 76.4819633023560 1
69.3645887597094 97.7186919618861 1
39.5383391436722 76.0368108511588 0
53.9710521485623 89.2073501375021 1
69.0701440628303 52.7404697301677 1
67.9468554771162 46.6785741067313 0
70.6615095549944 92.9271378936483 1
76.9787837274750 47.5759636497553 1
67.3720275457088 42.8384383202918 0
89.6767757507208 65.7993659274524 1
50.5347882898830 48.8558115276421 0
34.2120609778679 44.2095285986629 0
77.9240914545704 68.9723599933059 1
62.2710136700463 69.9544579544759 1
80.1901807509566 44.8216289321835 1
93.1143887974420 38.8006703371321 0
61.8302060231260 50.2561078924462 0
38.7858037967942 64.9956809553958 0
61.3792894474250 72.8078873131710 1
85.4045193941165 57.0519839762712 1
52.1079797319398 63.1276237688172 0
52.0454047683183 69.4328601204522 1
40.2368937354511 71.1677480218488 0
54.6351055542482 52.2138858806112 0
33.9155001090689 98.8694357422061 0
64.1769888749449 80.9080605867082 1
74.7892529594154 41.5734152282443 0
34.1836400264419 75.2377203360134 0
83.9023936624916 56.3080462160533 1
51.5477202690618 46.8562902634998 0
94.4433677691785 65.5689216055905 1
82.3687537571392 40.6182551597062 0
51.0477517712887 45.8227014577600 0
62.2226757612019 52.0609919483668 0
77.1930349260136 70.4582000018096 1
97.7715992800023 86.7278223300282 1
62.0730637966765 96.7688241241398 1
91.5649744980744 88.6962925454660 1
79.9448179406693 74.1631193504376 1
99.2725269292572 60.9990309984499 1
90.5467141139985 43.3906018065003 1
34.5245138532001 60.3963424583717 0
50.2864961189907 49.8045388132306 0
49.5866772163203 59.8089509945327 0
97.6456339600777 68.8615727242060 1
32.5772001680931 95.5985476138788 0
74.2486913672160 69.8245712265719 1
71.7964620586338 78.4535622451505 1
75.3956114656803 85.7599366733162 1
35.2861128152619 47.0205139472342 0
56.2538174971162 39.2614725105802 0
30.0588224466980 49.5929738672369 0
44.6682617248089 66.4500861455891 0
66.5608944724295 41.0920980793697 0
40.4575509837516 97.5351854890994 1
49.0725632190884 51.8832118207397 0
80.2795740146700 92.1160608134408 1
66.7467185694404 60.9913940274099 1
32.7228330406032 43.3071730643006 0
64.0393204150601 78.0316880201823 1
72.3464942257992 96.2275929676140 1
60.4578857391896 73.0949980975804 1
58.8409562172680 75.8584483127904 1
99.8278577969213 72.3692519338389 1
47.2642691084817 88.4758649955978 1
50.4581598028599 75.8098595298246 1
60.4555562927153 42.5084094357222 0
82.2266615778557 42.7198785371646 0
88.9138964166533 69.8037888983547 1
94.8345067243020 45.6943068025075 1
67.3192574691753 66.5893531774792 1
57.2387063156986 59.5142819801296 1
80.3667560017127 90.9601478974695 1
68.4685217859111 85.5943071045201 1
42.0754545384731 78.8447860014804 0
75.4777020053391 90.4245389975396 1
78.6354243489802 96.6474271688564 1
52.3480039879411 60.7695052560259 0
94.0943311251679 77.1591050907389 1
90.4485509709636 87.5087917648470 1
55.4821611406959 35.5707034722887 0
74.4926924184304 84.8451368493014 1
89.8458067072098 45.3582836109166 1
83.4891627449824 48.3802857972818 1
42.2617008099817 87.1038509402546 1
99.3150088051039 68.7754094720662 1
55.3400175600370 64.9319380069486 1
74.7758930009277 89.5298128951328 1
I ran your code and it does work fine. However, the tricky thing about gradient descent is ensuring that your costs don't diverge to infinity. If you look at your costs array, you will see that the costs definitely diverge and this is why you are not getting the correct results.
The best way to eliminate this in your case is to reduce the learning rate. Through experimentation, I have found that a learning rate of alpha = 0.003 is the best for your problem. I've also increased the number of iterations to 200000. Changing these two things gives me the following parameters and associated cost:
>> format long g;
>> thetas
thetas =
-17.6287417780435
0.146062780453677
0.140513170941357
>> cost(end)
ans =
0.214821863463963
This is more or less in line with the magnitudes of the parameters you see when you are using fminunc. However, they get slightly different parameters as well as different costs because of the actual minimization method itself. fminunc uses a variant of L-BFGS which finds the solution in a much faster way.
What is most important is the actual accuracy itself. Remember that to classify whether an example belongs to label 0 or 1, you take the weighted sum of the parameters and examples, run it through the sigmoid function and threshold at 0.5. We find what the average amount of times each expected label and predicted label match.
Using the parameters we found with gradient descent gives us the following accuracy:
>> ttrx = x * thetas;
>> h_x = 1 ./ (1 + exp(-ttrx)) >= 0.5;
>> mean(h_x == y)
ans =
0.89
This means that we've achieved an 89% classification accuracy. Using the labels provided by fminunc also gives:
>> thetas2 = [-24.932760; 0.204406; 0.199616];
>> ttrx = x * thetas2;
>> h_x = 1 ./ (1 + exp(-ttrx)) >= 0.5;
>> mean(h_x == y)
ans =
0.89
So we can see that the accuracy is the same so I wouldn't worry too much about the magnitude of the parameters but it's more in line with what we see when we compare the costs between the two implementations.
As a final note to you, I would suggest looking at this post of mine for some tips on how to make logistic regression work over long-term. I would definitely recommend normalizing your features prior to finding the parameters to make the algorithm run faster. It also addresses why you were finding the wrong parameters (namely the cost blowing up): Cost function in logistic regression gives NaN as a result.
normalizing the data using mean and standard deviation as follows enables you to use large learning rate and get a similar answer
clear; clc
data = load('ex2data1.txt');
m = length(data);
alpha = 0.1;
theta = [0; 0; 0];
y = data(:,3);
% Normalizing the data
xm1 = mean(data(:,1)); xm2 = mean(data(:,2));
xs1 = std(data(:,1)); xs2 = std(data(:,2));
x1 = (data(:,1)-xm1)./xs1; x2 = (data(:,2)-xm2)./xs2;
X = [ones(m, 1) x1 x2];
for i=1:10000
h = 1./(1+exp(-(X*theta)));
theta = theta - (alpha/m)* (X'*(h-y));
J(i) = (1/m)*(-y'*log(h)-(1-y)'*log(1-h));
end
theta
J(end)
figure
plot(J)

Why does my Matlab code not work correctly?

My code
b(abs(b(1:3:length(b))) > 0.75) = 0.75
What it's supposed to do:
b1 = b(1:3:end);
b1(abs(b1)>0.75) = 0.75;
b(1:3:end) = b1;
How are these two not the same?
The indexing part b(1:3:end) returns a short vector of zeros and ones, so will only change the i-th entry of b (for i in the first third-ish of b) to 0.75 if the absolute value of the 3*i + 1-th entry is greater than 0.75.
For example:
b = [-0.684; 0.941; 0.914; -0.029; 0.6; -0.716; -0.156; 0.831; 0.584; 0.919];
b_index = abs(b(1:3:length(b)))>0.75
would return
b_index =
0
0
0
1
and b(b_index) = 0.75 would change the 4th entry of b to 0.75.
One way of doing this as a one-liner is
b(1:3:end) = b(1:3:end).*(abs(b(1:3:end))<=0.75)) + 0.75*((b(1:3:end)>0.75));
but I think the three-liner is a bit clearer.

How to write many functions depending on each other in matlab

I have some functions are depending on each ther , the functions are from this book page 136 http://www.cs.helsinki.fi/u/ahyvarin/papers/bookfinal_ICA.pdf .. I functions are presented below , How to write following functions in matlab ??
y(t) = W(t-1)*x(t)
h(t) = P(t-1)*y(t)
P(t)=(1/B)*Tri[P(t-1)-m(t)*h^T(t)]
m(t) = h(t)/(B+y^T(t))*h(t))
e(t) = x(t)-W^T(t-1)*y(t)
W(t) = W(t-1) + m(t)*e^T(t)
It is solving the weight matrix W(t) iteratively .. I tried to do like this in matlab but I did not work so may be you can advice to correct the code :
for i=1:10
e=randn(3,5000);
A=[1 0 0;-0.5 0.5 0;0.3 0.1 0.1];
x=A*e;
y(t) = W(t-1)*x(t)
h(t) = P(t-1)*y(t)
P(t)=(1/B)*Tri[P(t-1)-m(t)*h^T(t)]
m(t) = h(t)/(B+y^T(t))*h(t))
e(t) = x(t)-W^T(t-1)*y(t)
W(t) = W(t-1) + m(t)*e^T(t)
end
Thanks
Ok. I can't really understand what you want, but your code shows that you don't understand some moments. I will try to clarify some moments to you:
for i = 2:10
x = rand(3);
y = W(:,:,i-1)*x;
h = P(:,:,i-1)*y;
m=h/(1+y'*h);
P(:,:,i)=P(:,:,i-1)*m*h';
e=x-W(:,:,i-1)'*y;
W(:,:,i)=W(:,:,i-1)+m*e';
end
You must go something like this: 1. you calculate x and use it to calculate other functions.
2. all of them are matrices. So you need to define it first. For example y = ones(3) etc. 3.Thats not y^T or e^T. Its transposing. If you do not feel difference it's early for you to solve this task :)
And the last: Tri function will create a some kind of problems to you, but it's defined at 136 page.
P.S. i missed beta becouse of don't know what is it :)

Matlab : Insert values from a 2-d matrix into a 3-d matrix on some condition

output is a 3d matrix with size(output) == [height width N] and input is a 2d matrix with size(input) == [height width] . I need to implement the following code in one line.
for k = 1:size(output,3)
f = output(:,:,k);
i_zero = (f==0);
f(is_zero) = input(is_zero);
output(:,:,k) = f;
end
bsxfun approach -
output = bsxfun(#times,output==0,input) + output
Alternative approach -
output = (output==0).*input(:,:,ones(1,N))+ output
I hope the "I need to implement" is not a homework.
Here goes a solution that should solve your problem although not in one line.
new_input=repmat(input,1,1,size(output,3));
output(output==0)=new_input(output==0);
All the answers solve the problem when there is a exact comparison to 0 (as OP required) but for the sake of generalization if you intend to change for another comparison be aware that not all methods work in the same way.
Example below:
CODE:
%Simulation
output=rand(10,10,3);
input=rand(10,10);
% output=randi(9,10,10,3);
% input=randi(9,10,10);
%OP code
output2=[]
for k = 1:size(output,3)
f = output(:,:,k);
i_zero = (f<0.5);
f(i_zero) = input(i_zero);
output2(:,:,k) = f;
end
%repmat code
output3=output;
new_input=repmat(input,1,1,size(output,3));
output3(output<0.5)=new_input(output<0.5);
any(output2(:)-output3(:))
%bsxfun code
output4 = bsxfun(#times,output<0.5,input) + output;
any(output2(:)-output4(:))
%other variation code
output5 = (output<0.5).*input(:,:,ones(1,size(output,3)))+ output;
any(output2(:)-output5(:))
% bultin code
output6=output;
output6(output<0.5)=builtin('_paren',repmat(input,[1,1,size(output,3)]),output<0.5);
any(output2(:)-output6(:))
'-----'
any(abs(output2(:)-output3(:))>eps)
any(abs(output2(:)-output4(:))>eps)
any(abs(output2(:)-output5(:))>eps)
any(abs(output2(:)-output6(:))>eps)
'-----'
sum(abs(output2(:)-output3(:)))
sum(abs(output2(:)-output4(:)))
sum(abs(output2(:)-output5(:)))
sum(abs(output2(:)-output6(:)))
OUTPUT:
ans =
0
ans =
1
ans =
1
ans =
0
-----
ans =
0
ans =
1
ans =
1
ans =
0
-----
ans =
0
ans =
150.5088
ans =
150.5088
ans =
0
If you insist on a single line solution, you can use the (:) operator along with the mod command:
output(output(:)==0) = input(mod(find(output(:)==0)-1,height*width)+1)
where the -1 and +1 are in order to avoid index 0
Here is in a one liner... but uses the undocumented builtin('_paren',... to subscript reference to the output of a function
output(output==0)=builtin('_paren',repmat(input,[1,1,N]),output==0)
without the the undocumented builtin this method gets messy if you want it in one line...
output=subsasgn(output,struct('type','()','subs',{{output==0}}),...
subsref(repmat(input,[1,1,N]),struct('type','()','subs',{{output==0}})))
...sadly I forgot using masks and adding two matrices together was an option...
You should have the 'cat' function:
http://www.mathworks.nl/help/matlab/ref/cat.html
cat(3,matrix1,matrix2,...) to concatenate along the 3rd dimension.

solve In Matlab a quadratic equation with very small coefficients

I'm implementing a code in matlab to solve quadratic equations, using the resolvent formula:
Here´s the code:
clear all
format short
a=1; b=30000000.001; c=1/4;
rdelta=sqrt(b^2-4*a*c);
x1=(-b+rdelta)/(2*a);
x2=(-b-rdelta)/(2*a);
fprintf(' Roots of the polynomial %5.3f x^2 + %5.3f x+%5.3f \n',a,b,c)
fprintf ('x1= %e\n',x1)
fprintf ('x2= %e\n\n',x2)
valor_real_x1= -8.3333e-009;
valor_real_x2= -2.6844e+007;
error_abs_x1 = abs (valor_real_x1-x1);
error_abs_x2 = abs (valor_real_x2-x2);
error_rel_x1 = abs (error_abs_x1/valor_real_x1);
error_rel_x2 = abs (error_abs_x2/valor_real_x2);
fprintf(' absolute_errorx1 = |real value - obtained value| = |%e - %e| = %e \n',valor_real_x1,x1,error_abs_x1)
fprintf(' absolute_errorx2 = |real value - obtained value| = |%e - %e| = %e \n\n',valor_real_x2,x2,error_abs_x2)
fprintf(' relative error_x1 = |absolut error / real value| = |%e / %e| = %e \n',error_abs_x1,valor_real_x1,error_rel_x1 )
fprintf(' relative_error_x2 = |absolut error / real value| = |%e / %e| = %e \n',error_abs_x2,valor_real_x2,error_rel_x2)
The problem I have is that it gives me an exact solution, ie for values ​​a = 1, b = 30000000,001 c = 1/4, the values ​​of the roots are:
Roots of the polynomial 1.000 x^2 + 30000000.001 x+0.250
x1= -9.313226e-009
x2= -3.000000e+007
Knowing that the exact value of the roots of the polynomial are:
x1= -8.3333e-009
x2= -2.6844e+007
Which gives me the following errors in the absolute and relative precision of the calculations:
absolute_errorx1 = |real value - obtained value| = |-8.333300e-009 - -9.313226e-009| = 9.799257e-010
absolute_errorx2 = |real value - obtained value| = |-2.684400e+007 - -3.000000e+007| = 3.156000e+006
relative error_x1 = |absolut error / real value| = |9.799257e-010 / -8.333300e-009| = 1.175916e-001
relative_error_x2 = |absolut error / real value| = |3.156000e+006 / -2.684400e+007| = 1.175682e-001
My question is: Is there an optimum method to obtain the roots of a quadratic equation?, ie I can make changes to my code to reduce the relative error between the expected solution and the resulting solution?
Using the quadratic formula directly in this cases results in a large loss of numerical precision from subtracting two values of very similar magnitude. This is because the expression
sqrt(b*b - 4*a*c)
is nearly the same as b. So you should use only one of these two roots, the one that does not involve subtracting two very close values, and for the other root you can use (for instance) the fact that the product of roots of a quadratic is c/a. I'll let you fill in the gaps.
Why does this sound like a homework problem from a first class in numerical analysis?
It has been a while since I was that young, but as I recall there is a trick. Anyway, you are wrong. The true roots of that polynomial are
solve('x^2 + 30000000.001*x + 0.25')
ans =
-30000000.000999991666666666944442
-0.0000000083333333330555578703796293981491
How well does roots do here?
p = [1 30000000.001 1/4];
format long g
roots(p)
ans =
-30000000.001
-8.33333333305556e-09
That actually seems pretty good. How does HPF do?
DefaultNumberOfDigits 64
a = hpf(1);
b = hpf('30000000.001');
c = hpf('0.25');
r1 = (-b + sqrt(b*b - 4*a*c))/(2*a)
r1 =
-0.000000008333333333055557870379629398149125529835186899898569329967
r2 = (-b - sqrt(b*b - 4*a*c))/(2*a)
r2 =
-30000000.000999991666666666944442129620370601850874470164813100101
Yep, HPF works nicely enough too.
So what happens when you use double precision numbers and the standard formula? Yeah, crapola arrives.
a = 1;
b = 30000000.001;
c = 0.25;
>> r1 = (-b + sqrt(b*b - 4*a*c))/(2*a)
r1 =
-7.45058059692383e-09
>> r2 = (-b - sqrt(b*b - 4*a*c))/(2*a)
r2 =
-30000000.001
Again, massive subtractive cancellation eats away at the result. (I seem to recall that was the problem you had in your last question.)
There is a trick you can use. See that the large solution was well estimated, just not the one near zero. So, what happens if you solved for the roots of fliplr(p) using the quadratic formula? How does this solve your problem? What transformation is implicitly done when you do that? (Sorry, but I won't do your homework. I think the above was enough of a hint anyway.)
i think your "real" values might be wrong (or maybe it's a precision thing... I dunno)
a*(valor_real_x1^2)+b*(valor_real_x1)+c
ans =
9.9999e-07
a*(valor_real_x2^2)+b*(valor_real_x2)+c
ans =
-8.4720e+13
A nice formula for this problem:
var q = sqrt(c*a)/b;
var f = .5 + .5 *sqrt(1-4*q*q);
var x1=-b*f/a;
var x2=-c/(f*b);