Secant Method not converging, Matlab - matlab

I have created a program in Matlab to try to find the root of f(x) = exp(2x) + 3x - 4 (the function "fopg1" in my code). My code is as follows:
format long
tic;
for dum=1:1000;
x(1) = 0.5;
x(2) = 0.4;
err_tol = 1e-8;
iteration(1) = 1;
n = 3;
while err_estimate > err_tol
iteration(n) = n;
x(n) = x(n-1) - fopg1(x(n-2)) * ((x(n-1) - x(n-2)) / (fopg1(x(n-1)) - fopg1(x(n-2))));
err_estimate(n) = abs(x(n) - x(n-1));
n = n + 1;
end
%end
time = toc;
avgtime = time/1000;
A = [iteration' x' err_estimate' tbd'];
f = '%2i %13.9f %13.9f %7.3f'; compose(f,A)
Unfortunately this does not converge. I feel like it should. Is there a flaw in my program or does it in fact not converge? Thanks in advance.
Maarten

I answered a very similar question here a few days ago. Using the same code, without an iterations limit and with an increased tolerance (1e-8 as per your example), I detect the expected convergence of exp(2x) + 3x - 4 using the secant method:
clear();
clc();
com = Inf;
i = 2;
err_tol = 1e-8;
f = #(x) exp(2*x) + 3*x - 4;
x(1) = 0.5;
x(2) = 0.4;
while (abs(com) > err_tol)
com = f(x(i)) * (x(i) - x(i-1)) / (f(x(i)) - f(x(i-1)));
x(i+1)= x(i) - com;
i = i + 1;
n = n - 1;
end
display(['Root X = ' num2str(x(end))]);
The message I receive is: Root X = 0.47369. It shouldn't be hard for you to implement your additional data within this code.

Related

"Error in running optimization. Not enough input arguments" while running ga in MATLAB

I am using following objective function for optimization:
function Length_Sum = objective_function( l1,l2,l3 )
Length_Sum = l1 + l2 + l3;
end
With constraints function given below, the constrint function uses another function for calculating values of thetas,
function [c, ceq] = simple_constraint(l1,l2,l3)
c(1) = l3^2 + 200*l3*cos(30) + 10000 - (l1 + l2)^2;
c(2) = (100- l3*cos(30))^2 + (100*sin(30))^2 - (l1-l2)^2;
thetas = inverse_kinematics(l1,l2,l3);
c(3) = thetas(4,1) - 160;
c(4) = thetas(4,2) - 160;
c(5) = thetas(4,3) - 160;
c(6) = 20 - thetas(4,1);
c(7) = 20 - thetas(4,2);
c(8) = 20 - thetas(4,3);
c(9) = thetas(5,1) - 340;
c(10) = thetas(5,2) - 340;
c(11) = thetas(5,3) - 340;
c(12) = 200 - thetas(5,1);
c(13) = 200 - thetas(5,2);
c(14) = 200 - thetas(5,3);
c(15) = thetas(6,1) - 340;
c(16) = thetas(6,2) - 340;
c(17) = thetas(6,3) - 340;
c(18) = 200 - thetas(6,1);
c(19) = 200 - thetas(6,2);
c(20) = 200 - thetas(6,3);
ceq = [];
end
Function called by constraint function is given below,
function thetas = inverse_kinematics(l1,l2,l3)
x = 100;
y = 0;
phi = 210*pi/180:60*pi/180:330*pi/180;
x1 = x - (l3*cos(phi));
y1 = y - (l3*sin(phi));
a = sqrt(x1.^2 + y1.^2);
y2 = -y1./a;
x2 = -x1./a;
gamma = atan2(y2,x2);
c = (- x1.^2 - y1.^2 - l1^2 + l2^2)./(2*l1*a);
d = acos(c);
theta1 = gamma + d;
if theta1 < 0
theta1 = theta1 + 2*pi;
end
theta4 = gamma - d;
if theta4 < 0
theta4 = theta4 + 2*pi;
end
e = (y1 - l1*sin(theta1))/l2;
f = (x1 - l1*cos(theta1))/l2;
theta2 = atan2(e,f) - theta1;
if theta2 < 0
theta2 = theta2 + 2*pi;
end
g = (y1 - l1*sin(theta4))/l2;
h = (x1 - l1*cos(theta4))/l2;
theta5 = atan2(g,h) - theta4;
if theta5 < 0
theta5 = theta5 + 2*pi;
end
theta3 = (phi)- (theta1 + theta2);
if theta3 < 0
theta3 = theta3 + 2*pi;
end
theta6 = (phi)- (theta4 + theta5);
if theta6 < 0
theta6 = theta6 + 2*pi;
end
thetas = [theta1;theta2;theta3;theta4;theta5;theta6].*180/pi;
end
After running this code using ga toolbox, with lower bounds [20 20 20] and upper bounds [100 100 100] and rest parameters set to default, I am getting "Error in running optimization. Not enough input arguments" error. Can someone help?
ga accepts constraint function with input in form of one vector with number of elements corresponding to number of constrainded variables. You should change
function [c, ceq] = simple_constraint(l1,l2,l3)
to
function [c, ceq] = simple_constraint(input)
l1 = input(1);
l2 = input(2);
l3 = input(3);
Next time, I suggest you try File->Generate Code... option from the mentioned toolbox. Then you can debug more easily from Matlab window.
There is also another problem in your program. Try running inverse_kinematics(20,20,20). It fails on line 29, but I won't go into details here, because this is not part of the question.

Why is the MATLAB output of this numerical method precision not getting more accurate?

Posting here vs math.stackexchange because I think my issue is syntax:
I'm trying to analyze the 2nd order ODE: y'' + 2y' + 2y = e^(-x) * sin(x) using MATLAB code for the midpoint method. I first converted the ODE to a system of 1st order equations and then tried to apply it below, but as the discretizations [m] are increased, the output is stopping at .2718. For example, m=11 yields:
ans =
0.2724
and m=101:
ans =
0.2718
and m=10001
ans =
0.2718
Here's the code:
function [y,t] = ODEsolver_midpointND(F,y0,a,b,m)
if nargin < 5, m = 11; end
if nargin < 4, a = 0; b = 1; end
if nargin < 3, a = 0; b = 1; end
if nargin < 2, error('invalid number of inputs'); end
t = linspace(a,b,m)';
h = t(2)-t(1);
n = length(y0);
y = zeros(m,n);
y(1,:) = y0;
for i=2:m
Fty = feval(F,t(i-1),y(i-1,:));
th = t(i-1)+h/2;
y(i,:) = y(i-1,:) + ...
h*feval(F,th,y(i-1,:)+(h/2)*Fty );
end
Separate file:
function F = Fexample1(t,y)
F1 = y(2);
F2 = exp(-t).*sin(t)-2.*y(2)-2.*y(1);
F = [F1,F2];
Third file:
[Y,t] = ODEsolver_midpointND('Fexample1',[0 0],0,1,11);
Ye = [(1./2).*exp(-t).*(sin(t)-t.*cos(t)) (1./2).*exp(-t).*((t-1).*sin(t)- t.*cos(t))];
norm(Y-Ye,inf)
Your ODE solver looks to me like it should work - however there's a typo in the analytic solution you're comparing to. It should be
Ye = [(1./2).*exp(-t).*(sin(t)-t.*cos(t)) (1./2).*exp(-t).*((t-1).*sin(t)+ t.*cos(t))];
i.e. with a + sign before the t.*cos(t) term in the derivative.

MATLAB beginning having difficulty with function handles [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 6 years ago.
Improve this question
I'm working on a project for school that basically involves iteratively solving a cubic equation. I'm using MATLAB for it, but I've never really done much with MATLAB so I'm having some trouble with the logic of it.
Here's my code:
% Redlich/Kwong EOS
sigma = 1;
epsilon = 0;
omega = 0.08664;
psi = 0.42748;
beta = #(Psat_RK) omega*PsatRK/Pc/Tr(1); % Pc is in bar, may need a unit conversion later
alpha = (Tr(1))^(-1/2);
q = psi*alpha/omega/Tr(1);
A = #(beta) (sigma + epsilon - 1)*beta - 1;
B = #(beta) (sigma*epsilon - sigma - epsilon)*(beta^2) + (q - sigma - epsilon)*beta;
C = #(beta) -(sigma*epsilon*(1+beta) + q)*(beta^2);
Q = #(A,B) ((A^2) - 3*B)/9;
R = #(A,B,C) (2*(A^3) - 9*A*B + 27*C)/54;
M = #(R,Q) R^2 - Q^3;
if M > 0
Z_single = ((-R+(M^0.5))^(1/3)) + ((-R-(M^0.5))^(1/3)) - (A/3);
I = (1/(sigma-epsilon))*log((Z_single+sigma*beta)/(Z_single+epsilon*beta));
end
if M < 0
theta = acos(R/(Q^(1/3)));
Z(1) = -2*(Q^0.5)*(cos(theta/3)) - (A/3);
Z(2) = -2*(Q^0.5)*(cos((theta + 2*pi)/3)) - (A/3);
Z(3) = -2*(Q^0.5)*(cos((theta - 2*pi)/3)) - (A/3);
Z_liquid = min(Z)
Z_vapor = max(Z)
I_liquid = (1/(sigma-epsilon))*log((Z_liquid+sigma*beta)/(Z_liquid+epsilon*beta));
I_vapor = (1/(sigma-epsilon))*log((Z_vapor+sigma*beta)/(Z_vapor+epsilon*beta));
end
ln_phi_liquid = Z_liquid - 1 - log(Z_liquid - beta) - q*I_liquid;
ln_phi_vapor = Z_vapor - 1 - log(Z_vapor - beta) - q*I_vapor;
objfun = (ln_phi_liquid - ln_phi_vapor);
Psat_RK_solved = fsolve(objfun,10);
Basically, I'm trying to iterate on the value of Psat_RK until the objfun equals 0. I can post more details of the math if needed, but I figured this would be enough to get started. Thanks.
Edit: Sorry, forgot to actually mention the problem.
Here's the error I'm getting.
Undefined operator '>' for input arguments of type 'function_handle'.
Error in Proj2 (line 73)
if M > 0
I can't figure out how to establish in this line that M is being calculated off an anonymous function.
EDIT:
sigma = 1;
epsilon = 0;
omega = 0.08664;
psi = 0.42748;
beta = #(Psat_RK) omega*PsatRK/Pc/Tr(1); % Pc is in bar, may need a unit conversion later
alpha = (Tr(1))^(-1/2);
q = psi*alpha/omega/Tr(1);
A = #(Psat_RK) (sigma + epsilon - 1)*beta(Psat_RK) - 1;
B = #(Psat_RK) (sigma*epsilon - sigma - epsilon)*(beta(Psat_RK)^2) + (q - sigma - epsilon)*beta(Psat_RK);
C = #(Psat_RK) -(sigma*epsilon*(1+beta(Psat_RK)) + q)*(beta(Psat_RK)^2);
Q = #(Psat_RK) ((A(Psat_RK)^2) - 3*B(Psat_RK))/9;
R = #(Psat_RK) (2*(A(Psat_RK)^3) - 9*A(Psat_RK)*B(Psat_RK) + 27*C(Psat_RK))/54;
M = #(Psat_RK) R(Psat_RK)^2 - Q(Psat_RK)^3;
if M(Psat_RK) > 0
Z_single = ((-R+(M^0.5))^(1/3)) + ((-R-(M^0.5))^(1/3)) - (A/3);
I = (1/(sigma-epsilon))*log((Z_single+sigma*beta)/(Z_single+epsilon*beta));
end
if M(Psat_RK) < 0
theta = acos(R/(Q^(1/3)));
Z(1) = -2*(Q^0.5)*(cos(theta/3)) - (A/3);
Z(2) = -2*(Q^0.5)*(cos((theta + 2*pi)/3)) - (A/3);
Z(3) = -2*(Q^0.5)*(cos((theta - 2*pi)/3)) - (A/3);
Z_liquid = min(Z)
Z_vapor = max(Z)
I_liquid = (1/(sigma-epsilon))*log((Z_liquid+sigma*beta)/(Z_liquid+epsilon*beta));
I_vapor = (1/(sigma-epsilon))*log((Z_vapor+sigma*beta)/(Z_vapor+epsilon*beta));
end
ln_phi_liquid = Z_liquid - 1 - log(Z_liquid - beta) - q*I_liquid;
ln_phi_vapor = Z_vapor - 1 - log(Z_vapor - beta) - q*I_vapor;
objfun = (ln_phi_liquid - ln_phi_vapor);
Psat_RK_solved = fsolve(objfun,10);
I know the code needs some work further down, but the code below shouldn't affect why it hangs at the first if statement, right? The error is:
Undefined function or variable 'Psat_RK'.
Error in Proj2 (line 122)
if M(Psat_RK) > 0
It looks like you are trying to use anonymous functions incorrectly.
If we take a look at one of them:
Q = #(A,B) ((A^2) - 3*B)/9;
To MATLAB, this is the equivalent of this function:
function C = Q(A, B)
C = ((A^2) - 3*B) / 9;
end
Q is the name of the function and doesn't represent a value. If, however, you pass Q the two arguments that it needs (A and B), then it will yield a value.
Obviously, you would want to call this in the following way:
value = Q(a,b);
If you look at your own code, you try to use Q directly as if it were a value rather than a function handle.
Z(1) = -2*(Q^0.5)*(cos(theta/3)) - (A/3);
If we break down this single line a little more, we realize that A (one of the inputs to Q) is also an anonymous function. Same actually goes for B.
Then further down the rabbit hole, A and B depend upon the output of the anonymous function beta which is finally defined at the top.
beta = #(Psat_RK) omega*PsatRK/Pc/Tr(1);
So assuming we have a value for Psat_RK, this whole chain would look like this.
betaValue = beta(Psat_RK);
aValue = A(betaValue);
bValue = B(betaValue);
qValue = A(aValue, bValue);
Now you can use qValue as a value and the statement above would become
Z(1) = -2 * (qValue ^ 0.5) * (cos(theta / 3)) - (aValue / 3);
If you wanted to simplify this, you could redefine Q to be:
Q = #(Psat_RK)(A(beta(Psat_RK))^2 - 3 * B(beta(Psat_RK))) / 9;
This applies to all anonymous functions you have defined (including M which is giving you your first error).
Summary
Anonymous functions are useful for a number of things and functional programmers love them. For your case, I would probably recommend that you just write a simple function that is a function of Psat_RK and create a single anonymous function for that and pass it to fsolve.
fsolve(#objectiveFUnction, x0);
function value = objectiveFunction(Psat_RK)
% Do all your calculations here to get objfun given Psat_RK
% No anonymous functions needed here!
end
Addendum
If we wanted to convert all of your anonymous functions to be a function of Psat_RK they would look like this.
A = #(Psat_RK) (sigma + epsilon - 1) * beta(Psat_RK) - 1;
B = #(Psat_RK) (sigma * epsilon - sigma - epsilon)*(beta(Psat_RK)^2) + (q - sigma - epsilon)*beta(Psat_RK);
C = #(Psat_RK) -(sigma*epsilon*(1+beta(Psat_RK)) + q)*(beta(Psat_RK)^2);
Q = #(Psat_RK) ((A(Psat_RK)^2) - 3*B(Psat_RK))/9;
R = #(Psat_RK) (2*(A(Psat_RK)^3) - 9*A(Psat_RK)*B(Psat_RK) + 27*C(Psat_RK))/54;
M = #(Psat_RK) R(Psat_RK)^2 - Q(Psat_RK)^3;
Example
Here is how I would write this as a separate function without all of those anonymous functions.
objectiveFunction.m
function value = objectiveFunction(psat)
% Redlich/Kwong EOS
sigma = 1;
epsilon = 0;
omega = 0.08664;
psi = 0.42748;
% Pc is in bar, may need a unit conversion later
beta = omega * psat / Pc / Tr(1); % NOT SURE WHAT Tr or Pc ARE
alpha = (Tr(1))^(-1/2);
q = psi*alpha/omega/Tr(1);
A = (sigma + epsilon - 1)*beta - 1;
B = (sigma*epsilon - sigma - epsilon)*(beta^2) + (q - sigma - epsilon)*beta;
C = beta -(sigma*epsilon*(1+beta) + q)*(beta^2);
Q = ((A^2) - 3*B)/9;
R = (2*(A^3) - 9*A*B + 27*C)/54;
M = R^2 - Q^3;
if M > 0
Z_single = ((-R+(M^0.5))^(1/3)) + ((-R-(M^0.5))^(1/3)) - (A/3);
I = (1/(sigma-epsilon))*log((Z_single+sigma*beta)/(Z_single+epsilon*beta));
end
if M < 0
theta = acos(R/(Q^(1/3)));
Z(1) = -2*(Q^0.5)*(cos(theta/3)) - (A/3);
Z(2) = -2*(Q^0.5)*(cos((theta + 2*pi)/3)) - (A/3);
Z(3) = -2*(Q^0.5)*(cos((theta - 2*pi)/3)) - (A/3);
Z_liquid = min(Z);
Z_vapor = max(Z);
I_liquid = (1/(sigma-epsilon))*log((Z_liquid+sigma*beta)/(Z_liquid+epsilon*beta));
I_vapor = (1/(sigma-epsilon))*log((Z_vapor+sigma*beta)/(Z_vapor+epsilon*beta));
end
ln_phi_liquid = Z_liquid - 1 - log(Z_liquid - beta) - q*I_liquid;
ln_phi_vapor = Z_vapor - 1 - log(Z_vapor - beta) - q*I_vapor;
value = (ln_phi_liquid - ln_phi_vapor);
end
Then from the MATLAB command window, you could type the following to get your solution.
Psat_RK_solved = fsolve(#objectiveFunction, 10);
This way, the only anonymous function is the one that you use to point fsolve to your actual objective function.

Error in evaluating a function

EDIT: The code that I have pasted is too long. Basicaly I dont know how to work with the second code, If I know how calculate alpha from the second code I think my problem will be solved. I have tried a lot of input arguments for the second code but it does not work!
I have written following code to solve a convex optimization problem using Gradient descend method:
function [optimumX,optimumF,counter,gNorm,dx] = grad_descent()
x0 = [3 3]';%'//
terminationThreshold = 1e-6;
maxIterations = 100;
dxMin = 1e-6;
gNorm = inf; x = x0; counter = 0; dx = inf;
% ************************************
f = #(x1,x2) 4.*x1.^2 + 2.*x1.*x2 +8.*x2.^2 + 10.*x1 + x2;
%alpha = 0.01;
% ************************************
figure(1); clf; ezcontour(f,[-5 5 -5 5]); axis equal; hold on
f2 = #(x) f(x(1),x(2));
% gradient descent algorithm:
while and(gNorm >= terminationThreshold, and(counter <= maxIterations, dx >= dxMin))
g = grad(x);
gNorm = norm(g);
alpha = linesearch_strongwolfe(f,-g, x0, 1);
xNew = x - alpha * g;
% check step
if ~isfinite(xNew)
display(['Number of iterations: ' num2str(counter)])
error('x is inf or NaN')
end
% **************************************
plot([x(1) xNew(1)],[x(2) xNew(2)],'ko-')
refresh
% **************************************
counter = counter + 1;
dx = norm(xNew-x);
x = xNew;
end
optimumX = x;
optimumF = f2(optimumX);
counter = counter - 1;
% define the gradient of the objective
function g = grad(x)
g = [(8*x(1) + 2*x(2) +10)
(2*x(1) + 16*x(2) + 1)];
end
end
As you can see, I have commented out the alpha = 0.01; part. I want to calculate alpha via an other code. Here is the code (This code is not mine)
function alphas = linesearch_strongwolfe(f,d,x0,alpham)
alpha0 = 0;
alphap = alpha0;
c1 = 1e-4;
c2 = 0.5;
alphax = alpham*rand(1);
[fx0,gx0] = feval(f,x0,d);
fxp = fx0;
gxp = gx0;
i=1;
while (1 ~= 2)
xx = x0 + alphax*d;
[fxx,gxx] = feval(f,xx,d);
if (fxx > fx0 + c1*alphax*gx0) | ((i > 1) & (fxx >= fxp)),
alphas = zoom(f,x0,d,alphap,alphax);
return;
end
if abs(gxx) <= -c2*gx0,
alphas = alphax;
return;
end
if gxx >= 0,
alphas = zoom(f,x0,d,alphax,alphap);
return;
end
alphap = alphax;
fxp = fxx;
gxp = gxx;
alphax = alphax + (alpham-alphax)*rand(1);
i = i+1;
end
function alphas = zoom(f,x0,d,alphal,alphah)
c1 = 1e-4;
c2 = 0.5;
[fx0,gx0] = feval(f,x0,d);
while (1~=2),
alphax = 1/2*(alphal+alphah);
xx = x0 + alphax*d;
[fxx,gxx] = feval(f,xx,d);
xl = x0 + alphal*d;
fxl = feval(f,xl,d);
if ((fxx > fx0 + c1*alphax*gx0) | (fxx >= fxl)),
alphah = alphax;
else
if abs(gxx) <= -c2*gx0,
alphas = alphax;
return;
end
if gxx*(alphah-alphal) >= 0,
alphah = alphal;
end
alphal = alphax;
end
end
But I get this error:
Error in linesearch_strongwolfe (line 11) [fx0,gx0] = feval(f,x0,d);
As you can see I have written the f function and its gradient manually.
linesearch_strongwolfe(f,d,x0,alpham) takes a function f, Gradient of f, a vector x0 and a constant alpham. is there anything wrong with my declaration of f? This code works just fine if I put back alpha = 0.01;
As I see it:
x0 = [3; 3]; %2-element column vector
g = grad(x0); %2-element column vector
f = #(x1,x2) 4.*x1.^2 + 2.*x1.*x2 +8.*x2.^2 + 10.*x1 + x2;
linesearch_strongwolfe(f,-g, x0, 1); %passing variables
inside the function:
[fx0,gx0] = feval(f,x0,-g); %variable names substituted with input vars
This will in effect call
[fx0,gx0] = f(x0,-g);
but f(x0,-g) is a single 2-element column vector with these inputs. Assingning the output to two variables will not work.
You either have to define f as a proper named function (just like grad) to output 2 variables (one for each component), or edit the code of linesearch_strongwolfe to return a single variable, then slice that into 2 separate variables yourself afterwards.
If you experience a very rare kind of laziness and don't want to define a named function, you can still use an anonymous function at the cost of duplicating code for the two components (at least I couldn't come up with a cleaner solution):
f = #(x1,x2) deal(4.*x1(1)^2 + 2.*x1(1)*x2(1) +8.*x2(1)^2 + 10.*x1(1) + x2(1),...
4.*x1(2)^2 + 2.*x1(2)*x2(2) +8.*x2(2)^2 + 10.*x1(2) + x2(2));
[fx0,gx0] = f(x0,-g); %now works fine
as long as you always have 2 output variables. Note that this is more like a proof of concept, since this is ugly, inefficient, and very susceptible to typos.

Regularized logistic regression code in matlab

I'm trying my hand at regularized LR, simple with this formulas in matlab:
The cost function:
J(theta) = 1/m*sum((-y_i)*log(h(x_i)-(1-y_i)*log(1-h(x_i))))+(lambda/2*m)*sum(theta_j)
The gradient:
∂J(theta)/∂theta_0 = [(1/m)*(sum((h(x_i)-y_i)*x_j)] if j=0
∂j(theta)/∂theta_n = [(1/m)*(sum((h(x_i)-y_i)*x_j)]+(lambda/m)*(theta_j) if j>1
This is not matlab code is just the formula.
So far I've done this:
function [J, grad] = costFunctionReg(theta, X, y, lambda)
J = 0;
grad = zeros(size(theta));
temp_theta = [];
%cost function
%get the regularization term
for jj = 2:length(theta)
temp_theta(jj) = theta(jj)^2;
end
theta_reg = lambda/(2*m)*sum(temp_theta);
temp_sum =[];
%for the sum in the cost function
for ii =1:m
temp_sum(ii) = -y(ii)*log(sigmoid(theta'*X(ii,:)'))-(1-y(ii))*log(1-sigmoid(theta'*X(ii,:)'));
end
tempo = sum(temp_sum);
J = (1/m)*tempo+theta_reg;
%regulatization
%theta 0
reg_theta0 = 0;
for jj=1:m
reg_theta0(jj) = (sigmoid(theta'*X(m,:)') -y(jj))*X(jj,1)
end
reg_theta0 = (1/m)*sum(reg_theta0)
grad_temp(1) = reg_theta0
%for the rest of thetas
reg_theta = [];
thetas_sum = 0;
for ii=2:size(theta)
for kk =1:m
reg_theta(kk) = (sigmoid(theta'*X(m,:)') - y(kk))*X(kk,ii)
end
thetas_sum(ii) = (1/m)*sum(reg_theta)+(lambda/m)*theta(ii)
reg_theta = []
end
for i=1:size(theta)
if i == 1
grad(i) = grad_temp(i)
else
grad(i) = thetas_sum(i)
end
end
end
And the cost function is giving correct results, but I have no idea why the gradient (one step) is not, the cost gives J = 0.6931 which is correct and the gradient grad = 0.3603 -0.1476 0.0320, which is not, the cost starts from 2 because the parameter theta(1) does not have to be regularized, any help? I guess there is something wrong with the code, but after 4 days I can't see it.Thanks
Vectorized:
function [J, grad] = costFunctionReg(theta, X, y, lambda)
hx = sigmoid(X * theta);
m = length(X);
J = (sum(-y' * log(hx) - (1 - y')*log(1 - hx)) / m) + lambda * sum(theta(2:end).^2) / (2*m);
grad =((hx - y)' * X / m)' + lambda .* theta .* [0; ones(length(theta)-1, 1)] ./ m ;
end
I used more variables, so you could see clearly what comes from the regular formula, and what comes from "the regularization cost added". Additionally, It is a good practice to use "vectorization" instead of loops in Matlab/Octave. By doing this, you guarantee a more optimized solution.
function [J, grad] = costFunctionReg(theta, X, y, lambda)
%Hypotheses
hx = sigmoid(X * theta);
%%The cost without regularization
J_partial = (-y' * log(hx) - (1 - y)' * log(1 - hx)) ./ m;
%%Regularization Cost Added
J_regularization = (lambda/(2*m)) * sum(theta(2:end).^2);
%%Cost when we add regularization
J = J_partial + J_regularization;
%Grad without regularization
grad_partial = (1/m) * (X' * (hx -y));
%%Grad Cost Added
grad_regularization = (lambda/m) .* theta(2:end);
grad_regularization = [0; grad_regularization];
grad = grad_partial + grad_regularization;
Finally got it, after rewriting it again like for the 4th time, this is the correct code:
function [J, grad] = costFunctionReg(theta, X, y, lambda)
J = 0;
grad = zeros(size(theta));
temp_theta = [];
for jj = 2:length(theta)
temp_theta(jj) = theta(jj)^2;
end
theta_reg = lambda/(2*m)*sum(temp_theta);
temp_sum =[];
for ii =1:m
temp_sum(ii) = -y(ii)*log(sigmoid(theta'*X(ii,:)'))-(1-y(ii))*log(1-sigmoid(theta'*X(ii,:)'));
end
tempo = sum(temp_sum);
J = (1/m)*tempo+theta_reg;
%regulatization
%theta 0
reg_theta0 = 0;
for i=1:m
reg_theta0(i) = ((sigmoid(theta'*X(i,:)'))-y(i))*X(i,1)
end
theta_temp(1) = (1/m)*sum(reg_theta0)
grad(1) = theta_temp
sum_thetas = []
thetas_sum = []
for j = 2:size(theta)
for i = 1:m
sum_thetas(i) = ((sigmoid(theta'*X(i,:)'))-y(i))*X(i,j)
end
thetas_sum(j) = (1/m)*sum(sum_thetas)+((lambda/m)*theta(j))
sum_thetas = []
end
for z=2:size(theta)
grad(z) = thetas_sum(z)
end
% =============================================================
end
If its helps anyone, or anyone has any comments on how can I do it better. :)
Here is an answer that eliminates the loops
m = length(y); % number of training examples
predictions = sigmoid(X*theta);
reg_term = (lambda/(2*m)) * sum(theta(2:end).^2);
calcErrors = -y.*log(predictions) - (1 -y).*log(1-predictions);
J = (1/m)*sum(calcErrors)+reg_term;
% prepend a 0 column to our reg_term matrix so we can use simple matrix addition
reg_term = [0 (lambda*theta(2:end)/m)'];
grad = sum(X.*(predictions - y)) / m + reg_term;