DerivativeCheck fails in fsolve function (Jacobian) - matlab

I am trying to find a solution for a non-linear system in MATLAB using the fsolve function. I have an idea of the region the solution is, so my initial points are a random variation inside this area. The code follows:
%MATLAB parameters
digits = 32;
format shortEng
%Initialization of solving process
min_norm = realmax;
options = optimoptions('fsolve', 'Algorithm', 'trust-region-dogleg',...
'Display', 'off', 'FunValCheck', 'on', 'Jacobian', 'on',...
'DerivativeCheck', 'on');
%Solving loop
while 1
%Refreshes seed after 100000 iterations
rng('shuffle');
for n=1:100000
%Initial points are randomly placed in solving region
x_0 = zeros(1, 3);
x_0(1) = 20*(10^-3)+ abs(3*(10^-3)*randn);
x_0(2) = abs(10^(-90) + 10^-89*randn);
x_0(3) = abs(10*randn);
%Solving
[x, fval] = fsolve(#fbnd, x_0, options);
norm_fval = norm(fval);
%If norm is minimum, result is displayed
if all(x > 0.0) && (norm_fval < min_norm)
iter_solu = x;
display(norm_fval);
display(iter_solu);
min_norm = norm_fval;
end
end
end
The function to be optimized and its jacobian follow:
function [F, J] = fbnd(x)
F(1) = x(1) - x(2)*(exp(7.56/(1.5*0.025))-1);
F(2) = x(1) - x(2)*(exp((x(3)*0.02)/(1.5*0.025))-1)-0.02;
F(3) = x(1) - x(2)*(exp((6.06+x(3)*0.018)/(1.5*0.025))-1)-0.018;
J = [[1.0, -3.5790482371355382991651020082214*10^87, 0.0];...
[1.0, 1.0-exp((8/15)*x(3)),-(8/15)*x(2)*exp((8/15)*x(3))];...
[1.0, 1.0-exp(0.48*x(3)+161.6), -0.48*x(2)*exp(0.48*x(3)+161.6)]];
end
When I turn DerivativeCheck on, I get the following error:
Objective function derivatives:
Maximum relative difference between user-supplied
and finite-difference derivatives = 1.
User-supplied derivative element (1,1): 1
Finite-difference derivative element (1,1): 0
For some reason, it things dF(1)/dx(1) should be zero instead of 1, when it is obvious it is 1. So I went ahead and replaced the J(1,1) by 0, and now I get the same problem with J(3,1):
Objective function derivatives:
Maximum relative difference between user-supplied
and finite-difference derivatives = 1.
User-supplied derivative element (3,1): 1
Finite-difference derivative element (3,1): 0
So, I replaced J(3,1) by 0 and it catches only minor precision errors. Is there an explanation why it thinks J(1,1) and J(3,1) should be zero? It seems really strange to me.

Related

Why " The boundary condition function BCFUN should return a column vector of length 5" error message is came?

function Y6
solinit = bvpinit(linspace(0,1),[0;3;1;1],2);
sol = bvp4c(#ode, #bc, solinit);
y = sol.y;
time = sol.parameters*sol.x;
ut = -y(4,:);
figure(1);
plot(time,y([1 2],:)','-'); hold on;
plot(time, ut, 'k:');
axis([0 time(1,end) -1.5 3]);
text(1.3,2.5,'x_1(t)');
text(1.3,.9,'x_2(t)');
text(1.3,-.5,'u(t)');
xlabel('time');
ylabel('states');
title('Numerical solution');
hold off;
% -------------------------------------------------------------------------
% ODE's of augmented states
function dydt = ode(t,y,T)
dydt = T*[2*y(2);4*y(4);0;-2*y(3)];
% -------------------------------------------------------------------------
% boundary conditions: x1(0)=11;p2(0)=2; x2(tf)=3; 3*p1(tf)+p2(2)^2=0
function res = bc(ya,yb,T)
res = [ ya(1) - 11; ya(4); yb(2) - 3; 3*yb(3)+yb(4)^2];
I don't know why The boundary condition function BCFUN should return a column vector of length 5 error message is came.
Can you explain to me please? Thank you so much
The unknown parameters or constants, here T, also count as components. Thus your state has overall 4+1=5 components, requiring the setting of 5 boundary conditions.
Sometimes that is obvious, for instance in a Sturm-Liouville eigenvalue computation with the eigenvalue as parameter you want to avoid the zero solution, so demand that, e.g., the square integral has value 1.
In a ballistic shot with the flight time as parameter where you fix the location of the canon and the target as obvious boundary conditions it is not that obvious what to select as additional BC, but possibilities are to fix the initial speed or the initial angle.
I might be rusty, but I get from the variation of
L = 3*(x1(T)-9)^2 + integral(0,T, 2*u^2 +p^T*(F(x,u)-x') )
the saddle point conditions
T : 0 = 12*x2(T)*(x1(T)-9) + 2*u(T)^2 (BC5)
x(t) : 0 = p'(t)+F_x(x,u)^T*p
p1'(t) = -2*p2(t) (DE3)
p2'(t) = 0 (DE4)
p(t) : x'(t) = F(x,u)
x1'(t) = 2*x2(t) (DE1)
x2'(t) = 4*u(t) (DE2)
u(t) : 0 = 4*u + p^T*F_u(x,u)
u(t) = -p2(t) (OPT)
x1(T): -p1(T)+6*(x1(T)-9) (BC3)
x2(0): 0 = p2(0) (BC4)
This can be directly evaluated as p2=-u=0 (BC4+DE4+OPT) are constant, likewise p1 = 6*(x1(T)-9) (BC3+DE3). This in turn forces x1(T)=9 (BC5) As now x2(t)=3 is constant, we get x1(t)=11+6*t. Then the remaining expression of the functional 3*(2+6*T)^2 is minimal for T=-1/3. For T=0 the minimal value of the functional is L=12.

Laguerre's method to obtain poly roots (Matlab)

I must write using Laguerre's method a piece of code to find the real and complex roots of poly:
P=X^5-5*X^4-6*X^3+6*X^2-3*X+1
I have little doubt. I did the algorithm in the matlab, but 3 out of 5 roots are the same and I don't think that is correct.
syms X %Declearing x as a variabl
P=X^5-5*X^4-6*X^3+6*X^2-3*X+1; %Equation we interest to solve
n=5; % The eq. order
Pd1 = diff(P,X,1); % first differitial of f
Pd2 = diff(P,X,2); %second differitial of f
err=0.00001; %Answear tollerance
N=100; %Max. # of Iterations
x(1)=1e-3; % Initial Value
for k=1:N
G=double(vpa(subs(Pd1,X,x(k))/subs(P,X,x(k))));
H=G^2 - double(subs(Pd2,X,x(k))) /subs(P,X,x(k));
D1= (G+sqrt((n-1)*(n*H-G^2)));
D2= (G-sqrt((n-1)*(n*H-G^2)));
D = max(D1,D2);
a=n/D;
x(k+1)=x(k)-a
Err(k) = abs(x(k+1)-x(k));
if Err(k) <=err
break
end
end
output (roots of polynomial):
x =
0.0010 + 0.0000i 0.1434 + 0.4661i 0.1474 + 0.4345i 0.1474 + 0.4345i 0.1474 + 0.4345i
What you actually see are all the values x(k) which arose in the loop. The last one, 0.1474 + 0.4345i is the end result of this loop - the approximation of the root which is in your given tolerance threshold. The code
syms X %Declaring x as a variable
P = X^5 - 5 * X^4 - 6 * X^3 + 6 * X^2 - 3 * X + 1; %Polynomial
n=5; %Degree of the polynomial
Pd1 = diff(P,X,1); %First derivative of P
Pd2 = diff(P,X,2); %Second derivative of P
err = 0.00001; %Answer tolerance
N = 100; %Maximal number of iterations
x(1) = 0; %Initial value
for k = 1:N
G = double(vpa(subs(Pd1,X,x(k)) / subs(P,X,x(k))));
H = G^2 - double(subs(Pd2,X,x(k))) / subs(P,X,x(k));
D1 = (G + sqrt((n-1) * (n * H-G^2)));
D2 = (G - sqrt((n-1) * (n * H-G^2)));
D = max(D1,D2);
a = n/D;
x(k+1) = x(k) - a;
Err(k) = abs(x(k+1)-x(k));
if Err(k) <=err
fprintf('Initial value %f, result %f%+fi', x(1), real(x(k)), imag(x(k)))
break
end
end
results in
Initial value -2.000000, result -1.649100+0.000000i
If you want to get other roots, you have to use other initial values. For example one can obtain
Initial value 10.000000, result 5.862900+0.000000i
Initial value -2.000000, result -1.649100+0.000000i
Initial value 3.000000, result 0.491300+0.000000i
Initial value 0.000000, result 0.147400+0.434500i
Initial value 1.000000, result 0.147400-0.434500i
These are all zeros of the polynomial.
A method for calculating the next root when you have found another one would be that you divide through the corresponding linear factor and use your loop for the resulting new polynomial. Note that this is in general not very easy to handle since rounding errors can have a big influence on the result.
Problems with the existing code
You do not implement the Laguerre method properly as a method in complex numbers. The denominator candidates D1,D2 are in general complex numbers, it is inadvisable to use the simple max which only has sensible results for real inputs. The aim is to have a=n/D be the smaller of both variants, so that one has to look for the D in [D1,D2] with the larger absolute value. If there were a conditional assignment as in C, this would look like
D = (abs(D_1)>abs(D2)) ? D1 : D2;
As that does not exist, one has to use commands with a similar result
D = D1; if (abs(D_1)<abs(D2)) D=D2; end
The resulting sequence of approximation points is
x(0) = 0.0010000
x(1) = 0.143349512707684+0.466072958423667i
x(2) = 0.164462212064089+0.461399841949893i
x(3) = 0.164466373475316+0.461405404094130i
There is a point where one can not expect the (residual) polynomial value at the root approximation to substantially decrease. The value close to zero is obtained by adding and subtracting rather large terms in the sum expression of the polynomial. The accuracy lost in these catastrophic cancellation events can not be recovered.
The threshold for polynomial values that are effectively zero can be estimated as the machine constant of the double type times the polynomial value where all coefficients and the evaluation point are replaced by their absolute values. This test serves in the code primarily to avoid divisions by zero or near-zero.
Finding all roots
One approach is to apply the method to a sufficiently large number of initial points along some circle containing all the roots, with some strict rules for early termination at too slow convergence. One would have to make the list of the roots found unique, but keep the multiplicity,...
The other standard method is to apply deflation, that is, divide out the linear factor of the root found. This works well in low degrees.
There is no need for the slower symbolic operations as there are functions that work directly on the coefficient array, such as polyval and polyder. Deflation by division with remainder can be achieved using the deconv function.
For real polynomials, we know that the complex conjugate of a root is also a root. Thus initialize the next iteration with the deflated polynomial with it.
Other points:
There is no point in the double conversions as at no point there is a conversion into the single type.
If you don't do anything with it, it makes no sense to create an array, especially not for Err.
Roots of the example
Implementing all this I get a log of
x(0) = 0.001000000000000+0.000000000000000i, |Pn(x(0))| = 0.99701
x(1) = 0.143349512707684+0.466072958423667i, |dx|= 0.48733
x(2) = 0.164462212064089+0.461399841949893i, |dx|=0.021624
x(3) = 0.164466373475316+0.461405404094130i, |dx|=6.9466e-06
root found x=0.164466373475316+0.461405404094130i with value P0(x)=-2.22045e-16+9.4369e-16i
Deflation
x(0) = 0.164466373475316-0.461405404094130i, |Pn(x(0))| = 2.1211e-15
root found x=0.164466373475316-0.461405404094130i with value P0(x)=-2.22045e-16-9.4369e-16i
Deflation
x(0) = 0.164466373475316+0.461405404094130i, |Pn(x(0))| = 4.7452
x(1) = 0.586360702193454+0.016571894375927i, |dx|= 0.61308
x(2) = 0.562204173408499+0.000003168181059i, |dx|=0.029293
x(3) = 0.562204925474889+0.000000000000000i, |dx|=3.2562e-06
root found x=0.562204925474889+0.000000000000000i with value P0(x)=2.22045e-16-1.33554e-17i
Deflation
x(0) = 0.562204925474889-0.000000000000000i, |Pn(x(0))| = 7.7204
x(1) = 3.332994579372812-0.000000000000000i, |dx|= 2.7708
root found x=3.332994579372812-0.000000000000000i with value P0(x)=6.39488e-14-3.52284e-15i
Deflation
x(0) = 3.332994579372812+0.000000000000000i, |Pn(x(0))| = 5.5571
x(1) = -2.224132251798332+0.000000000000000i, |dx|= 5.5571
root found x=-2.224132251798332+0.000000000000000i with value P0(x)=-3.33067e-14+1.6178e-15i
for the modified code
P = [1, -2, -6, 6, -3, 1];
P0 = P;
deg=length(P)-1; % The eq. degree
err=1e-05; %Answer tolerance
N=10; %Max. # of Iterations
x=1e-3; % Initial Value
for n=deg:-1:1
dP = polyder(P); % first derivative of P
d2P = polyder(dP); %second derivative of P
fprintf("x(0) = %.15f%+.15fi, |Pn(x(0))| = %8.5g\n", real(x),imag(x), abs(polyval(P,x)));
for k=1:N
Px = polyval(P,x);
dPx = polyval(dP,x);
d2Px = polyval(d2P,x);
if abs(Px) < 1e-14*polyval(abs(P),abs(x))
break % if value is zero in relative accuracy
end
G = dPx/Px;
H=G^2 - d2Px / Px;
D1= (G+sqrt((n-1)*(n*H-G^2)));
D2= (G-sqrt((n-1)*(n*H-G^2)));
D = D1;
if abs(D2)>abs(D1) D=D2; end % select the larger denominator
a=n/D;
x=x-a;
fprintf("x(%d) = %.15f%+.15fi, |dx|=%8.5g\n",k,real(x),imag(x), abs(a));
if abs(a) < err*(err+abs(x))
break
end
end
y = polyval(P0,x); % check polynomial value of the original polynomial
fprintf("root found x=%.15f%+.15fi with value P0(x)=%.6g%+.6gi\n", real(x),imag(x),real(y),imag(y));
disp("Deflation");
[ P,R ] = deconv(P,[1,-x]); % division with remainder
x = conj(x); % shortcut for conjugate pairs and clustered roots
end

Why am I getting the wrong sign on my cos(x) approximation?

I am writing a Matlab script that will approximate sin(x) and cos(x) using their Maclaurin polynomials.
When I input
arg = (5*pi)/4 I expect to get the correct approximations for
sin((5*pi)/4) = -0.7071067811865474617
cos((5*pi)/4) = -0.7071067811865476838.
Instead I get the following when running the script:
Approximation of sin(3.92699) >> -0.7071067811865474617
Actual sin(3.92699) = -0.7071067811865474617
Error approximately = 0.0000000000000000000 (0)
----------------------------------------------------------
Approximation of cos(3.92699) >> 0.7071067811865474617
Actual cos(3.92699) = -0.7071067811865476838
Error approximately = 0.0000000000000001110 (1.1102e-16)
I am getting the correct answers for sin but incorrect for cosine when the argument (angle) is in quadrant 3 or 4. The problem is that I am getting the wrong sign on the cos(arg) value. Where have I messed up?
CalculatorForSineCosine.m
% Argument for sine/cosine in radians.
arg = (5*pi)/4;
% Move the argument x so it's within [0, pi/2].
newArg = moveArgumentV2(arg);
% Calculate what degree we need for our Taylorpolynomial.
TOL = 0; % If 0, assume we want Machine Epsilon.
r = findDegreeV2(TOL);
% Plot nth degree Taylorpolynomial around x = 0 for sine.
% and calculate approximation of sin(x).
[approximatedSin, errorSin] = sin_taylorV2(r, newArg);
eS = num2str(errorSin); % errorSin in string format
% Plot nth degree Taylorpolynomial around x = 0 for cosine.
% and calculate approximation of cos(x).
[approximatedCos, errorCos] = cos_taylorV2(r, newArg);
eC = num2str(errorCos); % errorCos in string format
% Print out the result.
fprintf('\nApproximation of sin(%.5f)\t >> %.19f\n', arg, approximatedSin);
fprintf('Actual sin(%.5f)\t\t\t\t = %.19f\n', arg, sin(arg));
fprintf('Error approximately\t\t\t\t = %.19f (%s)\n', errorSin, eS);
disp("----------------------------------------------------------")
fprintf('Approximation of cos(%.5f)\t >> %.19f\n', arg, approximatedCos);
fprintf('Actual cos(%.5f)\t\t\t\t = %.19f\n', arg, cos(arg));
fprintf('Error approximately\t\t\t\t = %.19f (%s)\n\n', errorCos, eC);
sin_taylorV2.m
function [approximatedSin, errorSin] = sin_taylorV2(r, x)
%% sss
% Q_2n+1(x) where 2n+1 = degree of polynomial.
n = (r - 1)/2;
% Approximate sin(x) using its Taylorpolynomial.
approximatedSin = 0;
for k = 0:n
approximatedSin = approximatedSin + (((-1).^k) .* (x.^(2.*k+1)))./(factorial(2.*k+1));
end
% Calculate the error.
errorSin = abs(sin(x) - approximatedSin);
end
cos_taylorV2.m
function [approximatedCos, errorCos] = cos_taylorV2(r, x)
%% sss
% Q_2n+1(x) where 2n+1 = degree of polynomial and n = # terms.
n = (r - 1)/2;
% Approximate cos(x) using its Taylorpolynomial.
approximatedCos = 0;
for k = 0:n
approximatedCos = approximatedCos + (((-1).^k) .* (x.^(2.*k)))./(factorial(2.*k));
end
% Calculate the error.
errorCos = abs(cos(x) - approximatedCos);
end
moveArgumentV2.m
function newArg = moveArgumentV2(arg)
%% Moves the argument x to the interval [0, pi/2].
% Make use of sines periodocity and choose n as ceil( (x-pi)/2pi) )
n = ceil((arg-pi)/(2*pi));
x1 = arg - 2*pi*n; % New angle will be in [-pi, pi]
x2 = abs(x1); % Angle will be in [0, pi]
if (x2 < pi/2) && (x2 > 0)
x3 = x2;
else
x3 = pi - x2;
end
newArg = x3*sign(x1); % Angle will be in [0, pi/2]
end
I would like to notice two things in your code.
First, you don't need the moveArgumentV2(arg) function, as, if you remember, the radius of convergence for the Maclaurin/Taylor series of the sin(x)/cos(x) is the set of all real numbers. That means the series should converge for any real x, disregarding the round-off errors inherently to every arithmetic operations done in a computer.
As a matter of fact, following your code, we can write a function that approximates the cos as:
function y = mycos(x,n)
y = 0;
for k=0:n
term = (-1)^k*x.^(2*k)/factorial(2*k);
y = y + term;
end
end
Notice this function works for values outside the range [-pi,pi]:
x = -10*pi:0.1:10*pi;
ye = cos(x) % exact value
ya = mycos(x,100) % approximated value
plot(x,ye,x,ya,'o')
The values returned by the mycos function are close to the exact value given by the cos built-in function. This happens because I calculated the approximation with the first 100 terms. The error, however, for higher values of x, is extremely large if we use just a few terms.
ya = mycos(x,10) % approximated value with 10 terms only
plot(x,ye-ya); title('error')
The problem now is that we can't just increase the number of terms without running in another problem.
If we increase the number of points, the mycos function crumbles due to round-off errors, because of the factorial function that overflows. A good idea is to try to change your code in order to avoid the use of the factorial function. Notice the recurrence between sucessive terms in the Maclaurin expansion of the cos function, and you can create another function without the use of the factorial:
function y = mycos2(x,n)
term = 1;
y = 1;
for k=1:n
term = -term.*x.^2/(2*k-1)/(2*k);
y = y + term;
end
end
Here, we calculate each term in the series expansion from the previous calculated term. We avoid the calculation of the factorial and make use of what we already have. This speeds the code and avoids overflow. As a matter of fact, if we now calculate the cos approximation with 500 terms, we get:
x = -10*pi:0.5:10*pi;
ye = cos(x); % exact value
ya = mycos(x,500); % approximated value
ya2 = mycos2(x,500); % approximated value
plot(x,ye,x,ya,'x',x,ya2,'s')
legend('ye','ya','ya2')
Notice in this figure the x marks are the calculations done with the mycos function, while the o marks are done without using the factorial function. The first function crumbles for values outside the range [-2,2], but the second one runs just fine. It works even when I use 1e5 terms. Increasing the number of terms reduces the errors, so you can estimate how much terms you will use on an approximation, given a desired tolerance. If this number is greater than 170, the first function will not work properly.
factorial(170) returns 7.2574e+306, but factorial(171) returns Inf, so any value that should be calculated with more than 170 terms will have problems in the first function. Avoid the calculation of factorial at all costs.
This is what I tried:
x = -3*pi:0.01:3*pi;
y = x;
for ii=1:numel(y)
y(ii) = moveArgumentV2(y(ii)); % not vectorized
end
plot(sin(x))
hold on
plot(sin(y))
Both sin(x) and sin(y) produce the same plot. But:
plot(cos(x))
hold on
plot(cos(y))
Now we see that cos(x) and cos(y) are not the same! This is because moveArgumentV2 changes the angle to be in the first and fourth quadrant (in the range [-pi/2, pi/2]), which is what you need for the sin function, but is not adequate for the cos function.
I would modify sin_taylorV2 and cos_taylorV2 to call moveArgumentV2, so you don't rely on the caller to know what the valid input range is. In cos_taylorV2 you would need to call it this way:
x = moveArgumentV2(x+pi/2) - pi/2;
and in sin_taylorV2 you'd call it the same way you do now.
Or, better, write cos_taylorV2 in terms of sin_taylorV2, which we know to be correct. This avoids code duplication.

Numerical Integral in MatLab using integral command

I am trying to compute the value of this integral using Matlab
Here the other parameters have been defined or computed in the earlier part of the program as follows
N = 2;
sigma = [0.01 0.1];
l = [15];
meu = 4*pi*10^(-7);
f = logspace ( 1, 6, 500);
w=2*pi.*f;
for j = 1 : length(f)
q2(j)= sqrt(sqrt(-1)*2*pi*f(j)*meu*sigma(2));
q1(j)= sqrt(sqrt(-1)*2*pi*f(j)*meu*sigma(1));
C2(j)= 1/(q2(j));
C1(j)= (q1(j)*C2(j) + tanh(q1(j)*l))/(q1(j)*(1+q1(j)*C2(j)*tanh(q1(j)*l)));
Z(j) = sqrt(-1)*2*pi*f(j)*C1(j);
Apprho(j) = meu*(1/(2*pi*f(j))*(abs(Z(j))^2));
Phi(j) = atan(imag(Z(j))/real(Z(j)));
end
%integration part
c1=w./(2*pi);
rho0=1;
fun = #(x) log(Apprho(x)/rho0)/(x.^2-w^2);
c2= integral(fun,0,Inf);
phin=pi/4-c1.*c2;
I am getting an error like this
could anyone help and tell me where i am going wrong.thanks in advance
Define Apprho in a separate *.m function file, instead of storing it in an array:
function [ result ] = Apprho(x)
%
% Calculate f and Z based on input argument x
%
% ...
%
meu = 4*pi*10^(-7);
result = meu*(1/(2*pi*f)*(abs(Z)^2));
end
How you calculate f and Z is up to you.
MATLAB's integral works by calling the function (in this case, Apprho) repeatedly at many different x values. The x values called by integral don't necessarily correspond to the 1: length(f) values used in your original code, which is why you received errors.

Calculate the derivative of the sum of a mathematical function-MATLAB

In Matlab I want to create the partial derivative of a cost function called J(theta_0, theta_1) (in order to do the calculations necessary to do gradient descent).
The function J(theta_0, theta_1) is defined as:
Lets say h_theta(x) = theta_1 + theta_2*x. Also: alpha is fixed, the starting values of theta_1 and theta_2 are given. Let's say in this example: alpha = 0.1 theta_1 = 0, theta_2 = 1. Also I have all the values for x and y in two different vectors.
VectorOfX =
5
5
6
VectorOfX =
6
6
10
Steps I took to try to solve this in Matlab: I have no clue how to solve this problem in matlab. So I started off with trying to define a function in Matlab and tried this:
theta_1 = 0
theta_2 = 1
syms x;
h_theta(x) = theta_1 + t2*x;
This worked, but is not what I really wanted. I wanted to get x^(i), which is in a vector. The next thing I tried was:
theta_1 = 0
theta_2 = 1
syms x;
h_theta(x) = theta_1 + t2*vectorOfX(1);
This gives the following error:
Error using sym/subsindex (line 672)
Invalid indexing or function definition. When defining a
function, ensure that the body of the function is a SYM
object. When indexing, the input must be numeric, logical or
':'.
Error in prog1>gradientDescent (line 46)
h_theta(x) = theta_1 + theta_2*vectorOfX(x);
I looked up this error and don't know how to solve it for this particular example. I have the feeling that I make matlab work against me instead of using it in my favor.
When I have to perform symbolic computations I prefer to use Mathematica. In that environment this is the code to get the partial derivatives you are looking for.
J[th1_, th2_, m_] := Sum[(th1 + th2*Subscript[x, i] - Subscript[y, i])^2, {i, 1, m}]/(2*m)
D[J[th1, th2, m], th1]
D[J[th1, th2, m], th2]
and gives
Coming back to MATLAB we can solve this problem with the following code
%// Constants.
alpha = 0.1;
theta_1 = 0;
theta_2 = 1;
X = [5 ; 5 ; 6];
Y = [6 ; 6 ; 10];
%// Number of points.
m = length(X);
%// Partial derivatives.
Dtheta1 = #(theta_1, theta_2) sum(2*(theta_1+theta_2*X-Y))/2/m;
Dtheta2 = #(theta_1, theta_2) sum(2*X.*(theta_1+theta_2*X-Y))/2/m;
%// Loop initialization.
toll = 1e-5;
maxIter = 100;
it = 0;
err = 1;
theta_1_Last = theta_1;
theta_2_Last = theta_2;
%// Iterations.
while err>toll && it<maxIter
theta_1 = theta_1 - alpha*Dtheta1(theta_1, theta_2);
theta_2 = theta_2 - alpha*Dtheta2(theta_1, theta_2);
it = it + 1;
err = norm([theta_1-theta_1_Last ; theta_2-theta_2_Last]);
theta_1_Last = theta_1;
theta_2_Last = theta_2;
end
Unfortunately for this case the iterations does not converge.
MATLAB is not very flexible for symbolic computations, however a way to get those partial derivatives is the following
m = 10;
syms th1 th2
x = sym('x', [m 1]);
y = sym('y', [m 1]);
J = #(th1, th2) sum((th1+th2.*x-y).^2)/2/m;
diff(J, th1)
diff(J, th2)