Trouble computing cost in logistic regression - matlab
I am taking the course from Andrew Ng on Machine Learning on Coursera. In this assginment, I am working to calculate the cost function using logistic regression in MatLab, but am receiving "Error using sfminbx (line 27)
Objective function is undefined at initial point. fminunc cannot continue.".
I should add that the cost J within the costFunction function below is NaN because the log(sigmoid(X * theta)) is a -Inf vector. I'm sure this is related to the exception. Can you please help?
My cost function looks like the following:
function [J, grad] = costFunction(theta, X, y)
m = length(y); % number of training examples
J = 0;
grad = zeros(size(theta));
h = sigmoid(theta * X);
J = - (1 / m) * ((log(h)' * y) + (log(1 - h)' * (1 - y)));
grad = (1 / m) * X' * (h - y);
end
My code that calls this function looks like the following:
data = load('ex2data1.txt');
X = data(:, [1, 2]); y = data(:, 3);
[m, n] = size(X);
% Add intercept term to x and X_test
X = [ones(m, 1) X];
% Initialize fitting parameters
initial_theta = zeros(n + 1, 1);
% Compute and display initial cost and gradient
[cost, grad] = costFunction(initial_theta, X, y);
fprintf('Cost at initial theta (zeros): %f\n', cost);
fprintf('Expected cost (approx): 0.693\n');
fprintf('Gradient at initial theta (zeros): \n');
fprintf(' %f \n', grad);
fprintf('Expected gradients (approx):\n -0.1000\n -12.0092\n -11.2628\n');
% Compute and display cost and gradient with non-zero theta
test_theta = [-24; 0.2; 0.2];
[cost, grad] = costFunction(test_theta, X, y);
fprintf('\nCost at test theta: %f\n', cost);
fprintf('Expected cost (approx): 0.218\n');
fprintf('Gradient at test theta: \n');
fprintf(' %f \n', grad);
fprintf('Expected gradients (approx):\n 0.043\n 2.566\n 2.647\n');
fprintf('\nProgram paused. Press enter to continue.\n');
pause;
%% ============= Part 3: Optimizing using fminunc =============
% In this exercise, you will use a built-in function (fminunc) to find the
% optimal parameters theta.
% Set options for fminunc
options = optimset('GradObj', 'on', 'MaxIter', 400, 'Algorithm', 'trust-
region');
% Run fminunc to obtain the optimal theta
% This function will return theta and the cost
[theta, cost] = ...
fminunc(#(t)(costFunction(t, X, y)), initial_theta, options);
end
The dataset looks like the following:
34.62365962451697,78.0246928153624,0
30.28671076822607,43.89499752400101,0
35.84740876993872,72.90219802708364,0
60.18259938620976,86.30855209546826,1
79.0327360507101,75.3443764369103,1
45.08327747668339,56.3163717815305,0
61.10666453684766,96.51142588489624,1
75.02474556738889,46.55401354116538,1
76.09878670226257,87.42056971926803,1
84.43281996120035,43.53339331072109,1
95.86155507093572,38.22527805795094,0
75.01365838958247,30.60326323428011,0
82.30705337399482,76.48196330235604,1
69.36458875970939,97.71869196188608,1
39.53833914367223,76.03681085115882,0
53.9710521485623,89.20735013750205,1
69.07014406283025,52.74046973016765,1
67.94685547711617,46.67857410673128,0
70.66150955499435,92.92713789364831,1
76.97878372747498,47.57596364975532,1
67.37202754570876,42.83843832029179,0
89.67677575072079,65.79936592745237,1
50.534788289883,48.85581152764205,0
34.21206097786789,44.20952859866288,0
77.9240914545704,68.9723599933059,1
62.27101367004632,69.95445795447587,1
80.1901807509566,44.82162893218353,1
93.114388797442,38.80067033713209,0
61.83020602312595,50.25610789244621,0
38.78580379679423,64.99568095539578,0
61.379289447425,72.80788731317097,1
85.40451939411645,57.05198397627122,1
52.10797973193984,63.12762376881715,0
52.04540476831827,69.43286012045222,1
40.23689373545111,71.16774802184875,0
54.63510555424817,52.21388588061123,0
33.91550010906887,98.86943574220611,0
64.17698887494485,80.90806058670817,1
74.78925295941542,41.57341522824434,0
34.1836400264419,75.2377203360134,0
83.90239366249155,56.30804621605327,1
51.54772026906181,46.85629026349976,0
94.44336776917852,65.56892160559052,1
82.36875375713919,40.61825515970618,0
51.04775177128865,45.82270145776001,0
62.22267576120188,52.06099194836679,0
77.19303492601364,70.45820000180959,1
97.77159928000232,86.7278223300282,1
62.07306379667647,96.76882412413983,1
91.56497449807442,88.69629254546599,1
79.94481794066932,74.16311935043758,1
99.2725269292572,60.99903099844988,1
90.54671411399852,43.39060180650027,1
34.52451385320009,60.39634245837173,0
50.2864961189907,49.80453881323059,0
49.58667721632031,59.80895099453265,0
97.64563396007767,68.86157272420604,1
32.57720016809309,95.59854761387875,0
74.24869136721598,69.82457122657193,1
71.79646205863379,78.45356224515052,1
75.3956114656803,85.75993667331619,1
35.28611281526193,47.02051394723416,0
56.25381749711624,39.26147251058019,0
30.05882244669796,49.59297386723685,0
44.66826172480893,66.45008614558913,0
66.56089447242954,41.09209807936973,0
40.45755098375164,97.53518548909936,1
49.07256321908844,51.88321182073966,0
80.27957401466998,92.11606081344084,1
66.74671856944039,60.99139402740988,1
32.72283304060323,43.30717306430063,0
64.0393204150601,78.03168802018232,1
72.34649422579923,96.22759296761404,1
60.45788573918959,73.09499809758037,1
58.84095621726802,75.85844831279042,1
99.82785779692128,72.36925193383885,1
47.26426910848174,88.47586499559782,1
50.45815980285988,75.80985952982456,1
60.45555629271532,42.50840943572217,0
82.22666157785568,42.71987853716458,0
88.9138964166533,69.80378889835472,1
94.83450672430196,45.69430680250754,1
67.31925746917527,66.58935317747915,1
57.23870631569862,59.51428198012956,1
80.36675600171273,90.96014789746954,1
68.46852178591112,85.59430710452014,1
42.0754545384731,78.84478600148043,0
75.47770200533905,90.42453899753964,1
78.63542434898018,96.64742716885644,1
52.34800398794107,60.76950525602592,0
94.09433112516793,77.15910509073893,1
90.44855097096364,87.50879176484702,1
55.48216114069585,35.57070347228866,0
74.49269241843041,84.84513684930135,1
89.84580670720979,45.35828361091658,1
83.48916274498238,48.38028579728175,1
42.2617008099817,87.10385094025457,1
99.31500880510394,68.77540947206617,1
55.34001756003703,64.9319380069486,1
74.77589300092767,89.52981289513276,1
The only problem I see is that you should have written h = sigmoid(X * theta) instead of h = sigmoid(theta * X). I am getting the same answer from your code after changing this as I was getting from my code for the same assignment.
Related
Vectorize a regression map calculation
I compute the regression map of a time series A(t) on a field B(x,y,t) in the following way: A=1:10; %time B=rand(100,100,10); %x,y,time rc=nan(size(B,1),size(B,2)); for ii=size(B,1) for jj=1:size(B,2) tmp = cov(A,squeeze(B(ii,jj,:))); %covariance matrix rc(ii,jj) = tmp(1,2); %covariance A and B end end rc = rc/var(A); %regression coefficient Is there a way to vectorize/speed up code? Or maybe some built-in function that I did not know of to achieve the same result?
In order to vectorize this algorithm, you would have to "get your hands dirty" and compute the covariance yourself. If you take a look inside cov you'll see that it has many lines of input checking and very few lines of actual computation, to summarize the critical steps: y = varargin{1}; x = x(:); y = y(:); x = [x y]; [m,~] = size(x); denom = m - 1; xc = x - sum(x,1)./m; % Remove mean c = (xc' * xc) ./ denom; To simplify the above somewhat: x = [x(:) y(:)]; m = size(x,1); xc = x - sum(x,1)./m; c = (xc' * xc) ./ (m - 1); Now this is something that is fairly straightforward to vectorize... function q51466884 A = 1:10; %time B = rand(200,200,10); %x,y,time %% Test Equivalence: assert( norm(sol1-sol2) < 1E-10); %% Benchmark: disp([timeit(#sol1), timeit(#sol2)]); %% function rc = sol1() rc=nan(size(B,1),size(B,2)); for ii=1:size(B,1) for jj=1:size(B,2) tmp = cov(A,squeeze(B(ii,jj,:))); %covariance matrix rc(ii,jj) = tmp(1,2); %covariance A and B end end rc = rc/var(A); %regression coefficient end function rC = sol2() m = numel(A); rB = reshape(B,[],10).'; % reshape % Center: cA = A(:) - sum(A)./m; cB = rB - sum(rB,1)./m; % Multiply: rC = reshape( (cA.' * cB) ./ (m-1), size(B(:,:,1)) ) ./ var(A); end end I get these timings: [0.5381 0.0025] which means we saved two orders of magnitude in the runtime :) Note that a big part of optimizing the algorithm is assuming you don't have any "strangeness" in your data, like NaN values etc. Take a look inside cov.m to see all the checks that we skipped.
`ode45` and tspan error Attempted to access
I'm using ode45 to solve second order differential equation. the time span is determined based on how many numbers in txt file, therefore, the time span is defined as follows i = 1; t(i) = 0; dt = 0.1; numel(theta_d) while ( i < numel(theta_d) ) i = i + 1; t(i) = t(i-1) + dt; end Now the time elements should not exceed the size of txt (i.e. numel(theta_d)). In main.m, I have x0 = [0; 0]; options= odeset('Reltol',dt,'Stats','on'); [t, x] = ode45('ODESolver', t, x0, options); and ODESolver.m header is function dx = ODESolver(t, x) If I run the code, I'm getting this error Attempted to access theta_d(56); index out of bounds because numel(theta_d)=55. Error in ODESolver (line 29) theta_dDot = ( theta_d(i) - theta_dPrev ) / dt; Why the ode45 is not being fixed with the time span? Edit: this is the entire code main.m clear all clc global error theta_d dt; error = 0; theta_d = load('trajectory.txt'); i = 1; t(i) = 0; dt = 0.1; numel(theta_d) while ( i < numel(theta_d) ) i = i + 1; t(i) = t(i-1) + dt; end x0 = [pi/4; 0]; options= odeset('Reltol',dt,'Stats','on'); [t, x] = ode45(#ODESolver, t, x0, options); %e = x(:,1) - theta_d; % Error theta plot(t, x(:,2), 'r', 'LineWidth', 2); title('Tracking Problem','Interpreter','LaTex'); xlabel('time (sec)'); ylabel('$\dot{\theta}(t)$', 'Interpreter','LaTex'); grid on and ODESolver.m function dx = ODESolver(t, x) persistent i theta_dPrev if isempty(i) i = 1; theta_dPrev = 0; end global error theta_d dt ; dx = zeros(2,1); %Parameters: m = 0.5; % mass (Kg) d = 0.0023e-6; % viscous friction coefficient L = 1; % arm length (m) I = 1/3*m*L^2; % inertia seen at the rotation axis. (Kg.m^2) g = 9.81; % acceleration due to gravity m/s^2 % PID tuning Kp = 5; Kd = 1.9; Ki = 0.02; % theta_d first derivative theta_dDot = ( theta_d(i) - theta_dPrev ) / dt; theta_dPrev = theta_d(i); % u: joint torque u = Kp*(theta_d(i) - x(1)) + Kd*( theta_dDot - x(2)) + Ki*error; error = error + (theta_dDot - x(1)); dx(1) = x(2); dx(2) = 1/I*(u - d*x(2) - m*g*L*sin(x(1))); i = i + 1; end and this is the error Attempted to access theta_d(56); index out of bounds because numel(theta_d)=55. Error in ODESolver (line 28) theta_dDot = ( theta_d(i) - theta_dPrev ) / dt; Error in ode45 (line 261) f(:,2) = feval(odeFcn,t+hA(1),y+f*hB(:,1),odeArgs{:}); Error in main (line 21) [t, x] = ode45(#ODESolver, t, x0, options);
The problem here is because you have data at discrete time points, but ode45 needs to be able to calculate the derivative at any time point in your time range. Once it solves the problem, it will interpolate the results back onto your desired time points. So it will calculate the derivative many times more than at just the time points you specified, thus your i counter will not work at all. Since you have discrete data, the only way to proceed with ode45 is to interpolate theta_d to any time t. You have a list of values theta_d corresponding to times 0:dt:(dt*(numel(theta_d)-1)), so to interpolate to a particular time t, use interp1(0:dt:(dt*(numel(theta_d)-1)),theta_d,t), and I turned this into an anonymous function to give the interpolated value of theta_p at a given time t Then your derivative function will look like function dx = ODESolver(t, x,thetaI) dx = zeros(2,1); %Parameters: m = 0.5; % mass (Kg) d = 0.0023e-6; % viscous friction coefficient L = 1; % arm length (m) I = 1/3*m*L^2; % inertia seen at the rotation axis. (Kg.m^2) g = 9.81; % acceleration due to gravity m/s^2 % PID tuning Kp = 5; Kd = 1.9; Ki = 0.02; % theta_d first derivative dt=1e-4; theta_dDot = (thetaI(t) - theta(I-dt)) / dt; %// Note thetaI(t) is the interpolated theta_d values at time t % u: joint torque u = Kp*(thetaI(t) - x(1)) + Kd*( theta_dDot - x(2)) + Ki*error; error = error + (theta_dDot - x(1)); dx=[x(2); 1/I*(u - d*x(2) - m*g*L*sin(x(1)))]; end and you will have to define thetaI=#(t) interp1(0:dt:(dt*(numel(theta_d)-1)),theta_d,t); before calling ode45 using [t, x] = ode45(#(t,x) ODESolver(t,x,thetaI, t, x0, options);. I removed a few things from ODESolver and changed how the derivative was computed. Note I can't test this, but it should get you on the way.
Gradient descent Matlab
i have a problem with gradient descent in Matlab. I dont know how to build the function. Default settings: max_iter = 1000; learing = 1; degree = 1; My logistic regression cost function: (Correct ???) function [Jval, Jgrad] = logcost(function(theta, matrix, y) mb = matrix * theta; p = sigmoid(mb); Jval = sum(-y' * log(p) - (1 - y')*log(1 - p)) / length(matrix); if nargout > 1 Jgrad = matrix' * (p - y) / length(matrix); end and now my gradient descent function: function [theta, Jval] = graddescent(logcost, learing, theta, max_iter) [Jval, Jgrad] = logcost(theta); for iter = 1:max_iter theta = theta - learing * Jgrad; % is this correct? Jval[iter] = ??? end thx for all help :), Hans
You can specify the code of your cost function in a regular matlab function: function [Jval, Jgrad] = logcost(theta, matrix, y) mb = matrix * theta; p = sigmoid(mb); Jval = sum(-y' * log(p) - (1 - y')*log(1 - p)) / length(matrix); if nargout > 1 Jgrad = matrix' * (p - y) / length(matrix); end end Then, create your gradient descent method (Jgrad is automatically updated in each loop iteration): function [theta, Jval] = graddescent(logcost, learing, theta, max_iter) for iter = 1:max_iter [Jval, Jgrad] = logcost(theta); theta = theta - learing * Jgrad; end end and call it with a function object that can be used to evaluate your cost: % Initialize 'matrix' and 'y' ... matrix = randn(2,2); y = randn(2,1); % Create function object. fLogcost = #(theta)(logcost(theta, matrix, y)); % Perform gradient descent. [ theta, Jval] = graddescent(fLogcost, 1e-3, [ 0 0 ]', 10); You can also take a look at fminunc, built in Matlab's method for function optimization which includes an implementation of gradient descent, among other minimization techniques. Regards.
Forward Euler Method to solve first order ODEs in Matlab
I wrote this Matlab program that is supposed to solve the IVP du/dx= -5000(u(t) - cos(t)) - sin(t) with u(0)=1. My exact solution should be u(t) = cos(t) but the solution I am getting from Euler's forward in my code is huge in comparison to what it should be and what I calculated but I'm not sure where I've gone wrong in my code. Can you find my error? function main dt=5; u0 = 1; n=50; [T,U] = euler(dt, u0, n); uexact = cos(T); plot(T,U) hold on plot(T, uexact, 'r') end function [T,U]= euler( dt, u0, n) R= dt/n; T=zeros(1,n+1); U=zeros(1,n+1); U(1)=u0; T(1) = 0; for j=1:n U(j+1)= U(j)+ R*(rhs(T(j),U(j))); T(j+1)= T(j) + R; end end function dP = rhs(t, P) P = zeros(1,1); dP = (-5000)*(P - cos(t)) - sin(t); end
You don not have to set P = zeros(1,1) insde the function that approximate de derivative with the formula. Moreover, the problem you have is the [numerical unstability of forward Euler method][1]. You need a very small time step to make it converge (because of the large coefficient of P inside function dP). function main dt=5; u0 = 1; n=50000; % increase the number or subintervals (smaller time step) to get stability [T,U] = euler(dt, u0, n); uexact = cos(T); plot(T,U,'bs') hold on plot(T, uexact, 'r') end function [T,U]= euler( dt, u0, n) R= dt/n; T=zeros(1,n+1); U=zeros(1,n+1); U(1)=u0; T(1) = 0; for j=1:n % Implicit method converges % U(j+1) = ( U(j) - R*(-5000)*cos(T(j)) - R*sin(T(j)))/(1 - R*(-5000)); U(j+1)= U(j)+ R*(rhs(T(j),U(j))); T(j+1)= T(j) + R; end end function dP = rhs(t, P) %P = zeros(1,1); %% It must not be here dP = (-5000)*(P - cos(t)) - sin(t); end [1]: http://en.wikipedia.org/wiki/Euler_method
matlab fminunc not quitting (running indefinitely)
I have been trying to implement logistic regression in matlab for a while now. I have done it already, but for reasions unknown to me, I am unable to perform a single iteration using fminunc. When the function it called, the program just go in to wait mode indefinitely. Is there something wrong with code, or is my data set to large? function [theta J] = logisticReg(initial_theta, X, y, lambda, iter) % Set Options options = optimset('GradObj', 'on', 'MaxIter', iter); % Optimize [theta, J, exit_flag] = fminunc(#(t)(costFunctionReg(t, X, y, lambda)), initial_theta, options); end where X is a [676,6251] matrix y is a [676,1] vector lambda = 1 initial_theta is [6251, 1] vector of zeros iter = 1 Any 'pointing in the right direction' will be greatly appreciated! P.S. and I was able to run costFunctionReg. So am assuming it is this function. as requested the costFunctionReg function [J, grad] = costFunctionReg(theta, X, y, lambda) m = length(y); % number of training examples J = 0; grad = zeros(size(theta)); hyp = sigmoid(X * theta); cost_reg = (lambda / (2*m)) * sum(theta(2:end).^2); J = 1/m * sum((-y .* log(hyp)) - ((1-y) .* log(1-hyp))) + cost_reg; grad(1) = 1/m * sum((hyp - y)' * X(:,1)); grad(2:end) = (1/m * ((hyp - y)' * X(:,2:end))) + (lambda/m)*theta(2:end)'; to answer #Rasman question: Cost at initial theta: NaN press any key to continue Performing Logistic Regrestion Error using sfminbx (line 28) Objective function is undefined at initial point. fminunc cannot continue. Error in fminunc (line 406) [x,FVAL,~,EXITFLAG,OUTPUT,GRAD,HESSIAN] = sfminbx(funfcn,x,l,u, ... Error in logisticReg (line 8) [theta, J, exit_flag] = fminunc(#(t)(costFunctionReg(t, X, y, lambda)), initial_theta, options); Error in main (line 40) [theta J] = logisticReg(initial_theta, X, y, lambda, iter); The first line is me running costFunctionReg with initial_theta.
It's possible that you already tried searching this link: http://www.mathworks.com/matlabcentral/newsreader/view_thread/290418 The general arc (I've copied and pasted/made edits to the text from the above site) is: The error message indicates that your objective function "obj" either errors, or returns and invalid value such as Inf, NaN or a complex number when evaluated at the point x0. You may want to evaluate your function at x0 before calling fmincon (or in your case, fminunc) to see if it's well defined: something like costFunctionReg(initial_theta) And if your function (costFunctionReg) is returning a complex-valued result, then use real() to strip it away.