The multivariate linear regression cost function:
Is the following code in Matlab correct?
function J = computeCostMulti(X, y, theta)
m = length(y);
J = 0;
J=(1/(2*m)*(X*theta-y)'*(X*theta-y);
end
There is two ways i tried which is essentially the same code.
J = (X * theta - y)'*(X * theta - y)/2*m;
or you can try:
J = (1/(2*m))*(X * theta - y)'*(X * theta - y)
Your are missing a ) in the end:
J=(1/(2*m))*(X*theta-y)'*(X*theta-y);
^
Related
I'm trying to implement a vectorized version of the regularised logistic regression. I have found a post that explains the regularised version but I don't understand it.
To make it easy I will copy the code below:
hx = sigmoid(X * theta);
m = length(X);
J = (sum(-y' * log(hx) - (1 - y') * log(1 - hx)) / m) + lambda * sum(theta(2:end).^2) / (2*m);
grad =((hx - y)' * X / m)' + lambda .* theta .* [0; ones(length(theta)-1, 1)] ./ m ;
I understand the first part of the Cost equation, If I'm correct it could be represented as:
J = ((-y' * log(hx)) - ((1-y)' * log(1-hx)))/m;
The problem it's the regularization term. Let's take more detail:
Dimensions:
X = (m x (n+1))
theta = ((n+1) x 1)
I don't understand why he let the first term of theta (theta_0) outside of the equation, when in theory the regularized term it's:
and it has to take into account all the thetas
For the gradient descent, I think that this equation it's equivalent:
L = eye(length(theta));
L(1,1) = 0;
grad = (1/m * X'* (hx - y)+ (lambda*(L*theta)/m).
In Matlab indexes begin from 1, and in mathematic indexes begin from 0 (the indexes on the formula which you mentioned are also beginning from 0).
So, in theory, the first term of theta also needs to be let outside of the equation.
And as for your second question, you right! It is an equivalent clean equation!
function [J, grad] = costFunction(theta, X, y)
data = load('ex2data1.txt');
y = data(:, 3);
theta = [1;1;2];
m = length(y);
one = ones(m,1);
X1 = data(:, [1, 2]);
X = [one X1];
J = 0;
grad = zeros(size(theta));
J= 1/m *((sum(-y*log(sigmoid(X*theta)))) - (sum(1-y * log(1 - sigmoid(X*theta)))));
for i = 1:m
grad = (1/m) * sum (sigmoid(X*theta) - y')*X;
end
end
I want to know if i implemented the cost function and gradient descent correctly i am getting NaN answer though this and does theta(1) always have to be 0 i have it as 1 here. How many iterations i need for grad that should be equal to the length of matrix or something else?
function [J, grad] = costFunction(theta, X, y)
m = length(y);
J = 0;
grad = zeros(size(theta));
sig = 1./(1 + (exp(-(X * theta))));
J = ((-y' * log(sig)) - ((1 - y)' * log(1 - sig)))/m;
grad = ((sig - y)' * X)/m;
end
where
sig = 1./(1 + (exp(-(X * theta))));
is matrix representation of the logistic regression hypothesis which is defined as:
where function g is the sigmoid function. The sigmoid function is defined as:
J = ((-y' * log(sig)) - ((1 - y)' * log(1 - sig)))/m;
is matrix representation of the cost function in logistic regression :
and
grad = ((sig - y)' * X)/m;
is matrix representation of the gradient of the cost which is a vector of the same length as θ where the jth element (for j = 0,1,...,n) is defined as follows:
I have a vector x = [x_1 x_2 ... x_n], a vector y = [y_1 y_2 y_3] and a matrix X = [x_11 x_12 ... x_1n; x_21 x_22 ... x_2n; x_31 x_32 ... x_3n].
For i = 1, 2, ..., n, I want to compute the following sum in MATLAB:
sum((x(i) - y*X(:,i))^2)
What I have tried to write is the following MATLAB code:
vv = (x(1) - y*X(:,1))^2; % as an initialization for i=1
for i = 2 : n
vv = vv + (x(i) - y * X(:,i))^2
end
But I am wondering if I can compute that without for loop in order to potentially reduce the computational time especially if n is very high... So are there any other much more optimal possibilities to do that in MATLAB?
Any help will be very appreciated!
You do not need the loop at all,
for i = 2:n
y*X(:,i)
end
is the same as just y*X, so x(i) - yX(:,i) is simply x - yX so basically, its:
vv = sum((x - y * X).^2);
Thanks for #beaker for pointing the mistake.
function [J, grad] = costFunction(theta, X, y)
m = length(y);
h = sigmoid(X*theta);
sh = sigmoid(h);
grad = (1/m)*X'*(sh - y);
J = (1/m)*sum(-y.*log(sh) - (1 - y).*log(1 - sh));
end
I'm trying to compute the cost function for logistic regression. Can someone please tell me why this isn't accurate?
Update: Sigmoid function
function g = sigmoid(z)
g = zeros(size(z));
g = 1./(1 + exp(1).^(-z));
end
As Dan stated, your costFunction calls sigmoid twice. First, it performs the sigmoid function on X*theta; then it performs the sigmoid function again on the result of sigmoid(X*theta). Thus, sh = sigmoid(sigmoid(X*theta)). Your cost function should only call the sigmoid function once.
See the code below, I removed the sh variable and replaced it with h everywhere else. This causes the sigmoid function to only be called once.
function [J, grad] = costFunction(theta, X, y)
m = length(y);
h = sigmoid(X*theta);
grad = (1/m)*X'*(h - y);
J = (1/m)*sum(-y.*log(h) - (1 - y).*log(1 - h));
end
i have a problem with gradient descent in Matlab.
I dont know how to build the function.
Default settings:
max_iter = 1000;
learing = 1;
degree = 1;
My logistic regression cost function: (Correct ???)
function [Jval, Jgrad] = logcost(function(theta, matrix, y)
mb = matrix * theta;
p = sigmoid(mb);
Jval = sum(-y' * log(p) - (1 - y')*log(1 - p)) / length(matrix);
if nargout > 1
Jgrad = matrix' * (p - y) / length(matrix);
end
and now my gradient descent function:
function [theta, Jval] = graddescent(logcost, learing, theta, max_iter)
[Jval, Jgrad] = logcost(theta);
for iter = 1:max_iter
theta = theta - learing * Jgrad; % is this correct?
Jval[iter] = ???
end
thx for all help :), Hans
You can specify the code of your cost function in a regular matlab function:
function [Jval, Jgrad] = logcost(theta, matrix, y)
mb = matrix * theta;
p = sigmoid(mb);
Jval = sum(-y' * log(p) - (1 - y')*log(1 - p)) / length(matrix);
if nargout > 1
Jgrad = matrix' * (p - y) / length(matrix);
end
end
Then, create your gradient descent method (Jgrad is automatically updated in each loop iteration):
function [theta, Jval] = graddescent(logcost, learing, theta, max_iter)
for iter = 1:max_iter
[Jval, Jgrad] = logcost(theta);
theta = theta - learing * Jgrad;
end
end
and call it with a function object that can be used to evaluate your cost:
% Initialize 'matrix' and 'y' ...
matrix = randn(2,2);
y = randn(2,1);
% Create function object.
fLogcost = #(theta)(logcost(theta, matrix, y));
% Perform gradient descent.
[ theta, Jval] = graddescent(fLogcost, 1e-3, [ 0 0 ]', 10);
You can also take a look at fminunc, built in Matlab's method for function optimization which includes an implementation of gradient descent, among other minimization techniques.
Regards.