constant term in Matlab principal component regression (pcr) analysis - matlab

I am trying to learn principal component regression (pcr) with Matlab. I use this guide here: http://www.mathworks.fr/help/stats/examples/partial-least-squares-regression-and-principal-components-regression.html
it's really good, but I just cannot understand one step:
we do the PCA and the regression, nice and clear:
[PCALoadings,PCAScores,PCAVar] = princomp(X);
betaPCR = regress(y-mean(y), PCAScores(:,1:2));
And then we adjust the first coefficient:
betaPCR = PCALoadings(:,1:2)*betaPCR;
betaPCR = [mean(y) - mean(X)*betaPCR; betaPCR];
yfitPCR = [ones(n,1) X]*betaPCR;
How come that the coefficient needs to be 'mean(y) - mean(X)*betaPCR' for the constant one factor? Can you explain that to me?
Thanks in advance!

This is really a math question, not a coding question. Your PCA extracts a set of features and puts them in a matrix, which gives you PCALoadings and PCAScores. Pull out the first two principal components and their loadings, and put them in their own matrix:
W = PCALoadings(:, 1:2)
Z = PCAScores(:, 1:2)
The relationship between X and Z is that X can be approximated by:
Z = (X - mean(X)) * W <=> X ~ mean(X) + Z * W' (1)
The intuition is that Z captures most of the "important information" in X, and the matrix W tells you how to transform between the two representations.
Now you can do a regression of y on Z. First you have to subtract the mean from y, so that both the left and right hand sides have mean zero:
y - mean(y) = Z * beta + errors (2)
Now you want to use that regression to make predictions for y from X. Substituting from equation (1) into equation (2) gives you
y - mean(y) = (X - mean(X)) * W * beta
= (X - mean(X)) * beta1
where we have defined beta1 = W * beta (you do this in your third line of code). Rearranging:
y = mean(y) - mean(X) * beta1 + X * beta1
= [ones(n,1) X] * [mean(y) - mean(X) * beta1; beta1]
= [ones(n,1) X] * betaPCR
which works out if we define
betaPCR = [mean(y) - mean(X) * beta1; beta1]
as in your fourth line of code.

Related

Regularized logistic regresion with vectorization

I'm trying to implement a vectorized version of the regularised logistic regression. I have found a post that explains the regularised version but I don't understand it.
To make it easy I will copy the code below:
hx = sigmoid(X * theta);
m = length(X);
J = (sum(-y' * log(hx) - (1 - y') * log(1 - hx)) / m) + lambda * sum(theta(2:end).^2) / (2*m);
grad =((hx - y)' * X / m)' + lambda .* theta .* [0; ones(length(theta)-1, 1)] ./ m ;
I understand the first part of the Cost equation, If I'm correct it could be represented as:
J = ((-y' * log(hx)) - ((1-y)' * log(1-hx)))/m;
The problem it's the regularization term. Let's take more detail:
Dimensions:
X = (m x (n+1))
theta = ((n+1) x 1)
I don't understand why he let the first term of theta (theta_0) outside of the equation, when in theory the regularized term it's:
and it has to take into account all the thetas
For the gradient descent, I think that this equation it's equivalent:
L = eye(length(theta));
L(1,1) = 0;
grad = (1/m * X'* (hx - y)+ (lambda*(L*theta)/m).
In Matlab indexes begin from 1, and in mathematic indexes begin from 0 (the indexes on the formula which you mentioned are also beginning from 0).
So, in theory, the first term of theta also needs to be let outside of the equation.
And as for your second question, you right! It is an equivalent clean equation!

Linear regression cost function and gradient descent

I have been studying data science and ML topics for a while and I always get sucked at one point that makes a great confusion for me.
In courses like Andrew Ng's, it is defined that the error between the predicted value and the true value from e.g. Linear regression is expressed by:
error = predicted_value - y
In some other tutorials/courses, the error is presented as:
error = y - predicted_value
Also, for instance, on Udacity's data science Nanodegree, the gradient descent weights update is given by:
error = y - predicted_value
W_new = W + learn_rate * np.matmul(error, X)
At the same time, in several other books/courses, the same procedure is given by :
error = predicted_value - y
W_new = W - learn_rate * np.matmul(error, X)
Could someone help me out with those different notations?
Thank you!
EDIT
Following #bottaio answer, I got the following:
First case :
# compute errors
y_pred = np.matmul(X, W) + b
error = y_pred - y
# compute steps
W_new = W - learn_rate * np.matmul(error, X)
b_new = b - learn_rate * error.sum()
return W_new, b_new
Second case :
# compute errors
y_pred = np.matmul(X, W) + b
error = y - y_pred
# compute steps
W_new = W + learn_rate * np.matmul(error, X)
b_new = b + learn_rate * error.sum()
return W_new, b_new
Running the first and second cases, I get :
Third case :
# compute errors
y_pred = np.matmul(X, W) + b
error = y_pred - y
# compute steps
W_new = W + learn_rate * np.matmul(error, X)
b_new = b + learn_rate * error.sum()
return W_new, b_new
Running the third case, I get :
That's exactly the intuition I'm trying to achieve.
Whats the relation between using the error = y - y_pred and having to use the step computation as positive W_new = W + learn_rate * np.matmul(error, X) instead of W_new = W - learn_rate * np.matmul(error, X) ?
Thank you for all the support!!!!!
error = predicted_value - y
error' = y - predicted_value = -error
W = W + lr * matmul(error, X) = W + lr * matmul(-error', X) = W - lr * matmul(-error', X)
These two expressions are two ways of looking at the same thing. You propagate error backwards.
To be honest, the second states more clearly what is going on under the hood - error is just a difference between what model predicted relative to ground truth (explains predicted - y). And gradient descent step is about changing weights in opposite direction to gradient (explains minus).

Solving nonlinear equations with "solve", incorrect solution

I am testing MATLAB capabilities in solving equations for a project that I intend to do, so I gave it a test run with something simple, but the results that it gives me are incorrect. I tried to solve two non-linear equations with two unknowns, one of the solutions is correct the other is not.
syms theta d x y
eq1 = d * cos(theta) == x;
eq2 = d * sin(theta) == y;
sol = solve(eq1, eq2, theta, d)
sol.theta
sol.d
The solutions for d are correct, but for theta I get:
-2*atan((x - (x^2 + y^2)^(1/2))/y)
-2*atan((x + (x^2 + y^2)^(1/2))/y)
And the correct answer for theta is simply atan(y/x)
Then when I evaluate these solutions with x = 1, y = 0, I get:
eval(sol.d)
eval(sol.theta)
d = 1, -1
theta = NaN, -3.1416
Solutions for d are correct, but theta in that scenario should be 0.
What am I doing wrong?
EDIT: solving it by hand it looks like this: Divide the y equation by the x equation
y/x = (d * sin(theta)) / (d * cos(theta))
y/x = sin(theta)/cos(theta)
y/x = tan(theta)
theta = atan(y/x)
Even if matlab solves it in some other way and gets a different expression, it should still yield the same final result when I use numbers and it PARTIALLY does.
For x = 1 and y = 0, theta should be 0, => this doesnt work, it gives NaN (explanation bellow)
for x = 1 and y = 1, theta should be 45 degrees => this works
for x = 0 and y = 1 theta should be 90 degrees => this works
And I just checked it again with the 45 and 90 degree values for x and y and it works, but for x = 1 and y = 0 it still gives NaN as one of the answers and that is because it gets a 0/0 from the way it is expressing it
-2*atan((x - (x^2 + y^2)^(1/2))/y)
-2*(1 - (1^2 + 0^2))^(1/2)/0
-2*(1 - 1)^(1/2)/0
0/0
but if its in the form of atan(y/x) the result is
theta = atan(0/1)
theta = atan(0)
theta = 0
Did you mean to solve this:
syms a b theta d real
eq1 = a==d * cos(theta) ;
eq2 = b==d * sin(theta) ;
[sol] = solve([eq1 eq2],[d theta] ,'IgnoreAnalyticConstraints', true,'Real',true,'ReturnConditions',true);
When solving the equations with symbolic x and y, the solver will find a solution with a certain condition, which can be obtained using the argument 'ReturnCondition':
syms x y theta d real
eq1 = d*cos(theta) == x;
eq2 = d*sin(theta) == y;
sol = solve([eq1; eq2],[d theta],'ReturnConditions',true);
This gives the following result for sol
>> sol.d
(x^2 + y^2)^(1/2)
-(x^2 + y^2)^(1/2)
>> sol.theta
2*pi*k - 2*atan((x - (x^2 + y^2)^(1/2))/y)
2*pi*k - 2*atan((x + (x^2 + y^2)^(1/2))/y)
>> sol.parameters
k
>> sol.conditions
y ~= 0 & in(k, 'integer')
y ~= 0 & in(k, 'integer')
As you can see, y = 0 does not fulfill this general solution given by the solver, resulting in your problem for y = 0. You can find solutions for y = 0 by either making y numeric instead of symbolic, or by adding an assumption:
syms x y theta d real
assume(y==0)
sol = solve([eq1; eq2],[d theta],'ReturnConditions',true);
I guess its easier to just set y=0 numeric, for this one condition, since there are already 4 possible solutions and conditions for the three lines above.

Trying to compute a specific sum equation without using for loop in MATLAB

I have a vector x = [x_1 x_2 ... x_n], a vector y = [y_1 y_2 y_3] and a matrix X = [x_11 x_12 ... x_1n; x_21 x_22 ... x_2n; x_31 x_32 ... x_3n].
For i = 1, 2, ..., n, I want to compute the following sum in MATLAB:
sum((x(i) - y*X(:,i))^2)
What I have tried to write is the following MATLAB code:
vv = (x(1) - y*X(:,1))^2; % as an initialization for i=1
for i = 2 : n
vv = vv + (x(i) - y * X(:,i))^2
end
But I am wondering if I can compute that without for loop in order to potentially reduce the computational time especially if n is very high... So are there any other much more optimal possibilities to do that in MATLAB?
Any help will be very appreciated!
You do not need the loop at all,
for i = 2:n
y*X(:,i)
end
is the same as just y*X, so x(i) - yX(:,i) is simply x - yX so basically, its:
vv = sum((x - y * X).^2);
Thanks for #beaker for pointing the mistake.

how to solve a system of Ordinary Differential Equations (ODE's) in Matlab

I have to solve a system of ordinary differential equations of the form:
dx/ds = 1/x * [y* (g + s/y) - a*x*f(x^2,y^2)]
dy/ds = 1/x * [-y * (b + y) * f()] - y/s - c
where x, and y are the variables I need to find out, and s is the independent variable; the rest are constants. I've tried to solve this with ode45 with no success so far:
y = ode45(#yprime, s, [1 1]);
function dyds = yprime(s,y)
global g a v0 d
dyds_1 = 1./y(1) .*(y(2) .* (g + s ./ y(2)) - a .* y(1) .* sqrt(y(1).^2 + (v0 + y(2)).^2));
dyds_2 = - (y(2) .* (v0 + y(2)) .* sqrt(y(1).^2 + (v0 + y(2)).^2))./y(1) - y(2)./s - d;
dyds = [dyds_1; dyds_2];
return
where #yprime has the system of equations. I get the following error message:
YPRIME returns a vector of length 0, but the length of initial
conditions vector is 2. The vector returned by YPRIME and the initial
conditions vector must have the same number of elements.
Any ideas?
thanks
Certainly, you should have a look at your function yprime. Using some simple model that shares the number of differential state variables with your problem, have a look at this example.
function dyds = yprime(s, y)
dyds = zeros(2, 1);
dyds(1) = y(1) + y(2);
dyds(2) = 0.5 * y(1);
end
yprime must return a column vector that holds the values of the two right hand sides. The input argument s can be ignored because your model is time-independent. The example you show is somewhat difficult in that it is not of the form dy/dt = f(t, y). You will have to rearrange your equations as a first step. It will help to rename x into y(1) and y into y(2).
Also, are you sure that your global g a v0 d are not empty? If any one of those variables remains uninitialized, you will be multiplying state variables with an empty matrix, eventually resulting in an empty vector dyds being returned. This can be tested with
assert(~isempty(v0), 'v0 not initialized');
in yprime, or you could employ a debugging breakpoint.
the syntax for ODE solvers is [s y]=ode45(#yprime, [1 10], [2 2])
and you dont need to do elementwise operation in your case i.e. instead of .* just use *