Doing a PCA using an optimization in Matlab - matlab

I'd like to find the principal components of a data matrix X in Matlab by solving the optimization problem min||X-XBB'||, where the norm is the Frobenius norm, and B is an orthonormal matrix. I'm wondering if anyone could tell me how to do that. Ideally, I'd like to be able to do this using the optimization toolbox. I know how to find the principal components using other methods. My goal is to understand how to set up and solve an optimization problem which has a matrix as the answer. I'd very much appreciate any suggestions or comments.
Thanks!
MJ

The thing about Optimization is that there are different methods to solve a problem, some of which can require extensive computation.
Your solution, given the constraints for B, is to use fmincon. Start by creating a file for the non-linear constraints:
function [c,ceq] = nonLinCon(x)
c = 0;
ceq = norm((x'*x - eye (size(x))),'fro'); %this checks to see if B is orthonormal.
then call the routine:
B = fmincon(#(B) norm(X - X*B*B','fro'),B0,[],[],[],[],[],[],#nonLinCon)
with B0 being a good guess on what the answer will be.
Also, you need to understand that this algorithms tries to find a local minimum, which may not be the solution you ultimately want. For instance:
X = randn(1,2)
fmincon(#(B) norm(X - X*B*B','fro'),rand(2),[],[],[],[],[],[],#nonLinCon)
ans =
0.4904 0.8719
0.8708 -0.4909
fmincon(#(B) norm(X - X*B*B','fro'),rand(2),[],[],[],[],[],[],#nonLinCon)
ans =
0.9864 -0.1646
0.1646 0.9864
So be careful, when using these methods, and try to select a good starting point

The Statistics toolbox has a built-in function 'princomp' that does PCA. If you want to learn (in general, without the optimization toolbox) how to create your own code to do PCA, this site is a good resource.
Since you've specifically mentioned wanting to use the Optimization Toolbox and to set this up as an optimization problem, there is a very well-trusted 3rd-party package known as CVX from Stanford University that can solve the optimization problem you are referring to at this site.

Do you have the optimization toolbox? The documentation is really good, just try one of their examples: http://www.mathworks.com/help/toolbox/optim/ug/brg0p3g-1.html.
But in general the optimization function look like this:
[OptimizedMatrix, OptimizedObjectiveFunction] = optimize( (#MatrixToOptimize) MyObjectiveFunction(MatrixToOptimize), InitialConditionsMatrix, ...optional constraints and options... );
You must create MyObjectiveFunction() yourself, it must take the Matrix you want to optimize as an input and output a scalar value indicating the cost of the current input Matrix. Most of the optimizers will try to minimise this cost. Note that the cost must be a scalar.
fmincon() is a good place to start, once you are used to the toolbox you and if you can you should choose a more specific optimization algorithm for your problem.
To optimize a matrix rather than a vector, reshape the matrix to a vector, pass this vector to your objective function, and then reshape it back to the matrix within your objective function.
For example say you are trying to optimize the 3 x 3 matrix M. You have defined objective function MyObjectiveFunction(InputVector). Pass M as a vector:
MyObjectiveFunction(M(:));
And within the MyObjectiveFunction you must reshape M (if necessary) to be a matrix again:
function cost = MyObjectiveFunction(InputVector)
InputMatrix = reshape(InputVector, [3 3]);
%Code that performs matrix operations on InputMatrix to produce a scalar cost
cost = %some scalar value
end

Related

Minimizing Function with vector valued input in MATLAB

I want to minimize a function like below:
Here, n can be 5,10,50 etc. I want to use Matlab and want to use Gradient Descent and Quasi-Newton Method with BFGS update to solve this problem along with backtracking line search. I am a novice in Matlab. Can anyone help, please? I can find a solution for a similar problem in that link: https://www.mathworks.com/help/optim/ug/unconstrained-nonlinear-optimization-algorithms.html .
But, I really don't know how to create a vector-valued function in Matlab (in my case input x can be an n-dimensional vector).
You will have to make quite a leap to get where you want to be -- may I suggest to go through some basic tutorial first in order to digest basic MATLAB syntax and concepts? Another useful read is the very basic example to unconstrained optimization in the documentation. However, the answer to your question touches only basic syntax, so we can go through it quickly nevertheless.
The absolute minimum to invoke the unconstraint nonlinear optimization algorithms of the Optimization Toolbox is the formulation of an objective function. That function is supposed to return the function value f of your function at any given point x, and in your case it reads
function f = objfun(x)
f = sum(100 * (x(2:end) - x(1:end-1).^2).^2 + (1 - x(1:end-1)).^2);
end
Notice that
we select the indiviual components of the x vector by matrix indexing, and that
the .^ notation effects that the operand is to be squared elementwise.
For simplicity, save this function to a file objfun.m in your current working directory, so that you have it available from the command window.
Now all you have to do is to call the appropriate optimization algorithm, say, the quasi Newton method, from the command window:
n = 10; % Use n variables
options = optimoptions(#fminunc,'Algorithm','quasi-newton'); % Use QM method
x0 = rand(n,1); % Random starting guess
[x,fval,exitflag] = fminunc(#objfun, x0, options); % Solve!
fprintf('Final objval=%.2e, exitflag=%d\n', fval, exitflag);
On my machine I see that the algorithm converges:
Local minimum found.
Optimization completed because the size of the gradient is less than
the default value of the optimality tolerance.
Final objval=5.57e-11, exitflag=1

Matlab: Solve for a single variable in a linear system of equations

I have a linear system of about 2000 sparse equations in Matlab. For my final result, I only really need the value of one of the variables: the other values are irrelevant. While there is no real problem in simply solving the equations and extracting the correct variable, I was wondering whether there was a faster way or Matlab command. For example, as soon as the required variable is calculated, the program could in principle stop running.
Is there anyone who knows whether this is at all possible, or if it would just be easier to keep solving the entire system?
Most of the computation time is spent inverting the matrix, if we can find a way to avoid completely inverting the matrix then we may be able to improve the computation time. Lets assume I'm only interested in the solution for the last variable x(N). Using the standard method we compute
x = A\b;
res = x(N);
Assuming A is full rank, we can instead use LU decomposition of the augmented matrix [A b] to get x(N) which looks like this
[~,U] = lu([A b]);
res = U(end,end-1)/U(end,end);
This is essentially performing Gaussian elimination and then solving for x(N) using back-substitution.
We can extend this to find any value of x by swapping the columns of A before LU decomposition,
x_index = 123; % the index of the solution we are interested in
A(:,[x_index,end]) = A(:,[end,x_index]);
[~,U] = lu([A b]);
res = U(end,end)/U(end,end-1);
Bench-marking performance in MATLAB2017a with 10,000 random 200 dimensional systems we get a slight speed-up
Total time direct method : 4.5401s
Total time LU method : 3.9149s
Note that you may experience some precision issues if A isn't well conditioned.
Also, this approach doesn't take advantage of the sparsity of A. In my experiments even with 2000x2000 sparse matrices everything significantly slowed down and the LU method is significantly slower. That said full matrix representation only requires about 30MB which shouldn't be a problem on most computers.
If you have access to theory manuals on NASTRAN, I believe (from memory) there is coverage of partial solutions of linear systems. Also try looking for iterative or tri diagonal solvers for A*x = b. On this page, review the pqr solution answer by Shantachhani. Another reference.

Solve Ax = b using MATLAB

I have a linear system of equations AX = B to solve in MATLAB. What I have known is A is sparse, positive-definite and symmetric. I know the command x = A \ b works yet I am not sure MATLAB takes full advantage of A's good properties so as to maximize the efficiency. Is there any way to specify the algorithm to solve it, for example Conjugate Gradient algorithm in MATLAB?
If your matrix is sparse, you can use all these iterative functions, for example bicg for a biconjugate gradients method.
MATLAB's mldivide operator does indeed take advantage of properties of A. See the documentation for details - expand the "Algorithm" section.

vector valued limits in matlab integral

Is it possible to use vector limits for any matlab function of integration? I have to avoid from for loops because of the speed of my program. Can you please give me a clue on do
k=0:5
f=#(x)x^2
quad(f,k,k+1)
If somebody need, I found the answer of my question:quad with vector limit
I will try giving you an answer, based on my experience with quad function.
Starting from this:
k=0:5;
f=#(x) x.^2;
Notice the difference in your f definition (incorrect) and mine (correct).
If you only mean to integrate f within the range (0,5) you can easily call
quad(f,k(1),k(end))
Without handle function, you may reach the same results in a different way, by making use of trapz:
x = 0:5;
y = x.^2;
trapz(x,y)
If, instead, you mean to perform a step-by-step integration in the small range [k(i),k(i+1)] you may type
arrayfun(#(ii) quad(f,k(ii),k(ii+1)),1:numel(k)-1)
For a sake of convenince, notice that
sum(arrayfun(#(ii) quad(f,k(ii),k(ii+1)),1:numel(k)-1)) == quad(f,k(1),k(end))
I hope this helps.

matlab genetic algorithm solver complex input and output

My Matlab program has multiple inputs as a struct (in.a, in.b, etc.)
and multiple outputs (out.a, out.b, etc.)
I would like to use the genetic algorithm solver from teh optimization toolbox to find the best input in.a, while all the other inputs are constant. The fitness is one of the outputs, e.g. out.b(2,3).
How do I "tell" the solver this?
Thanks
Daniel
It is not uncommon in programming to have a situation where what is most convenient for your function and what some library call expects of it don't agree. The normal resolution to such a problem is to write a small layer in between that allows the two to talk; an interface.
From help ga:
X = GA(FITNESSFCN,NVARS) finds a local unconstrained minimum X to the
FITNESSFCN using GA. [...] FITNESSFCN accepts a vector X of size
1-by-NVARS, and returns a scalar evaluated at X.
So, ga expects vector input, scalar output, whereas you have a structure going in and out. You would have to write the following (sub)function:
function Y = wrapper_Objfun(X, in)
in.a = X; %# variable being optimized
out = YOUR_REAL_FUNCTION(in); %# call to your actual function
Y = out.b(2,3); %# objective value
end
and then the call to ga will look like
X = ga(#(x) wrapper_Objfun(x,in), N);
where N is however large in.a should be.
Also have a read about it in Matlab's own documentation on the subject.