Multi dimensonal output GPML? - matlab

I would use a multi dimensional gaussian modell for regression. Rasmussen has a book with an algoritm, but it is only for one dimension output. Any idea to modify it?

First, I presume that you know about, and this is not what you want.
Second, I consequently presume that your problem involves several functions. In this case, for most purposes, you can just run your regression on each function separately; that is, unless you have some weird norm on the output space prescribed to you.

Lets say you want to model f(x,y) = [u , v]^T. You could model u and v separately:
f1(x,y) = u
f2(x,y) = v
This is making the assumption that u and v are conditionally independent given x, y. However, GPML advises that u and v can remain correlated because of a correlating noise process. Consult Chapter 9 of GPML for approaches in this case.


Matlab: Solve for a single variable in a linear system of equations

I have a linear system of about 2000 sparse equations in Matlab. For my final result, I only really need the value of one of the variables: the other values are irrelevant. While there is no real problem in simply solving the equations and extracting the correct variable, I was wondering whether there was a faster way or Matlab command. For example, as soon as the required variable is calculated, the program could in principle stop running.
Is there anyone who knows whether this is at all possible, or if it would just be easier to keep solving the entire system?
Most of the computation time is spent inverting the matrix, if we can find a way to avoid completely inverting the matrix then we may be able to improve the computation time. Lets assume I'm only interested in the solution for the last variable x(N). Using the standard method we compute
x = A\b;
res = x(N);
Assuming A is full rank, we can instead use LU decomposition of the augmented matrix [A b] to get x(N) which looks like this
[~,U] = lu([A b]);
res = U(end,end-1)/U(end,end);
This is essentially performing Gaussian elimination and then solving for x(N) using back-substitution.
We can extend this to find any value of x by swapping the columns of A before LU decomposition,
x_index = 123; % the index of the solution we are interested in
A(:,[x_index,end]) = A(:,[end,x_index]);
[~,U] = lu([A b]);
res = U(end,end)/U(end,end-1);
Bench-marking performance in MATLAB2017a with 10,000 random 200 dimensional systems we get a slight speed-up
Total time direct method : 4.5401s
Total time LU method : 3.9149s
Note that you may experience some precision issues if A isn't well conditioned.
Also, this approach doesn't take advantage of the sparsity of A. In my experiments even with 2000x2000 sparse matrices everything significantly slowed down and the LU method is significantly slower. That said full matrix representation only requires about 30MB which shouldn't be a problem on most computers.
If you have access to theory manuals on NASTRAN, I believe (from memory) there is coverage of partial solutions of linear systems. Also try looking for iterative or tri diagonal solvers for A*x = b. On this page, review the pqr solution answer by Shantachhani. Another reference.

What is the difference between 'qr' and 'SVD' in Matlab to get the single vectors of a matrix?

Spefifically, the following two kinds of code can get the same S and V idealy. However, the second one's speed is usually faster than the first one in Matlab. Can someone tell me the reason?
Moreover, which method is more numerically stable?
[~,S,V] = svd(B,'econ');
[Qc,Rc] = qr(B',0);
[U,S,~] = svd(Rc,'econ');
V = Qc*U;
The second method does not have to be faster. For almost squared matrices it can be slower. Consider as example the Golub-Reinsch SVD-algorithm:
Its work depends on the output you want to calculate (only S, Sand V or S,V and U).
If you want to calculate Sand V without performing any preprocessing the required work is 4mn^2+8n^3.
If you perform QR-decomposition before this the needed amount of work is: 2/3n^3+n^2+1/3n-2 for the Housholder transformation. Now if your Matrix was almost squared, i.e m=n, you will have gained not much as R is still m x n. However if m is larger than n you can reduce R to an n x n matrix (called thin QR factorization). Now you want to calculate Uand S which will add 12n^3 for your SVD-algorithm.
So only SVD: 4mn^2+8n^3
SVD with QR: (12+2/3)n^3+n^2+1/3n-2
However most SVD-algorithms should inculde some (R-) bidiagonalizations which will reduce the work to: 2mn^2+11n^3
You can also apply QR, the R-bifactorization and then SVD to make it even faster but it all depends on your matrix dimensions.
Matlab uses for SVD the Lapack libraries. You can look up the exact runtimes here. They're approximately the same as above algorithm.
Hope this helps.

Matlab own fft2 without loops

I have a problem.
I have a task to write an own fft2 without using for-loops in Matlab.
There is a formula for computing this task:
F(u,v) = sum (0 to M-1) {sum(o to N-1) {f(m,n)*e^(-i*2pi*(um/M + vn/N))}}
Or for better reading:
It is easy to do it with two for-loops but I have no idea how to do this without these loops, absolutely no idea.
We get no help by the teaching personal. They don't even give a hint or a reference to a book, where we could read about it.
Now, I want to try to get help here.
Are you familiar with the matrix form of DFT? have a look here:
You can do something similar in order to get a matrix form for 2D DFT.
You need to transformation matrices. The first is a N-by-N DFT matrix that operates on the columns of f, as explained in the link above. Next you need another M-byM DFT matrix the operates on the rows of f. Finally, you transformed signal is given by
F = Wm * f * Wn;
without any loops.
Note that the DFT matrix can be constructed also without loop by using something like
Just a little correction in Thp's answer: (1:M)*((1:M)') is not the right way to create the matrix, but (1:M)'*(1:M) is the correct way.

Doing a PCA using an optimization in Matlab

I'd like to find the principal components of a data matrix X in Matlab by solving the optimization problem min||X-XBB'||, where the norm is the Frobenius norm, and B is an orthonormal matrix. I'm wondering if anyone could tell me how to do that. Ideally, I'd like to be able to do this using the optimization toolbox. I know how to find the principal components using other methods. My goal is to understand how to set up and solve an optimization problem which has a matrix as the answer. I'd very much appreciate any suggestions or comments.
The thing about Optimization is that there are different methods to solve a problem, some of which can require extensive computation.
Your solution, given the constraints for B, is to use fmincon. Start by creating a file for the non-linear constraints:
function [c,ceq] = nonLinCon(x)
c = 0;
ceq = norm((x'*x - eye (size(x))),'fro'); %this checks to see if B is orthonormal.
then call the routine:
B = fmincon(#(B) norm(X - X*B*B','fro'),B0,[],[],[],[],[],[],#nonLinCon)
with B0 being a good guess on what the answer will be.
Also, you need to understand that this algorithms tries to find a local minimum, which may not be the solution you ultimately want. For instance:
X = randn(1,2)
fmincon(#(B) norm(X - X*B*B','fro'),rand(2),[],[],[],[],[],[],#nonLinCon)
ans =
0.4904 0.8719
0.8708 -0.4909
fmincon(#(B) norm(X - X*B*B','fro'),rand(2),[],[],[],[],[],[],#nonLinCon)
ans =
0.9864 -0.1646
0.1646 0.9864
So be careful, when using these methods, and try to select a good starting point
The Statistics toolbox has a built-in function 'princomp' that does PCA. If you want to learn (in general, without the optimization toolbox) how to create your own code to do PCA, this site is a good resource.
Since you've specifically mentioned wanting to use the Optimization Toolbox and to set this up as an optimization problem, there is a very well-trusted 3rd-party package known as CVX from Stanford University that can solve the optimization problem you are referring to at this site.
Do you have the optimization toolbox? The documentation is really good, just try one of their examples:
But in general the optimization function look like this:
[OptimizedMatrix, OptimizedObjectiveFunction] = optimize( (#MatrixToOptimize) MyObjectiveFunction(MatrixToOptimize), InitialConditionsMatrix, ...optional constraints and options... );
You must create MyObjectiveFunction() yourself, it must take the Matrix you want to optimize as an input and output a scalar value indicating the cost of the current input Matrix. Most of the optimizers will try to minimise this cost. Note that the cost must be a scalar.
fmincon() is a good place to start, once you are used to the toolbox you and if you can you should choose a more specific optimization algorithm for your problem.
To optimize a matrix rather than a vector, reshape the matrix to a vector, pass this vector to your objective function, and then reshape it back to the matrix within your objective function.
For example say you are trying to optimize the 3 x 3 matrix M. You have defined objective function MyObjectiveFunction(InputVector). Pass M as a vector:
And within the MyObjectiveFunction you must reshape M (if necessary) to be a matrix again:
function cost = MyObjectiveFunction(InputVector)
InputMatrix = reshape(InputVector, [3 3]);
%Code that performs matrix operations on InputMatrix to produce a scalar cost
cost = %some scalar value

gmdistribution for classification in Matlab

Let's assume I have two gmdistibution models that i obtained using,1);,1);
Now I have an unknown 'data' observation, and I want to see if it belongs to data1 or data2.
Based on my understanding of these functions, nlogn output using posterior,cluster, or pdf commands wouldn't be a good measure since I am comparing 'data' to two different distributions.
What measure or output should I use find what is the p(data|modeldata1) and p(data|modeldata2) ?
Many thanks,
If I understand you correctly, you want to assign a new, unknown, datapoint to either class 1 or class 2 with the descriptors for each class (in this case the mean vector and covariance matrix) found by
In seeing this new datapoint, lets call it x, you should ask yourself what is
p(modeldata1 | x) and p(modeldata2 | x) and which ever one of these is the highest you should assign x to.
So how do you find these? You just apply Bayes rule and pick which ever one is the largest of:
p(modeldata1 | x) = p(x|modeldata1)p(modeldata1)/p(x)
p(modeldata1 | x) = p(x|modeldata2)p(modeldata2)/p(x)
Here you dont need to calculate p(x) as it is the same in each equation.
So, now you estimate the priors p(modeldata1) and p(modeldata2) by the number of training points from each class (or use some given information) and then calculate
p(x|modeldata1)=1/((2pi)^d/2 * sqrt(det(Sigma1)))*exp(0.5*(x-mu1)/Sigma1*(x-mu1))
where d is the dimensionality of your data, Sigma is a corvariance matrix, and mu is a mean vector. This is then your asked for p(data|modeldata1). (Just remember to also use p(modeldata1) and p(modeldata2) when you do the classification).
I know this was a bit unclear, but hopefully it can help you with a step in the right direction.
EDIT: Personally, I find a visualization such as the one below (takes from Pattern Recognition by Theodoridis and Koutroumbas). Here you have two gaussian mixtures with some priors and different covariance matrices. The blue area is where you would choose one class, while the gray area is where the other would be choosen.