Principal Component Analysis w/ Alternating Least Squares for Missing Data

Principal Component Analysis w/ Alternating Least Squares for Missing Data - matlab

In MATLAB R2014b there is a new function, pca(), that performs PCA that can handle missing data. In the documentation it says that it performs pca with the "alternating least squares" algorithm in order to estimate the missing values.
I would like to know if there are any practical references in how to apply PCA with this algorithm without the use of the function, or if there is a good reference on als. The reason is, there is no such function in Octave that can handle missing data and so I would like to code it myself.

Thanks for all your help. I went through the references and was able to find their matlab code on the als algorithm from two of the references. For anybody wondering, the source code can be found in these two links:
1) http://research.ics.aalto.fi/bayes/software/index.shtml
2) https://www.cs.nyu.edu/~roweis/code.html

Related

Matlab - output of the algorithm

I have a program using PSO algorithm using penalty function for Constraint Satisfaction. But when I run the program for different iterations, the output of the algorithm would be :
"Iteration 1: Best Cost = Inf"
.
Does anyone know why I always get inf answer?

There could be many reasons for that, none of which will be accurate if you don't provide a MWE with the code you have already tried or a context of the function you are analysing.
For instance, while studying the PSO algorithm you might use it on functions which have analytical solutions first. By doing this you can study the behaviour of the algorithm before applying to a similar problem, and fine tune its parameters.
My guess is that you might not be providing either the right function (I have done that already, getting a signal wrong is easy!), the right constraints (same logic applies), your weights for the penalty function and velocity update are way off.

alternatives to lsqlin MATLAB

Ok so I have a script that runs among other things lsqlin optimization function millions of times. To speed up this code I "codegen" it (basically automatically creates some mex files). This is a followup of Linear systems of inequations.
The problem here is that lsqlin as well as other optimization functions are not transformed and need to be called externally, which leads to loss of efficient.
I already found the MINQ toolbox but could not understand how to translate from lsqlin to this. Also found the QPC toolbox which requires a licence, which I am currently waiting.
Does anyone suggest another toolbox and how to convert from lsqlin to that?
General idea to codegen a lsqlin script (as can be seen a link is called and not a full conversion).
CODE:
function main_script()
coder.extrinsic('lsqlin_script')
for i=1:10^7
X=lsqlin_script(A,b,X0)
...
end
end
function X=lsqlin_script(A,b,X0)
X=lsqlin(eye(2),X0, A, b,[],[],[],[],X0, optimoptions('lsqlin','Display','Off'));
end
RUN:
codegen main_script.m
main_script_mex(INPUTS)

If you would describe your original problem I think you could expect more answers.
A possible approach to avoid lsqlin:
Calculate the orthogonal projection of Pxyz onto every plane defined by A and b.
Check whether the projections satisfy the inequality requirements. From those which satisfy select the closest point to Pxyz. If no valid point is found then the closest point will be on the intersection of planes. Calculate the shortest distance from Pxyz to every intersection lines... follow the steps applied for projection on planes..
As you can see it is not fully elaborated, you should work out the details if you think it may solve your problem.
For these calculations you do not need optimization function.

Functional form of 2D interpolation in Matlab

I need to construct an interpolating function from a 2D array of data. The reason I need something that returns an actual function is, that I need to be able to evaluate the function as part of an expression that I need to numerically integrate.
For that reason, "interp2" doesn't cut it: it does not return a function.
I could use "TriScatteredInterp", but that's heavy-weight: my grid is equally spaced (and big); so I don't need the delaunay triangularisation.
Are there any alternatives?

(Apologies for the 'late' answer, but I have some suggestions that might help others if the existing answer doesn't help them)
It's not clear from your question how accurate the resulting function needs to be (or how big, 'big' is), but one approach that you could adopt is to regress the data points that you have using a least-squares or Kalman filter-based method. You'd need to do this with a number of candidate function forms and then choose the one that is 'best', for example by using an measure such as MAE or MSE.
Of course this requires some idea of what the form underlying function could be, but your question isn't clear as to whether you have this kind of information.
Another approach that could work (and requires no knowledge of what the underlying function might be) is the use of the fuzzy transform (F-transform) to generate line segments that provide local approximations to the surface.
The method for this would be:
Define a 2D universe that includes the x and y domains of your input data
Create a 2D fuzzy partition of this universe - chosing partition sizes that give the accuracy you require
Apply the discrete F-transform using your input data to generate fuzzy data points in a 3D fuzzy space
Pass the inverse F-transform as a function handle (along with the fuzzy data points) to your integration function
If you're not familiar with the F-transform then I posted a blog a while ago about how the F-transform can be used as a universal approximator in a 1D case: http://iainism-blogism.blogspot.co.uk/2012/01/fuzzy-wuzzy-was.html
To see the mathematics behind the method and extend it to a multidimensional case then the University of Ostravia has published a PhD thesis that explains its application to various engineering problems and also provides an example of how it is constructed for the case of a 2D universe: http://irafm.osu.cz/f/PhD_theses/Stepnicka.pdf

If you want a function handle, why not define f=#(xi,yi)interp2(X,Y,Z,xi,yi) ?
It might be a little slow, but I think it should work.

If I understand you correctly, you want to perform a surface/line integral of 2-D data. There are ways to do it but maybe not the way you want it. I had the exact same problem and it's annoying! The only way I solved it was using the Surface Fitting Tool (sftool) to create a surface then integrating it.
After you create your fit using the tool (it has a GUI as well), it will generate an sftool object which you can then integrate in (2-D) using quad2d
I also tried your method of using interp2 and got the results (which were similar to the sfobject) but I had no idea how to do a numerical integration (line/surface) with the data. Creating thesfobject and then integrating it was much faster.
It was the first time I do something like this so I confirmed it using a numerically evaluated line integral. According to Stoke's theorem, the surface integral and the line integral should be the same and it did turn out to be the same.
I asked this question in the mathematics stackexchange, wanted to do a line integral of 2-d data, ended up doing a surface integral and then confirming the answer using a line integral!

What is the best way to implement a tree in matlab?

I want to write an implementation of a (not a binary) tree and and run some algorithms on it. The reason for using the matlab is that the rest of all programs are in matlab and it would be usful for some analysis and plotting. From an initial search in matlab i found that there aren't thing like pointers in matlab. So I'd like to know the best ( in terms on convinience) possible way to do this in matlab ? or any other ways ?

You can do this with MATLAB objects but you must make sure you use handle objects and not value objects because your nodes will contain cross-references to other nodes (i.e. parent, next sibling, first child).

This question is very old but still open. So I would just like to point readers to this implementation in plain MATLAB made by yours truly. Here is a tutorial that walks you through its use.

Matlab is very well suited to handle any kind of graphs (not only trees) represented as adjacency matrix or incidence matrix.
Matrices (representing graphs) can be either dense or sparse, depending on the properties of your graphs.
Last but not least, graph theory and linear algebra are in very fundamental ways related to each other see for example, so Matlab will be able to provide for you a very nice platform to harness such relationships.

ICA (Independent Component Analysis) in matlab

Could you provide an example of ICA Independent Component Analysis IN MATLAB?
I know PCA is implemented in matlab but ICA, what about RCA?

Have a look at the FastICA implementation. I've used their R version before, I assume the matlab implementation does the same thing... On that page you get a description of the algorithm and pointers to more info.

Dr G was right.
Now, you are able to find a complete and a very useful Matlab Package (works also with 2013a version):
FastICA
Also you can find a another ICA and PCA Matlab implementation package there: ICA/PCA. But I have no experience with it.

The topic is quite old, but it is worth mentioning that in 2017a, matlab introduced reconstruction independent component analysis (RICA), which may come in handy for someone searching for ICA.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Principal Component Analysis w/ Alternating Least Squares for Missing Data - matlab

Related

Matlab - output of the algorithm

alternatives to lsqlin MATLAB

Functional form of 2D interpolation in Matlab

What is the best way to implement a tree in matlab?

ICA (Independent Component Analysis) in matlab

Categories

Resources