Feature Selection methods in MATLAB? - matlab

I am trying to do some text classification with SVMs in MATLAB and really would to know if MATLAB has any methods for feature selection(Chi Sq.,MI,....), For the reason that I wan to try various methods and keeping the best method, I don't have time to implement all of them. That's why I am looking for such methods in MATLAB.Does any one know?

svmtrain
MATLAB has other utilities for classification like cluster analysis, random forests, etc.
If you don't have the required toolbox for svmtrain, I recommend LIBSVM. It's free and I've used it a lot with good results.

The Statistics Toolbox has sequentialfs. See also the documentation on feature selection.

A similar approach is dimensionality reduction. In MATLAB you can easily perform PCA or Factor analysis.
Alternatively you can take a wrapper approach to feature selection. You would search through the space of features by taking a subset of features each time, and evaluating that subset using any classification algorithm you decide (LDA, Decision tree, SVM, ..). You can do this as an exhaustively or using some kind of heuristic to guide the search (greedy, GA, SA, ..)
If you have access to the Bioinformatics Toolbox, it has a randfeatures function that does a similar thing. There's even a couple of cool demos of actual use cases.

May be this might help:
There are two ways of selecting the features in the classification:
Using fselect.py from libsvm tool directory (http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/#feature_selection_tool)
Using sequentialfs from statistics toolbox.
I would recommend using fselect.py as it provides more options - like automatic grid search for optimum parameters (using grid.py). It also provides an F-score based on the discrimination ability of the features (see http://www.csie.ntu.edu.tw/~cjlin/papers/features.pdf for details of F-score).
Since fselect.py is written in python, either you can use python interface or as I prefer, use matlab to perform a system call to python:
system('python fselect.py <training file name>')
Its important that you have python installed, libsvm compiled (and you are in the tools directory of libsvm which has grid.py and other files).
It is necessary to have the training file in libsvm format (sparse format). You can do that by using sparse function in matlab and then libsvmwrite.
xtrain_sparse = sparse(xtrain)
libsvmwrite('filename.txt',ytrain,xtrain_sparse)
Hope this helps.
For sequentialfs with libsvm, you can see this post:
Features selection with sequentialfs with libsvm

Related

Can MATLAB fcnLayers function be compared with MatConvNet?

In order to leverage FCN facilities for an image processing project, firstly I headed to use MatConvNet. However, in the preparation steps I found that MATLAB provided a new function (fcnLayers) to do so.
Can fcnLayers and its functionality be compared with MatConvNet? Specifically, I mean that is it possible to train models or use pre-trained ones?
Finally, may I achieve same result by using each of them?

How can I train SVM in Matlab, with more than 2 classes

I was trying to use fitcsvm to train and classify my data. However, I notice - correct me if I'm wrong - that fitcsvm could only be used with 2 classes (groups).
My data have more than 2 classes. Is there away to do classify them in matlab?
I did some googling and I read that some recommend to use fitcecoc, while others recommend to use out of the box code multisvm
Morover, other recommend to use discriminant analysis
Please, advise on best approach to go.
you are correct, fitcsvm is for one or two classes, you may use svmtrain which is matlab's svm classifier for more then two classes, also there is a famous toolbox named libsvm, if you google it would be found easily.
https://github.com/cjlin1/libsvm
Recently I saw some new method for multiply svm classifer named DSVM, it's nice new method,this would be found in matlab's files exchange.
http://www.mathworks.com/matlabcentral/fileexchange/48632-multiclass-svm-classifier
good luck

svmtrain function in matlab never exits ... do alternatives exist?

I am trying to learn how to use support vector machines in matlab. I have the bioinformatics toolbox, which has SVM functions svmtrain and svmclassify.
I managed to successfully use it for some reference data sets, with some nice accuracy. When I try to use the svm on my actual data the training never stops. My data set is 400 instances in 25 dimensions, so it should not take very long?!
Can I use other solvers in matlab? I dont want to buy new toolbox please ...
There are several things that may cause problems for training, but it should not run infinitely. Do you get any errors when using the solver?
With regard to alternatives: LIBSVM has an interface to matlab. This is a state-of-the-art library with thousands of users. I highly recommend it, because it is easy to install/use and offers additional functionality for parameter tuning and more.

Creating a classifier in MATLAB to be used with classperf

I'm working on a new model and would like to use classperf to check the performance of my classifier. How do I make it use my classifier as opposed to one of the built-in ones? All the examples I found online use classifiers that are included in MATLAB. I want to use K-fold to test it.
It isn't clear form the MATLAB documentation how to do this, though you can edit functions like knnclassify or svmclassify to see how they were written, and try to emulate that functionality.
Alternatively, there's a free MATLAB pattern recognition toolbox that uses objects to represent classifiers:
http://www.mathworks.com/matlabcentral/linkexchange/links/2947-pattern-recognition-toolbox
And you can make a new classifier by sub-classing the base classifier object: prtClass.
Then you can do:
c = myClassifier;
yGuess = c.kfolds(dataSet,10); %10 fold X-val
(Full disclosure, I'm an author of the PRT toolbox)

Which is a better method? libsvm or svmclassify?

I have been recently trying to use svm for feature classification. While i was doing so, a question came to my mind.
Which would be a better method to use, LIBSVM or svmclassify? What I mean by svmclassify is to use in-built functions in MATLAB such as svmtrain and svmclassify. In that sense, I was interested to know which method would be more accurate and which would be easier to use.
Since MATLAB has already the Bioinformatics toolbox already, why would you use LIBSVM? Aren't the functions like svmtrain and svmclassify already built in.. what additional benefits does LIBSVM bring about?
I would like to hear some of your opinions. Please Pardon me if the question is stupid..
I expect you would get very similar result using each library.
They are both very easy to use. The only big difference is that one comes with the MATLAB Bioinformatics toolbox and the other one you need to obtain from the authors web site and install by hand. If to you this is an issue I would recommend you stick to what is already installed in your computer. If not consider using LIBSVM, as it is a very well tested and well regarded library.
Also, from personal experience on playing with both, libSVM is much faster than MATLAB svm routines for obvious reasons. Last but not the least, libSVM has MATLAB plugins which can be called from MATLAB if you are more comfortable within a MATLAB environment.
I have also the same question, but I think that Libsvm is very useful and very easy in the case of multi-classes classification , but the matlab toolbox is designed for only two classes classification.
In my experience the libsvm performed giving cross validaion results as 45% where matlab code did 90%. So I looked up the explanation of matlab function for svm where they had such options related with perceptrones, I wonder if they are using pure svm or not but will write again in my case matlab was much better. (multiclass svm)