I have a dataset with multiple labeled vectors and I wanted to perform a multi-class SVM with RBF Kernel with the integrated function in MATLAB called 'templateSVM'.
To do so, I use the templateSVM function with the following command:
t = templateSVM('BoxConstraint', 1, 'KernelFunction', 'rbf')
The problem is that I cannot find how to set the 'sigma' parameter.
Thanks to previous computations, I know that C=1 and sigma=8 are the best parameters to get the best results. Not knowing how to set sigma leads me to awful results.
Would you know how to set this parameter?
Thanks a lot in advance.
Unfortunately the options available with templateSVM seem to be quite limited (I had this problem myself and couldn't find a solution). There are some crucial options (such as the RBF sigma parameter) that do not seem to be available with templateSVM but are available with svmtrain.
I know that this isn't a real answer to your question, but I suggest that you look into using libsvm instead - it is very configurable and integrates well with Matlab.
I know it's an old question, but the answer would be useful for new users.
Link below can answer the question:
https://www.mathworks.com/matlabcentral/answers/336748-support-vector-machine-parameters-matlab
"setting SIGMA": Use the 'KernelScale' name-value pair.
Related
I have been all over Google trying to find a good function/package to perform multivariate regression (i.e. predict multiple continuous variables given another set of multiple continuous variables).
I wish to use something like fitlm(), since that also gives me p-value statistics and R squared statistics. Does anything like that exist?
Matlab has a bundle of tools for this, see this page.
I believe that mvregress is the most rounded and mainstream tool. See this page for setting up an analysis with it.
Also, a comment in this post may be useful for alternatives, if needed: it is possible to approach this via separate regression analyses, one for each response variable.
I know that fitcsvm is a new command in matlab new version and in the latest document say that svmtrain will be removed. Are the two commands the same? Actually I notice that they are different in result in my recent work. Can anyone help me with this strange problem?
According to my experiments, I found that the two functions are different. fitcsvm takes the empirical distribution into consideration, the distribution is related to the number of positive samples and negatives in the default situation. However, svmtrain just take this distribution as [0.5 0.5], and one can think there's no prior knowledge.
Further, it may be with the data whether they have been standardization, to get more about this, just find the related document about SVM.
from fitcsvm:
fitcsvm and svmtrain use, among other algorithms, SMO for optimization. The software implements SMO differently between the two functions, but numerical studies show that there is sensible agreement in the results.
from Wikipedia Sequential minimal optimization:
SMO is an iterative algorithm for solving the optimization problem ...
Could anyone please help in providing an example showing how ANCOVA (analysis of covariance) can be done in scipy/statsmodel, with python?
I am not sure if I am asking too much, but a quick search showed me this which is not informative enough for me.
Thanks!
Statsmodels uses the linear model, OLS, to estimate ANOVA. So, having additional continuous regressors as in ANCOVA does not change the analysis.
Here are a few links to the relevant documentation
Anova helper functions and examples for ANCOVA interactions
http://statsmodels.sourceforge.net/devel/examples/generated/example_interactions.html
using formulas to create the design matrix
http://statsmodels.sourceforge.net/devel/example_formulas.html
the core OLS model
http://statsmodels.sourceforge.net/devel/generated/statsmodels.regression.linear_model.OLS.html
I have been recently trying to use svm for feature classification. While i was doing so, a question came to my mind.
Which would be a better method to use, LIBSVM or svmclassify? What I mean by svmclassify is to use in-built functions in MATLAB such as svmtrain and svmclassify. In that sense, I was interested to know which method would be more accurate and which would be easier to use.
Since MATLAB has already the Bioinformatics toolbox already, why would you use LIBSVM? Aren't the functions like svmtrain and svmclassify already built in.. what additional benefits does LIBSVM bring about?
I would like to hear some of your opinions. Please Pardon me if the question is stupid..
I expect you would get very similar result using each library.
They are both very easy to use. The only big difference is that one comes with the MATLAB Bioinformatics toolbox and the other one you need to obtain from the authors web site and install by hand. If to you this is an issue I would recommend you stick to what is already installed in your computer. If not consider using LIBSVM, as it is a very well tested and well regarded library.
Also, from personal experience on playing with both, libSVM is much faster than MATLAB svm routines for obvious reasons. Last but not the least, libSVM has MATLAB plugins which can be called from MATLAB if you are more comfortable within a MATLAB environment.
I have also the same question, but I think that Libsvm is very useful and very easy in the case of multi-classes classification , but the matlab toolbox is designed for only two classes classification.
In my experience the libsvm performed giving cross validaion results as 45% where matlab code did 90%. So I looked up the explanation of matlab function for svm where they had such options related with perceptrones, I wonder if they are using pure svm or not but will write again in my case matlab was much better. (multiclass svm)
I'm quite new with this topic so any help would be great. What I need is to optimize a neural network in MATLAB by using GA. My network has [2x98] input and [1x98] target, I've tried consulting MATLAB help but I'm still kind of clueless about what to do :( so, any help would be appreciated. Thanks in advance.
Edit: I guess I didn't say what is there to be optimized as Dan said in the 1st answer. I guess most important thing is number of hidden neurons. And maybe number of hidden layers and training parameters like number of epochs or so. Sorry for not providing enough info, I'm still learning about this.
If this is a homework assignment, do whatever you were taught in class.
Otherwise, ditch the MLP entirely. Support vector regression ( http://www.csie.ntu.edu.tw/~cjlin/libsvm/ ) is much more reliably trainable across a broad swath of problems, and pretty much never runs into the stuck-in-a-local-minima problem often hit with back-propagation trained MLP which forces you to solve a network topography optimization problem just to find a network which will actually train.
well, you need to be more specific about what you are trying to optimize. Is it the size of the hidden layer? Do you have a hidden layer? Is it parameter optimization (learning rate, kernel parameters)?
I assume you have a set of parameters (# of hidden layers, # of neurons per layer...) that needs to be tuned, instead of brute-force searching all combinations to pick a good one, GA can help you "jump" from this combination to another one. So, you can "explore" the search space for potential candidates.
GA can help in selecting "helpful" features. Some features might appear redundant and you want to prune them. However, say, data has too many features to search for the best set of features by some approaches such as forward selection. Again, GA can "jump" from this set candidate to another one.
You will need to find away to encode the data (input parameters, features...) fed to GA. For finding a set of input paras or a good set of features, I think binary encoding should work. In addition, choosing operators for GA to reproduce offsprings is also important. Yet GA needs to be tuned, too (early stopping which can also be applied to ANN).
Here are just some ideas. You might want to search for more info about GA, feature selection, ANN pruning...
Since you're using MATLAB already I suggest you look into the Genetic Algorithms solver (known as GATool, part of the Global Optimization Toolbox) and the Neural Network Toolbox. Between those two you should be able to save quite a bit of figuring out.
You'll basically have to do 2 main tasks:
Come up with a representation (or encoding) for your candidate solutions
Code your fitness function (which basically tests candidate solutions) and pass it as a parameter to the GA solver.
If you need help in terms of coming up with a fitness function, or encoding of candidate solutions then you'll have to be more specific.
Hope it helps.
Matlab has a simple but great explanation for this problem here. It explains both the ANN and GA part.
For more info on using ANN in command line see this.
There is also plenty of litterature on the subject if you google it. It is however not related to MATLAB, but simply the results and the method.
Look up Matthew Settles on Google Scholar. He did some work in this area at the University of Idaho in the last 5-6 years. He should have citations relevant to your work.