Is transposing training set affects the results with SVM - matlab

I am working on human age claffication where i have to classify data into two classes namely Young and Old. As a classifier i am using SVM and this is what i did so far to parepare the data :
The TrainingSet where it's size is (11264, 284) : where each column corresponds to an observation (a person). This means that i have 284 persons for the training task and 11264 features.
The TestSet is also formated as the TrainingSet.
The Groups (labels) is a matrix Groups(284, 1) filled with (1) for Olds and (-1) for Youngs.
I trained SVM using matlab built-in function to have the `SvmStruct'.
SvmStruct = svmtrain(TrainingSet, Groups')
Then i introduce the TestSet using this matlab function in order to have the classification results.
SvmClassify = svmclassify(SvmStruct, TestSet)
After i reviewed the matlab help about SVM i deduced that the data have to be introduced to the SVM classifier in a way that the each row of the TrainingSet corresponds to an Observation (a person in my case) and each column corresponds to a feature. So what i was doing so far was transposing those matrices (TrainingSet and TestSet).
Is what i did was wrong and all the results i got are wrong?

I looked at the source code for svmtrain, and it transposes the training data if the number of groups does not match the number of rows (svmtrain.m, line 249 ff, MATLAB 2015b):
% make sure data is the right size
if size(training,1) ~= size(groupIndex,1)
if size(training,2) == size(groupIndex,1)
training = training';
else
error(message('stats:svmtrain:DataGroupSizeMismatch'))
end
end
So no, your training results are not wrong.
However, svmclassify does not transpose the test data, it only checks for the right number of features (svmclassify.m, line 63 ff.):
if size(sample,2)~=size(svmStruct.SupportVectors,2)
error(message('stats:svmclassify:TestSizeMismatch'));
end
So this should have triggered an error (sample is your TestSet).

Related

SVM Classifications on set of images of digits in Matlab

I have to use SVM classifier on digits dataset. The dataset consists of images of digits 28x28 and a toal of 2000 images.
I tried to use svmtrain but the matlab gave an error that svmtrain has been removed. so now i am using fitcsvm.
My code is as below:
labelData = zeros(2000,1);
for i=1:1000
labelData(i,1)=1;
end
for j=1001:2000
labelData(j,1)=1;
end
SVMStruct =fitcsvm(trainingData,labelData)
%where training data is the set of images of digits.
I need to know how i can predict the outputs of test data using svm? Further is my code correct?
The function that you are looking for is predict. It takes the SVM-object as input followed by a data-matrix and returns the predicted labels.
Make sure that you do not train your model on all data but on a reasonable subset (usually 70%). You can use the cross-validation preparation:
% create cross-validation object
cvp = cvpartition(Lbl,'HoldOut',0.3);
% extract logical vectors for training and testing data
lgTrn = cvp.training;
lgTst = cvp.test;
% train SVM
mdl = fitcsvm(Dat(lgTrn,:),Lbl(lgTrn));
% test / predict SVM
Lbl_prd = predict(mdl,Dat(lgTst,:));
Note that your labeling produces a single vector of ones.
The reason why The Mathworks changed svmtrain to fitcsvm is conciseness. It is now clear whether it is "classification" (fitcsvm) or "regression" (fitrsvm).

Matlab predict function not working

I am trying to train a linear SVM on a data which has 100 dimensions. I have 80 instances for training. I train the SVM using fitcsvm function in MATLAB and check the function using predict on the training data. When I classify the training data with the SVM all the data points are being classified into only one class.
SVM = fitcsvm(votes,b,'ClassNames',unique(b)');
predict(SVM,votes);
This gives outputs as all 0's which corresponds to 0th class. b contains 1's and 0's indicating the class to which each data point belongs.
The data used, i.e. matrix votes and vector b are given the following link
Make sure you use a non-linear kernel, such as a gaussian kernel and that the parameters of the kernel are tweaked. Just as a starting point:
SVM = fitcsvm(votes,b,'KernelFunction','RBF', 'KernelScale','auto');
bp = predict(SVM,votes);
that said you should split your set in a training set and a testing set, otherwise you risk overfitting

Matlab train svm regressor using LIBVM with a sparse matrix of text

I am conducting a Bag Of Words analysis where I have a sparse matrix of 80x60452, where 1:80 represent the number of days for the text collected on each day. This matrix represents a collection of tweets on a particular day. The experiment is part of a univariate time series prediction and im not to sure how to implement this.
I am using the Text Matrix Generator (TMG) to construct this sparse matrix which will be used agains a response column vector of price observations(80x1).
I will use a the first 70 observations to train the SVM regressor to predict future price instances given a test data set of the remaining instances.
For example:
T = (length(priceData)-10);
trainBoW = BoW(1:T,1:end);
testBoW = BoW(T+1:end,1:end);
trainPrice = pData(1:T);
testPrice = pData(T+1:end);
model = svmtrain(trainBoW, trainPrice,'-c 1 -g 0.07 -b 1');
Running this code throws the error:
Error using svmtrain (line 274)
SVMTRAIN only supports classification into two groups. GROUP contains 62 groups.
How can I train a SVM regression model using libsvm to predict future price instances ?

leave-one-out regression using lasso in Matlab

I have 300 data samples with around 4000 dimension feature each. Each input has a 5 dim. output which is in the range of -2 to 2. I am trying to fit a lasso model to it. I went through a few posts which talk about cross validation strategies like this one: Leave one out cross validation algorithm in matlab
But I saw that lasso does not support leaveout in Matlab! http://www.mathworks.com/help/stats/lasso.html
How can I train a model using leave one out cross validation and fit a model using lasso on my dataset? I am trying to do this in matlab. I would like to get a set of weights which I will be able to use for future predictions on other data.
I tried using glmnet: http://www.stanford.edu/~hastie/glmnet_matlab/intro.html but I couldn't compile it on my machine due to lack of proper mex compiler.
Any solutions to my problem? Thanks :)
EDIT
I am also trying to use lasso function in-built with MATLAB. It has an option to perform cross validation. It outputs B and Fit Statistics, where B is Fitted coefficients, a p-by-L matrix, where p is the number of predictors (columns) in X, and L is the number of Lambda values.
Now given a new test sample, how can I calculate the output using this model?
You can use a leave-one-out approach regardless of your training method. As explained here, you can use crossvalind to split the data into training and test sets.
[Train, Test] = crossvalind('LeaveMOut', N, M)

How to use SVM in Matlab?

I am new to Matlab. Is there any sample code for classifying some data (with 41 features) with a SVM and then visualize the result? I want to classify a data set (which has five classes) using the SVM method.
I read the "A Practical Guide to Support Vector Classication" article and I saw some examples. My dataset is kdd99. I wrote the following code:
%% Load Data
[data,colNames] = xlsread('TarainingDataset.xls');
groups = ismember(colNames(:,42),'normal.');
TrainInputs = data;
TrainTargets = groups;
%% Design SVM
C = 100;
svmstruct = svmtrain(TrainInputs,TrainTargets,...
'boxconstraint',C,...
'kernel_function','rbf',...
'rbf_sigma',0.5,...
'showplot','false');
%% Test SVM
[dataTset,colNamesTest] = xlsread('TestDataset.xls');
TestInputs = dataTset;
groups = ismember(colNamesTest(:,42),'normal.');
TestOutputs = svmclassify(svmstruct,TestInputs,'showplot','false');
but I don't know that how to get accuracy or mse of my classification, and I use showplot in my svmclassify but when is true, I get this warning:
The display option can only plot 2D training data
Could anyone please help me?
I recommend you to use another SVM toolbox,libsvm. The link is as follow:
http://www.csie.ntu.edu.tw/~cjlin/libsvm/
After adding it to the path of matlab, you can train and use you model like this:
model=svmtrain(train_label,train_feature,'-c 1 -g 0.07 -h 0');
% the parameters can be modified
[label, accuracy, probablity]=svmpredict(test_label,test_feaure,model);
train_label must be a vector,if there are more than two kinds of input(0/1),it will be an nSVM automatically.
train_feature is n*L matrix for n samples. You'd better preprocess the feature before using it. In the test part, they should be preprocess in the same way.
The accuracy you want will be showed when test is finished, but it's only for the whole dataset.
If you need the accuracy for positive and negative samples separately, you still should calculate by yourself using the label predicted.
Hope this will help you!
Your feature space has 41 dimensions, plotting more that 3 dimensions is impossible.
In order to better understand your data and the way SVM works is to begin with a linear SVM. This tybe of SVM is interpretable, which means that each of your 41 features has a weight (or 'importance') associated with it after training. You can then use plot3() with your data on 3 of the 'best' features from the linear svm. Note how well your data is separated with those features and choose a basis function and other parameters accordingly.