SVM Classifications on set of images of digits in Matlab - matlab

I have to use SVM classifier on digits dataset. The dataset consists of images of digits 28x28 and a toal of 2000 images.
I tried to use svmtrain but the matlab gave an error that svmtrain has been removed. so now i am using fitcsvm.
My code is as below:
labelData = zeros(2000,1);
for i=1:1000
labelData(i,1)=1;
end
for j=1001:2000
labelData(j,1)=1;
end
SVMStruct =fitcsvm(trainingData,labelData)
%where training data is the set of images of digits.
I need to know how i can predict the outputs of test data using svm? Further is my code correct?

The function that you are looking for is predict. It takes the SVM-object as input followed by a data-matrix and returns the predicted labels.
Make sure that you do not train your model on all data but on a reasonable subset (usually 70%). You can use the cross-validation preparation:
% create cross-validation object
cvp = cvpartition(Lbl,'HoldOut',0.3);
% extract logical vectors for training and testing data
lgTrn = cvp.training;
lgTst = cvp.test;
% train SVM
mdl = fitcsvm(Dat(lgTrn,:),Lbl(lgTrn));
% test / predict SVM
Lbl_prd = predict(mdl,Dat(lgTst,:));
Note that your labeling produces a single vector of ones.
The reason why The Mathworks changed svmtrain to fitcsvm is conciseness. It is now clear whether it is "classification" (fitcsvm) or "regression" (fitrsvm).

Related

Matlab: make predictions with SVM for multiclass classification problems

I am trying to use a Support Vector Machine to classify my data in 3 classes. I used this Matlab function to train and cross-validate the SVM:
Mdl = fitcecoc(XTrain, yTrain, 'Learners', 'svm', 'ObservationsIn', 'rows', ...
'ScoreTransform', 'invlogit','Crossval','on', 'Holdout', 0.2);
where XTrain contains all of my data and yTrain is a cell containing the names of each class to be assigned to the input data in XTrain.
The function above returns to me:
Mdl --> 1x1 ClassificationPartitionedECOC
My question is, what function do I have to use in order to make predictions using new data? In the case of binary classification, I build the SVM with 'fitcsvm' and then I predicted the labels with:
[label, score] = predict(Mdl, XTest);
However, if I feed the ClassificationPartitionedECOC to the 'predict' function, it gives me this error:
No valid system or dataset was specified.
I haven't been able to find a function that allows me to perform prediction starting from the model format that I have, ClassificationPartitionedECOC.
Thanks for any help you may provide!
You can access the learner i through:
Mdl.BinaryLearners{i}
Because fitcecoc just trains a binary classifier like you would do with fitCSVM in a one versus one fashion.

Predict labels for new dataset (Test data) using cross validated Knn classifier model in matlab

I have a training dataset (50000 X 16) and test dataset (5000 X 16)[the 16th column in both the datasets are decision labels or response. The decision label in test dataset in used for checking the classification accuracy of the trained classifier]. I am using my training data for training and validating my cross validated knn classifier. I have created a cross validated knn classifier model using the following code :
X = Dataset2(1:50000,:); % Use some data for fitting
Y = Training_Label(1:50000,:); % Response of training data
%Create a KNN Classifier model
rng(10); % For reproducibility
Mdl = fitcknn(X,Y,'Distance', 'Cosine', 'Exponent', '', 'NumNeighbors', 10,'DistanceWeight', 'Equal', 'StandardizeData', 1);
%Construct a cross-validated classifier from the model.
CVMdl = crossval(Mdl,'KFold', 10);
%Examine the cross-validation loss, which is the average loss of each cross-validation model when predicting on data that is not used for training.
kloss = kfoldLoss(CVMdl, 'LossFun', 'ClassifError')
% Compute validation accuracy
validationAccuracy = 1 - kloss;
now I want to classify my Test data using this cross validated knn classifier but can't really figure out how to do that. I have gone through the available examples in matlab but couldn't find any suitable function or examples for doing this.
I know I can use the "predict" function for predicting the classlabels of my test data if my classifier is not cross validated. The code is as following :
X = Dataset2(1:50000,:); % Use some data for fitting
Y = Training_Label(1:50000,:); % Response of training data
%Create a KNN Classifier model
rng(10); % For reproducibility
Mdl = fitcknn(X,Y,'Distance', 'Cosine', 'Exponent', '', 'NumNeighbors', 10,'DistanceWeight', 'Equal', 'StandardizeData', 1);
%Classification using Test Data
Classifier_Output_Labels = predict(Mdl,TestDataset2(1:5000,:));
But I could not find any similar function (like "predict") for cross validated trained knn classifier. I found out the "kfoldPredict" function in Matlab documentation, but it says the function is used to evaluate the trained model.
http://www.mathworks.com/help/stats/classificationpartitionedmodel.kfoldpredict.html But I did not find any input of the new data through this function.
So could anyone please advise me how to use the cross validated knn classifier model to predict labels of new data? Any help is appreciated and badly needed. :( :(
Let's say you are doing 10-fold cross validation while learning the model. You can then use the kfoldLoss function to also get the CV loss for each fold and then choose the trained model that gives you the least CV loss in the following way:
modelLosses = kfoldLoss(Mdl,'mode','individual');
The above code will give you a vector of length 10 (10 CV error values) if you have done 10-fold cross-validation while learning. Assuming the trained model with least CV error is the 'k'th one, you would then use:
testSetPredictions = predict(Mdl.Trained{k}, testSetFeatures);
You seem to be confusing things here. Cross validation is a tool for model selection and evaluation. It is not training procedure per se. Consequently you cannot "use" cross validated object. You predict using trained object. Cross validation is a form of estimating generalization capabilities of a given model, it has nothing to do with actual training, it is rather a small statistical experiment to asses a particular property.

Matlab predict function not working

I am trying to train a linear SVM on a data which has 100 dimensions. I have 80 instances for training. I train the SVM using fitcsvm function in MATLAB and check the function using predict on the training data. When I classify the training data with the SVM all the data points are being classified into only one class.
SVM = fitcsvm(votes,b,'ClassNames',unique(b)');
predict(SVM,votes);
This gives outputs as all 0's which corresponds to 0th class. b contains 1's and 0's indicating the class to which each data point belongs.
The data used, i.e. matrix votes and vector b are given the following link
Make sure you use a non-linear kernel, such as a gaussian kernel and that the parameters of the kernel are tweaked. Just as a starting point:
SVM = fitcsvm(votes,b,'KernelFunction','RBF', 'KernelScale','auto');
bp = predict(SVM,votes);
that said you should split your set in a training set and a testing set, otherwise you risk overfitting

Compute the training error and test error in libsvm + MATLAB

I would like to draw learning curves for a given SVM classifier. Thus, in order to do this, I would like to compute the training, cross-validation and test error, and then plot them while varying some parameter (e.g., number of instances m).
How to compute training, cross-validation and test error on libsvm when used with MATLAB?
I have seen other answers (see example) that suggest solutions for other languages.
Isn't there a compact way of doing it?
Given a set of instances described by:
a set of features featureVector;
their corresponding labels (e.g., either 0 or 1),
if a model was previously inferred via libsvm, the MSE error can be computed as follows:
[predictedLabels, accuracy, ~] = svmpredict(labels, featureVectors, model,'-q');
MSE = accuracy(2);
Notice that predictedLabels contains the labels that were predicted by the classifier for the given instances.

How to use SVM in Matlab?

I am new to Matlab. Is there any sample code for classifying some data (with 41 features) with a SVM and then visualize the result? I want to classify a data set (which has five classes) using the SVM method.
I read the "A Practical Guide to Support Vector Classication" article and I saw some examples. My dataset is kdd99. I wrote the following code:
%% Load Data
[data,colNames] = xlsread('TarainingDataset.xls');
groups = ismember(colNames(:,42),'normal.');
TrainInputs = data;
TrainTargets = groups;
%% Design SVM
C = 100;
svmstruct = svmtrain(TrainInputs,TrainTargets,...
'boxconstraint',C,...
'kernel_function','rbf',...
'rbf_sigma',0.5,...
'showplot','false');
%% Test SVM
[dataTset,colNamesTest] = xlsread('TestDataset.xls');
TestInputs = dataTset;
groups = ismember(colNamesTest(:,42),'normal.');
TestOutputs = svmclassify(svmstruct,TestInputs,'showplot','false');
but I don't know that how to get accuracy or mse of my classification, and I use showplot in my svmclassify but when is true, I get this warning:
The display option can only plot 2D training data
Could anyone please help me?
I recommend you to use another SVM toolbox,libsvm. The link is as follow:
http://www.csie.ntu.edu.tw/~cjlin/libsvm/
After adding it to the path of matlab, you can train and use you model like this:
model=svmtrain(train_label,train_feature,'-c 1 -g 0.07 -h 0');
% the parameters can be modified
[label, accuracy, probablity]=svmpredict(test_label,test_feaure,model);
train_label must be a vector,if there are more than two kinds of input(0/1),it will be an nSVM automatically.
train_feature is n*L matrix for n samples. You'd better preprocess the feature before using it. In the test part, they should be preprocess in the same way.
The accuracy you want will be showed when test is finished, but it's only for the whole dataset.
If you need the accuracy for positive and negative samples separately, you still should calculate by yourself using the label predicted.
Hope this will help you!
Your feature space has 41 dimensions, plotting more that 3 dimensions is impossible.
In order to better understand your data and the way SVM works is to begin with a linear SVM. This tybe of SVM is interpretable, which means that each of your 41 features has a weight (or 'importance') associated with it after training. You can then use plot3() with your data on 3 of the 'best' features from the linear svm. Note how well your data is separated with those features and choose a basis function and other parameters accordingly.