How to use classifiers - matlab

I want to use the svm, knn, adaboost classifier on my data features. I build up code where I calculated the frame differences and calculated the features (eigenvalues, strain energy, potential energy).... build up an array of [number of frames , features]. I try to use svm as:
Features = data; % Features array [40, 5]
class = ones(numFrames-1, 1); % numFrames=41
class(1:(fix(numFrames/2))) = -1;
SVMstruct = svmtrain(Features, class, 'Kernel_Function', 'rbf');
newclass = svmclassify(SVMstruct, [40 5]); %Test data
I got an error:
The number of columns in TEST and training data must be equal.
%classperf(cp,newclass); %performance of the class given by cp'`
What is the reason for this error? And how do I to use further classifiers with this features set?

I can infer following things from the error which you are getting.
There is no error in svmtrain that means size(features)=[40 5]. The error is in the last line. See the syntax of svmclassify. You pass a sample of test data which has same number of features/columns as the training data in your case 5). Instead you are passing the size which is [40 5] which has only two columns. Pass the actual test set of n rows and 5 columns. The last line should be
newclass= svmclassify(SVMstruct,testData); %where size(testData)=[n 5], n indicates how many test samples you have.

Related

SVM multiclassification with MATLAB R2015a

I try to use MATLAB R2015a classification toolbox for my 4 classes. I imported my dataset and selected a Gaussian kernel to train my classifier. This is my dataset:
my Data=[9.36 0;8.72 0;9.13 0;7.38 0;8.02 0;12.15 1;11.02 1;11.61 1;
12.31 1;15.23 1;52.92 2;54.49 2;48.82 2;52.00 2;49.79 2;22.46 3;30.38 3;
21.98 3;24.46 3;26.08 3];
Then I export it into my workspace to use it with my new test data, but when I want to use it in work space this error apears:
Variables have been created in the base workspace.
To use the exported classifier trainedClassifier to make predictions on new data, T, use
yfit = predict(trainedClassifier, T{:,trainedClassifier.PredictorNames})
If your new data contains any integer variables, then preprocess the data to doubles like this:
X = table2array(varfun(#double, T(:,trainedClassifier.PredictorNames)));
yfit = predict(trainedClassifier, X)
I don't understand what does it mean exactly and what is T and yfit?
How can I test my new data with this classifier?
The thing is that you are trying to predict the classes of the data stored in a cell. First import it as a table.
Home_>import_>file name_>import_>(here choose Table from the imported data part). Now you can use your predictor by providing this table name.
yfit= a vector of predicted class labels for predictor data in the table T.
T = Sample data, specified as a table. Each row of T corresponds to one observation, and each column corresponds to one predictor variable. Optionally, T can contain additional columns for the response variable and observation weights. T must contain all of the predictors used to train SVMModel. Multi-column variables and cell arrays other than cell arrays of strings are not allowed.
Test data: example
load newdataset
rng(1);
CVSVMModel = fitcsvm(X,Y,'Holdout',0.15,'ClassNames',{'classname1','classname2'},...
'Standardize',true);
CompactSVMModel = CVSVMModel.Trained{1}; % Extract trained, compact classifier
testInds = test(CVSVMModel.Partition); % Extract the test indices
XTest = X(testInds,:);
predict(CompactSVMModel,XTest);% test here

libsvm: optimization finished, #iter = 1 nu = nan

I use libsvm to train a svm model in matlab, but when I call
model = svmtrain(labels,Feature,'-t 0');
It gives me this result:
*
optimization finished, #iter = 1
nu = nan
obj = nan, rho = nan
nSV = 0, nBSV = 0
Total nSV = 0
My positive and negative samples are of almost equal number: 935 vs 904 so this problem is not caused by unbalanced training dataset. Also I tried other kernels and none of them work.
You do not want to use svmtrain any more. The new version is templatesvm paired with fitcecoc. The datapages on both functions are quite extensive.
You'll eventually want to use your model to predict other data, use predict for that.
I recently encoutered similar problems when trying to classify terrain in a point cloud with more than two classes. templatesvm and fitcecoc solved my problems.
The code I used is as follows, where trainingdata is my 5 dimensional training data and groups contains the label for each class, which correspond to the cell array classes.
SVMtemp = templateSVM('KernelFunction','polynomial','IterationLimit',1e4,...
'PolynomialOrder',4,'OutlierFraction',ExpOut,...
'Standardize',true); % Create SVM template
Model = fitcecoc(trainingdata(:,4:8),groups,'learners',SVMtemp,'ClassNames',...
classes); % Create the SVM model

matlab ARX parameter

I want to use ARX. X is a 1000X13 matrix (1000 sample with 13 features). I want to see the relationship of for example 1st and 2nd column of X. I don't know how to make input parameters right. What should be the size of [na nb nk] for my regression problem. Matlab documentation doesn't have much detail.
Here is my code:
data = iddata(X(:,1),[],1); %I have to make iddata object first.
Y = arx(data,[ [ones(size(X(:,1),2),size(X(:,1),2))] [ones(size(X(:,2),1),size(X(:,1),1))] [ones(size(X(:,1),2),size(X(:,1),1))] ])
Error is:
Error using horzcat
Dimensions of matrices being concatenated are not consistent.
I tried to change the dimensions of [na nb nk], but every time, I got an error like:
Y = arx(data,[ [ones(size(X(:,1),2),size(X(:,1),2))] 1 [ones(size(X(:,1),2),size(X(:,1),1))] ])
Invalid ARX orders. Note that continuous-time ARX models are not supported.
Y = arx(data,[ 1 1 1])
Error using arx (line 77)
The model orders must be compatible with the input and output dimensions of the estimation data.

Bad results when testing libsvm in matlab

can someone help me to solve this?
I want to test whether this classification is already good or not. So, I try with data testing=data training. it will give 100% (acc) if the classification is good.
this is the code that I found from this site:
data= [170 66 ;
160 50 ;
170 63 ;
173 61 ;
168 58 ;
184 88 ;
189 94 ;
185 88 ]
labels=[-1;-1;-1;-1;-1;1;1;1];
numInst = size(data,1);
numLabels = max(labels);
testVal = [1 2 3 4 5 6 7 8];
trainLabel = labels(testVal,:);
trainData = data(testVal,:);
testData=data(testVal,:);
testLabel=labels(testVal,:);
numTrain = 8; numTest =8
%# train one-against-all models
model = cell(numLabels,1);
for k=1:numLabels
model{k} = svmtrain(double(trainLabel==k), trainData, '-c 1 -t 2 -g 0.2 -b 1');
end
%# get probability estimates of test instances using each model
prob = zeros(numTest,numLabels);
for k=1:numLabels
[~,~,p] = svmpredict(double(testLabel==k), testData, model{k}, '-b 1');
prob(:,k) = p(:,model{k}.Label==1); %# probability of class==k
end
%# predict the class with the highest probability
[~,pred] = max(prob,[],2);
acc = sum(pred == testLabel) ./ numel(testLabel) %# accuracy
C = confusionmat(testLabel, pred) %# confusion matrix
and this is the results:
optimization finished, #iter = 16
nu = 0.645259 obj = -2.799682,
rho = -0.437644 nSV = 8, nBSV = 1 Total nSV = 8
Accuracy = 100% (8/8) (classification)
acc =
0.3750
C =
0 5
0 3
I dont know why there's two accuracy, and its different. the first one is 100% and the second one is 0.375. is my code false? it should be 100% not 37.5%. Can u help me to correct this code??
If your using libsvm then you should change the name of the MEX file since Matlab already has a svm toolbox with the name svmtrain. However, the code is running so it seems you did change the name just not on the code you provided.
The second one is wrong, don't know exactly why. However, I can tell you that you will almost always get 100% accuracy if you use the test_Data = training_Data. That result really does not mean anything, since the algorithm can be overfit and not be shown in your results. Test your algorithm against new data and that will give you a realistic accuracy.
Is that the code you're using? I don't think your svmtrain invocation is valid. You should have svmtrain(MAT, VECT, ...) where MAT is a matrix of data, and VECT is a vector with the labels of each row of MAT. The remaining parameters are string-value pairs, meaning you'll have a string identifier and its corresponding valie.
When I ran your code (Linux, R2011a) I got an error on the svmtrain call. Running with svmtrain(trainData, double(trainLabel==k)) gave a valid output (for that line). Of course, it appears that you're not using pure matlab, as your svmpredict call isn't native matlab, but rather a matlab binding from LIBSVM...
C = confusionmat(testLabel, pred)
swap their positions
C= confusionmat(pred,testLabel)
or use this
[ConMat,order] = confusionmat(pred,testLabel)
shows the confusion matrix and the class order
The problem is in
[~,~,p] = svmpredict(double(testLabel==k), testData, model{k}, '-b 1');
p does not contain the predicted labels, it has the probability estimates of the labels being correct. LIBSVM's svmpredict already calculates accuracy for you correctly, that's why it says 100% in the debug output.
The fix is simple:
[p,~,~] = svmpredict(double(testLabel==k), testData, model{k}, '-b 1');
According to LIBSVM's Matlab bindings README:
The function 'svmpredict' has three outputs. The first one,
predictd_label, is a vector of predicted labels. The second output,
accuracy, is a vector including accuracy (for classification), mean
squared error, and squared correlation coefficient (for regression).
The third is a matrix containing decision values or probability
estimates (if '-b 1' is specified). If k is the number of classes
in training data, for decision values, each row includes results of
predicting k(k-1)/2 binary-class SVMs. For classification, k = 1 is a
special case. Decision value +1 is returned for each testing instance,
instead of an empty vector. For probabilities, each row contains k values
indicating the probability that the testing instance is in each class.
Note that the order of classes here is the same as 'Label' field
in the model structure.
I am sorry to tell that all answers are totally wrong!!
The main error done in the code is:
numLabels = max(labels);
because it returns (1), although it should return 2 if the labels are positive numbers, and then svmtrain/svmpredict will loop twice.
Anyway, change line labels=[-1;-1;-1;-1;-1;1;1;1];
to labels=[2;2;2;2;2;1;1;1];
and it will work successfully ;)

using precomputed kernels with libsvm

I'm currently working on classifying images with different image-descriptors. Since they have their own metrics, I am using precomputed kernels. So given these NxN kernel-matrices (for a total of N images) i want to train and test a SVM. I'm not very experienced using SVMs though.
What confuses me though is how to enter the input for training. Using a subset of the kernel MxM (M being the number of training images), trains the SVM with M features. However, if I understood it correctly this limits me to use test-data with similar amounts of features. Trying to use sub-kernel of size MxN, causes infinite loops during training, consequently, using more features when testing gives poor results.
This results in using equal sized training and test-sets giving reasonable results. But if i only would want to classify, say one image, or train with a given amount of images for each class and test with the rest, this doesn't work at all.
How can i remove the dependency between number of training images and features, so i can test with any number of images?
I'm using libsvm for MATLAB, the kernels are distance-matrices ranging between [0,1].
You seem to already have figured out the problem... According to the README file included in the MATLAB package:
To use precomputed kernel, you must include sample serial number as
the first column of the training and testing data.
Let me illustrate with an example:
%# read dataset
[dataClass, data] = libsvmread('./heart_scale');
%# split into train/test datasets
trainData = data(1:150,:);
testData = data(151:270,:);
trainClass = dataClass(1:150,:);
testClass = dataClass(151:270,:);
numTrain = size(trainData,1);
numTest = size(testData,1);
%# radial basis function: exp(-gamma*|u-v|^2)
sigma = 2e-3;
rbfKernel = #(X,Y) exp(-sigma .* pdist2(X,Y,'euclidean').^2);
%# compute kernel matrices between every pairs of (train,train) and
%# (test,train) instances and include sample serial number as first column
K = [ (1:numTrain)' , rbfKernel(trainData,trainData) ];
KK = [ (1:numTest)' , rbfKernel(testData,trainData) ];
%# train and test
model = svmtrain(trainClass, K, '-t 4');
[predClass, acc, decVals] = svmpredict(testClass, KK, model);
%# confusion matrix
C = confusionmat(testClass,predClass)
The output:
*
optimization finished, #iter = 70
nu = 0.933333
obj = -117.027620, rho = 0.183062
nSV = 140, nBSV = 140
Total nSV = 140
Accuracy = 85.8333% (103/120) (classification)
C =
65 5
12 38