How to improve the perfomance of SVM? - matlab

Im using LIBSVM and MatLab to classify 34x5 data in 3 classes. I applied 10 fold Kfold cross validation method and RBF kernel. The output is this confusion matrix with 0.88 Correct rate (88 % accuracy). This is my confusion matrix
9 0 0
0 3 0
0 4 18
I would like to know what methods inside SVM to consider to improve the accuracy or other classifications method in Machine learning techniques. Any help?
Here is my SVM classification code
load Turn180SVM1; //load data file
libsvm_options = '-s 1 -t 2 -d 3 -r 0 -c 1 -n 0.1 -p 0.1 -m 100 -e 0.000001 -h 1 -b 0 -wi 1 -q';//svm options
C=size(Turn180SVM1,2);
% cross validation
for i = 1:10
indices = crossvalind('Kfold',Turn180SVM1(:,C),10);
cp = classperf(Turn180SVM1(:,C));
for j = 1:10
[X, Z] = find(indices(:,end)==j);%testing
[Y, Z] = find(indices(:,end)~=j);%training
feature_training = Turn180SVM1([Y'],[1:C-1]); feature_testing = Turn180SVM1([X'],[1:C-1]);
class_training = Turn180SVM1([Y'],end); class_testing = Turn180SVM1([X'], end);
% SVM Training
disp('training');
[feature_training,ps] = mapminmax(feature_training',0,1);
feature_training = feature_training';
feature_testing = mapminmax('apply',feature_testing',ps)';
model = svmtrain(class_training,feature_training,libsvm_options);
%
% SVM Prediction
disp('testing');
TestPredict = svmpredict(class_testing,sparse(feature_testing),model);
TestErrap = sum(TestPredict~=class_testing)./length(class_testing)*100;
cp = classperf(cp, TestPredict, X);
disp(((i-1)*10 )+j);
end;
end;
[ConMat,order] = confusionmat(TestPredict,class_testing);
cp.CorrectRate;
cp.CountingMatrix;

Many methods exist. If your tuning procedure is optimal (e.g. well executed cross-validation) your choices include:
Improve preprocessing, perhaps tailor new aggregated features based on domain knowledge. Most importantly (and most effectively): make sure your inputs are standardized properly, for example by scaling every dimension onto [-1,1].
Use another kernel: RBF kernels are known to perform very well in a wide variety of settings, but specialised kernels exist for many tasks. Don't consider this unless you know what you are doing. Since you are dealing with a low-dimensional problem, RBF is probably a good choice if your data is not structured.
Reweigh training instances: particularly important when your data set is unbalanced (e.g. some classes have a lot less instances than others). You can do this with the -wX options in libsvm. All sorts of reweighting schemes exist, including variants of boosting. I'm not a major fan of this, since such approaches are prone to overfitting.
Change the cross-validation cost function to suit your exact needs. Is accuracy really what you are looking for or do you want, say, high F1 or high ROC-AUC? It is surprising how many people optimize a performance measure they are not really interested in.

Related

Standard arrays seem faster than gpuArray on conv net feed forward

I am implementing Convolutional networks in MATLAB, and I added a support for GPUs (I am using gpuArrays). I implemented the feed forward part. When I run it with standard array (I have the arrays already in my workspace ready), it takes 0.15 sec. However, when I run the EXACT same thing, but the arrays being gpuArrays, which are all in my workspace prior to running the feed forward script, it takes ~1.39 sec. Can someone explain what's going on here? Thanks
UPDATE: I tested running time and everything suggests that the main bottleneck is my convolution part, so I will paste that part of code down here:
pad = (size(layers_W{layerNum}, 1)-1) / 2;
for imageNum = 1:options.minibatchSize
for filterNum = 1:size(layers_W{layerNum}, 4)
for filterD = 1:size(layers_W{layerNum}, 3)
c = conv2(convInput(:, :, filterD, imageNum), ...
rot90(layers_W{layerNum}(:, :, filterD, filterNum), 2), 'valid');
layers_activations{layerNum}(pad+1:end-pad, pad+1:end-pad, filterNum, imageNum) = ...
layers_activations{layerNum}(pad+1:end-pad, pad+1:end-pad, filterNum, imageNum) + ...
c;
end
layers_activations{layerNum}(pad+1:end-pad, pad+1:end-pad, filterNum, imageNum) = ...
layers_activations{layerNum}(pad+1:end-pad, pad+1:end-pad, filterNum, imageNum) + ...
layers_b{layerNum}(filterNum);
end
end
if strcmp(options.activation, 'relu') == 1
layers_activations{layerNum} = max(0, layers_activations{layerNum});
elseif strcmp(options.activation, 'sigmoid') == 1
layers_activations{layerNum} = 1 ./ (1 + exp(-layers_activations{layerNum}));
end
This exact piece of code is ~52 times slower on GPU than on CPU. Any ideas?
UPDATE2: Tested separately the line that does 2d convolution (~10 times slower on GPU) and the line below it that adds two matrices(~100 times slower on GPU). I am completely confused why this is happening.
This isn't at all a surprise. The GPU is efficient at doing convolutions on large images (HD, 4K) but not particularly at images 227x227 or smaller, such as are typical in CNNs. You need to at least be running a 3-D convolution so you can apply all the filters over each input activation in one call, rather than looping over all the filters and all the images. Try replacing the inner loop with a call to convn.
Smart GPU implementations of convolution in this context, such as that used by the Neural Network Toolbox in MATLAB, use custom kernels and multi-threading to take advantage of spatial parallelism and parallelism in the batch dimensions of filters and inputs. Your implementation throws away all the batch parallelism.

Optimize matlab for loop for big data

I want to calculate the Euclidean distance between two images using the Hyperbolic Tangent (Sigmoid) kernel. Please follow this link where I have discussed the same problem using Gaussian Kernel in detail.
If x=(i,j) & y=(i1,j1) are any two pixels in our image then for hyperbolic tangent kernel, my H(x,y) will be defined as:
H(i,j) = tanh(alpha*(x'*y) + c)
where alpha and c are parameters and x' is the transpose of x. Parameter alpha can be taken as 1/N where N is my image dimension(8192 x 200 in my case) and c can take any value according to the problem. More detailed description about Hyperbolic Tangent kernel can be found here.
To achieve my goal & keeping the running time under consideration, I have written the below MATLAB script.
gray1=zeros(8192,200);
gray2=zeros(8192,200);
s1 = 8192;
s2 = 200;
alpha = s1*s2;
perms = combvec(1:s2,1:s1);
perms = [perms(2,:);perms(1,:)]';
perms1 = perms;
gray1(4096,100) = 10;
gray2(10,100) = 10;
img_diff = gray1 - gray2;
display('Calculation of Sigmoid Kernel started');
for i = 1:length(perms1)
kernel = sum(bsxfun(#times,perms,perms1(i,:))');
kernel1 = tanh((1/alpha)*kernel + 1)';
g_temp(i) = img_diff(:)'*kernel1;
end
temp = g_temp*img_diff(:);
ans = sqrt(temp);
In spite of my all efforts I couldn't vectorize it further so as to decrease its running cost. Currently, it is taking around 29 hours to complete which is too much for me as I want to run it for various different images. I want to give it a completely vectorized form using intrinsic MATLAB functions as it was done by #dan-man in the case of Gaussian Kernel. With his help the Gaussian Version was taking 1-2 secs to complete. I tried my best to use the same conv2fft function in this case also but it seems difficult to find a way to achieve that.
Can someone please help me to remove that one extra for loop so as to get the running cost of algorithm in the same proportion as that of the Gaussian version of same problem.
Thanks in advance.
Get rid of the nasty loop with matrix-multiplication -
g_temp = img_diff(:).'*tanh((1/alpha)*(perms*perms.')+1)
With my times in my PC for just 50 iterations, the code takes 2.07s
Just changing the bsxfun line to
kernel = sum(bsxfun(#times,perms,perms1(i,:)),2)';
as the warning suggests you can get it to 1.65s
If you use the Neural Network toolbox and substitute tanh by tansig , the time goes to 1.44s
If you write your own tanhas
kernel1= (2./(1+exp(-2.*((1/alpha)*kernel + 1)))-1)';
the time goes to 1.28s
Just these changes would mean improvement from 29h to 18h
And remember to preallocate!
g_temp=zeros(length(perms1),1);

How to use dataset of UCI for classification of RVM?

I have downloaded some datasets from UCI for classification of RVM task.However,I am not sure about how to use it.I guess that these datasets must be normalized or do some other job before using it for training and testing.
For example,I have downloaded 'banknote authentication Data Set' on UCI.And use svmtrain in matlab to obtain a svm model(use svm model for testing data and then use rvm codes if result of svm classification is ok).
>> load banknote
>> meas = banknote(:,1:4);
>> species = banknote(:,5);
>> data = [meas(:,1), meas(:,2), meas(:,3), meas(:,4)];
>> groups = ismember(species,1);
>> [train, test] = crossvalind('holdOut',groups);
>> cp = classperf(groups);
>> svmStruct = svmtrain(data(train,:),groups(train),'showplot',true);
These is what I do in matlab,and get the following message:
??? Error using ==> svmtrain at 470
Unable to solve the optimization problem:
Maximum number of iterations exceeded; increase options.MaxIter.
To continue solving the problem with the current solution as the
starting point, set x0 = x before calling quadprog.
And here are a part of the dataset(total lines 1372 and use some for training and the rest for testing):
3.6216,8.6661,-2.8073,-0.44699,0
4.5459,8.1674,-2.4586,-1.4621,0
3.866,-2.6383,1.9242,0.10645,0
3.4566,9.5228,-4.0112,-3.5944,0
0.32924,-4.4552,4.5718,-0.9888,0
4.3684,9.6718,-3.9606,-3.1625,0
3.5912,3.0129,0.72888,0.56421,0
2.0922,-6.81,8.4636,-0.60216,0
3.2032,5.7588,-0.75345,-0.61251,0
1.5356,9.1772,-2.2718,-0.73535,0
1.2247,8.7779,-2.2135,-0.80647,0
3.9899,-2.7066,2.3946,0.86291,0
1.8993,7.6625,0.15394,-3.1108,0
-1.5768,10.843,2.5462,-2.9362,0
3.404,8.7261,-2.9915,-0.57242,0
So, any good advice about this problem?Thank you all for helping.
Later to commit.Use scale function to normalization the feature.And if the datasets have too many features,we can use PCA to reduce dimension.

Control Toolbox: How does the function minreal work in Matlab?

I have a question about the function “minreal” in Matlab. From the help of Matlab I would assume that the output is a minimal realization of a system. To my understanding it means that the output function is observable and controllable.
Example:
num = [ 6.40756397363316, -4511.90326777420, 7084807.91317081, -3549645853.18273, 2307781024837.00, -761727788683491, 2.26760542619190e+17, -1.54992537527829e+19, 5.58719150155001e+21 ];
den = [ 1, 824.614362937241, 1036273.19811846, 592905955.793358, 319582996989.696, 106244022544031, 2.87990542333047e+16, 2.36284104437760e+18, 3.50241006466156e+20, 0];
G = tf(num,den);
G_min = minreal(ss(G));
But it is not a minimal realization:
>> size(G_min)
State-space model with 1 outputs, 1 inputs, and 9 states.
>> rank(obsv(G_min))
ans = 6
>> rank(ctrb(G_min))
ans = 5
So obviously: rank(obsv(G_min)) != rank(ctrb(G_min)) != 9 (number of states).
Where is my mistake?
Thank you very much.
Conceptually you are correct, in that a minimal realization is controllable and observable. However, minreal makes no guarantees of that. As per the doc:
Pole-zero cancellation is a straightforward search through the poles
and zeros looking for matches that are within tolerance. Transfer functions
are first converted to zero-pole-gain form.
That is, minreal just does a somewhat mindless search for whether poles and
zeros are close to each other, and makes no guarantees that the result
satisfied any other conditions. Note that in your case you could specify a
larger tolerance and more states would be eliminated,
>> G_red = minreal(G,10)
G_red =
6.408 s + 74.87
------------------------
s^2 + 625.7 s + 1.703e05
Continuous-time transfer function.
and you'd get something closer to what you might expect.
Alternatively, you'd most likely be better off transforming to a balanced realization and deciding which states to eliminate yourself. See the doc for balreal for an example of how to use it with modred to achieve this.
You might also take note of the doc for obsv, which clearly states that you shouldn't trust its results for anything other than toy problems:
obsv is here for educational purposes and is not recommended for serious control design.
Computing the rank of the observability matrix is not recommended for observability testing.
Ob will be numerically singular for most systems with more than a handful of states.
This fact is well documented in the control literature. For example, see section III in
http://lawww.epfl.ch/webdav/site/la/users/105941/public/NumCompCtrl.pdf

Thumb Recognition in matlab using SVM algo

I am working on a project thumb recognition. following is code I am reading the 118 images of order 42 X 25 and storing them in training matrix.
training=zeros(118, 1050);
imagefiles = dir('*.png');
nfiles = length(imagefiles);
for ii=1:nfiles
currentfilename = imagefiles(ii).name;
I = imread(currentfilename);
BW=im2bw(I,graythresh(I));
temp = reshape(BW,1,1050);
training(ii,:)=temp;
end
Now I am creating a matrix of labelData to assign labels to images.
labelData = zeros(118,1);
labelData(1:50,:) = 0;
labelData(51:83,:) = 1;
labelData(84:118,:) = 2;
Here i am training my system by giving training data and label data.
options=optimset('MaxIter',5000);
SVMStruct = svmtrain(training,labelData,'Kernel_Function','linear','QuadProg_Opts',options);
BUT when I run this code it is giving me an error like
Error 1 : SVMTRAIN only supports classification into two groups. GROUP contains 3 groups.
Error 2 : SVMStruct = svmtrain(training,labelData,'Kernel_Function','linear','QuadProg_Opts',options);
Kindly help me what is the problem I used it before it was working fine but now I dont know what is going on. Thanks in advance.
Error 1 tells you what the problem is - the MATLAB built-in SVM only supports binary classification. You are assigning 3 classes.
Your options are:
Construct three classifiers: 0 vs. 1,2 then 1 vs. 0,2 then 2 vs. 0,1 and look at the output of each.
Construct 0 vs. not 0 and then 1 vs. 2
Use a multi-class SVM trainer from LIBSVM or svmlight or other such packages.
The error message is pretty clear. MATLAB's svmtrain does not support multiclass classification, that is only two classes are allowed.
So, you have two options: 1) write your own multiclass classifier as a wrapper around svmtrain. You can implement one-vs-all or one-vs-one strategies. 2) use a svm implementation that already supports multiclass classification such as libsvm.
Your problem is in the labelData vector ceck it and find the eror, yoy shoild OAA architector if hthe number of classes is more then .