Could you give an example of classification of 4 classes using Support Vector Machines (SVM) in matlab something like:
atribute_1 atribute_2 atribute_3 atribute_4 class
1 2 3 4 0
1 2 3 5 0
0 2 6 4 1
0 3 3 8 1
7 2 6 4 2
9 1 7 10 3
SVMs were originally designed for binary classification. They have then been extended to handle multi-class problems. The idea is to decompose the problem into many binary-class problems and then combine them to obtain the prediction.
One approach called one-against-all, builds as many binary classifiers as there are classes, each trained to separate one class from the rest. To predict a new instance, we choose the classifier with the largest decision function value.
Another approach called one-against-one (which I believe is used in LibSVM), builds k(k-1)/2 binary classifiers, trained to separate each pair of classes against each other, and uses a majority voting scheme (max-win strategy) to determine the output prediction.
There are also other approaches such as using Error Correcting Output Code (ECOC) to build many somewhat-redundant binary-classifiers, and use this redundancy to obtain more robust classifications (uses the same idea as Hamming codes).
Example (one-against-one):
%# load dataset
load fisheriris
[g gn] = grp2idx(species); %# nominal class to numeric
%# split training/testing sets
[trainIdx testIdx] = crossvalind('HoldOut', species, 1/3);
pairwise = nchoosek(1:length(gn),2); %# 1-vs-1 pairwise models
svmModel = cell(size(pairwise,1),1); %# store binary-classifers
predTest = zeros(sum(testIdx),numel(svmModel)); %# store binary predictions
%# classify using one-against-one approach, SVM with 3rd degree poly kernel
for k=1:numel(svmModel)
%# get only training instances belonging to this pair
idx = trainIdx & any( bsxfun(#eq, g, pairwise(k,:)) , 2 );
%# train
svmModel{k} = svmtrain(meas(idx,:), g(idx), ...
'BoxConstraint',2e-1, 'Kernel_Function','polynomial', 'Polyorder',3);
%# test
predTest(:,k) = svmclassify(svmModel{k}, meas(testIdx,:));
end
pred = mode(predTest,2); %# voting: clasify as the class receiving most votes
%# performance
cmat = confusionmat(g(testIdx),pred);
acc = 100*sum(diag(cmat))./sum(cmat(:));
fprintf('SVM (1-against-1):\naccuracy = %.2f%%\n', acc);
fprintf('Confusion Matrix:\n'), disp(cmat)
Here is a sample output:
SVM (1-against-1):
accuracy = 93.75%
Confusion Matrix:
16 0 0
0 14 2
0 1 15
MATLAB does not support multiclass SVM at the moment. You could use svmtrain (2-classes) to achieve this, but it would be much easier to use a standard SVM package.
I have used LIBSVM and can confirm that it's very easy to use.
%%# Your data
D = [
1 2 3 4 0
1 2 3 5 0
0 2 6 4 1
0 3 3 8 1
7 2 6 4 2
9 1 7 10 3];
%%# For clarity
Attributes = D(:,1:4);
Classes = D(:,5);
train = [1 3 5 6];
test = [2 4];
%%# Train
model = svmtrain(Classes(train),Attributes(train,:),'-s 0 -t 2');
%%# Test
[predict_label, accuracy, prob_estimates] = svmpredict(Classes(test), Attributes(test,:), model);
Related
I want to create a 4 dimensional meshgrid.
I know I need to use the ngrid function. However, the output of meshgrid and ngrid is not exactly the same unless one permutes dimensions.
To illustrate, a three dimensional meshgrid seems to be equivalent to a three dimensional ngrid if the following permutations are done:
[X_ndgrid,Y_ndgrid,Z_ndgrid] = ndgrid(1:3,4:6,7:9)
X_meshgrid = permute(X_ndgrid,[2,1,3]);
Y_meshgrid = permute(Y_ndgrid,[2,1,3]);
Z_meshgrid = permute(Z_ndgrid,[2,1,3]);
sum(sum(sum(X == X_meshgrid))) == 27
sum(sum(sum(Y == Y_meshgrid))) == 27
sum(sum(sum(Z == Z_meshgrid))) == 27
I was wondering what are the right permutations for a 4-D meshgrid.
[X_ndgrid,Y_ndgrid,Z_ndgrid, K_ndgrid] = ndgrid(1:3,4:6,7:9,10:12 )
Edit: EBH, thanks for your answer below. Just one more quick question. If the endgoal is to create a grid in order to use interpn, what would be the difference between creating a grid with meshgrid or with ndgrid (assuming a 3 dimensional problem?)
The difference between meshgrid and ndgrid is that meshgrid order the first input vector by the columns, and the second by the rows, so:
>> [X,Y] = meshgrid(1:3,4:6)
X =
1 2 3
1 2 3
1 2 3
Y =
4 4 4
5 5 5
6 6 6
while ndgrid order them the other way arround, like:
>> [X,Y] = ndgrid(1:3,4:6)
X =
1 1 1
2 2 2
3 3 3
Y =
4 5 6
4 5 6
4 5 6
After the first 2 dimensions, there is no difference between them, so using permute only on the first 2 dimensions should be enough. So for 4 dimensions you just write:
[X_ndgrid,Y_ndgrid,Z_ndgrid,K_ndgrid] = ndgrid(1:3,4:6,7:9,10:12);
[X_meshgrid,Y_meshgrid,Z_meshgrid] = meshgrid(1:3,4:6,7:9);
X_meshgrid_p = permute(X_meshgrid,[2,1,3]);
Y_meshgrid_p = permute(Y_meshgrid,[2,1,3]);
all(X_ndgrid(1:27).' == X_meshgrid_p(:)) % the transpose is only relevant for this comparison, not for the result.
all(Y_ndgrid(1:27).' == Y_meshgrid_p(:)) % the transpose is only relevant for this comparison, not for the result.
all(Z_ndgrid(1:27).' == Z_meshgrid(:)) % the transpose is only relevant for this comparison, not for the result.
and it will return:
ans =
1
ans =
1
ans =
1
If you want to use it as an input for interpn, you should use the ndgrid format.
I have a set of data that I wish to approximate via random sampling in a non-parametric manner, e.g.:
eventl=
4
5
6
8
10
11
12
24
32
In order to accomplish this, I initially bin the data up to a certain value:
binsize = 5;
nbins = 20;
[bincounts,ind] = histc(eventl,1:binsize:binsize*nbins);
Then populate a matrix with all possible numbers covered by the bins which the approximation can choose:
sizes = transpose(1:binsize*nbins);
To use the bin counts as weights for selection i.e. bincount (1-5) = 2, thus the weight for choosing 1,2,3,4 or 5 = 2 whilst (16-20) = 0 so 16,17,18, 19 or 20 can never be chosen, I simply take the bincounts and replicate them across the bin size:
w = repelem(bincounts,binsize);
To then perform weighted number selection, I use:
[~,R] = histc(rand(1,1),cumsum([0;w(:)./sum(w)]));
R = sizes(R);
For some reason this approach is unable to approximate the data. It was my understanding that was sufficient sampling depth, the binned version of R would be identical to the binned version of eventl however there is significant variation and often data found in bins whose weights were 0.
Could anybody suggest a better method to do this or point out the error?
For a better method, I suggest randsample:
values = [1 2 3 4 5 6 7 8]; %# values from which you want to pick
numberOfElements = 1000; %# how many values you want to pick
weights = [2 2 2 2 2 1 1 1]; %# weights given to the values (1-5 are twice as likely as 6-8)
sample = randsample(values, numberOfElements, true, weights);
Note that even with 1000 samples, the distribution does not exactly correspond to the weights, so if you only pick 20 samples, the histogram may look rather different.
I have extracted HOG features for male and female pictures, now, I'm trying to use the Leave-one-out-method to classify my data.
Due the standard way to write it in Matlab is:
[Train, Test] = crossvalind('LeaveMOut', N, M);
What I should write instead of N and M?
Also, should I write above code statement inside or outside a loop?
this is my code, where I have training folder for Male (80 images) and female (80 images), and another one for testing (10 random images).
for i = 1:10
[Train, Test] = crossvalind('LeaveMOut', N, 1);
SVMStruct = svmtrain(Training_Set (Train), train_label (Train));
Gender = svmclassify(SVMStruct, Test_Set_MF (Test));
end
Notes:
Training_Set: an array consist of HOG features of training folder images.
Test_Set_MF: an array consist of HOG features of test folder images.
N: total number of images in training folder.
SVM should detect which images are male and which are female.
I will focus on how to use crossvalind for the leave-one-out-method.
I assume you want to select random sets inside a loop. N is the length of your data vector. M is the number of randomly selected observations in Test. Respectively M is the number of observations left out in Train. This means you have to set N to the length of your training-set. With M you can specify how many values you want in your Test-output, respectively you want to left out in your Train-output.
Here is an example, selecting M=2 observations out of the dataset.
dataset = [1 2 3 4 5 6 7 8 9 10];
N = length(dataset);
M = 2;
for i = 1:5
[Train, Test] = crossvalind('LeaveMOut', N, M);
% do whatever you want with Train and Test
dataset(Test) % display the test-entries
end
This outputs: (this is generated randomly, so you won't have the same result)
ans =
1 9
ans =
6 8
ans =
7 10
ans =
4 5
ans =
4 7
As you have it in your code according to this post, you need to adjust it for a matrix of features:
Training_Set = rand(10,3); % 10 samples with 3 features each
N = size(Training_Set,1);
M = 2;
for i = 1:5
[Train, Test] = crossvalind('LeaveMOut', N, 2);
Training_Set(Train,:) % displays the data to train
end
I'm trying to solve the following problem:
I'v a kernel made of 0's and 1's ,
e.g a crosslike kernel
kernel =
0 1 0
1 1 1
0 1 0
and I need to apply it to a given matrix like
D =
16 2 3 13
5 11 10 8
9 7 6 12
4 14 15 1
for semplicity let's assume to start from element D(2,2), wich is 11, to avoid padding (that I can do with padarray).
I should superimpose the kernel and extract only elements where kernel==1, i.e
[2,5,11,10,7] then apply on them a custom filter like median or average and replacing central element with the result.
Then I would like to pass through all other elements (neglect edge elements for semplicity) and do the same.
Now I'm using tempS= ordfilt2(Z,order,kernel,'symmetric');
that performs exactly that operation with median filter. But I would like to use a different criterion (i.e. the average or some weird operation )
Use blockproc. This also handles border effects automatically (see the documentation). For example, to compute the median of the values masked by the kernel:
mask = logical(kernel);
R = blockproc(D, [1 1], #(d) median(d.data(mask)), ...
'bordersize', [1 1], 'trimborder', 0);
The first [1 1] indicated the step. The second [1 1] indicates how many elements to take around the central one.
With your example D, the result is
R =
2 3 3 3
9 7 8 10
5 9 10 6
4 7 6 1
This should do what you want:
D = rand(10,20);
kernel = [0,1,0;1,1,1;0,1,0];
[dy,dx] = find(kernel==1);
% should be calculated from kernel
dy = dy-2;
dx = dx-2;
% start and stop should calculated by using kernel size
result = zeros(size(D));
for y = 2:(size(D,1)-1)
for x = 2:(size(D,2)-1)
elements = D(sub2ind(size(D),y+dy,x+dx));
result(y,x) = weirdOperation(elements);
end
end
Nevertheless this will perform very poorly in terms of speed. You should consider use builtin functions. conv2 or filter2 for linear filter operations. ordfilt2 for order-statistic funtionality.
Could you give an example of classification of 4 classes using Support Vector Machines (SVM) in matlab something like:
atribute_1 atribute_2 atribute_3 atribute_4 class
1 2 3 4 0
1 2 3 5 0
0 2 6 4 1
0 3 3 8 1
7 2 6 4 2
9 1 7 10 3
SVMs were originally designed for binary classification. They have then been extended to handle multi-class problems. The idea is to decompose the problem into many binary-class problems and then combine them to obtain the prediction.
One approach called one-against-all, builds as many binary classifiers as there are classes, each trained to separate one class from the rest. To predict a new instance, we choose the classifier with the largest decision function value.
Another approach called one-against-one (which I believe is used in LibSVM), builds k(k-1)/2 binary classifiers, trained to separate each pair of classes against each other, and uses a majority voting scheme (max-win strategy) to determine the output prediction.
There are also other approaches such as using Error Correcting Output Code (ECOC) to build many somewhat-redundant binary-classifiers, and use this redundancy to obtain more robust classifications (uses the same idea as Hamming codes).
Example (one-against-one):
%# load dataset
load fisheriris
[g gn] = grp2idx(species); %# nominal class to numeric
%# split training/testing sets
[trainIdx testIdx] = crossvalind('HoldOut', species, 1/3);
pairwise = nchoosek(1:length(gn),2); %# 1-vs-1 pairwise models
svmModel = cell(size(pairwise,1),1); %# store binary-classifers
predTest = zeros(sum(testIdx),numel(svmModel)); %# store binary predictions
%# classify using one-against-one approach, SVM with 3rd degree poly kernel
for k=1:numel(svmModel)
%# get only training instances belonging to this pair
idx = trainIdx & any( bsxfun(#eq, g, pairwise(k,:)) , 2 );
%# train
svmModel{k} = svmtrain(meas(idx,:), g(idx), ...
'BoxConstraint',2e-1, 'Kernel_Function','polynomial', 'Polyorder',3);
%# test
predTest(:,k) = svmclassify(svmModel{k}, meas(testIdx,:));
end
pred = mode(predTest,2); %# voting: clasify as the class receiving most votes
%# performance
cmat = confusionmat(g(testIdx),pred);
acc = 100*sum(diag(cmat))./sum(cmat(:));
fprintf('SVM (1-against-1):\naccuracy = %.2f%%\n', acc);
fprintf('Confusion Matrix:\n'), disp(cmat)
Here is a sample output:
SVM (1-against-1):
accuracy = 93.75%
Confusion Matrix:
16 0 0
0 14 2
0 1 15
MATLAB does not support multiclass SVM at the moment. You could use svmtrain (2-classes) to achieve this, but it would be much easier to use a standard SVM package.
I have used LIBSVM and can confirm that it's very easy to use.
%%# Your data
D = [
1 2 3 4 0
1 2 3 5 0
0 2 6 4 1
0 3 3 8 1
7 2 6 4 2
9 1 7 10 3];
%%# For clarity
Attributes = D(:,1:4);
Classes = D(:,5);
train = [1 3 5 6];
test = [2 4];
%%# Train
model = svmtrain(Classes(train),Attributes(train,:),'-s 0 -t 2');
%%# Test
[predict_label, accuracy, prob_estimates] = svmpredict(Classes(test), Attributes(test,:), model);