I try to code a NN that predicts the cross product of 2 vectors in R^3.
XTrain has 3:2:1000 dimensions and is an array of matrices 3x2 (or basically 2 vectors 3x1). YTrain is a 3:1000 array. Here is my code:
Xtrain = zeros(3,2,1000);
Ytrain = zeros(3,1000);
for c = 1:1000
v1 = -1 + (1+1)*rand(3,1);
v2 = -1 + (1+1)*rand(3,1);
C = cross(v1,v2);
m = [v1, v2];
Xtrain(:,:,c) = m;
Ytrain(:,c) = C;
end
net = feedforwardnet([3 2]);
net.layers{1}.transferFcn = 'tansig';
net.layers{2}.transferFcn = 'purelin';
net = train(net, Xtrain, Ytrain);
However Matlab gives 'Inputs X is not two-dimensional.' error. What is the best solution here?
In feedforward neural networks inputs are always 1D.
So unroll your 2D input into a 1D array, and just train with that!
Related
My problem is simple. I have trained a feedforwardnet. now I want to extract its weights and biases so i can test it on another programming language. but while i tested those trained weights by my own code, it always returns different results compare with neural tool box.here is my code
close all
RandStream.setGlobalStream (RandStream ('mrg32k3a','Seed', 1234));
[x,t] = simplefit_dataset;
plot(t)
hold on
topo = [2]
net = feedforwardnet(topo);
net = train(net,x,t);
view(net)
y = net(x);
plot(y)
%rewrite net
BI = net.B{1};
WI = net.IW{1};
BO = net.B{2};
WO = net.LW{2};
% input layer
Z = WI*x + BI*ones(1,length(x));
Z = 2./(1+exp(-2*Z))-1;
Y = WO*Z + BO*ones(1,length(x));
plot(Y)
legend('target','tool box result','my result')
it is a simple neural network only have two layer.
no scaling or normalization be implied
here is the result
Neural network inputs and outputs are mapped to the [-1;1] range by default, so the calculation of Z and Y is correct for the mapped values, not for the actual inputs and outputs. To avoid this, you can unset the processFcns properties:
....
net = feedforwardnet(topo);
net.inputs{1}.processFcns= {};
net.outputs{2}.processFcns= {};
net = train(net,x,t);
....
Alternatively, you can map the input and output values manually:
x= mapminmax(x,-1,1);
Z = WI*x + BI*ones(1,length(x));
Z = 2./(1+exp(-2*Z))-1;
Y = WO*Z + BO*ones(1,length(x));
Y= mapminmax(Y,min(y),max(y));
I am using DeepLearnToolbox to do CNN (Convolutional Neural Networks).
I have computed my network successfully and I've seen my accuracy, but my question is:
how can I query one single image into the network in order to get the label predicted?
The final result that I want to get is a label predicted, with maybe the error for each label not predicted.
Thank you.
This is the code that I have for testing the accuracy:
function [er, bad] = cnntest(net, x, y) % net = network, x = test_x (images), y = test_y (labels)
% feedforward
net = cnnff(net, x);
[~, h] = max(net.o);
[~, a] = max(y);
bad = find(h ~= a);
er = numel(bad) / size(y, 2);
end
These two lines
net = cnnff(net, x);
[~, h] = max(net.o);
feed an image x through the network and then compute the index h which had the largest output activation. You can simply do the same for an arbitrary input image x and it will give you the class h.
I have a simple matlab code to generate some random data and then to use a Euclidean and Mahalanobis classifier to classify the random data. The issue I am having is that the error results for each classifier is always the same. They both always misclassify the same vectors. But the data is different each time.
So the data is created in a simple way to check the results easily. Because we have three classes all of which are equiprobable, I just generate 333 random values for each class and add them all to X to be classified. Thus the results should be [class 1, class 2, class 3] but 333 of each.
I can tell the classifiers work because I can view the data created by mvnrnd is random each time and the error changes. But between the two classifiers the error does not change.
Can anyone tell why?
% Create some initial values, means, covariance matrix, etc
c = 3;
P = 1/c; % All 3 classes are equiprobable
N = 999;
m1 = [1, 1];
m2 = [12, 8];
m3 = [16, 1];
m = [m1; m2; m3];
S = [4 0; 0 4]; % All share the same covar matrix
% Generate random data for each class
X1 = mvnrnd(m1, S, N*P);
X2 = mvnrnd(m2, S, N*P);
X3 = mvnrnd(m3, S, N*P);
X = [X1; X2; X3];
% Create the solution array zEst to compare results to
xEst = ceil((3/999:3/999:3));
% Do the actual classification for mahalanobis and euclidean
zEuc = euc_mal_classifier(m', S, P, X', c, N, true);
zMal = euc_mal_classifier(m', S, P, X', c, N, false);
% Check the results
numEucErr = 0;
numMalErr = 0;
for i=1:N
if(zEuc(i) ~= xEst(i))
numEucErr = numEucErr + 1;
end
if(zMal(i) ~= xEst(i))
numMalErr = numMalErr + 1;
end
end
% Tell the user the results of the classification
strE = ['Euclidean classifier error percent: ', num2str((numEucErr/N) * 100)];
strM = ['Mahalanob classifier error percent: ', num2str((numMalErr/N) * 100)];
disp(strE);
disp(strM);
And the classifier
function z = euc_mal_classifier( m, S, P, X, c, N, eOrM)
for i=1:N
for j=1:c
if(eOrM == true)
t(j) = sqrt((X(:,i)- m(:,j))'*(X(:,i)-m(:,j)));
else
t(j) = sqrt((X(:,i)- m(:,j))'*inv(S)*(X(:,i)-m(:,j)));
end
end
[num, z(i)] = min(t);
end
The reason why there is no difference in classification lies in your covariance matrix.
Assume the distance of a point to the center of a class is [x,y].
For euclidian the distance then will be:
sqrt(x*x + y*y);
For Mahalanobis:
Inverse of covariance matrix:
inv([a,0;0,a]) = [1/a,0;0,1/a]
Distance is then:
sqrt(x*x*1/a + y*y*1/a) = 1/sqrt(a)* sqrt(x*x + y*y)
So, the distances for the classes will be the same as euclidean but with a scale factor. Since the scale factor is the same for all classes and dimensions, you will not find a difference in your class assignments!
Test it with different covariance matrices and you will find your errors to differ.
Because of this kind of data with identity covariance matrix, all classifiers should result in almost the same performance
let's see the data without identity covariance matrix that three classifiers lead to different errors :
err_bayesian =
0.0861
err_euclidean =
0.1331
err_mahalanobis =
0.0871
close('all');clear;
% Generate and plot dataset X1
m1=[1, 1]'; m2=[10, 5]';m3=[11, 1]';
m=[m1 m2 m3];
S1 = [7 4 ; 4 5];
S(:,:,1)=S1;
S(:,:,2)=S1;
S(:,:,3)=S1;
P=[1/3 1/3 1/3];
N=1000;
randn('seed',0);
[X,y] =generate_gauss_classes(m,S,P,N);
plot_data(X,y,m,1);
randn('seed',200);
[X4,y1] =generate_gauss_classes(m,S,P,N);
% 2.5_b.1 Applying Bayesian classifier
z_bayesian=bayes_classifier(m,S,P,X4);
% 2.5_b.2 Apply ML estimates of the mean values and covariance matrix (common to all three
% classes) using function Gaussian_ML_estimate
class1_data=X(:,find(y==1));
[m1_hat, S1_hat]=Gaussian_ML_estimate(class1_data);
class2_data=X(:,find(y==2));
[m2_hat, S2_hat]=Gaussian_ML_estimate(class2_data);
class3_data=X(:,find(y==3));
[m3_hat, S3_hat]=Gaussian_ML_estimate(class3_data);
S_hat=(1/3)*(S1_hat+S2_hat+S3_hat);
m_hat=[m1_hat m2_hat m3_hat];
% Apply the Euclidean distance classifier, using the ML estimates of the means, in order to
% classify the data vectors of X1
z_euclidean=euclidean_classifier(m_hat,X4);
% 2.5_b.3 Similarly, for the Mahalanobis distance classifier, we have
z_mahalanobis=mahalanobis_classifier(m_hat,S_hat,X4);
% 2.5_c. Compute the error probability for each classifier
err_bayesian = (1-length(find(y1==z_bayesian))/length(y1))
err_euclidean = (1-length(find(y1==z_euclidean))/length(y1))
err_mahalanobis = (1-length(find(y1==z_mahalanobis))/length(y1))
This is a follow up question to:
PCA Dimensionality Reduction
In order to classify the new 10 dimensional test data do I have to reduce the training data down to 10 dimensions as well?
I tried:
X = bsxfun(#minus, trainingData, mean(trainingData,1));
covariancex = (X'*X)./(size(X,1)-1);
[V D] = eigs(covariancex, 10); % reduce to 10 dimension
Xtrain = bsxfun(#minus, trainingData, mean(trainingData,1));
pcatrain = Xtest*V;
But using the classifier with this and the 10 dimensional testing data produces very unreliable results? Is there something that I am doing fundamentally wrong?
Edit:
X = bsxfun(#minus, trainingData, mean(trainingData,1));
covariancex = (X'*X)./(size(X,1)-1);
[V D] = eigs(covariancex, 10); % reduce to 10 dimension
Xtrain = bsxfun(#minus, trainingData, mean(trainingData,1));
pcatrain = Xtest*V;
X = bsxfun(#minus, pcatrain, mean(pcatrain,1));
covariancex = (X'*X)./(size(X,1)-1);
[V D] = eigs(covariancex, 10); % reduce to 10 dimension
Xtest = bsxfun(#minus, test, mean(pcatrain,1));
pcatest = Xtest*V;
You have to reduce both training and test data, but both in the same way. So once you got your reduction matrix from PCA on the training data, you have to use this matrix to reduce dimensionality of the test data. In short words, you need one, constant transformation which is applied to both training and testing elements.
Using your code
% first, 0-mean data
Xtrain = bsxfun(#minus, Xtrain, mean(Xtrain,1));
Xtest = bsxfun(#minus, Xtest, mean(Xtrain,1));
% Compute PCA
covariancex = (Xtrain'*Xtrain)./(size(Xtrain,1)-1);
[V D] = eigs(covariancex, 10); % reduce to 10 dimension
pcatrain = Xtrain*V;
% here you should train your classifier on pcatrain and ytrain (correct labels)
pcatest = Xtest*V;
% here you can test your classifier on pcatest using ytest (compare with correct labels)
I want to do a 10-fold cross-validation in my one-against-all support vector machine classification in MATLAB.
I tried to somehow mix these two related answers:
Multi-class classification in libsvm
Example of 10-fold SVM classification in MATLAB
But as I'm new to MATLAB and its syntax, I didn't manage to make it work till now.
On the other hand, I saw just the following few lines about cross validation in the LibSVM README files and I couldn't find any related example there:
option -v randomly splits the data into n parts and calculates cross
validation accuracy/mean squared error on them.
See libsvm FAQ for the meaning of outputs.
Could anyone provide me an example of 10-fold cross-validation and one-against-all classification?
Mainly there are two reasons we do cross-validation:
as a testing method which gives us a nearly unbiased estimate of the generalization power of our model (by avoiding overfitting)
as a way of model selection (eg: find the best C and gamma parameters over the training data, see this post for an example)
For the first case which we are interested in, the process involves training k models for each fold, and then training one final model over the entire training set.
We report the average accuracy over the k-folds.
Now since we are using one-vs-all approach to handle the multi-class problem, each model consists of N support vector machines (one for each class).
The following are wrapper functions implementing the one-vs-all approach:
function mdl = libsvmtrain_ova(y, X, opts)
if nargin < 3, opts = ''; end
%# classes
labels = unique(y);
numLabels = numel(labels);
%# train one-against-all models
models = cell(numLabels,1);
for k=1:numLabels
models{k} = libsvmtrain(double(y==labels(k)), X, strcat(opts,' -b 1 -q'));
end
mdl = struct('models',{models}, 'labels',labels);
end
function [pred,acc,prob] = libsvmpredict_ova(y, X, mdl)
%# classes
labels = mdl.labels;
numLabels = numel(labels);
%# get probability estimates of test instances using each 1-vs-all model
prob = zeros(size(X,1), numLabels);
for k=1:numLabels
[~,~,p] = libsvmpredict(double(y==labels(k)), X, mdl.models{k}, '-b 1 -q');
prob(:,k) = p(:, mdl.models{k}.Label==1);
end
%# predict the class with the highest probability
[~,pred] = max(prob, [], 2);
%# compute classification accuracy
acc = mean(pred == y);
end
And here are functions to support cross-validation:
function acc = libsvmcrossval_ova(y, X, opts, nfold, indices)
if nargin < 3, opts = ''; end
if nargin < 4, nfold = 10; end
if nargin < 5, indices = crossvalidation(y, nfold); end
%# N-fold cross-validation testing
acc = zeros(nfold,1);
for i=1:nfold
testIdx = (indices == i); trainIdx = ~testIdx;
mdl = libsvmtrain_ova(y(trainIdx), X(trainIdx,:), opts);
[~,acc(i)] = libsvmpredict_ova(y(testIdx), X(testIdx,:), mdl);
end
acc = mean(acc); %# average accuracy
end
function indices = crossvalidation(y, nfold)
%# stratified n-fold cros-validation
%#indices = crossvalind('Kfold', y, nfold); %# Bioinformatics toolbox
cv = cvpartition(y, 'kfold',nfold); %# Statistics toolbox
indices = zeros(size(y));
for i=1:nfold
indices(cv.test(i)) = i;
end
end
Finally, here is simple demo to illustrate the usage:
%# laod dataset
S = load('fisheriris');
data = zscore(S.meas);
labels = grp2idx(S.species);
%# cross-validate using one-vs-all approach
opts = '-s 0 -t 2 -c 1 -g 0.25'; %# libsvm training options
nfold = 10;
acc = libsvmcrossval_ova(labels, data, opts, nfold);
fprintf('Cross Validation Accuracy = %.4f%%\n', 100*mean(acc));
%# compute final model over the entire dataset
mdl = libsvmtrain_ova(labels, data, opts);
Compare that against the one-vs-one approach which is used by default by libsvm:
acc = libsvmtrain(labels, data, sprintf('%s -v %d -q',opts,nfold));
model = libsvmtrain(labels, data, strcat(opts,' -q'));
It may be confusing you that one of the two questions is not about LIBSVM. You should try to adjust this answer and ignore the other.
You should select the folds, and do the rest exactly as the linked question. Assume the data has been loaded into data and the labels into labels:
n = size(data,1);
ns = floor(n/10);
for fold=1:10,
if fold==1,
testindices= ((fold-1)*ns+1):fold*ns;
trainindices = fold*ns+1:n;
else
if fold==10,
testindices= ((fold-1)*ns+1):n;
trainindices = 1:(fold-1)*ns;
else
testindices= ((fold-1)*ns+1):fold*ns;
trainindices = [1:(fold-1)*ns,fold*ns+1:n];
end
end
% use testindices only for testing and train indices only for testing
trainLabel = label(trainindices);
trainData = data(trainindices,:);
testLabel = label(testindices);
testData = data(testindices,:)
%# train one-against-all models
model = cell(numLabels,1);
for k=1:numLabels
model{k} = svmtrain(double(trainLabel==k), trainData, '-c 1 -g 0.2 -b 1');
end
%# get probability estimates of test instances using each model
prob = zeros(size(testData,1),numLabels);
for k=1:numLabels
[~,~,p] = svmpredict(double(testLabel==k), testData, model{k}, '-b 1');
prob(:,k) = p(:,model{k}.Label==1); %# probability of class==k
end
%# predict the class with the highest probability
[~,pred] = max(prob,[],2);
acc = sum(pred == testLabel) ./ numel(testLabel) %# accuracy
C = confusionmat(testLabel, pred) %# confusion matrix
end