Problem with Converging of the 1D CNN model - matlab

I want to train a 1D CNN with a sequential data. I have two classes of data, two features and 1584 samples. Each element of my training array is a 2xn cell (n = number of frames in each sample, screenshot attached). I have tried with the MATLAB example, but my model does not converge att all. it just stays at 51%.
I don't know how to set filter number, size or layers.
Can anyone help?
https://www.mathworks.com/help/deeplearning/ug/sequence-classification-using-1-d-convolutions.html#SequenceClassificationUsing1DConvolutionsExample-2
filterSize = 2;
numFilters = 16;
Num_classes=2;
numFeatures=2;
layers = [ ...
sequenceInputLayer(numFeatures)
convolution1dLayer(filterSize,numFilters,Padding="causal")
reluLayer
layerNormalizationLayer
convolution1dLayer(filterSize,2*numFilters,Padding="causal")
reluLayer
layerNormalizationLayer
globalAveragePooling1dLayer
fullyConnectedLayer(Num_classes)
softmaxLayer
classificationLayer];
miniBatchSize = 27;
options = trainingOptions("adam", ...
MiniBatchSize=miniBatchSize, ...
MaxEpochs=15, ...
SequencePaddingDirection="left", ...
ValidationData={ValidationData,labelsValidation}, ...
Plots="training-progress", ...
Verbose=0);

Related

Confusion matrix for RCNN Detector in Matlab

I tagged 300 images using the alexnet neural network I completed the training of the rcnn detector object. I can test the pictures one by one, containing 12 different people, 100 of which are not labeled.
testImage = imread('E:.....\....jpg');
[bboxes, score, label] = detect(detector, testImage, 'MiniBatchSize', 128)
T = 0.7;
idx = score >= T;
s = score(idx);
lbl = label(idx);
bbox = bboxes(idx, :);
outputImage = testImage;
for ii = 1 : size(bbox, 1)
annotation = sprintf('%s: (Confidence = %f)', lbl(ii), s(ii));
outputImage = insertObjectAnnotation(outputImage, 'rectangle', bbox(ii,:), annotation);
end
figure
imshow(outputImage)
But for 100 test pictures, each with 12 people (12 labels), I cannot plot confusion matrix that gives the success of the network, the correct number of predictions. There are many examples, which one should I use? For example, in an experiment I get an error like this.
Example
net = trainNetwork(XTrain,YTrain,layers,options);
[XTest,YTest] = trainingData;
YPredicted = classify(net,XTest);
plotconfusion(YTest,YPredicted)
Eror
[XTest,YTest] = detector;
Insufficient number of outputs from right hand side of equal sign to satisfy assignment.
I have 100 test pictures. There is a rcnn detector trained using Alexnet. How can I see the correct number of tagging of 12 people (12 labels) in 100 images with the confusion matrix? How can I create the success of the RCNN detector for 100 images and 12 different labels?

Correct practice and approach for reporting the training and generalization performance

I am trying to learn the correct procedure for training a neural network for classification. Many tutorials are there but they never explain how to report for the generalization performance. Can somebody please tell me if the following is the correct method or not. I am using first 100 examples from the fisheriris data set that has labels 1,2 and call them as X and Y respectively. Then I split X into trainData and Xtest with a 90/10 split ratio. Using trainData I trained the NN model. Now the NN internally further splits trainData into tr,val,test subsets. My confusion is which one is usually used for generalization purpose when reporting the performance of the model to unseen data in conferences/Journals?
The dataset can be found in the link: https://www.mathworks.com/matlabcentral/fileexchange/71468-simple-neural-networks-with-k-fold-cross-validation-manner
rng('default')
load iris.mat;
X = [f(1:100,:) l(1:100)];
numExamples = size(X,1);
indx = randperm(numExamples);
X = X(indx,:);
Y = X(:,end);
split1 = cvpartition(Y,'Holdout',0.1,'Stratify',true); %90% trainval 10% test
istrainval = training(split1); % index for fitting
istest = test(split1); % indices for quality assessment
trainData = X(istrainval,:);
Xtest = X(istest,:);
Ytest = Y(istest);
numExamplesXtrainval = size(trainData,1);
indxXtrainval = randperm(numExamplesXtrainval);
trainData = trainData(indxXtrainval,:);
Ytrain = trainData(:,end);
hiddenLayerSize = 10;
% data format = rows = number of dim, column = number of examples
net = patternnet(hiddenLayerSize);
net = init(net);
net.performFcn = 'crossentropy';
net.trainFcn = 'trainscg';
net.trainParam.epochs=50;
[net tr]= train(net,trainData', Ytrain');
Trained = sim(net, trainData'); %outputs predicted labels
train_predict = net(trainData');
performanceTrain = perform(net,Ytrain',train_predict)
lbl_train=grp2idx(Ytrain);
Yhat_train = (train_predict >= 0.5);
Lbl_Yhat_Train = grp2idx(Yhat_train);
[cmMatrixTrain]= confusionmat(lbl_train,Lbl_Yhat_Train )
accTrain=sum(lbl_train ==Lbl_Yhat_Train)/size(lbl_train,1);
disp(['Training Set: Total Train Acccuracy by MLP = ',num2str(100*accTrain ), '%'])
[confTest] = confusionmat(lbl_train(tr.testInd),Lbl_Yhat_Train(tr.testInd) )
%unknown test
test_predict = net(Xtest');
performanceTest = perform(net,Ytest',test_predict);
Yhat_test = (test_predict >= 0.5);
test_lbl=grp2idx(Ytest);
Lbl_Yhat_Test = grp2idx(Yhat_test);
[cmMatrix_Test]= confusionmat(test_lbl,Lbl_Yhat_Test )
This is the output.
Problem1: There seems to be no prediction for the other class. Why?
Problem2: Do I need a separate dataset like the one I created as Xtest for reporting generalization error or is it the practice to use the data trainData(tr.testInd,:) as the generalization test set? Did I create an unnecessary subset?
performanceTrain =
2.2204e-16
cmMatrixTrain =
45 0
45 0
Training Set: Total Train Acccuracy by MLP = 50%
confTest =
9 0
5 0
cmMatrix_Test =
5 0
5 0
There are a few issues with the code. Let's deal with them before answering your question. First, you set a threshold of 0.5 for making decisions (Yhat_train = (train_predict >= 0.5);) while all points of your net prediction are above 0.5. This means you only get zeros in your confusion matrices. You can plot the scores to choose a better threshold:
figure;
plot(train_predict(Ytrain == 1),'.b')
hold on
plot(train_predict(Ytrain == 2),'.r')
legend('label 1','label 2')
cvpartition gave me an error. It ran successfully as split1 = cvpartition(Y,'Holdout',0.1); In any case, artificial neural networks usuallly manage partitioning within the training process, so you feed in X and Y and some parameters for how to do it. See here for example: link where you set
net.divideParam.trainRatio = .4;
net.divideParam.valRatio = .3;
net.divideParam.testRatio = .3;
So how to report the results? Only for the test data. The train data will suffer from overfit, and will show false, too good results. If you use validation data (you havn't), then you cannot show results for it because it will also suffer from overfit. If you let the training do validation for you your test results will be safe from overfit.

Use the osusvm to recognize faces after the C2 layer in HMAX model

I achieved the computation of the original HMAX model, and I get the results at C2 layer. Now I still have the tuned-layer, in other words, to use the osusvm.
In my project, I have two directories. One containing the training images and other containing the test images.
Reference: lennon310's response in Training images and test images
Firstly, I would like to show you my results at C2 layer (surely that the results should be a vectors). Notice that I extracted only 2 prototypes in the S2 layer (in my project I used 256 prototypes, but only in this question, assume that I used only 2 prototypes), and four prototypes sizes:[4 8 12 16]. So for each image, we get 8 C2 units (2 prototypes x 4 patch sizes = 8).
C2res{1}: For the six training images:
0.0088 0.0098 0.0030 0.0067 0.0063 0.0057
0.0300 0.0315 0.0251 0.0211 0.0295 0.0248
0.1042 0.1843 0.1151 0.1166 0.0668 0.1134
0.3380 0.2529 0.3709 0.2886 0.3938 0.3078
0.2535 0.3255 0.3564 0.2196 0.1681 0.2827
3.9902 5.3475 4.5504 4.9500 6.7440 4.4033
0.8520 0.8740 0.7209 0.7705 0.4303 0.7687
6.3131 7.2560 7.9412 7.1929 9.8789 6.6764
C2res{2}: For the two test images:
0.0080 0.0132
0.0240 0.0001
0.1007 0.2214
0.3055 0.0249
0.2989 0.3483
4.6946 4.2762
0.7048 1.2791
6.7595 4.7728
Secondly, I downloaded the osu-svm matlab toolbox and I added its path:
addpath(genpath('./osu-svm/')); %put your own path to osusvm here
useSVM = 1; %if you do not have osusvm installed you can turn this
%to 0, so that the classifier would be a NN classifier
%note: NN is not a great classifier for these features
Then I used the code below:
%Simple classification code
XTrain = [C2res{1}]; %training examples as columns
XTest = [C2res{2}]; %the labels of the training set
ytrain = [ones(size(C2res{1},2),1)];%testing examples as columns
ytest = [ones(size(C2res{2},2),1)]; %the true labels of the test set
if useSVM
Model = CLSosusvm(XTrain,ytrain); %training
[ry,rw] = CLSosusvmC(XTest,Model); %predicting new labels
else %use a Nearest Neighbor classifier
Model = CLSnn(XTrain, ytrain); %training
[ry,rw] = CLSnnC(XTest,Model); %predicting new labels
end
successrate = mean(ytest==ry) %a simple classification score
Does the code just above is true ? Why I always get a successrate=1 ? I think that I am wrong in some places. Please I need help. If it is true, does another way to compute that ? What can I use instead of successrate in order to get more sexy results?
Note:
The function CLSosusvm is :
function Model = CLSosusvm(Xtrain,Ytrain,sPARAMS);
%function Model = CLSosusvm(Xtrain,Ytrain,sPARAMS);
%
%Builds an SVM classifier
%This is only a wrapper function for osu svm
%It requires that osu svm (http://www.ece.osu.edu/~maj/osu_svm/) is installed and included in the path
%X contains the data-points as COLUMNS, i.e., X is nfeatures \times nexamples
%y is a column vector of all the labels. y is nexamples \times 1
%sPARAMS is a structure of parameters:
%sPARAMS.KERNEL specifies the kernel type
%sPARAMS.C specifies the regularization constant
%sPARAMS.GAMMA, sPARAMS.DEGREE are parameters of the kernel function
%Model contains the parameters of the SVM model as returned by osu svm
Ytrain = Ytrain';
if nargin<3
SETPARAMS = 1;
elseif isempty(sPARAMS)
SETPARAMS = 1;
else
SETPARAMS = 0;
end
if SETPARAMS
sPARAMS.KERNEL = 0;
sPARAMS.C = 1;
end
switch sPARAMS.KERNEL,
case 0,
[AlphaY, SVs, Bias, Parameters, nSV, nLabel] = ...
LinearSVC(Xtrain, Ytrain, sPARAMS.C);
case 1,
[AlphaY, SVs, Bias, Parameters, nSV, nLabel] = ...
PolySVC(Xtrain, Ytrain, sPARAMS.DEGREE, sPARAMS.C, 1,0);
case 2,
[AlphaY, SVs, Bias, Parameters, nSV, nLabel] = ...
PolySVC(Xtrain, Ytrain, sPARAMS.DEGREE, sPARAMS.C, 1,sPARAMS.COEF);
case 3,
[AlphaY, SVs, Bias, Parameters, nSV, nLabel] = ...
RbfSVC(Xtrain, Ytrain, sPARAMS.GAMMA, sPARAMS.C);
end
Model.AlphaY = AlphaY;
Model.SVs = SVs;
Model.Bias = Bias;
Model.Parameters = Parameters;
Model.nSV = nSV;
Model.nLabel = nLabel;
Model.sPARAMS = sPARAMS;
The function CLSosusvmC is :
function [Labels, DecisionValue]= CLSosusvmC(Samples, Model);
%function [Labels, DecisionValue]= CLSosusvmC(Samples, Model);
%
%wrapper function for osu svm classification
%Samples contains the data-points to be classified as COLUMNS, i.e., it is nfeatures \times nexamples
%Model is the model returned by CLSosusvm
%Labels are the predicted labels
%DecisionValue are the values assigned by the Model to the points (Labels = sign(DecisionValue))
[Labels, DecisionValue]= SVMClass(Samples, Model.AlphaY, ...
Model.SVs, Model.Bias, ...
Model.Parameters, Model.nSV, Model.nLabel);
Labels = Labels';
DecisionValue = DecisionValue';
Your code looks good to me.
Since you have only 2 test images, the possible successful rate will be limited to 0, 0.5, 1. And it is expected to achieve the 100% accuracy with a 25% probability ([0 1],[1 0],[1 1],[0 0]). You can shuffle the data, and re-select 2 from the 8 as the test for some times, then observe the accuracy.
Also try to add more images to both training and test samples.
Machine learning has little sense on the set of 8 images. Gather at least 10x more data, and then analyze the results. With such small dataset any results are possible (from 0 to 100 percent), and none of them is realiable.
Meanwhile you can try to perform repeated cross validation:
Shuffle your data
Split it to two-element parts ( [1 2][3 4][5 6][7 8] )
For each of such parts:
a) test on it, while training on the rest, so for example:
train on [3 4 5 6 7 8] and test on [1 2]
b) record the mean score
Repeat the whole process and report the means score

Neural network not training

I'm trying to set up a custom neural network, but when I train it, it doesn't train : the training process makes 0 iterations! I don't get any errors though, just 0 iterations, and I have no idea why. (The architecture might seem odd to you, it is supposed to be a custom PNN. But before we can even discuss if it makes sense or not, I would like to be able to train it...)
Here is the code
net = network;
net.trainFcn = 'trainlm';
net.performFcn = 'mse';
net.numInputs = 1;
net.numLayers = (2*nbclasses)+1; % (one pattern layer + one summation layer per class) + competition layer
net.inputConnect(1:nbclasses,:) = 1; % connects the input to all pattern layers
for i = 1:nbclasses % Connect the pattern layers to their corresponding summation layers
net.layerConnect(i+nbclasses,i) = 1;
net.layers{i}.size = size(tr_feature,1);
net.layers{i}.transferFcn = 'radbas';
end
for i = (nbclasses+1):(nbclasses*2) % Connect all summation layers to the competition layer
net.layers{i}.size = 1;
net.layerConnect(net.numLayers,i) = 1;
end
net.layers{net.numLayers}.transferFcn = 'compet';
net.outputConnect(1,end) = 1;
net.view;
[net, tr] = train(net,tr_feature',tr_true');
% tr_feature is a 800x2 data matrix, tr_true is the 800x1 corresponding labels
Any idea?
Thanks in advance!

using precomputed kernels with libsvm

I'm currently working on classifying images with different image-descriptors. Since they have their own metrics, I am using precomputed kernels. So given these NxN kernel-matrices (for a total of N images) i want to train and test a SVM. I'm not very experienced using SVMs though.
What confuses me though is how to enter the input for training. Using a subset of the kernel MxM (M being the number of training images), trains the SVM with M features. However, if I understood it correctly this limits me to use test-data with similar amounts of features. Trying to use sub-kernel of size MxN, causes infinite loops during training, consequently, using more features when testing gives poor results.
This results in using equal sized training and test-sets giving reasonable results. But if i only would want to classify, say one image, or train with a given amount of images for each class and test with the rest, this doesn't work at all.
How can i remove the dependency between number of training images and features, so i can test with any number of images?
I'm using libsvm for MATLAB, the kernels are distance-matrices ranging between [0,1].
You seem to already have figured out the problem... According to the README file included in the MATLAB package:
To use precomputed kernel, you must include sample serial number as
the first column of the training and testing data.
Let me illustrate with an example:
%# read dataset
[dataClass, data] = libsvmread('./heart_scale');
%# split into train/test datasets
trainData = data(1:150,:);
testData = data(151:270,:);
trainClass = dataClass(1:150,:);
testClass = dataClass(151:270,:);
numTrain = size(trainData,1);
numTest = size(testData,1);
%# radial basis function: exp(-gamma*|u-v|^2)
sigma = 2e-3;
rbfKernel = #(X,Y) exp(-sigma .* pdist2(X,Y,'euclidean').^2);
%# compute kernel matrices between every pairs of (train,train) and
%# (test,train) instances and include sample serial number as first column
K = [ (1:numTrain)' , rbfKernel(trainData,trainData) ];
KK = [ (1:numTest)' , rbfKernel(testData,trainData) ];
%# train and test
model = svmtrain(trainClass, K, '-t 4');
[predClass, acc, decVals] = svmpredict(testClass, KK, model);
%# confusion matrix
C = confusionmat(testClass,predClass)
The output:
*
optimization finished, #iter = 70
nu = 0.933333
obj = -117.027620, rho = 0.183062
nSV = 140, nBSV = 140
Total nSV = 140
Accuracy = 85.8333% (103/120) (classification)
C =
65 5
12 38