What is the training accuracy of this model? - matlab

Training accuracy plot
I’m trying to classifiy ECG signals using LSTM and MATLAB, the above plot shows that the training accuracy of the system is 100% but when I apply this code to calculate and get the accuracy I get only 20%
LSTMAccuracy = sum(trainPred == Labels)/numel(Labels)*100
Training accuracy calculated
Am I missing something here? Or there something wrong I did in my code?
Here is the configuration and the training code:
layers = [ ...
sequenceInputLayer(1)
bilstmLayer(100,'OutputMode','last')
fullyConnectedLayer(5)
softmaxLayer
classificationLayer
]
options = trainingOptions('adam', ...
'MaxEpochs',1750, ...
'MiniBatchSize', 150, ...
'InitialLearnRate', 0.0001, ...
'ExecutionEnvironment',"auto",...
'plots','training-progress', ...
'Verbose',false);
net = trainNetwork(Signals, Labels, layers, options);
trainPred = classify(net, Signals,'SequenceLength',1000);
LSTMAccuracy = sum(trainPred == Labels)/numel(Labels)*100
figure
confusionchart(Labels,trainPred,'ColumnSummary','column-normalized',...
'RowSummary','row-normalized','Title','Confusion Chart for LSTM');

The problem can be solved using the following line of code:
trainPred = classify(net, Signals,'SequenceLength','longest');

Related

Problem with Converging of the 1D CNN model

I want to train a 1D CNN with a sequential data. I have two classes of data, two features and 1584 samples. Each element of my training array is a 2xn cell (n = number of frames in each sample, screenshot attached). I have tried with the MATLAB example, but my model does not converge att all. it just stays at 51%.
I don't know how to set filter number, size or layers.
Can anyone help?
https://www.mathworks.com/help/deeplearning/ug/sequence-classification-using-1-d-convolutions.html#SequenceClassificationUsing1DConvolutionsExample-2
filterSize = 2;
numFilters = 16;
Num_classes=2;
numFeatures=2;
layers = [ ...
sequenceInputLayer(numFeatures)
convolution1dLayer(filterSize,numFilters,Padding="causal")
reluLayer
layerNormalizationLayer
convolution1dLayer(filterSize,2*numFilters,Padding="causal")
reluLayer
layerNormalizationLayer
globalAveragePooling1dLayer
fullyConnectedLayer(Num_classes)
softmaxLayer
classificationLayer];
miniBatchSize = 27;
options = trainingOptions("adam", ...
MiniBatchSize=miniBatchSize, ...
MaxEpochs=15, ...
SequencePaddingDirection="left", ...
ValidationData={ValidationData,labelsValidation}, ...
Plots="training-progress", ...
Verbose=0);

Correct practice and approach for reporting the training and generalization performance

I am trying to learn the correct procedure for training a neural network for classification. Many tutorials are there but they never explain how to report for the generalization performance. Can somebody please tell me if the following is the correct method or not. I am using first 100 examples from the fisheriris data set that has labels 1,2 and call them as X and Y respectively. Then I split X into trainData and Xtest with a 90/10 split ratio. Using trainData I trained the NN model. Now the NN internally further splits trainData into tr,val,test subsets. My confusion is which one is usually used for generalization purpose when reporting the performance of the model to unseen data in conferences/Journals?
The dataset can be found in the link: https://www.mathworks.com/matlabcentral/fileexchange/71468-simple-neural-networks-with-k-fold-cross-validation-manner
rng('default')
load iris.mat;
X = [f(1:100,:) l(1:100)];
numExamples = size(X,1);
indx = randperm(numExamples);
X = X(indx,:);
Y = X(:,end);
split1 = cvpartition(Y,'Holdout',0.1,'Stratify',true); %90% trainval 10% test
istrainval = training(split1); % index for fitting
istest = test(split1); % indices for quality assessment
trainData = X(istrainval,:);
Xtest = X(istest,:);
Ytest = Y(istest);
numExamplesXtrainval = size(trainData,1);
indxXtrainval = randperm(numExamplesXtrainval);
trainData = trainData(indxXtrainval,:);
Ytrain = trainData(:,end);
hiddenLayerSize = 10;
% data format = rows = number of dim, column = number of examples
net = patternnet(hiddenLayerSize);
net = init(net);
net.performFcn = 'crossentropy';
net.trainFcn = 'trainscg';
net.trainParam.epochs=50;
[net tr]= train(net,trainData', Ytrain');
Trained = sim(net, trainData'); %outputs predicted labels
train_predict = net(trainData');
performanceTrain = perform(net,Ytrain',train_predict)
lbl_train=grp2idx(Ytrain);
Yhat_train = (train_predict >= 0.5);
Lbl_Yhat_Train = grp2idx(Yhat_train);
[cmMatrixTrain]= confusionmat(lbl_train,Lbl_Yhat_Train )
accTrain=sum(lbl_train ==Lbl_Yhat_Train)/size(lbl_train,1);
disp(['Training Set: Total Train Acccuracy by MLP = ',num2str(100*accTrain ), '%'])
[confTest] = confusionmat(lbl_train(tr.testInd),Lbl_Yhat_Train(tr.testInd) )
%unknown test
test_predict = net(Xtest');
performanceTest = perform(net,Ytest',test_predict);
Yhat_test = (test_predict >= 0.5);
test_lbl=grp2idx(Ytest);
Lbl_Yhat_Test = grp2idx(Yhat_test);
[cmMatrix_Test]= confusionmat(test_lbl,Lbl_Yhat_Test )
This is the output.
Problem1: There seems to be no prediction for the other class. Why?
Problem2: Do I need a separate dataset like the one I created as Xtest for reporting generalization error or is it the practice to use the data trainData(tr.testInd,:) as the generalization test set? Did I create an unnecessary subset?
performanceTrain =
2.2204e-16
cmMatrixTrain =
45 0
45 0
Training Set: Total Train Acccuracy by MLP = 50%
confTest =
9 0
5 0
cmMatrix_Test =
5 0
5 0
There are a few issues with the code. Let's deal with them before answering your question. First, you set a threshold of 0.5 for making decisions (Yhat_train = (train_predict >= 0.5);) while all points of your net prediction are above 0.5. This means you only get zeros in your confusion matrices. You can plot the scores to choose a better threshold:
figure;
plot(train_predict(Ytrain == 1),'.b')
hold on
plot(train_predict(Ytrain == 2),'.r')
legend('label 1','label 2')
cvpartition gave me an error. It ran successfully as split1 = cvpartition(Y,'Holdout',0.1); In any case, artificial neural networks usuallly manage partitioning within the training process, so you feed in X and Y and some parameters for how to do it. See here for example: link where you set
net.divideParam.trainRatio = .4;
net.divideParam.valRatio = .3;
net.divideParam.testRatio = .3;
So how to report the results? Only for the test data. The train data will suffer from overfit, and will show false, too good results. If you use validation data (you havn't), then you cannot show results for it because it will also suffer from overfit. If you let the training do validation for you your test results will be safe from overfit.

How to implement KNN using Matlab and calculate the percentage accuracy

I'm new using matlab, my goal is to implement knn, I have two differents txt files, one contains test data(sample) and the other one contains training data.
So far I think I should do something like this, but I'm not sure how to do it.
load fisheriris
x = meas(:,3:4);
gscatter(x(:,1),x(:,2),species)
newpoint = [5 1.45];
[n,d] = knnsearch(x,newpoint,'k',10);
line(x(n,1),x(n,2),'color',[.5 .5 .5],'marker','o','linestyle','none','markersize',10)
Or maybe this is a more simple way to do it, to me that's very clear the two different sets of data, sample and training, but this doesn't show the accuracy of the predicted class.
A= [50, 60;
7,2;
13,12;
100,200;];
B=[1,0;
200,30;
19,10];
G={'First Row';
'Second Row';
'Third Row'};
class = knnclassify(A,B,G);
disp('Result: ');
disp(class);
the matrix looks like this:
Training data:
148.0,50.0,0
187.0,34.0,0
204.0,89.0,0
430.0,161.0,1
427.0,22.0,1
-42.0,469.0,1
more,more,class....
Test data:
290.0,-57.0,0
194.0,-80.0,0
174.0,33.0,0
465.0,691.0,1
270.0,-194.0,1
-56.0,665.0,1
more,more,class....
How can I classify this data using knn and show the predictions for each row so I can calculate the accuracy percentage?
-------EDITED------
I forgot, if I need the accuracy for each class, what should I do?
Here is the updated code using knnclassify
trainData= [148.0,50.0,0; ...
187.0,34.0,0; ...
204.0,89.0,0; ...
430.0,161.0,1; ...
427.0,22.0,1; ...
-42.0,469.0,1 ...
];
testData= [290.0,-57.0,0; ...
194.0,-80.0,0; ...
174.0,33.0,0; ...
465.0,691.0,1; ...
270.0,-194.0,1; ...
-56.0,665.0,1];
% Data
Sample=testData(:,1:2);
Training=trainData(:,1:2);
Group=trainData(:,3);
% Classify
k=1; % number of nearest neighbors used in the classification
Class = knnclassify(Sample, Training, Group,k);
% Display Prediction
fprintf('%.1f %.1f - Real %d , Predicted %d\n',[testData.'; Class.']);
% Calculate percentage accuracy for each class
trueClass=testData(:,3);
classList=unique(trueClass);
for classIndex=1:length(classList)
indexesOfEachClass=find(trueClass==classList(classIndex));
percentageAccuracyEachClass(classIndex,1)=sum(Class(indexesOfEachClass)==trueClass(indexesOfEachClass))/length(indexesOfEachClass)*100;
end
fprintf('\nClass %d Accuracy : %f%%',[classList.'; percentageAccuracyEachClass.']);
% Calculate overall percentage accuracy
dataClassifiedAccurately=Class==trueClass;
percentageAccuracy=sum(dataClassifiedAccurately)/length(dataClassifiedAccurately)*100;
fprintf('\n\nOverall Accuracy : %f%%\n',percentageAccuracy);

How to perform two group classification with deep neural network? (Matlab)

I'm new in machine learning (and to stackoverflow as well) and i want to make some classification tasks. I performed two group classifications on my data set (field of speech acoustics) with LIBSVM and Matlab's Pattern Recignition Tool from the Neural network toolbox to create a simple network with one hidden layer. In the hope of higher classification results i want to try Deep Neural Networks, and i found this code: http://www.mathworks.com/matlabcentral/fileexchange/42853-deep-neural-network
I have some difficulty understanding it.
My data is constructed of 127 samples of 19 parameters, so my input number is 19. I want to classify them in two groups: 0 and 1, so my output number is 1. The values in my data set are normalized between 0 and 1.
My code is the following:
clear all
clc
addpath('..');
load('data.mat')
inputdata = inputs;
outputdata = outputs;
datanum = 127;
outputnum = 1;
hiddennum = 3;
inputnum = 19;
% rbm = randRBM(inputnum, outputnum);
% rbm = pretrainRBM( rbm, inputdata );
dbn = randDBN([inputnum, hiddennum, outputnum]);
dbn = pretrainDBN( dbn, inputdata );
dbn = SetLinearMapping( dbn, inputdata, outputdata );
dbn = trainDBN( dbn, inputdata, outputdata );
estimate = v2h( dbn, inputdata )
[rmse AveErrNum] = CalcRmse(dbn, inputdata, outputdata)
The code runs. The rmse is 0.4183, the AveErrNum is 0.1969. What i need is the classification accuracy between my targets (stored in outputdata) and the networks predictions (Accuracy = data classified correctly / all data).
Where do i find the networks predictions after binarization?
Do I use the right type of network for my classification?
Don't I need to divide my data into Training, Validation and Testing samples (like in the case of a simple neural network with one hidden layer)?
Thanks in advance for any help!

How to train a neural network using validation in Matlab

I'm trying to show the difference between results of an ANN trained with and without validation...
assume that I'm trying to train a ANN how a sin function should work:
this is gonna be my training data:
x = -1:0.05:1;
t = sin(2*pi*x)+0.01*randn(size(x));
and for validation data I gonna use this:
val.X = -0.975:.05:0.975;
val.T = sin(2*pi*val.X)+0.01*randn(size(val.X));
then I configure my net as follows:
net = feedforwardnet(10,'trainlm');
net.trainParam.show = 50;
net.trainParam.epochs = 300;
then I train net without validation :
[net1, tr1] = train(net,x,t);
and for training with validation I use this code:
[net2,tr2]=train(net,x,t,[],[],val);
but it doesn't work!?
EDIT:
the error says: "Error weights EW is not a matrix or cell array."
I wonder if you could tell me how to validate training ANN by custom data!?