I'm trying to set up a custom neural network, but when I train it, it doesn't train : the training process makes 0 iterations! I don't get any errors though, just 0 iterations, and I have no idea why. (The architecture might seem odd to you, it is supposed to be a custom PNN. But before we can even discuss if it makes sense or not, I would like to be able to train it...)
Here is the code
net = network;
net.trainFcn = 'trainlm';
net.performFcn = 'mse';
net.numInputs = 1;
net.numLayers = (2*nbclasses)+1; % (one pattern layer + one summation layer per class) + competition layer
net.inputConnect(1:nbclasses,:) = 1; % connects the input to all pattern layers
for i = 1:nbclasses % Connect the pattern layers to their corresponding summation layers
net.layerConnect(i+nbclasses,i) = 1;
net.layers{i}.size = size(tr_feature,1);
net.layers{i}.transferFcn = 'radbas';
end
for i = (nbclasses+1):(nbclasses*2) % Connect all summation layers to the competition layer
net.layers{i}.size = 1;
net.layerConnect(net.numLayers,i) = 1;
end
net.layers{net.numLayers}.transferFcn = 'compet';
net.outputConnect(1,end) = 1;
net.view;
[net, tr] = train(net,tr_feature',tr_true');
% tr_feature is a 800x2 data matrix, tr_true is the 800x1 corresponding labels
Any idea?
Thanks in advance!
Related
I am trying to learn the correct procedure for training a neural network for classification. Many tutorials are there but they never explain how to report for the generalization performance. Can somebody please tell me if the following is the correct method or not. I am using first 100 examples from the fisheriris data set that has labels 1,2 and call them as X and Y respectively. Then I split X into trainData and Xtest with a 90/10 split ratio. Using trainData I trained the NN model. Now the NN internally further splits trainData into tr,val,test subsets. My confusion is which one is usually used for generalization purpose when reporting the performance of the model to unseen data in conferences/Journals?
The dataset can be found in the link: https://www.mathworks.com/matlabcentral/fileexchange/71468-simple-neural-networks-with-k-fold-cross-validation-manner
rng('default')
load iris.mat;
X = [f(1:100,:) l(1:100)];
numExamples = size(X,1);
indx = randperm(numExamples);
X = X(indx,:);
Y = X(:,end);
split1 = cvpartition(Y,'Holdout',0.1,'Stratify',true); %90% trainval 10% test
istrainval = training(split1); % index for fitting
istest = test(split1); % indices for quality assessment
trainData = X(istrainval,:);
Xtest = X(istest,:);
Ytest = Y(istest);
numExamplesXtrainval = size(trainData,1);
indxXtrainval = randperm(numExamplesXtrainval);
trainData = trainData(indxXtrainval,:);
Ytrain = trainData(:,end);
hiddenLayerSize = 10;
% data format = rows = number of dim, column = number of examples
net = patternnet(hiddenLayerSize);
net = init(net);
net.performFcn = 'crossentropy';
net.trainFcn = 'trainscg';
net.trainParam.epochs=50;
[net tr]= train(net,trainData', Ytrain');
Trained = sim(net, trainData'); %outputs predicted labels
train_predict = net(trainData');
performanceTrain = perform(net,Ytrain',train_predict)
lbl_train=grp2idx(Ytrain);
Yhat_train = (train_predict >= 0.5);
Lbl_Yhat_Train = grp2idx(Yhat_train);
[cmMatrixTrain]= confusionmat(lbl_train,Lbl_Yhat_Train )
accTrain=sum(lbl_train ==Lbl_Yhat_Train)/size(lbl_train,1);
disp(['Training Set: Total Train Acccuracy by MLP = ',num2str(100*accTrain ), '%'])
[confTest] = confusionmat(lbl_train(tr.testInd),Lbl_Yhat_Train(tr.testInd) )
%unknown test
test_predict = net(Xtest');
performanceTest = perform(net,Ytest',test_predict);
Yhat_test = (test_predict >= 0.5);
test_lbl=grp2idx(Ytest);
Lbl_Yhat_Test = grp2idx(Yhat_test);
[cmMatrix_Test]= confusionmat(test_lbl,Lbl_Yhat_Test )
This is the output.
Problem1: There seems to be no prediction for the other class. Why?
Problem2: Do I need a separate dataset like the one I created as Xtest for reporting generalization error or is it the practice to use the data trainData(tr.testInd,:) as the generalization test set? Did I create an unnecessary subset?
performanceTrain =
2.2204e-16
cmMatrixTrain =
45 0
45 0
Training Set: Total Train Acccuracy by MLP = 50%
confTest =
9 0
5 0
cmMatrix_Test =
5 0
5 0
There are a few issues with the code. Let's deal with them before answering your question. First, you set a threshold of 0.5 for making decisions (Yhat_train = (train_predict >= 0.5);) while all points of your net prediction are above 0.5. This means you only get zeros in your confusion matrices. You can plot the scores to choose a better threshold:
figure;
plot(train_predict(Ytrain == 1),'.b')
hold on
plot(train_predict(Ytrain == 2),'.r')
legend('label 1','label 2')
cvpartition gave me an error. It ran successfully as split1 = cvpartition(Y,'Holdout',0.1); In any case, artificial neural networks usuallly manage partitioning within the training process, so you feed in X and Y and some parameters for how to do it. See here for example: link where you set
net.divideParam.trainRatio = .4;
net.divideParam.valRatio = .3;
net.divideParam.testRatio = .3;
So how to report the results? Only for the test data. The train data will suffer from overfit, and will show false, too good results. If you use validation data (you havn't), then you cannot show results for it because it will also suffer from overfit. If you let the training do validation for you your test results will be safe from overfit.
I am using Matlab and I try to train a neural network. Due to the big number of observations I need to reduce the computational time. Hence, I would like my network to save the parameters computed for time t-1 and use these as initial point for time t (instead of iterating say 1000 times for the solution, to iterate 6-10). The mocking code I prepared, without including the for loop, is the following, it works but without doing what I ask to do
clear;
x = randn(1,50);
y = x.^2;
HU = 2;
nets = trainnet(HU)
nets = train(nets,x,y)
net1 = configure(nets,x,y);
net1.IW = nets.IW;
net1.LW = nets.LW;
net1.b = nets.b;
net1 = trainnet(HU)
net1 = train(net1,x,y)
function net = trainnet(HU)
trainFcn = 'trainlm';
hiddenLayerSize = HU;
net = fitnet(hiddenLayerSize,trainFcn);
end
I would really appreciate any help. Thanks in advance.
I tried to implement the Widrow - Nguyen weight initialization on MATLAB 2014a. to compare its performance against HARD RANDOM weight init technique.
a = -1;
b = 1;
% WIDROW weights for Layer Input to Hidden Layer 1
sum_sq_wts = 0;
for k=1:30
iw(:,:) = zeros(num_input, nodes_hidden_layer);
for i=1:num_input
for j=1:nodes_hidden_layer
iw(i,j)=(b-a)*rand(1,1) + a;
sum_sq_wts = sum_sq_wts + (iw(i,j)*iw(i,j));
end
end
norm = sqrt(sum_sq_wts);
beta = 0.7*nodes_hidden_layer.^(1/num_input);
for i=1:num_input
for j=1:nodes_hidden_layer
iw(i,j) = beta*iw(i,j)/norm;
end
end
IW{k}=iw';
end
% WIDROW weights for Hidden Layer 1 to output Layer
sum_sq_wts = 0;
for k=1:30
lw(:,:) = zeros(nodes_hidden_layer,1);
for i=1:nodes_hidden_layer
for j=1:1
iw(i,j)=(b-a)*rand(1,1) + a;
sum_sq_wts = sum_sq_wts + iw(i,j)*iw(i,j);
end
end
norm = sqrt(sum_sq_wts);
beta = 0.7*nodes_hidden_layer.^(1/num_input);
for i=1:nodes_hidden_layer
for j=1:1
lw(i,j) = beta*lw(i,j)/norm;
end
end
LW{k}=lw';
end
WidNgu{1,1} = IW;
WidNgu{1,2} = LW;
I am generating 30 different set of Widrow weights in the above code. The problem is that the weights generated by the above code generate a lesser performance value for a neural network trained using them as compared to the random set of weights. The problem i used to train was a simple function approx prob.
One thing more interesting i observe is that, the first weight set generated by the above, at times performs better than the random weight approach, but the rest 29 sets that i created are always poor performing.
Where have i gone wrong in this??
I have implemented the Naive Bayse Classifier for multiclass but problem is my error rate is same while I increase the training data set. I was debugging this over an over but wasn't able to figure why its happening. So I thought I ll post it here to find if I am doing anything wrong.
%Naive Bayse Classifier
%This function split data to 80:20 as data and test, then from 80
%We use incremental 5,10,15,20,30 as the test data to understand the error
%rate.
%Goal is to compare the plots in stanford paper
%http://ai.stanford.edu/~ang/papers/nips01-discriminativegenerative.pdf
function[tPercent] = naivebayes(file, iter, percent)
dm = load(file);
for i=1:iter
%Getting the index common to test and train data
idx = randperm(size(dm.data,1))
%Using same idx for data and labels
shuffledMatrix_data = dm.data(idx,:);
shuffledMatrix_label = dm.labels(idx,:);
percent_data_80 = round((0.8) * length(shuffledMatrix_data));
%Doing 80-20 split
train = shuffledMatrix_data(1:percent_data_80,:);
test = shuffledMatrix_data(percent_data_80+1:length(shuffledMatrix_data),:);
%Getting the label data from the 80:20 split
train_labels = shuffledMatrix_label(1:percent_data_80,:);
test_labels = shuffledMatrix_label(percent_data_80+1:length(shuffledMatrix_data),:);
%Getting the array of percents [5 10 15..]
percent_tracker = zeros(length(percent), 2);
for pRows = 1:length(percent)
percentOfRows = round((percent(pRows)/100) * length(train));
new_train = train(1:percentOfRows,:);
new_train_label = train_labels(1:percentOfRows);
%get unique labels in training
numClasses = size(unique(new_train_label),1);
classMean = zeros(numClasses,size(new_train,2));
classStd = zeros(numClasses, size(new_train,2));
priorClass = zeros(numClasses, size(2,1));
% Doing the K class mean and std with prior
for kclass=1:numClasses
classMean(kclass,:) = mean(new_train(new_train_label == kclass,:));
classStd(kclass, :) = std(new_train(new_train_label == kclass,:));
priorClass(kclass, :) = length(new_train(new_train_label == kclass))/length(new_train);
end
error = 0;
p = zeros(numClasses,1);
% Calculating the posterior for each test row for each k class
for testRow=1:length(test)
c=0; k=0;
for class=1:numClasses
temp_p = normpdf(test(testRow,:),classMean(class,:), classStd(class,:));
p(class, 1) = sum(log(temp_p)) + (log(priorClass(class)));
end
%Take the max of posterior
[c,k] = max(p(1,:));
if test_labels(testRow) ~= k
error = error + 1;
end
end
avgError = error/length(test);
percent_tracker(pRows,:) = [avgError percent(pRows)];
tPercent = percent_tracker;
plot(percent_tracker)
end
end
end
Here is the dimentionality of my data
x =
data: [768x8 double]
labels: [768x1 double]
I am using Pima data set from UCI
What are the results of your implementation of the training data itself? Does it fit it at all?
It's hard to be sure but there are couple things that I noticed:
It is important for every class to have training data. You can't really train a classifier to recognize a class if there was no training data.
If possible number of training examples shouldn't be skewed towards some of classes. For example if in 2-class classification number of training and cross validation examples for class 1 constitutes only 5% of the data then function that always returns class 2 will have error of 5%. Did you try checking precision and recall separately?
You're trying to fit normal distribution to each feature in a class and then use it for posterior probabilities. I'm not sure how it plays out in terms of smoothing. Could you try to re-implement it with simple counting and see if it gives any different results?
It also could be that features are highly redundant and bayes method overcounts probabilities.
Hi i found this code somewhere with little info with it.
it's suppose to be a backpropagation neural network code.
but it seem to be lacking of something like weight and bias.
is the code correct? is it test-while-train backpropagation neural network?
thanks
% --- Executes on button press in pushbutton6.
%~~~~~~~~~~~[L1 L2 1];first hidden layer,second & output layer~~~~~
layer = [11 15 1];
myepochs = 30;
attemption = 1; %i;
mytfn = {'tansig' 'tansig' 'purelin'};
%~~~~~~load data~~~~~~~~~~~~~~~~~~~~~~~
m = xlsread('D:\MATLAB\datatrain.csv');
%~~~~~~convert the data in Matrix form~~~~
[row,col] = size(m);
P = m(1:row,1:10)';
T1 = m(1:row, col)'; % target data for training...last column
net = newff([minmax(P)],layer,mytfn,'trainlm'); %nnet
net.trainParam.epochs = myepochs; % how many time newff will repeat the training
net.trainParam.showWindow = true;
net.trainParam.showCommandLine = true;
net = train(net,P,T1); % start training newff with input P and target T1
Y = sim(net,P); % training
save 'net7' net;
% --- Executes on button press in pushbutton4.
%~~~~~~load data~~~~~~~~~~~~~~~~~~~~~~~
mt = xlsread('D:\MATLAB\datatest.csv');
%~~~~~~convert the data in Matrix form~~~~
[row1,col1] = size(mt);
Pt= mt(1:row1,1:10)';
Tt = mt(1:row1, col1)';
load 'net7' -mat;
Yt= sim(net,Pt);
%~~~~~~~final result of the neural network~~~~~~~~
[r,c]=size(Yt);
result=Yt(c);
if result>0.7
error=1-result;
set(handles.edit39,'String','yes')
set(handles.edit40,'String',num2str(error))
set(handles.edit41,'String','Completed')
data1=[num2str(result) ];
fid = fopen('D:\MATLAB\record.csv','a+');
fprintf(fid,[data1,'\n']);
fclose(fid);
else
set(handles.edit39,'String','no')
set(handles.edit40,'String',num2str(result))
set(handles.edit41,'String','Completed')
data1=[num2str(result) ];
fid = fopen('D:\MATLAB\record.csv','a+');
fprintf(fid,[data1,'\n']);
fclose(fid);
end
The code is correct. Neural network weights and biases are stored inside net structure, you can access them via net.IW and net.LW structures. Biases are stored inside net.b. This code train a network using inputs P and targets T1, splitting them in training, testing and validations subsets used during training. Check the documentation for further information about training procedure.