I'm trying to find where are make mistakes. Be very glad if you could help me.
Here is my problem:
In serial the train, from neural network toolbox, function behave in one way but when I put it in a parfor loop everything goes crazy.
>> version
ans =
8.3.0.532 (R2014a)
Here is a function
function per = neuralTr(tSet,Y,CrossVal,Ycv)
hiddenLayerSize = 94;
redeT = patternnet(hiddenLayerSize);
redeT.input.processFcns = {'removeconstantrows','mapminmax'};
redeT.output.processFcns = {'removeconstantrows','mapminmax'};
redeT.divideFcn = 'dividerand'; % Divide data randomly
redeT.divideMode = 'sample'; % Divide up every sample
redeT.divideParam.trainRatio = 80/100;
redeT.divideParam.valRatio = 10/100;
redeT.divideParam.testRatio = 10/100;
redeT.trainFcn = 'trainscg'; % Scaled conjugate gradient
redeT.performFcn = 'crossentropy'; % Cross-entropy
redeT.trainParam.showWindow=0; %default is 1)
redeT = train(redeT,tSet,Y);
outputs = sim(redeT,CrossVal);
per = perform(redeT,Ycv,outputs);
end
And here is the code I'm typing:
Data loaded in workspace
whos
Name Size Bytes Class Attributes
CrossVal 282x157 354192 double
Y 2x363 5808 double
Ycv 2x157 2512 double
per 1x1 8 double
tSet 282x363 818928 double
Function executing in Serial
per = neuralTr(tSet,Y,CrossVal,Ycv)
per =
0.90
Starting parallel
>> parpool local
Starting parallel pool (parpool) using the 'local' profile ... connected to 12 workers.
ans =
Pool with properties:
Connected: true
NumWorkers: 12
Cluster: local
AttachedFiles: {}
IdleTimeout: Inf (no automatic shut down)
SpmdEnabled: true
Initializing and executing the function 12 times in parallel
per = cell(12,1);
parfor ii = 1 : 12
per{ii} = neuralTr(tSet,Y,CrossVal,Ycv);
end
per
per =
[0.96]
[0.83]
[0.92]
[1.08]
[0.85]
[0.89]
[1.06]
[0.83]
[0.90]
[0.93]
[0.95]
[0.81]
Executing again to see if random initialization brings different values
per = cell(12,1);
parfor ii = 1 : 12
per{ii} = neuralTr(tSet,Y,CrossVal,Ycv);
end
per
per =
[0.96]
[0.83]
[0.92]
[1.08]
[0.85]
[0.89]
[1.06]
[0.83]
[0.90]
[0.93]
[0.95]
[0.81]
EDIT 1:
Running the function only with for
per = cell(12,1);
for ii = 1 : 12
per{ii} = neuralTr(tSet,Y,CrossVal,Ycv);
end
per
per =
[0.90]
[0.90]
[0.90]
[0.90]
[0.90]
[0.90]
[0.90]
[0.90]
[0.90]
[0.90]
[0.90]
[0.90]
EDIT 2:
I modified my function now everything works great. Maybe the problem is when data is divided in parallel. So i divided the data before send to parallel. Tks a lot
function per = neuralTr(tSet,Y,CrossVal,Ycv)
indt = 1:round(size(tSet,2) * 0.8) ;
indv = round(size(tSet,2) * 0.8):round(size(tSet,2) * 0.9);
indte = round(size(tSet,2) * 0.9):size(tSet,2);
hiddenLayerSize = 94;
redeT = patternnet(hiddenLayerSize);
redeT.input.processFcns = {'removeconstantrows','mapminmax'};
redeT.output.processFcns = {'removeconstantrows','mapminmax'};
redeT.divideFcn = 'dividerand'; % Divide data randomly
redeT.divideMode = 'sample'; % Divide up every sample
redeT.divideParam.trainRatio = 80/100;
redeT.divideParam.valRatio = 10/100;
redeT.divideParam.testRatio = 10/100;
redeT.trainFcn = 'trainscg'; % Scaled conjugate gradient
redeT.performFcn = 'crossentropy'; % Cross-entropy
redeT.trainParam.showWindow=0; %default is 1)
redeT = train(redeT,tSet,Y);
outputs = sim(redeT,CrossVal);
per = zeros(12,1);
parfor ii = 1 : 12
redes = train(redeT,tSet,Y);
per(ii) = perform(redes,Ycv,outputs);
end
end
Result:
>> per = neuralTr(tSet,Y,CrossVal,Ycv)
per =
0.90
0.90
0.90
0.90
0.90
0.90
0.90
0.90
0.90
0.90
0.90
0.90
Oh! I think I found it, but cant test it.
you have in your code:
redeT.divideFcn = 'dividerand'; % Divide data randomly
If each of the workers chooses the data randomly, then its expected for them to have different results, aren't they?
Try the next:
per = cell(12,1);
parfor ii = 1 : 12
rng(1); % set the seed for random number generation, so every time the number generated will be the same
per{ii} = neuralTr(tSet,Y,CrossVal,Ycv);
end
per
Not sure if neuralTr does set the seed inside, but give it a go.
Related
Using Matlab I am trying to construct a neural network that can classify handwritten digits that are 30x30 pixels. I use backpropagation to find the correct weights and biases. The network starts with 900 inputs, then has 2 hidden layers with 16 neurons and it ends with 10 outputs. Each output neuron has a value between 0 and 1 that represents the belief that the input should be classified as a certain digit. The problem is that after training, the output becomes almost indifferent to the input and it goes towards a uniform belief of 0.1 for each output.
My approach is to take each image with 30x30 pixels and reshape it to be vectors of 900x1 (note that 'Images_vector' is already in the vector format when it is loaded). The weights and biases are initiated with random values between 0 and 1. I am using stochastic gradiƫnt descent to update the weights and biases with 10 randomly selected samples per batch. The equations are as described by Nielsen.
The script is as follows.
%% Inputs
numberofbatches = 1000;
batchsize = 10;
alpha = 1;
cutoff = 8000;
layers = [900 16 16 10];
%% Initialization
rng(0);
load('Images_vector')
Images_vector = reshape(Images_vector', 1, 10000);
labels = [ones(1,1000) 2*ones(1,1000) 3*ones(1,1000) 4*ones(1,1000) 5*ones(1,1000) 6*ones(1,1000) 7*ones(1,1000) 8*ones(1,1000) 9*ones(1,1000) 10*ones(1,1000)];
newOrder = randperm(10000);
Images_vector = Images_vector(newOrder);
labels = labels(newOrder);
images_training = Images_vector(1:cutoff);
images_testing = Images_vector(cutoff + 1:10000);
w = cell(1,length(layers) - 1);
b = cell(1,length(layers));
dCdw = cell(1,length(layers) - 1);
dCdb = cell(1,length(layers));
for i = 1:length(layers) - 1
w{i} = rand(layers(i+1),layers(i));
b{i+1} = rand(layers(i+1),1);
end
%% Learning process
batches = randi([1 cutoff - batchsize],1,numberofbatches);
cost = zeros(numberofbatches,1);
c = 1;
for batch = batches
for i = 1:length(layers) - 1
dCdw{i} = zeros(layers(i+1),layers(i));
dCdb{i+1} = zeros(layers(i+1),1);
end
for n = batch:batch+batchsize
y = zeros(10,1);
disp(labels(n))
y(labels(n)) = 1;
% Network
a{1} = images_training{n};
z{2} = w{1} * a{1} + b{2};
a{2} = sigmoid(0, z{2});
z{3} = w{2} * a{2} + b{3};
a{3} = sigmoid(0, z{3});
z{4} = w{3} * a{3} + b{4};
a{4} = sigmoid(0, z{4});
% Cost
cost(c) = sum((a{4} - y).^2) / 2;
% Gradient
d{4} = (a{4} - y) .* sigmoid(1, z{4});
d{3} = (w{3}' * d{4}) .* sigmoid(1, z{3});
d{2} = (w{2}' * d{3}) .* sigmoid(1, z{2});
dCdb{4} = dCdb{4} + d{4} / 10;
dCdb{3} = dCdb{3} + d{3} / 10;
dCdb{2} = dCdb{2} + d{2} / 10;
dCdw{3} = dCdw{3} + (a{3} * d{4}')' / 10;
dCdw{2} = dCdw{2} + (a{2} * d{3}')' / 10;
dCdw{1} = dCdw{1} + (a{1} * d{2}')' / 10;
c = c + 1;
end
% Adjustment
b{4} = b{4} - dCdb{4} * alpha;
b{3} = b{3} - dCdb{3} * alpha;
b{2} = b{2} - dCdb{2} * alpha;
w{3} = w{3} - dCdw{3} * alpha;
w{2} = w{2} - dCdw{2} * alpha;
w{1} = w{1} - dCdw{1} * alpha;
end
figure
plot(cost)
ylabel 'Cost'
xlabel 'Batches trained on'
With the sigmoid function being the following.
function y = sigmoid(derivative, x)
if derivative == 0
y = 1 ./ (1 + exp(-x));
else
y = sigmoid(0, x) .* (1 - sigmoid(0, x));
end
end
Other than this I have also tried to have 1 of each digit in each batch, but this gave the same result. Also I have tried varying the batch size, the number of batches and alpha, but with no success.
Does anyone know what I am doing wrong?
Correct me if I'm wrong: You have 10000 samples in you're data, which you divide into 1000 batches of 10 samples. Your training process consists of running over these 10000 samples once.
This might be too little, normally your training process consists of several epochs (one epoch = iterating over every sample once). You can try going over your batches multiple times.
Also for 900 inputs your network seems small. Try it with more neurons in the second layer. Hope it helps!
My dataset is huge. Let X be input training data which is 6X140000 and T be targets, which are 3X140000.
net = patternnet(10);
% Set divide parameters
net.divideFcn = 'divideind';
net.divideParam.trainInd = loc_Train;
net.divideParam.testInd = loc_Test;
net.divideParam.valInd = loc_Valid;
net.trainFcn = 'trainscg';
% Set training parameters
net.trainParam.epochs = 1000;
net.trainParam.max_fail = 20;
net.trainParam.min_grad = 1e-20;
net.trainParam.goal = 1e-10; % Set a very small value
% Set network performance functions
net.performFcn = 'crossentropy';
net.performParam.regularization = 0.02;
net.performParam.normalization = 'none';
net.trainParam.showWindow = 0;
net.trainParam.showCommandLine = 1;
After I have setup my network, I run the following code to train my network.
[net, tr] = train(net, X, T);
The command line shows:
Calculation mode: MEX Training Pattern Recognition Neural Network
with TRAINSCG.
Epoch 0/1000, Time 0.001, Performance 0.0061672/1e-10,
Gradient 0.00065207/1e-20, Validation Checks 0/20
Epoch 20/1000, Time 2.214, Performance 0.0060292/1e-10, Gradient 6.3997e-05/1e-20, Validation Checks 20/20
Training with TRAINSCG completed: Validation
stop.
The tr object, which is the training record, holds information such as testing indices. however, tr.testInd returns empty.
I have the following simulation running in Matlab. For a period of 25 years, it simulates "Assets", which grow according to geometric brownian motion, and "Liabilities", which grow at a fixed rate of 7% each year. At the end of the simulation, I take the ratio of Assets to Liabilities, and the trial is successful if this is greater than 90%.
All inputs are fixed except for Sigma (the standard deviation). My goal is to find the lowest possible value of sigma that will result in a ratio of assets to liabilities > 0.9 for every year.
Is there anything in Matlab designed to solve this kind of optimization problem?
The code below sets up the simulation for a fixed value of sigma.
%set up inputs
nPeriods = 25;
years = 2016:(2016+nPeriods);
rate = Assumptions.Returns;
sigma = 0.15; %This is the input that I want to optimize
dt = 1;
T = nPeriods*dt;
nTrials = 500;
StartAsset = 81.2419;
%calculate fixed liabilities
StartLiab = 86.9590;
Liabilities = zeros(size(years))'
Liabilities(1) = StartLiab
for idx = 2:length(years)
Liabilities(idx) = Liabilities(idx-1)*(1 + Assumptions.Discount)
end
%run simulation
obj = gbm(rate,sigma,'StartState',StartAsset);
%rng(1,'twister');
[X1,T] = simulate(obj,nPeriods,'DeltaTime',dt, 'nTrials', nTrials);
Ratio = zeros(size(X1))
for i = 1:nTrials
Ratio(:,:,i)= X1(:,:,i)./Liabilities;
end
Unsuccessful = Ratio < 0.9
UnsuccessfulCount = sum(sum(Unsuccessful))
First make your simulation a function that takes sigma as the input:
function f = asset(sigma)
%set up inputs
nPeriods = 25;
years = 2016:(2016+nPeriods);
rate = Assumptions.Returns;
%sigma = %##.##; %This is the input of the function that I want to optimize
dt = 1;
T = nPeriods*dt;
nTrials = 500;
StartAsset = 81.2419;
%calculate fixed liabilities
StartLiab = 86.9590;
Liabilities = zeros(size(years))'
Liabilities(1) = StartLiab
for idx = 2:length(years)
Liabilities(idx) = Liabilities(idx-1)*(1 + Assumptions.Discount)
end
%run simulation
obj = gbm(rate,sigma,'StartState',StartAsset);
%rng(1,'twister');
[X1,T] = simulate(obj,nPeriods,'DeltaTime',dt, 'nTrials', nTrials);
Ratio = zeros(size(X1))
for i = 1:nTrials
Ratio(:,:,i)= X1(:,:,i)./Liabilities;
end
Unsuccessful = Ratio < 0.9
UnsuccessfulCount = sum(sum(Unsuccessful))
f = sigma + UnsuccessfulCount
end
Then you can use fminbnd (or fminsearch for multiple inputs) to find the minimized value of sigma.
Sigma1 = 0.001;
Sigma2 = 0.999;
optSigma = fminbnd(asset,Sigma1,Sigma2)
Hi everyone and thank you for taking time to read this.
I have the following code to train a neural network:
P = [-1 2 0.5 3];
T1 = 1;
T2 = 2;
T3 = 1.5;
net = newff([-1 3;-1 3;-1 3;-1 3],[2 1],{'logsig' 'logsig'},'traingd');
net.trainParam.epochs = 50;
net.trainParam.lr = 0.6;
%now start to train for the first 50 epochs
[net,Y,E,Pf,Af,tr] = train(net,P',T1)
I want to have error for every epoch. I trained my network for 50 epochs and at the end it gives me the final error but I want but I want all of the errors!
If you know you are going to train for 50 epochs you can set up an array and record the error for each one.
I tried to use the modified version of NN back propagation code by Phil Brierley
(www.philbrierley.com). When i try to solve the XOR problem it works perfectly. but when i try to solve a problem of the form output = x1^2 + x2^2 (ouput = sum of squares of input), the results are not accurate. i have scaled the input and ouput between -1 and 1. I get different results every time i run the same program (i understand its due to random wts initialization), but results are very different. i tried changing learning rate but still results converge.
have given the code below
%---------------------------------------------------------
% MATLAB neural network backprop code
% by Phil Brierley
%--------------------------------------------------------
clear; clc; close all;
%user specified values
hidden_neurons = 4;
epochs = 20000;
input = [];
for i =-10:2.5:10
for j = -10:2.5:10
input = [input;i j];
end
end
output = (input(:,1).^2 + input(:,2).^2);
output1 = output;
% Maximum input and output limit and scaling factors
m1 = -10; m2 = 10;
m3 = 0; m4 = 250;
c = -1; d = 1;
%Scale input and output
for i =1:size(input,2)
I = input(:,i);
scaledI = ((d-c)*(I-m1) ./ (m2-m1)) + c;
input(:,i) = scaledI;
end
for i =1:size(output,2)
I = output(:,i);
scaledI = ((d-c)*(I-m3) ./ (m4-m3)) + c;
output(:,i) = scaledI;
end
train_inp = input;
train_out = output;
%read how many patterns and add bias
patterns = size(train_inp,1);
train_inp = [train_inp ones(patterns,1)];
%read how many inputs and initialize learning rate
inputs = size(train_inp,2);
hlr = 0.1;
%set initial random weights
weight_input_hidden = (randn(inputs,hidden_neurons) - 0.5)/10;
weight_hidden_output = (randn(1,hidden_neurons) - 0.5)/10;
%Training
err = zeros(1,epochs);
for iter = 1:epochs
alr = hlr;
blr = alr / 10;
%loop through the patterns, selecting randomly
for j = 1:patterns
%select a random pattern
patnum = round((rand * patterns) + 0.5);
if patnum > patterns
patnum = patterns;
elseif patnum < 1
patnum = 1;
end
%set the current pattern
this_pat = train_inp(patnum,:);
act = train_out(patnum,1);
%calculate the current error for this pattern
hval = (tanh(this_pat*weight_input_hidden))';
pred = hval'*weight_hidden_output';
error = pred - act;
% adjust weight hidden - output
delta_HO = error.*blr .*hval;
weight_hidden_output = weight_hidden_output - delta_HO';
% adjust the weights input - hidden
delta_IH= alr.*error.*weight_hidden_output'.*(1-(hval.^2))*this_pat;
weight_input_hidden = weight_input_hidden - delta_IH';
end
% -- another epoch finished
%compute overall network error at end of each epoch
pred = weight_hidden_output*tanh(train_inp*weight_input_hidden)';
error = pred' - train_out;
err(iter) = ((sum(error.^2))^0.5);
%stop if error is small
if err(iter) < 0.001
fprintf('converged at epoch: %d\n',iter);
break
end
end
%Output after training
pred = weight_hidden_output*tanh(train_inp*weight_input_hidden)';
Y = m3 + (m4-m3)*(pred-c)./(d-c);
% Testing for a new set of input
input_test = [6 -3.1; 0.5 1; -2 3; 3 -2; -4 5; 0.5 4; 6 1.5];
output_test = (input_test(:,1).^2 + input_test(:,2).^2);
input1 = input_test;
%Scale input
for i =1:size(input1,2)
I = input1(:,i);
scaledI = ((d-c)*(I-m1) ./ (m2-m1)) + c;
input1(:,i) = scaledI;
end
%Predict output
train_inp1 = input1;
patterns = size(train_inp1,1);
bias = ones(patterns,1);
train_inp1 = [train_inp1 bias];
pred1 = weight_hidden_output*tanh(train_inp1*weight_input_hidden)';
%Rescale
Y1 = m3 + (m4-m3)*(pred1-c)./(d-c);
analy_numer = [output_test Y1']
plot(err)
This is the sample output i get for problem
state after 20000 epochs
analy_numer =
45.6100 46.3174
1.2500 -2.9457
13.0000 11.9958
13.0000 9.7097
41.0000 44.9447
16.2500 17.1100
38.2500 43.9815
if i run once more i get different results. as can be observed for small values of input i get totally wrong ans (negative ans not possible). for other values accuracy is still poor.
can someone tell what i am doing wrong and how to correct.
thanks
raman