Getting NaN values in neural network weight matrices - matlab

**I am trying to develop a feedforward NN in MATLAB. I have a dataset of 12 inputs and 1 output with 46998 samples. I have some NaN values in last rows of Matrix, because some inputs are accelerations & velocities which are 1 & 2 steps less respectively than displacements.
With this current data set I am getting w1_grad & w2_grad as NaN matrices. I tried to remove them using `Heave_dataset(isnan(Heave_dataset))=[];, but my dataset is getting converted into a column matrix of (1*610964).
can anyone help me with this ?
%
%% Clear Variables, Close Current Figures, and Create Results Directory
clc;
clear all;
close all;
mkdir('Results//'); %Directory for Storing Results
%% Configurations/Parameters
load 'Heave_dataset'
% Heave_dataset(isnan(Heave_dataset))=[];
nbrOfNeuronsInEachHiddenLayer = 24;
nbrOfOutUnits = 1;
unipolarBipolarSelector = -1; %0 for Unipolar, -1 for Bipolar
learningRate = 0.08;
nbrOfEpochs_max = 50000;
%% Read Data
Input = Heave_dataset(:, 1:length(Heave_dataset(1,:))-1);
TargetClasses = Heave_dataset(:, length(Heave_dataset(1,:)));
%% Calculate Number of Input and Output NodesActivations
nbrOfInputNodes = length(Input(1,:)); %=Dimention of Any Input Samples
nbrOfLayers = 2 + length(nbrOfNeuronsInEachHiddenLayer);
nbrOfNodesPerLayer = [nbrOfInputNodes nbrOfNeuronsInEachHiddenLayer nbrOfOutUnits];
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Forward Pass %%%%%%%%%%%
%% Adding the Bias to Input layer
Input = [ones(length(Input(:,1)),1) Input];
%% Weights leading from input layer to hidden layer is w1
w1 = rand(nbrOfNeuronsInEachHiddenLayer,(nbrOfInputNodes+1));
%% Input & output of hidde layer
hiddenlayer_input = Input*w1';
hiddenlayer_output = -1 + 2./(1 + exp(-(hiddenlayer_input)));
%% Adding the Bias to hidden layer
hiddenlayer_output = [ones(length(hiddenlayer_output(:,1)),1) hiddenlayer_output];
%% Weights leading from input layer to hidden layer is w1
w2 = rand(nbrOfOutUnits,(nbrOfNeuronsInEachHiddenLayer+1));
%% Input & output of hidde layer
outerlayer_input = hiddenlayer_output*w2';
outerlayer_output = outerlayer_input;
%% Error Calculation
TotalError = 0.5*(TargetClasses-outerlayer_output).^2;
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Backward Pass %%%%%%%%%%%
d3 = outerlayer_output - TargetClasses;
d2 = (d3*w2).*hiddenlayer_output.*(1-hiddenlayer_output);
d2 = d2(:,2:end);
D1 = d2' * Input;
D2 = d3' * hiddenlayer_output;
w1_grad = D1/46998 + learningRate*[zeros(size(w1,1),1) w1(:,2:end)]/46998;
w2_grad = D2/46998 + learningRate*[zeros(size(w2,1),1) w2(:,2:end)]/46998;

You should try vectorize your algorithm. First arrange your data in a 46998x12 matrix X.Add bias to X like X=[ones(46998,1 X]. Then the weights leading from input layer to first hidden layer must be arranged in a matrix W1 with dimensions numberofneuronsinfirsthiddenlayer(24)x(input + 1). Then XW1' is what you feed in your neuron function (either is it sigmoid or whatever it is). The result (like sigmoid(XW') is the output of neurons at hidden level 1. You add bias like before and multiply by weight matrix W2 (the weights that lead from hidden layer 1 to hidden layer 2) and so on. Hope this helps to get you started vectorizing your code at least for the feedforward part. The back-propagation part is a little trickier but luckily involves the same matrices.
I will shortly recite the feedforward process so that we use same language talking about backpropagation.
There is the data called X.(dimensions 46998x12)
A1 = [ones(46998,1 X] is the input including bias. (46998x13)
Z2 = A1*W1' (W1 is the weight matrix that leads from input to hidden layer 1)
A2 = sigmoid(Z2);
A2 = [ones(m,1) A2]; adding bias again
Z3 = A2 * W2';
A3 = sigmoid(Z3);
Supposing you only have one hidden layer feedforward stops here. I'll start backwards now and you can generalize as appropriate.
d3 = A3 - Y; (Y must is part of your data, the actual values of the data with which you train your nn)
d2 = (d3 * W2).* A2 .* (1-A2); ( Sigmod function has a nice property that d(sigmoid(z))/dz = sigmoid(z)*(1-sigmoid(z)).)
d2 = d2(:,2:end);(You dont need the first column that corresponds in the bias)
D1 = d2' * A1;
D2 = d3' * A2;
W1_grad = D1/m + lambda*[zeros(size(W1,1),1) W1(:,2:end)]/m; (lamda is the earning rate, m is 46998)
W2_grad = D2/m + lambda*[zeros(size(W2,1),1) W2(:,2:end)]/m;
Everything must be in place now except for the vectorized cost function which have to be minimized. Hope this helps a bit...

Related

MATLAB Backpropagation Algorithm not functioning as expected

I am attempting to write a Multi-Layer Perceptron Network inside MATLAB to help me better understand the calculus required for backpropagation.
The aim is so provide the network with XOR data (where upper-right and lower-left quadrant data is class 1 and the remaining quadrants class 0), train the network on this data, and then test it on new data.
My problem is that my loss curve looks very very strange:
It appears to bounce between very low error very high error and converge in the middle to a pretty poor error.
I was wondering if someone could check that I have correctly implemented the chain rule in MATLAB syntax.
The MLP network is structured as follows: Input-layer has 2 neurons, 1 hidden-layer with 2 neurons, and 1 output neuron.
Here is the MATLAB code:
%Create XOR Dataset
x1pos = rand(500,1);
x1neg = -rand(500,1);
x1 = [x1pos; x1neg];
p = randperm(length(x1));
x1 = x1(p);
x2pos = rand(500,1);
x2neg = -rand(500,1);
x2 = [x1pos; x1neg];
p = randperm(length(x2));
x2 = x2(p);
Data = [x1 x2];
TrainingData = Data(1:800,:);
TestData = Data(801:length(Data),:);
T = gt((Data(:,1).*Data(:,2)),0); %Create class label for data and assign to matrix T
%Neural Net
%Training
W1 = rand(2,2); %Initialize random weights
W2 = rand(1,2); %Initialize random weights
B1 = rand(2,1); %Initialize random biases
B2 = rand(1,1); %Initialize random biases
n = 0.05; %Set Learning Rate
for i = 1:800
%Fwd Pass
x1 = Data(i,1);
x2 = Data(i,2);
X = [x1; x2];
A1 = W1*X + B1;
H1 = sigmoid(A1);
A2 = W2*H1 + B2;
Y = sigmoid(A2);
%Loss
Loss = (Y-T(i))*(Y-T(i));
scatter(i, Loss)
hold on;
%Backpropagation
dEdY = 2*(Y-T(i)); %The partial derivative of the loss with respect to the output
dYdA2 = Y*(1-Y); %The partial derivative of the output with respect to the hidden layer output
dA2dH1 = W2.'; %The partial derivative of the hidden layer output with respect to the first layer activations
dH1dA1 = H1.*(1-H1); %The partial derivative of the first layer activations with respect to the first layer output
%Chain Rule
dEdW2 = dEdY.*dYdA2.*W2.';
dEdW1 = dEdY.*dYdA2.*dA2dH1.*dH1dA1.*W1.';
dEdB2 = dEdY.*dYdA2;
dEdB1 = dEdY.*dYdA2.*dA2dH1.*dH1dA1;
%Update Weights
W2 = (W2.' - n.*dEdW2).';
W1 = (W1.' - n.*dEdW1).';
%Update Biases
B2 = B2 - n.*dEdB2;
B1 = B1 - n.*dEdB1;
%Next training loop
end
%Testing
for i = 801:1000
x1 = Data(i,1);
x2 = Data(i,2);
X = [x1; x2];
A1 = W1*X + B1;
H1 = sigmoid(A1);
A2 = W2*H1 + B2;
Y = sigmoid(A2);
end
function o = sigmoid(input)
o = [];
for i = 1:length(input)
o = [o; 1/(1+exp(-input(i)))];
end
end

How to plot π›½βˆ’π‘€ diagram in MATLAB?

I am trying to plot the π›½βˆ’π‘€ diagrams for the given phase constants using MATLAB, but although I have look at many web pages, there is not a similar example plotting π›½βˆ’π‘€ diagram in MATLAB. Could you please clarify me how to proceed by giving some examples regarding to this problem? Any help would really be appreciated.
Plot range: 𝑓=10π‘€β„Žπ‘§βˆ’10𝐺𝐻𝑧
w : Angular frequency
wc : A constant Angular frequency
Parameters for 1st: 𝑀𝑐1=0.2βˆ—π‘€, 𝑀𝑐2=0.4βˆ—π‘€, 𝑀𝑐3=0.6βˆ—π‘€, 𝑀𝑐4=0.8βˆ—π‘€ , Ι›1=1* Ι›0, ΞΌ= ΞΌ0
Parameters for 1st: a1=0.08636cm, a2=0.8636cm, a3=2.286cm, a4=29.21cm, Ι›1=1* Ι›0, ΞΌ= ΞΌ0
As the OP asked, this is a sort of Matlab code.
I assume to map the plot of B with w in range [1,100] (but values can be changed)
First case has wc has 3 different cases, 4 different plot of B (B1,B2, B3 and B4) will be mapped in four different colors
%constant inizialization
mu = 1.2566E-6;
e = 1;
start_f = 10000; %10 MHz start frequency range
end_f = 10000000; %10 GHz end frequency range
step = 10 %plot the function every "step" Hz (ONLY INTEGER NUMBERS ALLOWED)
k = 1;
% function of B example: B = w*sqrt(mu*e)*sqrt(1-((wc^2)/w));
%vectors initialization to avoid the "consider preallocation" Matlab not-critical warning
range_f = ceil((end_f - start_f)/step) + 1;
w = zeros(range_f);
B1 = zeros(range_f);
B2 = zeros(range_f);
B3 = zeros(range_f);
B4 = zeros(range_f);
for i=start_f:step:end_f %from 10 MHz to 10 GHz with steps of 1 Hz
%store i in the i-cell of vector w
w(k) = i;
%values that need to be updated every time
w1 = 0.2*w(i);
w2 = 0.4*w(i);
w3 = 0.6*w(i);
w4 = 0.8*w(i);
%four different results of B
B1(i) = w(i)*sqrt(mu*e)*sqrt(1-((w1^2)/w(i)));
B2(i) = w(i)*sqrt(mu*e)*sqrt(1-((w2^2)/w(i)));
B3(i) = w(i)*sqrt(mu*e)*sqrt(1-((w3^2)/w(i)));
B4(i) = w(i)*sqrt(mu*e)*sqrt(1-((w4^2)/w(i)));
k = k+1;
end
%plot the 4 lines
plot(w,B1,'r') %red line of B1 = f(w)
hold on
plot(w,B2,'g') %green line of B2 = f(w)
hold on
plot(w,B3,'b') %blue line of B3 = f(w)
hold on
plot(w,B4,'k') %black line of B4 = f(w)
4 different cases have to be represented with 4 plot (in this example they have been overlayed).
The last notation can be done in the same way (you have 4 constant parameters a1, a2 etc.) that does not depends from w this time. So
B1a(i) = sqrt((w(i)^2)*mu*e - ((pi^2)/a1)));
B2a(i) = sqrt((w(i)^2)*mu*e - ((pi^2)/a1)));
B3a(i) = sqrt((w(i)^2)*mu*e - ((pi^2)/a1)));
B4a(i) = sqrt((w(i)^2)*mu*e - ((pi^2)/a1)));
If some errors (due to "fast" writing) occurs to you, report them in comments and I will correct and update the code

Error executing harmonic product spectrum

I've been trying to find the fundamental notes using Harmonic Product Spectrum in MATLAB. I came across an algorithm and tried using it. I tested it with C Major scale (piano) with the notes
C4 D4 E4 F4 G4 A4 B4 C5 B4 A4 G4 F4 E4 D4 C4
I get the correct (almost close) frequency values for all the notes except the last C4 note because at that point i get the error
Matrix dimensions must agree.
This is the line that shows the error
f_ym = (1*seg_fft) .* (1.0*seg_fft2) .* (1*seg_fft3) .* (1*seg_fft4);
I'm not quite sure what's wrong here..
I found the note onsets and performed FFT on each onset and used that for the harmonic product spectrum. This is the part that does the HPS
h = 1;
for i = 2:No_of_peaks
song_seg = song(max_col(i-1):max_col(i)-1);
L = length(song_seg);
NFFT = 2^nextpow2(L); % Next power of 2 from length of y
seg_fft = fft(song_seg,NFFT);%/L;
%HPS
seg_fft = seg_fft(1 : size(seg_fft,1) / 2);
seg_fft = abs(seg_fft);
%HPS: downsampling
for i = 1:length(seg_fft)
seg_fft2(i,1) = 1;
seg_fft3(i,1) = 1;
seg_fft4(i,1) = 1;
% f_x5(i,1) = 1;
end
for i = 1:floor((length(seg_fft)-1)/2)
seg_fft2(i,1) = (seg_fft(2*i,1) + seg_fft((2*i)+1,1))/2;
end
for i = 1:floor((length(seg_fft)-2)/3)
seg_fft3(i,1) = (seg_fft(3*i,1) + seg_fft((3*i)+1,1) + seg_fft((3*i)+2,1))/3;
end
for i = 1:floor((length(seg_fft)-3)/4)
seg_fft4(i,1) = (seg_fft(4*i,1) + seg_fft((4*i)+1,1) + seg_fft((4*i)+2,1) + seg_fft((4*i)+3,1))/4;
end
%HPS, PartII: calculate product
f_ym = (1*seg_fft) .* (1.0*seg_fft2) .* (1*seg_fft3) .* (1*seg_fft4);
%HPS, PartIII: find max
f_y1 = max(f_ym);
for c = 1 : size(f_ym)
if(f_ym(c, 1) == f_y1)
index = c;
end
end
% Convert that to a frequency
f_y(h) = (index / NFFT) * FS
h=h+1;
f_y = abs(f_y)';
end
Well I'm trying to find the fundamental frequency in the presence of harmonics. Harmonic product spectrum is one way of doing it and that is what is being implemented in the above code. When I do the multiplication the size of seg_fft seems to be half the size of seg_fft2, seg_fft3, seg_fft4.
I don't know how I can make the dimensions to be of the same size.. That is where I need some help..
Would really appreciate some quick help. Thanx in advance :)
It is hard to understand exactly what you are trying to do without more explanation. But you get an error while trying to do element-wise multiplication of several matrices. This error message means that not all matrices have the same size, which is required for element-wise operations. To debug this kind of errors, the easiest is usually to check the sizes of the various matrices. So add these lines just before the line that gives the error:
disp(size(seg_fft))
disp(size(seg_fft2))
disp(size(1*seg_fft3))
disp(size(seg_fft4))
This should probably show you that one of them has a size that is different from what you expect. After this, try to fix your for-loops so that they all have the same size.

Simple Linear Neural Network Weights from Training are not compatible with training results

The weights that I get from training, when implied directly on input, return different results!
I'll show it on a very simple example
let's say we have an input vector x= 0:0.01:1;
and target vector t=x^2 (I know it better to use non linear network)
after training, 2 layer, linear network, with one neuron at each layer, we get:
sim(net,0.95) = 0.7850 (some error in training - that's ok and should be)
weights from net.IW,net.LW,net.b:
IW =
0.4547
LW =
2.1993
b =
0.3328 -1.0620
if I use the weights: Out = purelin(purelin(0.95*IW+b(1))*LW+b(2)) = 0.6200! , I get different result from the result of the sim!
how can it be? what's wrong?
the code:
%Main_TestWeights
close all
clear all
clc
t1 = 0:0.01:1;
x = t1.^2;
hiddenSizes = 1;
net = feedforwardnet(hiddenSizes);
[Xs,Xi,Ai,Ts,EWs,shift] = preparets(net,con2seq(t1),con2seq(x));
net.layers{1,1}.transferFcn = 'purelin';
[net,tr,Y,E,Pf,Af] = train(net,Xs,Ts,Xi,Ai);
view(net);
IW = cat(2,net.IW{1});
LW = cat(2,net.LW{2,1});
b = cat(2,[net.b{1,1},net.b{2,1}]);
%Result from Sim
t2=0.95;
Yk = sim(net,t2)
%Result from Weights
x1 = IW*t2'+b(1)
x1out = purelin(x1)
x2 = purelin(x1out*(LW)+b(2))
The neural network toolbox rescales inputs and outputs to the [-1,1] range. You must therefore rescale and unscale it so that your simulation output is the same sim()'s output:
%Result from Weights
x1 = 2*t2 - 1; # rescale
x1 = IW*x1+b(1);
x1out = purelin(x1);
x2 = purelin(x1out*(LW)+b(2));
x2 = (x2+1)/2 # unscale
then
>> x2 == Yk
ans =
1

How to implement a neural network with a hidden layer?

I am trying to train a 3 input, 1 output neural network (with an input layer, one hidden layer and an output layer) that can classify quadratics in MATLAB. I am attempting to implement phases for feed-forward, $x_i^{out}=f(s_i)$, $s_i={\sum}_{\substack{j\\}} w_{ij}x_j^{in}$ back-propagation ${\delta}_j^{in}=f'(s_i){\sum}_{\substack{j\\}} {\delta}_i^{out}w_{ij}$ and updating $w_{ij}^{new}=w_{ij}^{old}-\epsilon {\delta}_i^{out}x_j^{in}$, where $x$ is an input vector, $w$ is weight and $\epsilon$ is a learning rate.
I have troubles coding the hidden layer and adding the activation function $f(s)=tanh(s)$ since the error in the output of the network doesn't seem to decrease. Can someone point out what I am implementing wrong?
The inputs are the real coeffcients of the quadratic $ax^2 + bx + c = 0$ and the output should be positive if the quadratic has two real roots and negative if it doesn't.
nTrain = 100; % training set
nOutput = 1;
nSecondLayer = 7; % size of hidden layer (arbitrary)
trainExamples = rand(4,nTrain); % independent random set of examples
trainExamples(4,:) = ones(1,nTrain); % set the dummy input to be 1
T = sign(trainExamples(2,:).^2-4*trainExamples(1,:).*trainExamples(3,:)); % The teacher provides this for every example
%The student neuron starts with random weights
w1 = rand(4,nSecondLayer);
w2 = rand(nSecondLayer,nOutput);
nepochs=0;
nwrong = 1;
S1(nSecondLayer,nTrain) = 0;
S2(nOutput,nTrain) = 0;
while( nwrong>1e-2 ) % more then some small number close to zero
for i=1:nTrain
x = trainExamples(:,i);
S2(:,i) = w2'*S1(:,i);
deltak = tanh(S2(:,i)) - T(:,i); % back propagate
deltaj = (1-tanh(S2(:,i)).^2).*(w2*deltak); % back propagate
w2 = w2 - tanh(S1(:,i))*deltak'; % updating
w1 = w1- x*deltaj'; % updating
end
output = tanh(w2'*tanh(w1'*trainExamples));
dOutput = output-T;
nwrong = sum(abs(dOutput));
disp(nwrong)
nepochs = nepochs+1
end
nepochs
Thanks
After a few days of bashing my head against the wall I discovered a small typo. Below is a working solution:
clear
% Set up parameters
nInput = 4; % number of nodes in input
nOutput = 1; % number of nodes in output
nHiddenLayer = 7; % number of nodes in th hidden layer
nTrain = 1000; % size of training set
epsilon = 0.01; % learning rate
% Set up the inputs: random coefficients between -1 and 1
trainExamples = 2*rand(nInput,nTrain)-1;
trainExamples(nInput,:) = ones(1,nTrain); %set the last input to be 1
% Set up the student neurons for both hidden and the output layers
S1(nHiddenLayer,nTrain) = 0;
S2(nOutput,nTrain) = 0;
% The student neuron starts with random weights from both input and the hidden layers
w1 = rand(nInput,nHiddenLayer);
w2 = rand(nHiddenLayer+1,nOutput);
% Calculate the teacher outputs according to the quadratic formula
T = sign(trainExamples(2,:).^2-4*trainExamples(1,:).*trainExamples(3,:));
% Initialise values for looping
nEpochs = 0;
nWrong = nTrain*0.01;
Wrong = [];
Epoch = [];
while(nWrong >= (nTrain*0.01)) % as long as more than 1% of outputs are wrong
for i=1:nTrain
x = trainExamples(:,i);
S1(1:nHiddenLayer,i) = w1'*x;
S2(:,i) = w2'*[tanh(S1(:,i));1];
delta1 = tanh(S2(:,i)) - T(:,i); % back propagate
delta2 = (1-tanh(S1(:,i)).^2).*(w2(1:nHiddenLayer,:)*delta1); % back propagate
w1 = w1 - epsilon*x*delta2'; % update
w2 = w2 - epsilon*[tanh(S1(:,i));1]*delta1'; % update
end
outputNN = sign(tanh(S2));
delta = outputNN - T; % difference between student and teacher
nWrong = sum(abs(delta/2));
nEpochs = nEpochs + 1;
Wrong = [Wrong nWrong];
Epoch = [Epoch nEpochs];
end
plot(Epoch,Wrong);