How to do Function approximation in coding a Wavelet neural network? - matlab

I am new to neural network and trying to write a wavelet neural network without matlab toolbox.
For function approximation the sigmoid active function output is between 0 and 1 and for calculate the error we use sum square, this is what i know.
But after some iteration the result shows NaN(not a number) in Matlab. Perhaps the change in parameter of weights in the NN(neural network) and doesn't show the NaN , at the end for testing the NN for any input it shows specific constant value.
For more information for the NN:
3 layers==> 1 input in inputLayer,20 units in hidden layer, 1 output in outputLayer
This is a part of the code:
--------------------------------------------------------
x=0:0.1*pi:pi;
y=sin(x);
HiddenLayer=20;%Number of Units in HiddenLayer
M=size(x,1);%Number of row in InputData
N=size(x,2);%Number of col in InputData
% % -------InitialValues------
w=zeros(1,HiddenLayer);
w(1,:)=1;%Initial
a=zeros(M,HiddenLayer);
a(:,:)=0.4;%Initial
b=zeros(M,HiddenLayer);
b(:,:)=0.5;
miu=0.2;%Number
beta=0.05;%Number
ErrorGoal=0.0001;%Number
%---------------Main--------------------
Iteration=1;
while Iteration<1000
ErrorSave=zeros(1,N);
z=zeros(1,HiddenLayer);
p=zeros(1,HiddenLayer);
multiwp=zeros(1,HiddenLayer);
for u=1:N % for more than one input if needed
% keyboard;
for i=1:HiddenLayer
z(i)=((x(u)-b(i))/a(i));
end
for j=1:HiddenLayer
p(j)=cos(1.75*z(j))*exp((-z(j)^2)/2);
end
for k=1:HiddenLayer
multiwp(k)=w(k)*p(k);
end
sumwp=sum(multiwp);
gTotal=sigmf(sumwp,[1 0]);
Error=0.5*(y(u)-gTotal).^2;
% keyboard;
ErrorSave(u)=Error;
% keyboard;
gradEW=-(y(u)-gTotal).*gTotal.*(1-gTotal)*cos(1.75*z);
gradEB=zeros(M,HiddenLayer);
for m=1:HiddenLayer
gradEB(m)=-(y(u)-gTotal).*gTotal.*(1-gTotal).*(w(m)./a(m)).*exp(-(z(m).^2)/2).*...
((1.75*sin(1.75*z(m)))+(z(m)*cos(1.75*z(m))));
end
gradEA=zeros(M,HiddenLayer);
for o=1:HiddenLayer
gradEA(o)=-(y(u)-gTotal).*gTotal.*(1-gTotal)*z(o).*(w(o)./a(o))*exp(-(z(o).^2)/2).*...
((1.75*sin(1.75*z(o)))+(z(o)*cos(1.75*z(o))));
end
% keyboard;
%------------Updating Parameter ---------------
w=w-(miu*gradEW)+beta*w;
b=b+(miu*gradEB)+beta*b;
a=a+(miu*gradEA)+beta*a;
end
----------------------------------------------------------------------
If anyone can help and point out my mistake in concept or coding.
Best wishes :)

Related

How to interprete the regression plot obtained at the end of neural network regression for multiple outputs?

I have trained my Neural network model using MATLAB NN Toolbox. My network has multiple inputs and multiple outputs, 6 and 7 respectively, to be precise. I would like to clarify few questions based on it:-
The final regression plot showed at the end of the training shows a very good accuracy, R~0.99. However, since I have multiple outputs, I am confused as to which scatter plot does it represent? Shouldn't we have 7 target vs predicted plots for each of the output variable?
According to my knowledge, R^2 is a better method of commenting upon the accuracy of the model, whereas MATLAB reports R in its plot. Do I treat that R as R^2 or should I square the reported R value to obtain R^2.
I have generated the Matlab Script containing weight, bias and activation functions, as a final Result of the training. So shouldn't I be able to simply give my raw data as input and obtain the corresponding predicted output. I gave the exact same training set using the indices Matlab chose for training (to cross check), and plotted the predicted output vs actual output, but the result is not at all good. Definitely, not along the lines of R~0.99. Am I doing anything wrong?
code:
function [y1] = myNeuralNetworkFunction_2(x1)
%MYNEURALNETWORKFUNCTION neural network simulation function.
% X = [torque T_exh lambda t_Spark N EGR];
% Y = [O2R CO2R HC NOX CO lambda_out T_exh2];
% Generated by Neural Network Toolbox function genFunction, 17-Dec-2018 07:13:04.
%
% [y1] = myNeuralNetworkFunction(x1) takes these arguments:
% x = Qx6 matrix, input #1
% and returns:
% y = Qx7 matrix, output #1
% where Q is the number of samples.
%#ok<*RPMT0>
% ===== NEURAL NETWORK CONSTANTS =====
% Input 1
x1_step1_xoffset = [-24;235.248;0.75;-20.678;550;0.799];
x1_step1_gain = [0.00353982300884956;0.00284355877067267;6.26959247648903;0.0275865874012055;0.000366568914956012;0.0533831576137729];
x1_step1_ymin = -1;
% Layer 1
b1 = [1.3808996210168685;-2.0990163849711894;0.9651733083552595;0.27000953282929346;-1.6781835509820286;-1.5110463684800366;-3.6257438832309905;2.1569498669085361;1.9204156230460485;-0.17704342477904209];
IW1_1 = [-0.032892214008082517 -0.55848270745152429 -0.0063993424771670616 -0.56161004933654057 2.7161844536020197 0.46415317073346513;-0.21395624254052176 -3.1570133640176681 0.71972178875396853 -1.9132557838515238 1.3365248285282931 -3.022721627052706;-1.1026780445896862 0.2324603066452392 0.14552308208231421 0.79194435276493658 -0.66254679969168417 0.070353201192052434;-0.017994515838487352 -0.097682677816992206 0.68844109281256027 -0.001684535122025588 0.013605622123872989 0.05810686279306107;0.5853667840629273 -2.9560683084876329 0.56713425120259764 -2.1854386350040116 1.2930115031659106 -2.7133159265497957;0.64316656469750333 -0.63667017646313084 0.50060179040086761 -0.86827897068177973 2.695456517458648 0.16822164719859456;-0.44666821007466739 4.0993786464616679 -0.89370838440321498 3.0445073606237933 -3.3015566360833453 -4.492874075961689;1.8337574137485424 2.6946232855369989 1.1140472073136622 1.6167763205944321 1.8573696127039145 -0.81922672766933646;-0.12561950922781362 3.0711045035224349 -0.6535751823440773 2.0590707752473199 -1.3267693770634292 2.8782780742777794;-0.013438026967107483 -0.025741311825949621 0.45460734966889638 0.045052447491038108 -0.21794568374100454 0.10667240367191703];
% Layer 2
b2 = [-0.96846557414356171;-0.2454718918618051;-0.7331628718025488;-1.0225195290982099;0.50307202195645395;-0.49497234988401961;-0.21817117469133171];
LW2_1 = [-0.97716474643411022 -0.23883775971686808 0.99238069915206006 0.4147649511973347 0.48504023209224734 -0.071372217431684551 0.054177719330469304 -0.25963474838320832 0.27368380212104881 0.063159321947246799;-0.15570858147605909 -0.18816739764334323 -0.3793600124951475 2.3851961990944681 0.38355142531334563 -0.75308427071748985 -0.1280128732536128 -1.361052031781103 0.6021878865831336 -0.24725687748503239;0.076251356114485525 -0.10178293627600112 0.10151304376762409 -0.46453434441403058 0.12114876632815359 0.062856969143306296 -0.0019628163322658364 -0.067809039768745916 0.071731544062023825 0.65700427778446913;0.17887084584125315 0.29122649575978238 0.37255802759192702 1.3684190468992126 0.60936238465090853 0.21955911453674043 0.28477957899364675 -0.051456306721251184 0.6519451272106177 -0.64479205028051967;0.25743349663436799 2.0668075180209979 0.59610776847961111 -3.2609682919282603 1.8824214917530881 0.33542869933904396 0.03604272669356564 -0.013842766338427388 3.8534510207741826 2.2266745660915586;-0.16136175574939746 0.10407287099228898 -0.13902245286490234 0.87616472446622717 -0.027079111747601223 0.024812287505204988 -0.030101536834009103 0.043168268669541855 0.12172932035587079 -0.27074383434206573;0.18714562505165402 0.35267726325386606 -0.029241400610813449 0.53053853235049087 0.58880054832728757 0.047959541165126809 0.16152268183097709 0.23419456403348898 0.83166785128608967 -0.66765237856750781];
% Output 1
y1_step1_ymin = -1;
y1_step1_gain = [0.114200879346771;0.145581598485951;0.000139011547272197;0.000456244862967996;2.05816254143146e-05;5.27704485488127;0.00284355877067267];
y1_step1_xoffset = [-0.045;1.122;2.706;17.108;493.726;0.75;235.248];
% ===== SIMULATION ========
% Dimensions
Q = size(x1,1); % samples
% Input 1
x1 = x1';
xp1 = mapminmax_apply(x1,x1_step1_gain,x1_step1_xoffset,x1_step1_ymin);
% Layer 1
a1 = tansig_apply(repmat(b1,1,Q) + IW1_1*xp1);
% Layer 2
a2 = repmat(b2,1,Q) + LW2_1*a1;
% Output 1
y1 = mapminmax_reverse(a2,y1_step1_gain,y1_step1_xoffset,y1_step1_ymin);
y1 = y1';
end
% ===== MODULE FUNCTIONS ========
% Map Minimum and Maximum Input Processing Function
function y = mapminmax_apply(x,settings_gain,settings_xoffset,settings_ymin)
y = bsxfun(#minus,x,settings_xoffset);
y = bsxfun(#times,y,settings_gain);
y = bsxfun(#plus,y,settings_ymin);
end
% Sigmoid Symmetric Transfer Function
function a = tansig_apply(n)
a = 2 ./ (1 + exp(-2*n)) - 1;
end
% Map Minimum and Maximum Output Reverse-Processing Function
function x = mapminmax_reverse(y,settings_gain,settings_xoffset,settings_ymin)
x = bsxfun(#minus,y,settings_ymin);
x = bsxfun(#rdivide,x,settings_gain);
x = bsxfun(#plus,x,settings_xoffset);
end
The above one is the automatically generated code. The plot which I generated to cross-check the first variable is below:-
% X and Y are input and output - same as above
X_train = X(results.info1.train.indices,:);
y_train = Y(results.info1.train.indices,:);
out_train = myNeuralNetworkFunction_2(X_train);
scatter(y_train(:,1),out_train(:,1))
To answer your question about R: Yes, you should square R to get the R^2 value. In this case, they will be very close since R is very close to 1.
The graphs give the correlation between the estimated and real (target) values. So R is the strenght of the correlation. You can square it to find the R-square.
The graph you draw and matlab gave are not the graph of the same variables. The ranges or scales of the axes are very different.
First of all, is the problem you are trying to solve a regression problem? Or is it a classification problem with 7 classes converted to numeric? I assume this is a classification problem, as you are trying to get the success rate for each class.
As for your first question: According to the literature it is recommended to use the value "All: R". If you want to get the success rate of each of your classes, Precision, Recall, F-measure, FP rate, TP Rate, etc., which are valid in classification problems. values ​​you need to reach. There are many matlab documents for this (help ROC) and you can look at the details. All the values ​​I mentioned and which I think you actually want are obtained from the confusion matrix.
There is a good example of this.
[x,t] = simpleclass_dataset;
net = patternnet(10);
net = train(net,x,t);
y = net(x);
[c,cm,ind,per] = confusion(t,y)
I hope you will see what you want from the "nntraintool" window that appears when you run the code.
Your other questions have already been answered. Alternatively, you can consider using a machine learning algorithm with open source software such as Weka.

(matlab) MLP with relu and softmax not working with mini-batch SGD and produces similar predictions on MNIST dataset

I implemented a multilayer perceptron with 1 hidden layer on MNIST dataset. The activation function in hidden layer is leaky(0.01) ReLu and output layer has a softmax activation function. The learning method is mini-batch SGD. The network structure is 784*30*10. The problem is I found the predictions the network made, for each input sample, are quite similar. That means the model would always like to think the image is some certain number. Thanks #Lemm Ras for pointing out the label-data mismatching problem in previous data_shuffle function and now fixed. But after some batch training, I found the predictions are still some kind of similar: That's confusing.
Another issue is the update value is too small comparing with original weight, in the MLP code, I add variable 'cc' and 'dd' to record the ratio between their weight_update and weight,
cc=W_OUTPUT_Update./W_OUTPUT;
dd=W_MLP_Update./W_MLP;
During debugging, the magnitude for cc is 10^-4(0.0001) and dd is also 10^-4. This might be the reason that the accuracy doesn't seems improved a lot.
After several days debugging. I have no idea why that happens and how to solve it, it made me stuck for one week. Can someone help me please?
The screenshot is the value of A2 after softmax function.
[dimension, images, labels, labels_matrix, train_amount, test_labels_matrix, test_images, test_labels, test_amount] = load_mnist_data(); %initialize str
images=images(:,1:10000); % for debugging, get part of whole data set
labels=labels(1:10000,1);
labels_matrix=labels_matrix(:,1:10000);
test_images=test_images(:,1:500);
test_labels=test_labels(1:500,1);
train_amount=10000;
test_amount=500;
% initialize the structure
[ W_MAD, W_MLP, W_OUTPUT] = initialize_structure(dimension, train_amount, test_amount);
epoch=100;
correct_rate=zeros(1,epoch); %record testing accuracy
corr=zeros(1,epoch); %record training accuracy
lr=0.2;
lamda=0;
batch_size=50;
for i=1:epoch
sprintf('MLP in iteration %d over %d', i, epoch)
%shuffle data
[labels_shuffled labels_matrix_shuffled images_shuffled]=shuffle_data(labels, labels_matrix,images);
[ cor, W_MLP, W_OUTPUT ] = train_mlp_relu(lr, leaky, lamda, momentum_gamma, batch_size,W_MLP, W_OUTPUT, W_MAD, power, images_shuffled, train_amount, labels_shuffled, labels_matrix_shuffled);
corr(i)=cor/train_amount;
% test
correct_rate(i) = structure_test( W_MAD, W_MLP, W_OUTPUT, test_images, test_labels, test_amount );
end
% plot results
plot(1:epoch,correct_rate);
Here's the training MLP function, please ignore L2 regularization parameter lamda which is currently set as 0.
%MLP with batch size batch_size
cor=0;
%leaky=(1/batch_size);
leaky=0.001;
for i=1:train_amount/batch_size
batch_images=images(:,batch_size*(i-1)+1:batch_size*i);
batch_labels=labels_matrix(:,batch_size*(i-1)+1:batch_size*i);
%from MAD to MLP
V1=W_MLP'*batch_images;
V1(1,:)=1; %set bias unit as 1
V1_dirivative=ones(size(V1));
V1_dirivative(find(V1<0))=leaky;
A1=relu(V1,leaky); % A stands for activation
V2=W_OUTPUT'* A1;
A2=softmax(V2);
%write these scope control codes into functions.
%train error
[val idx]=max(A2);
idx=idx-1; %because index(idx) for matrix vaires from 1 to 10 while label varies from 0 to 9.
res=labels(batch_size*(i-1)+1:batch_size*i)-idx';
cor=cor+sum(res(:)==0);
%softmax loss, due to relu applied nodes that has
%contribution to activate neurons has gradient 1; while <0 nodes
%has no contribution
delta_softmax=-(1/batch_size)*(batch_labels-A2);
delta_output=W_OUTPUT*delta_softmax.*V1_dirivative;
%update
W_OUTPUT_Update=lr*(1/batch_size)*A1*delta_softmax'+lamda*W_OUTPUT;
cc=W_OUTPUT_Update./W_OUTPUT;
W_MLP_Update=lr*(1/batch_size)*batch_images*delta_output'+lamda*W_MLP;
dd=W_MLP_Update./W_MLP;
k=mean(A2,2);
W_OUTPUT=W_OUTPUT-W_OUTPUT_Update;
W_MLP=W_MLP-W_MLP_Update;
end
end
Here is the softmax function:
function [ val ] = softmax( val )
val=exp(val);
val=val./repmat(sum(val),10,1);
end
The labels_matrix is the aimed output matrix for A2 and created as:
labels_matrix=full(sparse(labels+1,1:train_amount,1));
test_labels_matrix=full(sparse(test_labels+1,1:test_amount,1));
And Relu:
function [ val ] = relu( val,leaky )
val(find(val<0))=leaky*val(find(val<0));
end
Data shuffle
%this version is wrong, due to it only shuffles label and data without doing the same shuffling on the 'labels_matrix' which is used to calculate MLP's delta in output layer. It destroyed the link between data and label.
% function [ label, data ] = shuffle_data( label, data )
% [row column]=size(data);
% array=randperm(column);
% data=data(:,array);
% label=label(array);
% %if shuffle respect to row then use the code below
% %data=data(randperm(row),:);
% end
function [ label, label_matrix, data ] = shuffle_data( label, label_matrix, data )
[row column]=size(data);
array=randperm(column);
data=data(:,array);
label=label(array);
label_matrix=label_matrix(:, array);
%if shuffle respect to row then use the code below
%data=data(randperm(row),:);
end
Data loading:
function [ dimension, images, labels, labels_matrix, train_amount, test_labels_matrix, test_images, test_labels, test_amount] = load_mnist_data()
%%load training and testing data, labels
data_location='C:\Users\yz39g15\Documents\MATLAB\common\mnist test\for the report/modify/train-images.idx3-ubyte';
label_location='C:\Users\yz39g15\Documents\MATLAB\common\mnist test\for the report/modify/train-labels.idx1-ubyte';
test_data_location='C:\Users\yz39g15\Documents\MATLAB\common\mnist test\for the report/modify/t10k-images.idx3-ubyte';
test_label_location='C:\Users\yz39g15\Documents\MATLAB\common\mnist test\for the report/modify/t10k-labels.idx1-ubyte';
images = loadMNISTImages(data_location);
labels = loadMNISTLabels(label_location);
test_images=loadMNISTImages(test_data_location);
test_labels=loadMNISTLabels(test_label_location);
%%data centralization
[dimension train_amount]=size(images);
[dimension test_amount]=size(test_images);
%%complete normalization
%%transform labels from index to matrix in order to apply square loss function in output layer
labels_matrix=full(sparse(labels+1,1:train_amount,1));
test_labels_matrix=full(sparse(test_labels+1,1:test_amount,1));
end
When you are shuffling the images, the association data-label is lost. Since this association must survive, what you need is to enforce the same shuffling for both data and labels.
In order to do so you could, for instance, Create an external shuffled index list: shuffled=randperm(N), with N the number of images and then pass to the train method either the list created or the elements of images and label addressed by the shuffled list.

Error Backpropagation - Neural network

I am trying to write a code for error back-propagation for neural network but my code is taking really long time to execute. I know that training of Neural network takes long time but it is taking long time for a single iteration as well.
Multi-class classification problem!
Total number of training set = 19978
Number of inputs = 513
Number of hidden units = 345
Number of classes = 10
Below is my entire code:
X=horzcat(ones(19978,1),inputMatrix); %Adding bias
M=floor(0.66*(513+10)); %Taking two-third of imput+output
Wji=rand(513,M);
aj=X*Wji;
zj=tanh(aj); %Hidden Layer output
Wkj=rand(M,10);
ak=zj*Wkj;
akTranspose = ak';
ykTranspose=softmax(akTranspose); %For multi-class classification
yk=ykTranspose'; %Final output
error=0;
%Initializing target variables
t = zeros(19978,10);
t(1:2000,1)=1;
t(2001:4000,2)=1;
t(4001:6000,3)=1;
t(6001:8000,4)=1;
t(8001:10000,5)=1;
t(10001:12000,6)=1;
t(12001:14000,7)=1;
t(14001:16000,8)=1;
t(16001:18000,9)=1;
t(18001:19778,10)=1;
errorArray=zeros(100000,1); %Stroing error values to keep track of error iteration
errorDiff=zeros(100000,1);
for nIterations=1:5
errorOld=error;
aj=X*Wji; %Forward propagating in each iteration
zj=tanh(aj);
ak=zj*Wkj;
akTranspose = ak';
ykTranspose=softmax(akTranspose);
yk=ykTranspose';
error=0;
%Calculating error
for n=1:19978 %for 19978 training samples
for k=1:10 %for 10 classes
error = error + t(n,k)*log(yk(n,k)); %using cross entropy function
end
end
error=-error;
Ediff = error-errorOld;
errorArray(nIterations,1)=error;
errorDiff(nIterations,1)=Ediff;
%Calculating dervative of error wrt weights wji
derEWji=zeros(513,345);
derEWkj=zeros(345,10);
for i=1:513
for j=1:M;
derErrorTemp=0;
for k=1:10
for n=1:19978
derErrorTemp=derErrorTemp+Wkj(j,k)*(yk(n,k)-t(n,k));
Calculating derivative of E wrt Wkj%
derEWkj(j,k) = derEWkj(j,k)+(yk(n,k)-t(n,k))*zj(n,j);
end
end
for n=1:19978
Calculating derivative of E wrt Wji
derEWji(i,j) = derEWji(i,j)+(1-(zj(n,j)*zj(n,j)))*derErrorTemp;
end
end
end
eta = 0.0001; %learning rate
Wji = Wji - eta.*derEWji; %updating weights
Wkj = Wkj - eta.*derEWkj;
end
for-loop is very time-consuming in Matlab even with the help of JIT. Try to modify your code by vectorize them rather than organizing them in a 3-loop or even 4-loop. For example,
for n=1:19978 %for 19978 training samples
for k=1:10 %for 10 classes
error = error + t(n,k)*log(yk(n,k)); %using cross entropy function
end
end
can be changed to:
error = sum(sum(t.*yk)); % t and yk are both n*k arrays that you construct
You may try to do similar jobs for the rest of your code. Use dot product or multiplication operations on arrays for different cases.

complex valued neural network (CVNN) error divergence

I am currently working with my undergraduate thesis on complex valued neural network(CVNN).My topic is based on Single-layered complex-valued neural network for real-valued classification problems.I am using gradient-descent learning rule to classify a dataset given below:
Data Set
The alogorithm i used here can be found on page 946 of the following PDF labeled as Complex valued neuron (CVN) Model.The main algorithm can be on section 3 of that topic
Algorithm of CVN Model
But instead of getting convergence,my error curve has shown divergent characteristics.Here is my output of error curve.
error curve at CVNN implementation
I am simulating the code behind this on MATLAB.My implementation is also given below:
clc
clear all
epoch=1000;
n=8;
%x=real input value
in=dlmread('Diabetes1.txt');
x=in(1:384,1:8);
%d=desired output value
out=dlmread('Diabetes1.txt');
data_1=out(1:384,9);
data_2=out(1:384,10);
%m=complex representation of input
m=(cos((pi).*(x(:,:)-0))+1i*sin((pi).*(x(:,:)-0)));
%
%research
%m=i.*x(:,:)
%m=x(:,:)+i.*x(:,:)
%Wih=weight
%
%m=x(:,:).*(cos(pi./4)+i.*sin(pi./4));
Wih1 =0.5* exp(1i * 2*pi*rand(8,1));
Wih2 =0.5* exp(1i * 2*pi*rand(8,1));
%Pih=bias
Pih1 =0.5*exp(1i * 2*pi*rand(1,1));
Pih2 =0.5*exp(1i * 2*pi*rand(1,1));
for ite=1:epoch
% www(ite)=ite;
E_Total=0;
E1t=0;
E2t=0;
for j=1:384
%blr=learning rate
blr=0.1;
%cpat=current pattern
cpat = m(j,:);
z1=cpat*Wih1+Pih1;
u1=real(z1);
v1=imag(z1);
fu1=1/(1+exp(-u1));
fv1=1/(1+exp(-v1));
%y=actual output
%for activation function 1
y1=sqrt((fu1).^2+(fv1).^2);
%for activation function 2
% y1=(fu1-fv1).^2;
error1=(data_1(j,1)-y1);
E1=((data_1(j,1)-y1).^2);
t11=1./(1+exp(-u1));
f11=t11.*(1-t11);
t21=1./(1+exp(-v1));
f21=t21.*(1-t21);
%for activation function 1
r1= blr.*(data_1(j,1)-y1).*((t11.*f11)./y1)+i.*blr.*(data_1(j,1)-y1).*((t21.*f21)./y1);
%for activation function 2
%r1=2.*blr.*(data_1(j,1)-y1).*(t11-t21).*f11+1i.*2.*blr.*(data_1(j,1)-y1).*(t21-t11).*f21;
%
Pih1=Pih1+r1;
Wih1= Wih1+(conj(m(j,:)))'.*r1;
%////////////////////////////////////////////////
%cpat=current pattern
z2=cpat*Wih2+Pih2;
u2=real(z2);
v2=imag(z2);
fu2=1./(1+exp(-u2));
fv2=1./(1+exp(-v2));
% fu2=tanh(u2);
% fv2=tanh(v2);
%y=actual output
%for activation function 1
y2=sqrt((fu2).^2+(fv2).^2);
%for activation function 2
% y2=(fu2-fv2).^2;
error2=(data_2(j,1)-y2);
E2=((data_2(j,1)-y2).^2);
t12=1./(1+exp(-u2));
f12=t12.*(1-t12);
t22=1./(1+exp(-v2));
f22=t22.*(1-t22);
%for activation function1
r2= blr.*(data_2(j,1)-y2).*((t12.*f12)./y2)+i.*blr.*(data_2(j,1)-y2).*((t22.*f22)./y2);
%for activation function 2
%r2=2*blr*(data_2(j,1)-y2)*(t12-t22)*f12+1i*2*blr*(data_2(j,1)-y2)*(t22-t12)*f22;
Pih2=Pih2+r2;
Wih2= Wih2+(conj(m(j,:)))'.*r2;
%///////////////////////////////////////////////
E1t=E1+E1t;
E2t=E2+E2t;
E_Total=(E1+E2+E_Total);
E1;
E2;
end
Err=E_Total/(2.*384);
figure(1)
plot(ite,Err,'b--')
hold on;
%figure(1)
end
dlmwrite('weight.txt',Wih1)
dlmwrite('weight.txt', Wih2, '-append', ...
'roffset', 1, 'delimiter', ' ')
dlmwrite('weight.txt', Pih1, '-append', ...
'roffset', 1, 'delimiter', ' ')
dlmwrite('weight.txt', Pih2, '-append', ...
'roffset', 1, 'delimiter', ' ')
I still could not figure out reason behind this opposite characteristics on the dataset.So any kind of help regarding this is appreciated.
If you are doing gradient descent, a very common debugging technique is to check whether the gradient you calculated actually matches the numerical gradient of your loss function.
That is, check
(f(x+dx)-f(x))/dx==f'(x)*dx
for a variety of small dx. Usually try along each dimension, as well as in a variety of random directions. You also want to do this check for a variety of value of x.
You should take a glance at this blog for complex back-propagation.
For holomorphic functions, complex BP are fairly straight forward.
For non-holomorphic functions (every CVNN must have at least one non-holomorphic function), they need careful treat.

fisheriris data and perceptron

i want to apply the perceptron algorithm for fisheriris data and i was tried this code
function [ ] = Per( )
%PERCEPTON_NN Summary of this function goes here
% Detailed explanation goes here
%%%%%%%%%%%STEP ONE INPUT DATA PREPERATION
%N=3000;
load fisheriris
tr=50; %traning
te=50; %test
epochs =150;
data=meas;
%N = size(meas,1);
%species=nonomil(species)
%figure,plot(data_shuffeled(1,:),data_shuffeled(2,:),'rx');
%%%%%%%%%%%%%%%%%%%STEP TWO INTIALIZE WEIGHT
baise=1;
w=[baise; 1 ; 1;1 ; 1];
eta=0.9; %%learning rate
%%%%%%%%%%%%%%%%%%%%%%%% Rosen Blatt learning Algo
for epoch=1 : epochs
for i=1 : tr
x=[1;data(i,1);data(i,2);data(i,3);data(i,4)]; % input vector
N = size(species,i); %desiard output
y=w'*x; % y=actual output ,w'= transpose w , mmoken y=w.*x
%%%%%%%%%%%%%%% Activation function (hardlimit)(step fn)
y=1*(y>=0)+(-1)*(y<0); % da badl el if
%%%%%%%%%%% Error calcualtion %%%%%%%
err(i)=N-y;
%%%%%%%%%%%%%% update weight %%%%%%%%%5
wnew=w+ eta*err(i)*x;
w=wnew;
end
mse(epoch)=mean(err.^2);
end
%%%%%%%%%%%%%%%%%%%%%% step four classification (testing) %%%%%%%%%%%%%%%%%%5
hold on
for i=1 : te
%x=[1;data(i,1);data(i,2),data(i,3);data(i,4)];
x=[1;data(i,1);data(i,2);data(i,3);data(i,4)];
% d=data_shuffeled(3,i+tr);
N = size(species,i);
y=w'*x;
y=1*(y>=0)+(-1)*(y<0);
if (y==1)
plot(x(2),x(3),x(4),x(5),'rx');
elseif y==-1
plot(x(2),x(3),x(4),x(5),'r&');
end
end
hold off
if abs(N-y)>1E-6
testerro=testerro+1;
end
i wrote this code to make the perceptron algorithm with fisheriris data "meas" as input and species as "output"
any help in the code or any modify on this code .
Thanks .
First, did you know that MATLAB has something for neural network training called the Neural network toolbox?
Second, think data_shuffeled is your own function. There is something called randperm in MATLAB that you should use to shuffle your data.
Third, you want to avoid using for-loops when you can use vectors/matrices in MATLAB.
Instead of doing (for testing)
for i = 1:te,
....
end
You might want to do
X = [ones(te,1), data]; %X is [50x5] so each row of X is x'
y = X*w; %y is [50x1], X is [50x5], w is [5x1]
idx_p1 = y==1; %all the indices of y where y is +1
idx_m1 = y==-1; %all the indicies of y where y is -1
plot(X(idx_p1,1),y(idx_p1),'rx');
plot(X(idx_m1,1),y(idx_m1),'r&');
I don't know how you were using plot with 4-dimensional X so the above just plots with the first feature (column) of X.
Additionally, the training looks strange to me. For one, I don't think you should use N for both the size of data matrix meas and for the desired output. 'yhat' or 'ybar' is a better name. Also, if N is the desired output, then why is it size(species,i) where i loops through 1:50? species is a [150x1] vector. size(species,1) = 150. And size(species,x) where x is 2 to 50 will be 1. Are you sure you want this? Shouldn't it be something like:
yhat = -ones(50,1); %Everything is -1
yhat(strmatch('virginica,species)) = 1; %except virginicas which are +1