Export a neural network trained with MATLAB in other programming languages

Export a neural network trained with MATLAB in other programming languages - matlab

I trained a neural network using the MATLAB Neural Network Toolbox, and in particular using the command nprtool, which provides a simple GUI to use the toolbox features, and to export a net object containing the informations about the NN generated.
In this way, I created a working neural network, that I can use as classifier, and a diagram representing it is the following:
There are 200 inputs, 20 neurons in the first hidden layer, and 2 neurons in the last layer that provide a bidimensional output.
What I want to do is to use the network in some other programming language (C#, Java, ...).
In order to solve this problem, I try to use the following code in MATLAB:
y1 = tansig(net.IW{1} * input + net.b{1});
Results = tansig(net.LW{2} * y1 + net.b{2});
Assuming that input is a monodimensional array of 200 elements, the previous code would work if net.IW{1} is a 20x200 matrix (20 neurons, 200 weights).
The problem is that I noticed that size(net.IW{1}) returns unexpected values:
>> size(net.IW{1})
ans =
20 199
I got the same problem with a network with 10000 input. In this case, the result wasn't 20x10000, but something like 20x9384 (I don't remember the exact value).
So, the question is: how can I obtain the weights of each neuron? And after that, can someone explain me how can I use them to produce the same output of MATLAB?

I solved the problems described above, and I think it is useful to share what I've learned.
Premises
First of all, we need some definitions. Let's consider the following image, taken from [1]:
In the above figure, IW stands for initial weights: they represent the weights of neurons on the Layer 1, each of which is connected with each input, as the following image shows [1]:
All the other weights, are called layer weights (LW in the first figure), that are also connected with each output of the previous layer. In our case of study, we use a network with only two layers, so we will use only one LW array to solve our problems.
Solution of the problem
After the above introduction, we can proceed by dividing the issue in two steps:
Force the number of initial weights to match with the input array length
Use the weights to implement and use the neural network just trained in other programming languages
A - Force the number of initial weights to match with the input array length
Using the nprtool, we can train our network, and at the end of the process, we can also export in the workspace some information about the entire training process. In particular, we need to export:
a MATLAB network object that represents the neural network created
the input array used to train the network
the target array used to train the network
Also, we need to generate a M-file that contains the code used by MATLAB to create the neural network, because we need to modify it and change some training options.
The following image shows how to perform these operations:
The M-code generated will be similar to the following one:
function net = create_pr_net(inputs,targets)
%CREATE_PR_NET Creates and trains a pattern recognition neural network.
%
% NET = CREATE_PR_NET(INPUTS,TARGETS) takes these arguments:
% INPUTS - RxQ matrix of Q R-element input samples
% TARGETS - SxQ matrix of Q S-element associated target samples, where
% each column contains a single 1, with all other elements set to 0.
% and returns these results:
% NET - The trained neural network
%
% For example, to solve the Iris dataset problem with this function:
%
% load iris_dataset
% net = create_pr_net(irisInputs,irisTargets);
% irisOutputs = sim(net,irisInputs);
%
% To reproduce the results you obtained in NPRTOOL:
%
% net = create_pr_net(trainingSetInput,trainingSetOutput);
% Create Network
numHiddenNeurons = 20; % Adjust as desired
net = newpr(inputs,targets,numHiddenNeurons);
net.divideParam.trainRatio = 75/100; % Adjust as desired
net.divideParam.valRatio = 15/100; % Adjust as desired
net.divideParam.testRatio = 10/100; % Adjust as desired
% Train and Apply Network
[net,tr] = train(net,inputs,targets);
outputs = sim(net,inputs);
% Plot
plotperf(tr)
plotconfusion(targets,outputs)
Before start the training process, we need to remove all preprocessing and postprocessing functions that MATLAB executes on inputs and outputs. This can be done adding the following lines just before the % Train and Apply Network lines:
net.inputs{1}.processFcns = {};
net.outputs{2}.processFcns = {};
After these changes to the create_pr_net() function, simply we can use it to create our final neural network:
net = create_pr_net(input, target);
where input and target are the values we exported through nprtool.
In this way, we are sure that the number of weights is equal to the length of input array. Also, this process is useful in order to simplify the porting to other programming languages.
B - Implement and use the neural network just trained in other programming languages
With these changes, we can define a function like this:
function [ Results ] = classify( net, input )
y1 = tansig(net.IW{1} * input + net.b{1});
Results = tansig(net.LW{2} * y1 + net.b{2});
end
In this code, we use the IW and LW arrays mentioned above, but also the biases b, used in the network schema by the nprtool. In this context, we don't care about the role of biases; simply, we need to use them because nprtool does it.
Now, we can use the classify() function defined above, or the sim() function equally, obtaining the same results, as shown in the following example:
>> sim(net, input(:, 1))
ans =
0.9759
-0.1867
-0.1891
>> classify(net, input(:, 1))
ans =
0.9759
-0.1867
-0.1891
Obviously, the classify() function can be interpreted as a pseudocode, and then implemented in every programming languages in which is possibile to define the MATLAB tansig() function [2] and the basic operations between arrays.
References
[1] Howard Demuth, Mark Beale, Martin Hagan: Neural Network Toolbox 6 - User Guide, MATLAB
[2] Mathworks, tansig - Hyperbolic tangent sigmoid transfer function, MATLAB Documentation center
Additional notes
Take a look to the robott's answer and the Sangeun Chi's answer for more details.

Thanks to VitoShadow and robott answers, I can export Matlab neural network values to other applications.
I really appreciate them, but I found some trivial errors in their codes and want to correct them.
1) In the VitoShadow codes,
Results = tansig(net.LW{2} * y1 + net.b{2});
-> Results = net.LW{2} * y1 + net.b{2};
2) In the robott preprocessing codes,
It would be easier extracting xmax and xmin from the net variable than calculating them.
xmax = net.inputs{1}.processSettings{1}.xmax
xmin = net.inputs{1}.processSettings{1}.xmin
3) In the robott postprocessing codes,
xmax = net.outputs{2}.processSettings{1}.xmax
xmin = net.outputs{2}.processSettings{1}.xmin
Results = (ymax-ymin)*(Results-xmin)/(xmax-xmin) + ymin;
-> Results = (Results-ymin)*(xmax-xmin)/(ymax-ymin) + xmin;
You can manually check and confirm the values as follows:
p2 = mapminmax('apply', net(:, 1), net.inputs{1}.processSettings{1})
-> preprocessed data
y1 = purelin ( net.LW{2} * tansig(net.iw{1}* p2 + net.b{1}) + net.b{2})
-> Neural Network processed data
y2 = mapminmax( 'reverse' , y1, net.outputs{2}.processSettings{1})
-> postprocessed data
Reference:
http://www.mathworks.com/matlabcentral/answers/14517-processing-of-i-p-data

This is a small improvement to the great Vito Gentile's answer.
If you want to use the preprocessing and postprocessing 'mapminmax' functions, you have to pay attention because 'mapminmax' in Matlab normalizes by ROW and not by column!
This is what you need to add to the upper "classify" function, to keep a coherent pre/post-processing:
[m n] = size(input);
ymax = 1;
ymin = -1;
for i=1:m
xmax = max(input(i,:));
xmin = min(input(i,:));
for j=1:n
input(i,j) = (ymax-ymin)*(input(i,j)-xmin)/(xmax-xmin) + ymin;
end
end
And this at the end of the function:
ymax = 1;
ymin = 0;
xmax = 1;
xmin = -1;
Results = (ymax-ymin)*(Results-xmin)/(xmax-xmin) + ymin;
This is Matlab code, but it can be easily read as pseudocode.
Hope this will be helpful!

I tried to implement a simply 2-layer NN in C++ using OpenCV and then exported the weights to Android which worked quiet well. I wrote a small script which generates a header file with the learned weights and this is used in the following code snipped.
// Map Minimum and Maximum Input Processing Function
Mat mapminmax_apply(Mat x, Mat settings_gain, Mat settings_xoffset, double settings_ymin){
Mat y;
subtract(x, settings_xoffset, y);
multiply(y, settings_gain, y);
add(y, settings_ymin, y);
return y;
/* MATLAB CODE
y = x - settings_xoffset;
y = y .* settings_gain;
y = y + settings_ymin;
*/
}
// Sigmoid Symmetric Transfer Function
Mat transig_apply(Mat n){
Mat tempexp;
exp(-2*n, tempexp);
Mat transig_apply_result = 2 /(1 + tempexp) - 1;
return transig_apply_result;
}
// Map Minimum and Maximum Output Reverse-Processing Function
Mat mapminmax_reverse(Mat y, Mat settings_gain, Mat settings_xoffset, double settings_ymin){
Mat x;
subtract(y, settings_ymin, x);
divide(x, settings_gain, x);
add(x, settings_xoffset, x);
return x;
/* MATLAB CODE
function x = mapminmax_reverse(y,settings_gain,settings_xoffset,settings_ymin)
x = y - settings_ymin;
x = x ./ settings_gain;
x = x + settings_xoffset;
end
*/
}
Mat getNNParameter (Mat x1)
{
// convert double array to MAT
// input 1
Mat x1_step1_xoffsetM = Mat(1, 48, CV_64FC1, x1_step1_xoffset).t();
Mat x1_step1_gainM = Mat(1, 48, CV_64FC1, x1_step1_gain).t();
double x1_step1_ymin = -1;
// Layer 1
Mat b1M = Mat(1, 25, CV_64FC1, b1).t();
Mat IW1_1M = Mat(48, 25, CV_64FC1, IW1_1).t();
// Layer 2
Mat b2M = Mat(1, 48, CV_64FC1, b2).t();
Mat LW2_1M = Mat(25, 48, CV_64FC1, LW2_1).t();
// input 1
Mat y1_step1_gainM = Mat(1, 48, CV_64FC1, y1_step1_gain).t();
Mat y1_step1_xoffsetM = Mat(1, 48, CV_64FC1, y1_step1_xoffset).t();
double y1_step1_ymin = -1;
// ===== SIMULATION ========
// Input 1
Mat xp1 = mapminmax_apply(x1, x1_step1_gainM, x1_step1_xoffsetM, x1_step1_ymin);
Mat temp = b1M + IW1_1M*xp1;
// Layer 1
Mat a1M = transig_apply(temp);
// Layer 2
Mat a2M = b2M + LW2_1M*a1M;
// Output 1
Mat y1M = mapminmax_reverse(a2M, y1_step1_gainM, y1_step1_xoffsetM, y1_step1_ymin);
return y1M;
}
example for a bias in the header could be this:
static double b2[1][48] = {
{-0.19879, 0.78254, -0.87674, -0.5827, -0.017464, 0.13143, -0.74361, 0.4645, 0.25262, 0.54249, -0.22292, -0.35605, -0.42747, 0.044744, -0.14827, -0.27354, 0.77793, -0.4511, 0.059346, 0.29589, -0.65137, -0.51788, 0.38366, -0.030243, -0.57632, 0.76785, -0.36374, 0.19446, 0.10383, -0.57989, -0.82931, 0.15301, -0.89212, -0.17296, -0.16356, 0.18946, -1.0032, 0.48846, -0.78148, 0.66608, 0.14946, 0.1972, -0.93501, 0.42523, -0.37773, -0.068266, -0.27003, 0.1196}};
Now, that Google published Tensorflow, this became obsolete.

Hence the solution becomes (after correcting all parts)
Here I am giving a solution in Matlab, but if you have tanh() function, you may easily convert it to any programming language. It is for just showing the fields from network object and the operations you need.
Assume you have a trained ann (network object) that you want to export
Assume that the name of the trained ann is trained_ann
Here is the script for exporting and testing.
Testing script compares original network result with my_ann_evaluation() result
% Export IT
exported_ann_structure = my_ann_exporter(trained_ann);
% Run and Compare
% Works only for single INPUT vector
% Please extend it to MATRIX version by yourself
input = [12 3 5 100];
res1 = trained_ann(input')';
res2 = my_ann_evaluation(exported_ann_structure, input')';
where you need the following two functions
First my_ann_exporter:
function [ my_ann_structure ] = my_ann_exporter(trained_netw)
% Just for extracting as Structure object
my_ann_structure.input_ymax = trained_netw.inputs{1}.processSettings{1}.ymax;
my_ann_structure.input_ymin = trained_netw.inputs{1}.processSettings{1}.ymin;
my_ann_structure.input_xmax = trained_netw.inputs{1}.processSettings{1}.xmax;
my_ann_structure.input_xmin = trained_netw.inputs{1}.processSettings{1}.xmin;
my_ann_structure.IW = trained_netw.IW{1};
my_ann_structure.b1 = trained_netw.b{1};
my_ann_structure.LW = trained_netw.LW{2};
my_ann_structure.b2 = trained_netw.b{2};
my_ann_structure.output_ymax = trained_netw.outputs{2}.processSettings{1}.ymax;
my_ann_structure.output_ymin = trained_netw.outputs{2}.processSettings{1}.ymin;
my_ann_structure.output_xmax = trained_netw.outputs{2}.processSettings{1}.xmax;
my_ann_structure.output_xmin = trained_netw.outputs{2}.processSettings{1}.xmin;
end
Second my_ann_evaluation:
function [ res ] = my_ann_evaluation(my_ann_structure, input)
% Works with only single INPUT vector
% Matrix version can be implemented
ymax = my_ann_structure.input_ymax;
ymin = my_ann_structure.input_ymin;
xmax = my_ann_structure.input_xmax;
xmin = my_ann_structure.input_xmin;
input_preprocessed = (ymax-ymin) * (input-xmin) ./ (xmax-xmin) + ymin;
% Pass it through the ANN matrix multiplication
y1 = tanh(my_ann_structure.IW * input_preprocessed + my_ann_structure.b1);
y2 = my_ann_structure.LW * y1 + my_ann_structure.b2;
ymax = my_ann_structure.output_ymax;
ymin = my_ann_structure.output_ymin;
xmax = my_ann_structure.output_xmax;
xmin = my_ann_structure.output_xmin;
res = (y2-ymin) .* (xmax-xmin) /(ymax-ymin) + xmin;
end

Related

How to interprete the regression plot obtained at the end of neural network regression for multiple outputs?

I have trained my Neural network model using MATLAB NN Toolbox. My network has multiple inputs and multiple outputs, 6 and 7 respectively, to be precise. I would like to clarify few questions based on it:-
The final regression plot showed at the end of the training shows a very good accuracy, R~0.99. However, since I have multiple outputs, I am confused as to which scatter plot does it represent? Shouldn't we have 7 target vs predicted plots for each of the output variable?
According to my knowledge, R^2 is a better method of commenting upon the accuracy of the model, whereas MATLAB reports R in its plot. Do I treat that R as R^2 or should I square the reported R value to obtain R^2.
I have generated the Matlab Script containing weight, bias and activation functions, as a final Result of the training. So shouldn't I be able to simply give my raw data as input and obtain the corresponding predicted output. I gave the exact same training set using the indices Matlab chose for training (to cross check), and plotted the predicted output vs actual output, but the result is not at all good. Definitely, not along the lines of R~0.99. Am I doing anything wrong?
code:
function [y1] = myNeuralNetworkFunction_2(x1)
%MYNEURALNETWORKFUNCTION neural network simulation function.
% X = [torque T_exh lambda t_Spark N EGR];
% Y = [O2R CO2R HC NOX CO lambda_out T_exh2];
% Generated by Neural Network Toolbox function genFunction, 17-Dec-2018 07:13:04.
%
% [y1] = myNeuralNetworkFunction(x1) takes these arguments:
% x = Qx6 matrix, input #1
% and returns:
% y = Qx7 matrix, output #1
% where Q is the number of samples.
%#ok<*RPMT0>
% ===== NEURAL NETWORK CONSTANTS =====
% Input 1
x1_step1_xoffset = [-24;235.248;0.75;-20.678;550;0.799];
x1_step1_gain = [0.00353982300884956;0.00284355877067267;6.26959247648903;0.0275865874012055;0.000366568914956012;0.0533831576137729];
x1_step1_ymin = -1;
% Layer 1
b1 = [1.3808996210168685;-2.0990163849711894;0.9651733083552595;0.27000953282929346;-1.6781835509820286;-1.5110463684800366;-3.6257438832309905;2.1569498669085361;1.9204156230460485;-0.17704342477904209];
IW1_1 = [-0.032892214008082517 -0.55848270745152429 -0.0063993424771670616 -0.56161004933654057 2.7161844536020197 0.46415317073346513;-0.21395624254052176 -3.1570133640176681 0.71972178875396853 -1.9132557838515238 1.3365248285282931 -3.022721627052706;-1.1026780445896862 0.2324603066452392 0.14552308208231421 0.79194435276493658 -0.66254679969168417 0.070353201192052434;-0.017994515838487352 -0.097682677816992206 0.68844109281256027 -0.001684535122025588 0.013605622123872989 0.05810686279306107;0.5853667840629273 -2.9560683084876329 0.56713425120259764 -2.1854386350040116 1.2930115031659106 -2.7133159265497957;0.64316656469750333 -0.63667017646313084 0.50060179040086761 -0.86827897068177973 2.695456517458648 0.16822164719859456;-0.44666821007466739 4.0993786464616679 -0.89370838440321498 3.0445073606237933 -3.3015566360833453 -4.492874075961689;1.8337574137485424 2.6946232855369989 1.1140472073136622 1.6167763205944321 1.8573696127039145 -0.81922672766933646;-0.12561950922781362 3.0711045035224349 -0.6535751823440773 2.0590707752473199 -1.3267693770634292 2.8782780742777794;-0.013438026967107483 -0.025741311825949621 0.45460734966889638 0.045052447491038108 -0.21794568374100454 0.10667240367191703];
% Layer 2
b2 = [-0.96846557414356171;-0.2454718918618051;-0.7331628718025488;-1.0225195290982099;0.50307202195645395;-0.49497234988401961;-0.21817117469133171];
LW2_1 = [-0.97716474643411022 -0.23883775971686808 0.99238069915206006 0.4147649511973347 0.48504023209224734 -0.071372217431684551 0.054177719330469304 -0.25963474838320832 0.27368380212104881 0.063159321947246799;-0.15570858147605909 -0.18816739764334323 -0.3793600124951475 2.3851961990944681 0.38355142531334563 -0.75308427071748985 -0.1280128732536128 -1.361052031781103 0.6021878865831336 -0.24725687748503239;0.076251356114485525 -0.10178293627600112 0.10151304376762409 -0.46453434441403058 0.12114876632815359 0.062856969143306296 -0.0019628163322658364 -0.067809039768745916 0.071731544062023825 0.65700427778446913;0.17887084584125315 0.29122649575978238 0.37255802759192702 1.3684190468992126 0.60936238465090853 0.21955911453674043 0.28477957899364675 -0.051456306721251184 0.6519451272106177 -0.64479205028051967;0.25743349663436799 2.0668075180209979 0.59610776847961111 -3.2609682919282603 1.8824214917530881 0.33542869933904396 0.03604272669356564 -0.013842766338427388 3.8534510207741826 2.2266745660915586;-0.16136175574939746 0.10407287099228898 -0.13902245286490234 0.87616472446622717 -0.027079111747601223 0.024812287505204988 -0.030101536834009103 0.043168268669541855 0.12172932035587079 -0.27074383434206573;0.18714562505165402 0.35267726325386606 -0.029241400610813449 0.53053853235049087 0.58880054832728757 0.047959541165126809 0.16152268183097709 0.23419456403348898 0.83166785128608967 -0.66765237856750781];
% Output 1
y1_step1_ymin = -1;
y1_step1_gain = [0.114200879346771;0.145581598485951;0.000139011547272197;0.000456244862967996;2.05816254143146e-05;5.27704485488127;0.00284355877067267];
y1_step1_xoffset = [-0.045;1.122;2.706;17.108;493.726;0.75;235.248];
% ===== SIMULATION ========
% Dimensions
Q = size(x1,1); % samples
% Input 1
x1 = x1';
xp1 = mapminmax_apply(x1,x1_step1_gain,x1_step1_xoffset,x1_step1_ymin);
% Layer 1
a1 = tansig_apply(repmat(b1,1,Q) + IW1_1*xp1);
% Layer 2
a2 = repmat(b2,1,Q) + LW2_1*a1;
% Output 1
y1 = mapminmax_reverse(a2,y1_step1_gain,y1_step1_xoffset,y1_step1_ymin);
y1 = y1';
end
% ===== MODULE FUNCTIONS ========
% Map Minimum and Maximum Input Processing Function
function y = mapminmax_apply(x,settings_gain,settings_xoffset,settings_ymin)
y = bsxfun(#minus,x,settings_xoffset);
y = bsxfun(#times,y,settings_gain);
y = bsxfun(#plus,y,settings_ymin);
end
% Sigmoid Symmetric Transfer Function
function a = tansig_apply(n)
a = 2 ./ (1 + exp(-2*n)) - 1;
end
% Map Minimum and Maximum Output Reverse-Processing Function
function x = mapminmax_reverse(y,settings_gain,settings_xoffset,settings_ymin)
x = bsxfun(#minus,y,settings_ymin);
x = bsxfun(#rdivide,x,settings_gain);
x = bsxfun(#plus,x,settings_xoffset);
end
The above one is the automatically generated code. The plot which I generated to cross-check the first variable is below:-
% X and Y are input and output - same as above
X_train = X(results.info1.train.indices,:);
y_train = Y(results.info1.train.indices,:);
out_train = myNeuralNetworkFunction_2(X_train);
scatter(y_train(:,1),out_train(:,1))

To answer your question about R: Yes, you should square R to get the R^2 value. In this case, they will be very close since R is very close to 1.

The graphs give the correlation between the estimated and real (target) values. So R is the strenght of the correlation. You can square it to find the R-square.
The graph you draw and matlab gave are not the graph of the same variables. The ranges or scales of the axes are very different.

First of all, is the problem you are trying to solve a regression problem? Or is it a classification problem with 7 classes converted to numeric? I assume this is a classification problem, as you are trying to get the success rate for each class.
As for your first question: According to the literature it is recommended to use the value "All: R". If you want to get the success rate of each of your classes, Precision, Recall, F-measure, FP rate, TP Rate, etc., which are valid in classification problems. values you need to reach. There are many matlab documents for this (help ROC) and you can look at the details. All the values I mentioned and which I think you actually want are obtained from the confusion matrix.
There is a good example of this.
[x,t] = simpleclass_dataset;
net = patternnet(10);
net = train(net,x,t);
y = net(x);
[c,cm,ind,per] = confusion(t,y)
I hope you will see what you want from the "nntraintool" window that appears when you run the code.
Your other questions have already been answered. Alternatively, you can consider using a machine learning algorithm with open source software such as Weka.

Issues in fitting data to linear model

Assuming a noiseless AR(1) process y(t)= a*y(t-1) . I have following conceptual questions and shall be glad for the clarification.
Q1 - Discrepancy between mathematical formulation and implementation - The mathematical formulation of AR model is in the form of y(t) = - summmation over i=1 to p[a*y(t-p)] + eta(t) where p=model order and eta(t) is a white gaussian noise. But when estimating coefficients using any method like arburg() or the least square, we simply call that function. I do not know if a white gaussian noise is implicitly added. Then, when we resolve the AR equation with the estimated coefficients, I have seen that the negative sign is not considered nor the noise term added.
What is the correct representation of AR model and how do I find the average coefficients over k number of trials when I have only a single sample of 1000 data points?
Q2 - Coding problem in How to simulate fitted_data for k number of trials and then find the residuals - I fitted a data "data" generated from unknown system and obtained the coefficient by
load('data.txt');
for trials = 1:10
model = ar(data,1,'ls');
original_data=data;
fitted_data(i)=coeff1*data(i-1); % **OR**
data(i)=coeff1*data(i-1);
fitted_data=data;
residual= original_data - fitted_data;
plot(original_data,'r'); hold on; plot(fitted_data);
end
When calculating residual is the fitted_data obtained as above by resolving the AR equation with the obtained coefficients? Matlab has a function for doing this but I wanted to make my own. So, after finding coefficients from the original data how do I resolve ? The coding above is incorrect. Attached is the plot of original data and the fitted_data.

If you model is simply y(n)= a*y(n-1) with scalar a, then here is the solution.
y = randn(10, 1);
a = y(1 : end - 1) \ y(2 : end);
y_estim = y * a;
residual = y - y_estim;
Of course, you should separate the data into train-test, and apply a on the test data. You can generalize this approach to y(n)= a*y(n-1) + b*y(n-2), etc.
Note that \ represents mldivide() function: mldivide
Edit:
% model: y[n] = c + a*y(n-1) + b*y(n-2) +...+z*y(n-n_order)
n_order = 3;
allow_offset = true; % alows c in the model
% train
y_train = randn(20,1); % from your data
[y_in, y_out] = shifted_input(y_train, n_order, allow_offset);
a = y_in \ y_out;
% now test
y_test = randn(20,1); % from your data
[y_in, y_out] = shifted_input(y_test, n_order, allow_offset);
y_estim = y_in * a; % same a
residual = y_out - y_estim;
here is shifted_input():
function [y_in, y_out] = shifted_input(y, n_order, allow_offset)
y_out = y(n_order + 1 : end);
n_rows = size(y, 1) - n_order;
y_in = nan(n_rows, n_order);
for k = 1 : n_order
y_in(:, k) = y(1 : n_rows);
y = circshift(y, -1);
end
if allow_offset
y_in = [y_in, ones(n_rows, 1)];
end
return

AR-type models can serve a number of purposes, including linear prediction, linear predictive coding, filtering noise. The eta(t) are not something we are interested in retaining, rather part of the point of the algorithms is to remove their influence to any extent possible by looking for persistent patterns in the data.
I have textbooks that, in the context of linear prediction, do not include the negative sign included in your expression prior to the sum. On the other hand Matlab's function lpcdoes:
Xp(n) = -A(2)*X(n-1) - A(3)*X(n-2) - ... - A(N+1)*X(n-N)
I recommend you look at function lpc if you haven't already, and at the examples from the documentation such as the following:
randn('state',0);
noise = randn(50000,1); % Normalized white Gaussian noise
x = filter(1,[1 1/2 1/3 1/4],noise);
x = x(45904:50000);
% Compute the predictor coefficients, estimated signal, prediction error, and autocorrelation sequence of the prediction error:
p = lpc(x,3);
est_x = filter([0 -p(2:end)],1,x); % Estimated signal
e = x - est_x; % Prediction error
[acs,lags] = xcorr(e,'coeff'); % ACS of prediction error
The estimated x is computed as est_x. Note how the example uses filter. Quoting the matlab doc again, filter(b,a,x) "is a "Direct Form II Transposed" implementation of the standard difference equation:
a(1)*y(n) = b(1)*x(n) + b(2)*x(n-1) + ... + b(nb+1)*x(n-nb)
- a(2)*y(n-1) - ... - a(na+1)*y(n-na)
which means that in the prior example est_x(n) is computed as
est_x(n) = -p(2)*x(n-1) -p(3)*x(n-2) -p(4)*x(n-3)
which is what you expect!
Edit:
As regards the function ar, the matlab documentation explains that the output coefficients have the same meaning as in the lp scenario discussed above.
The right way to evaluate the output of the AR model is to compute
data_armod(i)= -coeff(2)*data(i-1) -coeff(3)*data(i-2) -coeff(4)*data(i-3)
where coeff is the coefficient matrix returned with
model = ar(data,3,'ls');
coeff = model.a;

Can I export my (Matlab-based) neural network to PHP?

I have trained a neural network in Matlab (Using the neural network toolbox). Now I would like to export the calculated weights and biases to another platform (PHP) in order to make calculations with them. Is there a way to create a function or equation to do this?
I found this related question: Equation that compute a Neural Network in Matlab.
Is there a way to do what I want and port the results of my NN (29 inputs, 10 hidden layers, 1 output) to PHP?

Yes, the net properties also referenced in the other question are simple matrices:
W1=net.IW{1,1};
W2=net.LW{2,1};
b1=net.b{1,1};
b2=net.b{2,1};
So you can write them to a file, say, as comma-separated-values.
csvwrite('W1.csv',W1)
Then, in PHP read this data and convert or use it as you like.
<?php
if (($handle = fopen("test.csv", "r")) !== FALSE) {
$data = fgetcsv($handle, 1000, ",");
}
?>
Than, to process the weights, you can use the formula from the other question by replacing the tansig function, which is calculated according to:
n = 2/(1+exp(-2*n))-1
This is mathematically equivalent to tanh(N)
Which exists in php as well.
source: http://dali.feld.cvut.cz/ucebna/matlab/toolbox/nnet/tansig.html

Transferring all of these is pretty trivial. You will need:
Write the code for matrix multiplication, which are a pretty simple couple of for loops.
Second, observe that according to the Matlab documentation tansig(n) = 2/(1+exp(-2*n))-1. I'm pretty sure that PHP has exp (and if not, it is has a pretty simple polynomial expansion which you can write yourself)
Read, understand and apply Jasper van den Bosch's excellent answer to your question.

Hence the solution becomes (after correcting all wrong parts)
Here I am giving a solution in Matlab, but if you have tanh() function, you may easily convert it to any programming language. For PHP, tanh() function exists: php tanh(). It is for just showing the fields from network object and the operations you need.
Assume you have a trained ann (network object) that you want to export
Assume that the name of the trained ann is trained_ann
Here is the script for exporting and testing.
Testing script compares original network result with my_ann_evaluation() result
% Export IT
exported_ann_structure = my_ann_exporter(trained_ann);
% Run and Compare
% Works only for single INPUT vector
% Please extend it to MATRIX version by yourself
input = [12 3 5 100];
res1 = trained_ann(input')';
res2 = my_ann_evaluation(exported_ann_structure, input')';
where you need the following two functions
First my_ann_exporter:
function [ my_ann_structure ] = my_ann_exporter(trained_netw)
% Just for extracting as Structure object
my_ann_structure.input_ymax = trained_netw.inputs{1}.processSettings{1}.ymax;
my_ann_structure.input_ymin = trained_netw.inputs{1}.processSettings{1}.ymin;
my_ann_structure.input_xmax = trained_netw.inputs{1}.processSettings{1}.xmax;
my_ann_structure.input_xmin = trained_netw.inputs{1}.processSettings{1}.xmin;
my_ann_structure.IW = trained_netw.IW{1};
my_ann_structure.b1 = trained_netw.b{1};
my_ann_structure.LW = trained_netw.LW{2};
my_ann_structure.b2 = trained_netw.b{2};
my_ann_structure.output_ymax = trained_netw.outputs{2}.processSettings{1}.ymax;
my_ann_structure.output_ymin = trained_netw.outputs{2}.processSettings{1}.ymin;
my_ann_structure.output_xmax = trained_netw.outputs{2}.processSettings{1}.xmax;
my_ann_structure.output_xmin = trained_netw.outputs{2}.processSettings{1}.xmin;
end
Second my_ann_evaluation:
function [ res ] = my_ann_evaluation(my_ann_structure, input)
% Works with only single INPUT vector
% Matrix version can be implemented
ymax = my_ann_structure.input_ymax;
ymin = my_ann_structure.input_ymin;
xmax = my_ann_structure.input_xmax;
xmin = my_ann_structure.input_xmin;
input_preprocessed = (ymax-ymin) * (input-xmin) ./ (xmax-xmin) + ymin;
% Pass it through the ANN matrix multiplication
y1 = tanh(my_ann_structure.IW * input_preprocessed + my_ann_structure.b1);
y2 = my_ann_structure.LW * y1 + my_ann_structure.b2;
ymax = my_ann_structure.output_ymax;
ymin = my_ann_structure.output_ymin;
xmax = my_ann_structure.output_xmax;
xmin = my_ann_structure.output_xmin;
res = (y2-ymin) .* (xmax-xmin) /(ymax-ymin) + xmin;
end

creating a train perceptron in MATLAB for gender clasiffication

I am coding a perceptron to learn to categorize gender in pictures of faces. I am very very new to MATLAB, so I need a lot of help. I have a few questions:
I am trying to code for a function:
function [y] = testset(x,w)
%y = sign(sigma(x*w-threshold))
where y is the predicted results, x is the training/testing set put in as a very large matrix, and w is weight on the equation. The part after the % is what I am trying to write, but I do not know how to write this in MATLAB code. Any ideas out there?
I am trying to code a second function:
function [err] = testerror(x,w,y)
%err = sigma(max(0,-w*x*y))
w, x, and y have the same values as stated above, and err is my function of error, which I am trying to minimize through the steps of the perceptron.
I am trying to create a step in my perceptron to lower the percent of error by using gradient descent on my original equation. Does anyone know how I can increment w using gradient descent in order to minimize the error function using an if then statement?
I can put up the code I have up till now if that would help you answer these questions.
Thank you!
edit--------------------------
OK, so I am still working on the code for this, and would like to put it up when I have something more complete. My biggest question right now is:
I have the following function:
function [y] = testset(x,w)
y = sign(sum(x*w-threshold))
Now I know that I am supposed to put a threshold in, but cannot figure out what I am supposed to put in as the threshold! any ideas out there?
edit----------------------------
this is what I have so far. Changes still need to be made to it, but I would appreciate input, especially regarding structure, and advice for making the changes that need to be made!
function [y] = Perceptron_Aviva(X,w)
y = sign(sum(X*w-1));
end
function [err] = testerror(X,w,y)
err = sum(max(0,-w*X*y));
end
%function [w] = perceptron(X,Y,w_init)
%w = w_init;
%end
%------------------------------
% input samples
X = X_train;
% output class [-1,+1];
Y = y_train;
% init weigth vector
w_init = zeros(size(X,1));
w = w_init;
%---------------------------------------------
loopcounter = 0
while abs(err) > 0.1 && loopcounter < 100
for j=1:size(X,1)
approx_y(j) = Perceptron_Aviva(X(j),w(j))
err = testerror(X(j),w(j),approx_y(j))
if err > 0 %wrong (structure is correct, test is wrong)
w(j) = w(j) - 0.1 %wrong
elseif err < 0 %wrong
w(j) = w(j) + 0.1 %wrong
end
% -----------
% if sign(w'*X(:,j)) ~= Y(j) %wrong decision?
% w = w + X(:,j) * Y(j); %then add (or subtract) this point to w
end

you can read this question I did some time ago.
I uses a matlab code and a function perceptron
function [w] = perceptron(X,Y,w_init)
w = w_init;
for iteration = 1 : 100 %<- in practice, use some stopping criterion!
for ii = 1 : size(X,2) %cycle through training set
if sign(w'*X(:,ii)) ~= Y(ii) %wrong decision?
w = w + X(:,ii) * Y(ii); %then add (or subtract) this point to w
end
end
sum(sign(w'*X)~=Y)/size(X,2) %show misclassification rate
end
and it is called from code (#Itamar Katz) like (random data):
% input samples
X1=[rand(1,100);rand(1,100);ones(1,100)]; % class '+1'
X2=[rand(1,100);1+rand(1,100);ones(1,100)]; % class '-1'
X=[X1,X2];
% output class [-1,+1];
Y=[-ones(1,100),ones(1,100)];
% init weigth vector
w=[.5 .5 .5]';
% call perceptron
wtag=perceptron(X,Y,w);
% predict
ytag=wtag'*X;
% plot prediction over origianl data
figure;hold on
plot(X1(1,:),X1(2,:),'b.')
plot(X2(1,:),X2(2,:),'r.')
plot(X(1,ytag<0),X(2,ytag<0),'bo')
plot(X(1,ytag>0),X(2,ytag>0),'ro')
legend('class -1','class +1','pred -1','pred +1')
I guess this can give you an idea to make the functions you described.
To the error compare the expected result with the real result (class)

Assume your dataset is X, the datapoins, and Y, the labels of the classes.
f=newp(X,Y)
creates a perceptron.
If you want to create an MLP then:
f=newff(X,Y,NN)
where NN is the network architecture, i.e. an array that designates the number of neurons at each hidden layer. For example
NN=[5 3 2]
will correspond to an network with 5 neurons at the first layers, 3 at the second and 2 a the third hidden layer.

Well what you call threshold is the Bias in machine learning nomenclature. This should be left as an input for the user because it is used during training.
Also, I wonder why you are not using the builtin matlab functions. i.e newp or newff. e.g.
ff=newp(X,Y)
Then you can set the properties of the object ff to do your job for selecting gradient descent and so on.

How to set output size in Matlab newff method

Summary:
I'm trying to do classification of some images depending on the angles between body parts.
I assume that human body consists of 10 parts(as rectangles) and find the center of each part and calculate the angle of each part by reference to torso.
And I have three action categories:Handwave-Walking-Running.
My goal is to find which test images fall into which action category.
Facts:
TrainSet:1057x10 feature set,1057 stands for number of image.
TestSet:821x10
I want my output to be 3x1 matrice each row showing the percentage of classification for action category.
row1:Handwave
row2:Walking
row3:Running
Code:
actionCat='H';
[train_data_hw train_label_hw] = tugrul_traindata(TrainData,actionCat);
[test_data_hw test_label_hw] = tugrul_testdata(TestData,actionCat);
actionCat='W';
[train_data_w train_label_w] = tugrul_traindata(TrainData,actionCat);
[test_data_w test_label_w] = tugrul_testdata(TestData,actionCat);
actionCat='R';
[train_data_r train_label_r] = tugrul_traindata(TrainData,actionCat);
[test_data_r test_label_r] = tugrul_testdata(TestData,actionCat);
Train=[train_data_hw;train_data_w;train_data_r];
Test=[test_data_hw;test_data_w;test_data_r];
Target=eye(3,1);
net=newff(minmax(Train),[10 3],{'logsig' 'logsig'},'trainscg');
net.trainParam.perf='sse';
net.trainParam.epochs=50;
net.trainParam.goal=1e-5;
net=train(net,Train);
trainSize=size(Train,1);
testSize=size(Test,1);
if(trainSize > testSize)
pend=-1*ones(trainSize-testSize,size(Test,2));
Test=[Test;pend];
end
x=sim(net,Test);
Question:
I'm using Matlab newff method.But my output is always an Nx10 matrice not 3x1.
My input set should be grouped as 3 classes but they are grouped to 10 classes.
Thanks

%% Load data : I generated some random data instead
Train = rand(1057,10);
Test = rand(821,10);
TrainLabels = randi([1 3], [1057 1]);
TestLabels = randi([1 3], [821 1]);
trainSize = size(Train,1);
testSize = size(Test,1);
%% prepare the input/output vectors (1-of-N output encoding)
input = Train'; %'matrix of size numFeatures-by-numImages
output = zeros(3,trainSize); % matrix of size numCategories-by-numImages
for i=1:trainSize
output(TrainLabels(i), i) = 1;
end
%% create net: one hidden layer with 10 nodes (output layer size is infered: 3)
net = newff(input, output, 10, {'logsig' 'logsig'}, 'trainscg');
net.trainParam.perf = 'sse';
net.trainParam.epochs = 50;
net.trainParam.goal = 1e-5;
view(net)
%% training
net = init(net); % initialize
[net,tr] = train(net, input, output); % train
%% performance (on Training data)
y = sim(net, input); % predict
%[err cm ind per] = confusion(output, y);
[maxVals predicted] = max(y); % predicted
cm = confusionmat(predicted, TrainLabels);
acc = sum(diag(cm))/sum(cm(:));
fprintf('Accuracy = %.2f%%\n', 100*acc);
fprintf('Confusion Matrix:\n');
disp(cm)
%% Testing (on Test data)
y = sim(net, Test');
Note how I converted from category label for each instance (1/2/3) to a 1-to-N encoding vector ([100]: 1, [010]: 2, [001]: 3)
Also note that the test set is currently not being used, since by default the input data is divided into train/test/validation. You could achieve your manual division by setting net.divideFcn to the divideind function, and setting the corresponding net.divideParam parameters.
I showed the testing on the same training data, but you could do the same for the Test data.