I'm having a problem setting up a proper Neural Network for one class classification. Basically I've only the features that rapresent a background of an image. So the training phase would train the NN on those features. During the execution phase the NN will have features that could be "background" or "foreground" (the upper step is segmentation, I've already done it). NOTE: I can't train NN on "foreground" because I don't know if the segmentation process acquires only foreground objects. How I'm supposed to set up correctly my NN?
Here is some piece of code:
toTrainFeat = computeFeatures(backBboxes,frame);
classes(1:size(backBboxes,1))=1; % one-class
[net Y E] = adapt(net,toTrainFeat,classes); % Incremental learning
if numFrame >=40 || sse(E) <0.01 %classify only after 40 frames OR if NN is smart enough
y = net(toClassifyFeat);
y
end
This code does not work, I think because I'm submitting only ONE-CLASS to the adapt method (in fact it crashes when it call adapt). Any help?
Thanks a lot.
Related
I'm currently working through the SciML tutorials workshop exercises for the Julia language (https://tutorials.sciml.ai/html/exercises/01-workshop_exercises.html). Specifically, I'm stuck on exercise 6 part 3, which involves training a neural network to approximate the system of equations
function lotka_volterra(du,u,p,t)
x, y = u
α, β, δ, γ = p
du[1] = dx = α*x - β*x*y
du[2] = dy = -δ*y + γ*x*y
end
The goal is to replace the equation for du[2] with a neural network: du[2] = NN(u, p)
where NN is a neural net with parameters p and inputs u.
I have a set of sample data that the network should try to match. The loss function is the squared difference between the network model's output and that sample data.
I defined my network with
NN = Chain(Dense(2,30), Dense(30, 1)). I can get Flux.train! to run, but the problem is that sometimes the initial parameters for the neural network result in a loss on the order of 10^20 and so training never converges. My best try got the loss down from about 2000 initially to about 20 using the ADAM optimizer over about 1000 iterations, but I can't seem to do any better.
How can I make sure my network is consistently trainable, and is there a way to get better convergence?
How can I make sure my network is consistently trainable, and is there a way to get better convergence?
See the FAQ page on techniques for improving convergence. In a nutshell, the single shooting approach of most ML papers is very unstable and does not work on most practical problems, but there are a litany of techniques to help out. One of the best ones is multiple shooting, which optimizes only short bursts (in parallel) along the time series.
But training on a small interval and growing the interval works, also using more stable optimizers (BFGS) can work. You can also weigh the loss function so that earlier times mean more. Lastly, you can minibatch in a way similar to multiple shooting, i.e. start from a data point and only solve to the next (in fact, if you actually look at the original neural ODE paper NumPy code, they do not do the algorithm as explained but instead do this form of sampling to stabilize the spiral ODE training).
I am using a multi layer perceptron for fitting a model to a data given input-output pair following the tutorial https://www.mathworks.com/help/deeplearning/gs/fit-data-with-a-neural-network.html.
Confusion 1) I am having a tough time understanding where the test set which has been created using the command net.divideParam.testRatio used? In general, we split the data set into train, validation and an unseen test set that is used for performanace evaluation and reporting the confusion matrix. This approach is usually done for classification task. But for the problem of regression and model fitting es. using NN should we not explicitly have a test set that is unseen during training? Is this command net.divideParam.testRatio creating that unseen test set but it is never used in testing the network? The program code uses all of the inputs in the testing. It is unclear if after training I should use an unseen dataset for testing and then reporting the performance or not.
% Create a Fitting Network
hiddenLayerSize = 10;
net = fitnet(hiddenLayerSize);
inputs = houseInputs;
targets = houseTargets;
% Set up Division of Data for Training, Validation, Testing
net.divideParam.trainRatio = 70/100;
net.divideParam.valRatio = 15/100;
net.divideParam.testRatio = 15/100;
% Train the Network
[net,tr] = train(net,inputs,targets);
% Test the Network
outputs = net(inputs);
errors = gsubtract(outputs,targets);
performance = perform(net,targets,outputs)
Confusion 2) When using regression model mvregress do we follow the same approach as the answer for confusion 1)
Please help. I am unable to find the correct practices and approach for these initial steps and I believe that the proper use makes a great impact in the result.
I can help you mostly with confusion 1).
When you train a neural network, you are separating the dataset in 3 sets:
Training set, used to train the network (the only dataset which actually allows the update of the Network weights);
Validation set, used to stop the training (this is the parameter Validation checks in the GUI);
Test set, which influences the performance plots and the overall performance of the fitter;
Therefore, of these 3, only the training set is seen by the network and influences the weights update; while the validation set allows to stop the training if the network is overfitting the training data (an improvement in training data fitting does not improve the validation data fitting/classification). Finally, test set is useful for a first check of the fitter performance.
If you check the value of net.divideParam, you can see that the network stores the percentage of values for each set; during the training, the inputs and targets will be randomly divided according to these 3 values. This is why if you use the toolbox to plot the performance of the network. You can also avoid this to be done randomly by setting the net.divideFcn to 'divideind'. This is mostly useful if you know well your dataset.
When you train the network using
[net,tr] = train(net,inputs,targets);
tr stores the results of the training, including the indexes of the training (tr.trainInd), validation (tr.valInd) and test set (tr.testInd). To retrieve each of the sets it is possible to index the input with those inputs, while other parameters, such as the accuracy or the performance of the network can be retrieved through tr.
Regarding confusion 2, I think that regression model mvregress works with a different approach: it should just evaluate the parameters for the fitting without splitting the dataset in three slices. It should be up to you to evaluate the regression by adding some points or removing them from the inputs.
I am working on a regression problem with the following sample training data .
As shown I have an input of only 4 parameters with only one of them changing which is Z so the rest have no real value while an output of 124 parameters denoted from O1 to O124
Noting that O1 changes with a constant rate of 20 [1000 then 1020 then 1040 ...] while O2 changes with a different rate which is 30 however still constant and same applies for all the 124 outputs ,all changes linearily in a constant way.
I believed it's a trivial problem and a very simple neural network model will reach a 100% accuracy on testing data but the results were the opposite.
I reached 100% test accuracy using a linear regressor and 99.99997% test accuracy using KNN regressor
I reached 41% test data accuracy in a 10 layered neural network using relu activation while all the rest activation functions failed and shallow relu also failed
Using a simple neural network with linear activation function and no hidden layers I reached 92% on the test data
My Question is how can I get the neural network to get 100% on test data like the linear Regressor?
It is supposed that using a shallow network with linear activation to be equivilant to the linear regressor but the results are different ,am I missing something ?
If you use linear activation a deep model is in principle the same as a linear regression / a NN with 1 layer. E.g a deep NN with linear activation the prediction is given as y = W_3(W_2(W_1 x))), which can be rewritten as y = (W_3 (W_2 W_1))x, which is the same as y = (W_4 x), which is a linear Regression.
Given that check if your NN without a hidden layer converges to the same parameters as your linear regression. If this is not the case then your implementation is probably wrong. If this is the case, then your larger NN probably converges to some solution to the problem, where the test accuracy is just worse. Try different random seeds then.
I have trained a feedforward neural network in Matlab. Now I have to implement this neural network in C language (or simulate the model in Matlab using mathematical equations, without using direct functions). How do I do that? I know that I have to take the weights and bias and activation function. What else is required?
There is no point in representing it as a mathematical function because it won't save you any computations.
Indeed all you need is the weights, biases, activation and your architecture. I'm assuming it is a simple feedforward network as you said, you need to implement some kind of matrix multiplication and addition in C. Also, you'll need to implement the activation function. After that, you're ready to go. Your feed forward NN is ready to be implemented. If the C code will not be used for training, it won't be necessary to implement the backpropagation algorithm in C.
A feedforward layer would be implemented as follows:
Output = Activation_function(Input * weights + bias)
Where,
Input: (1 x number_of_input_parameters_for_this_layer)
Weights: (number_of_input_parameters_for_this_layer x number_of_neurons_for_this_layer)
Bias: (1 x number_of_neurons_for_this_layer)
Output: (1 x number_of_neurons_for_this_layer)
The output of a layer is the input to the next layer.
After some days of searching, I have found the following webpage to be very useful http://ufldl.stanford.edu/tutorial/supervised/MultiLayerNeuralNetworks/
The picture below shows a simple feedforward neural network. Picture taken from the above website.
In this figure, the circles denote the inputs to the network. The circles labeled “+1” are called bias units, and correspond to the intercept term. The leftmost layer of the network is called the input layer, and the rightmost layer the output layer (which, in this example, has only one node). The middle layer of nodes is called the hidden layer, because its values are not observed in the training set. In this example, the neural network has 3 input units (not counting the bias unit), 3 hidden units, and 1 output unit.
The mathematical equations representing this feedforward network are
This neural network has parameters (W,b)=(W(l),b(l),W(2),b(2)), where we write Wij(l) to denote the parameter (or weight) associated with the connection between unit j in layer l, and unit i in layer l+1. (Note the order of the indices.) Also, bi(l) is the bias associated with unit i in layer l+1.
So, from the trained model, as Mido mentioned in his answer, we have to take the input weight matrix which is W(1), the layer weight matrix which is W(2), biases, hidden layer transfer function and output layer transfer function. After this, use the above equations to estimate the output hW,b(x). A popular transfer function used for a regression problem is tan-sigmoid transfer function in the hidden layer and linear transfer function in the output layer.
Those who use Matlab, these links are highly useful
try to simulate neural network in Matlab by myself
Neural network in MATLAB
Programming a Basic Neural Network from scratch in MATLAB
Suppose I have very big train set so that Matlab hangs while training or there is insufficient memory to hold train set.
Is it possible to split the training set into parts and train the network by parts?
Is it possible to train the network with one sample at a time (one by one)?
You can just manually divide dataset into batches and train them one after one:
for bn = 1:num_batches
inputs = <get batch bn inputs>;
targets = <get batch bn targets>;
net = train(net, inputs, targets);
end
Though batch size should be greater than 1, but anyway that should reduce memory consumtion for training.
In case of trainlm training alogrithm, net.efficiency.memoryReduction optim could help.
Also instead of default trainlm algorithm you can try less memory consuming ones like trainrp.
For details on training algorithms check matlab documentation page.
I assumed above that you are using corresponding matlab toolbox for neural networks.
Regarding training one sample at a time you could try googling for stochastic gradient descent algorithm. But, it looks like it is not in default set of training algorithm in the toolbox.