How Backpropagation works? - neural-network

I have a question on backpropagation algorithm which is used in Deep Learning.
How should I update the weights when we have n training samples?
Should I update the weights for each sample and then again update it by the next sample?
Or I should get the average of them and then use the average?
Please guide me what is the rational procedure.
Thanks,
Afshin

They are both rational options.
Both approaches are correct. They are respectively called "online" and "offline" learning.
Online learning
Online machine learning is used in the case where the data becomes available in a sequential fashion (excerpt of the definition on Wikipedia).
Offline learning
Offline or "batch" learning may be used when one has access to the entire training dataset at once. An advantage of using batch learning is the improved immunity to local optima, but this comes at the cost of increased cost of training the net (the network often requires additional backpropagation iterations).

Related

How do I determine the architecture for deep NN training according to the number of examples?

As the title says, how can I determine the architecture or build a reasonable model for training a neural network with regards to the number of examples?
For example, assuming that I have roughly 50 thousand images and I have successfully converted all data to fit the model which means they are ready for training, how can I choose a model that is suitable for training a neural network? I am a little bit confused sometimes when I have data but I did not know how to initiate a model for training NN.
Fine tuning is the way
Sometimes you have a pre-trained CNN that you can use as a starting point for your domain. For more about fine tuning You can check here.
According to this, my advice is to fine tune a pre-trained Neural Network that you can find in Keras (This page, under "Available models") or TensorFlow. You can go deeper as far as you are confident with your training set!
In any case, you need to see the number of samples per class rather than the absolute number of images in your training set. If you are confident you can choose a Deep Learning SOA architecture and try to train it from zero.

Steps to think of while trying to increase accuracy of network

How do you proceed to increasing accuracy of your neural network?
I have tried lots of architectures yet in my image detection ( classification + localization ) I can only get 75% accuracy.
I am using VOC2007 dataset, and I extracted only data where 1 person is present.
What are the steps I can think of to increase the accuracy of my object detector?
thanks for help.
You might want to have a look at my masters thesis Analysis and Optimization of Convolutional Neural Network Architectures, chapter 2.5 page 15:
A machine learning developer has the following choices to improve the model’s quality:
(I1) Change the problem definition (e.g., the classes which are to be distinguished)
(I2) Get more training data
(I3) Clean the training data
(I4) Change the preprocessing (see Appendix B.1)
(I5) Augment the training data set (see Appendix B.2)
(I6) Change the training setup (see Appendices B.3 to B.5)
(I7) Change the model (see Appendices B.6 and B.7)
It's always good to check thoroughly where exactly the problem is and compare it with a human baseline. Reliably getting better than a human is super hard.

Continuously train MATLAB ANN, i.e. online training?

I would like to ask for ideas what options there is for training a MATLAB ANN (artificial neural network) continuously, i.e. not having a pre-prepared training set? The idea is to have an "online" data stream thus, when first creating the network it's completely untrained but as samples flow in the ANN is trained and converges.
The ANN will be used to classify a set of values and the implementation would visualize how the training of the ANN gets improved as samples flows through the system. I.e. each sample is used for training and then also evaluated by the ANN and the response is visualized.
The effect that I expect is that for the very first samples the response of the ANN will be more or less random but as the training progress the accuracy improves.
Any ideas are most welcome.
Regards, Ola
In MATLAB you can use the adapt function instead of train. You can do this incrementally (change weights every time you get a new piece of information) or you can do it every N-samples, batch-style.
This document gives an in-depth run-down on the different styles of training from the perspective of a time-series problem.
I'd really think about what you're trying to do here, because adaptive learning strategies can be difficult. I found that they like to flail all over compared to their batch counterparts. This was especially true in my case where I work with very noisy signals.
Are you sure that you need adaptive learning? You can't periodically re-train your NN? Or build one that generalizes well enough?

Neuroph Vs Encog

I have decided to use a feed-forward NN with back-propagation training for my OCR application for Handwritten text and the input layer is going to be with 32*32 (1024) neurones and at least 8-12 out put neurones.
I found Neuroph easy to use by reading some articles at the same time Encog is few times better in performance. Considering the parameters in my scenario which API is the most suitable one. And I appreciate if u can comment on the number of input nodes i have taken, is it too large value (Although it is out of the topic)
First my disclaimer, I am one of the main developers on the Encog project. This means I am more familiar with Encog that Neuroph and perhaps biased towards it. In my opinion, the relative strengths of each are as follows. Encog supports quite a few interchangeable machine learning methods and training methods. Neuroph is VERY focused on neural networks and you can express a connection between just about anything. So if you are going to create very custom/non-standard (research) neural networks of different typologies than the typical Elman/Jordan, NEAT, HyperNEAT, Feedforward type networks, then Neuroph will fit the bill nicely.

Which multiplication and addition factor to use when doing adaptive learning rate in neural networks?

I am new to neural networks and, to get grip on the matter, I have implemented a basic feed-forward MLP which I currently train through back-propagation. I am aware that there are more sophisticated and better ways to do that, but in Introduction to Machine Learning they suggest that with one or two tricks, basic gradient descent can be effective for learning from real world data. One of the tricks is adaptive learning rate.
The idea is to increase the learning rate by a constant value a when the error gets smaller, and decrease it by a fraction b of the learning rate when the error gets larger. So basically the learning rate change is determined by:
+(a)
if we're learning in the right direction, and
-(b * <learning rate>)
if we're ruining our learning. However, on the above book there's no advice on how to set these parameters. I wouldn't expect a precise suggestion since parameter tuning is a whole topic on its own, but just a hint at least on their order of magnitude. Any ideas?
Thank you,
Tunnuz
I haven't looked at neural networks for the longest time (10 years+) but after I saw your question I thought I would have a quick scout about. I kept seeing the same figures all over the internet in relation to increase(a) and decrease(b) factor (1.2 & 0.5 respectively).
I have managed to track these values down to Martin Riedmiller and Heinrich Braun's RPROP algorithm (1992). Riedmiller and Braun are quite specific about sensible parameters to choose.
See: RPROP: A Fast Adaptive Learning Algorithm
I hope this helps.