Neural Network Retraining - neural-network

I am coding a simple Neural Network, but I have thought of one issue that is bothering me.
This NN is for finding categories in the input. To better understand this, say the categories are "the numbers" (0,1,2...9).
To implement this the output layer is 10 nodes. Say I train this NN with several input -output pairs and save the learned weights somewhere. As the learning process takes quite a lot of time, after that I go and take a break. Come fresh the next day and re-start learning with new input -output pairs. So fair so goo
But what happen if on that time, I decide that I want to recognize hexadecimals (0,1,...9,A,B,,,E,F)... ergo the categories are increasing.
I suspect that would imply changing the structure of the NN and therefore I should retrain the NN from scratch.
Is this so?
Any comment, advice or your share of experience will be greatly appreciated
EDIT: This question has been marked as duplicate. I read the other question and although similar, my question is more concrete. While the other question speaks in generalities and the answer also is quite general- mine is very concrete as I use an example:
If I train a NN to recognize decimal numbers and later on decide to add data to make it recognize hexadecimals, can this be possible? How? Do I have to retrain the whole NN? In other words, does the structure of the NN needs to stay stationary with 10 OR 16 outputs since the beginning?
I would very much appreciate for a concrete answer to this. Thanks

A few considerations
Your training set and testing set should have the same distribution
Unless you have some way of specifying sample weights like some algorithms can you should at all costs avoid training on biased data. This is true for machine learning in general, not only neural networks.
Resuming training from a previous session is equivalent of using good initial values
Technically, you're just using the previous network as initial value instead of a random value. You should keep training in the whole dataset as always, to avoid a biased network.
Short Answer
Yes, you should always retrain your network if by retrain, you mean doing a training routine with the full dataset.
If you just mean retrain as doing a really long training iteration, it isn't your choice anyway. You must always train the network until the training error and testing error (or cross validated error) converge. If you reuse the previously trained network, that will probably happen faster.
You see, this is true no matter what kind of model change. If you change the network architecture, or the dataset, or both (your example), or some other parameter.
Of course, if you change the network architecture, you're going to have a bit of trouble on reusing the previous network. You could reuse the learned parameters from nodes that were kept and randomly initialize the parameters for the new nodes.

Related

Neural Network checkpoints?

I am new to Neural Network and I dont know what exactly to search on google for solution,here is my problem ,if you kindly please let me know what I am looking for,
So I am working on a project where,it will have many contributors over time,and each contributor will write a new line on excel file and then run the code to train dataset,
if want to ask is that ,is there a way to save a checkpoint so each time the code don't have to train the whole dataset and just continue to train the new entries instead of starting from zero.
Please let me know what exactly I should google.
Kind regards
This is, as you guessed, extremely common and usually referred to as "fine-tuning". In your case, since the dataset barely changes between training runs, you can expect the model to be very similar, so you could initialize your weights to the weights of the previous best model and retrain for only a few epochs, likely with a small learning rate.
People usually do fine-tuning starting from a network trained on an entirely different dataset, so it's likely that you will find that use-case rather than yours, but it will work even better if you keep a very similar dataset.
"Continual learning without forgetting"

Increased Error with more Training Data for a Neural Network in Matlab

I have a question regarding the Matlab NN toolbox. As a part of research project I decided to create a Matlab script that uses the NN toolbox for some fitting solutions.
I have a data stream that is being loaded to my system. The Input data consists of 5 input channels and 1 output channel. I train my data on on this configurations for a while and try to fit the the output (for a certain period of time) as new data streams in. I retrain my network constantly to keep it updated.
So far everything works fine, but after a certain period of time the results get bad and do not represent the desired output. I really can't explain why this happens, but i could imagine that there must be some kind of memory issue, since as the data set is still small, everything is ok.
Only when it gets bigger the quality of the simulation drops down. Is there something as a memory which gets full, or is the bad sim just a result of the huge data sets? I'm a beginner with this tool and will really appreciate your feedback. Best Regards and thanks in advance!
Please elaborate on your method of retraining with new data. Do you run further iterations? What do you consider as "time"? Do you mean epochs?
At a first glance, assuming time means epochs, I would say that you're overfitting the data. Neural Networks are supposed to be trained for a limited number of epochs with early stopping. You could try regularization, different gradient descent methods (if you're using a GD method), GD momentum. Also depending on the values of your first few training datasets, you may have trained your data using an incorrect normalization range. You should check these issues out if my assumptions are correct.

Neural network preprocessing

I'm working on school project about data prediction in NN I have my data normalized and I have three input and one output
My questions is
what is the different between the taring data and test data (is the training data supposed to be the input data and the test the output data)
what is testing rate is it any random number or is there rule to find it
what is training error
and my final question is after training my data I remember something about error I'm not quite sure but do I need to find the error of my prediction and how to find it
I know my questions might not be clear but I'm just confused and tried to explain it as much as I can
Answering in a school spirit: Let's suppose you are given 10 solved exercises to study. You do study them, and then the teacher tests you on these exact exercises. You do well on the test. However, there is an important question. Why did you do well?? Did you really understand the exercises, or did you just memorize them?? And how can the teacher know ??
There is only one way: The teacher must test you on a set of similar but different exercises. If you also do well on them, you have gotten a feel for the subject, and you are able to generalize the knowledge you acquired. If not, you probably memorized them, without understanding a thing. This kind of knowledge is useless.
The same happens with neural networks. You use some patterns to (training set) to train them. But, to check if they are able to generalize, you have to test them on a different set of patterns (test set) without the network knowing the correct answers. Ideally, you should have small differences in performance between the two sets, that is good generalization ability.
So, both train and tests sets are inputs, not outputs. The only difference is when you use them, the training set during the training, and the test set after it. The training/test set rate is the percentage you got correct of the training/test sets respectively. The training/test error is the complementary, that is, the percentage you got wrong.
I know this reply might come late but I will just complement the previous answer by saying that in supervised learning both the training set and the test set are input-output pairs. By structure alone they are exactly the same, a set of input and their corresponding output(or label) pairs. There is no difference in structure between both.
As blue_note said, they are just used in different occasions: one during training and one after that

Is there a rule/good advice on how big a artificial neural network should be?

My last lecture on ANN's was a while ago but I'm currently facing a project where I would want to use one.
So, the basics - like what type (a mutli-layer feedforward network), trained by an evolutionary algorithm (thats a given by the project), how many input-neurons (8) and how many ouput-neurons (7) - are set.
But I'm currently trying to figure out how many hidden layers I should use and how many neurons in each of these layers (the ea doesn't modify the network itself, but only the weights).
Is there a general rule or maybe a guideline on how to figure this out?
The best approach for this problem is to implement the cascade correlation algorithm, in which hidden nodes are sequentially added as necessary to reduce the error rate of the network. This has been demonstrated to be very useful in practice.
An alternative, of course, is a brute-force test of various values. I don't think simple answers such as "10 or 20 is good" are meaningful because you are directly addressing the separability of the data in high-dimensional space by the basis function.
A typical neural net relies on hidden layers in order to converge on a particular problem solution. A hidden layer of about 10 neurons is standard for networks with few input and output neurons. However, a trial an error approach often works best. Since the neural net will be trained by a genetic algorithm the number of hidden neurons may not play a significant role especially in training since its the weights and biases on the neurons which would be modified by an algorithm like back propogation.
As rcarter suggests, trial and error might do fine, but there's another thing you could try.
You could use genetic algorithms in order to determine the number of hidden layers or and the number of neurons in them.
I did similar things with a bunch of random forests, to try and find the best number of trees, branches, and parameters given to each tree, etc.

neural network and a intrusion detection system

How do I approach the problem with a neural network and a intrusion detection system where by lets say we have an attack via FTP.
Lets say some one attempts to continuously try different logins via brute force attack on an ftp account.
How would I set the structure of the NN? What things do I have to consider? How would it recognise "similar approaches in the future"?
Any diagrams and input would be much appreciated.
Your question is extremely general and a good answer is a project in itself. I recommend contracting someone with experience in neural network design to help come up with an appropriate model or even tell you whether your problem is amenable to using a neural network. A few ideas, though:
Inputs need to be quantized, so start by making a list of possible numeric inputs that you could measure.
Outputs also need to be quantized and you probably can't generate a simple "Yes/no" response. Most likely you'll want to generate one or more numbers that represent a rough probability of it being an attack, perhaps broken down by category.
You'll need to accumulate a large set of training data that has been analyzed and quantized into the inputs and outputs you've designed. Figuring out the process of doing this quantization is a huge part of the overall problem.
You'll also need a large set of validation data, which should be quantized in the same way as the training data, but that should not take any part in the training, as otherwise you will simply force a correlation network that may well be completely meaningless.
Once you've completed the above, you can think about how you want to structure your network and the specific algorithms you want to use to train it. There is a wide range of literature on this topic, but, honestly, this is the simpler part of the problem. Representing the problem in a way that can be processed coherently is much more difficult.