Neural Network OCR - using probability to determine touching symbols?

Neural Network OCR - using probability to determine touching symbols? - neural-network

Now I use some neural network for OCR and it produces output symbol and some probability for it. Also I have algorithm to split touching characters.
I expected to use probability to decide when to apply splitting.
But now I cannot do this because my network sometimes gives probability for touching characters higher then for normal characters.
Also I cannot understand what happened even after splitting - sometimes normal symbol can be split into two another symbols that both can be recognized with higher probability that initial symbol.
So I need to decide what to do. The question is
can Neural Network at least in theory provide reliable probability for OCR in this sense?
If it is possible then what should I try to do? Should I try to process current output or train network more or choose another network?
Any kind of help or suggestion will be greatly appreciated

Your approach is good and should eventually work given enough training data and given that you remove enough bugs from your preprocessing, splitting, training, etc.
Make sure that you split in the training set (prior to training) exactly the same way that you split the digits when you test them.
But note that Machine Learning produces algorithms that are correct within some accuracy, so you will always find instances that fail. The question is how good is your overall test performance (e.g. % correct digits), and how to increase this to the level that your application requires.
The question is can Neural Network at least in theory provide reliable
probability for OCR in this sense?
Yes
If it is possible then what should I try to do? Should I try to
process current output or train network more or choose another
network?
all of the above until it works! Training size is one of the key factors, and as you grow your training size you can grow your network to improve accuracy.

Related

Choose training and test set for MLP and Hopfield network

I have a question regarding the choice of the training and the test set for a Multilayer Perceptron (MLP) and a Hopfield network.
For example, assume that we got 100 patterns of the digits 0-9 given in a bitmap format. 10 of them are perfect digits while the other 90 are distorted. Which of these patterns will be used for the training set and which for the test set? The goal is to classify the digits.
I suppose for the Hopfield network the perfect digits will be used as the training set, but what about the MLP? One approach I thought of was to take for example 70 of the distorted digits and use them as the training set along with the corresponding perfect digits as their intended targets. Is this approach correct?

Disclaimer: I have not worked with Hopfield Networks before, so I trust you in your statements about it, but it should not be of that great relevance for the answer, anyways.
I am also assuming that you want to classify the digits, which is something you don't explicitly state in your question.
As for a proper split: Aside from the fact that that little training data is generally not a feasible amount to get decent results for a MLP (even for a simple task such as digit classification), it is unlikely that you will be able to "pre-label" your training data in terms of quality in most real-world scenarios. You should therefore always assume that the data you are processing is inherently noisy. A good example for this is also the fact that data augmentation is frequently used to enrich your training corpus. Since data augmentation can consist of such simple changes as
added noise
minor rotations
horizontal/vertical flipping (the latter only makes so much sense for digits, though)
can improve your accuracy, it goes to show that visual quality and quantity for training are two very different things. Of course, it is not per se true that quantity alone will solve your problem (although research indicates that it is at least a good idea to use very much data)
Further, what you judge to be a good representation might be very much different from the network's perspective (although for labeling digits it might be rather easy to tell). A decent strategy is therefore to simply perform a random sampling for your training/test split.
Something I like to do when preprocessing a dataset is, when done splitting, to check whether every class is somewhat evenly represented in the splits, so you won't overfit.
Similarly, I would argue that having clean/high quality images of digits in both your test and training set might make the most sense, since you want to both be able to recognize a high quality number, as well as a sloppily written digit, and then test whether you can actually recognize it (with your test set).

Neural Network Retraining

I am coding a simple Neural Network, but I have thought of one issue that is bothering me.
This NN is for finding categories in the input. To better understand this, say the categories are "the numbers" (0,1,2...9).
To implement this the output layer is 10 nodes. Say I train this NN with several input -output pairs and save the learned weights somewhere. As the learning process takes quite a lot of time, after that I go and take a break. Come fresh the next day and re-start learning with new input -output pairs. So fair so goo
But what happen if on that time, I decide that I want to recognize hexadecimals (0,1,...9,A,B,,,E,F)... ergo the categories are increasing.
I suspect that would imply changing the structure of the NN and therefore I should retrain the NN from scratch.
Is this so?
Any comment, advice or your share of experience will be greatly appreciated
EDIT: This question has been marked as duplicate. I read the other question and although similar, my question is more concrete. While the other question speaks in generalities and the answer also is quite general- mine is very concrete as I use an example:
If I train a NN to recognize decimal numbers and later on decide to add data to make it recognize hexadecimals, can this be possible? How? Do I have to retrain the whole NN? In other words, does the structure of the NN needs to stay stationary with 10 OR 16 outputs since the beginning?
I would very much appreciate for a concrete answer to this. Thanks

A few considerations
Your training set and testing set should have the same distribution
Unless you have some way of specifying sample weights like some algorithms can you should at all costs avoid training on biased data. This is true for machine learning in general, not only neural networks.
Resuming training from a previous session is equivalent of using good initial values
Technically, you're just using the previous network as initial value instead of a random value. You should keep training in the whole dataset as always, to avoid a biased network.
Short Answer
Yes, you should always retrain your network if by retrain, you mean doing a training routine with the full dataset.
If you just mean retrain as doing a really long training iteration, it isn't your choice anyway. You must always train the network until the training error and testing error (or cross validated error) converge. If you reuse the previously trained network, that will probably happen faster.
You see, this is true no matter what kind of model change. If you change the network architecture, or the dataset, or both (your example), or some other parameter.
Of course, if you change the network architecture, you're going to have a bit of trouble on reusing the previous network. You could reuse the learned parameters from nodes that were kept and randomly initialize the parameters for the new nodes.

i need a way to train a neural network other than backpropagation

This is an on-going venture and some details are purposefully obfuscated.
I have a box that has several inputs and one output. The output voltage changes as the input voltages are changed. The desirability of the output sequence cannot be evaluated until many states pass and a look back process is evaluated.
I want to design a neural network that takes a number of outputs from the box as input and produce the correct input settings for the box to produce the optimal next output.
I cannot train this network using backpropagation. How do I train this network?

Genetic algorithm would be a good candidate here. A chromosome could encode the weights of the neural network. After evaluation you assign a fitness value to the chromosomes based on their performance. Chromosomes with higher fitness value have a higher chance to reproduce, helping to generate better performing chromosomes in the next generation.
Encoding the weights is a relatively simple solution, more complex ones could even define the topology of the network.
You might find some additional helpful information here:
http://en.wikipedia.org/wiki/Neuroevolution

Hillclimbing is the simplest optimization algorithm to implement. Just randomly modify the weights, see if it does better, if not reset them and try again. It's also generally faster than genetic algorithms. However it is prone to getting stuck in local optima, so try running it several times and selecting the best result.

Using a learned Artificial Neural Network to solve inputs

I've recently been delving into artificial neural networks again, both evolved and trained. I had a question regarding what methods, if any, to solve for inputs that would result in a target output set. Is there a name for this? Everything I try to look for leads me to backpropagation which isn't necessarily what I need. In my search, the closest thing I've come to expressing my question is
Is it possible to run a neural network in reverse?
Which told me that there, indeed, would be many solutions for networks that had varying numbers of nodes for the layers and they would not be trivial to solve for. I had the idea of just marching toward an ideal set of inputs using the weights that have been established during learning. Does anyone else have experience doing something like this?
In order to elaborate:
Say you have a network with 401 input nodes which represents a 20x20 grayscale image and a bias, two hidden layers consisting of 100+25 nodes, as well as 6 output nodes representing a classification (symbols, roman numerals, etc).
After training a neural network so that it can classify with an acceptable error, I would like to run the network backwards. This would mean I would input a classification in the output that I would like to see, and the network would imagine a set of inputs that would result in the expected output. So for the roman numeral example, this could mean that I would request it to run the net in reverse for the symbol 'X' and it would generate an image that would resemble what the net thought an 'X' looked like. In this way, I could get a good idea of the features it learned to separate the classifications. I feel as it would be very beneficial in understanding how ANNs function and learn in the grand scheme of things.

For a simple feed-forward fully connected NN, it is possible to project hidden unit activation into pixel space by taking inverse of activation function (for example Logit for sigmoid units), dividing it by sum of incoming weights and then multiplying that value by weight of each pixel. That will give visualization of average pattern, recognized by this hidden unit. Summing up these patterns for each hidden unit will result in average pattern, that corresponds to this particular set of hidden unit activities.Same procedure can be in principle be applied to to project output activations into hidden unit activity patterns.
This is indeed useful for analyzing what features NN learned in image recognition. For more complex methods you can take a look at this paper (besides everything it contains examples of patterns that NN can learn).
You can not exactly run NN in reverse, because it does not remember all information from source image - only patterns that it learned to detect. So network cannot "imagine a set inputs". However, it possible to sample probability distribution (taking weight as probability of activation of each pixel) and produce a set of patterns that can be recognized by particular neuron.

I know that you can, and I am working on a solution now. I have some code on my github here for imagining the inputs of a neural network that classifies the handwritten digits of the MNIST dataset, but I don't think it is entirely correct. Right now, I simply take a trained network and my desired output and multiply backwards by the learned weights at each layer until I have a value for inputs. This is skipping over the activation function and may have some other errors, but I am getting pretty reasonable images out of it. For example, this is the result of the trained network imagining a 3: number 3

Yes, you can run a probabilistic NN in reverse to get it to 'imagine' inputs that would match an output it's been trained to categorise.
I highly recommend Geoffrey Hinton's coursera course on NN's here:
https://www.coursera.org/course/neuralnets
He demonstrates in his introductory video a NN imagining various "2"s that it would recognise having been trained to identify the numerals 0 through 9. It's very impressive!
I think it's basically doing exactly what you're looking to do.
Gruff

neural network and a intrusion detection system

How do I approach the problem with a neural network and a intrusion detection system where by lets say we have an attack via FTP.
Lets say some one attempts to continuously try different logins via brute force attack on an ftp account.
How would I set the structure of the NN? What things do I have to consider? How would it recognise "similar approaches in the future"?
Any diagrams and input would be much appreciated.

Your question is extremely general and a good answer is a project in itself. I recommend contracting someone with experience in neural network design to help come up with an appropriate model or even tell you whether your problem is amenable to using a neural network. A few ideas, though:
Inputs need to be quantized, so start by making a list of possible numeric inputs that you could measure.
Outputs also need to be quantized and you probably can't generate a simple "Yes/no" response. Most likely you'll want to generate one or more numbers that represent a rough probability of it being an attack, perhaps broken down by category.
You'll need to accumulate a large set of training data that has been analyzed and quantized into the inputs and outputs you've designed. Figuring out the process of doing this quantization is a huge part of the overall problem.
You'll also need a large set of validation data, which should be quantized in the same way as the training data, but that should not take any part in the training, as otherwise you will simply force a correlation network that may well be completely meaningless.
Once you've completed the above, you can think about how you want to structure your network and the specific algorithms you want to use to train it. There is a wide range of literature on this topic, but, honestly, this is the simpler part of the problem. Representing the problem in a way that can be processed coherently is much more difficult.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse