What architecture/methods are used to make a neural network which can get infinite big input and/or return infinite big output?
I have an idea how to make infinite big output.
I just need extra input neurons and after the first calculation send output (or part of it) to input neurons.
But I have no clue how to make extensible input.
Maybe use multiple iterations, and plug output to input, and change the rest of the input neurons accordingly to the next portion of input data?
Artificial intelligence is new for me, so it is possible that I'm asking something that I don't want or something impossible. Please provide simple answers.
The short answer is any RNN is capable of consuming, and producing, arbitrary length sequences. Depending on the structure of the data CNNs, Graph Nets etc. can also work with arbitrarily large inputs.
Related
i have quick simple question about neuron network. As we all know, it is better to make the network deeper instead of wider. So what will happen if i set each hidden layer to be just one neuron and expand my network really deep?
This question is raised because i had a lecture about CNN today. The reason why we use CNN is that we want to extract the features of images and decrease the dimensions of the input data. Since we keep making the input for each layer smaller and smaller, why not just use one neuron and make network deeper? Or something bad will happen?
thanks!
Obviously, the single-neuron example doesn't work -- otherwise, that's what we'd use.
The kernels of each layer in a CNN harness spatial relationships and evaluate those juxtapositions with non-linear functions, which are the main differentiation of a CNN over a simple linear-combination NN. Without those non-linear operations, the CNN layers are merely a programming convenience.
If you immediately collapse your input to a single value, you have a huge problem in trying to write the cascading non-linearities that comprise the output evaluation. Yes, it's theoretically possible to write a function with, say, 28x28x3 inputs and exactly the output you want -- and to train the multitiude of parameters in that function -- but it's very messy to code and nearly impossible to maintain.
For instance, imagine trying to code an entire income tax form in a single function, such that the input was the entire range of applicable monetary amounts, Boolean status info, scalar arguments (e.g. how many children are live at home), ... and have the output be the correct amount of your payment or refund. There are many Boolean equations to apply, step functions with changing tax rates, various categories of additional tax or relief, etc.
Now, parameterize all of the constant coefficients in that massive calculation. Get some 10^6 real-life observations, and train your model on only that input and labels (tax/refund amount). You don't get any intermediate results to check, only that final amount.
It's possible, but not easy to write, debug, or maintain.
I am using Bidirectional Recurrent Neural Network(Bi-RNN) for binary classification.
I searched lots of paper related to that and finally get to know that the output of last time step or all of outputs are used for learning and prediction.
However, I'd like to use the output of certain step, which may not be last step, depending on the input. If Cth input is what i want to know, I will use Cth output. Because Cth output is resutled by all of other hidden states in Bi-RNN, I don't have to use other outputs of time step If I am only interested in Cth one.
I know this is little special case but I can't find any example for this case.
(Attention based approach is diffrent from what i want to know. I want to use only one output,which may not be last output, for learning a input sequnece)
Is there any example and research paper related what i mentioned?
Now I use some neural network for OCR and it produces output symbol and some probability for it. Also I have algorithm to split touching characters.
I expected to use probability to decide when to apply splitting.
But now I cannot do this because my network sometimes gives probability for touching characters higher then for normal characters.
Also I cannot understand what happened even after splitting - sometimes normal symbol can be split into two another symbols that both can be recognized with higher probability that initial symbol.
So I need to decide what to do. The question is
can Neural Network at least in theory provide reliable probability for OCR in this sense?
If it is possible then what should I try to do? Should I try to process current output or train network more or choose another network?
Any kind of help or suggestion will be greatly appreciated
Your approach is good and should eventually work given enough training data and given that you remove enough bugs from your preprocessing, splitting, training, etc.
Make sure that you split in the training set (prior to training) exactly the same way that you split the digits when you test them.
But note that Machine Learning produces algorithms that are correct within some accuracy, so you will always find instances that fail. The question is how good is your overall test performance (e.g. % correct digits), and how to increase this to the level that your application requires.
The question is can Neural Network at least in theory provide reliable
probability for OCR in this sense?
Yes
If it is possible then what should I try to do? Should I try to
process current output or train network more or choose another
network?
all of the above until it works! Training size is one of the key factors, and as you grow your training size you can grow your network to improve accuracy.
This is an on-going venture and some details are purposefully obfuscated.
I have a box that has several inputs and one output. The output voltage changes as the input voltages are changed. The desirability of the output sequence cannot be evaluated until many states pass and a look back process is evaluated.
I want to design a neural network that takes a number of outputs from the box as input and produce the correct input settings for the box to produce the optimal next output.
I cannot train this network using backpropagation. How do I train this network?
Genetic algorithm would be a good candidate here. A chromosome could encode the weights of the neural network. After evaluation you assign a fitness value to the chromosomes based on their performance. Chromosomes with higher fitness value have a higher chance to reproduce, helping to generate better performing chromosomes in the next generation.
Encoding the weights is a relatively simple solution, more complex ones could even define the topology of the network.
You might find some additional helpful information here:
http://en.wikipedia.org/wiki/Neuroevolution
Hillclimbing is the simplest optimization algorithm to implement. Just randomly modify the weights, see if it does better, if not reset them and try again. It's also generally faster than genetic algorithms. However it is prone to getting stuck in local optima, so try running it several times and selecting the best result.
How do I approach the problem with a neural network and a intrusion detection system where by lets say we have an attack via FTP.
Lets say some one attempts to continuously try different logins via brute force attack on an ftp account.
How would I set the structure of the NN? What things do I have to consider? How would it recognise "similar approaches in the future"?
Any diagrams and input would be much appreciated.
Your question is extremely general and a good answer is a project in itself. I recommend contracting someone with experience in neural network design to help come up with an appropriate model or even tell you whether your problem is amenable to using a neural network. A few ideas, though:
Inputs need to be quantized, so start by making a list of possible numeric inputs that you could measure.
Outputs also need to be quantized and you probably can't generate a simple "Yes/no" response. Most likely you'll want to generate one or more numbers that represent a rough probability of it being an attack, perhaps broken down by category.
You'll need to accumulate a large set of training data that has been analyzed and quantized into the inputs and outputs you've designed. Figuring out the process of doing this quantization is a huge part of the overall problem.
You'll also need a large set of validation data, which should be quantized in the same way as the training data, but that should not take any part in the training, as otherwise you will simply force a correlation network that may well be completely meaningless.
Once you've completed the above, you can think about how you want to structure your network and the specific algorithms you want to use to train it. There is a wide range of literature on this topic, but, honestly, this is the simpler part of the problem. Representing the problem in a way that can be processed coherently is much more difficult.