Input values of an ANN constructed with keras framework (using theano) - neural-network

I want to costruct a neural network which will be trained based on data i create. My question is what form these data should have? In other words does keras allow neural networks that take strings/characters as input? If not, and only is able to accept numbers in what range should the input/output be?

The only condition for your input data i.e features, is that it should be numerical. There isn't really any constraint on range but it's always a good idea to do Feature Scaling, Normalization etc to make sure that our model won't get confused. Neural Networks or other machine learning methods cannot accept string (characters, words) directly, therefore, you need to first convert string to numbers. There are many ways to do that, most common techniques include Bag of Words, tf-idf features, word embeddings etc.
Following tutorials (using scikit) might be a good starting point:
http://scikit-learn.org/stable/tutorial/text_analytics/working_with_text_data.html
https://www.kaggle.com/c/word2vec-nlp-tutorial/details/part-1-for-beginners-bag-of-words

Related

Supervised neural network

Wanted to ask the opinion of SO experts about the type of neural network I should use to teach it make yes/no answers on the combination of over fifty parameters. Essentially I have a valuation that may produce up to fifty different warnings or errors that are present in what’s being evaluated. So far I’ve been using mean average with coefficients to produce yes/no threshold, but wanted to learn more about applying it through supervised neural network, which I can feed different results and teach it to give final verdict. Which neural network I can use for such undertaking? There are quite a few there and as I’m entering the field of artificial learning, I wanted to which direction I should start looking to.
EDIT
What I'm starting to lean towards is employing some kind of back-propagation to adjust coefficients for each of the rule, where the decision of whether barcode data is correct or not will influence those coefficients. I'm pretty sure this can be achieved using a NN, but not exactly sure which one to use.

Use a trained neural network to imitate its training data

I'm in the overtures of designing a prose imitation system. It will read a bunch of prose, then mimic it. It's mostly for fun so the mimicking prose doesn't need to make too much sense, but I'd like to make it as good as I can, with a minimal amount of effort.
My first idea is to use my example prose to train a classifying feed-forward neural network, which classifies its input as either part of the training data or not part. Then I'd like to somehow invert the neural network, finding new random inputs that also get classified by the trained network as being part of the training data. The obvious and stupid way of doing this is to randomly generate word lists and only output the ones that get classified above a certain threshold, but I think there is a better way, using the network itself to limit the search to certain regions of the input space. For example, maybe you could start with a random vector and do gradient descent optimisation to find a local maximum around the random starting point. Is there a word for this kind of imitation process? What are some of the known methods?
How about Generative Adversarial Networks (GAN, Goodfellow 2014) and their more advanced siblings like Deep Convolutional Generative Adversarial Networks? There are plenty of proper research articles out there, and also more gentle introductions like this one on DCGAN and this on GAN. To quote the latter:
GANs are an interesting idea that were first introduced in 2014 by a
group of researchers at the University of Montreal lead by Ian
Goodfellow (now at OpenAI). The main idea behind a GAN is to have two
competing neural network models. One takes noise as input and
generates samples (and so is called the generator). The other model
(called the discriminator) receives samples from both the generator
and the training data, and has to be able to distinguish between the
two sources. These two networks play a continuous game, where the
generator is learning to produce more and more realistic samples, and
the discriminator is learning to get better and better at
distinguishing generated data from real data. These two networks are
trained simultaneously, and the hope is that the competition will
drive the generated samples to be indistinguishable from real data.
(DC)GAN should fit your task quite well.

can a neural network be trained to recognize abstract pattern forms?

i'm curious as to the kind of limitations even an expertly designed network might have. this one in particular is what i could use some insight on:
given:
a set of random integers of non-trivial size (say at least 500)
an expertly created/trained neural network.
task:
number anagram: create the largest representation of an infinite sequence of integers possible in a given time frame where the sequence
either can be represented in closed form (ie - n^2, 2x+5, etc) or is
registered in OEIS (http://oeis.org/). the numbers used to create the
sequence can be taken from the input set in any order. so if the
network is fed (3, 5, 1, 7...), returning (1, 3, 5, 7 ...) would be an
acceptable result.
it's my understanding that an ANN can be trained to look for a particular sequence pattern (again - n^2, 2x+5, etc). what I'm wondering is if it can be made to recognize a more general pattern like n^y or xy+z. my thinking is that it won't be able to, because n^y can produce sequences that look different enough from one another that a stable 'base pattern' can't be established. that is - intrinsic to the way ANNs work (taking sets of input and doing fuzzy-matching against a static pattern it's been trained to look for) is that they are limited in terms of scope of what it is they can be trained to look for.
have i got this right?
Continuing from the conversation I had with you in the comments:
Neural networks still might be useful. Instead of training a neural net to search for a single pattern, the neural net can be trained to predict the data. If the data contains a predictable pattern, the NN can learn it, and the weights of the NN will represent the pattern it has learned. I think that may be what you were intending to do.
Some things that might be helpful for you if you do this:
Autoencoders do unsupervised learning and can learn the structure of individual datapoints.
Recurrent Neural Networks can model sequences of data rather than just individual datapoints. This sounds more like what you are looking for.
A Compositional Pattern-Producing Network (CPPNs) is a really fancy word for a neural network with mathematical functions as activation functions. This would allow you to model functions that aren't easily approximated by NNs with simple activation functions like sigmoids or ReLU. But usually this isn't necessary, so don't worry to much about it until after you have a simple NN working.
Dropout is a simple technique where you remove half of the hidden units every iteration. This seems to seriously reduce overfitting. It prevents complicated relationships between neurons from forming, which should make the models more interpretable, which seems like your goal.

Using a learned Artificial Neural Network to solve inputs

I've recently been delving into artificial neural networks again, both evolved and trained. I had a question regarding what methods, if any, to solve for inputs that would result in a target output set. Is there a name for this? Everything I try to look for leads me to backpropagation which isn't necessarily what I need. In my search, the closest thing I've come to expressing my question is
Is it possible to run a neural network in reverse?
Which told me that there, indeed, would be many solutions for networks that had varying numbers of nodes for the layers and they would not be trivial to solve for. I had the idea of just marching toward an ideal set of inputs using the weights that have been established during learning. Does anyone else have experience doing something like this?
In order to elaborate:
Say you have a network with 401 input nodes which represents a 20x20 grayscale image and a bias, two hidden layers consisting of 100+25 nodes, as well as 6 output nodes representing a classification (symbols, roman numerals, etc).
After training a neural network so that it can classify with an acceptable error, I would like to run the network backwards. This would mean I would input a classification in the output that I would like to see, and the network would imagine a set of inputs that would result in the expected output. So for the roman numeral example, this could mean that I would request it to run the net in reverse for the symbol 'X' and it would generate an image that would resemble what the net thought an 'X' looked like. In this way, I could get a good idea of the features it learned to separate the classifications. I feel as it would be very beneficial in understanding how ANNs function and learn in the grand scheme of things.
For a simple feed-forward fully connected NN, it is possible to project hidden unit activation into pixel space by taking inverse of activation function (for example Logit for sigmoid units), dividing it by sum of incoming weights and then multiplying that value by weight of each pixel. That will give visualization of average pattern, recognized by this hidden unit. Summing up these patterns for each hidden unit will result in average pattern, that corresponds to this particular set of hidden unit activities.Same procedure can be in principle be applied to to project output activations into hidden unit activity patterns.
This is indeed useful for analyzing what features NN learned in image recognition. For more complex methods you can take a look at this paper (besides everything it contains examples of patterns that NN can learn).
You can not exactly run NN in reverse, because it does not remember all information from source image - only patterns that it learned to detect. So network cannot "imagine a set inputs". However, it possible to sample probability distribution (taking weight as probability of activation of each pixel) and produce a set of patterns that can be recognized by particular neuron.
I know that you can, and I am working on a solution now. I have some code on my github here for imagining the inputs of a neural network that classifies the handwritten digits of the MNIST dataset, but I don't think it is entirely correct. Right now, I simply take a trained network and my desired output and multiply backwards by the learned weights at each layer until I have a value for inputs. This is skipping over the activation function and may have some other errors, but I am getting pretty reasonable images out of it. For example, this is the result of the trained network imagining a 3: number 3
Yes, you can run a probabilistic NN in reverse to get it to 'imagine' inputs that would match an output it's been trained to categorise.
I highly recommend Geoffrey Hinton's coursera course on NN's here:
https://www.coursera.org/course/neuralnets
He demonstrates in his introductory video a NN imagining various "2"s that it would recognise having been trained to identify the numerals 0 through 9. It's very impressive!
I think it's basically doing exactly what you're looking to do.
Gruff

Neural Network Categorical Data Implementation

I've been learning to work with neural networks as a hobby project, but am at a complete loss with how to handle categorical data. I read the article http://visualstudiomagazine.com/articles/2013/07/01/neural-network-data-normalization-and-encoding.aspx, which explains normalization of the input data and explains how to preprocess categorical data using effects encoding. I understand the concept of breaking the categories into vectors, but have no idea how to actually implement this.
For example, if I'm using countries as categorical data (e.g. Finland, Thailand, etc), would I process the resulting vector into a single number to be fed to a single input, or would I have a separate input for each component of the vector? Under the latter, if there are 196 different countries, that would mean I would need 196 different inputs just to process this particular piece of data. If a lot of different categorical data is being fed to the network, I can see this becoming really unwieldy very fast.
Is there something I'm missing? How exactly is categorical data mapped to neuron inputs?
Neural network inputs
As a rule of thumb: different classes and categories should have their own input signals.
Why you can't encode it with a single input
Since a neural network acts upon the input values through activation functions, a higher input value will result in a higher activation input.
A higher input value will make the neuron more likely to fire.
As long as you don't want to tell the network that Thailand is "better" than Finland then you may not encode the country input signal as InputValue(Finland) = 24, InputValue(Thailand) = 140.
How it should be encoded
Each country deserves its own input signal so that they contribute equally to activating the neurons.