Neural networks and unordered input data - neural-network

Much work has been done in recent years on neural networks where the input is a collection. In particular, convolutional networks often work well on N-dimensional arrays (exploiting spatial structure) and recurrent neural networks such as LSTM often work well on time-series data (exploiting temporal structure).
I'm currently looking at the question of using neural networks for heuristic guidance of search in theorem proving. One of the problems here is that the input takes the form not of an array but of a set of clauses, where there is not so much of a natural ordering between clauses, and we cannot say that each clause will mostly interact with nearby neighbors in space.
Aside from 'bag of words' for text documents (which works because there is a fixed dictionary), is anything already known about which neural network architectures work well for inputs that don't have a natural order?

Related

There are deep learning methods for string similarity in machine translation?

I am interested in machine translation and more specific I would like to examine the similarity between two strings. I would like to know if there are deep learning methods for text feature extraction. I already tried the famous statistics methods like cosine similarity, Levenstein distance, word frequency and others.
Thank you
To find the similarity between 2 string ,try to train a Siamese networks
on your dataset
Siamese networks are a special type of neural network architecture. Instead of a model learning to classify its inputs, the neural networks learns to differentiate between two inputs. It learns the similarity between them.
https://medium.com/#gautam.karmakar/manhattan-lstm-model-for-text-similarity-2351f80d72f1
The below is the link of a kaggle competition ,they have used siamese networks for text simmilarity
https://medium.com/mlreview/implementing-malstm-on-kaggles-quora-question-pairs-competition-8b31b0b16a07
Hope this clears your doubts

Use a trained neural network to imitate its training data

I'm in the overtures of designing a prose imitation system. It will read a bunch of prose, then mimic it. It's mostly for fun so the mimicking prose doesn't need to make too much sense, but I'd like to make it as good as I can, with a minimal amount of effort.
My first idea is to use my example prose to train a classifying feed-forward neural network, which classifies its input as either part of the training data or not part. Then I'd like to somehow invert the neural network, finding new random inputs that also get classified by the trained network as being part of the training data. The obvious and stupid way of doing this is to randomly generate word lists and only output the ones that get classified above a certain threshold, but I think there is a better way, using the network itself to limit the search to certain regions of the input space. For example, maybe you could start with a random vector and do gradient descent optimisation to find a local maximum around the random starting point. Is there a word for this kind of imitation process? What are some of the known methods?
How about Generative Adversarial Networks (GAN, Goodfellow 2014) and their more advanced siblings like Deep Convolutional Generative Adversarial Networks? There are plenty of proper research articles out there, and also more gentle introductions like this one on DCGAN and this on GAN. To quote the latter:
GANs are an interesting idea that were first introduced in 2014 by a
group of researchers at the University of Montreal lead by Ian
Goodfellow (now at OpenAI). The main idea behind a GAN is to have two
competing neural network models. One takes noise as input and
generates samples (and so is called the generator). The other model
(called the discriminator) receives samples from both the generator
and the training data, and has to be able to distinguish between the
two sources. These two networks play a continuous game, where the
generator is learning to produce more and more realistic samples, and
the discriminator is learning to get better and better at
distinguishing generated data from real data. These two networks are
trained simultaneously, and the hope is that the competition will
drive the generated samples to be indistinguishable from real data.
(DC)GAN should fit your task quite well.

Input values of an ANN constructed with keras framework (using theano)

I want to costruct a neural network which will be trained based on data i create. My question is what form these data should have? In other words does keras allow neural networks that take strings/characters as input? If not, and only is able to accept numbers in what range should the input/output be?
The only condition for your input data i.e features, is that it should be numerical. There isn't really any constraint on range but it's always a good idea to do Feature Scaling, Normalization etc to make sure that our model won't get confused. Neural Networks or other machine learning methods cannot accept string (characters, words) directly, therefore, you need to first convert string to numbers. There are many ways to do that, most common techniques include Bag of Words, tf-idf features, word embeddings etc.
Following tutorials (using scikit) might be a good starting point:
http://scikit-learn.org/stable/tutorial/text_analytics/working_with_text_data.html
https://www.kaggle.com/c/word2vec-nlp-tutorial/details/part-1-for-beginners-bag-of-words

can a neural network be trained to recognize abstract pattern forms?

i'm curious as to the kind of limitations even an expertly designed network might have. this one in particular is what i could use some insight on:
given:
a set of random integers of non-trivial size (say at least 500)
an expertly created/trained neural network.
task:
number anagram: create the largest representation of an infinite sequence of integers possible in a given time frame where the sequence
either can be represented in closed form (ie - n^2, 2x+5, etc) or is
registered in OEIS (http://oeis.org/). the numbers used to create the
sequence can be taken from the input set in any order. so if the
network is fed (3, 5, 1, 7...), returning (1, 3, 5, 7 ...) would be an
acceptable result.
it's my understanding that an ANN can be trained to look for a particular sequence pattern (again - n^2, 2x+5, etc). what I'm wondering is if it can be made to recognize a more general pattern like n^y or xy+z. my thinking is that it won't be able to, because n^y can produce sequences that look different enough from one another that a stable 'base pattern' can't be established. that is - intrinsic to the way ANNs work (taking sets of input and doing fuzzy-matching against a static pattern it's been trained to look for) is that they are limited in terms of scope of what it is they can be trained to look for.
have i got this right?
Continuing from the conversation I had with you in the comments:
Neural networks still might be useful. Instead of training a neural net to search for a single pattern, the neural net can be trained to predict the data. If the data contains a predictable pattern, the NN can learn it, and the weights of the NN will represent the pattern it has learned. I think that may be what you were intending to do.
Some things that might be helpful for you if you do this:
Autoencoders do unsupervised learning and can learn the structure of individual datapoints.
Recurrent Neural Networks can model sequences of data rather than just individual datapoints. This sounds more like what you are looking for.
A Compositional Pattern-Producing Network (CPPNs) is a really fancy word for a neural network with mathematical functions as activation functions. This would allow you to model functions that aren't easily approximated by NNs with simple activation functions like sigmoids or ReLU. But usually this isn't necessary, so don't worry to much about it until after you have a simple NN working.
Dropout is a simple technique where you remove half of the hidden units every iteration. This seems to seriously reduce overfitting. It prevents complicated relationships between neurons from forming, which should make the models more interpretable, which seems like your goal.

what should be my input in ANN

I am getting confusing about Input data set . I am studying about Artificial Neural Network , my purpose is that I wanted to use the historical data (I have stock data of last 10 years ) to predict stock value in the future (for example 2015). So, where is my input? For example i have a Excel sheet data as [Column1-Date| Column2-High | Column3-low |Column4-opening|Column5-closing]
By profession I am a quant and I am currently pursuing a masters degree in Computer Science. There are a many considerations when selecting financial input for a neural network including,
Select indicators which which are positively correlated to returns.
Indicators are independent variables which have predictive power on the dependent variable (stock returns). Common popular indicators include technical indicators derived from price and volume data, fundamental indicators about the underlying company or asset, and quantitative indicators such as descriptive statistics or even model parameters. If you have many indicators, you can narrow them down using correlation analysis, best subset, or principal component analysis.
Pre-process the indicators for use in Neural Networks
Neural networks work by connecting perceptrons together. Each perceptron contains an activation function e.g. the sigmoid function or tanh. Most activation functions have an active range. For the sigmoid function this is between -sqrt(3) and +sqrt(3). What this means is that you should normalize your data to within the active range and seriously consider removing outliers.
There are many other potential issues with using Neural Networks. I wrote an article a while back which identified ten issues, including the ones mentioned here. Feel free to check it out.