Choosing distinct pairs with neural network - neural-network

I'm trying to generate first class cubic graphs, to do so I need to choose a perfect matching, which is basically a partition of [|1, n|] with pairs (Each number should be in exactly one pair).
For example, for n = 8, I could choose (1, 7), (2, 4), (3, 6), (5, 8).
I want to use a neural net to do so, but I don't know at all how to do this properly, because my net need to know which numbers are available at each step. My first idea was to try to generate a permutation and then take the numbers two by two, but I don't know how to do this with a neural network. Any ideas ?

Related

Normalization before data split in Neural Network

I am trying to run a MLP regressor on my dataset with one hidden layer. I am doing a standardization of my data but I want to be clear as whether it matters if I do the standardization after or before splitting the dataset in Training and Test set. I want to know if there will be any difference in my prediction values if I carry out standardization before data split.
Yes and no. If mean and variance of the training and test set are different, standardization can lead to a different outcome.
That being said, a good training and test set should be similar enough so that the data points are distributed in a similar way, and post-split standardization should give the same results.
You should absolutely do it before splitting.
Imagine having [1,2,3,4,5,6,7,8,9,10] as your inputs, which get split into [1, 2, 3, 4, 5, 7, 9, 10] for train and [6,8] for test.
It's immediately clear that min-max ranges, as well as the mean and standard deviation of both samples are completely different, so by applying standardization "post-split", you are completely scrambling the relationship between the values in the 1st and the 2nd set.

Neural Network to identify Seven-Segment Numerals

I am studying machine learning and I am working on my first neural network as a project for one of my classes. I am programming the network in java. The point of the network is to identify seven-segmented numeral (like on a regular digital clock). The network does not actually have to be linked to any real sensors, it just needs to work in theory based on inputs as 0's and 1's in text form, not binary, which correspond to a hypothetical sensor matrix laid across the top of the number.
My question is, what sort of output am I looking to get?
Will the binary output be just correspond to the same sort of matrix as input or is the binary output supposed to represent the input number in binary such as returning 111 for 7?
If it does just return another matrix, what is the point of the network?
The input for a seven-segment numeral would be a (1 X 7) vector, with 1 for segments that are on and 0 for segments that are off.
As for the output, you don't specify what you want it to be, so let's assume you want it to tell you "which digit is the screen showing". Since there are 10 digits (0 through 9), you have 10 possible answers. The output would be a (1 X 10) vector, with each number corresponding to one of the digits. Its value represents how confident the network is that this is the correct answer (typically the output values lie in [0, 1], but it depends on your setup) Ideally you would want the network to return a vector having 1 on one attribute and zeros in all others.
Note however, that this case a classifier is not useful. A classification algorithm generalizes what it has seen in the past. So, it would be useful for handwritten recognition, because even if the same person writes the same digit twice, it is not exactly the same. In your case, each digit is the same across all the 7-segment displays, so you network is not exactly learning, rather memorizing the input.

Restricting output classes in multi-class classification in Tensorflow

I am building a bidirectional LSTM to do multi-class sentence classification.
I have in total 13 classes to choose from and I am multiplying the output of my LSTM network to a matrix whose dimensionality is [2*num_hidden_unit,num_classes] and then apply softmax to get the probability of the sentence to fall into 1 of the 13 classes.
So if we consider output[-1] as the network output:
W_output = tf.Variable(tf.truncated_normal([2*num_hidden_unit,num_classes]))
result = tf.matmul(output[-1],W_output) + bias
and I get my [1, 13] matrix (assuming I am not working with batches for the moment).
Now, I also have information that a given sentence does not fall into a given class for sure and I want to restrict the number of classes considered for a given sentence. So let's say for instance that for a given sentence, I know it can fall only in 6 classes so the output should really be a matrix of dimensionality [1,6].
One option I was thinking of is to put a mask over the result matrix where I multiply the rows corresponding to the classes that I want to keep by 1 and the ones I want to discard by 0, by in this way I will just lose some of the information instead of redirecting it.
Anyone has a clue on what to do in this case?
I think your best bet is, as you seem to have described, using a weighted cross entropy loss function where the weights for your "impossible class" are 0 and 1 for the other possible classes. Tensorflow has a weighted cross entropy loss function.
Another interesting but probably less effective method is to feed whatever information you now have about what classes your sentence can/cannot fall into the network at some point (probably towards the end).

can a neural network be trained to recognize abstract pattern forms?

i'm curious as to the kind of limitations even an expertly designed network might have. this one in particular is what i could use some insight on:
given:
a set of random integers of non-trivial size (say at least 500)
an expertly created/trained neural network.
task:
number anagram: create the largest representation of an infinite sequence of integers possible in a given time frame where the sequence
either can be represented in closed form (ie - n^2, 2x+5, etc) or is
registered in OEIS (http://oeis.org/). the numbers used to create the
sequence can be taken from the input set in any order. so if the
network is fed (3, 5, 1, 7...), returning (1, 3, 5, 7 ...) would be an
acceptable result.
it's my understanding that an ANN can be trained to look for a particular sequence pattern (again - n^2, 2x+5, etc). what I'm wondering is if it can be made to recognize a more general pattern like n^y or xy+z. my thinking is that it won't be able to, because n^y can produce sequences that look different enough from one another that a stable 'base pattern' can't be established. that is - intrinsic to the way ANNs work (taking sets of input and doing fuzzy-matching against a static pattern it's been trained to look for) is that they are limited in terms of scope of what it is they can be trained to look for.
have i got this right?
Continuing from the conversation I had with you in the comments:
Neural networks still might be useful. Instead of training a neural net to search for a single pattern, the neural net can be trained to predict the data. If the data contains a predictable pattern, the NN can learn it, and the weights of the NN will represent the pattern it has learned. I think that may be what you were intending to do.
Some things that might be helpful for you if you do this:
Autoencoders do unsupervised learning and can learn the structure of individual datapoints.
Recurrent Neural Networks can model sequences of data rather than just individual datapoints. This sounds more like what you are looking for.
A Compositional Pattern-Producing Network (CPPNs) is a really fancy word for a neural network with mathematical functions as activation functions. This would allow you to model functions that aren't easily approximated by NNs with simple activation functions like sigmoids or ReLU. But usually this isn't necessary, so don't worry to much about it until after you have a simple NN working.
Dropout is a simple technique where you remove half of the hidden units every iteration. This seems to seriously reduce overfitting. It prevents complicated relationships between neurons from forming, which should make the models more interpretable, which seems like your goal.

Neural Networks in Matlab. Adding large learning patterns

I am new to matlab and I can't find a solution to my problem...
What is the problem?
I have to create a neural network using matlab that will have almost 25k inputs and 10 outputs. There is also 300 patterns to learn.
When I was reading info about neural networks in matlab I saw that all input/learing data are in one matrix. It's ok for xor or something like that small. Then I realized that I would have to create matrix that contains 25 000 * 300 elements (7,5 mln of integers).
1) Is there any possibility that I can expand matrix by adding new rows (learning patterns)?
2) or maybe it got something like:
learnPatternMatrix1 = [1, 2, 3 , ..., 25 000];
perfectOutputMatrix1 = [1, 2, 3, ... , 10];
network.addPattern(learnPatternMatrix1, perfectOutputMatrix1);
network.addPattern(learnPatternMatrix2, perfectOutputMatrix2);
% ...
network.addPattern(learnPatternMatrix300, perfectOutputMatrix300);
network.learn()?
Thanks for help ;)
I'm sorry I don't have an answer to making Matlab deal with that size of matrix. I do have some comments which may be relevant to the problem, however.
Neural networks are, like most machine learning algorithms, unlikely to perform well when there are a large number of features (inputs) compared to the number of data points. Unless you have an order or two magnitude more data points that the 250,000 features you describe, this approach may not work. You seem to have only 300 cases. Even support vector machines, supposedly robust to this problem, are unlikely to perform well under these conditions.
In the case of not enough data for number of features, you can think of it as guaranteed overfitting, as each data point will be uniquely situated and widely separated in feature space.
Have you considered feature reduction? That will solve your Matlab problem, and is likely to improve the performance of your ANN.