Is possible to distinguish between patterns [ABCDEFG] and [ABCDEGF]? And What about distinguishing between [ABCDEFGH] and [BCDEFGH]?
I did a PhD entitled "Temporal Sequence Processing In neural Networks". It contains many ideas for solving exactly this type of question. You can download it here. Chapter 9 concerns the recognition of sequences, though it will probably refer to a great many things covered in earlier chapters, so I'm not sure you can read that on its own.
Yes, with Levenshtein distance.
Related
Recently, I was asked about how to pre-train a deep neural network with unlabeled data, meaning, instead of initializing the model weight with small random numbers, we set initial weight from a pretrained model (with unlabeled data).
Well, intuitively, I kinda get it, it probably helps with the vanishing gradient issue and shorten the training time when there are not too much labeled data available. But still, I don't really know how it is done, how can you train a neural network with unlabeled data? Is it something like SOM or Boltzmann machine?
Has anybody heard about this? If yes, can you provide some links to sources or papers. I am curious. Greatly appreciate!
There are lots of ways to deep-learn from unlabeled data. Layerwise pre-training was developed back in the 2000s by Geoff Hinton's group, though that's generally fallen out of favor.
More modern unsupervised deep learning methods include Auto-Encoders, Variational Auto-Encoders, and Generative Adversarial Networks. I won't dive into the details of all of them, but the simplest of these, auto-encoders, work by compressing an unlabeled input into a low dimensional real-valued representation, and using this compressed representation to reconstruct the original input. Intuitively, a compressed code that can effectively be used to recreate an input is likely to capture some useful features of said input. See here for an illustration and more detailed description. There are also plenty of examples implemented in your deep learning library of choice.
I guess in some sense any of the listed methods could be used as pre-training, e.g for preparing a network for a discriminative task like classification, though I'm not aware of that being a particularly common practice. Initialization methods, activation functions, and other optimization tricks are generally advanced enough to do well without more complicated initialization procedures.
Can anybody tell whether these three models are based on skip-gram or CBOW methology?
Thanks in advance
The referenced paper, under those four "English word vectors" downloads, seems to only discuss their techniques with respect to the CBOW model, so it's probably that. (But other vectors from nearby pages, such as those labeled "Wiki word vectors", are clearly described as being skip-gram trained.)
But: once you've decided to trust some off-the-shelf vectors, does it really matter how they were trained? They're either good on your target tasks, or not.
Do you know if anyone has tried to compile high level programming languages (java, c#, etc') into a recurrent neural network and then evolve them?
I mean that the whole process including memory usage is stored in a graph of a neural net, and I'm talking about complex programs (thinking about natural language processing problems).
When I say neural net I mean a directed weighted graphs that spreads activation, and the nodes are functions of their inputs (linear, sigmoid and multiplicative to keep it simple).
Furthermore, is that what people mean in genetic programming or is there a difference?
Neural networks are not particularly well suited for evolving programs; their strength tends to be in classification. If anyone has tried, I haven't heard about it (which considering I barely touch neural networks is not a surprise, but I am active in the general AI field at the moment).
The main reason why neural networks aren't useful for generating programs is that they basically represent a mathematical equation (numeric, rather than functional). Given some numeric input, you get a numeric output. It is difficult to interpret these in the context of a program any more complicated than simple arithmetic.
Genetic Programming traditionally uses Lisp, which is a pure functional language, and often programs are often shown as tree diagrams (which occasionally look similar to some neural network diagrams - is this the source of your confusion?). The programs are evolved by exchanging entire branches of a tree (a function and all its parameters) between programs or regenerating an entire branch randomly.
There are certainly a lot of good (and a lot of bad) references on both of these topics out there - I refrain from listing them because it isn't clear what you are actually interested in. Wikipedia covers each of these techniques, and is a good starting point.
Genetic programming is very different from Neural networks. What you are suggesting is more along the lines of genetic programming - making small random changes to a program, possibly "breeding" successful programs. It is not easy, and I have my doubts that it can be done successfully across a large program.
You may have more luck extracting a small but critical part of your program, one which has a few particular "aspects" (such as parameter values) that you can try to evolve.
Google is your friend.
Some sophisticated anti-virus programs as well as sophisticated malware use formal grammar and genetic operators to evolve against each other using neural networks.
Here is an example paper on the topic: http://nexginrc.org/nexginrcAdmin/PublicationsFiles/raid09-sadia.pdf
Sources: A class on Artificial Intelligence I took a couple years ago.
With regards to your main question, no one has ever tried that on programming languages to the best of my knowledge, but there is some research in the field of evolutionary computation that could be compared to something like that (but it's obviously a far-fetched comparison). As a matter of possible interest, I asked a similar question about sel-improving compilers a while ago.
For a difference between genetic algorithms and genetic programming, have a look at this question.
Neural networks have nothing to do with genetic algorithms or genetic programming, but you can obviously use either to evolve neural nets (as any other thing for that matters).
You could have look at genetic-programming.org where they claim that they have found some near human competitive results produced by genetic programming.
I have not heard of self-evolving and self-imrpvoing programs before. They may exist as special research tools like genetic-programming.org have but nothing solid for generic use. And even if they exist they are very limited to special purpose operations like malware detection as Alain mentioned.
I am new in neural networks and I need to determine the pattern among a given set of inputs and outputs. So how do I decide which neural network to use for training or even which learning method to use? I have little idea about the pattern or relation between the given input and outputs.
Any sort of help will be appreciated. If you want me to read some stuff then it would be great if links are provided.
If any more info is needed plz say so.
Thanks.
Choosing the right neural networks is something of an art form. It's a bit difficult to give generic suggestions as the best NN for a situation will depend on the problem at hand. As with many of these problems neural netowrks may or may not be the best solution. I'd highly recommned trying out different networks and testing their performance vs a testing data set. When I did this I usually used the ANN tools though the R software package.
Also keep your mind open to other statistical learning techniques as well, things like decision trees and Support Vector Machines may be a better choice for some problems.
I'd suggest the following books:
http://www.amazon.com/Neural-Networks-Pattern-Recognition-Christopher/dp/0198538642
http://www.stats.ox.ac.uk/~ripley/PRbook/#Contents
I'm quite new with this topic so any help would be great. What I need is to optimize a neural network in MATLAB by using GA. My network has [2x98] input and [1x98] target, I've tried consulting MATLAB help but I'm still kind of clueless about what to do :( so, any help would be appreciated. Thanks in advance.
Edit: I guess I didn't say what is there to be optimized as Dan said in the 1st answer. I guess most important thing is number of hidden neurons. And maybe number of hidden layers and training parameters like number of epochs or so. Sorry for not providing enough info, I'm still learning about this.
If this is a homework assignment, do whatever you were taught in class.
Otherwise, ditch the MLP entirely. Support vector regression ( http://www.csie.ntu.edu.tw/~cjlin/libsvm/ ) is much more reliably trainable across a broad swath of problems, and pretty much never runs into the stuck-in-a-local-minima problem often hit with back-propagation trained MLP which forces you to solve a network topography optimization problem just to find a network which will actually train.
well, you need to be more specific about what you are trying to optimize. Is it the size of the hidden layer? Do you have a hidden layer? Is it parameter optimization (learning rate, kernel parameters)?
I assume you have a set of parameters (# of hidden layers, # of neurons per layer...) that needs to be tuned, instead of brute-force searching all combinations to pick a good one, GA can help you "jump" from this combination to another one. So, you can "explore" the search space for potential candidates.
GA can help in selecting "helpful" features. Some features might appear redundant and you want to prune them. However, say, data has too many features to search for the best set of features by some approaches such as forward selection. Again, GA can "jump" from this set candidate to another one.
You will need to find away to encode the data (input parameters, features...) fed to GA. For finding a set of input paras or a good set of features, I think binary encoding should work. In addition, choosing operators for GA to reproduce offsprings is also important. Yet GA needs to be tuned, too (early stopping which can also be applied to ANN).
Here are just some ideas. You might want to search for more info about GA, feature selection, ANN pruning...
Since you're using MATLAB already I suggest you look into the Genetic Algorithms solver (known as GATool, part of the Global Optimization Toolbox) and the Neural Network Toolbox. Between those two you should be able to save quite a bit of figuring out.
You'll basically have to do 2 main tasks:
Come up with a representation (or encoding) for your candidate solutions
Code your fitness function (which basically tests candidate solutions) and pass it as a parameter to the GA solver.
If you need help in terms of coming up with a fitness function, or encoding of candidate solutions then you'll have to be more specific.
Hope it helps.
Matlab has a simple but great explanation for this problem here. It explains both the ANN and GA part.
For more info on using ANN in command line see this.
There is also plenty of litterature on the subject if you google it. It is however not related to MATLAB, but simply the results and the method.
Look up Matthew Settles on Google Scholar. He did some work in this area at the University of Idaho in the last 5-6 years. He should have citations relevant to your work.