Brute Force and Minimax/AlphaBeta pruning - brute-force

Brute Force is essentially just searching every possible combination but how is minimax any different? Minimax searches every combination as well and then returns the best score?
I understand that when we used alpha beta pruning we take out ones that will have no effect on our min/max value but this occurs after we have already performed minimax so would this be considered brute force? Maybe I'm misunderstanding what I've been reading so far, so any help would be great!
Thank You!

Alpha-beta pruning does not occur "after" minimax, it occurs "during" minimax. And the pruning does in fact eliminate a vast number of "combinations" -- down to sqrt of the brute-force minimax number in the best case, n^(3/4) in the average case.

Minimax is a much more efficient way to walk through the decision tree than brute force. As only nodes that might still contain a better solution are visited, in contrary to brute force where all nodes are visited, without pruning of any kind.
Wiki:
Heuristics can also be used to make an early cutoff of parts of the search. One example of this is the minimax principle for searching game trees, that eliminates many subtrees at an early stage in the search.
In Minimax you cut parts of the tree depending on the available moves and their score, meaning that you would not go over all the possibilities, and this makes it superior to brute-force approach.

Related

Why use Crossover in Neural Network training?

Why specifically is it used?
I know it increases variation which may help explore the problem space, but how much does it increase the probability of finding the optimal solution/configuration in time? And does it do anything else advantageous?
And does it necessarily always help, or are there instances in which it would increase the time taken to find the optimal solution?
As Patrick Trentin said, crossover improve the speed of convergence, because it allows to combine good genes that are already found in the population.
But, for neuro-evolution, crossover is facing the "permutation problem", also known as "the competing convention problem". When two parents are permutations of the same network, then, except in rare cases, their offspring will always have a lower fitness. Because the same part of the network is copied in two different locations, and so the offspring is losing viable genes for one of these two locations.
for example the networks A,B,C,D and D,C,B,A that are permutations of the same network. The offspring can be:
A,B,C,D (copy of parent 1)
D,C,B,A (copy of parent 2)
A,C,B,D OK
A,B,C,A
A,B,B,A
A,B,B,D
A,C,B,A
A,C,C,A
A,C,C,D
D,B,C,A OK
D,C,B,D
D,B,B,A
D,B,B,D
D,B,C,D
D,C,C,A
D,C,C,D
So, for this example, 2/16 of the offspring are copies of the parents. 2/16 are combinations without duplicates. And 12/16 have duplicated genes.
The permutation problem occurs because networks that are permutations one of the other have the same fitness. So, even for an elitist GA, if one is selected as parent, the other will also often be selected as parent.
The permutations may be only partial. In this case, the result is better than for complete permutations, but the offspring will, in a lot of cases, still have a lower fitness than the parents.
To avoid the permutation problem, I heard about similarity based crossover, that compute similarity of neurons and their connected synapses, doing the crossing-over between the most similar neurons instead of a crossing-over based on the locus.
When evolving topology of the networks, some NEAT specialists think the permutation problem is part of a broader problem: "the variable lenght genome problem". And NEAT seems to avoid this problem by speciation of the networks, when two networks are too differents in topology and weights, they aren't allowed to mate. So, NEAT algorithm seems to consider permuted networks as too different, and doesn't allow them to mate.
A website about NEAT also says:
However, in another sense, one could say that the variable length genome problem can never be "solved" because it is inherent in any system that generates different constructions that solve the same problem. For example, both a bird and a bat represent solutions to the problem of flight, yet they are not compatible since they are different conventions of doing the same thing. The same situation can happen in NEAT, where very different structures might arise that do the same thing. Of course, such structures will not mate, avoiding the serious consequence of damaged offspring. Still, it can be said that since disparate representations can exist simultaneously, incompatible genomes are still present and therefore the problem is not "solved." Ultimately, it is subjective whether or not the problem has been solved. It depends on what you would consider a solution. However, it is at least correct to say, "the problem of variable length genomes is avoided."
Edit: To answer your comment.
You may be right for similarity based crossover, I'm not sure it totally avoids the permutation problem.
About the ultimate goal of crossover, without considering the permutation problem, I'm not sure it is useful for the evolution of neural networks, but my thought is: if we divide a neural network in several parts, each part contributes to the fitness, so two networks with a high fitness may have different good parts. And combining these parts should create an even better network. Some offspring will of course inherit the bad parts, but some other offspring will inherit the good parts.
Like Ray suggested, it could be useful to experiment the evolution of neural networks with and without crossover. As there is randomness in the evolution, the problem is to run a large number of tests, to compute the average evolution speed.
About evolving something else than a neural network, I found a paper that says an algorithm using crossover outperforms a mutation-only algorithm for solving the all-pairs shortest path problem (APSP).
Edit 2:
Even if the permutation problem seems to be only applicable to some particular problems like neuro-evolution, I don't think we can say the same about crossover, because maybe we are missing something about the problems that don't seem to be suitable for crossover.
I found a free version of the paper about similarity based crossover for neuro-evolution, and it shows that:
an algorithm using a naive crossover performs worse than a mutation-only algorithm.
using similarity based crossover it performs better than a mutation-only algorithm for all tested cases.
NEAT algorithm sometimes performs better than a mutation-only algorithm.
Crossover is complex and I think there is a lack of studies that compare it with mutation-only algorithms, maybe because its usefulness highly depends:
of its engineering, in function of particular problems like the permutation problem. So of the type of crossover we use (similarity based, single point, uniform, edge recombination, etc...).
And of the mating algorithm. For example, this paper shows that a gendered genetic algorithm strongly outperforms a non-gendered genetic algorithm for solving the TSP. For solving two other problems, the algorithm doesn't strongly outperforms, but it is better than the non-gendered GA. In this experiment, males are selected on their fitness, and females are selected on their ability to produce a good offspring. Unfortunately, this study doesn't compare the results with a mutation-only algorithm.

Best Method to Intersect Huge HyperLogLogs in Redis

The problem is simple: I need to find the optimal strategy to implement accurate HyperLogLog unions based on Redis' representation thereof--this includes handling their sparse/dense representations if the data structure is exported for use elsewhere.
Two Strategies
There are two strategies, one of which seems vastly simpler. I've looked at the actual Redis source and I'm having a bit of trouble (not big in C, myself) figuring out whether it's better from a precision and efficiency perspective to use their built-in structures/routines or develop my own. For what it's worth, I'm willing to sacrifice space and to some degree errors (stdev +-2%) in the pursuit of efficiency with extremely large sets.
1. Inclusion Principle
By far the simplest of the two--essentially I would just use the lossless union (PFMERGE) in combination with this principle to calculate an estimate of the overlap. Tests seem to show this running reliably in many cases, although I'm having trouble getting an accurate handle on in-the-wild efficiency and accuracy (some cases can produce errors of 20-40% which is unacceptable in this use case).
Basically:
aCardinality + bCardinality - intersectionCardinality
or, in the case of multiple sets...
aCardinality + (bCardinality x cCardinality) - intersectionCardinality
seems to work in many cases with good accuracy, but I don't know if I trust it. While Redis has many built-in low-cardinality modifiers designed to circumvent known HLL issues, I don't know if the issue of wild inaccuracy (using inclusion/exclusion) is still present with sets of high disparity in size...
2. Jaccard Index Intersection/MinHash
This way seems more interesting, but a part of me feels like it may computationally overlap with some of Redis' existing optimizations (ie, I'm not implementing my own HLL algorithm from scratch).
With this approach I'd use a random sampling of bins with a MinHash algorithm (I don't think an LSH implementation is worth the trouble). This would be a separate structure, but by using minhash to get the Jaccard index of the sets, you can then effectively multiply the union cardinality by that index for a more accurate count.
Problem is, I'm not very well versed in HLL's and while I'd love to dig into the Google paper I need a viable implementation in short order. Chances are I'm overlooking some basic considerations either of Redis' existing optimizations, or else in the algorithm itself that allows for computationally-cheap intersection estimates with pretty lax confidence bounds.
thus, my question:
How do I most effectively get a computationally-cheap intersection estimate of N huge (billions) sets, using redis, if I'm willing to sacrifice space (and to a small degree, accuracy)?
Read this paper some time back. Will probably answer most of your questions. Inclusion Principle inevitably compounds error margins a large number of sets. Min-Hash approach would be the way to go.
http://tech.adroll.com/media/hllminhash.pdf
There is a third strategy to estimate the intersection size of any two sets given as HyperLogLog sketches: Maximum likelihood estimation.
For more details see the paper available at
http://oertl.github.io/hyperloglog-sketch-estimation-paper/.

How many and which parents should we select for crossover in genetic algorithm

I have read many tutorials, papers and I understood the concept of Genetic Algorithm, but I have some problems to implement the problem in Matlab.
In summary, I have:
A chromosome containing three genes [ a b c ] with each gene constrained by some different limits.
Objective function to be evaluated to find the best solution
What I did:
Generated random values of a, b and c, say 20 populations. i.e
[a1 b1 c1] [a2 b2 c2]…..[a20 b20 c20]
At each solution, I evaluated the objective function and ranked the solutions from best to worst.
Difficulties I faced:
Now, why should we go for crossover and mutation? Is the best solution I found not enough?
I know the concept of doing crossover (generating random number, probability…etc) but which parents and how many of them will be selected to do crossover or mutation?
Should I do the crossover for the entire 20 solutions (parents) or only two of them?
Generally a Genetic Algorithm is used to find a good solution to a problem with a huge search space, where finding an absolute solution is either very difficult or impossible. Obviously, I don't know the range of your values but since you have only three genes it's likely that a good solution will be found by a Genetic Algorithm (or a simpler search strategy at that) without any additional operators. Selection and Crossover is usually carried out on all chromosome in the population (although it's not uncommon to carry some of the best from each generation forward as is). The general idea is that the fitter chromosomes are more likely to be selected and undergo crossover with each other.
Mutation is usually used to stop the Genetic Algorithm prematurely converging on a non-optimal solution. You should analyse the results without mutation to see if it's needed. Mutation is usually run on the entire population, at every generation, but with a very small probability. Giving every gene 0.05% chance that it will mutate isn't uncommon. You usually want to give a small chance of mutation, without it completely overriding the results of selection and crossover.
As has been suggested I'd do a lit bit more general background reading on Genetic Algorithms to give a better understanding of its concepts.
Sharing a bit of advice from 'Practical Neural Network Recipies in C++' book... It is a good idea to have a significantly larger population for your first epoc, then your likely to include features which will contribute to an acceptable solution. Later epocs which can have smaller populations will then tune and combine or obsolete these favourable features.
And Handbook-Multiparent-Eiben seems to indicate four parents are better than two. However bed manufactures have not caught on to this yet and seem to only produce single and double-beds.

Alternatives to Brute Force Search (Language Unspecific)

I am using a brute force method to optimize a solution in one of my recent projects and it is working quite well. Basically the optimization process involves searching for a global maximum in the space of all possible solutions. I was curious if there are other techniques which can be used to speed up a brute force search or other methods entirely. This is an area that I have little experience in but, as I said, I am quite curious.
Genetic algorithms are a good way to find maximums, even when is not possible to test all solutions.
It's a wide spread technique and there are implementations in very programming languages.
Simulated annealing is useful for solving local maxima problems, but is not always guaranteed to find the global maxima. It basically uses random 'jumps' in an attempt to find a better location/value than its current, and this can speed up searches.

What's a genetic algorithm that would produce interesting/surprising results and not have a boring/obvious end point?

I find genetic algorithm simulations like this to be incredibly entrancing and I think it'd be fun to make my own. But the problem with most simulations like this is that they're usually just hill climbing to a predictable ideal result that could have been crafted with human guidance pretty easily. An interesting simulation would have countless different solutions that would be significantly different from each other and surprising to the human observing them.
So how would I go about trying to create something like that? Is it even reasonable to expect to achieve what I'm describing? Are there any "standard" simulations (in the sense that the game of life is sort of standardized) that I could draw inspiration from?
Depends on what you mean by interesting. That's a pretty subjective term. I once programmed a graph analyzer for fun. The program would first let you plot any f(x) of your choice and set the bounds. The second step was creating a tree holding the most common binary operators (+-*/) in a random generated function of x. The program would create a pool of such random functions, test how well they fit to the original curve in question, then crossbreed and mutate some of the functions in the pool.
The results were quite cool. A totally weird function would often be a pretty good approximation to the query function. Perhaps not the most useful program, but fun nonetheless.
Well, for starters that genetic algorithm is not doing hill-climbing, otherwise it would get stuck at the first local maxima/minima.
Also, how can you say it doesn't produce surprising results? Look at this vehicle here for example produced around generation 7 for one of the runs I tried. It's a very old model of a bicycle. How can you say that's not a surprising result when it took humans millennia to come up with the same model?
To get interesting emergent behavior (that is unpredictable yet useful) it is probably necessary to give the genetic algorithm an interesting task to learn and not just a simple optimisation problem.
For instance, the Car Builder that you referred to (although quite nice in itself) is just using a fixed road as the fitness function. This makes it easy for the genetic algorithm to find an optimal solution, however if the road would change slightly, that optimal solution may not work anymore because the fitness of a solution may have grown dependent on trivially small details in the landscape and not be robust to changes to it. In real, cars did not evolve on one fixed test road either but on many different roads and terrains. Using an ever changing road as the (dynamic) fitness function, generated by random factors but within certain realistic boundaries for slopes etc. would be a more realistic and useful fitness function.
I think EvoLisa is a GA that produces interesting results. In one sense, the output is predictable, as you are trying to match a known image. On the other hand, the details of the output are pretty cool.