Dijkstra's algorithm Vs Uniform Cost Search (Time comlexity) - dijkstra

My question is as follows: According to different sources, Dijkstra's algorithm is nothing but a variant of Uniform Cost Search. We know that Dijkstra's algorithm finds the shortest path between a source and all destinations ( single-source ). However, we can always modify Dijkstra to find the the shortest path between a START and a GOAL state ( when the goal is popped from the priority queue, we simply stop); but doing so, the worst case scenario will be still finding the shortest path from START to all other nodes ( suppose the goal is the furthest node in the graph).
If we implement Dijkstra's algorithm using a min-priority heap, the running time will be
O(V log V +E) , where E is the number of edges and V the number of vertices.
Since Uniform Cost Search is the same as Dijkstra ( slightly different implementation), then the running time of UCS should be similar to Dijkstra, right? However, according to my AI class, Uniform Cost Search is exponential at the worst case, and it takes O(b1 + [C*/ε]), where C* is the cost of the optimal solution. ( b is the branching factor)
How can both algorithms be the same while they have different running times? Is the running time the same, but the way we look at it is different?
I would appreciate your help :):) Thank you

Is the running time the same, but the way we look at it is different?
Yes. Uniform cost search can be used on infinitely large graphs, on which Dijkstra's original algorithm would never terminate. In such situations, it's no use defining complexity in terms of V and E as both might be infinite and the resulting big-O figure meaningless.

Related

Solving non-convex optimization with global optimization algorithm using MATLAB

I have a simple unconstrained non-convex optimization problem. Since problems of these type have multiple local minima, I am looking for global optimization algorithm that yields a unique/global minimum. In the internet I came across global optimization algorithms like genetic algorithms, simulated annealing, etc but for solving a simple one variable unconstrained non-convex optimization problem, I think using these high level algorithms doesn't seem to be a good idea. Could anyone recommend me a simple global algorithm for solving such simple one variable unconstrained non-convex optimization problem? I would highly appreciate ideas on this.
"Since problems of these type have multiple local minima". It's not true, the real situation is the following:
Maybe you have one local minimum
Maybe you have infinite set of local miminums
Maybe you have finite number of local minimums
Maybe minimum is not attained
Maybe problem is unbounded below
Also big picture is that there are really true methods which really solve problems (numerically and they slow), but there is a slang to call method which is not nessesary find minumum value of function also call as "solve".
In fact M^n~M for any finite n and any infinite set M. So the fact that you problem has one dimension is nothing. It is still hard as problem with 1000000 parameters which are drawn from the set M from theoretical point of view.
If you interesting how approximately solve problem with known precision epsilon in domain - then split you domain into 1/espsilon regions, sample value(evalute function) at middle point, and select minimum
Method which I will describe below is precise method, and other methods: particle estimation, sequent.convex.programming, alternative direction, particle swarm, Neidler-Mead simplex method, mutlistart gradient/subgradient descend or any descend algorithm like Newton Method or coordinate descend, they all has no gurantess for non-convex problems and some of them even can no be applied if function is nonconvex.
If you interesting in really solve with some precision on function value then then you can take attention into method, which is called branch-and-bound and which truly found minimum, algorithms which you described I don't think so that they solve problem and find minimum in strong sense:
Basic idea of branch and bound - partition domain into convex sets and improve lower/upper bound, in your case it is intervals.
You should have a routine to find upper bound of optimal (min.) value: you can do it e.g. just by sampling subdomain and take smallest or use local optimization method start from random point.
But also you should have lower bound of optimal (min.) value by some principle and this is hard part:
convex relaxation of integer variables to make them real variables
use Lagrange Dual function
use Lipshitc constant on function, etc.
This is sophisticaed step.
If this two values are near - we're done in other case partion or refine partition.
Get info about lower and upper bound of child subproblems and then take min. of upper bounds and min. of lower bounds of children. If child returns more worse lower bound it can be upgraded by parent.
References:
For more great explanation please look into:
EE364B, Lecture 18, prof. Stephen Boyd, Stanford University. It's available on youtube and in ITunes University. If you new to this area I recommend you to look EE263, EE364A, EE364B courses of Stephen P. Boyd. You will love it
Since this is a one dimensional problem, things are easier.
A simple steepest descend procedure may be used as follows.
Suppose the interval of search is a<x<b.
Start the SD from a minimizing your function say f(x). You recover the first minimum Xm1. You should use a fine step, not too large.
Shift this point by adding a positive small constant Xm1+ε. Then maximize f or minimize -f, starting from this point. You get a max of f, you distort it by ε and start from there a minimization, and so on so forth.

Affinity propagation vs basic k-means algorithm

I have a dataset consists of (700 data points x 400 dimensions) which belong to 10 classes. I did cluster this data to see how data points will fit into clusters similar to their class. I performed two clustering experiments, one using basic k-means (euclidean) and another using Affinity Propagation. I noticed that the results using k-means are better and faster!! than the Affinity Propagation.
I could not understand the reason behind this. Can any of you help in giving explanation why I got such results (I thought Affinity Propagation is better than k-means)?
It could be a matter of granularity - the APC result could be close to a subclustering or superclustering of the class labels. There is a parameter that affects APC granularity (check yourself).
Another consideration is how you prepare the network that you give to APC (or any other network clustering algorithm). Ideally it should not be too dense. As a rough guideline, make sure that the distribution of { number of neighbours per node | all nodes } does not stray far outside [0.5 * sqrt(N) - 2.0 * sqrt(N)]. Especially try to avoid hubs, that is, nodes that have many more neighbours than that upper bound.
As a sanity check, are the values that you give to APC similarities? They should similarities be of course, not distances. You have a choice how the similarity is computed. The standard way to restrain the number of neighbours is to use a cut-off. Experiment with the combination of these. Finally you may also want to try MCL, an algorithm that precedes APC and uses conceptually similar principles but is a bit cleaner in its formulation (alternation of simple matrix operations). It is probably faster.

Why doesn't k-means give the global minima?

I read that the k-means algorithm only converges to a local minima and not to a global minima. Why is this? I can logically think of how initialization could affect the final clustering and there is a possibility of sub-optimum clustering, but I did not find anything that will mathematically prove that.
Also, why is k-means an iterative process?
Can't we just partially differentiate the objective function w.r.t. to the centroids, equate it to zero to find the centroids that minimizes this function? Why do we have to use gradient descent to reach the minimum step by step?
Consider:
. c .
. c .
Where c is a cluster centroid. The algorithm will stop, but a better solution is:
. .
c c
. .
With regards to a proof - You don't require a mathematical proof to prove that something isn't always true, you just need a single counter-example, as provided above. You can probably convert the above into a mathematical proof, but this is unnecessary and generally requires a lot of work; even in academia it is accepted to merely give a counter-example to disprove something.
The k-means algorithm is by definition an iterative process, it's simply the way it works. The problem of clustering is NP-hard, thus using an exact algorithm to calculate the centroids would take immensely long.
Don't mix the problem and the algorithm.
The k-means problem is finding the least-squares assignment to centroids.
There are multiple algorithms for finding a solution.
There is an obvious approach to find the global optimum: enumerating all k^n possible assignments - that will yield a global minimum, but in exponential runtime.
Much more attention was put to finding an approximate solution in faster time.
The Lloyd/Forgy algorithm is an EM-style iterative model refinement approach, that is guaranteed to converge to a local minimum simply because there is a finite number of states, and the objective function must decrease in every step. This algorithm runs in O(n*k*i) where i << n usually, but it may find a local minimum only.
The MacQueens method is technically not iterative. It's a single-pass, one-element-at-a-time algorithm that will not even find a local minimum in the Lloyd sense. (You can however run it multiple passes over the data set, until convergence, to get a local minimum too!) If you do a single pass, its in O(n*k), for multiple passes add i. It may or may not take more passes than Lloyd.
Then there is Hartigan and Wong. I don't remember the details, IIRC it was a clever, more lazy, variant of Lloyd/Forgy, so probably in O(n*k*i), too (although probably not recomputing all n*k distances for later iterations?)
You could also do a randomized alogrithm that just tests l random assignments. It probably won't find a minimum at all, but run in "linear" time O(n*l).
Oh, and you can try different random initializations, to improve your chances of finding the global minimum. Add a factor t for the number of trials...

Optimal solution for: All possible acyclic paths in a graph problem

I am dealing with undirected graph. I need to find all possible acyclic paths within a graph:
with G(V,E)
find all subsets of V that are acyclic paths
I am using either python scipy or matlab - whichever would be appropriate.
Is there any clever solution for this?
I'm trying to achieve it with a breadth-first search (see wiki)
I also have this toolbox in matlab: http://www.mathworks.com/matlabcentral/fileexchange/4266-grtheory-graph-theory-toolbox but it seems there's no straightforward solution for my problem.
PS. The problem practically is stated as: Transit Network Design Problem: Find such a transport network that minimizes cost of passangers and operators (i.e. optimal subway network for urban area)
Thanks in advance
Rafal
I think the problem as stated in your PS may be a NP problem. If so, there are straightforward solutions only for graphs with very limited numbers of nodes (N ~ <= 20). Other solutions will be approximate, giving rise to only local optimums. The solution to your problem as stated in the question will simply be to calculate all the permutations of the node orders. Again this will become computationally infeasible with comparatively low numbers of nodes (possibly higher than 20 but not much).
Do you need only the shortest paths between all pairs of vertices, or really all paths?

Why can't we apply Dijkstra's algorithm for a graph with negative weights?

Why can't we apply Dijkstra's algorithm for a graph with negative weights?
What does it mean to find the least expensive path from A to B, if every time you travel from C to D you get paid?
If there is a negative weight between two nodes, the "shortest path" is to loop backwards and forwards between those two nodes forever. The more hops, the "shorter" the path gets.
This is nothing to do with the algorithm, and all to do with the impossibility of answering such a question.
Edit:
The above claim assumes bidirectional links. If there is no cycles which have an overall negative weight, you do not have a way to loop around forever, being paid.
In such a case, Dijkstra's algorithm may still fail:
Consider two paths:
an optimal path that racks up a cost of 100, before crossing the final edge which has a -25 weight, giving a total of 75, and
a suboptimal path that has no negatively-weighted edges with a total cost of 90.
Dijkstra's algorithm will investigate the suboptimal path first, and will declare itself finished when it finds it. It will never follow up the subpath that is worse than the first solution found
I will give you an counterexample. Consider following graph
http://img853.imageshack.us/img853/7318/3fk8.png
Suppose you begun in vertex A and you want shortest path to D. Dijkstra's algorithm would do following steps:
Mark A as visited and add vertices B and C to queue
Fetch from queue vertex with minimal distance. It is B
Mark B as visited and add vertex D to queue.
Fetch from queue. Not it is vertex D.
Mark D as visited
Dijkstra says shortest path from A to D has length 2 but it is obviously not true.
Imagine you had a directed graph in it with a directed cycle, and the total "distance" around that was a negative weight. If on your way from the Start to the End vertex you could pass through that directed cycle, you could simply go around and around the directed cycle an arbitrary number of times.
And that means you could make you path across the graph have an infinitely negative distance (or effectively so).
However, as long as there are no directed cycles around your graph, you could get away with using Dijkstra's Algorithm without anything exploding on you.
All that being said, there if you have a graph with negative weights, you could use the Belman-Ford algorithm. Because of the generality of this algorithm, however, it is a bit slower. The Bellman-Ford algorithm takes O(V·E), where the Dijkstra's takes O(E + VlogV) time