Why can't we apply Dijkstra's algorithm for a graph with negative weights? - dijkstra

Why can't we apply Dijkstra's algorithm for a graph with negative weights?

What does it mean to find the least expensive path from A to B, if every time you travel from C to D you get paid?
If there is a negative weight between two nodes, the "shortest path" is to loop backwards and forwards between those two nodes forever. The more hops, the "shorter" the path gets.
This is nothing to do with the algorithm, and all to do with the impossibility of answering such a question.
The above claim assumes bidirectional links. If there is no cycles which have an overall negative weight, you do not have a way to loop around forever, being paid.
In such a case, Dijkstra's algorithm may still fail:
Consider two paths:
an optimal path that racks up a cost of 100, before crossing the final edge which has a -25 weight, giving a total of 75, and
a suboptimal path that has no negatively-weighted edges with a total cost of 90.
Dijkstra's algorithm will investigate the suboptimal path first, and will declare itself finished when it finds it. It will never follow up the subpath that is worse than the first solution found

I will give you an counterexample. Consider following graph
Suppose you begun in vertex A and you want shortest path to D. Dijkstra's algorithm would do following steps:
Mark A as visited and add vertices B and C to queue
Fetch from queue vertex with minimal distance. It is B
Mark B as visited and add vertex D to queue.
Fetch from queue. Not it is vertex D.
Mark D as visited
Dijkstra says shortest path from A to D has length 2 but it is obviously not true.

Imagine you had a directed graph in it with a directed cycle, and the total "distance" around that was a negative weight. If on your way from the Start to the End vertex you could pass through that directed cycle, you could simply go around and around the directed cycle an arbitrary number of times.
And that means you could make you path across the graph have an infinitely negative distance (or effectively so).
However, as long as there are no directed cycles around your graph, you could get away with using Dijkstra's Algorithm without anything exploding on you.
All that being said, there if you have a graph with negative weights, you could use the Belman-Ford algorithm. Because of the generality of this algorithm, however, it is a bit slower. The Bellman-Ford algorithm takes O(V·E), where the Dijkstra's takes O(E + VlogV) time


Dijkstra's algorithm when all edges have same weight

If all edges had the same weight in a given graph, will Dijkstra's algorithm still find the shortest path between 2 vertices?
Yes dijkstra algorithm can find the shortest path even when all edges have the same weight. dijkstra has time complexity O((V+E)logV).Instead you should choose BFS algorithm to do the same thing,because BFS has time complexity O(V+E),so BFS is asymptotically faster than dijkstra.
Yes it would, But you might want to take a look at Breadth-first search, wich solves the case you are refering to.
To find the path, you can make a recursive function that starts in the destiny node with flagged distance n, and moves to one of the neightbour nodes with flagged distance n-1

Clustering words into groups

This is a Homework question. I have a huge document full of words. My challenge is to classify these words into different groups/clusters that adequately represent the words. My strategy to deal with it is using the K-Means algorithm, which as you know takes the following steps.
Generate k random means for the entire group
Create K clusters by associating each word with the nearest mean
Compute centroid of each cluster, which becomes the new mean
Repeat Step 2 and Step 3 until a certain benchmark/convergence has been reached.
Theoretically, I kind of get it, but not quite. I think at each step, I have questions that correspond to it, these are:
How do I decide on k random means, technically I could say 5, but that may not necessarily be a good random number. So is this k purely a random number or is it actually driven by heuristics such as size of the dataset, number of words involved etc
How do you associate each word with the nearest mean? Theoretically I can conclude that each word is associated by its distance to the nearest mean, hence if there are 3 means, any word that belongs to a specific cluster is dependent on which mean it has the shortest distance to. However, how is this actually computed? Between two words "group", "textword" and assume a mean word "pencil", how do I create a similarity matrix.
How do you calculate the centroid?
When you repeat step 2 and step 3, you are assuming each previous cluster as a new data set?
Lots of questions, and I am obviously not clear. If there are any resources that I can read from, it would be great. Wikipedia did not suffice :(
As you don't know exact number of clusters - I'd suggest you to use a kind of hierarchical clustering:
Imagine that all your words just a points in non-euclidean space. Use Levenshtein distance to calculate distance between words (it works great, in case, if you want to detect clusters of lexicographically similar words)
Build minimum spanning tree which contains all of your words
Remove links, which have length greater than some threshold
Linked groups of words are clusters of similar words
Here is small illustration:
P.S. you can find many papers in web, where described clustering based on building of minimal spanning tree
P.P.S. If you want to detect clusters of semantically similar words, you need some algorithms of automatic thesaurus construction
That you have to choose "k" for k-means is one of the biggest drawbacks of k-means.
However, if you use the search function here, you will find a number of questions that deal with the known heuristical approaches to choosing k. Mostly by comparing the results of running the algorithm multiple times.
As for "nearest". K-means acutally does not use distances. Some people believe it uses euclidean, other say it is squared euclidean. Technically, what k-means is interested in, is the variance. It minimizes the overall variance, by assigning each object to the cluster such that the variance is minimized. Coincidentially, the sum of squared deviations - one objects contribution to the total variance - over all dimensions is exactly the definition of squared euclidean distance. And since the square root is monotone, you can also use euclidean distance instead.
Anyway, if you want to use k-means with words, you first need to represent the words as vectors where the squared euclidean distance is meaningful. I don't think this will be easy or maybe not even possible.
About the distance: In fact, Levenshtein (or edit) distance satisfies triangle inequality. It also satisfies the rest of the necessary properties to become a metric (not all distance functions are metric functions). Therefore you can implement a clustering algorithm using this metric function, and this is the function you could use to compute your similarity matrix S:
-> S_{i,j} = d(x_i, x_j) = S_{j,i} = d(x_j, x_i)
It's worth to mention that the Damerau-Levenshtein distance doesn't satisfy the triangle inequality, so be careful with this.
About the k-means algorithm: Yes, in the basic version you must define by hand the K parameter. And the rest of the algorithm is the same for a given metric.

Find K-farthest neighbors in a weighted graph in matlab

I want to find the K-farthest neighbors in a given undirected weighted graph (the graph is given as a sparse weight matrix, but I can use an representation advised).
Just to make sure the problem is well-defined: I want to find k nodes which have maximal distance from one another.
Solutions that are close to the optimal set are also ok - I just need it to find some farthest points in a mesh :)
Assuming you are just looking for a decent solution I would recommend a simple solution similar to the "furthest insertion" starting position for the travelling salesman problem:
Add 1 point to the empty set, preferably one in the corner or in the edge (Of course you can just try all of them)
Add the furthest point to the set (increase the distance most from current points in set)
Keep repeating the previous step untill there are k points in the set
It will not be optimal but probably not very bad.
If you want to improve on this you could use a heuristic to improve on the result, for example:
Consider the set with point 1 to j left out, j
Try all possible points to substitute these j points
record best possible solution
Consider the set with point 2 to j+1 left out
Furthermore if k is not too large, say less than 5, and the total amount of points is not too large, say less than 100, it will probably be easier to just calculate all possible combinations. This is assuming that the norm calculation can be done efficiently.
Once you know you want to implement this the regular way to continue is find something similar and edit it a bit to suit your needs. If you scroll down on this page you should find an example of furthest insertion. Editing it to follow your measure of 'far' should be managable.

Dijkstra's algorithm Vs Uniform Cost Search (Time comlexity)

My question is as follows: According to different sources, Dijkstra's algorithm is nothing but a variant of Uniform Cost Search. We know that Dijkstra's algorithm finds the shortest path between a source and all destinations ( single-source ). However, we can always modify Dijkstra to find the the shortest path between a START and a GOAL state ( when the goal is popped from the priority queue, we simply stop); but doing so, the worst case scenario will be still finding the shortest path from START to all other nodes ( suppose the goal is the furthest node in the graph).
If we implement Dijkstra's algorithm using a min-priority heap, the running time will be
O(V log V +E) , where E is the number of edges and V the number of vertices.
Since Uniform Cost Search is the same as Dijkstra ( slightly different implementation), then the running time of UCS should be similar to Dijkstra, right? However, according to my AI class, Uniform Cost Search is exponential at the worst case, and it takes O(b1 + [C*/ε]), where C* is the cost of the optimal solution. ( b is the branching factor)
How can both algorithms be the same while they have different running times? Is the running time the same, but the way we look at it is different?
I would appreciate your help :):) Thank you
Is the running time the same, but the way we look at it is different?
Yes. Uniform cost search can be used on infinitely large graphs, on which Dijkstra's original algorithm would never terminate. In such situations, it's no use defining complexity in terms of V and E as both might be infinite and the resulting big-O figure meaningless.

Finding the "tightest" subset in Euclidean space

I am given at of points x_1, x_2, ... x_n \in R^d. I wish to find a subset of k points such that the sum of the distances between these k points is minimal. Naively this is an O(n choose k) problem, but I am looking for a faster algorithm.
I can think of two alternative equivalent formulations:
The minimal edge weight clique problem: think of the points as a graph, edge weights are the distances, and finding the minimal weight clique. This is equivalent to maximal edge weight problem, which is known to be NP-complete. However, I have the benefit of knowing that my graph is embedded in R^d, and that all the weights are positive, so perhaps that might help?
The minimal unconstrained sub-matrix problem: I am given the symmetric distance matrix, and I want to find a kXk minor with minimal sum.
I'd appreciate any help in this.
The most obvious optimization doesn't really require any different formula.
Just greedily find a near-optimal candidate first. Try to refine it in linear time by swapping members. Then do an exhaustive search but stop whenever the new candidates are worse than the greedy-candidate to prune the search space.
Compute the mean
Order objects by squared distance from mean
Test all n-k intervals of length k in this order, choose the best
For any non-chosen object, try to swap it with one of the chosen objects, if it improves the score
Now you should have a reasonably good candidate for pruning.
Then do an exhaustive search, and stop whenever it is worse than this candidate.
Note: steps 1-3 are an inspiration taken from fast convex hull algorithms.