How can Dijkstra's algorithm apply to both undirected and directed algorithm in one program? - dijkstra

The graph is represented in the format as below:
MAX 12
NODE 1 1
NODE 2 2
NODE 3 3
NODE 4 4
NODE 5 5
NODE 6 6
NODE 7 7
NODE 9 9
NODE 8 8
NODE 10 10
NODE 11 11
NODE 12 12
EDGE 1 2
EDGE 2 3
EDGE 3 4
EDGE 4 5
EDGE 5 6
EDGE 6 7
EDGE 7 8
EDGE 8 9
EDGE 9 10
EDGE 10 11
EDGE 11 12
EDGE 1 12
EDGE 1 3
EDGE 1 4
EDGE 1 6
EDGE 1 8
EDGE 1 11
EDGE 1 10
EDGE 6 10
EDGE 3 6
EDGE 4 6
EDGE 5 7
EDGE 9 11
I need to use the adjacent list to read in those edges. But if I want to use it as an undirected graph, that is , ignore all the directness of all edges. How could I know the connectivity of each pair of nodes?
For example, the shortest distance between (NODE 2, NODE 8) is 2 (2->1>8) in the undirected graph, but using the Dijkstra's algorithm to this graph gets 4 (2->3->6->7->8). How could I represent the undirected graph while still using the same technique to read in edges?

If you really don't want to change the technique of reading in the edges you'd have to iterate over all the other nodes to see if your node is in their adjacency-list instead of the other way around.
This will increase your running time by quite a bit while not saving you much storage so I'd advise to just change the technique of reading in the edges.

Related

Algorithm to determine clusters of linked nodes in Spark

Say I have a dataset of 9 unique that are connected in the following way like this:
start end
----------
1 2
2 3
4 5
6 7
7 8
7 1
4 9
9 5
The dataset represents a graph of nodes, and the links between them. So for instance, the given links are represented as two clusters: one with 6 nodes, and one with 3 nodes.
CLUSTER 1 CLUSTER 2
1 --- 2 --- 3 4 --- 5
| \___ |
| \|
7 --- 6 9
|
|
8
I want an efficient algorithm clusters the edges together like so:
node cluster
-------------
1 1
2 1
3 1
4 2
5 2
6 1
7 1
8 1
9 2
The problem is that I have a lot of these edges, and my current algorithm is pretty slow. Assuming that these datasets are represented as DataFrames in Spark, is there a more SQL-like way of achieving this besides stripping them down to RDDs and iterating over them like lists?
Spark's GraphX library comes with a connected components method: https://spark.apache.org/docs/latest/graphx-programming-guide.html#connected-components
However I had tried using that in the past and found it a bit slow, so I ended up implementing it myself using the algorithm described here: http://mmds-data.org/presentations/2014/vassilvitskii_mmds14.pdf
It is very fast but requires a bit of implementation, probably beyond the scope of a stackoverflow answer (and the previous code I wrote is proprietary). And it's also possible GraphX has improved their implementation in the years since.

How can we make clusters of rows of 3 dimensional matrix using knnsearch?

Here is a subset of a 988x3 matrix (vertices of a 3D object):
2 3 4
1 2 3
8 5 2
6 2 4
7 8 9
9 5 1
3 5 8
6 5 7
1 2 8
. . .
Let suppose the 7 nearest neighbors of the first vertex are v(2), v(20), v(5), v(15), v(19), v(50), and v(23). We choose another vertex and find its 7 nearest neighbors according to this condition: The new vertex and its 7 nearest neighbors should not be from the last chosen nearest neighbors. I short, I want to make clusters of 8 distinct vertices from a list of 988 vertices based on the knnsearch. how can we do it in MATLAB?
kd-trees could help here.
It has a function called kdtree_k_nearest_neighbors.
In pseudocode
construct kd-tree
Pick random point
find k-nearest neighbors
store indices
remove point and k-nearest neighbors from set
*repeat*

Genetic Algorithm for Flow shop Scheduling

I need help in Matlab: I need to find out how to Crossover any two sequences for genetic alghorithm in FlowShop, e.g.
1st sequence = 1 5 4 7 3 2 9 8 10 6
2nd sequence = 7 8 9 10 5 4 2 1 3 6
after crossover, the off-springs should be
offspring 1 = 1 5 4 7 3 2 8 9 10 6
offspring 2 = 7 8 9 10 1 5 4 3 2 6
Crossover should be such that each number doesn't repeat itself in the offspring sequence. Can anyone tell me how to do this?
There are a number of existing crossovers defined for permutation encodings. Among them the following would be useful for you:
Cyclic Crossover
Partially Matched Crossover
Uniform-like Crossover
Position-based Crossover
These crossovers aim to preserve the position of the job in the permutation. You can find implementations in C# in the PermutationEncoding plugin of HeuristicLab. Browse the source files and you can also find references to scientific articles that describe these crossovers.

visualizing a distance matrix

Sorry if there's already an answer to this. I can't seem to find it.
I'm working on an application that pulls legislators' voting records on bills, and I'm trying to come up with some interesting ways of visualizing the data. There's one idea in my head right now, but I'm not sure it's mathematically possible to do the visualization I want to in two dimensions.
The data begins like this:
HB1 HB2 HB3
Smith 1 0 1
Hill 1 1 1
Davis 0 1 0
Where 1 = aye, 0 = nay.
The next step I take is to measure the "distance" of each legislator from the other by summing the XORs of their voting records, so that each time one legislator disagrees with another they get a distance "point" with that legislator. That creates a table like this:
Smith Hill Davis
Smith 0 1 3
Hill 1 0 2
Davis 3 2 0
So my idea is to graph each legislator as a point on a plane, and to have the distances between those points reflect the distance rating in the table. I think it presents an interesting opportunity to see if there are clusters of legislators with similar voting patterns, etc.
Now, obviously, this is easy to do with 3 points since you can always draw a triangle with three given lengths for sides. But I can't figure out whether it's mathematically possible to graph lots more (35-70) legislators and still have all the distances right within a 2-dimensional space, or whether you potentially need one additional dimension with each legislator after three.
So, for example, is it possible to preserve all the distances if the data table looks like this?
0 13 6 8 10 14 12 14 12 12
13 0 13 13 13 7 9 11 9 7
6 13 0 12 8 16 14 10 12 14
8 13 12 0 12 10 6 10 10 8
10 13 8 12 0 10 12 12 14 14
14 7 16 10 10 0 10 10 12 8
12 9 14 6 12 10 0 12 8 10
14 11 10 10 12 10 12 0 8 10
12 9 12 10 14 12 8 8 0 10
12 7 14 8 14 8 10 10 10 0
If so, does Octave have a built-in function? or can anyone point me to an algorithm?
Ok, found the answer.
No, it's generally not mathematically possible to do what I wanted to do.
The best approximation is an algorithm called multidimensional scaling. Octave has a built-in function: cmdscale.
Hope others may find this helpful.

Matlab, looking for the largest value in a rotated square

Let me explain:
I have a matrix sitting in Matlab that contains values of the height of terrain. I want to now the largest value inside a rectangle. However this rectangle is usually rotated with respect to the orientation of the datapoints in the matrix. To illustrate:
10 10 10 10 10 10 10 10 10 10
10 10 10 10 10 10 10 10 10 10
10 10 10 10 10 10 10 10 10 10
10 10 10 10 10 10 10 10 10 10
10 10 10 10 10 10 10 10 10 10
10 10 10 10 10 10 10 10 10 10
10 10 10 10 10 10 10 10 10 10
This shows a matrix where I would like to extract the data within a rectangle. Imagine a rectangle over the bold faced 10. The bold face 10 is the data I want to examine.
I understand you get a jagged edge when you do stuff like this. I actually want my jagged ' pixelated' outline to lie outside the rectangle i define. The data within the set can not change (ie., be interpolated) and I'm looking for the maximum value.
I've already been close to a solution but it didn't work out. At first it seems to be quite simple so hopefully someone with a fresh pair of eyes can help me out.
regards,
Berend
Select the values you're interested in by indexing, then calculate the maximum of those.
Steve on Image Processing has a number of posts on clever ways to do indexing: http://blogs.mathworks.com/steve/category/indexing/