If I calculate the length of the shortest path using networkx as:
path_length = nx.shortest_path_length(G, source = origin, target = destination, weight = 'distance')
How does networkx know to interpret the edge attribute as a distance or a weight?
The documentation says either is acceptable but doesn't specify how the attribute will be interpreted.
In the case of a weight, I would expect high values to be preferred. The shortest path traveling through the edges of the highest weights.
In the case of a distance, I would expect lower values to be preferred, to minimize total distance.
Am I missing something conceptually?
The results I've gotten are consistent with my expectations for distances but it's uncomfortable that I can't find anything in the docs that clarifies this.
From the docs:
weight (None or string, optional (default = None)) – If None, every
edge has weight/distance/cost 1. If a string, use this edge attribute
as the edge weight. Any edge attribute not present defaults to 1.
So Whether it is a distance or weight the objective is minimization. Usually words like profit/utility refer to maximization, while weight/distance/cost to minimization, some others like fitness may be used in both cases.
Related
I have a column vector "distances", and I want to select a value randomly from this vector such that smaller values have a higher probability of being selected. So far I am using the following, where "possible_cells" is the randomly selected value:
w=(fliplr(1:numel(distances)))/100
possible_cells=randsample((sort(distances)),1,true,w)
Basically, I flipped the distance vector to create probabilities of selection "w" (if I am understanding randsample correctly), so that the smallest value has the probability of being selected equal to the highest value. To check how well this works, I randomly drew 50 values and by using a histogram, I see that the values are higher than I would expect. Does anyone have any idea on how else to do what I described above?
0 Comments
How about something like this?
let's start with 10 sample distances with lengths no greater than 20 just to demonstrate:
d = randi(20,10,1);
Next, since we want smaller values to be more likely, let's take the reciprocal of those distances:
d_rec = 1./d;
Now, let's normalize so we can create a distribution from which to select our distance:
d_rec_norm = d_rec ./ sum(d_rec);
This new variable reflects the probability with which to select each given distance. Now comes a little trick... we choose the distance like this:
d_i = find(rand < cumsum(d_rec_norm),1);
This will give us the index of our chosen distance. The logic behind this is that when cumulatively summing the normalized values associated with each distance (d_rec_norm) we create "bins" whose widths are proportional to the likelihood of selecting each distance. All that is left is to pick a random number between 0 and 1 (rand) and see which "bin" it falls in.
I'm a new poster here, so let me know if this is unclear and I can try to improve my explanation.
I am trying to find a max value of a curve fitted plot for a certain region in this plot. I have a 4th order fit, and when i use max(x), the ans for this is an extrapolated value, while I am actually looking fot the max value of the 'bump' in my data.
So question, how do I select the max for only a certain region in the data while using a cfit? Or how do I exclude a part of the fit?
LF = pol4Fit(L,F);
Coefs= coeffvalues(LF);
This code does only give the optimum (the max value) of the real points:
L_opt = feval(LF,L);
[F_opt,Num_Length]= max (L_opt);
Opt_Length= L(Num_Length);
So now I was trying something like: y=max(LF(F)), but this is not specific to select a region.
Try to only evaluate the region you are interested in.
For instance, let's say the specific region is a vector named S.
You can simply rewrite your code like below:
L_opt = feval(LF,S);
Use the specific domain region S instead of the whole domain L and it only evaluates the region you are concerned with. Then using max function should work properly for you.
I need to classify objects using fuzzy logic. Each object is characterized by 4 features - {size, shape, color, texture}. Each feature is fuzzified by linguistic terms and some membership function. The problem is I am unable to understand how to defuzzify such that I may know which class an unknown object belongs to. Using the Mamdani Max-Min inference, can somebody help in solving this issue?
Objects = {Dustbin, Can, Bottle, Cup} or denoted as {1,2,3,4} respectively. The fuzzy sets for each feature is :
Feature : Size
$\tilde{Size_{Large}}$ = {1//1,1/2,0/3,0.6/4} for crisp values in range 10cm - 20 cm
$\tilde{Size_{Small}}$ = {0/1,0/2,1/3,0.4/4} (4cm - 10cm)
Shape:
$\tilde{Shape_{Square}}$ = {0.9/1, 0/2,0/3,0/4} for crisp values in range 50-100
$\tilde{Shape_{Cylindrical}}$ = {0.1/1, 1/2,1/3,1/4} (10-40)
Feature : Color
$\tilde{Color_{Reddish}}$ = {0/1, 0.8/2, 0.6/3,0.3/4} say red values in between 10-50 (not sure, assuming)
$\tilde{Color_{Greenish}}$ = {1/1, 0.2/2, 0.4/3, 0.7/4} say color values in 100-200
Feature : Texture
$\tilde{Tex_{Coarse}}$ = {0.2/1, 0.2/2,0/3,0.5/4} if texture crisp values 10-20
$\tilde{Tex_{Shiny}}$ = {0.8/1, 0.8/2, 1/3, 0.5/4} 30-40
The If then else rules for classification are
R1: IF object is large in size AND cylindrical shape AND greenish in color AND coarse in texture THEN object is Dustbin
or in tabular form just to save space
Object type Size Shape Color Texture
Dustbin : Large cylindrical greenish coarse
Can : small cylindrical reddish shiny
Bottle: small cylindrical reddish shiny
Cup : small cylindrical greenish shiny
Then, there is an unknown feature with crisp values X = {12cm, 52,120,11}. How do I classify it? Or is my understanding incorrect, that I need to reformulate the entire thing?
Fuzzy logic means that every pattern belongs to a class up to a level. In other words, the output of the algorithm for every pattern could be a vector of let's say percentages of similarity to each class that sum up to unity. Then the decision for a class could be taken by checking a threshold. This means that the purpose of fuzzy logic is to quantify the uncertainty. If you need a decision for your case, a simple minimum distance classifier or a majority vote should be enough. Otherwise, define again your problem by taking the "number factor" into consideration.
One possible approach could be to define centroids for each feature's distinct attribute, for example, Large_size=15cm and Small_size=7cm. The membership function could be then defined as a function of the distance from these centroids. Then you could do the following:
1) Calculate the euclidean difference * a Gaussian or Butterworth kernel (in order to capture the range around the centroid) for every feature. Prepare a kernel for every class, for example, dustbin as a target needs large size, coarse texture etc.
2) Calculate the product of all the above (this is a Naive Bayes approach). Fuzzy logic ends here.
3) Then, you could assign the pattern to the class with the highest value of the membership function.
Sorry for taking too long to answer, hope this will help.
I am trying to update a MST by adding a new vertex in the MST. For this, I have been following "Updating Spanning Tree" by Chin and Houck. http://www.computingscience.nl/docs/vakken/al/WerkC/UpdatingSpanningTrees.pdf
A step in the paper requires me to find the largest edge in the path/paths between two given vertices. My idea is to find all the possible paths between the vertices and then, subsequently find the largest edge from the paths. I have been trying to implement this in MATLAB. However, so far, I have been unsuccessful. Any lead / clear algorithm to find all paths between two vertices or even the largest edge in the path between two given nodes/ vertices would be really welcome.
For reference, I would like to put forward an example. If the graph has following edges 1-2, 1-3, 2-4 and 3-4, the paths between 4 and 4 are:
1) 4-2-1-3-4
2) 4-3-1-2-4
Thank you
The algorithm works by lowering the t value to exclude large edges from the new MST. When the algorithm completes, t will be the lowest edge that remains to be inserted to complete the MST.
The m value represents the largest edge on a path from r to z, local to each run of INSERT. m is lowered at each iteration of the loop if possible, thereby removing the previous m edge as a possible candidate for t.
It's not easy to explain in words, I recommend doing a run of the algorithm on paper until the steps are clear.
I made a quick attempt to sketch the steps here: http://jacob.midtgaard-olesen.dk/?p=140
But basically, the algorithm adds edges from the old MST unless it finds a smaller edge to add between the new node z and another node in the old MST. In the example, the edge (A,B) is not in the new tree, since a better connection to B was found by the algorithm.
Note that on selecting h and k, if t and (w,r) have equal edge value, I believe you should choose (w,r)
Finally you should probably go trough the proof following the algorithm to understand why the algorithm works. (I didn't read it all :) )
Can we use Dijkstra's algorithm with negative weights?
STOP! Before you think "lol nub you can just endlessly hop between two points and get an infinitely cheap path", I'm more thinking of one-way paths.
An application for this would be a mountainous terrain with points on it. Obviously going from high to low doesn't take energy, in fact, it generates energy (thus a negative path weight)! But going back again just wouldn't work that way, unless you are Chuck Norris.
I was thinking of incrementing the weight of all points until they are non-negative, but I'm not sure whether that will work.
As long as the graph does not contain a negative cycle (a directed cycle whose edge weights have a negative sum), it will have a shortest path between any two points, but Dijkstra's algorithm is not designed to find them. The best-known algorithm for finding single-source shortest paths in a directed graph with negative edge weights is the Bellman-Ford algorithm. This comes at a cost, however: Bellman-Ford requires O(|V|·|E|) time, while Dijkstra's requires O(|E| + |V|log|V|) time, which is asymptotically faster for both sparse graphs (where E is O(|V|)) and dense graphs (where E is O(|V|^2)).
In your example of a mountainous terrain (necessarily a directed graph, since going up and down an incline have different weights) there is no possibility of a negative cycle, since this would imply leaving a point and then returning to it with a net energy gain - which could be used to create a perpetual motion machine.
Increasing all the weights by a constant value so that they are non-negative will not work. To see this, consider the graph where there are two paths from A to B, one traversing a single edge of length 2, and one traversing edges of length 1, 1, and -2. The second path is shorter, but if you increase all edge weights by 2, the first path now has length 4, and the second path has length 6, reversing the shortest paths. This tactic will only work if all possible paths between the two points use the same number of edges.
If you read the proof of optimality, one of the assumptions made is that all the weights are non-negative. So, no. As Bart recommends, use Bellman-Ford if there are no negative cycles in your graph.
You have to understand that a negative edge isn't just a negative number --- it implies a reduction in the cost of the path. If you add a negative edge to your path, you have reduced the cost of the path --- if you increment the weights so that this edge is now non-negative, it does not have that reducing property anymore and thus this is a different graph.
I encourage you to read the proof of optimality --- there you will see that the assumption that adding an edge to an existing path can only increase (or not affect) the cost of the path is critical.
You can use Dijkstra's on a negative weighted graph but you first have to find the proper offset for each Vertex. That is essentially what Johnson's algorithm does. But that would be overkill since Johnson's uses Bellman-Ford to find the weight offset(s). Johnson's is designed to all shortest paths between pairs of Vertices.
http://en.wikipedia.org/wiki/Johnson%27s_algorithm
There is actually an algorithm which uses Dijkstra's algorithm in a negative path environment; it does so by removing all the negative edges and rebalancing the graph first. This algorithm is called 'Johnson's Algorithm'.
The way it works is by adding a new node (lets say Q) which has 0 cost to traverse to every other node in the graph. It then runs Bellman-Ford on the graph from point Q, getting a cost for each node with respect to Q which we will call q[x], which will either be 0 or a negative number (as it used one of the negative paths).
E.g. a -> -3 -> b, therefore if we add a node Q which has 0 cost to all of these nodes, then q[a] = 0, q[b] = -3.
We then rebalance out the edges using the formula: weight + q[source] - q[destination], so the new weight of a->b is -3 + 0 - (-3) = 0. We do this for all other edges in the graph, then remove Q and its outgoing edges and voila! We now have a rebalanced graph with no negative edges to which we can run dijkstra's on!
The running time is O(nm) [bellman-ford] + n x O(m log n) [n Dijkstra's] + O(n^2) [weight computation] = O (nm log n) time
More info: http://joonki-jeong.blogspot.co.uk/2013/01/johnsons-algorithm.html
Actually I think it'll work to modify the edge weights. Not with an offset but with a factor. Assume instead of measuring the distance you are measuring the time required from point A to B.
weight = time = distance / velocity
You could even adapt velocity depending on the slope to use the physical one if your task is for real mountains and car/bike.
Yes, you could do that with adding one step at the end i.e.
If v ∈ Q, Then Decrease-Key(Q, v, v.d)
Else Insert(Q, v) and S = S \ {v}.
An expression tree is a binary tree in which all leaves are operands (constants or variables), and the non-leaf nodes are binary operators (+, -, /, *, ^). Implement this tree to model polynomials with the basic methods of the tree including the following:
A function that calculates the first derivative of a polynomial.
Evaluate a polynomial for a given value of x.
[20] Use the following rules for the derivative: Derivative(constant) = 0 Derivative(x) = 1 Derivative(P(x) + Q(y)) = Derivative(P(x)) + Derivative(Q(y)) Derivative(P(x) - Q(y)) = Derivative(P(x)) - Derivative(Q(y)) Derivative(P(x) * Q(y)) = P(x)*Derivative(Q(y)) + Q(x)*Derivative(P(x)) Derivative(P(x) / Q(y)) = P(x)*Derivative(Q(y)) - Q(x)*Derivative(P(x)) Derivative(P(x) ^ Q(y)) = Q(y) * (P(x) ^(Q(y) - 1)) * Derivative(Q(y))