Build Graph given a SuccessorsFunction and a set of nodes - guava

I would like to build a Guava ImmutableGraph given a set of nodes (starting points) and a SuccessorsFunction. The graph would contain all the nodes reachable from any of the starting node and all the edges seen on the way thanks to the SuccessorsFunction. (E.g., given starting node {a} and successors a → b and b → c, the resulting graph should be {(a, b), (b, c)}.)
I see how I can obtain a Traverser to explore the reachable nodes in a certain order, given starting nodes and a SuccessorsFunction, but it does not meet my needs as I want to obtain a graph, not just the nodes.
It is not very hard to define an algorithm that does this, but it’s subtle enough to deserve trying to re-use an existing solution. I would be surprised if it didn’t exist already in the library. Does it? Or is this requirement not sensible?
I didn’t find this in the related wiki page.

Guava doesn't have this feature built in, so you'll need a custom solution that does some sort of graph traversal (like breadth-first traversal), like the following code snippet.
public static <N> ImmutableGraph<N> buildGraphWithBreadthFirstTraversal(
Iterable<N> startingNodes, SuccessorsFunction<N> successorsFunction) {
MutableGraph<N> result = GraphBuilder.directed().allowsSelfLoops(true).build();
startingNodes.forEach(result::addNode);
Queue<N> nodesRemaining = Queues.newArrayDeque(startingNodes);
while (!nodesRemaining.isEmpty()) {
N next = nodesRemaining.remove();
for (N successor : successorsFunction.successors(next)) {
if (!result.edges().contains(EndpointPair.ordered(next, successor))) {
nodesRemaining.add(successor);
result.putEdge(next, successor);
}
}
}
return ImmutableGraph.copyOf(result);
}
Here is a basic JUnit 5 unit test that confirms the code works when given a starting node and a successorsFunction that together form a cycle of 1 -> 2 -> 4 -> 1.
#Test
void succeedsOnTraversalWithCycle() {
var result =
MoreGraphs.buildGraphWithBreadthFirstTraversal(
ImmutableList.of(1),
node -> {
int nextNode = node * 2;
return nextNode <= 4 ? ImmutableList.of(nextNode) : ImmutableList.of(1);
});
assertThat(result)
.isEqualTo(
GraphBuilder.directed()
.allowsSelfLoops(true)
.immutable()
.putEdge(1, 2)
.putEdge(2, 4)
.putEdge(4, 1)
.build());
}

Related

How to combine the elements of an arbitrary number of dependent Fluxes?

In the non reactive world the following code snippet is nothing special:
interface Enhancer {
Result enhance(Result result);
}
Result result = Result.empty();
result = fooEnhancer.enhance(result);
result = barEnhancer.enhance(result);
result = bazEnhancer.enhance(result);
There are three different Enhancer implementations taking a Result instance, enhancing it and returning the enhanced result. Let's assume the order of the enhancer calls matters.
Now what if these methods are replaced by reactive variants returning a Flux<Result>? Because the methods depend on the result(s) of the preceding method, we cannot use combineLatest here.
A possible solution could be:
Flux.just(Result.empty())
.switchMap(result -> first(result)
.switchMap(result -> second(result)
.switchMap(result -> third(result))))
.subscribe(result -> doSomethingWith(result));
Note that the switchMap calls are nested. As we are only interested in the final result, we let switchMap switch to the next flux as soon as new events are emitted in preceding fluxes.
Now let's try to do it with a dynamic number of fluxes. Non reactive (without fluxes), this would again be nothing special:
List<Enhancer> enhancers = <ordered list of different Enhancer impls>;
Result result = Result.empty();
for (Enhancer enhancer : enhancers) {
result = enhancer.enhance(result);
}
But how can I generalize the above reactive example with three fluxes to deal with an arbitrary number of fluxes?
I found a solution using recursion:
#FunctionalInterface
interface FluxProvider {
Flux<Result> get(Result result);
}
// recursive method creating the final Flux
private Flux<Result> cascadingSwitchMap(Result input, List<FluxProvider> fluxProviders, int idx) {
if (idx < fluxProviders.size()) {
return fluxProviders.get(idx).get(input).switchMap(result -> cascadingSwitchMap(result, fluxProviders, idx + 1));
}
return Flux.just(input);
}
// code using the recursive method
List<FluxProvider> fluxProviders = new ArrayList<>();
fluxProviders.add(fooEnhancer::enhance);
fluxProviders.add(barEnhancer::enhance);
fluxProviders.add(bazEnhancer::enhance);
cascadingSwitchMap(Result.empty(), fluxProviders, 0)
.subscribe(result -> doSomethingWith(result));
But maybe there is a more elegant solution using an operator/feature of project-reactor. Does anybody know such a feature? In fact, the requirement doesn't seem to be such an unusual one, is it?
switchMap feels inappropriate here. If you have a List<Enhancer> by the time the Flux pipeline is declared, why not apply a logic close to what you had in imperative style:
List<Enhancer> enhancers = <ordered list of different Enhancer impls>;
Mono<Result> resultMono = Mono.just(Result.empty)
for (Enhancer enhancer : enhancers) {
resultMono = resultMono.map(enhancer::enhance); //previousValue -> enhancer.enhance(previousValue)
}
return resultMono;
That can even be performed later at subscription time for even more dynamic resolution of the enhancers by wrapping the whole code above in a Mono.defer(() -> {...}) block.

Get all outging connectors from a shape programmatically

I want to rename a connector after a shape has been dropped.
Lets say I have a shape1 and I dropped a shape2 connected with shape1.
I want the connector shape between shape1 and shape2 so that I can rename it.
I guess it depends on what stage you intercept the drop. If it's immediately, you might make some assumptions about how many connectors might be involved, but if if some time after the drop then you might want to determine how many connections are involved.
As an example, with the following shapes:
...you could approach this in a number of ways:
Use the GluedShapes method working back from ShapeTwo
Use the GluedShapes method including the 'from' shape
Iterate through the Connects collection of the Page
Iterate over the Connect objects in on your target shape (ShapeOne)
I would definitely try and use the GluedShapes method (which came into Visio in 2010) over the Connect objects, but I'm adding them here as they can be useful depending on what you're trying to achieve.
Here's an example using LINQPad:
void Main()
{
var vApp = MyExtensions.GetRunningVisio();
var vPag = vApp.ActivePage;
//For demo purposes I'm assuming the following shape IDs
//but in reality you'd get a reference by other methods
//such as Window.Selection, Page index or ID
var shpOne = vPag.Shapes.ItemFromID[1];
var shpTwo = vPag.Shapes.ItemFromID[2];
Array gluedIds;
Console.WriteLine("1) using GluedShapes with the 'to' shape only");
gluedIds = shpTwo.GluedShapes(Visio.VisGluedShapesFlags.visGluedShapesIncoming1D,"");
IterateByIds(vPag, gluedIds);
Console.WriteLine("\n2) using GluedShapes with the 'to' and 'from' shapes");
gluedIds = shpTwo.GluedShapes(Visio.VisGluedShapesFlags.visGluedShapesIncoming1D, "", shpOne);
IterateByIds(vPag, gluedIds);
Console.WriteLine("\n3) using the Connects collection on Page");
var pageConns = from c in vPag.Connects.Cast<Visio.Connect>()
where c.FromSheet.OneD != 0
group c by c.FromSheet into connectPair
where connectPair.Any(p => p.ToSheet.ID == shpOne.ID) && connectPair.Any(p => p.ToSheet.ID == shpTwo.ID)
select connectPair.Key.Text;
pageConns.Dump();
Console.WriteLine("\n4) using FromConnects and Linq to navigate from shpOne to shpTwo finding the connector in the middle");
var shpConns = from c in shpOne.FromConnects.Cast<Visio.Connect>()
where c.FromSheet.OneD != 0
let targetConnector = c.FromSheet
from c2 in targetConnector.Connects.Cast<Visio.Connect>()
where c2.ToSheet.Equals(shpTwo)
select targetConnector.Text;
shpConns.Dump();
}
private void IterateByIds(Visio.Page hostPage, Array shpIds)
{
if (shpIds.Length > 0)
{
for (int i = 0; i < shpIds.Length; i++)
{
//Report on the shape text (or change it as required)
Console.WriteLine(hostPage.Shapes.ItemFromID[(int)shpIds.GetValue(i)].Text);
}
}
}
Running the above will result in this output:
It's worth bearing in mind that the Connects code (3 and 4) makes the assumption that connector shape (1D) are being connected to the flowchart shapes (2D) and not the other way round (which is possible).
You can think of the connect objects as being analgous to connection points, so in the diagram, the three connector shapes generate six connect objects:
Anyway, hope that gets you unstuck.
UPDATE - Just to be clear (and to answer the original question properly), the code to get all outgoing connectors from ShapeOne would be:
Console.WriteLine("using GluedShapes to report outgoing connectors");
gluedIds = shpOne.GluedShapes(Visio.VisGluedShapesFlags.visGluedShapesOutgoing1D, "");
IterateByIds(vPag, gluedIds);

Groovy/Scala - Abort Early while iterating using an accumulator

I have a fairly large collection that I would like to iterate and find out if the collection contains more than one instance of a particular number. Since the collection is large, i'd like to exit early, i.e not traverse the complete list.
I have a dirty looking piece of code that does this in a non-functional programming way. However, i'm unable to find a functional programming way of doing this (In Groovy or Scala), since I need to do 2 things at the same time.
Accumulate state
Exit Early
The "accumulate state" can be done using the "inject" or "fold" methods in Groovy/Scala but there's no way of exiting early from those methods. Original groovy code is below. Any thoughts?
def collection = [1,2,3,2,4,6,0,65,... 1 million more numbers]
def n = 2
boolean foundMoreThanOnce(List<Integer> collection, Integer n) {
def foundCount = 0
for(Integer i : collection) {
if(i == n) {
foundCount = foundCount + 1
}
if(foundCount > 1) {
return true
}
}
return false
}
print foundMoreThanOnce(collection, n)
One of many possible Scala solutions.
def foundMoreThanOnce[A](collection: Seq[A], target: A): Boolean =
collection.dropWhile(_ != target).indexOf(target,1) > 0
Or a slight variation...
collection.dropWhile(target.!=).drop(1).contains(target)
Scans the collection only until the 2nd target element is found.
Not sure about groovy, but if possible for you to use Java 8 then there is a possibility
collection.stream().filter(z -> {return z ==2;} ).limit(2)
the limit will stop the stream processing as soon as it get 2nd occurrence of 2.
You can use it as below, to ensure there are exact two occurrences
Long occ = collection.stream().filter(z -> {return z ==2;} ).limit(2).count();
if(occ == 2)
return true;

Spark - how to handle with lazy evaluation in case of iterative (or recursive) function calls

I have a recursive function that needs to compare the results of the current call to the previous call to figure out whether it has reached a convergence. My function does not contain any action - it only contains map, flatMap, and reduceByKey. Since Spark does not evaluate transformations (until an action is called), my next iteration does not get the proper values to compare for convergence.
Here is a skeleton of the function -
def func1(sc: SparkContext, nodes:RDD[List[Long]], didConverge: Boolean, changeCount: Int) RDD[(Long] = {
if (didConverge)
nodes
else {
val currChangeCount = sc.accumulator(0, "xyz")
val newNodes = performSomeOps(nodes, currChangeCount) // does a few map/flatMap/reduceByKey operations
if (currChangeCount.value == changeCount) {
func1(sc, newNodes, true, currChangeCount.value)
} else {
func1(sc, newNode, false, currChangeCount.value)
}
}
}
performSomeOps only contains map, flatMap, and reduceByKey transformations. Since it does not have any action, the code in performSomeOps does not execute. So my currChangeCount does not get the actual count. What that implies, the condition to check for the convergence (currChangeCount.value == changeCount) is going to be invalid. One way to overcome is to force an action within each iteration by calling a count but that is an unnecessary overhead.
I am wondering what I can do to force an action w/o much overhead or is there another way to address this problem?
I believe there is a very important thing you're missing here:
For accumulator updates performed inside actions only, Spark guarantees that each task’s update to the accumulator will only be applied once, i.e. restarted tasks will not update the value. In transformations, users should be aware of that each task’s update may be applied more than once if tasks or job stages are re-executed.
Because of that accumulators cannot be reliably used for managing control flow and are better suited for job monitoring.
Moreover executing an action is not an unnecessary overhead. If you want to know what is the result of the computation you have to perform it. Unless of course the result is trivial. The cheapest action possible is:
rdd.foreach { case _ => }
but it won't address the problem you have here.
In general iterative computations in Spark can be structured as follows:
def func1(chcekpoinInterval: Int)(sc: SparkContext, nodes:RDD[List[Long]],
didConverge: Boolean, changeCount: Int, iteration: Int) RDD[(Long] = {
if (didConverge) nodes
else {
// Compute and cache new nodes
val newNodes = performSomeOps(nodes, currChangeCount).cache
// Periodically checkpoint to avoid stack overflow
if (iteration % checkpointInterval == 0) newNodes.checkpoint
/* Call a function which computes values
that determines control flow. This execute an action on newNodes.
*/
val changeCount = computeChangeCount(newNodes)
// Unpersist old nodes
nodes.unpersist
func1(checkpointInterval)(
sc, newNodes, currChangeCount.value == changeCount,
currChangeCount.value, iteration + 1
)
}
}
I see that these map/flatMap/reduceByKey transformations are updating an accumulator. Therefore the only way to perform all updates is to execute all these functions and count is the easiest way to achieve that and gives the lowest overhead compared to other ways (cache + count, first or collect).
Previous answers put me on the right track to solve a similar convergence detection problem.
foreach is presented in the docs as:
foreach(func) : Run a function func on each element of the dataset. This is usually done for side effects such as updating an Accumulator or interacting with external storage systems.
It seems like instead of using rdd.foreach() as a cheap action to trigger accumulator increments placed in various transformations, it should be used to do the incrementing itself.
I'm unable to produce a scala example, but here's a basic java version, if it can still help:
// Convergence is reached when two iterations
// return the same number of results
long previousCount = -1;
long currentCount = 0;
while (previousCount != currentCount){
rdd = doSomethingThatUpdatesRdd(rdd);
// Count entries in new rdd with foreach + accumulator
rdd.foreach(tuple -> accumulator.add(1));
// Update helper values
previousCount = currentCount;
currentCount = accumulator.sum();
accumulator.reset();
}
// Convergence is reached

Implementing Enqueue and Dequeue for Queue using Alloy

Having the following:
sig Queue { root: Node }
sig Node { next: lone Node }
fact nextNotReflexive { no n:Node | n = n.next }
fact nextNotCyclic { no n:Node | n in n.^next }
Can anyone help on the implementation of the Enq and Deq?
pred Enq[q,q':Queue, n:Node]{}
pred Deq [q,q':Queue]{}
Any help is appreciated.
You can define enqueue easily:
pred Enq[q, q': Queue, n: Node] {
q'.root = n and n.next = q.root
}
But dequeue is not nearly as easy. The problem is that you need to modify a Node, not a Queue, in order to make the change to the last node that's required to dequeue an element. You actually need to replace the second-to-last Node in the Queue with a different Node whose "next" field is empty -- but since you're using atom identity to represent node uniqueness, that Node will actually be a different Node.