Implementing Enqueue and Dequeue for Queue using Alloy - queue

Having the following:
sig Queue { root: Node }
sig Node { next: lone Node }
fact nextNotReflexive { no n:Node | n = n.next }
fact nextNotCyclic { no n:Node | n in n.^next }
Can anyone help on the implementation of the Enq and Deq?
pred Enq[q,q':Queue, n:Node]{}
pred Deq [q,q':Queue]{}
Any help is appreciated.

You can define enqueue easily:
pred Enq[q, q': Queue, n: Node] {
q'.root = n and n.next = q.root
}
But dequeue is not nearly as easy. The problem is that you need to modify a Node, not a Queue, in order to make the change to the last node that's required to dequeue an element. You actually need to replace the second-to-last Node in the Queue with a different Node whose "next" field is empty -- but since you're using atom identity to represent node uniqueness, that Node will actually be a different Node.

Related

Java - GraphStream node modified event listener

In my code, I have a listener on mouse released, and in there, I have a snippet like this:
for (Node n : gs){
Object o = n.getAttribute("ui.selected");
if (o != null && (Boolean)(o)){
nodes[num++] = n;
}
}
The problem is that the event manager hasn't updated all of the selected nodes to set the ui.selected attribute. This is a race condition.
I seem to be missing how to add an event listener to a node, or to the view for when nodes are modified.
I am trying to calculate metrics on a set of selected nodes.

Build Graph given a SuccessorsFunction and a set of nodes

I would like to build a Guava ImmutableGraph given a set of nodes (starting points) and a SuccessorsFunction. The graph would contain all the nodes reachable from any of the starting node and all the edges seen on the way thanks to the SuccessorsFunction. (E.g., given starting node {a} and successors a → b and b → c, the resulting graph should be {(a, b), (b, c)}.)
I see how I can obtain a Traverser to explore the reachable nodes in a certain order, given starting nodes and a SuccessorsFunction, but it does not meet my needs as I want to obtain a graph, not just the nodes.
It is not very hard to define an algorithm that does this, but it’s subtle enough to deserve trying to re-use an existing solution. I would be surprised if it didn’t exist already in the library. Does it? Or is this requirement not sensible?
I didn’t find this in the related wiki page.
Guava doesn't have this feature built in, so you'll need a custom solution that does some sort of graph traversal (like breadth-first traversal), like the following code snippet.
public static <N> ImmutableGraph<N> buildGraphWithBreadthFirstTraversal(
Iterable<N> startingNodes, SuccessorsFunction<N> successorsFunction) {
MutableGraph<N> result = GraphBuilder.directed().allowsSelfLoops(true).build();
startingNodes.forEach(result::addNode);
Queue<N> nodesRemaining = Queues.newArrayDeque(startingNodes);
while (!nodesRemaining.isEmpty()) {
N next = nodesRemaining.remove();
for (N successor : successorsFunction.successors(next)) {
if (!result.edges().contains(EndpointPair.ordered(next, successor))) {
nodesRemaining.add(successor);
result.putEdge(next, successor);
}
}
}
return ImmutableGraph.copyOf(result);
}
Here is a basic JUnit 5 unit test that confirms the code works when given a starting node and a successorsFunction that together form a cycle of 1 -> 2 -> 4 -> 1.
#Test
void succeedsOnTraversalWithCycle() {
var result =
MoreGraphs.buildGraphWithBreadthFirstTraversal(
ImmutableList.of(1),
node -> {
int nextNode = node * 2;
return nextNode <= 4 ? ImmutableList.of(nextNode) : ImmutableList.of(1);
});
assertThat(result)
.isEqualTo(
GraphBuilder.directed()
.allowsSelfLoops(true)
.immutable()
.putEdge(1, 2)
.putEdge(2, 4)
.putEdge(4, 1)
.build());
}

How to sessionize stream with Apache Flink?

I want to sessionize this stream: 1,1,1,2,2,2,2,2,3,3,3,3,3,3,3,0,3,3,3,5, ... to these sessions:
1,1,1
2,2,2,2,2
3,3,3,3,3,3,3
0
3,3,3
5
I've wrote CustomTrigger to detect when stream elements change from 1 to 2 (2 to 3, 3 to 0 and so on) and then fire the trigger. But this is not the solution, because when I processing the first element of 2's, and fire the trigger the window will be [1,1,1,2] but I need to fire the trigger on the last element of 1's.
Here is the pesudo of my onElement function in my custom trigger class:
override def onElement(element: Session, timestamp: Long, window: W, ctx: TriggerContext): TriggerResult = {
if (prevState == element.value) {
prevState = element.value
TriggerResult.CONTINUE
} else {
prevState = element.value
TriggerResult.FIRE
}
}
How can I solve this problem?
I think a FlatMapFunction with a ListState is the easiest way to implement this use-case.
When a new element arrives (i.e., the flatMap() method is called), you check if the value changed. If the value did not changed, you append the element to the state. If the value changed, you emit the current list state as a session, clear the list, and insert the new element as the first to the list state.
However, you should keep in mind that this assumes that the order of elements is preserved. Flink ensures within a partition, i.e, as long as elements are not shuffled and all operators run with the same parallelism.

Groovy/Scala - Abort Early while iterating using an accumulator

I have a fairly large collection that I would like to iterate and find out if the collection contains more than one instance of a particular number. Since the collection is large, i'd like to exit early, i.e not traverse the complete list.
I have a dirty looking piece of code that does this in a non-functional programming way. However, i'm unable to find a functional programming way of doing this (In Groovy or Scala), since I need to do 2 things at the same time.
Accumulate state
Exit Early
The "accumulate state" can be done using the "inject" or "fold" methods in Groovy/Scala but there's no way of exiting early from those methods. Original groovy code is below. Any thoughts?
def collection = [1,2,3,2,4,6,0,65,... 1 million more numbers]
def n = 2
boolean foundMoreThanOnce(List<Integer> collection, Integer n) {
def foundCount = 0
for(Integer i : collection) {
if(i == n) {
foundCount = foundCount + 1
}
if(foundCount > 1) {
return true
}
}
return false
}
print foundMoreThanOnce(collection, n)
One of many possible Scala solutions.
def foundMoreThanOnce[A](collection: Seq[A], target: A): Boolean =
collection.dropWhile(_ != target).indexOf(target,1) > 0
Or a slight variation...
collection.dropWhile(target.!=).drop(1).contains(target)
Scans the collection only until the 2nd target element is found.
Not sure about groovy, but if possible for you to use Java 8 then there is a possibility
collection.stream().filter(z -> {return z ==2;} ).limit(2)
the limit will stop the stream processing as soon as it get 2nd occurrence of 2.
You can use it as below, to ensure there are exact two occurrences
Long occ = collection.stream().filter(z -> {return z ==2;} ).limit(2).count();
if(occ == 2)
return true;

Spark - how to handle with lazy evaluation in case of iterative (or recursive) function calls

I have a recursive function that needs to compare the results of the current call to the previous call to figure out whether it has reached a convergence. My function does not contain any action - it only contains map, flatMap, and reduceByKey. Since Spark does not evaluate transformations (until an action is called), my next iteration does not get the proper values to compare for convergence.
Here is a skeleton of the function -
def func1(sc: SparkContext, nodes:RDD[List[Long]], didConverge: Boolean, changeCount: Int) RDD[(Long] = {
if (didConverge)
nodes
else {
val currChangeCount = sc.accumulator(0, "xyz")
val newNodes = performSomeOps(nodes, currChangeCount) // does a few map/flatMap/reduceByKey operations
if (currChangeCount.value == changeCount) {
func1(sc, newNodes, true, currChangeCount.value)
} else {
func1(sc, newNode, false, currChangeCount.value)
}
}
}
performSomeOps only contains map, flatMap, and reduceByKey transformations. Since it does not have any action, the code in performSomeOps does not execute. So my currChangeCount does not get the actual count. What that implies, the condition to check for the convergence (currChangeCount.value == changeCount) is going to be invalid. One way to overcome is to force an action within each iteration by calling a count but that is an unnecessary overhead.
I am wondering what I can do to force an action w/o much overhead or is there another way to address this problem?
I believe there is a very important thing you're missing here:
For accumulator updates performed inside actions only, Spark guarantees that each task’s update to the accumulator will only be applied once, i.e. restarted tasks will not update the value. In transformations, users should be aware of that each task’s update may be applied more than once if tasks or job stages are re-executed.
Because of that accumulators cannot be reliably used for managing control flow and are better suited for job monitoring.
Moreover executing an action is not an unnecessary overhead. If you want to know what is the result of the computation you have to perform it. Unless of course the result is trivial. The cheapest action possible is:
rdd.foreach { case _ => }
but it won't address the problem you have here.
In general iterative computations in Spark can be structured as follows:
def func1(chcekpoinInterval: Int)(sc: SparkContext, nodes:RDD[List[Long]],
didConverge: Boolean, changeCount: Int, iteration: Int) RDD[(Long] = {
if (didConverge) nodes
else {
// Compute and cache new nodes
val newNodes = performSomeOps(nodes, currChangeCount).cache
// Periodically checkpoint to avoid stack overflow
if (iteration % checkpointInterval == 0) newNodes.checkpoint
/* Call a function which computes values
that determines control flow. This execute an action on newNodes.
*/
val changeCount = computeChangeCount(newNodes)
// Unpersist old nodes
nodes.unpersist
func1(checkpointInterval)(
sc, newNodes, currChangeCount.value == changeCount,
currChangeCount.value, iteration + 1
)
}
}
I see that these map/flatMap/reduceByKey transformations are updating an accumulator. Therefore the only way to perform all updates is to execute all these functions and count is the easiest way to achieve that and gives the lowest overhead compared to other ways (cache + count, first or collect).
Previous answers put me on the right track to solve a similar convergence detection problem.
foreach is presented in the docs as:
foreach(func) : Run a function func on each element of the dataset. This is usually done for side effects such as updating an Accumulator or interacting with external storage systems.
It seems like instead of using rdd.foreach() as a cheap action to trigger accumulator increments placed in various transformations, it should be used to do the incrementing itself.
I'm unable to produce a scala example, but here's a basic java version, if it can still help:
// Convergence is reached when two iterations
// return the same number of results
long previousCount = -1;
long currentCount = 0;
while (previousCount != currentCount){
rdd = doSomethingThatUpdatesRdd(rdd);
// Count entries in new rdd with foreach + accumulator
rdd.foreach(tuple -> accumulator.add(1));
// Update helper values
previousCount = currentCount;
currentCount = accumulator.sum();
accumulator.reset();
}
// Convergence is reached