How to implement deterministic single threaded network simulation - simulation

I read about how FoundationDB does its network testing/simulation here: http://www.slideshare.net/FoundationDB/deterministic-simulation-testing
I would like to implement something very similar, but cannot figure out how they actually did implement it. How would one go about writing, for example, a C++ class that does what they do. Is it possible to do the kind of simulation they do without doing any code generation (as they presumeably do)?
Also: How can a simulation be repeated, if it contains random events?? Each time the simulation would require to choose a new random value and thus be not the same run as the one before. Maybe I am missing something here...hope somebody can shed a bit of light on the matter.

You can find a little bit more detail in the talk that went along with those slides here: https://www.youtube.com/watch?v=4fFDFbi3toc
As for the determinism question, you're right that a simulation cannot be repeated exactly unless all possible sources of randomness and other non-determinism are carefully controlled. To that end:
(1) Generate all random numbers from a PRNG that you seed with a known value.
(2) Avoid any sort of branching or conditionals based on facts about the world which you don't control (e.g. the time of day, the load on the machine, etc.), or if you can't help that, then pseudo-randomly simulate those things too.
(3) Ensure that whatever mechanism you pick for concurrency has a mode in which it can guarantee a deterministic execution order.
Since it's easy to mess all those things up, you'll also want to have a way of checking whether determinism has been violated.
All of this is covered in greater detail in the talk that I linked above.

In the sims I've built the biggest issue with repeatability ends up being proper seed management (as per the previous answer). You want your simulations to give different results only when you supply a different seed to your random number generators than before.
After that the biggest issue I've seen seems tends to be making sure you don't iterate over collections with nondeterministic ordering. For instance, in Java, you'd use a LinkedHashMap instead of a HashMap.

Related

Optaplanner: Generating a partial solution to VRP where trucks and/or stops may remain unassigned based on Time windows

I am solving a variation on vehicle routing problem. The model worked until I implemented a change where certain vehicles and/or stops may remain unassigned because the construction filter does not allow the move due to time window considerations (late arrival not allowed).
The problem size is 2 trucks/3 stops. truck_1 has 2 stops (Stop_1 and Stop_2) assigned to it, and consequently 1 truck and 1 stop remain unassigned since truck_2 will arrive late to Stop_3.
I have the following error:
INFO o.o.c.i.c.DefaultConstructionHeuristicPhase - Construction Heuristic phase (0) ended: step total (2), time spent (141), best score (-164hard/19387soft).
java.lang.IllegalStateException: Local Search phase started with an uninitialized Solution. First initialize the Solution. For example, run a Construction Heuristic phase first.
at org.optaplanner.core.impl.localsearch.DefaultLocalSearchPhase.phaseStarted(DefaultLocalSearchPhase.java:119)
at org.optaplanner.core.impl.localsearch.DefaultLocalSearchPhase.solve(DefaultLocalSearchPhase.java:60)
at org.optaplanner.core.impl.solver.DefaultSolver.runPhases(DefaultSolver.java:213)
at org.optaplanner.core.impl.solver.DefaultSolver.solve(DefaultSolver.java:176)
I tried to set the planning variable to null (nullable = true) but it seems it is not allowed in case of chained variables.
I am using Optaplanner 6.2.
Please help,
Thank you,
Piyush
Your construction filter may be too restrictive, it could prevent the construction heuristic from creating an initialized solution. You should remove the time window constraint from the construction filter and add it as a hard score constraint in your score calculator instead.
From the Optaplanner docs:
Instead of implementing a hard constraint, it can sometimes be built in. For example: If Lecture A should never be assigned to Room X, but it uses ValueRangeProvider on Solution, so the Solver will often try to assign it to Room X too (only to find out that it breaks a hard constraint). Use a ValueRangeProvider on the planning entity or filtered selection to define that Course A should only be assigned a Room different than X.
This can give a good performance gain in some use cases, not just because the score calculation is faster, but mainly because most optimization algorithms will spend less time evaluating unfeasible solutions. However, usually this not a good idea because there is a real risk of trading short term benefits for long term harm:
Many optimization algorithms rely on the freedom to break hard constraints when changing planning entities, to get out of local optima.
Both implementation approaches have limitations (feature compatiblity, disabling automatic performance optimizations, ...), as explained in their documentation.

How do *you* explain macro behaviour in Netlogo?

I'd be interested to know what strategies people are using to work backwards from the results of a model (from observed emergent behaviour) to try to answer the question: what is it about individual turtles that has led to this macro behaviour?
Are you asking, "How can you guess what rules govern agents [turtles], given the macro-level behavior?"? Or are you asking, "Given that you have the source code, how to do you figure out what it is about the source that generates the macro-level behavior?"?
Answering the first question is very hard, often. I don't have suggestions.
Answering the second can be hard, too.
One strategy is to experiment with different initial configurations, or experiment with different rules in agents [turtles]. If you have no guesses about what to vary, make arbitrary choices at first, until you begin to have guesses, or even vague intuitions. Then vary the code in ways that will explore whether your guesses are correct, and in ways that will allow you to refine your guesses. This strategy won't always work.
Perhaps it might be useful to try to think through how you could generate the macro-level behavior, if that's what you wanted to produce. You might not have any idea of what an answer might be--this would amount to answering the first question above--but if you do, it might lead to guesses that you could use for the preceding strategy.
I do this when I am debugging because sometimes I get unexpected macro-behaviour and I need to determine whether it is real or an error. Typically what I would do is set the random seed then look at which agents are involved in the behaviour and when (ticks in NetLogo) it occurs. I would then rerun the model and stop it a few ticks beforehand and inspect the agents that I know are going to be involved in the behaviour to see if there's something unusual about them or their environment. Macro behaviour typically occurs because of the interactions of agents, environment and events so I am trying to determine WHAT those elements are, before seeking to explain how they combine to create the behaviour. Once that's done, I can usually trace the code (dropping in a print statement and inspecting the relevant agents) to work out how it occurred.

G-Counters in Riak: Don't the underlying vclocks provide the same data?

I've been reading into CvRDTs and I'm aware that Riak has already added a few to Riak 2.
My question is: why would Riak implement a gcounter when it sounds like the underlying vclock that is associated with every object records the same information? Wouldn't the result be a gcounter stored with a vclock, each containing the same essential information?
My only guess right now would be that Riak may garbage-collect the vclocks, trimming information that would actually be important for the purpose of a gcounter (i.e. the number of increments).
I cannot read Erlang particularly well, so maybe I've wrongly assumed that Riak stores vclocks with these special-case data types. However, the question still applies to the homegrown solutions that are written on top of standard Riak (and hence inherit vclocks with each object persisted).
EDIT:
I have since written the following article to help explain CvRDTs in a more practical manner. This article also touches on the redundancy I have highlighted above:
Conflict-free Replicated Data Types (CRDT) - A digestible explanation with less math.
Riak prunes version vectors, no big deal for causality (false concurrency, more siblings, safe) but a disaster for counters.
Riak's CRDT support is general. We "hide" CRDTs inside the regular riak object.
Riak's CRDT support is in it's first wave, we'll be optimising further as we make further releases.
We have a great mailing list for questions like this, btw. Stack Overflow has it's uses but if you want to talk to the authors of an open source DB why not use their list? Since Riak is open source, you can submit a pull request, we'd love to incorporate your ideas into the code base.
Quick answer: Riak's counters are actually PN-Counters, ie they allow both increments and decrements, so can't be implemented like a vclock, as they require tracking the increments and decrements differently.
Long Answer:
This question suggests you have completely misunderstood the difference between a g-counter and a vector clock (or version vector).
A vector clock (vclock) is a system for tracking the causality of concurrent updates to a piece of data. They are a map of {actor => logical clock}. Actors only increment their logical clocks when the data they're associated with changes, and try to increment it as little as possible (so at most once per update). Two vclocks can either be concurrent, or one can dominate the other.
A g-counter is a CvRDT with what looks like the same structure as a vclock, but with important differences. They are implemented as a map of {actor => counter}. Actors can increment their own counter as much as they want. A g-counter has the concept of a "counter value", and the concept of a "merge", so that when concurrent operations are executed by different actors, they can work out what the actual "counter value" should be.
Importantly, g-counters can't track causality, and vclocks have no idea what their "counter value" is.
To conflate the two in a codebase would not only be confusing, but could also bring in errors.
Add this to the fact that riak actually implements pn-counters. The difference is that a g-counter can only be incremented, but pn-counters can be both incremented and decremented. Pn-counters work by being a map of {actor => (increment count, decrement count)}, which more obviously has a different structure to a vclock. You can only increment both those counts, hence why there are two and not just one.

How to handle the two signals depending on each other?

I read Deprecating the Observer Pattern with Scala.React and found reactive programming very interesting.
But there is a point I can't figure out: the author described the signals as the nodes in a DAG(Directed acyclic graph). Then what if you have two signals(or event sources, or models, w/e) depending on each other? i.e. the 'two-way binding', like a model and a view in web front-end programming.
Sometimes it's just inevitable because the user can change view, and the back-end(asynchronous request, for example) can change model, and you hope the other side to reflect the change immediately.
The loop dependencies in a reactive programming language can be handled with a variety of semantics. The one that appears to have been chosen in scala.React is that of synchronous reactive languages and specifically that of Esterel. You can have a good explanation of this semantics and its alternatives in the paper "The synchronous languages 12 years later" by Benveniste, A. ; Caspi, P. ; Edwards, S.A. ; Halbwachs, N. ; Le Guernic, P. ; de Simone, R. and available at http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=1173191&tag=1 or http://virtualhost.cs.columbia.edu/~sedwards/papers/benveniste2003synchronous.pdf.
Replying #Matt Carkci here, because a comment wouldn't suffice
In the paper section 7.1 Change Propagation you have
Our change propagation implementation uses a push-based approach based on a topologically ordered dependency graph. When a propagation turn starts, the propagator puts all nodes that have been invalidated since the last turn into a priority queue which is sorted according to the topological order, briefly level, of the nodes. The propagator dequeues the node on the lowest level and validates it, potentially changing its state and putting its dependent nodes, which are on greater levels, on the queue. The propagator repeats this step until the queue is empty, always keeping track of the current level, which becomes important for level mismatches below. For correctly ordered graphs, this process monotonically proceeds to greater levels, thus ensuring data consistency, i.e., the absence of glitches.
and later at section 7.6 Level Mismatch
We therefore need to prepare for an opaque node n to access another node that is on a higher topological level. Every node that is read from during n’s evaluation, first checks whether the current propagation level which is maintained by the propagator is greater than the node’s level. If it is, it proceed as usual, otherwise it throws a level mismatch exception containing a reference to itself, which is caught only in the main propagation loop. The propagator then hoists n by first changing its level to a level above the node which threw the exception, reinserting n into the propagation queue (since it’s level has changed) for later evaluation in the same turn and then transitively hoisting all of n’s dependents.
While there's no mention about any topological constraint (cyclic vs acyclic), something is not clear. (at least to me)
First arises the question of how is the topological order defined.
And then the implementation suggests that mutually dependent nodes would loop forever in the evaluation through the exception mechanism explained above.
What do you think?
After scanning the paper, I can't find where they mention that it must be acyclic. There's nothing stopping you from creating cyclic graphs in dataflow/reactive programming. Acyclic graphs only allow you to create Pipeline Dataflow (e.g. Unix command line pipes).
Feedback and cycles are a very powerful mechanism in dataflow. Without them you are restricted to the types of programs you can create. Take a look at Flow-Based Programming - Loop-Type Networks.
Edit after second post by pagoda_5b
One statement in the paper made me take notice...
For correctly ordered graphs, this process
monotonically proceeds to greater levels, thus ensuring data
consistency, i.e., the absence of glitches.
To me that says that loops are not allowed within the Scala.React framework. A cycle between two nodes would seem to cause the system to continually try to raise the level of both nodes forever.
But that doesn't mean that you have to encode the loops within their framework. It could be possible to have have one path from the item you want to observe and then another, separate, path back to the GUI.
To me, it always seems that too much emphasis is placed on a programming system completing and giving one answer. Loops make it difficult to determine when to terminate. Libraries that use the term "reactive" tend to subscribe to this thought process. But that is just a result of the Von Neumann architecture of computers... a focus of solving an equation and returning the answer. Libraries that shy away from loops seem to be worried about program termination.
Dataflow doesn't require a program to have one right answer or ever terminate. The answer is the answer at this moment of time due to the inputs at this moment. Feedback and loops are expected if not required. A dataflow system is basically just a big loop that constantly passes data between nodes. To terminate it, you just stop it.
Dataflow doesn't have to be so complicated. It is just a very different way to think about programming. I suggest you look at J. Paul Morison's book "Flow Based Programming" for a field tested version of dataflow or my book (once it's done).
Check your MVC knowledge. The view doesn't update the model, so it won't send signals to it. The controller updates the model. For a C/F converter, you would have two controllers (one for the F control, on for the C control). Both controllers would send signals to a single model (which stores the only real temperature, Kelvin, in a lossless format). The model sends signals to two separate views (one for C view, one for F view). No cycles.
Based on the answer from #pagoda_5b, I'd say that you are likely allowed to have cycles (7.6 should handle it, at the cost of performance) but you must guarantee that there is no infinite regress. For example, you could have the controllers also receive signals from the model, as long as you guaranteed that receipt of said signal never caused a signal to be sent back to the model.
I think the above is a good description, but it uses the word "signal" in a non-FRP style. "Signals" in the above are really messages. If the description in 7.1 is correct and complete, loops in the signal graph would always cause infinite regress as processing the dependents of a node would cause the node to be processed and vice-versa, ad inf.
As #Matt Carkci said, there are FRP frameworks that allow loops, at least to a limited extent. They will either not be push-based, use non-strictness in interesting ways, enforce monotonicity, or introduce "artificial" delays so that when the signal graph is expanded on the temporal dimension (turning it into a value graph) the cycles disappear.

How to start working with a large decision table

Today I've been presented with a fun challenge and I want your input on how you would deal with this situation.
So the problem is the following (I've converted it to demo data as the real problem wouldn't make much sense without knowing the company dictionary by heart).
We have a decision table that has a minimum of 16 conditions. Because it is an impossible feat to manage all of them (2^16 possibilities) we've decided to only list the exceptions. Like this:
As an example I've only added 10 conditions but in reality there are (for now) 16. The basic idea is that we have one baseline (the default) which is valid for everyone and all the exceptions to this default.
Example:
You have a foreigner who is also a pirate.
If you go through all the exceptions one by one, and condition by condition you remove the exceptions that have at least one condition that fails. In the end you'll end up with the following two exceptions that are valid for our case. The match is on the IsPirate and the IsForeigner condition. But as you can see there are 2 results here, well 3 actually if you count the default.
Our solution
Now what we came up with on how to solve this is that in the GUI where you are adding these exceptions, there should run an algorithm which checks for such cases and force you to define the exception more specifically. This is only still a theory and hasn't been tested out but we think it could work this way.
My Question
I'm looking for alternative solutions that make the rules manageable and prevent the problem I've shown in the example.
Your problem seem to be resolution of conflicting rules. When multiple rules match your input, (your foreigner and pirate) and they end up recommending different things (your cangetjob and cangetevicted), you need a strategy for resolution of this conflict.
What you mentioned is one way of resolution -- which is to remove the conflict in the first place. However, this may not always be possible, and not always desirable because when a user adds a new rule that conflicts with a set of old rules (which he/she did not write), the user may not know how to revise it to remove the conflict.
Another possible resolution method is prioritization. Mark a priority on each rule (based on things like the user's own authority etc.), sort the matching rules according to priority, and apply in ascending sequence of priority. This usually works and is much simpler to manage (e.g. everybody knows that the top boss's rules are final!)
Prioritization may also be used to mark a certain rule as "global override". In your example, you may want to make "IsPirate" as an override rule -- which means that it overrides settings for normal people. In other words, once you're a pirate, you're treated differently. This make it very easy to design a system in which you have a bunch of normal business rules governing 90% of the cases, then a set of "exceptions" that are treated differently, automatically overriding certain things. In this case, you should also consider making "?" available in the output columns as well.
One other possible resolution method is to include attributes in each of your conditions. For example, certain conditions must have no "zeros" in order to pass (? doesn't matter). Some conditions must have at least one "one" in order to pass. In other words, mark each condition as either "AND", "OR", or "XOR". Some popular file-system security uses this model. For example, CanGetJob may be AND (you want to be stringent on rights-to-work). CanBeEvicted may be OR -- you may want to evict even a foreigner if he is also a pirate.
An enhancement on the AND/OR method is to provide a threshold that the total result must exceed before passing that condition. For example, putting CanGetJob at a threshold of 2 then it must get at least two 1's in order to return 1. This is sometimes useful on conditions that are not clearly black-and-white.
You can mix resolution methods: e.g. first prioritize, then use AND/OR to resolve rules with similar priorities.
The possibilities are limitless and really depends on what your actual needs are.
To me this problem reminds business rules engine where there is no known algorithm to define outputs from inputs (e.g. using boolean logic) but the user (typically some sort of administrator) has to define all or some the logic itself.
This might sound a bit of an overkill but OTOH this provides virtually limit-less extension capabilities: you don't have to code any new business logic, just define a new rule set.
As I understand your problem, you are looking for a nice way to visualise the editing for these rules. But this all depends on your programming language and the tool you select for this. Java, for example, has JBoss Drools. Quoting their page:
Drools Guvnor provides a (logically
centralized) repository to store you
business knowledge, and a web-based
environment that allows business users
to view and (within certain
constraints) possibly update the
business logic directly.
You could possibly use this generic tool or write your own.
Everything depends on what your actual rules will look like. Rules like 'IF has an even number of these properties THEN' would be painful to represent in this format, whereas rules like 'IF pirate and not geek THEN' are easy.
You can 'avoid the ambiguity' by stating that you'll always be taking the first actual match, in other words your rules have a priority. You'd then want to flag rules which have no effect because they are 'shadowed' by rules higher up. They're not hard to find, so it's something your program should do.
Your interface could also indicate groups of rules where rules within the group can be in any order without changing the outcomes. This will add clarity to what the rules are really saying.
If some of your outputs are relatively independent of the others, you will also get a more compact and much clearer table by allowing question marks in the output. In that design the scan for first matching rule is done once for each output. Consider for example if 'HasChildren' is the only factor relevant to 'Can Be Evicted'. With question marks in the outputs (= no effect) you could be halving the number of exception rules.
My background for this is circuit logic design, not business logic. What you're designing is similar to, but not the same as, a PLA. As long as your actual rules are close to sum of products then it can work well. If your rules aren't, for example the 'even number of these properties' rule, then the grid like presentation will break down in a combinatorial explosion of cases. Your best hope if your rules are arbitrary is to get a clearer more compact presentation with either equations or with diagrams like a circuit diagram. To be avoided, if you can.
If you are looking for a Decision Engine with a GUI, than you can try this one: http://gandalf.nebo15.com/
We just released it, it's open source and production ready.
You probably need some kind of inference engine. Think about doing it in prolog.