Existence of a 0- and 1-valent configurations in the proof of FLP impossibility result - distributed-computing

In the known paper Impossibility of Distributed Consensus with one Faulty Process (JACM85), FLP (Fisher, Lynch and Paterson) proved the surprising result that no completely asynchronous consensus protocol can tolerate even a single unannounced process death.
In Lemma 3, after showing that D contains both 0-valent and 1-valent configurations, it says:
Call two configurations neighbors if one results from the other in a single step. By an easy induction, there exist neighbors C₀, C₁ ∈ C such that Dᵢ = e(Cᵢ) is i-valent, i = 0, 1.
I can follow the whole proof except when they claim the existence of such C₀ and C₁. Could you please give me some hints?

D (the set of possible configurations after applying e to elements of C) contains both 0-valent and 1-valent configurations (and is assumed to contain no bivalent configurations).
That is — e maps every element in C to either a 0-valent or a 1-valent configuration. By definition of C, there must be a root element that is connected to all other elements by a series of "neighbour" relationships, so there must be a boundary point where an element in C that leads to a 0-valent configuration after e is neighbours with an element in C that leads to a 1-valent configuration after e.

I once went down the path of reading all these papers only to discover its a complete waste of time.
The result is not surprising at all.
The paper you mention "[Impossibility of Distributed Consensus with One Faulty
Process]" 1
is a long list of complex mathematical proofs that simply equate to:
1) Consensus is a deterministic state
2) one (or more) faulty systems within an environment is a non deterministic environment
3) No deterministic state, action or outcome can ever be reached within a non deterministic environment.
The end. No further thought is required.
This is how it works in the real world outside of acadamia.
If you wish for agents to reach consensus then Synchronous (Timing model) approximation constructs have to be added to make the environment deterministic within a given set of constraints. For example simple constructs like Timeouts, Ack/Nack, Handshake, Witness, or way more complex constructs.
The closer you wish to get to a Synchronous deterministic model the more complex the constructs become. A hypothetical Synchronous model would have infinitely complex constructs. Also bearing in mind that a fully deterministic Synchronous model can never be achieved in a non trivial distributed system. This is because in any non trivial dynamic multi variate system with a variable initial state there exists an infinite number of possible states, actions and outcomes at any point in time. Chaos Theory
Consider the complexity of a construct for detecting a dropped TCP packets because of buffer overflow errors in a router at hop number 21. And the complexity of detecting the same buffer overflow error dropping the detection signal from the construct itself.

Define a mapping f such that f(C) = 0, if e(C) is 0-valent, otherwise, f(C) = 1, if e(C) is 1-valent.
Because e(C) could not be bivalent, if we assume that D has no bivalent configuration, f(C) could only be either 0 or 1.
Arrange accessible configurations from the initial bivalent configuration in a tree, there must be two neighbors C0, C1 in the tree that f(C0) != f(C1). Because, if not, all f(C) are the same, which means that D has only either all 0-valent configurations or all 1-valent configurations.

Related

Why there are many instructions with zero destination that not affectting the hardware in RISC-V ISA?

The first register is hardware zero in RISC-V ISA. It is used in many cases such as calling zero to another register, and jumping but not storing the address, etc.
However, there are many possibilities that don't change the hardware when the destination register is zero and we don't need those possible instructions because they are not used for any porpuse. I feel it is wasting the bits for other functional instructions. What am I missing in this issue? Why it is so?
As I know, the bits in ISA are expensive so ISA developers try to keep them as simple and compressed as possible to cover many different functionalities. However, it makes me feel the reverse because of many instructions that use the destination with the first register that is hardwired to zero.
I don't know if they are reserved for future usage or if the first register (x0) can be used without hardwiring zero.
One question the original designers were concerned with answering is: what will cost less hardware for a small embedded system? Having useless instructions like add x0, x0, x0 or even add x0, a0, a1?  Or doing something useful with those otherwise useless encoding?  And the answer is the former, to the question of what will take less hardware.
Another is: what will allow the most common (and also useful) instructions to execute as quickly as possible.  Decoding and add x0, x0, x0, or, add x0, a0, a1, so that it will do something different (from no-op) can slow down the useful add instructions as follows: while some of that decoding can happen in parallel (with added hardware), ultimately, the two paths, decoding add x0, a0, a1 and decoding a normal add a0, a0, a1 have to merge and that generally happens with muxes.  The more muxes that are introduced, the longer the cycle has to be, so doing that has the effect of slowing down the whole processor.
The designers of RISC V went to lengths to remove one mux from the decode phase as compared to MIPS, by keeping the target register field in a fixed position for both R- and I-Type instructions.

How to make the dynamic model in Dymola agree with the steady-state design result?

Modelica modeling is the first principle modeling, so how to test the model and set an effective benchmark is important, for example, I could design a fluid network as my wish, but when building a dynamic simulation model, I need to know the detailed geometry structure and parameters to set up every piece of my model. Usually, I would build a steady-state model with simple energy and mass conservation laws, then design every piece of equipment based on the corresponding design manual, but when I put every dynamic component together, when simulation till steady-state, the result is different from the steady-state model more or less. So I was wondering if I should modify my workflow to make the dynamic model agree with the steady-state model. Any suggestions are welcome.
#dymola #modelica
To my understanding of the question, your parameter values are fixed and physically known. I would attempt the following approach as a heuristic to identify the (few) component(s) that one needs to carefully investigate in order to understand how they influence or violates the assumed first principles.
This is just as a first trial and it could be subject to further improvement and fine-tuning.
Consider the set of significant set of variables xd(p,t) \in R^n and parameters p. Note that p also includes significant start values. p in R^m includes only the set of additional parameters not available in the steady state model.
Denote the corresponding variables of the steady state model by x_s
Denote a time point where the dynamic model is "numerically" in "semi-" steady-state by t*
Consider the function C(xd(p,t*),xs) = ||D||^2 with D = xd(p,t*) - xs
It could be beneficial to describe C as a vector rather than a single valued function.
Compute the partial derivatives of C w.t. p expressed in terms of dxd/dp, i.e.
dC/dp = d[D^T D]/dp
= d[(x_d-x_s)^T (x_d - x_s)]/dp
= (dx_d/dp)^T D + ...
Consider scaling the above function, i.e. dC/dp * p/C (avoid expected numerical issues via some epsilon-tricks)
Here you get a ranking of most significant parameters which are causing the apparent differences. The hopefully few number of components including these parameters could be the ones causing such violation.
If this still does not help, may be due to expected high correlation among parameters, I would go further and consider a dummy parameter identification problem, out of which a more rigorous ranking of significant model parameters can be obtained.
If the Modelica language had capabilities for expressing dynamic parameter sensitivities, all the above computation can be easily carried out as a single Modelica model (with a slightly modified formulation).
For instance, if we had something like der(x,p) corresponding to dx/dp, one could simply state
dcdp = der(C,p)
An alternative approach is proposed via the DerXP library

How can a connection between one gate input with mutiple outputs of other gates causes circuit memory?

I'm reading the Digital Design and Computer Architecture by David Harris, Sarah Harris. The authors give the following definition of combinational logic:
A combinational circuit’s outputs depend only on the current values of
the inputs; in other words, it combines the current input values to
compute the output... A combinational circuit is memoryless, but a
sequential circuit has memory. The functional specification of a
combinational circuit expresses the output values in terms of the
current input values.
However, they claim this circuit is not combinational:
because "node n6 connects to the output terminals of both I3 and I4". Indeed, it's one of the designated signs when a scheme can not be combinational but, according to the authors:
Certain circuits that disobey these rules are still combinational, so
long as the outputs depend only on the current values of the inputs.
As I'm able to catch on, the aforementioned circuit is the case: its output is 1 if and only if its inputs are both 1, otherwise the output is 0. So the output is defined as a function of the inputs (the AND function).
In fact, there was already a question about this circuit in the computer science network and it has an accepted answer. Here's an excerpt from it:
Circuit (d) cannot be written in this form [of formula], since the
outputs of I3 and I4 are wired together. What is the relation between
the input to the rightmost gate and the outputs of I3 and I4? Not
something that can be described combinatorially.
Unfortunately, I'm still confused due to
The circuit, regarded as a black box, is still in scope of the combinational logic definition: its output values depend only on the current values of the inputs;
The relation between the input to the rightmost gate and the outputs of I3 and I4 can be described through the function NAND of the circuit inputs and this function is quite "memoryless". It's not obvious for me why we can't afford to depict a gate input using multiple outputs of other gates.
I need some elaboration. Maybe things would fall into place if someone provide a circuit example when two gates outputs is connected to one input and it actually causes "memory" (in contrast to the considered sample).
Circuit (d) is not combinational because it is not a logic gate circuit at all.
I think it's a very silly example to explain combinational vs sequential circuits.
In a logic circuit, an output wire cannot go to another output wire. You assumed that the outputs, when connected together, will act as a logical OR or AND of themselves.
This is not true (otherwise why would we use AND/OR gates in the first place?).
What will happen depends on the specific implementation of the gates (i.e. specific IC or manufacturing process you used) and this is not something that a logic circuit is meant to model.
A logic circuit must behave the same, no matter what brand you are using.
In circuit (d), the output of I3 will feed both the input of the rightmost NOT and the output of I4 (the complementary is also true).
Most IC will break if a current will flow in from their outputs, others won't but they will interfere with the capability of the right-most NOT to sense its input.
Logic circuits are still circuits, so you should, in theory, perform a full circuit analysis, which includes solving differential equations, to solve for their output.
Digital electronics is a branch that abstracts from these "low-level" details but at the cost of making some assumptions, one of which is: outputs are never merged without a gate.
The whole point of a combinational circuit is that you can write out = f(in0, in1, ..., ink) but it's not always possible.
Take for example an edge detector, it is just a f(A) = (NOT A) AND A which should, by the law of the excluded middle, always output 0.
But it will not because the NOT A path takes a slightly longer time to reach the AND input.
How can you describe this dynamic behaviour with a f(A) function?
Don't think too much of it, when you'll get to sequential circuits you'll spot the difference immediately (if you need a preview, look up for "latch circuit").

Finding Conditional Moments in a Markov Process

This question combines math and programming. I will first describe the general problem and then give an example that is (hopefully) simpler to understand.
General Question: Consider a Markov-chain process of N-states with transition matrix Π. Each state is associated with a value x_n (n in {1,…,n}). Our goal is to find the unconditional average of the first two moments (mean and var) along T-period paths conditional on (i) the path starts in a subset of states, N_0, (ii) it ends in a subset of states, N_T, and (iii) it is not going through a subset of states, N_not, in any of the periods between 1 to T-1. By saying we are interested in the unconditional average of these two moments, I basically mean what would be the average of these two moments in the stationary distribution. To be more concrete, let me illustrate the goal of the exercise in a simple case.
Simple Example: Consider a 3-state Markov-chain process with transition matrix Π, and let the three state be denoted by A, B, and C. Each of these states are associated with some value (x_A, x_B, and x_C), respectively. We are interested in what happens along paths that satisfy the following condition. The path starts at point A, after 3 periods are in either points B or C, and between periods 1 to 3 never go again through point A. Denote this condition by (#). So, for example, a path which we are interested in would be {A,B,B,C} with the associated values {x_A, x_B, x_B, x_C}. We are interested in the average and standard deviation along such paths. In particular, we would like to find the unconditional average of these first two moments in paths that satisfy (#).
Let me now propose a solution based on simulating the process. Since both T and N are quite large, this solution is too slow for my purpose.
Simulation Solution: Starting from some initial point simulate the process for a very long time period, and drop the first τ periods. Extract all paths along the simulation that satisfy condition (#) and compute the mean and std along each of these paths. Finally, simply take the average across these paths.
I’m hoping there is a better and more efficient way to achieve the goal. Since I want the solution to be accurate and the size of T and N the simulation takes a long time.
I would love to hear your thoughts and if you know of efficient methods to achieve this goal. Please let me know if something is not clear and I'll try to clarify it.
Thank you!!!
I think I know how to do this if N_0 consists of one state, let's call that state A.
The long run probability of being in A is pi(A) and can be obtained by solving pi = pi*P, with P the transition matrix.
The other thing you need to calculate is the probability of those transient paths. You probably need to introduce a modified P, where all states i in the set N_not are absorbing (i.e. P[i,i]=1 and P[i,j]=0 for j is not i). Then starting from a vector p(0) which has a 1 in the element corresponding to state A and 0 otherwise, you can keep calculating p(n) = p(n-1)*P to get the probabilities of your transient paths.
Multiply the result of that by pi(A) to get the unconditional probability.
You can probably do something like this as well when N_0 is a set, but I don't know how you should select p(0) in that case.

Skipping steps in Normalization?

Just curious: is there some reason why one cannot do all necessary normalizations
in a single step? Isnt normalization ultimately the redrawing of the Functional Dependency (FD) graph? We start out with an FD diagram/graph and we want to end up with a graph (vertices are attributes, there is an edge between attributes a,b if b is FD on a ) representing a relation in (Edit) BCNF ?
EDIT: What I mean is : we start with a FD graph , which is a graph pairing attributes a,b iff b is FD on A, i.e., we join a and b with an edge iff b=f(a).
From this graph we want to obtain a graph (FD)_2 with certain traits, which are equivalent to having been fully normalized, i.e., (FD)_2 is in 5NF or 6NF, using the graph-theoretical relation between a graph and a given normal form. If So we are basically mapping one graph to another graph. Can we use this approch-- drawing (FD)_2 directly, as a function of FD, to skip normalization steps?
Yes: Normalization can be characterized by rearranging (hyper)graphs. It does not have to be done by moving through normal forms in some order. (It's just a common misconception that it is.)
The normal forms on the continuum from 1NF to 6NF are those dealing with problematic FDs (functional dependencies) and JDs (join dependencies). They can be ordered so that if a relation value or variable satisfies a form then it satisfies the forms before but not necessarily after. Currently: 1NF, 2NF, 3NF, EKNF, BCNF, 4NF, ETNF, RFNF, SKNF, 5NF aka PJ/NF, Overstrong PJ/NF, 6NF. This ordering has nothing to do per se with decomposing to relation values or variables that are in higher normal forms. It is not necessary to decompose through a sequence of forms.
The normal forms are just different conditions that have been found with helpful properties. Moreover, the normal forms are just those that have been discovered; there may well be other helpful properties to be distinguished. We don't pass through them to normalize now. ETNF is 2012!
As to your graph characterization:
A FD has a set of attributes as determinant. Which determines another set. But since the one determines the other if and only if the one determines each of the sets that contain exactly one member of the other, informally but unambiguously we also talk about a set of attributes determining an attribute. A FD {...} -> a holds iff a = f(...). (There can be zero or more determinant attributes.) BCNF is the highest normal form re problematic FDs, but there are higher normal forms re problematic JDs. A JD with given components holds in a relation iff it is always their join. Ie its meaning/predicate can be expressed as the AND of the components'. So a FD {...} -> A holds iff a JD holds corresponding to a meaning/predicate with conjunct A = F(...)! A MVD (multi-valued dependency) corresponds to a certain binary JD. 5NF means that every JD that holds is "implied by the keys" (a technical term).
There are algorithms that starting with FDs decompose directly to 2NF, directly to 3NF and directly to BCNF (with various other properties like preservation of FDs). See the Alice book. One can decompose to 6NF simply by decomposing until there are no nontrivial JDs, without regard to FDs.
(See C. J. Date's Database Design and Relational Theory: Normal Forms and All That Jazz.)