Isolating run sequence definition seems inconsistent - unicode

I've been reading Unicode Standard Annex #9, which describes the Unicode Bidirectional Algorithm, and I've hit a snag in section BD13, the definition of an isolating run sequence.
An isolating run sequence is a maximal sequence of level runs such that for all level runs except the last one in the sequence, the last character of the run is an isolate initiator whose matching PDI is the first character of the next level run in the sequence.
How can this be the case? Section BD9, which defines the matching PDI of an isolate initiator, says that
an isolate initiator and its matching PDI are always assigned the same explicit embedding level
and section BD7 defines a level run as
a maximal substring of characters that have the same embedding level
This would seem to indicate that an isolate initiator and its matching PDI cannot belong to consecutive level runs. Either they should belong to the same level run, or there should be at least one level run between them.
What's the resolution of this apparent inconsistency? Is there a distinction between an "explicit embedding level" and an "embedding level"? Is one of the quotes I'm relying on non-normative and not completely accurate? Are isolating run sequences just always one level run long?
In case the wording changes, this question is based on revision 33, the current revision of the annex.

While writing up this question, I figured out the examples under section BD13 and how they resolve the issue. An isolating run sequence is not contiguous. Its elements are not required to be consecutive level runs; indeed, they must be separated by level runs belonging to other isolating run sequences. This should be made more explicit.

Related

Why the immediate offset in the riscv's JAL instruction has bit order changed?

The bit field is shown below
I don't see the point the of doing this re-ordering of bit-field.
Is there a special kind of manipulation when RISC-V processor is executing this instruction?
The purpose of the shuffling is to reduce the number of muxs involved in constructing the full sized operand from the immediates across the different instruction types.
For example, the sign-extend bit (which drives a lot wires) is always the same (inst[31]). You can also see that imm[10] is almost always in the same place too, across I-type, S-type, B-type, and J-type instructions.

Multi-instance and Loop in BPMN

I am trying to model a certain behaviour, where couple of activities in differents swimlanes supposed to be processed in a loop. Now BPMN uses tokens to ilustrate the flow and paths taken. I wonder how such tokens work in case of loops. Does every activity iteration creates a token which consequently travel through the connected activities?
E.g. Let's say Activity1 will be performed in a loop 10 times. Will that create 10 tokens where each will travel through the remaining activities of the process? Such behaviour would be undesirable, however if I am not mistaken multi-instance activities work that way.
The only solution on my mind which would comply with BPMN specification would be to create a Call activity for the whole block of activities and then run the Call activity in a loop.
Can anyone clarify for me the use of loops and multi-instances in BPMN from the view of tokens?
Thank you in advance!
Based upon my reading of the documentation: https://www.omg.org/spec/BPMN/2.0/PDF The answer from #qwerty_so does not seem to conform to the standard, although in part this seems to be because the question also seems imprecise or at least underspecified.
A token (see glossary) is simply an imaginary object that represents the flow unit in the process diagram. There are at least three different types of loops specified in the standard, which suggest different implications for the flow unit.
Sections 13.2.6 and 12.2.7 describe Loop Activity and Multiple Instance Activities respectively. While the latter, on its face, might not seem like a loop, the standard defines attributes of the activity that suggest otherwise including: MultipleInstanceLoopCharacteristics and ExpressionloopCardinality.
In the former case, it seems that the operational semantics suggest a single flow unit that repeats multiple times according to some policy or even unbounded.
In the latter case, the activity has "multiple instances spawned," including a parallel variant.
That multiple instances can flow forward in parallel, on its face, suggests that the system must at least allow for the possibility of spawning multiple tokens (or conceptually splitting the original token) to support multiple threads proceeding simultaneously along different paths.
That said, the Loop Activity (13.2.6) appears to support the OP's desired semantics.

How to handle the two signals depending on each other?

I read Deprecating the Observer Pattern with Scala.React and found reactive programming very interesting.
But there is a point I can't figure out: the author described the signals as the nodes in a DAG(Directed acyclic graph). Then what if you have two signals(or event sources, or models, w/e) depending on each other? i.e. the 'two-way binding', like a model and a view in web front-end programming.
Sometimes it's just inevitable because the user can change view, and the back-end(asynchronous request, for example) can change model, and you hope the other side to reflect the change immediately.
The loop dependencies in a reactive programming language can be handled with a variety of semantics. The one that appears to have been chosen in scala.React is that of synchronous reactive languages and specifically that of Esterel. You can have a good explanation of this semantics and its alternatives in the paper "The synchronous languages 12 years later" by Benveniste, A. ; Caspi, P. ; Edwards, S.A. ; Halbwachs, N. ; Le Guernic, P. ; de Simone, R. and available at http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=1173191&tag=1 or http://virtualhost.cs.columbia.edu/~sedwards/papers/benveniste2003synchronous.pdf.
Replying #Matt Carkci here, because a comment wouldn't suffice
In the paper section 7.1 Change Propagation you have
Our change propagation implementation uses a push-based approach based on a topologically ordered dependency graph. When a propagation turn starts, the propagator puts all nodes that have been invalidated since the last turn into a priority queue which is sorted according to the topological order, briefly level, of the nodes. The propagator dequeues the node on the lowest level and validates it, potentially changing its state and putting its dependent nodes, which are on greater levels, on the queue. The propagator repeats this step until the queue is empty, always keeping track of the current level, which becomes important for level mismatches below. For correctly ordered graphs, this process monotonically proceeds to greater levels, thus ensuring data consistency, i.e., the absence of glitches.
and later at section 7.6 Level Mismatch
We therefore need to prepare for an opaque node n to access another node that is on a higher topological level. Every node that is read from during n’s evaluation, first checks whether the current propagation level which is maintained by the propagator is greater than the node’s level. If it is, it proceed as usual, otherwise it throws a level mismatch exception containing a reference to itself, which is caught only in the main propagation loop. The propagator then hoists n by first changing its level to a level above the node which threw the exception, reinserting n into the propagation queue (since it’s level has changed) for later evaluation in the same turn and then transitively hoisting all of n’s dependents.
While there's no mention about any topological constraint (cyclic vs acyclic), something is not clear. (at least to me)
First arises the question of how is the topological order defined.
And then the implementation suggests that mutually dependent nodes would loop forever in the evaluation through the exception mechanism explained above.
What do you think?
After scanning the paper, I can't find where they mention that it must be acyclic. There's nothing stopping you from creating cyclic graphs in dataflow/reactive programming. Acyclic graphs only allow you to create Pipeline Dataflow (e.g. Unix command line pipes).
Feedback and cycles are a very powerful mechanism in dataflow. Without them you are restricted to the types of programs you can create. Take a look at Flow-Based Programming - Loop-Type Networks.
Edit after second post by pagoda_5b
One statement in the paper made me take notice...
For correctly ordered graphs, this process
monotonically proceeds to greater levels, thus ensuring data
consistency, i.e., the absence of glitches.
To me that says that loops are not allowed within the Scala.React framework. A cycle between two nodes would seem to cause the system to continually try to raise the level of both nodes forever.
But that doesn't mean that you have to encode the loops within their framework. It could be possible to have have one path from the item you want to observe and then another, separate, path back to the GUI.
To me, it always seems that too much emphasis is placed on a programming system completing and giving one answer. Loops make it difficult to determine when to terminate. Libraries that use the term "reactive" tend to subscribe to this thought process. But that is just a result of the Von Neumann architecture of computers... a focus of solving an equation and returning the answer. Libraries that shy away from loops seem to be worried about program termination.
Dataflow doesn't require a program to have one right answer or ever terminate. The answer is the answer at this moment of time due to the inputs at this moment. Feedback and loops are expected if not required. A dataflow system is basically just a big loop that constantly passes data between nodes. To terminate it, you just stop it.
Dataflow doesn't have to be so complicated. It is just a very different way to think about programming. I suggest you look at J. Paul Morison's book "Flow Based Programming" for a field tested version of dataflow or my book (once it's done).
Check your MVC knowledge. The view doesn't update the model, so it won't send signals to it. The controller updates the model. For a C/F converter, you would have two controllers (one for the F control, on for the C control). Both controllers would send signals to a single model (which stores the only real temperature, Kelvin, in a lossless format). The model sends signals to two separate views (one for C view, one for F view). No cycles.
Based on the answer from #pagoda_5b, I'd say that you are likely allowed to have cycles (7.6 should handle it, at the cost of performance) but you must guarantee that there is no infinite regress. For example, you could have the controllers also receive signals from the model, as long as you guaranteed that receipt of said signal never caused a signal to be sent back to the model.
I think the above is a good description, but it uses the word "signal" in a non-FRP style. "Signals" in the above are really messages. If the description in 7.1 is correct and complete, loops in the signal graph would always cause infinite regress as processing the dependents of a node would cause the node to be processed and vice-versa, ad inf.
As #Matt Carkci said, there are FRP frameworks that allow loops, at least to a limited extent. They will either not be push-based, use non-strictness in interesting ways, enforce monotonicity, or introduce "artificial" delays so that when the signal graph is expanded on the temporal dimension (turning it into a value graph) the cycles disappear.

Lex reserved word rules versus lookup table

This web page suggests that if your lex program "has a large number of reserved words, it is more efficient to let lex simply match a string and determine in your own code whether it is a variable or reserved word."
My question is: More efficient where, and why? If it means compiling the lexer is faster, I don't really care about that because it is one step removed from the program which uses the lexer to parse input.
It seems to be that lex just uses your description to build a state machine that processes one character at a time. It does not seem logical that increasing the size of the state machine would necessarily cause it to become any slower than using one rule for identifiers and then doing several string comparisons.
Additionally, if it turns out that there is some logical reason for this to make sense as an optimization, what would be considered a large number of reserved words? I have approximately 20, as compared to about 30 other rules for various things. Would that be considered a large number of reserved words? Should I attempt to use the same strategy for some of the other symbols?
I have attempted to google for a result, but the only relevant articles I found stated this strategy as though it were well-known without giving any reason.
In case it is relevant, I am using flex 2.5.35.
Edit: Here is another reference which claims that lex produces an inefficient scanner when asked to match several long literal strings. It also does not give a reason.
According to the flex manual, "[t]he speed of the scanner is independent of the number of rules or ... how complicated the rules are with regard to operators such as '*' and '|'."
The main performance losses are due to backtracking. This can be avoided by (among other things) using catch-all rules which will match tokens which "start with" the offending token. For example, if you have a list of reserved words consisting of [a-zA-Z_], and then a rule for matching identifiers of the form [a-zA-Z_][a-zA-Z_0-9]*, the rule for matching identifiers will catch any identifiers which start with the name of a reserved word without having to back up and try to match again.
According to the faq, flex generates a deterministic finite automaton which "does all the matching simultaneously, in parallel." The result of this is, as was said above, that the speed of the scanner is independent of the number of rules. On the other hand, string comparison is linear in the number of rules.
As a result, the reserved word rules should, in fact, be considerably faster than a lookup table.

How to start working with a large decision table

Today I've been presented with a fun challenge and I want your input on how you would deal with this situation.
So the problem is the following (I've converted it to demo data as the real problem wouldn't make much sense without knowing the company dictionary by heart).
We have a decision table that has a minimum of 16 conditions. Because it is an impossible feat to manage all of them (2^16 possibilities) we've decided to only list the exceptions. Like this:
As an example I've only added 10 conditions but in reality there are (for now) 16. The basic idea is that we have one baseline (the default) which is valid for everyone and all the exceptions to this default.
Example:
You have a foreigner who is also a pirate.
If you go through all the exceptions one by one, and condition by condition you remove the exceptions that have at least one condition that fails. In the end you'll end up with the following two exceptions that are valid for our case. The match is on the IsPirate and the IsForeigner condition. But as you can see there are 2 results here, well 3 actually if you count the default.
Our solution
Now what we came up with on how to solve this is that in the GUI where you are adding these exceptions, there should run an algorithm which checks for such cases and force you to define the exception more specifically. This is only still a theory and hasn't been tested out but we think it could work this way.
My Question
I'm looking for alternative solutions that make the rules manageable and prevent the problem I've shown in the example.
Your problem seem to be resolution of conflicting rules. When multiple rules match your input, (your foreigner and pirate) and they end up recommending different things (your cangetjob and cangetevicted), you need a strategy for resolution of this conflict.
What you mentioned is one way of resolution -- which is to remove the conflict in the first place. However, this may not always be possible, and not always desirable because when a user adds a new rule that conflicts with a set of old rules (which he/she did not write), the user may not know how to revise it to remove the conflict.
Another possible resolution method is prioritization. Mark a priority on each rule (based on things like the user's own authority etc.), sort the matching rules according to priority, and apply in ascending sequence of priority. This usually works and is much simpler to manage (e.g. everybody knows that the top boss's rules are final!)
Prioritization may also be used to mark a certain rule as "global override". In your example, you may want to make "IsPirate" as an override rule -- which means that it overrides settings for normal people. In other words, once you're a pirate, you're treated differently. This make it very easy to design a system in which you have a bunch of normal business rules governing 90% of the cases, then a set of "exceptions" that are treated differently, automatically overriding certain things. In this case, you should also consider making "?" available in the output columns as well.
One other possible resolution method is to include attributes in each of your conditions. For example, certain conditions must have no "zeros" in order to pass (? doesn't matter). Some conditions must have at least one "one" in order to pass. In other words, mark each condition as either "AND", "OR", or "XOR". Some popular file-system security uses this model. For example, CanGetJob may be AND (you want to be stringent on rights-to-work). CanBeEvicted may be OR -- you may want to evict even a foreigner if he is also a pirate.
An enhancement on the AND/OR method is to provide a threshold that the total result must exceed before passing that condition. For example, putting CanGetJob at a threshold of 2 then it must get at least two 1's in order to return 1. This is sometimes useful on conditions that are not clearly black-and-white.
You can mix resolution methods: e.g. first prioritize, then use AND/OR to resolve rules with similar priorities.
The possibilities are limitless and really depends on what your actual needs are.
To me this problem reminds business rules engine where there is no known algorithm to define outputs from inputs (e.g. using boolean logic) but the user (typically some sort of administrator) has to define all or some the logic itself.
This might sound a bit of an overkill but OTOH this provides virtually limit-less extension capabilities: you don't have to code any new business logic, just define a new rule set.
As I understand your problem, you are looking for a nice way to visualise the editing for these rules. But this all depends on your programming language and the tool you select for this. Java, for example, has JBoss Drools. Quoting their page:
Drools Guvnor provides a (logically
centralized) repository to store you
business knowledge, and a web-based
environment that allows business users
to view and (within certain
constraints) possibly update the
business logic directly.
You could possibly use this generic tool or write your own.
Everything depends on what your actual rules will look like. Rules like 'IF has an even number of these properties THEN' would be painful to represent in this format, whereas rules like 'IF pirate and not geek THEN' are easy.
You can 'avoid the ambiguity' by stating that you'll always be taking the first actual match, in other words your rules have a priority. You'd then want to flag rules which have no effect because they are 'shadowed' by rules higher up. They're not hard to find, so it's something your program should do.
Your interface could also indicate groups of rules where rules within the group can be in any order without changing the outcomes. This will add clarity to what the rules are really saying.
If some of your outputs are relatively independent of the others, you will also get a more compact and much clearer table by allowing question marks in the output. In that design the scan for first matching rule is done once for each output. Consider for example if 'HasChildren' is the only factor relevant to 'Can Be Evicted'. With question marks in the outputs (= no effect) you could be halving the number of exception rules.
My background for this is circuit logic design, not business logic. What you're designing is similar to, but not the same as, a PLA. As long as your actual rules are close to sum of products then it can work well. If your rules aren't, for example the 'even number of these properties' rule, then the grid like presentation will break down in a combinatorial explosion of cases. Your best hope if your rules are arbitrary is to get a clearer more compact presentation with either equations or with diagrams like a circuit diagram. To be avoided, if you can.
If you are looking for a Decision Engine with a GUI, than you can try this one: http://gandalf.nebo15.com/
We just released it, it's open source and production ready.
You probably need some kind of inference engine. Think about doing it in prolog.