I am trying to model a certain behaviour, where couple of activities in differents swimlanes supposed to be processed in a loop. Now BPMN uses tokens to ilustrate the flow and paths taken. I wonder how such tokens work in case of loops. Does every activity iteration creates a token which consequently travel through the connected activities?
E.g. Let's say Activity1 will be performed in a loop 10 times. Will that create 10 tokens where each will travel through the remaining activities of the process? Such behaviour would be undesirable, however if I am not mistaken multi-instance activities work that way.
The only solution on my mind which would comply with BPMN specification would be to create a Call activity for the whole block of activities and then run the Call activity in a loop.
Can anyone clarify for me the use of loops and multi-instances in BPMN from the view of tokens?
Thank you in advance!
Based upon my reading of the documentation: https://www.omg.org/spec/BPMN/2.0/PDF The answer from #qwerty_so does not seem to conform to the standard, although in part this seems to be because the question also seems imprecise or at least underspecified.
A token (see glossary) is simply an imaginary object that represents the flow unit in the process diagram. There are at least three different types of loops specified in the standard, which suggest different implications for the flow unit.
Sections 13.2.6 and 12.2.7 describe Loop Activity and Multiple Instance Activities respectively. While the latter, on its face, might not seem like a loop, the standard defines attributes of the activity that suggest otherwise including: MultipleInstanceLoopCharacteristics and ExpressionloopCardinality.
In the former case, it seems that the operational semantics suggest a single flow unit that repeats multiple times according to some policy or even unbounded.
In the latter case, the activity has "multiple instances spawned," including a parallel variant.
That multiple instances can flow forward in parallel, on its face, suggests that the system must at least allow for the possibility of spawning multiple tokens (or conceptually splitting the original token) to support multiple threads proceeding simultaneously along different paths.
That said, the Loop Activity (13.2.6) appears to support the OP's desired semantics.
Related
Image to illustrate point of freezing Context:
Creating a scalable model for a production line to increase Man Machine Optimization ratio. Will be scaling the model for an operator (resource) to work on multiple machines (of the same type). During the process flow at a machine, the operator will be seized and released multiple times for different taskings.
Problem:
Entire process freezes when the operator is being seized at multiple seize blocks concurrently.
Thoughts:
Is there a way to create a list where taskings are added in the event the resource is currently seized. Resource will then work on the list of taskings whenever it becomes idle. Any other methods to resolve this issue is also appreciated!
If this is going to become a complex model, you may want to consider using a pure agent-based approach.
Your resource has a LinkedList of JobRequest agents that are created and send by the machines when necessary. They are sorted by some priority.
The resource then simply does one JobRequest after the next.
No ResourcePools or Seieze elements required.
This is often the more powerful and flexible approach as you are not bound to the process blocks anymore. But obviously, it needs good control and testing from you :)
Problem: Entire process freezes when the operator is being
seized at multiple seize blocks concurrently.
You need to explain your problem better: it is not possible to "seize the same operator at multiple seize blocks concurrently" (unless you are using a resource choice condition or similar to try to 'force' seizing of a particular resource --- even then, this is more accurately framed as 'I've set up resource choice conditions which mean I end up having no valid resources available').
What does your model "freezing" represent? For example, it could just be a natural consequence of having no resources available, especially if you have long delay times or are using Delay blocks with "Until stopDelay() is called" set --- i.e., you are relying on events elsewhere in your model to free agents (and seized resources) from blocks, which an incorrect model design might mean never happen in some circumstances. (If your model is "freezing" because of no resources being available, it should 'unfreeze' when one does.)
During the process flow at a machine, the operator will be
seized and released multiple times for different taskings.
You can just do this bit by breaking down the actions at a machine into a number of Seize/Delay/Release actions with different characteristics (or a process flow that loops around a set of these driven by some data if you want it to be more flexible / data-driven).
I know that one can utilize multiple KieBases and multiple KieSessions, but I don't understand under what scenarios one would use one approach vs the other (I am having some trouble in general understanding the definitions and relationships between KieContainer, KieBase, KieModule, and KieSession). Can someone clarify this?
You use multiple KieBases when you have multiple sets of rules doing different things.
KieSessions are the actual session for rule execution -- that is, they hold your data and some metadata and are what actually executes the rules.
Let's say I have an application for a school. One part of my application monitors students' attendance. The other part of my application tracks their grades. I have a set of rules which decides if students are truant and we need to talk to their parents. I have a completely unrelated set of rules which determines whether a student is having trouble academically and needs to be put on probation/a performance plan.
These rules have nothing to do with one another. They have completely separate concerns, different rule inputs, and are triggered in different parts of the application. The part of the application that is tracking attendance doesn't need to trigger the rules that monitor student performance.
For this application, I would have two different KieBases: one for attendance, and one for academics. When I need to fire the rules, I fire one or the other -- there is no use case for firing both at the same time.
The KieSession is the runtime for when we fire those rules. We add to it the data we need to trigger the rules, and it also tracks some other metadata that's really not relevant to this discussion. When firing the academics rules, I would be adding to it the student's grades, their classes, and maybe some information about the student (eg the grade level, whether they're an "honors" student, tec.). For the attendance rules, we would need the student information, plus historical tardiness/absence records. Those distinct pieces of data get added to the sessions.
When we decide to fire rules, we first get the appropriate KieBase -- academics or attendance. Then we get a session for that rule set, populate the data, and fire it. We technically "execute" the session, not the rules (and definitely not the rule base.) The rule base is just the collection of the rules; the session is how we actually execute it.
There are two kinds of sessions -- stateful and stateless. As their names imply, they differ with how data is stored and tracked. In most cases, people use stateful sessions because they want their rules to do iterative work on the inputs. You can read more about the specific differences in the documentation.
For low-volume applications, there's generally little need to reuse your KieSessions. Create, use, and dispose of them as needed. There is, however, some inherent overhead in this process, so there comes a point in which reuse does become something that you should consider. The documentation discusses the solution provided out-of-the box for Drools, which is session pooling.
(When trying to wrap your head around this, I like to use an analogy of databases. A session is like a JDBC connection: for small applications you can create them, use them, then close them as you need them. But as you scale you'll quickly find that you need to look into connection pooling to minimize this overhead. In this particular analogy, the rule base would be the database that the rules are executing against -- not the tables!)
Assuming a BPMN process describing activities, gateways, start and end events. As follow:
Each step is managed by a BPMN engine. At one point, how can we tell which is the state of the process ? Activities seem to define some state embodied as actions (e.g. evaluating request). Am I correct ?
Also, if we assume activity represents the state, how do we get a listing of next possible states if we were to navigate through a dedicated follow-up application ?
Should the process be modeled in a more workflow oriented way to express those state/actions possibilities ? I have the intuition that events could also be used to manage states and possible related actions.
Since I am not sure what exactly you understand as state of the process, I will try to define that first. I guess you are aware of the token concept, see a discussion in the Camunda forum:
A token is a BPMN concept that represents a state within a process instance. It does not have any variables or any message.
You may now define the state of the process as a statistics how many tokens at a given time are existing, and how many are currently in a given activity or event.
This statistics can be extracted from your favorite BPMN engine (and seen e.g. in Camunda's Cockpit as little colorful bubbles). With that statistics in hand, you could in principle generate forecast on next possible states, i.e. determine scenarios how many tokens will be in the next time instance probably in each activity.
State has a different meaning in BPMN, it could mean:
1 - Where is the token in the flow?
2 - Is the process flow running correctly or not?
3 - Or, by a specific variable (field) in the forms.
If you mean the third case, which is common in processes, you have to define a field in your data model as enum (depends on the engine) and manually or automatically change its value in the forms.
Obviously, the rather abstract Petri-Net-style token flow semantics of BPMN does not capture the real semantics of business processes. It has just been artificially imposed on BPMN due to academic pressure groups. A really meaningful semantics must refer to the information context of a process in the business system that owns it.
Of course, a business system that is the owner of a process (type), is, at any point during a running process, in a certain complex dynamic information state, some part of which forms the context of the process and can therefore be considered its state.
In fact, the (information) state of a process is essentially given by all the property-value slots of objects that are used or affected by (events/activities of) the process. In addition to these "global variables", the state of a process also includes
the values of (auxiliary) process variables,
the information, which activities have been started (and are ongoing).
Take a look into the Imixs-Workflow project. It is a event orientated workflow engine instead of the task orientated design often seen in BPM engines.
Each task in this kind of workflow engine defines a state in your process model. The workflow engine holds this state until a event is fired. An event defines the transition from one state to another.
You can find examples how to model different szenarious in a event driven workflow model here.
I've been using Spec Explorer for about a month now on a big project,
it´s been going well besides one thing
Sometimes new states are being generated instead of looping, for example
- Create object, new state
- Do something with object, new state
- Do something that changes nothing (trying to create same object, does not change any state variables) here I get a new state instead of looping
Most of the times it loops, like it should, sometimes not, and there is absolutely no difference in the state comparison view except for the two top lines that only covers the description as to how the state came to be.
Anyone had similar problems or knows what´s going on?
There are several possible reasons.
But in most cases the problem is: scenarios introduce control states.
Here the deepest explanation you can get on "How are identical states identified?"
"Ideally, we would identify two states when they
(a) have the same state contents, and
(b) have the same future behavior.
The reason why (a) is not enough is that enabled actions don’t depend on the state contents only, but also potentially on scenarios applied in a Cord script. Scenarios introduce control states.
The problem here is that checking (b) is not feasible in practice, as it would imply looking ahead all paths stemming from a state.
So we rely on a heuristic, consisting on identifying states that not only have the same contents, but are also produced in the same step of a scenario.
So two states are equivalent if they contain the same data AND can perform the same actions.
For example, in a scenario such as A; A; B*, we have three states, all with the same (empty) contents.
When we compose this scenario in parallel with a model program, states corresponding to these three states will not be merged, regardless of their contents.
As a consequence, when you are comparing two states to understand why they are not merged, you should not just look at the values of their variables (data state), but also at the state description, which provides the control state.
States which have been generated by different machines using Spec Explorer heuristic cannot safely considered by a single state.
As said, this is just a safe heuristic. So there’s no guarantee that two conceptually-equivalent states will always be merged;
but two states that are not conceptually equivalent should never be merged."
I read Deprecating the Observer Pattern with Scala.React and found reactive programming very interesting.
But there is a point I can't figure out: the author described the signals as the nodes in a DAG(Directed acyclic graph). Then what if you have two signals(or event sources, or models, w/e) depending on each other? i.e. the 'two-way binding', like a model and a view in web front-end programming.
Sometimes it's just inevitable because the user can change view, and the back-end(asynchronous request, for example) can change model, and you hope the other side to reflect the change immediately.
The loop dependencies in a reactive programming language can be handled with a variety of semantics. The one that appears to have been chosen in scala.React is that of synchronous reactive languages and specifically that of Esterel. You can have a good explanation of this semantics and its alternatives in the paper "The synchronous languages 12 years later" by Benveniste, A. ; Caspi, P. ; Edwards, S.A. ; Halbwachs, N. ; Le Guernic, P. ; de Simone, R. and available at http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=1173191&tag=1 or http://virtualhost.cs.columbia.edu/~sedwards/papers/benveniste2003synchronous.pdf.
Replying #Matt Carkci here, because a comment wouldn't suffice
In the paper section 7.1 Change Propagation you have
Our change propagation implementation uses a push-based approach based on a topologically ordered dependency graph. When a propagation turn starts, the propagator puts all nodes that have been invalidated since the last turn into a priority queue which is sorted according to the topological order, briefly level, of the nodes. The propagator dequeues the node on the lowest level and validates it, potentially changing its state and putting its dependent nodes, which are on greater levels, on the queue. The propagator repeats this step until the queue is empty, always keeping track of the current level, which becomes important for level mismatches below. For correctly ordered graphs, this process monotonically proceeds to greater levels, thus ensuring data consistency, i.e., the absence of glitches.
and later at section 7.6 Level Mismatch
We therefore need to prepare for an opaque node n to access another node that is on a higher topological level. Every node that is read from during n’s evaluation, first checks whether the current propagation level which is maintained by the propagator is greater than the node’s level. If it is, it proceed as usual, otherwise it throws a level mismatch exception containing a reference to itself, which is caught only in the main propagation loop. The propagator then hoists n by first changing its level to a level above the node which threw the exception, reinserting n into the propagation queue (since it’s level has changed) for later evaluation in the same turn and then transitively hoisting all of n’s dependents.
While there's no mention about any topological constraint (cyclic vs acyclic), something is not clear. (at least to me)
First arises the question of how is the topological order defined.
And then the implementation suggests that mutually dependent nodes would loop forever in the evaluation through the exception mechanism explained above.
What do you think?
After scanning the paper, I can't find where they mention that it must be acyclic. There's nothing stopping you from creating cyclic graphs in dataflow/reactive programming. Acyclic graphs only allow you to create Pipeline Dataflow (e.g. Unix command line pipes).
Feedback and cycles are a very powerful mechanism in dataflow. Without them you are restricted to the types of programs you can create. Take a look at Flow-Based Programming - Loop-Type Networks.
Edit after second post by pagoda_5b
One statement in the paper made me take notice...
For correctly ordered graphs, this process
monotonically proceeds to greater levels, thus ensuring data
consistency, i.e., the absence of glitches.
To me that says that loops are not allowed within the Scala.React framework. A cycle between two nodes would seem to cause the system to continually try to raise the level of both nodes forever.
But that doesn't mean that you have to encode the loops within their framework. It could be possible to have have one path from the item you want to observe and then another, separate, path back to the GUI.
To me, it always seems that too much emphasis is placed on a programming system completing and giving one answer. Loops make it difficult to determine when to terminate. Libraries that use the term "reactive" tend to subscribe to this thought process. But that is just a result of the Von Neumann architecture of computers... a focus of solving an equation and returning the answer. Libraries that shy away from loops seem to be worried about program termination.
Dataflow doesn't require a program to have one right answer or ever terminate. The answer is the answer at this moment of time due to the inputs at this moment. Feedback and loops are expected if not required. A dataflow system is basically just a big loop that constantly passes data between nodes. To terminate it, you just stop it.
Dataflow doesn't have to be so complicated. It is just a very different way to think about programming. I suggest you look at J. Paul Morison's book "Flow Based Programming" for a field tested version of dataflow or my book (once it's done).
Check your MVC knowledge. The view doesn't update the model, so it won't send signals to it. The controller updates the model. For a C/F converter, you would have two controllers (one for the F control, on for the C control). Both controllers would send signals to a single model (which stores the only real temperature, Kelvin, in a lossless format). The model sends signals to two separate views (one for C view, one for F view). No cycles.
Based on the answer from #pagoda_5b, I'd say that you are likely allowed to have cycles (7.6 should handle it, at the cost of performance) but you must guarantee that there is no infinite regress. For example, you could have the controllers also receive signals from the model, as long as you guaranteed that receipt of said signal never caused a signal to be sent back to the model.
I think the above is a good description, but it uses the word "signal" in a non-FRP style. "Signals" in the above are really messages. If the description in 7.1 is correct and complete, loops in the signal graph would always cause infinite regress as processing the dependents of a node would cause the node to be processed and vice-versa, ad inf.
As #Matt Carkci said, there are FRP frameworks that allow loops, at least to a limited extent. They will either not be push-based, use non-strictness in interesting ways, enforce monotonicity, or introduce "artificial" delays so that when the signal graph is expanded on the temporal dimension (turning it into a value graph) the cycles disappear.