Can optaplanner solve partly pre-assigned planning entities using Drools rules? - drools

We are using the time dependent vehicle routing problem example of Optaplanner 6.2.
In our case the domain model consists of activities (corresponding to customers) and technicians (corresponding to vehicles).
Is it possible to initialize an optimization with partly pre-assigned activities to certain technicians, whereby the rest of activities is not assigned?
This would correspond to the case of optaplanner (cvrptw-case) when we stop the solving or wait for the solution, and then add at the end of the solved xml-file not assigned activities.
This file would then be used for the further optimization as input file. Here it is mandatory to lock the already assigned activities.
Can such a starting state: initially locked consecutive
chain-parts of pre-defined activities at the chain-start which should not be rearranged - while the rest activities should be optimized and put after the pre-assigned activities into the existing chains, handled with incremental constraint rules (with Drools)?

Read over the Immovable Planning Entities and Nonvolatile Replanning sections of the Repeated Planning chapter of the manual. It's fairly well explained.

Related

When to use multiple KieBases vs multiple KieSessions?

I know that one can utilize multiple KieBases and multiple KieSessions, but I don't understand under what scenarios one would use one approach vs the other (I am having some trouble in general understanding the definitions and relationships between KieContainer, KieBase, KieModule, and KieSession). Can someone clarify this?
You use multiple KieBases when you have multiple sets of rules doing different things.
KieSessions are the actual session for rule execution -- that is, they hold your data and some metadata and are what actually executes the rules.
Let's say I have an application for a school. One part of my application monitors students' attendance. The other part of my application tracks their grades. I have a set of rules which decides if students are truant and we need to talk to their parents. I have a completely unrelated set of rules which determines whether a student is having trouble academically and needs to be put on probation/a performance plan.
These rules have nothing to do with one another. They have completely separate concerns, different rule inputs, and are triggered in different parts of the application. The part of the application that is tracking attendance doesn't need to trigger the rules that monitor student performance.
For this application, I would have two different KieBases: one for attendance, and one for academics. When I need to fire the rules, I fire one or the other -- there is no use case for firing both at the same time.
The KieSession is the runtime for when we fire those rules. We add to it the data we need to trigger the rules, and it also tracks some other metadata that's really not relevant to this discussion. When firing the academics rules, I would be adding to it the student's grades, their classes, and maybe some information about the student (eg the grade level, whether they're an "honors" student, tec.). For the attendance rules, we would need the student information, plus historical tardiness/absence records. Those distinct pieces of data get added to the sessions.
When we decide to fire rules, we first get the appropriate KieBase -- academics or attendance. Then we get a session for that rule set, populate the data, and fire it. We technically "execute" the session, not the rules (and definitely not the rule base.) The rule base is just the collection of the rules; the session is how we actually execute it.
There are two kinds of sessions -- stateful and stateless. As their names imply, they differ with how data is stored and tracked. In most cases, people use stateful sessions because they want their rules to do iterative work on the inputs. You can read more about the specific differences in the documentation.
For low-volume applications, there's generally little need to reuse your KieSessions. Create, use, and dispose of them as needed. There is, however, some inherent overhead in this process, so there comes a point in which reuse does become something that you should consider. The documentation discusses the solution provided out-of-the box for Drools, which is session pooling.
(When trying to wrap your head around this, I like to use an analogy of databases. A session is like a JDBC connection: for small applications you can create them, use them, then close them as you need them. But as you scale you'll quickly find that you need to look into connection pooling to minimize this overhead. In this particular analogy, the rule base would be the database that the rules are executing against -- not the tables!)

How is the state of a BPMN process defined?

Assuming a BPMN process describing activities, gateways, start and end events. As follow:
Each step is managed by a BPMN engine. At one point, how can we tell which is the state of the process ? Activities seem to define some state embodied as actions (e.g. evaluating request). Am I correct ?
Also, if we assume activity represents the state, how do we get a listing of next possible states if we were to navigate through a dedicated follow-up application ?
Should the process be modeled in a more workflow oriented way to express those state/actions possibilities ? I have the intuition that events could also be used to manage states and possible related actions.
Since I am not sure what exactly you understand as state of the process, I will try to define that first. I guess you are aware of the token concept, see a discussion in the Camunda forum:
A token is a BPMN concept that represents a state within a process instance. It does not have any variables or any message.
You may now define the state of the process as a statistics how many tokens at a given time are existing, and how many are currently in a given activity or event.
This statistics can be extracted from your favorite BPMN engine (and seen e.g. in Camunda's Cockpit as little colorful bubbles). With that statistics in hand, you could in principle generate forecast on next possible states, i.e. determine scenarios how many tokens will be in the next time instance probably in each activity.
State has a different meaning in BPMN, it could mean:
1 - Where is the token in the flow?
2 - Is the process flow running correctly or not?
3 - Or, by a specific variable (field) in the forms.
If you mean the third case, which is common in processes, you have to define a field in your data model as enum (depends on the engine) and manually or automatically change its value in the forms.
Obviously, the rather abstract Petri-Net-style token flow semantics of BPMN does not capture the real semantics of business processes. It has just been artificially imposed on BPMN due to academic pressure groups. A really meaningful semantics must refer to the information context of a process in the business system that owns it.
Of course, a business system that is the owner of a process (type), is, at any point during a running process, in a certain complex dynamic information state, some part of which forms the context of the process and can therefore be considered its state.
In fact, the (information) state of a process is essentially given by all the property-value slots of objects that are used or affected by (events/activities of) the process. In addition to these "global variables", the state of a process also includes
the values of (auxiliary) process variables,
the information, which activities have been started (and are ongoing).
Take a look into the Imixs-Workflow project. It is a event orientated workflow engine instead of the task orientated design often seen in BPM engines.
Each task in this kind of workflow engine defines a state in your process model. The workflow engine holds this state until a event is fired. An event defines the transition from one state to another.
You can find examples how to model different szenarious in a event driven workflow model here.

Is rule engine suitable for validating data against set of rules?

I am trying to design an application that allows users to create subscriptions based on different configurations - expressing their interest to receive alerts when those conditions are met.
While evaluating the options for achieving the same, I was thinking about utilizing a generic rule engine such as Drools to achieve the same. Which seemed to be a natural fit to this problem looking at an high-level. But digging deeper and giving it a bit more thought, I am doubting if Business Rule Engine is the right thing to use.
I see Rule engine as something that can select a Rule based on predefined condition and apply the Rule to that data to produce an outcome. Whereas, my requirement is to start with a data (the event that is generated) and identify based on Rules (subscriptions) configured by users to identify all the Rules (subscription) that would satisfy the event being handled. So that Alerts can be generated to all those Subscribers.
To give an example, an hypothetical subscription from an user could be, to be alerted when a product in Amazon drops below $10 in the next 7 days. Another user would have created a subscription to be notified when a product in Amazon drops below $15 within the next 30 days and also offers free one-day shipping for Prime members.
After a bit of thought, I have settled down to storing the Rules/Subscriptions in a relational DB and identifying which Subscriptions are to fire an Alert for an Event by querying against the DB.
My main reason for choosing this approach is because of the volume, as the number of Rules/Subscriptions I being with will be about 1000 complex rules, and will grow exponentially as more users are added to the system. With the query approach I can trigger a single query that can validate all Rules in one go, vs. the Rule engine approach which would require me to do multiple validations based on the number of Rules configured.
While, I know my DB approach would work (may not be efficient), I just wanted to understand if Rule Engine can be used for such purposes and be able to scale well as the number of rules increases. (Performance is of at most importance as the number of Events that are to be processed per minute will be about 1000+)
If rule engine is not the right way to approach it, what other options are there for me to explore rather than writing my own implementation.
You are getting it wrong. A standard rule engine selects rules to execute based on the data. The rules constraints are evaluated with the data you insert into the rule engine. If all constraints in a rule match the data, the rule is executed. I would suggest you to try Drools.

Multi-instance and Loop in BPMN

I am trying to model a certain behaviour, where couple of activities in differents swimlanes supposed to be processed in a loop. Now BPMN uses tokens to ilustrate the flow and paths taken. I wonder how such tokens work in case of loops. Does every activity iteration creates a token which consequently travel through the connected activities?
E.g. Let's say Activity1 will be performed in a loop 10 times. Will that create 10 tokens where each will travel through the remaining activities of the process? Such behaviour would be undesirable, however if I am not mistaken multi-instance activities work that way.
The only solution on my mind which would comply with BPMN specification would be to create a Call activity for the whole block of activities and then run the Call activity in a loop.
Can anyone clarify for me the use of loops and multi-instances in BPMN from the view of tokens?
Thank you in advance!
Based upon my reading of the documentation: https://www.omg.org/spec/BPMN/2.0/PDF The answer from #qwerty_so does not seem to conform to the standard, although in part this seems to be because the question also seems imprecise or at least underspecified.
A token (see glossary) is simply an imaginary object that represents the flow unit in the process diagram. There are at least three different types of loops specified in the standard, which suggest different implications for the flow unit.
Sections 13.2.6 and 12.2.7 describe Loop Activity and Multiple Instance Activities respectively. While the latter, on its face, might not seem like a loop, the standard defines attributes of the activity that suggest otherwise including: MultipleInstanceLoopCharacteristics and ExpressionloopCardinality.
In the former case, it seems that the operational semantics suggest a single flow unit that repeats multiple times according to some policy or even unbounded.
In the latter case, the activity has "multiple instances spawned," including a parallel variant.
That multiple instances can flow forward in parallel, on its face, suggests that the system must at least allow for the possibility of spawning multiple tokens (or conceptually splitting the original token) to support multiple threads proceeding simultaneously along different paths.
That said, the Loop Activity (13.2.6) appears to support the OP's desired semantics.

Rules Based Database Engine

I would like to design a rules based database engine within Oracle for PeopleSoft Time entry application. How do I do this?
A rules-based system needs several key components:
- A set of rules defined as data
- A set of uniform inputs on which to operate
- A rules executor
- Supervisor hierarchy
Write out a series of use-cases - what might someone be trying to accomplish using the system?
Decide on what things your rules can take as inputs, and what as outputs
Describe the rules from your use-cases as a series of data, and thus determine your rule format. Expand 2 as necessary for this.
Create the basic rule executor, and test that it will take the rule data and process it correctly
Extend the above to deal with multiple rules with different priorities
Learn enough rule engine theory and graph theory to understand common rule-based problems - circularity, conflicting rules etc - and how to use (node) graphs to find cases of them
Write a supervisor hierarchy that is capable of managing the ruleset and taking decisions based on the possible problems above. This part is important, because it is your protection against foolishness on the part of the rule creators causing runtime failure of the entire system.
Profit!
Broadly, rules engines are an exercise in managing complexity. If you don't manage it, you can easily end up with rules that cascade from each other causing circular loops, race-conditions and other issues. It's very easy to construct these accidentally: consider an email program which you have told to move mail from folder A to B if it contains the magic word 'beta', and from B to A if it contains the word 'alpha'. An email with both would be shuttled back and forward until something broke, preventing all other rules from being processed.
I have assumed here that you want to learn about the theory and build the engine yourself. alphazero raises the important suggestion of using an existing rules engine library, which is wise - this is the kind of subject that benefits from academic theory.
I haven't tried this myself, but an obvious approach is to use Java procedures in the Oracle database, and use a Java rules engine library in that code.
Try:
http://www.oracle.com/technology/tech/java/jsp/index.html
http://www.oracle.com/technology/tech/java/java_db/pdf/TWP_AppDev_Java_DB_Reduce_your_Costs_and%20_Extend_your_Database_10gR1_1113.PDF
and
http://www.jboss.org/drools/
or
http://www.jessrules.com/
--
Basically you'll need to capture data events (inserts, updates, deletes), map to them to your rulespace's events, and apply rules.