Decision Report generation in Drools - drools

How can we generate a decision report in Drools like the OPA Decision Report.
I've tried to check the drools websites and all. But I couldn't find any concrete information regarding this topic.

You'll have to enable all your rules to report that they are fired.
You can do this by adding (e.g.) the output of a log file entry to the rules' right hand side (or even a simple println to some text file).
A more generic way can be achieved by adding an event listener. A "rule fired" event can then access the rule to retrieve a metadata item and write an entry to a log, or file, or store it in a list or whatever. Obviously this is cleaner and safer (you'll detect a missing metadata entry) and more flexible.
My white paper on rules design patterns describes a technique where you write rule to keep track of individual conditions being met by some fact, and at the end of rules firing or not, you can assess what isn't fulfilled. It requires more work than the "all-or-nothing" rule, but it isn't prohibitive (I think).

Related

Does drools have any validate utility that will scan a knowledge base?

If any incorrect rule pushed to knowledgebase. Can we scan the knowledgeBase to filter out the bad rules, which may create problem later?
Not only does this not exist, but it's not possible.
Simple syntax errors
The really easy problems -- straight up syntax errors -- are caught at runtime when the Drools rules are compiled. The rules themselves won't be loaded at all, and the Kie classes will throw an exception.
So, for example, let's say you were in a rush and you forgot the "when" keyword:
rule "Example 1"
salience 9
$s: SomeObject( foo == "bar" )
then
$s.doSomething()
end
This will be caught quickly at the time your rules are loaded. Other syntax errors like missing imports will be caught as well.
If you're following unit testing best practices, this will be caught at the unit testing stage, which -- given proper CI practices -- should keep the bad rules from being merged into your main code line.
If you're loading your rules from a Maven repository as a kjar, you'll likely want to have some sort of testing harness before publishing that kjar so that you don't end up with a situation where your service pulls new rules and ends up with a nonfunctional artifact.
Logical errors
Unlike syntax errors which can be caught early, logical errors really can't. The problem boils down to the fact that Drools is a framework for business logic -- it doesn't actually understand your business logic. And it can't, and it shouldn't need to.
If you write bad rules that put yourself into a bad situation, Drools has no way of actually identifying this because it doesn't understand your business needs.
Consider the following rules:
rule "A"
when
$f: Foo( value == 0 )
then
$f.setValue(100);
update($f);
end
rule "B"
when
$f: Foo( value > 10 )
then
$f.setValue(0);
update($f);
end
Assuming your initial input is Foo(value: 50), rule B will fire, setting value to 0. Then rule A will fire, setting value to 100. Then rule B will fire, setting value to 0. And so on, forever, until you kill the process manually.
This simple infinite loop is poor rule design, but it's only a bug because of your data characteristics. The above example is obvious and contrived, but that's because I'm just using it as an example. In practice, I once saw a set of 27 rules that triggered in a particular order given a very specific input and looped like that -- it took me a whole week to track down while manually tracing transactions through a customer system.
Drools can't "know" that this will cause an infinite loop because it doesn't understand your business rules or data. In the above Foo example with rules A and B, you could say "well obviously Drools could see that it's setting value in a way that'll trigger rules in a loop given the 'update' call" -- but you're making assumptions. What if I told you that there are situations where these rules don't loop? I never showed you the Foo model, maybe setValue has side effects.
The way you'd catch these is by actually validating your business logic. I usually start with unit tests and treat it as if I'm testing an "if" condition -- check the boundaries, check the edge cases, check the good value, check an obviously bad value, etc. Then I add end-to-end tests passing in real (sanitized) inputs collected from production -- the most common inputs, the weird inputs, the ones that have shown up in previous bug reports, etc.
And because that's not enough and you really need an understanding of all rules, we have internal style guidelines around DRL design to keep something like the above looping examples from happening. (We don't permit update, for example. There are specific rules around salience and retract as well.)
But built-in validation for this? Not actually possible; you need to do your own due diligence just like you would for your code.

How do I know which drools rule is running now?

For example, I am load a lot of drools rules to run, how do I know which drools rule now is running? So I can know find out the rule
Assuming you're talking about the right hand side of the rules, you'll want to use an AgendaEventListener. This is an interface which defines a listener that you can create that watches the Event Lifecycle. For more information about the event model, please refer to the Drools documentation.
The easiest way to do this would be to extend either DefaultAgendaEventListener or DebugAgendaEventListener. Both of these classes implement all of the interface methods. The Default listener implements each method as a "no-op", so you can override just the methods you care about. The Debug listener implements each method with a logging statement, logging the toString() of the triggering event to INFO. If you're just learning about the Drools lifecycle, hooking up the various Debug listeners is a great way to watch and learn how rules and events process in rules.
(Also the cool thing about listeners is that they allow you to put breakpoints in the "when" clause that trigger when specific conditions are met -- eg when a rule match is created. In general I find that listeners are a great debugging tool because they allow you to put breakpoints in methods that trigger when different parts of the Drools lifecycle occur.)
Anyway, what you'll want to do is create an event listener and then pay attention to one or more of these specific events:
BeforeMatchFired
AfterMatchFired
MatchCreated
Which events to pay attention to depend on where you think the issue is.
If you think the issue is in the "when" clause (left-hand side, LHS), the MatchCreated event is what is triggered when Drools evaluates the LHS and decides that this rule is valid for firing based on the input data. It is then put on, effectively, a priority queue based on salience. When the rule is the highest priority on the queue, it is picked up for firing -- at this point the BeforeMatchFired event is triggered; note that this is before the "then" clause (right-hand side, RHS) is evaluated. Then Drools will actually do the work on the RHS, and once it finishes, trigger the AfterMatchFired.
Things get a little more complicated when your rules do things like updates/retracts/etc -- you'll start having to consider potential match cancellations when Drools re-evaluates the LHS and decides that a rule is no longer valid to be fired per the facts in working memory. But in general, these are the tools you'll want to start with.
The way I would traditionally identify long-running rules would be to start timing within the BeforeMatchFired and to stop timing in the AfterMatchFired, and then log the resulting rule execution time. Note that you want to be careful here to log the execution of the current rule, tracking it by name; if your rule extends another rule you might find that your execution flow goes BeforeMatchFired(Child) -> BeforeMatchFired(Parent) -> AfterMatchFired(Parent) -> AfterMatchFired(Child), so if you're naively stopping a shared timer you might start having issues. My preferred way of doing this is by tracking timers by rule name in thread local or even a thread-safe map implementation, but you can go whichever route you'd like.
If you're using a very new version of Drools (7.41+), there is a new library called drools-metric which you can use to identify slow rules. I haven't personally used this library yet because the newest versions of Drools have started introducing non-backwards-compatible changes in minor releases, but this is an option as well.
You can read more about drools-metric in the official documentation here (you'll need to scroll down a bit.) There's some tuning you'll need to do because the module only logs instances where the thresholds are exceeded. The docs that I've linked to include the Maven dependency you'll need to import, along with information about configuration, and some examples of the output and how to understand what it's telling you.

How Drools works?

I have a scenario wherein I need to add rules to a rule engine dynamically.
What if I add same rule twice/multiple times?
I am not able to get exact behavior of Drools by doing POC(I am a newbie to Drools).
Also, if a rule once inserted remain in knowledgeBase until I explicitly remove it?
You cannot add the save rule twice (which is sufficient to rule out "multiple times"). If it is the "same", it simply replaces the previous "same" rule. If only the titles differ, then (it isn't the same rule and) you have two rules with the same LHS and the same RHS. This may, or may not, produces the same reaction a second time: this depends on what the RHS does or does not.
You can and should clarify these things by reading the documentation and/or experimenting with a simple setup.

How to start working with a large decision table

Today I've been presented with a fun challenge and I want your input on how you would deal with this situation.
So the problem is the following (I've converted it to demo data as the real problem wouldn't make much sense without knowing the company dictionary by heart).
We have a decision table that has a minimum of 16 conditions. Because it is an impossible feat to manage all of them (2^16 possibilities) we've decided to only list the exceptions. Like this:
As an example I've only added 10 conditions but in reality there are (for now) 16. The basic idea is that we have one baseline (the default) which is valid for everyone and all the exceptions to this default.
Example:
You have a foreigner who is also a pirate.
If you go through all the exceptions one by one, and condition by condition you remove the exceptions that have at least one condition that fails. In the end you'll end up with the following two exceptions that are valid for our case. The match is on the IsPirate and the IsForeigner condition. But as you can see there are 2 results here, well 3 actually if you count the default.
Our solution
Now what we came up with on how to solve this is that in the GUI where you are adding these exceptions, there should run an algorithm which checks for such cases and force you to define the exception more specifically. This is only still a theory and hasn't been tested out but we think it could work this way.
My Question
I'm looking for alternative solutions that make the rules manageable and prevent the problem I've shown in the example.
Your problem seem to be resolution of conflicting rules. When multiple rules match your input, (your foreigner and pirate) and they end up recommending different things (your cangetjob and cangetevicted), you need a strategy for resolution of this conflict.
What you mentioned is one way of resolution -- which is to remove the conflict in the first place. However, this may not always be possible, and not always desirable because when a user adds a new rule that conflicts with a set of old rules (which he/she did not write), the user may not know how to revise it to remove the conflict.
Another possible resolution method is prioritization. Mark a priority on each rule (based on things like the user's own authority etc.), sort the matching rules according to priority, and apply in ascending sequence of priority. This usually works and is much simpler to manage (e.g. everybody knows that the top boss's rules are final!)
Prioritization may also be used to mark a certain rule as "global override". In your example, you may want to make "IsPirate" as an override rule -- which means that it overrides settings for normal people. In other words, once you're a pirate, you're treated differently. This make it very easy to design a system in which you have a bunch of normal business rules governing 90% of the cases, then a set of "exceptions" that are treated differently, automatically overriding certain things. In this case, you should also consider making "?" available in the output columns as well.
One other possible resolution method is to include attributes in each of your conditions. For example, certain conditions must have no "zeros" in order to pass (? doesn't matter). Some conditions must have at least one "one" in order to pass. In other words, mark each condition as either "AND", "OR", or "XOR". Some popular file-system security uses this model. For example, CanGetJob may be AND (you want to be stringent on rights-to-work). CanBeEvicted may be OR -- you may want to evict even a foreigner if he is also a pirate.
An enhancement on the AND/OR method is to provide a threshold that the total result must exceed before passing that condition. For example, putting CanGetJob at a threshold of 2 then it must get at least two 1's in order to return 1. This is sometimes useful on conditions that are not clearly black-and-white.
You can mix resolution methods: e.g. first prioritize, then use AND/OR to resolve rules with similar priorities.
The possibilities are limitless and really depends on what your actual needs are.
To me this problem reminds business rules engine where there is no known algorithm to define outputs from inputs (e.g. using boolean logic) but the user (typically some sort of administrator) has to define all or some the logic itself.
This might sound a bit of an overkill but OTOH this provides virtually limit-less extension capabilities: you don't have to code any new business logic, just define a new rule set.
As I understand your problem, you are looking for a nice way to visualise the editing for these rules. But this all depends on your programming language and the tool you select for this. Java, for example, has JBoss Drools. Quoting their page:
Drools Guvnor provides a (logically
centralized) repository to store you
business knowledge, and a web-based
environment that allows business users
to view and (within certain
constraints) possibly update the
business logic directly.
You could possibly use this generic tool or write your own.
Everything depends on what your actual rules will look like. Rules like 'IF has an even number of these properties THEN' would be painful to represent in this format, whereas rules like 'IF pirate and not geek THEN' are easy.
You can 'avoid the ambiguity' by stating that you'll always be taking the first actual match, in other words your rules have a priority. You'd then want to flag rules which have no effect because they are 'shadowed' by rules higher up. They're not hard to find, so it's something your program should do.
Your interface could also indicate groups of rules where rules within the group can be in any order without changing the outcomes. This will add clarity to what the rules are really saying.
If some of your outputs are relatively independent of the others, you will also get a more compact and much clearer table by allowing question marks in the output. In that design the scan for first matching rule is done once for each output. Consider for example if 'HasChildren' is the only factor relevant to 'Can Be Evicted'. With question marks in the outputs (= no effect) you could be halving the number of exception rules.
My background for this is circuit logic design, not business logic. What you're designing is similar to, but not the same as, a PLA. As long as your actual rules are close to sum of products then it can work well. If your rules aren't, for example the 'even number of these properties' rule, then the grid like presentation will break down in a combinatorial explosion of cases. Your best hope if your rules are arbitrary is to get a clearer more compact presentation with either equations or with diagrams like a circuit diagram. To be avoided, if you can.
If you are looking for a Decision Engine with a GUI, than you can try this one: http://gandalf.nebo15.com/
We just released it, it's open source and production ready.
You probably need some kind of inference engine. Think about doing it in prolog.

Rules Based Database Engine

I would like to design a rules based database engine within Oracle for PeopleSoft Time entry application. How do I do this?
A rules-based system needs several key components:
- A set of rules defined as data
- A set of uniform inputs on which to operate
- A rules executor
- Supervisor hierarchy
Write out a series of use-cases - what might someone be trying to accomplish using the system?
Decide on what things your rules can take as inputs, and what as outputs
Describe the rules from your use-cases as a series of data, and thus determine your rule format. Expand 2 as necessary for this.
Create the basic rule executor, and test that it will take the rule data and process it correctly
Extend the above to deal with multiple rules with different priorities
Learn enough rule engine theory and graph theory to understand common rule-based problems - circularity, conflicting rules etc - and how to use (node) graphs to find cases of them
Write a supervisor hierarchy that is capable of managing the ruleset and taking decisions based on the possible problems above. This part is important, because it is your protection against foolishness on the part of the rule creators causing runtime failure of the entire system.
Profit!
Broadly, rules engines are an exercise in managing complexity. If you don't manage it, you can easily end up with rules that cascade from each other causing circular loops, race-conditions and other issues. It's very easy to construct these accidentally: consider an email program which you have told to move mail from folder A to B if it contains the magic word 'beta', and from B to A if it contains the word 'alpha'. An email with both would be shuttled back and forward until something broke, preventing all other rules from being processed.
I have assumed here that you want to learn about the theory and build the engine yourself. alphazero raises the important suggestion of using an existing rules engine library, which is wise - this is the kind of subject that benefits from academic theory.
I haven't tried this myself, but an obvious approach is to use Java procedures in the Oracle database, and use a Java rules engine library in that code.
Try:
http://www.oracle.com/technology/tech/java/jsp/index.html
http://www.oracle.com/technology/tech/java/java_db/pdf/TWP_AppDev_Java_DB_Reduce_your_Costs_and%20_Extend_your_Database_10gR1_1113.PDF
and
http://www.jboss.org/drools/
or
http://www.jessrules.com/
--
Basically you'll need to capture data events (inserts, updates, deletes), map to them to your rulespace's events, and apply rules.