How to reduce the time cost when calling getKieContainer() in Drools? - drools

I'm using the Drools API to build a kieContainer when my application starts. But I noticed that it cost a lot of time when calling getKieContainer().
I am searching for a method to reduce the time cost to get a reusable KieContainer.
KieHelper kieHelper = new KieHelper();
kieHelper.addContent(getContent(), ResourceType.GDST);
KieContainer kieContainer = kieHelper.getKieContainer();

Do it only twice. You're supposed to call this only once when you initialise the application.
Otherwise you can consider using Drools' executable model

Related

I am having a problem with Flutter/Dart Async code execution, as in how does it work

In Flutter we use async await and Future, can someone explain that if we don't use another thread (which we can't in dart) and run the job on main UIThread only won't the app become choppy because even if we are waiting for the job to execute it will ultimately execute on UIThread only.
I somewhere read about isolates also. But cannot paint the exact picture. If someone can explain in detail.
I think you missed something when it comes to asynchronous methods in Dart.
Dart is a single-threaded programming language, Java or C# are multi-threaded programming languages, forget async as a new thread, it doesn't happen in Dart.
Dart is a single-threaded programming language. This means is that Dart can only run one instruction at a time, while Java could run multiple instructions concurrently.
As a rule, everything you do in Dart will start in UI-Thread. Whatever method you call in Dart, whether using sync, async, then, they will be running on UI-Thread, since Dart is a single thread.
In single-threaded languages ​​like Javascript and Dart, an async method is NOT executed in parallel but following the regular sequence of events, handled by the Event Loop. There are some problems (I would give some approaches, as we will see below) if you run the following code in a multithreaded language where fetch will take some time to execute:
String user = new Database.fetch(David);
String personalData = new Database.fetch(user);
You will receive David's data in user, and after that, you will receive your data.
This will lock your UI, unlike languages ​​like Java which have Threads, where you can perform this task in the background, on another thread, and the UI-Thread will run smoothly.
If you do this at Dart
String user = new Database.fetch(David);
String personalData = new Database.fetch(user);
user will be null in personalData, because the fetch event is a Future.
How to solve this in Dart?
String user = await Database.fetch(David);
String personalData = await Database.fetch(user);
For those who like a more functional paradigm (I don't like it) you can use then.
Database.fetch(David).then((user){
Database.fetch(user).then((personal){
String personalData = personal;
});
});
However, imagine that you have billions of data in that database, this heavy task will probably cause the animations on your screen to freeze, and you will see a jank in the user's UI, for that purpose isolates were invented.
Dart Isolates give you a way to perform real multi-threading in Dart. They have their own separate heaps(memory), and run the code in the background, just like the Threads of multi-threaded languages. I could explain how isolates work, but it would make this response very long, and the goal is just to differentiate asynchronous from multi-threaded methods.
A simple way to solve the problem above using isolates would be using compute.
Compute was created to facilitate the creation of isolates, you just pass the function and the data that this function will execute, and that's it!
Important to remember that compute is a Future, so you have to use await or then to get its result.
In our example, we could create a new thread and get its result when we finish by just calling compute like this:
String user = await compute(Database.fetch,David);
String personalData = await compute(Database.fetch,user);
Very simple, isn't it?
In summary:
Everything that waits some time to be completed, in Dart is called a "Future".
To wait for the result of a future to be assigned to a variable, use await or then.
The asynchronous methods (await and then) can be used to obtain a result from a Future, and are executed ON THE MAIN THREAD because Dart is single-thread.
If you want to run any function on a new thread, you can create an isolate. Dart offers an easy-to-use isolate wrapper called compute, where you only need to pass one method that will be processed and the data that will be processed, and it will return its result in the future.
NOTE: if you are going to use compute make sure you are using a static or top-level method (see that in the example I used Database.fetch it was no accident if you need to call Database().fetch or need to create an instance of it, means it is not a static method and will not work with isolates).
English is not my first language and I didn't want to write so much because of that, but I hope I helped differentiate between multi-threaded asynchronous programming from single-threaded asynchronous programming.

How can my Apache Spark code emit periodic heartbeats?

I'm developing an Apache Spark job to run and I plan to deploy it as one stage in an AWS Step Function. Unfortunately the particular way that I wish to deploy it isn't directly supported by Step Functions at this time; however, Step Functions has an API for a generic Task that I can make use of. Essentially, once the task is started, it needs to periodically make a call to sendTaskHeartbeat and then on completion it needs to call sendTaskSuccess.
My Spark job is written in Scala, and I'm wondering what the best approach for running something on a timer is within the context of an Apache Spark job. I see from other answers that I could make use of java.util.concurrent or perhaps java.util.Timer, but I'm not sure how the threading would work specifically in a Spark context. Since Spark is already doing a lot to distribute my code across each node I'm not sure if there are some hidden considerations I need to be weary of (i.e. I don't really want more than one instance of my timer, I want to make sure it stops when the sparky parts of my code complete, etc.
Is it safe to use a regular Timer in a Spark job? If I did something like this:
val timer = new Timer()
val task = new TimerTask {
/* sendTaskHeartbeat */
}
timer.schedule(task, 1000L, 1000L)
val myRDD = spark.read.parquet(pathToParquetFiles)
val transformedRDD = myRDD.map( /* business logic */ )
transformedRDD.saveAsHadoopDataset(config) andThen task.cancel
Would that be sufficient? Or is there a risk that this code would lose track of the task and timer objects by the time it reaches the andThen, due to the distribution across nodes?
I believe your implement is sufficient. The timer task will only run in the driver node. (as long as you do not include them in the RDD transformation)
Only thing need to be careful is error handling. Make sure the timer task getting terminated when the transformation throws an error. otherwise your job could stuck because of timer thread is still alive.
I ended up making use of a combination of a java.util.Timer and a SparkListener. I instantiate the Timer on the onJobStart event (and only once, so if (TIMER == null) { /* instantiate */ }, because the onJobStart event seemingly can fire multiple times). And then I handle the completion activity on the onApplicationEnd event (which does only fire once). The reason I didn't use onApplicationStart was because it seemed like by the time I hooked in my listener to the Spark context, this event had already fired.

Corrupted KieBase object?

Due to performance consideration, we try to re-use the same KieBase object to spawn new KieSession to for each rule invocation against the same ruleset. Everthingyth works well until after a period of time when all of a sudden, the newly created kieSession from the cached kieBase stops firing the rules that it was supposed to.
But as soon as we get rid of the cached kieBase and re-create a new kieBase and new a kieSession with it, it starts working again.
Our understanding is that kieBase object does not hold session-specific data. But the behavior seems to indicate that the cached kieBase is subject to being tampered over time.
The version we are using is 6.3.0.Final.
Any hints on this would be highly appreciated.
I had this issue, too. After creating 20 sessions in rapid succession, the 21st session would throw a ClassCastException on a cast to BigDecimal in this rule even though runs on the previous 20 sessions did not:
rule IsMaterialChange_TotalMonthlyIncomeAmountUpdate when
Data("TOT_MO_INCM_AMT-ORIGINAL_VALUE"; $originalTotalMonthlyIncomeAmount : value)
Data("TOTALMONTHLYINCOMEAMOUNT"; (BigDecimal) value != (BigDecimal) $originalTotalMonthlyIncomeAmount)
then
insert(new IsMaterialChange("TotalMonthlyIncomeAmountUpdate", true));
end
After digging around for a bit, I found this answer that pointed to the JIT compiler as the cause. It has a default threshold of 20 which lined up exactly with my issue. For some reason the JIT compiler has problems casting this value.
I disabled the JIT compiler like this:
KieContainer kieContainer = kieServices.newKieContainer(kieServices.getRepository().getDefaultReleaseId());
KieBaseConfiguration kieBaseConfiguration = kieServices.newKieBaseConfiguration();
// Disable JIT Compiler to prevent ClassCastException BigDecimal
kieBaseConfiguration.setOption(ConstraintJittingThresholdOption.get(-1));
KieBase kieBase = kieContainer.newKieBase(loaderKey, kieBaseConfiguration);

Mock network delay asynchronously

I have a Scala application that relies on an external RESTful webservice for some part of its functionality. We'd like to do some performance tests on the application, so we stub out the webservice with an internal class that fakes the response.
One thing we would like to keep in order to make the performance test as realistic as possible is the network lag and response time from the remote host. This is between 50 and 500 msec (we measured).
Our first attempt was to simply do a Thread.sleep(random.nextInt(450) + 50), however I don't think that's accurate - we use NIO, which is non-blocking, and Thread.sleep is blocking and locks up the whole thread.
Is there a (relatively easy / short) way to stub a method that contacts an external resource, then returns and calls a callback object when ready? The bit of code we would like to replace with a stub implementation is as follows (using Sonatype's AsyncHttpClient), where we wrap its completion handler object in one of our own that does some processing:
def getActualTravelPlan(trip: Trip, completionHandler: AsyncRequestCompletionHandler) {
val client = clientFactory.getHttpClient
val handler = new TravelPlanCompletionHandler(completionHandler)
// non-blocking call here.
client.prepareGet(buildApiURL(trip)).setRealm(realm).execute(handler)
}
Our current implementation does a Thread.sleep in the method, but that's, like I said, blocking and thus wrong.
Use a ScheduledExecutorService. It will allow you to schedule things to run at some time in the future. Executors has factory methods for creating them fairly simply.

Accessing the process instance from Rule Tasks in JBPM 5

The short version: How do I get JBPM5 Rule Nodes to use a DRL file which reads and updates process variables?
The long version:
I have a process definition, being run under JBPM5. The start of this process looks something like this:
[Start] ---> [Rule Node] ---> [Gateway (Diverge)] ... etc
The gateway uses constraints on a variable named 'isValid'.
My Rule Node is pointing to the RuleFlowGroup 'validate', which contains only one rule:
rule "Example validation rule"
ruleflow-group "validate"
when
processInstance : WorkflowProcessInstance()
then
processInstance.setVariable("isValid", new Boolean(false));
end
So, by my logic, if this is getting correctly processed then the gateway should always follow the "false" path.
In my Java code, I have something like the following:
KnowledgeBuilder kbuilder = KnowledgeBuilderFactory.newKnowledgeBuilder();
kbuilder.add(ResourceFactory.newClassPathResource("myProcess.bpmn"), ResourceType.BPMN2);
kbuilder.add(ResourceFactory.newClassPathResource("myRules.drl"), ResourceType.DRL);
KnowledgeBase kbase = kbuilder.newKnowledgeBase();
StatefulKnowledgeSession ksession = kbase.newStatefulKnowledgeSession();
new Thread(new Runnable()
{
public void run()
{
ksession.fireUntilHalt();
}
}).start();
// start a new process instance
Map<String, Object> params = new HashMap<String, Object>();
params.put("isValid", true);
ksession.startProcess("test.processesdefinition.myProcess", params);
I can confirm the following:
The drl file is getting loaded into working memory, because when I put syntax errors in the file then I get errors.
If I include a value for "isValid" in the Java params map, the process only ever follows the path specified by Java, apparently ignoring the drools rule.
If I take the "isValid" parameter out of the params map, I get a runtime error.
From this I assume that the final "setVariable" line in the rule is either not executing, or is updating the wrong thing.
I think my issue is related to this statement in the official documentation:
Rule constraints do not have direct access to variables defined inside the process. It is
however possible to refer to the current process instance inside a rule constraint, by adding
the process instance to the Working Memory and matching for the process instance in your
rule constraint. We have added special logic to make sure that a variable processInstance of
type WorkflowProcessInstance will only match to the current process instance and not to other
process instances in the Working Memory. Note that you are however responsible yourself to
insert the process instance into the session and, possibly, to update it, for example, using Java
code or an on-entry or on-exit or explicit action in your process.
However I cannot figure out how to do what is described here. How do I add the process instance into working memory in a way that would make it accessible to this first Rule Node? Rule Nodes do not seem to support on-entry behaviors, and I can't add it to the Java code because the process could very easily complete execution of the rules node before the working memory has been updated to include the process.
As you mentioned, there are several options to inserting the process instance into the working memory:
- inserting it after calling startProcess()
- using an action script to insert it (using "insert(kcontext.getProcessInstance()")
If calling startProcess() might already have gone over the rule task (which is probably the case in your example), and you don't have another node in front of your rule task where you could just use an on-entry/exit script to do this (so that's is hidden), I would recommend using an explicit script task before your rule task to do this.
Kris