Drools working concurrency drools 5.5.0 - drools

How to use drools in an environment where multiple users are working or accessing rules for the same operation.
Considering a drl which contains 5 rules and now these rules are being accessed by multiple users. This is obvious that rules will be stored in knowledgesession. Each time when a request happens system will do i/o and loads dt and drls then a new knowledgesession will be created.
We are going to have more than 1500 rules which will be managed in 150+ dts and 150+ drls.
A sample code lead will be appreciated.

The same knowledge session can be used by multiple requests as the the rules remain constant independent of the requests.I have created a KIE session when application loads which is independent of requests made.

Related

Drools all rules are getting loaded

I am using drools in my project and assume it has 100 rules. I have two process flow (typically it has start node->rule flow task->end node). One process flow's rule flow task is specified with rule flow-group which is assigned to 50 rules and another process flow's rule flow task is specified with flow flow groups which is assigned to rest of the 50 rules. The don't overlap.
Now I use kiesession and call start process of first process flow, I see that it loads all the 100 rules instead of only 50 and gives me compilation and runtime errors. So please help me in understanding why rules from different rule flow groups are getting executed in a process flow where those rules are no where related its rule flow-group ? I see all its when conditions are getting loaded.
The "unit of work" in Drools is the KieBase and not the rule-flow-group. All the rules in your KieBase will be present in your KieSessions and will be evaluated when required.
Hope it helps,

how to block specific IP addreess with mod_security after specific times in one minute

Well, normally I'm not the person intended to do that, I'm a PHP developer and have general knowledge about Apache and security administration, but for emergency only I have to do this now.
I'm in a situation where I need to write Mod_Security rule that:
- blocks specific IP address from access our website,
- for 5 minutes
- if it try to call more than 10 links in less than 10 seconds
Can I achieve that writing a mod_security rule?
ModSecurity can do this, but wouldn't suggest it.
Have a look at the DOS rules in the OWASP CRS: https://github.com/SpiderLabs/owasp-modsecurity-crs/blob/master/experimental_rules/modsecurity_crs_11_dos_protection.conf. Note these do depend on set up in the main CRS setup file: https://github.com/SpiderLabs/owasp-modsecurity-crs/blob/master/modsecurity_crs_10_setup.conf.example
However ModSecurity collections are not the most stable especially for high volume. You run into problems with multiple threads accessing the collection file. Also might find you have to delete the collection file regularly (e.g. every 24 hours) to prevent it continually growing.

Preventing update loops for multiple databases using CDC

We have a number of legacy systems that we're unable to make changes to - however, we want to start taking data changes from these systems and applying them automatically to other systems.
We're thinking of some form of service bus (no specific tech picked yet) sitting in the middle, and a set of bus adapters (one per legacy application) to translate between database specific concepts and general update messages.
One area I've been looking at is using Change Data Capture (CDC) to monitor update activity in the legacy databases, and use that information to construct appropriate messages. However, I have a concern - how best could I, as a consumer of CDC information, distinguish changes applied by the application vs changes applied by the bus adapter on receipt of messages - because otherwise, the first update that gets distributed by the bus will get re-distributed by every receiver when they apply that change to their own system.
If I was implementing "poor mans" CDC - i.e. triggers, then those triggers execute within the context/transaction/connection of the original DML statements - so I could either design them to ignore one particular user (the user applying incoming updates from the bus), or set and detect a session property to similar ignore certain updates.
Any ideas?
If I understand your question correctly, you're trying to define a message routing structure that works with a design you've already selected (using an enterprise service bus) and a message implementation that you can use to flow data off your legacy systems that only forward-ports changes to your newer systems.
The difficulty is you're trying to apply changes in such a way that they don't themselves generate a CDC message from the clients receiving the data bundle from your legacy systems. In fact, all you're concerned about is having your newer systems consume the data and not propagate messages back to your bus, creating unnecessary crosstalk that might exponentiate, overloading your infrastructure.
The secret is how MSSQL's CDC features reconcile changes as they propagate through the network. Specifically, note this caveat:
All the changes are logged in terms of LSN or Log Sequence Number. SQL
distinctly identifies each operation of DML via a Log Sequence Number.
Any committed modifications on any tables are recorded in the
transaction log of the database with a specific LSN provided by SQL
Server. The __$operationcolumn values are: 1 = delete, 2 = insert, 3 =
update (values before update), 4 = update (values after update).
cdc.fn_cdc_get_net_changes_dbo_Employee gives us all the records net
changed falling between the LSN we provide in the function. We have
three records returned by the net_change function; there was a delete,
an insert, and two updates, but on the same record. In case of the
updated record, it simply shows the net changed value after both the
updates are complete.
For getting all the changes, execute
cdc.fn_cdc_get_all_changes_dbo_Employee; there are options either to
pass 'ALL' or 'ALL UPDATE OLD'. The 'ALL' option provides all the
changes, but for updates, it provides the after updated values. Hence
we find two records for updates. We have one record showing the first
update when Jason was updated to Nichole, and one record when Nichole
was updated to EMMA.
While this documentation is somewhat terse and difficult to understand, it appears that changes are logged and reconciled in LSN order. Competing changes should be discarded by this system, allowing your consistency model to work effectively.
Note also:
CDC is by default disabled and must be enabled at the database level
followed by enabling on the table.
Option B then becomes obvious: institute CDC on your legacy systems, then use your service bus to translate these changes into updates that aren't bound to CDC (using, for example, raw transactional update statements). This should allow for the one-way flow of data that you seek from the design of your system.
For additional methods of reconciling changes, consider the concepts raised by this Wikipedia article on "eventual consistency". Best of luck with your internal database messaging system.

Drools: How to update only specific rules?

We use Drools with an interface where users can update / edit rules. Those rules are then stored (and versioned) in a database. Afterwards the rules are fetched again from database and added one by one in the following way:
for (Rule rule ...) {
knowledgeBuilder.add(ResourceFactory.newByteArrayResource(rule.getRuleContent().getBytes()), ResourceType.DRL);
if (kbuilder.hasErrors()) { throw Error... }
}
kbase = KnowledgeBaseFactory.newKnowledgeBase();
kbase.addKnowledgePackages(knowledgeBuilder.getKnowledgePackages());
We have several hundred rules and a rule checking throughput of also several hundred rules/second (on 2 nodes). With the number of rules increasing this KnowledgeBase update takes longer and longer (compiling all those rules) and during this time no rule can be checked. So the system stands still from the user point of view.
There seems to exist no possibility to refresh a rule selectively - is this correct? If yes, then how is the best way to handle such a situation? The first idea that comes to mind is using two KnowledgeBases in parallel...
The KnowledgeBase API has methods to remove rules/packages and also to add packages. So to accomplish this you could remove the rule you want to update and then insert the updated version into the knowledge base. If you have any statefull sessions who's dispose() method has not been called, than the changes to the KnowledgeBase should get pushed out to them as well.

WF performance with new 20,000 persisted workflow instances each month

Windows Workflow Foundation has a problem that is slow when doing WF instances persistace.
I'm planning to do a project whose bussiness layer will be based on WF exposed WCF services. The project will have 20,000 new workflow instances created each month, each instance could take up to 2 months to finish.
What I was lead to belive that given WF slownes when doing peristance my given problem would be unattainable given performance reasons.
I have the following questions:
Is this true? Will my performance be crap with that load(given WF persitance speed limitations)
How can I solve the problem?
We currently have two possible solutions:
1. Each new buisiness process request(e.g. Give me a new drivers license) will be a new WF instance, and the number of persistance operations will be limited by forwarding all status request operations to saved state values in a separate database.
2. Have only a small amount of Workflow Instances up at any give time, without any persistance ofso ever(only in case of system crashes etc.), by breaking each workflow stap in to a separate worklof and that workflow handling each business process request instance in the system that is at that current step(e.g. I'm submitting my driver license reques form, which is step one... we have 100 cases of that, and my step one workflow will handle every case simultaneusly).
I'm very insterested in solution for that problem. If you want to discuss that problem pleas be free to mail me at nstjelja#gmail.com
The number of hydrated executing wokflows will be determined by environmental factors memory server through put etc. Persistence issue really only come into play if you are loading and unloading workflows all the time aka real(ish) time in that case workflow may not be the best solution.
In my current project we also use WF with persistence. We don't have quite the same volume (perhaps ~2000 instances/month), and they are usually not as long to complete (they are normally done within 5 minutes, in some cases a few days). We did decide to split up the main workflow in two parts, where the normal waiting state would be. I can't say that I have noticed any performance difference in the system due to this, but it did simplify it, since our system sometimes had problems matching incoming signals to the correct workflow instance (that was an issue in our code; not in WF).
I think that if I were to start a new project based on WF I would rather go for smaller workflows that are invoked in sequence, than to have big workflows handling the full process.
To be honest I am still investigating the performance characteristics of workflow foundation.
However if it helps, I have heard the WF team have made many performance improvements with the new release of WF 4.
Here are a couple of links that might help (if you havn't seem them already)
A Developer's Introduction to Windows Workflow Foundation (WF) in .NET 4 (discusses performance improvements)
Performance Characteristics of Windows Workflow Foundation (applies to WF 3.0)
WF on 3.5 had a performance problem. WF4 does not - 20000 WF instances per month is nothing. If you were talking per minute I'd be worried.