Using drools expert with dynamic decision tables - drools

Here's what I had like to do.
I had like to put "rules" in a database table. This is sort of like the drools xls decision table format except that all the rules will be rows in a table. This way I can modify the rules easily . I need to put this in a table and not an xls because my rules could be frequently changing. Is this possible with drools? Can I build a knowledgebase with rules retrieved from a DB (instead of a DRL or a xls file) and every time rules change can I rebuild the knowledge base from scratch (or maybe just parts of the knowledgebase, essentially update only those rules that's changed..)

It depends on what kind of rules you have in mind. A database-backed approach makes sense if you have lots of rules that have the same structure, and which only vary according to certain 'parameters'. In this case you can write a single generic rule, and use the database to store all of the combinations that apply. For example, suppose you have a rules to calculate shipping rates per country, for an order, e.g.
rule "Shipping rates to France"
when
$order : Order(country == 'fr')
then
$order.setShippingRate(10.0);
update(order);
end
// Similar rules for other countries…
You could replace these rules data from your database where each CountryShippingRate specifies the rate for one country. Then you insert all of the CountryShippingRate rows as fact objects in the rule session, and a single rule, like:
rule "Shipping rates"
when
$order : Order($country : country)
CountryShippingRate($rate : rate, country == $country)
then
$order.setShippingRate($rate);
update(order);
end
In practice, it turns out that lots of decision table type rules can be rewritten this way.

Related

Active Record efficient querying on multiple different tables

Let me give a summary of what I've been attempting to do and the efficiency issues I've been running into:
Essentially I want my users to be able to select parameters to filter data from my database, then I want to pass relevant data which passes those filters from the controller.
However, these filters query on data from multiple different tables (that is, about 5-6 different tables), some of which are quite large (as in 100k+ rows). These tables are all related to what I want to show, e.g. Here is a bond that meets so and so criteria, which is issued by so and so issuer, which must meet these criteria, and so on.
From an end result, I only really need about 100 rows after querying based on the parameters given by the user, but it feels like I need to look at everything in every table because I dont know how strict the filters will be beforehand. e.g. With a starting universe of 100k sets of data, passing filter f1,f2 of Table 1 might leave 90k, but after passing through filter f3 of table 2, f4,f5,f6 of table 3, and so ..., we might end up with 100 or less sets of data that pass these parameters because the last filters checked might be quite strict.
How can I go about querying through these multiple different tables efficiently?
Doing a join between them seems like it'd yield some time complexity of |T_1||T_2||T_3||T_4||T_5||T_6| where T_i is the "size" of table i.
On the other hand, just looking through the other tables based off the ids of the ones that pass the previous filter (as in, id 5,7,8 pass filters in T_1, which of those ids then pass filters in T_2, then which of those pass filters in T_3 and so on) looks like it might(?) have time complexity of |T_1| + |T_2| + ... + |T_6|.
I'm relatively new to Ruby on Rails, so im not entirely sure all of the tools at my disposal that could help with optimizing this, but at the same time I'm not entirely sure how to best approach this algorithmically.

Modifying queries on the fly depending on selected filters

In an existing codebase some tables are now horizontally sharded within the same database but in different namespaces.
E.g. let's say there previously was a large table users which is now sharded by the country field so there are now the following tables: us.users, ca.users, es.users etc.
Since every single query to the tables already contains the country filter I was thinking of the following minimalistic adjustment so that there's no need to subclass the original model for every country manually or dynamically:
class SessionWithShardedTableSupport(Session):
""" Use sharded tables on the fly """
def connection(self, mapper=None, clause=None, bind=None, close_with_result=None, **kw):
if mapper and mapper.local_table.name == 'users':
mapper.local_table.schema = '???' # `us` or `ca` or `es`
return super().connection(mapper, clause, bind, close_with_result, **kw)
The solution would work fine if there was a way to get the country filter from the query from within the session but there doesn't seem any (at least inspecting both mapper & clause parameters didn't reveal them).
1) Is there a way to get the where clause / the filters from within the session?
2) Is there maybe a better way to adjust the table name / table namespace on the fly with minimalistic changes to the existing code?

How to automate the execution process of data quality rules?

One of our clients has a requirement to build/develop data quality rules using hiveQL.
E.g, Replace NULL values, Change date format in YYYY-MM-DD, Standardize amount column values in US & EU format, etc.
Problem Statement:
I have the set of data quality rules in one hive table(dq_rules), want to execute each rule one by one and store the errors(the data issues such as null column, incorrect date format column) in another hive table(dq_logging) for reporting/logging purpose.
Please Suggest me solution by keeping one thing in mind that, I want to make this solution generic and executable for any hive table/columns(It means it should be parameterized).
Restriction: I cannot use existing Data Quality tools. I need to complete it using a hive only(Restriction is given by Client).
Schema for Tables:
dq_rules => Validation Rule ID,Rule Category,DQ Dimension,Rule Description Date Added,Date Retired
dq_logging => Error_ID,Source_Name,Erroneous_Source_Fields,Source_File_Record,Validation Rule ID
If anyone has a solution related to writing shell/python script that will also work for me. I just need to make it end to end process.

Creating decision tables in Red Hat Decision Central not reflecting complex types / structures

I have a DMN decision created in Decision Manager 7.3. I have a few data types created, all of which are "structures" (i.e. complex types) with nested fields. I have created a decision table of which the condition column is bound to one of these structures (Customer) and the output column is bound to a Result structure.
However, I would expect the column headers to reflect the structure of the objects as per the example here (step 9 onwards): https://access.redhat.com/documentation/en-us/red_hat_decision_manager/7.3/html-single/designing_a_decision_service_using_dmn_models/index#dmn-data-types-defining-proc_dmn-models
In the documentation example, the Loan_Qualification type has nested fields and these are shown as sub-columns in the table header.
My data types are defined as follows:
I have a Customer input node and a decision node defined as follows:
Yet in my decision table, the columns map to the top level object only as follows:
So any ideas as to what I might be missing? Thanks in advance.
UPDATE
I have used the answer given below by #karreiro which works for the outcome / action column, but inserting an Input Clause left or right adds a new top level column, not a sub column, which then looks like the following:
Is this something you expect the decision table editor to be able to do as well?
Your expectations are correct.
The DMN editor aims to support the auto-creation of fields for Structure Data Types (for output clauses https://issues.jboss.org/browse/DROOLS-3685, and input clauses https://issues.jboss.org/browse/DROOLS-4491).
However, momentarily, users need to create these fields manually:
See how to create here :-)

Using Views and CTE on DB2/AS400

a Generic question..
I have an employee Table(EMPMAST) which has the New as well old Employee data. There is a flag called Current? which is 'Y' if he/she is a current employee.
Now I have to select records in my SQLRPGLE with only the current records and also some other criteria's(For Example EMPNAME = 'SAM') .What is the best way to deal with it. (in terms of performance and system usage)
To create a View over the EMPMAST with Current? = 'Y'. Then use it in the program with other conditions.
Use a CTE(With AS) in the Program which would have the Condition Current = 'Y' and use it.
use the table directly without CTE and View
any other option
Options 1, 2 and 3 would all perform the same. They would likely all have the same optimized query and access plan.
A CTE vs. a View are two different things. A View would be appropriate for a query that is going to be used in multiple locations, A CTE is only available in the query in which it is defined. I usually don't use the CTE except to replace a complex subquery. In your case the condition is simple enough to be contained in the where clause so I don't see the need to introduce additional complexity.
Some folks will tell you not to query the table directly in the program, but to always use a view. That way you add an extra layer of insulation between the program and the database, and you can still define record structures with ExtName, and not have to worry about changes to the table unless they affect the view itself. In this case you would likely have a dedicated view for each program that uses the table.
I tend to just use a hybrid of these techniques. I query tables, CTE's, or Views depending on the situation, and define my record structures explicitly in the program. I prefer to just query the table, but if I have some complex query logic which is unique to the program, I will use a CTE. I do have a few Views, but these are limited to those queries that happen in multiple programs where I want to ensure the same logic is applied consistently.