Drools RETE algorithm confusion - drools

I am having an issue understanding RETE algorithm Beta node JoinNode and notNode?
Documentation says :
There are two two-input nodes, JoinNode and NotNode, and both are
types of BetaNodes. BetaNodes are used to compare 2 objects, and their
fields, to each other. The objects may be the same or different types.
By convention, we refer to the two inputs as left and right. The left
input for a BetaNode is generally a list of objects; in Drools this is
a Tuple. The right input is a single object. Two Nodes can be used to
implement 'exists' checks. BetaNodes also have memory. The left input
is called the Beta Memory and remembers all incoming tuples. The right
input is called the Alpha Memory and remembers all incoming objects.
I understood, Alpha Node: Various literal conditions for drl rules but above documentation for BetaNodes is confusing me a bit.
say below is drl condition for above diagram:
$person : Person( favouriteCheese == $cheddar )
Query: 1) what are these left and right inputs to two-input Beta Nodes exactly as explained in above documentation? I believe it's referring to facts and rules where I believe tuples would be facts?
2) notNode would be basically drl condition matching literal condition with not?
Updated question on 6Sep17:
3) I believe above diagram represent joinNode, how would notNode be represented , if above workflow is altered to suit notNode?

The condition corresponding to the diagram would be
Cheese( $name: name == "Cheddar" )
Person( favouriteCheese == $name )
Once there is a match, a new tuple consisting of the matching Cheese and Person is composed and can act as a new tuple for further matches if there is a third pattern in the condition.
A not-node would be one that asserts the non-existence of some fact. It would fire only once.
You might find a much better description of "rete" on the web.

Related

Does My imposed Condition have a name in category theory/Set theory?

There's a requirement i'm placing on the signature of a total & referencially transparent function:
def add[T](a: T)(b: T): T
//requirement is type T under e.g. addition must always bear antoher type T
// and is not allowed to throw runtime arithmetic exceptions or such.
this requirement can be easily fulfilled for many types such as Int,String,Nat(natural numbers); yet is also easily violated by types such as NonZeroInt as addition of two non-zero integers can in fact be zero.
My question is there a coined term for this condition? Monoid comes to mind but it's obvious I'm not imposing all the rules for monoids here.
If I understand what you're asking, then the term you are looking for is "closure" for a set given an operation. Refer to the mathematical definition in Wikipedia here. In short:
A set has closure under an operation if performance of that operation
on members of the set always produces a member of the same set
However, "closure" seems to have a different meaning in computer science. See the link here. And my searches pertaining to closure in the context of Scala, even when also putting it in the context of mathematics or set-theory, doesn't lead to any helpful results. Perhaps this is why you've had trouble finding the coined term.
Without any laws required it is just anot operation on a set T, nothing more. If add is associative you can call it Semigroup.

How to check if the value is a number in Prolog manually?

How to check if the given value is a number in Prolog without using built-in predicates like number?
Let's say I have a list [a, 1, 2, 3]. I need a way to check if every element within this list is a number. The only part of the problem that bothers me is how to do the check itself without using the number predicate.
The reason why I'm trying to figure this out is that I've got a college assignment where it's specifically said not to use any of the built-in predicates.
You need some built-in predicate to solve this problem - unless you enumerate all numbers explicitly (which is not practical since there are infinitely many of them).
1
The most straight-forward would be:
maplist(number, L).
Or, recursively
allnumbers([]).
allnumbers([N|Ns]) :-
number(N),
allnumbers(Ns).
2
In a comment you say that "the value is given as an atom". That could mean that you get either [a, '1', '2'] or '[a, 1, 2]`. I assume the first. Here again, you need a built-in predicate to analyze the name. Relying on ISO-Prolog's errors we write:
numberatom(Atom) :-
atom_chars(Atom, Chs),
catch(number_chars(_, Chs), error(syntax_error(_),_), false).
Use numberatom/1 in place of number/1, So write a recurse rule or use maplist/2
3
You might want to write a grammar instead of the catch... goal. There have been many such definitions recently, you may look at this question.
4
If the entire "value" is given as an atom, you will need again atom_chars/2or you might want some implementation specific solution like atom_to_term/3 and then apply one of the solutions above.

Why doesn't scala's parallel sequences have a contains method?

Why does
List.range(0,100).contains(2)
Work, while
List.range(0,100).par.contains(2)
Does not?
This is planned for the future?
The non-teleological answer is that it's because contains is defined in SeqLike but not in ParSeqLike.
If that doesn't satisfy your curiosity, you can find that SeqLike's contains is defined thus:
def contains(elem: Any): Boolean = exists (_ == elem)
So for your example you can write
List.range(0,100).par.exists(_ == 2)
ParSeqLike is missing a few other methods as well, some of which would be hard to implement efficiently (e.g. indexOfSlice) and some for less obvious reasons (e.g. combinations - maybe because that's only useful on small datasets). But if you have a parallel collection you can also use .seq to get back to the linear version and get your methods back:
List.range(0,100).par.seq.contains(2)
As for why the library designers left it out... I'm totally guessing, but maybe they wanted to reduce the number of methods for simplicity's sake, and it's nearly as easy to use exists.
This also raises the question, why is contains defined on SeqLike rather than on the granddaddy of all collections, GenTraversableOnce, where you find exists? A possible reason is that contains for Map is semantically a different method to that on Set and Seq. A Map[A,B] is a Traversable[(A,B)], so if contains were defined for Traversable, contains would need to take a tuple (A,B) argument; however Map's contains takes just an A argument. Given this, I think contains should be defined in GenSeqLike - maybe this is an oversight that will be corrected.
(I thought at first maybe parallel sequences don't have contains because searching where you intend to stop after finding your target on parallel collections is a lot less efficient than the linear version (the various threads do a lot of unnecessary work after the value is found: see this question), but that can't be right because exists is there.)

Documenting Scala functional chains [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
Scala (and functional programming, in general), advocates a style of programming where you produce functional "chains" of the form
collection.operation1(...).operation2(...)...
where the operations are various combinations of map, filter, etc.
Where the equivalent Java code might require 50 lines, the Scala code can be done in 1 or 2 lines. The functional chain can change an input collection to something completely different.
The disadvantage of the Scala code is that 10 minutes later (never mind 6 months later), I can't figure out what I was thinking, because the notation is so compact, and lacks type information (because of implied types).
How do you document this? Do you put a large block comment before the chain, changing an elegant 1 line solution into a bulky 40 line solution consisting of 39 lines of comment? Do you intersperse your comments like this?
collection.
// Select the items that meet condition X
filter(predicate_function).
// Change these items from A's to B's
map(transformation_function).
// etc.
Something else? No documentation? (Leave them guessing. They'll never "downsize" you then, because no one else can maintain the code. :-))
If you find yourself writing comments at that detail level, you're just repeating what the code says.
For long functional chains, define new functions to replace parts of the chain. Give these meaningful names. Then you might be able to avoid comments. The names of these functions themselves should explain what they do.
The best comments are the ones that explain why the code does something. Well-written code should make the "how" obvious from the code itself.
I don't write that code to begin with (unless it's a script for one-time use or playing around in the REPL).
If I can explain what the code does in one comment and the reads okay, then I keep it as a one liner:
// Find all real-valued square roots and group them in integer bins
ds.filter(_ >= 0).map(math.sqrt).groupBy(_.toInt).map(_._2)
If I can't understand this by reading carefully through the chain of commands, then I should break it up more into functionally distinct units. For example, if I expected someone to not realize that the square root of a negative number is not real-valued, I would say:
// Only non-negative numbers have a real-valued square root
val nonneg = ds.filter(_ >= 0)
// Find square roots and group them in integer bins
nonneg.map(math.sqrt).groupBy(_.toInt).map(_._2)
In particular, if someone doesn't know the Scala collections library well, and doesn't have the patience to spend five to ten minutes understanding one line of code, then either they shouldn't be working on my code (nor on anything else that accomplishes something nontrivial that they don't understand and don't have the patience to understand), or I should know in advance that I'm providing an e.g. language and mathematics tutorial in addition to writing working code, either by writing a paragraph explaining how the following line works, or breaking it out command by command, or including comments at the start of each anonymous function explaining what is going on (as appropriate).
Anyway, if you can't understand what it does, you probably need some intermediate values. They are very helpful for mental-resetting ("I can't see how to get from A to C!...but...okay, I can understand A to B. And I can understand B to C.")
If your chained operations are all monadic transforms: map, flatMap, filter, then it's often much, much clearer to rewrite the logic as a for-comprehension.
coll.filter(predicate).map(transform)
could become
for(elem <- coll if predicate) yield transform(elem)
it's even easier to show off the power of the technique if you have a longer sequence of operations, such as with Kassen's example:
def eligibleCustomers(products: Seq[Product]) = for {
product <- products
customer <- product.customers
paying <- customer if customer.isPremium
eligible <- paying if paying.age < 20
} yield eligible
If you don't want to split it in multiple methods as hammar suggested you can split the line and give the intermediate values names (and optionally types).
def eligibleCustomers: List[Customer] = {
val customers = products.flatMap(_.customers)
val paying = customers.filter(_.isPremium)
val eligible = paying.filter(_.age < 20)
eligible
}
The linelength is a somehow natural indicator, when your chain is getting too long. :)
Of course, it will depend upon how trivial the chain is:
customerdata.filter (_.age < 40).filter (_.city == "Rio").
filter (_.income > 3000).filter (_.joined < 2005)
filter (_.sex == 'f'). ...
I recently had your impression, where an application of 3 files, one of them a bit lengthy, consisting of 4 classes, one of them not trivial, and of about 10 to 20 methods. Each method was about 5 to 10 lines, and each 2 of them could have been easily combined to a lager one, but I had to convince myself, that although measuring the elegance in spared lines of codes isn't completely wrong, sparing lines isn't the goal itself.
But splitting a method into two often makes complexity per line lower, but not the overall complexity, to understand the whole program.
If the problem domain is complex - filter data at different levels, rowwise, columnwise, map it, group it, build averages, build graphs, paginate them ... - the complicated job has to be done somewhere.
The program isn't more easy to understand, you just have to hit page down less often. It is a readjustment, that you have to read a line of code more slowly.
It doesn't bother me that much now I'm used to Scala. If you want to be more explicit with types, you can always, for example, replace things like map(_.foo) with map { a:A => a.foo } to make the code more readable in lengthy/complex operations. Not that I usually find the need to do that.

Diff, compare and merge files - general concepts

Consider the example below without any assumption about language or file types. I have:
M_old : [A,B,C]
M_new : [A,B,X]
and now I would like to create:
M_result : [A,B,C,X]
which is basically the union of M_old and M_new. Are there any compare/diff/merge tool that supports this kind of operation - that can take M_old and M_new and produce M_result based on the above instances?
I have had success with computing (using a simple diff/merge tool) M_result when:
M_old : [A,B,C]
M_new : [A,B,C,X]
since all elements in M_old is contained in M_new. But I don't see how this possible when some of the elements in M_old is not present in M_new.
So in more general terms does a "merge" operation only support union under special conditions as described above?
If you don't care about the exact order of the resulting list, you can compute M_result by appending every element in M_old that is not in M_new to M_new, or, alternatively, by concatenating M_new and M_old and then removing duplicate elements (These yield the exact same result if neither input list contains any duplicate elements to begin with).
correct merge tools will propose you the solution that you want amongst other solutions, this is what our tool ECMerge does.
(I presume you have an ancestor: M_ancestor: [A,B])
your problem here is that there are not enough data to determine that is expected when there is a replacement (should the merge result be ',C' then ',X', ',X' then ',C' or only one or the other).
(I presume you have an ancestor: M_ancestor: [A,B,C])
the other case when the change is an insertion of ',X' (M_old : [A,B,C] versus M_new : [A,B,C,X]) is generally resolved by adding ',X' as there is no particular choice. if the ancestor is [A,B] only then there is still a question: should the merger consider adding ',C' or ',C,X' ? this is not obvious then as the version with only ',C' might have taken into account the knowledge of ',C,X' from outside the SCC system and already deleted the ',X'.
even 'clever' merger cannot do divination :)