Exists condition in Drools - drools

The context is employee shift rostering in OptaPlanner with Drools rules. Assume that I have some shifts and that I need to check if any of the shifts is in a list of pre-defined shifts to assign. The latter list of shifts to assign is, say, [S1,S2,S3]; I need to match the following condition (I use a colon as a "such that"):
exists s in [S1,S2,S3] : forall shift (shift != s)
How could I implement such a rule in Drools?

The following pattern will match if there is no Shift that is member of [S1, S2, S3].
not Shift( this memberOf [S1, S2, S3] )
Not sure if the list literal [S1, S2, S3] is Drools-compatible but I assume it's going to be something dynamic that will be inserted into working memory as a fact.

I think you need to represent the list of shifts to be assigned [S1, S2, S3] as a different type of fact (ShiftToAssign):
Shift { // planning entity
String id;
LocalDateTime: time;
Emploee employee;
}
ShiftToAssign { // planning fact
String id;
}
Then, instead of adding/removing elements to the pre-defined list of shifts to assign, you simply add them to #ProblemFactCollectionProperty. They will be automatically inserted into Drools working memory and so you'll be able to match your condition like this:
$s : ShiftToAssign // equivalent to: for each Si in [S1, S2, S3]
not Shift(id == $s.id) // equivalent to: there is no Shift whose id is member of the list

Related

Drools - Find matching object from a list of unconnected objects

I have a Mutation class with 3 variables, m1, m2, and m3. I also have a list of independent objects of a class Strain with 1 variable v. I need to be able to match a number of these objects i a list to the variables m1, m2 and m3.
Here is the Mutation class
class Mutation{
String m1;
String m2;
String m3;
}
The class Strain has a single variable m and is shown below:
class Strain {
String m;
}
I want to find the Strains that match the mutation. Here is what I tried with no success.
rule "Find the mathcing strains for the mutation"
when
$m: Mutation ( $m1: m1, $m2: m2, $m3: m3)
$s1: Strain( m == $m1 )
$s2: Strain( m == $m2 )
$s3: Strain ( m == $m3)
then
System.out.println("Found matching Strains");
end
However, this does not work. I have all the Strain objects and the Mutation objects inserted in the session. I also print out the values of the list of Mutation objects and the list of Strain objects and see the m values of the Strain objects that match the Mutation values, m1, m2 ad m3.
What am I doing wrong? Is this achievable using Drools? I'm sure it's my solution which has the problem. I would really appreciate finding a solution to this problem. I have scoured the net looking for a similar problem but haven't come across a solution that could fix this problem.
I summary, List mutations ... variables, m1,m2 and m3
List strains .. variable m
Please would really appreciate help in pointing me where I am going wrong, thanks in advance.

How to match specific tokens in UIMA Ruta?

DECLARE A,B;
DECLARE Annotation C(Annotation firstA, Annotation secondA,...);
"token1|token2|...|tokenn" -> A;
"token3|token4" -> B;
A A B {->MARK(C,1,3)};
I did with GATHER
(A COMMA A B) {-> GATHER(C,1,4,"firstA"=1,"secondA" = 3,"B"=4)};
But how about in case of unknown sequence of A type? as below, how can it be possible to store all A in features? The number of features are also unknown. In plan java, we declare String array and can add element, but in Ruta seems to be no as such process.
(A (COMMA A)+ B) {-PARTOF(C) -> GATHER(C,beginPosition,endPosition,"firstA"=1,"secondA" = 3,"thirdA"=?,so on)};
Annotations in UIMA refer to the complete span, from the begin offset to the end offset. So, if you want to specify something with two elements, then a simple annotation is not sufficient. You cannot create an annotation of the type C that covers the first A and the B but not the second A.
However, you can store the important information in feature values. How to implement it depends on various things.
If there are always exactly two annotation you want to remember, then add two features to type C and assign the feature values in the given rule, e.g., by CREATE(C,1,3,"first" = A, "second" = B ).
You can also use different actions like GATHER, or use one FSArray feature in order to store the annotations.
A complete example with a FSArray:
DECLARE A, B;
DECLARE Annotation C (FSArray as, B b);
"A" -> A;
"B" -> B;
(A (COMMA A)+ B){-PARTOF(C) -> CREATE(C, "as" = A, "b" = B)};
If applied on a text like "A, A, A B", the last rule creates one annotation of type C that stores three A annotations in the feature "as" and one B annotation in the feature "b"

Algorithm to evaluate value of Boolean expression

I had programming interview which consisted of 3 interviewers, 45 min each.
While first two interviewers gave me 2-3 short coding questions (i.e reverse linked list, implement rand(7) using rand(5) etc ) third interviewer used whole timeslot for single question:
You are given string representing correctly formed and parenthesized
boolean expression consisting of characters T, F, &, |, !, (, ) an
spaces. T stands for True, F for False, & for logical AND, | for
logical OR, ! for negate. & has greater priority than |. Any of these
chars is followed by a space in input string. I was to evaluate value
of expression and print it (output should be T or F). Example: Input:
! ( T | F & F ) Output: F
I tried to implement variation of Shunting Yard algorithm to solve the problem (to turn input in postfix form, and then to evaluate postfix expression), but failed to code it properly in given timeframe, so I ended up explaining in pseudocode and words what I wanted.
My recruiter said that first two interviewers gave me "HIRE", while third interviewer gave me "NO HIRE", and since the final decision is "logical AND", he thanked me for my time.
My questions:
Do you think that this question is appropriate to code on whiteboard in approx. 40 mins? To me it seems to much code for such a short timeslot and dimensions of whiteboard.
Is there shorter approach than to use Shunting yard algorithm for this problem?
Well, once you have some experience with parsers postfix algorithm is quite simple.
1. From left to right evaluate for each char:
if its operand, push on the stack.
if its operator, pop A, then pop B then push B operand A onto the stack. Last item on the stack will be the result. If there's none or more than one means you're doing it wrong (assuming the postfix notation is valid).
Infix to postfix is quite simple as well. That being said I don't think it's an appropriate task for 40 minutes if You don't know the algorithms. Here is a boolean postfix evaluation method I wrote at some stage (uses Lambda as well):
public static boolean evaluateBool(String s)
{
Stack<Object> stack = new Stack<>();
StringBuilder expression =new StringBuilder(s);
expression.chars().forEach(ch->
{
if(ch=='0') stack.push(false);
else if(ch=='1') stack.push(true);
else if(ch=='A'||ch=='R'||ch=='X')
{
boolean op1 = (boolean) stack.pop();
boolean op2 = (boolean) stack.pop();
switch(ch)
{
case 'A' : stack.push(op2&&op1); break;
case 'R' : stack.push(op2||op1); break;
case 'X' : stack.push(op2^op1); break;
}//endSwitch
}else
if(ch=='N')
{
boolean op1 = (boolean) stack.pop();
stack.push(!op1);
}//endIF
});
return (boolean) stack.pop();
}
In your case to make it working (with that snippet) you would first have to parse the expression and replace special characters like "!","|","^" etc with something plain like letters or just use integer char value in your if cases.

Matching a list (of tags) with another and detecting presence of common elements

My requirement is to match tags. In the example, this particular HourConstraint checks the TeacherHour assigned to Hour(23).
Specifically, it checks TeacherHour.attributes["tags"] for the values ["asst_ct","teacher_john_smith"] and detects atleast one match, two in this case (both "asst_ct" and "teacher_john_smith") .
TeacherHour:
id: 47
assigned_hour: Null
attributes:Map<List<String>>
"tags":["asst_ct","no_strenuous_duties","kinda_boring","teacher_john_smith"]
"another_attribute":[...]
HourConstraint:
hour: Hour(23)
attribute: "tags"
values_list: ["asst_ct","teacher_john_smith"]
Question: How do I detect the presence (true or false) of common elements between two lists?
Drools Expert has memberOf and contains, but they check a scalar vs a collection, never a collection vs a collection.
I see two potential ways:
introduce a function boolean isIntersecting(list,list) and tell Drools to use that for truth checking
Implement TeacherHour.attributes[] as a string instead of a list and HourConstraint.valueslist as a regular expression that can match that list
There are a few options. Most straight forward is to use the Collections class to do that for you:
rule X
when
$t: TeacherHour( )
HourConstraint( Collections.disjoint( $t.attributes["tags"], values_list ) == false )
...
If this is something you would use often in your rules, then I recommend wrapping that function in a pluggable operator, supported by Drools. Lets say you name the operator "intersect", you can then write your rules like this:
rule X
when
$t: TeacherHour( )
HourConstraint( values_list intersect $t.attributes["tags"] )
...
A third option, is to use "from", but that is less efficient in runtime as it causes iterations on the first list:
rule X
when
$t: TeacherHour( )
$tag : String() from $t.attributes["tags"]
exists( HourConstraint( values_list contains $tag ) )
...

Best way to create generic/method consistency for sort.data.frame?

I've finally decided to put the sort.data.frame method that's floating around the internet into an R package. It just gets requested too much to be left to an ad hoc method of distribution.
However, it's written with arguments that make it incompatible with the generic sort function:
sort(x,decreasing,...)
sort.data.frame(form,dat)
If I change sort.data.frame to take decreasing as an argument as in sort.data.frame(form,decreasing,dat) and discard decreasing, then it loses its simplicity because you'll always have to specify dat= and can't really use positional arguments. If I add it to the end as in sort.data.frame(form,dat,decreasing), then the order doesn't match with the generic function. If I hope that decreasing gets caught up in the dots `sort.data.frame(form,dat,...), then when using position-based matching I believe the generic function will assign the second position to decreasing and it will get discarded. What's the best way to harmonize these two functions?
The full function is:
# Sort a data frame
sort.data.frame <- function(form,dat){
# Author: Kevin Wright
# http://tolstoy.newcastle.edu.au/R/help/04/09/4300.html
# Some ideas from Andy Liaw
# http://tolstoy.newcastle.edu.au/R/help/04/07/1076.html
# Use + for ascending, - for decending.
# Sorting is left to right in the formula
# Useage is either of the following:
# sort.data.frame(~Block-Variety,Oats)
# sort.data.frame(Oats,~-Variety+Block)
# If dat is the formula, then switch form and dat
if(inherits(dat,"formula")){
f=dat
dat=form
form=f
}
if(form[[1]] != "~") {
stop("Formula must be one-sided.")
}
# Make the formula into character and remove spaces
formc <- as.character(form[2])
formc <- gsub(" ","",formc)
# If the first character is not + or -, add +
if(!is.element(substring(formc,1,1),c("+","-"))) {
formc <- paste("+",formc,sep="")
}
# Extract the variables from the formula
vars <- unlist(strsplit(formc, "[\\+\\-]"))
vars <- vars[vars!=""] # Remove spurious "" terms
# Build a list of arguments to pass to "order" function
calllist <- list()
pos=1 # Position of + or -
for(i in 1:length(vars)){
varsign <- substring(formc,pos,pos)
pos <- pos+1+nchar(vars[i])
if(is.factor(dat[,vars[i]])){
if(varsign=="-")
calllist[[i]] <- -rank(dat[,vars[i]])
else
calllist[[i]] <- rank(dat[,vars[i]])
}
else {
if(varsign=="-")
calllist[[i]] <- -dat[,vars[i]]
else
calllist[[i]] <- dat[,vars[i]]
}
}
dat[do.call("order",calllist),]
}
Example:
library(datasets)
sort.data.frame(~len+dose,ToothGrowth)
Use the arrange function in plyr. It allows you to individually pick which variables should be in ascending and descending order:
arrange(ToothGrowth, len, dose)
arrange(ToothGrowth, desc(len), dose)
arrange(ToothGrowth, len, desc(dose))
arrange(ToothGrowth, desc(len), desc(dose))
It also has an elegant implementation:
arrange <- function (df, ...) {
ord <- eval(substitute(order(...)), df, parent.frame())
unrowname(df[ord, ])
}
And desc is just an ordinary function:
desc <- function (x) -xtfrm(x)
Reading the help for xtfrm is highly recommended if you're writing this sort of function.
There are a few problems there. sort.data.frame needs to have the same arguments as the generic, so at a minimum it needs to be
sort.data.frame(x, decreasing = FALSE, ...) {
....
}
To have dispatch work, the first argument needs to be the object dispatched on. So I would start with:
sort.data.frame(x, decreasing = FALSE, formula = ~ ., ...) {
....
}
where x is your dat, formula is your form, and we provide a default for formula to include everything. (I haven't studied your code in detail to see exactly what form represents.)
Of course, you don't need to specify decreasing in the call, so:
sort(ToothGrowth, formula = ~ len + dose)
would be how to call the function using the above specifications.
Otherwise, if you don't want sort.data.frame to be an S3 generic, call it something else and then you are free to have whatever arguments you want.
I agree with #Gavin that x must come first. I'd put the decreasing parameter after the formula though - since it probably isn't used that much, and hardly ever as a positional argument.
The formula argument would be used much more and therefore should be the second argument. I also strongly agree with #Gavin that it should be called formula, and not form.
sort.data.frame(x, formula = ~ ., decreasing = FALSE, ...) {
...
}
You might want to extend the decreasing argument to allow a logical vector where each TRUE/FALSE value corresponds to one column in the formula:
d <- data.frame(A=1:10, B=10:1)
sort(d, ~ A+B, decreasing=c(A=TRUE, B=FALSE)) # sort by decreasing A, increasing B