Can't make accumulate to work properly - drools

I have the following class structure (Class A contains Class B):
class A {
B object;
...
}
and I'm trying to do something whenever the average of the accumulation of a specific field in class B is above a given value.
so I'm trying to write the following :
when
A($var1 : object)
accumulate( B($num:num) from $var1;
$avg1 : avg ($num); $avg1 < 10000)
then ...
what happens is that instead of accumulating all entities in the session and calculating the average for all of them, the average is being calculated on each entity separately.
so if the session already contains 5 numeric values which bigger than 10000 and another one is inserted then the "then" part is invoked 6 times (each one with average value the equals to the numeric value itself) instead of only once.
Do you have some hint that might help me to solve that?
thanks.

You have to accumulate over all A facts accessing the field num of field object.
when
accumulate( A( $var1: object );
$avg1: avg($var1.getNum()); $avg1 < 10000)
then ...
Inserting all B objects as facts as well would permit you to write the straightforward
when
accumulate( B( $num: num );
$avg1: avg($num); $avg1 < 10000)
then ...

Related

Multiple Knapsacks with Fungible Items

I am using cp_model to solve a problem very similar to the multiple-knapsack problem (https://developers.google.com/optimization/bin/multiple_knapsack). Just like in the example code, I use some boolean variables to encode membership:
# Variables
# x[i, j] = 1 if item i is packed in bin j.
x = {}
for i in data['items']:
for j in data['bins']:
x[(i, j)] = solver.IntVar(0, 1, 'x_%i_%i' % (i, j))
What is specific to my problem is that there are a large number of fungible items. There may be 5 items of type 1 and 10 items of type 2. Any item is exchangeable with items of the same type. Using the boolean variables to encode the problem implicitly assumes that the order of the assignment for the same type of items matter. But in fact, the order does not matter and only takes up unnecessary computation time.
I am wondering if there is any way to design the model so that it accurately expresses that we are allocating from fungible pools of items to save computation.
Instead of creating 5 Boolean variables for 5 items of type 'i' in bin 'b', just create an integer variable 'count' from 0 to 5 of items 'i' in bin 'b'. Then sum over b (count[i][b]) == #item b

Overriding `Comparison method violates its general contract` exception

I have a comparator like this:
lazy val seq = mapping.toSeq.sortWith { case ((_, set1), (_, set2)) =>
// Just propose all the most connected nodes first to the users
// But also allow less connected nodes to pop out sometimes
val popOutChance = random.nextDouble <= 0.1D && set2.size > 5
if (popOutChance) set1.size < set2.size else set1.size > set2.size
}
It is my intention to compare sets sizes such that smaller sets may appear higher in a sorted list with 10% chance.
But compiler does not let me do that and throws an Exception: java.lang.IllegalArgumentException: Comparison method violates its general contract! once I try to use it in runtime. How can I override it?
I think the problem here is that, every time two elements are compared, the outcome is random, thus violating the transitive property required of a comparator function in any sorting algorithm.
For example, let's say that some instance a compares as less than b, and then b compares as less than c. These results should imply that a compares as less than c. However, since your comparisons are stochastic, you can't guarantee that outcome. In fact, you can't even guarantee that a will be less than b next time they're compared.
So don't do that. No sort algorithm can handle it. (Such an approach also violates the referential transparency principle of functional programming and will make your program much harder to reason about.)
Instead, what you need to do is to decorate your map's members with a randomly assigned weighting - before attempting to sort them - so that they can be sorted consistently. However, since this happens at the start of a sort operation, the result of the sort will be different each time, which I think is what you're looking for.
It's not clear what type mapping has in your example, but it appears to be something like: Map[Any, Set[_]]. (You can replace the types as required - it's not that important to this approach. For example, say mapping actually has the type Map[String, Set[SomeClass]], then you would replace references below to Any with String and Set[_] to Set[SomeClass].)
First, we'll create a case class that we'll use to score and compare the map elements. Then we'll map the contents of mapping to a sequence of elements of this case class. Next, we sort those elements. Finally, we extract the tuple from the decorated class. The result should look something like this:
final case class Decorated(x: (Any, Set[_]), rand: Double = random.nextDouble)
extends Ordered[Decorated] {
// Calculate a rank for this element. You'll need to change this to suit your precise
// requirements. Here, if rand is less than 0.1 (a 10% chance), I'm adding 5 to the size;
// otherwise, I'll report the actual size. This allows transitive comparisons, since
// rand doesn't change once defined. Values are negated so bigger sets come to the fore
// when sorted.
private def rank: Int = {
if(rand < 0.1) -(x._2.size + 5)
else -x._2.size
}
// Compare this element with another, by their ranks.
override def compare(that: Decorated): Int = rank.compare(that.rank)
}
// Now sort your mapping elements as follows and convert back to tuples.
lazy val seq = mapping.map(x => Decorated(x)).toSeq.sorted.map(_.x)
This should put the elements with larger sets towards the front, but there's 10% chance that sets appear 5 bigger and so move up the list. The result will be different each time the last line is re-executed, since map will create new random values for each element. However, during sorting, the ranks will be fixed and will not change.
(Note that I'm setting the rank to a negative value. The Ordered[T] trait sorts elements in ascending order, so that - if we sorted purely by set size - smaller sets would come before larger sets. By negating the rank value, sorting will put larger sets before smaller sets. If you don't want this behavior, remove the negations.)

Strange object value instead of float by using mapReduce in mongodb with Doctrine

I use mongo query for calculating sum price for every item.
My query looks like so
$queryBuilder = new Query\Builder($this, $documentName);
$queryBuilder->field('created')->gte($startDate);
$queryBuilder->field('is_test_value')->notEqual(true);
..........
$queryBuilder->map('function() {emit(this.item, this.price)}');
$queryBuilder->reduce('function(item, valuesPrices) {
return {sum: Array.sum(valuesPrices)}
}');
And this works, no problem. But I found that in some cases (approximately 20 cases from 200 results) I have strange result in field sum - instead of sum value I see construction like
[objectObject]444444444444444
4 - is price for item.
I tried to replace reduce block to block like this:
var sum = 0;
for (var i = 0; i < valuesPrices.length; i++) {
sum += parseFloat(valuesPrices[i]);
}
return {sum: sum}
In that case I see NAN value.
I suspected that some data in field price was inserted incorrectly (not as float, but as string, object etc). I tried execute my query from mongo cli and I see that all price values are integer.
It's not "strange" at all. You "broke the rules" and now you are paying for it.
"MongoDB can invoke the reduce function more than once for the same key. In this case, the previous output from the reduce function for that key will become one of the input values to the next reduce function invocation for that key."
The primary rule of mapReduce (as cited ) is that you must return exactly the same structure from the "reducer" as you do from the "mapper". This is because the "reducer" can actually run several times for the same "key". This is how mapReduce processes large lists.
You fix this by just returning a singular value, just like you did in the emit:
return Array.sum(values);
And then there will not be a problem. Adding an object key to that makes the data inconsistent, and thus you get an error when the "reduced" result gets fed back into the "reducer" again.

To satisfy x out of y constraints

I am making a timetabling program which does one to one matches from SubjectTeacherPeriod (planning entity) to Period. There comes a case when I need to: "for y periods, atleast x of the SubjectTeacherPeriod must match a match_condition"
For example, I want to constrain 3 particular periods, atleast two of them to be taught by teachers who match to asst prof.
Here is the data structure holding such a constraint:
Class XOfYPeriods
SomeType match_condition
int x
List<Period> Periods //problem
SubjectTeacherPeriod has a Period, of course
class SubjectTeacherPeriod
int id
SomeType attrib
Period period
How do I write a rule that evaluates individual Periods from a list to check if x number of SubjectTeacherPeriods that are allocated those Periods meet the match condition?
Do correct me if I am defining my classes in bad form.
For the sake of example, here is a statement to be evaluated to determine a match: eval(matches($stp_attrib,$match_condition))
Sorry for the use of Pseudocode if it confused more than clarified. The SomeType is actually List< String> and thus the match condition is checked with a Collections.disjoint
I will give it a try, but not sure I completely understand your problem statement:
rule "X of Y Periods"
when
$c : XOfYPeriods( )
$list : List( size > $c.x ) from
accumulate( $stp : SubjectTeacherPeriod( matches(attrib, $c.match_condition),
period memberOf $c.periods ),
collectList( $stp ) )
then
// $list of STP that match the condition and
// whose period matches one of the periods in the list
end
Hope it helps.

Change the class of columns in a data frame

First of all, excuse me if I do any mistakes, but English is not a language I use very often.
I have a data frame with numbers. A small part of the data frame is this:
nominal 2
2
2
2
ordinal
2
1
1
2
So, I want to use the gower distance function on these numbers.
Here ( http://rgm2.lab.nig.ac.jp/RGM2/R_man-2.9.0/library/StatMatch/man/gower.dist.html ) says that in order to use gower.dist, all nominal variables must be of class "factor" and all ordinal variables of class "ordered".
By default, all the columns are of class "integer" and mode "numeric". In order to change the class of the columns, i use these commands:
DF=read.table("clipboard",header=TRUE,sep="\t")
# I select all the cells and I copy them to the clipboard.
#Then R, with this command, reads the data from there.
MyHeader=names(DF) # I save the headers of the data frame to a temp matrix
for (i in 1:length(DF)) {
if (MyHeader[[i]]=="nominal") DF[[i]]=as.factor(DF[[i]])
}
for (i in 1:length(DF)) {
if (MyHeader[[i]]=="ordinal") DF[[i]]=as.ordered(DF[[i]])
}
The first for/if loop changes the class from integer to factor, which is what I want, but the second changes the class of ordinal variables to: "ordered" "factor".
I need to change all the columns with the header "ordinal" to "ordered", as the gower.dist function says.
Thanks in advance,
B.T.
What you are doing is fine --- if perhaps a little inelegantly.
With your ordered factor, you have something like:
> foo <- as.ordered(1:10)
> foo
[1] 1 2 3 4 5 6 7 8 9 10
Levels: 1 < 2 < 3 < 4 < 5 < 6 < 7 < 8 < 9 < 10
> class(foo)
[1] "ordered" "factor"
Notice that it has two classes, indicating that it is an ordered factor and that is is a factor:
> is.ordered(as.ordered(1:10))
[1] TRUE
> is.factor(as.ordered(1:10))
[1] TRUE
In some senses, you might like to think that foo is an ordered factor but also inherits from the factor class too. Alternatively, if there isn't a specific method that handles ordered factors, but there is a method for factors, R will use the factor method. As far as R is concerned, an ordered factor is an object with classes "ordered" and "factor". This is what your function for Gower's distance will require.
You could easily do this with:
DF$nominal <- as.factor(DF$nominal)
DF$ordinal <- as.ordered(DF$ordinal)
which gives you a dataframe with the correct structure. If you work with data frames, please stay away from [[]] unless you know very well what you're doing. Take Dirks advice, and check Owen's R Guide as well. You definitely need it.
If i do the conversion as I showed above, gower.dist() works perfectly fine. On a sidenote, the gowers distance can easily be calculated using the daisy() function as well:
DF <- data.frame(
ordinal= c(1,2,3,1,2,1),
nominal= c(2,2,2,2,2,2)
)
DF$nominal <- as.factor(DF$nominal)
DF$ordinal <- as.ordered(DF$ordinal)
library(cluster)
daisy(DF,metric="gower")
library(StatMatch)
gower.dist(DF)