Optimising first solution strategy for VRP - or-tools

I'm trying to pick the best first solution strategy to use on a VRP.
My use case is that an individual case takes around 60 seconds to solve on average, but i need to run hundreds or thousands of cases sequentially such that my whole solution takes hours.
I can trade off finding the optimal solution against time; a good solution is usually good enough.
Using the different strategies, i get solve times between 1 and 120 seconds.
My questions:
Is it reasonable to assume that the best strategy for one case will also be the best for other cases given my model does not change much - just different pickup nodes and time windows?
Has anyone tried first testing each strategy then picking the best to use for the rest of the cases?
If i was to set the limit time to e.g. 1 second, would the strategy that gives the lowest objective function after say 1s also be likely to give the best solution strategy after 60s, unlimited?
Many thanks!

Related

use number of solutions rather than maximum time to end solve attempts

I am using the CP-SAT solver on a JSP.
I am iterating so the solver runs many times (basically simulating each day for a year), I do not need to find the optimal solution, just a reasonably good one, so I would like to be a bit smarter on ending the solver than simply allowing it to run for X seconds each time. For example, i would like to take the 5th solution each time, or even to stop once the current solution makespan is only 5% (for example) shorter than the previous solution.
Is this possible? I am only aware of solver.parameters.max_time_in_seconds as a way of limiting the calculation time. Intermediate solutions are printed by SolutionPrinter but i think this is output only and there is no way to break the solver during a run?
wrong, you can stop the search in a callback, see this recipe:
https://github.com/google/or-tools/blob/stable/ortools/sat/docs/solver.md#stopping-search-early

Topics and LL/token in Mallet change every time

Why do I get different keywords and LL/token every time I run topic models in Mallet? Is it normal?
Please help. Thank you.
Yes, this is normal and expected. Mallet implements a randomized algorithm. Finding the exact optimal best topic model for a collection is computationally intractable, but it's much easier to find one of countless "pretty good" solutions.
As an intuition, imagine shaking a box of sand. The smaller particles will sift towards one side, and the larger particles towards the other. That's way easier than trying to sort them by hand. You won't get the exact order, but each time you'll get one of a large number of equally good approximate sortings.
If you want to have a stronger guarantee of local optimality, add --num-icm-iterations 100 to switch from sampling to choosing the single best allocation for each token, given all the others.

Difference between using InsertLogical & accumulate

I was taking a look at different ways to formulate a constraint rule for Optaplanner. I was wondering about the use of InsertLogical.
In the nurse rostering example, is it just a way to measure the length of the consecutive working days? I mean, I'd like to know the difference between using InsertLogical (and then calculating the day length) or plain and simple "accumulate" function.
Also, about this specific example I'd like to know why is perfomance improved by applying different saliences.
insertLogicals are dreadfully slow. Out of 30 examples/quickstarts or so, nurse rostering is the only one using it, for the "n consecutive" constraints. Avoid it if you can.
For ConstraintStreams, we're working on better, faster, cleaner alternatives to handle these kind of constraints.

Does OptaPlanner have a "built-in" way to perform multi-unit score normalization?

At the moment, my problem has four metrics. Each of these measures something entirely different (each has different units, a different range, etc.) and each is weighted externally. I am using Drools for scoring.
I only have only one score level (SimpleLongScore) and I have to find a way to appropriately combine the individual scores of these metrics onto one long value
The most significant problem at the moment is that the range of values for the metrics can be wildly different.
So if, for example, after a move the score of a metric with a small possible range improves by, say, 10%, that could be completely dwarfed by an alternate move which improves the metric with a larger range's score by only 1% because OptaPlanner only considers the actual score value rather than the possible range of values and how changes affect them proportionally (to my knowledge).
So, is there a way to handle this cleanly which is already part of OptaPlanner that I cannot find?
Is the only feasible solution to implement Pareto scoring? Because that seems like a hack-y nightmare.
So far I have code/math to compute the best-possible and worst-possible scores for a metric that I access from within the Drools and then I can compute where in that range a move puts us, but this also feel quite hack-y and will cause issues with incremental scoring if we want to scale non-linearly within that range.
I keep coming back to thinking I should just just bite the bullet and implement Pareto scoring.
Thanks!
Take a look at #ConstraintConfiguration and #ConstraintWeight in the docs.
Also take a look at the chapter "explaning the score", which can exactly tell you which constraint had which score impact on the best solution found.
If, however, you need pareto optimization, so you need multiple best solutions that don't dominate each other, then know that OptaPlanner doesn't support that yet, but I know of 2 cases that implemented it in OptaPlanner by hacking BestSolutionRecaller.
That being said, 99% of the cases that think of pareto optimization, are 100% happy with #ConstraintWeight instead, because users don't want multiple best solutions (except during simulations), they just want one in production.

What's the name of algorithm to decide best collect frequency in facebook games?

So in many facebook games there are various buildings with different collect frequency and the number of collection you can make depends on the length and gap of periods of free time you have in a day.
Thinking about how to find the maximum occurrence of different frequency reminds me of words like knapsack and scheduling, but I forgot what's really the name of the algorithm about this or whether this is as difficult as those problems.
So, what's the name I am looking for?
Thanks.
(Test: Is it possible to bump a question in SO?)
Sounds like weighted interval scheduling.
A list of tasks is given as a set of time intervals; for instance, one task might run from 2:00 to 5:00 and another task might run from 6:00 to 8:00. Posed as an optimization problem, the goal is to maximize the number of executed tasks without overlapping the tasks. A request corresponds to an interval of time. We say that a subset of requests is compatible if no two of them overlap in time and our goal is to accept as large a compatible subset as possible. A compatible set of maximum size is called optimal.