Scheduling w. CP-SAT solver is very slow - or-tools

When I run this code from the official documentation with the standard parameters of num_nurses = 4, num_shifts = 3, num_days = 3 I find a solution in less than 1 second wall time.
When I increase the parameter values to num_nurses = 40, num_shifts = 30, num_days = 30 I end up aborting the script after 2 hours since no solutions have been found.
I appreciate that with the higher parameters there is a combinatorial explosion but scheduling 40 nurses across 30 shifts for a month is a realistic problem.
What can be done to solve this problem faster for larger parameters? Is it solution hinting or increasing/decreasing the number of constraints?

This example is actually not very interesting. The biggest problem is that num_nurses != num_shifts, there are no solutions. And this is hard to prove.
Please have a look at this example.

Related

Ways to Improve Universal Differential Equation Training with sciml_train

About a month ago I asked a question about strategies for better convergence when training a neural differential equation. I've since gotten that example to work using the advice I was given, but when I applied what the same advice to a more difficult model, I got stuck again. All of my code is in Julia, primarily making use of the DiffEqFlux library. In effort to keep this post as brief as possible, I won't share all of my code for everything I've tried, but if anyone wants access to it to troubleshoot I can provide it.
What I'm Trying to Do
The data I'm trying to learn comes from an SIRx model:
function SIRx!(du, u, p, t)
β, μ, γ, a, b = Float32.([280, 1/50, 365/22, 100, 0.05])
S, I, x = u
du[1] = μ*(1-x) - β*S*I - μ*S
du[2] = β*S*I - (μ+γ)*I
du[3] = a*I - b*x
nothing
end;
The initial condition I used was u0 = Float32.([0.062047128, 1.3126149f-7, 0.9486445]);. I generated data from t=0 to 25, sampled every 0.02 (in training, I only use every 20 points or so for speed, and using more doesn't improve results). The data looks like this: Training Data
The UDE I'm training is
function SIRx_ude!(du, u, p, t)
μ, γ = Float32.([1/50, 365/22])
S,I,x = u
du[1] = μ*(1-x) - μ*S + ann_dS(u, #view p[1:lenS])[1]
du[2] = -(μ+γ)*I + ann_dI(u, #view p[lenS+1:lenS+lenI])[1]
du[3] = ann_dx(u, #view p[lenI+1:end])[1]
nothing
end;
Each of the neural networks (ann_dS, ann_dI, ann_dx) are defined using FastChain(FastDense(3, 20, tanh), FastDense(20, 1)). I tried using a single neural network with 3 inputs and 3 outputs, but it was slower and didn't perform any better. I also tried normalizing inputs to the network first, but it doesn't make a significant difference outside of slowing things down.
What I've Tried
Single shooting
The network just fits a line through the middle of the data. This happens even when I weight the earlier datapoints more in the loss function. Single-shot Training
Multiple Shooting
The best result I had was with multiple shooting. As seen here, it's not simply fitting a straight line, but it's not exactly fitting the data eitherMultiple Shooting Result. I've tried continuity terms ranging from 0.1 to 100 and group sizes from 3 to 30 and it doesn't make a significant difference.
Various Other Strategies
I've also tried iteratively growing the fit, 2-stage training with a collocation, and mini-batching as outlined here: https://diffeqflux.sciml.ai/dev/examples/local_minima, https://diffeqflux.sciml.ai/dev/examples/collocation/, https://diffeqflux.sciml.ai/dev/examples/minibatch/. Iteratively growing the fit works well the first couple of iterations, but as the length increases it goes back to fitting a straight line again. 2-stage collocation training works really well for stage 1, but it doesn't actually improve performance on the second stage (I've tried both single and multiple shooting for the second stage). Finally, mini-batching worked about as well as single-shooting (which is to say not very well) but much more quickly.
My Question
In summary, I have no idea what to try. There are so many strategies, each with so many parameters that can be tweaked. I need a way to diagnose the problem more precisely so I can better decide how to proceed. If anyone has experience with this sort of problem, I'd appreciate any advice or guidance I can get.
This isn't a great SO question because it's more exploratory. Did you lower your ODE tolerances? That would improve your gradient calculation which could help. What activation function are you using? I would use something like softplus instead of tanh so that you don't have the saturating behavior. Did you scale the eigenvalues and take into account the issues explored in the stiff neural ODE paper? Larger neural networks? Different learning rates? ADAM? Etc.
This is much better suited for a forum for discussion like the JuliaLang Discourse. We can continue there since walking through this will not be fruitful without some back and forth.

Or-tools cp_sat solver is inconsistent in results

I have an optimization problem and I am using or-tools cp_sat solver. The number of variables is around 3500 (all boolean) but the number of constraints is huge (~750000). Out of 3500 variables, ~3000 are directly dependent on the other 500.
There are 2 scenarios I tested:
With a simple objective function depending on ~3000 constraint variables.
With a complex objective function depending on ~3000*3000 new variables, where each new variable is pairwise logical_and of the variables in (1).
For each case, we seed the solver with hints for ~500 variables.
For 1, it cannot find an optimal solution in reasonable time. After around 30-45 minutes of runtime, the improvement to the objective function is negligible, but the solutions are satisfactory.
For 2, behavior is weird. Around half of the time, it claims that the problem is INFEASIBLE, half of the time, claims that it found OPTIMAL solution, but only returns back the solution implied by the hints. Only rarely (less than a couple percent of the runs), it does some optimization and returns FEASIBLE.
In addition, case 1 uses around 4-6 GB of memory but case 2 uses 100-120 GB of memory.
Is the behavior in case 2 expected? How should I approach debugging this?
For case 2, the problem become very big. You are creating 9M Boolean variables.
Are you using multithreading ?
Can you try reducing the size of the model and see if this is still flaky ?
Is the problem creation deterministic ?
Are you using large coefficient ? Is it possible you are hitting an integer overflow error ?
Thanks

Too many for loop iterations - for loop terminates

In a classification task, I need to do feature selection. So out of featSize = 98 features (variables), I want to know which ones are applicable. For each combination I train the classifier by tuning its hyperparameters. I've come across a problem in my usage of a for loop:
for b = 1:(2^featSize) - 1
% this is to choose the features. e.g. [1 0 0] selects the first
% feature out of three features if featSize = 3.
end
Matlab gives a warning: Warning: Too many FOR loop iterations. Stopping after 9223372036854775806 iterations.
Am I using the for loop in a prohibitive way? Is there another alternative method of completing this step?
Building a model for every possible combination of features is intractable. It's clear from your for loop that you would have to build an exponential number of models to cover every feature subset.
There are many approaches to feature selection that are practical to implement. The one most similar to your method is forward-selection. Many algorithms offer a regularization parameter instead (e.g. LASSO or ridge-regression). Some options for regression are discussed here https://stats.stackexchange.com/questions/127444/a-guide-to-regularization-strategies-in-regression
This talk covers many approaches to the problem of feature selection https://www.youtube.com/watch?v=JsArBz46_3s&index=21&list=PLGVZCDnMOq0ovNxfxOqYcBcQOIny9Zvb-&t=0s
2^98 = 316.9e27 = 300 thousand million million million million. If you run a billion* loop iterations a second, it would take ten thousand million** years to run that loop. I don't think you can afford the electricity bill... :)
It is scary, isn't it, how quickly exponential things explode?
Luckily, you don't need to loop this often to visit all pairs of features. If you have 98 features, then you have 98^2 pairs, not 2^98. Actually, you have 98*97, if you don't want to pair a feature with itself, and 98*97/2 if the order doesn't matter.
You can write a double loop to visit each pair:
N = 98
for ii = 1:N-1
for jj = ii+1:N
% do something with the pair [ii,jj]
end
end
* A billion as in a million million -- not the US billion.
** 2^98 /1e12 /60 /60 /24 /365 == 10.049e+9 -- I didn't take leap years or leap seconds into account... :)
I think you are requesting the for loop to do 2^98 = 316,910,000,000,000,000,000,000,000,000 iterations, so you will need to reduce the number of iterations.
As others have noted, yes, you are using the for loop in a prohibitive way, almost destructively. It's absurd to ask any regular computer, much less a super computer to run that many iterations of a loop. So that is that part of your question answered.
Regarding developing another method of tackling this, I don't know much about machine learning (I guess this is bad to say as I'm attempting to solve this), but regardless, it doesn't seem like you've provided enough information for us to help you there. Either way, you will need to somehow drastically reduce the number of iterations of the loop for this to run efficiently, and avoid the error.

Solving approach for a series

I am having a great trouble on finding the solution of this series.
index 1 2 3 4 5
number 0 1 5 15 35
here say first index is an exception but what is the solution for that series to pick an index & get the number. Please add your Explanation of the solving approach.
I would also like to have some extra example for solving approach of other this kind of series.
The approaches to solve a general series matching problem vary a lot, depending on the information you have about the series. You can start with reading up on time series.
For this series you can easily google it and find out they're related to the binomial coefficients like n!/(n-4)!/4! . Taking into account i, it will be something like (i+3)!/4!/(i-1)!

calculating the best speedup

I have been going over a previous exam for my computer architecture course that i got an incorrect answer, how could i calculate the best possibly speedup?
I understand theres a limit as to how mucha program can be sped up im just unsure of the forumla (he problem is part b). Any help will be upvoted and very much appreciated thanks!
(6 points) To accelerate an application, two enhancements with the following speedups are proposed:
  Speedup1 = 25
  Speedup2 = 15
Enhancement 1 is usable for 40% of the instructions and enhancement 2 is usable for 30% of the instructions. Two enhancements do not overlap.
a) What is the speedup if both enhancements are applied?
b) If you keep improving these two enhancements, what is the best speedup you can reach?
Rather than trying to memorize a formula, use common sense. Imagine that both portions of speed-up-able code could be sped up infinitely: that is, made to take no time at all. What would be left? How much time would it take?
Let t be the total run rime, then:
(a) Since you are not asking for this section, I am giving a full solution, for future readers.
t' = modified run time = 0.4t / 25 + 0.3t / 15 + 0.3t = 0.336t,
Thus, speedup = t/t' = t / 0.336t ~= 2.97
(b) The question asks keep advancing THESE speed ups, so you cannot improve the whole program. Then the best speed up you can get, according to amdahl's law is bounded by the sequential, un-improveable part. Amdahl's law says that the maximum speed up will be 1/SEQUENTIAL_PART What is the sequential part in your case? Make sure you understand why.
The idea of amdahl's law is, assuming you can speed up the improved part to infinite speed up, the total speed up will still be bounded by the non-improved part.