Setup and Goal:
I want to solve a vehicle routing problem with time windows and optional stops. I am looking for a solution that serves as many stops as possible. There is no need to minimize arc costs, I am only interested in the number of serviced stops. For each stop I have defined a cost for not servicing it. I define the cost of an arc (i,j) to be the transit time in seconds because I my understanding is that this is useful in problems with time windows.
Problem:
The arc costs interfere with the penalties for dropping stops. The solver might drop a stop because getting there is too expensive. But I don't care about the cost of getting to a stop.
Possible Solutions:
Multiply the stop drop penalties by a large number so that is much more expensive to drop a stop than to travel between stops.
Remove the arc cost from the objective function but keep the arc cost available for the first solution strategy.
Question:
Is 2) possible in or-tools? What's the best practice for guiding the first solution strategy but not accumulating costs in doing so?
Related
I'm trying to implement a high-precision, millisecond-timescale timer in matlab. Every T seconds, I want to query a camera linked to matlab, and if there is an image in the memory, I want to pull it out. The actual connection to the camera is straightforward - but a problem arises because images are coming in every ~60ms, and need to all be pulled off before another image enters the camera buffer. This essentially means that I need to be checking the camera buffer at least every ~30ms, and ideally every ~5ms.
While MATLAB's built in timer function ostensibly allows millisecond timing, it suffers from poor precision. While in >95% of executions the built in MATLAB timer will indeed pause for ~5ms between runs, in ~5% of cases it hovers around ~30ms, and in ~1% of cases it takes >100ms between executions, which unacceptably kills the performance. I should clarify in MATLAB's defense that simultaneously there are two other timers running (both with 1 s periods), as well as a number of figure windows open, so even though my machine is beefy (16-core, 64GB RAM), there is certainly a lot to be doing all at once. I have tried using timers based on .NET timers (System.Timers.Timer(period)) as well as with the Java sleep function (java.lang.Thread.sleep(period)), both of which should theoretically be more precise, and while both are better than the MATLAB timer (at the cost of being more unwieldy), none are able to consistently achieve <60ms execution delay across thousands of iterations.
Maybe I'm asking for something which is not implementable - but I hope that there is some way to implement a high-precision timer in MATLAB which will continue executing at a ms time-scale even when there are other figures/timers/commands being executed semi-simultaneously. I should maybe clarify that when running just a timer with no other timers/figures open I am able to consistently achieve <60ms execution (and really, consistent <10ms execution for a 5ms timer period). This is possible even when all those timers/figures are open in a different instance of MATLAB, so it seems the problem is to somehow separate the timer from the rest of MATLAB. Any advice or guidance would be appreciated in this regard.
Depending on what exactly you are doing, the timing system of Psychtoolbox may help you.
Specifically, check out the WaitSecs function and its documentation. It is supposed to be more precise than timer, and the documentation contains some tips about achieving high precision timing in general.
Also related is the GetSecs function.
It might however happen that switching to WaitSecs will not help you. In that case you can be quite sure that your machine is just too loaded to do what you are trying to do.
So far I am using the routing package of ortools with the arc cost evaluator:
costCallbackIndex = model.registerTransitCallback(this::costCallback);
model.setArcCostEvaluatorOfAllVehicles(costCallbackIndex);
But I have realized that I am more interested in maximizing the number of pickups and deliveries (they are optional, using model.addDisjunction with a drop penalty) than the overall sum of meters travelled. Since I am dealing with a lot of pickups and deliveries I don't want to put the extra strain of minimizing arc costs on the solver.
One option is to define a arc cost callback that always returns 0, but this might confuse the solver. I could also not call the method setArcCostEvaluatorOfAllVehicles and hence suggest to the solver that I am not interested in arc costs.
What is the recommended approach?
Why don't you just refrain from using setArcCostEvaluatorOfAllVehicles ?
I am confused by the hybrid modelling paradigm in Modelica. On one hand, events are useful, on the other hand, they are to be avoided. Let me explain my case:
I have a large model consisting of multiple buildings in a neighborhood that is simulated over 1 year. Initially, the model ran very slow. Adding noEvent() around as many if-conditions as possible drastically improved the speed.
As the development continued, the control of the model got more complicated, and I have again many events, sometimes at very short intervals. To give an idea:
Number of (model) time events : 28170
Number of (U) time events : 0
Number of state events : 22572
Number of step events : 0
These events blow up the output (for correct post-processing I need the variables at events) and slows the simulation. And moreover, I have the feeling that some of the noEvent(if...) lead to unexpected behavior.
I wonder if it would be a solution to force my events at certain time steps and prohibit them in between these time steps? Ideally, I would like to trigger these 'forced events' based on certain conditions. For example: during the day they should be every 15 minutes, at high solar radiation at every minute, during nights I don't want events at all.
Is this a good idea to do? I guess this will be faster as many of the state events will become time events? How can this be done with Modelica 3.2 (in Dymola)?
Thanks on beforehand for all answers.
Roel
A few comments.
First, if you have a simulation with lots of events (relative to the total duration of the simulation), the first thing I would encourage you to do is use a lower order integrator. The point here is that higher-order integrators normally allow you to take longer time steps. But if those steps are constantly truncated by events, they just end up being really expensive.
Second, you could try fixed-step integrators. Depending on the tool, they may implement this kind of "pool events and fire them all at once" kind of approach in the context of fixed-time step integrators. But the specification doesn't really say anything on how tools should deal with events that occur between fixed time steps.
Third, another way to approach this would be to "pool" your events yourself. The simplest way I could imagine doing this would be to take all the statements that currently generate events and wrap them in a "when sample(...,...) then" statement. This way, you could make sure that the events were only triggered at specific intervals. This would be more portable then the fixed time step approach. I think this is what you were actually proposing in your question but it is important to point out that it should not be based on time steps (the model has no concept of a time step) but rather on a model specified sampling interval (which will, in practice, be completely independent of time steps).
As you point out, using "sample(...,...)" will turn these into time events and, yes, this should be faster.
My question is specific to iPhone, iPod, and iPad, since I am assuming that the architecture makes a big difference. I'm hoping there is either a specification somewhere (for the various chips perhaps), or a reliable way to measure T for each specific instruction. I know I can use any number of tools to measure aggregate processor time used, memory used, etc. I want to quantify at a lower level.
So, I'm able to figure out how many times I go through the main part of the algorithm. For example, I iterate n * (n-1) times in a naive implementation, and between n (best case) and n + n * (n-1) (worst case) in another. I can also make a reasonable count of the total number of instructions (+ - = % * /, and logic statements), and I can compare those counts, but that's assuming the weight of each operation is the same. Also, I don't have any idea how to weight the actual time value of a logic statement (if, else, for, while) vs a mathematical operator... is "if" as much work as "+" each time I use it? I would love to know where to find this information.
So, for clarity, my goal is to discover how much processor time I am demanding of the CPU (or GPU or any U) so that I can design an optimal algorithm around processor time. Can someone give me an idea of where to start for iOS hardware?
Edit: This link to ClockServices.c and SIMD stuff in the developer portal might be a good start for people interested in this. A few more cups of coffee tonight and I might get through it ;)
On a modern platform, processor time isn't the only limiting factor. Often, memory access is.
Still, processor time:
Your basic approach at an estimation for the processor load is OK, though, and is sensible: Make a rough estimate of the cost based on your knowledge of typical platforms.
In this article, Table 1 shows the times for typical primitive operations in .NET. While your platform may vary, the relative time is usually very similar. Maybe you can find - or even make - one for iStuff.
(I haven't come across one so thorough for other platforms, except processor / instruction set manuals, but they deal with assembly instructions)
memory locality:
A cache miss can cost you hundreds of cycles, a disk access a thousand times as much. So controlling your memory access patterns (i.e. reducing the working set, restructuring and accessing data in a cache-friendly way) is an important part of evaluating an algorithm.
xCode has instruments to measure performance of each function/operation, you can simply use them.
I'm trying to optimize the performance of my code, but I'm not familiar with xcode's debuggers or debuggers in general. Is it possible to track the execution time and frequency of calls being made at runtime?
Imagine a chain of events with some recursive calls over a fraction of a second. What's the best way to track where the CPU spends most of its time?
Many thanks.
Edit: Maybe this is better asked by saying, how do I use the xcode debug tools to do a stack trace?
You want to use the built-in performance tools called 'Instruments', check out Apples guide to Instruments. Specifically you probably want the System Instruments. There's also the Tuning Guide which could be useful to you and Shark.
Imagine a chain of events with some
recursive calls over a fraction of a
second. What's the best way to track
where the CPU spends most of its time?
Short version of previous answer.
Learn an IDE or debugger. Make sure it has a "pause" button or you can interrupt it when your program is running and taking too long.
If your code runs too quickly to be manually paused, wrap a temporary loop of 10 to 1000 times around it.
When you pause it, make a copy of the call stack, into some text editor. Repeat several times.
Your answer will be in those stacks. If the CPU is spending most of its time in a statement, that statement will be at the bottom of most of the stack samples. If there is some function call that causes most of the time to be used, that function call will be on most of the stacks. It doesn't matter if it's recursive - that just means it shows up more than once on a stack.
Don't think about measuring microseconds, or counting calls. Think about "percent of time active". That's what stack samples tell you, and that's roughly what you'll save if you fix it.
It's that simple.
BTW, when you fix that problem, you will get a speedup factor. Then, other issues in your code will be magnified by that factor, so they will be easier to find. This way, you can keep going until you've squeezed every cycle out of it.
The first thing I tell people is to recognize the difference between
1) timing routines and counting how many times they are called, and
2) finding code that you can fruitfully optimize.
For (1) there are instrumenting profilers.
To be really successful at (2) you need a rare type of profiler.
You need a sampling profiler that
samples the entire call stack, not just the program counter
samples at random wall clock times, not just CPU, so as to capture possible I/O problems
samples when you want it to (not when waiting for user input)
for output, gives you, for each line of code that appears on stack samples, the percent of samples containing that line. That is a direct measure of the total time that could be saved if that line were not there.
(I actually do it by hand, interrupting the program under the debugger.)
Don't get sidetracked by problems you don't have, such as
accuracy of measurement. If a line of code appears on 30% of call stack samples, it's actual cost could be anywhere in a range around 30%. If you can find a way to eliminate it or invoke it a lot less, you will save what it costs, even if you don't know in advance exactly what its cost is.
efficiency of sampling. Since you don't need accuracy of time measurement, you don't need a large number of samples. Even if you get a large number of samples, they don't skew the results significantly, because they don't fail to spot the costly lines of code.
call graphs. They make nice graphics, but are not what you need to know. An arc on a call graph corresponds to a line of code in the best case, usually multiple lines, so knowing cost of an arc only tells the cost of a line in the best case. Call graphs concentrate on functions, when what you need to find is lines of code. Call graphs get wrapped up in the issue of recursion, which is irrelevant.
It's important to understand what to expect. Many programmers, using traditional profilers, can get a 20% improvement, consider that terrific, count the profiler a winner, and stop there. Others, working with large programs, can often get speedup factors of 20 times.
This is done by fixing a series of problems, each one giving a multiplicative speedup factor. As soon as the profiler fails to find the next problem, the process stops. That's why "good enough" isn't good enough.
Here is a brief explanation of the method.