GPU MATLAB gives different elapsed time between first and second execution

GPU MATLAB gives different elapsed time between first and second execution - matlab

When i execute my code using Matlab parallel toolbox it gives me two different time execution between first and second time.
In fact the first time is very slow (more than CPU version) however the second time is faster and logical, and subsequent runs are the same as the second time. Why does this happen?

That is correct, and expected.
When you call it for the first time, it needs to initialize the GPU ("turn it on" in some sense), set up the CUDA contexts, etc etc. The second time you run it the GPU is ready to take anything you throw at it.
On top of that, depends on how you wrote the code, maybe the first time it requires to move some data to the GPU, and perhaps in the second time the memory is already there.
Often doing gpuDevice(1) will initialize the context enough, but otherwise just throw a small matrix multiplication to it to initialize.
All this is somehow true for other parallel computing paradigms in MATLAB, e.g. if you want to use parfor you need to initialize the parallel pool or it will take very long the first time.

Related

matlab script node in Labview with different timing

I have a DAQ for Temperature measurment. I take a continuous sample rate and after DAQ, calculating temperature difference per minute (Cooling Rate: CR) during this process. This CR and temperature values are inserted into the Matlab script for a physical model running (predicting the temperature drop for next 30 sec). Then, I record and compare the predicted and experimental values in LabVIEW.
What i am trying to do is the matlab model is executing every 30 sec, and send out its predictions as an output from matlab script. One of this outputs helps me to change the Air Blower Motor Speed until next matlab run( eventually affect the temperature drop for next 30 sec as well, which becomes a closed loop). After 30 sec while main process is still running, sending CR and temperature values to matlab model again, and so on.
I have a case structure for this Matlab script. And inside of case structure i applied an elapsed time function to control the timing for the matlab script, but this is not working.

Yes. Short answer: I believe (one of) the reasons the program behaves weird on changed timing are several race conditions present in the code.
The part of the diagram presented shows several big problems with the code:
Local variables lead to race conditions. Use dataflow. E.g. you are writing to Tinitial local variable, and reading from Tinitial local varaible in the chunk of code with no data dependencies. It is not known whether reading or writing will happen first. It may not manifest itself badly with small delays, while big delays may be an issue. Solution: rewrite you program using the following example:
From Bad:
To Good:
(nevermind broken wires)
Matlab script node executes in the main UI execution system. If it is executing for a long time, it may freeze indicators/controls as well as execution of other pieces of code. Change execution system of other VIs in your program (say to "other 1") and see if the situation improves.

Show progress of a Matlab instruction execution

Is there any way to show the execution progress (even a rough estimation) of a Matlab instruction?
For example let's say I am computing distances using pdist:
D = pdist(my_matrix,'cosine');
and this computation takes hours, does Matlab provide any way to show periodically the execution progress?

Not intrinsically. You can of course do after-the-fact checking with the profiler or tic/toc.
If this is something you will be doing a lot for a single function, you could consider modifying the function and saving it in your path with a new name (I have a directory called "Modified Builtin" for just this purpose). In the case of pdist.m, you might save pdist_updates.m. Looking a the function, the actual distances are calculated starting around line 250 with a series of nested loops. Add in a line like:
disp(sprintf('Processing pair %d of %d',i,n-1));
at line 265. If you really wanted to get fancy, you could use tic and toc to time each loop and provide an estimate of how long the entire calculation would take so you would know how long you had to run to the coffee machine.
Of course this will cause problems if you end up cancelling your Statistics Toolbox license or if Mathworks upgrades the toolbox and changes the functionality, so use this approach sparingly.

Faster way to run simulink simulation repeatedly for a large number of time

I want to run a simulation which includes SimEvent blocks (thus only Normal option is available for sim run) for a large number of times, like at least 1000. When I use sim it compiles the program every time and I wonder if there is any other solution which just run the simulation repeatedly in a faster way. I disabled Rebuild option from Configuration Parameter and it does make it faster but still takes ages to run for around 100 times.
And single simulation time is not long at all.
Thank you!

It's difficult to say why the model compiles every time without actually seeing the model and what's inside it. However, the Parallel Computing Toolbox provides you with the ability to distribute the iterations of your model across several cores, or even several machines (with the MATLAB Distributed Computing Server). See Run Parallel Simulations in the documentation for more details.

Why does Matlab run faster after a script is "warmed up"?

I have noticed that the first time I run a script, it takes considerably more time than the second and third time1. The "warm-up" is mentioned in this question without an explanation.
Why does the code run faster after it is "warmed up"?
I don't clear all between calls2, but the input parameters change for every function call. Does anyone know why this is?
1. I have my license locally, so it's not a problem related to license checking.
2. Actually, the behavior doesn't change if I clear all.

One reason why it would run faster after the first time is that many things are initialized once, and their results are cached and reused the next time. For example in the M-side, variables can be defined as persistent in functions that can be locked. This can also occur on the MEX-side of things.
In addition many dependencies are loaded after the first time and remain so in memory to be re-used. This include M-functions, OOP classes, Java classes, MEX-functions, and so on. This applies to both builtin and user-defined ones.
For example issue the following command before and after running your script for the first run, then compare:
[M,X,C] = inmem('-completenames')
Note that clear all does not necessarily clear all of the above, not to mention locked functions...
Finally let us not forget the role of the accelerator. Instead of interpreting the M-code every time a function is invoked, it gets compiled into machine code instructions during runtime. JIT compilation occurs only for the first invocation, so ideally the efficiency of running object code the following times will overcome the overhead of re-interpreting the program every time it runs.

Matlab is interpreted. If you don't warm up the code, you will be losing a lot of time due to interpretation instead of the actual algorithm. This can skew results of timings significantly.
Running the code at least once will enable Matlab to actually compile appropriate code segments.

Besides Matlab-specific reasons like JIT-compilation, modern CPUs have large caches, branch predictors, and so on. Warming these up is an issue for benchmarking even in assembly language.
Also, more importantly, modern CPUs usually idle at low clock speed, and only jump to full speed after several milliseconds of sustained load.
Intel's Turbo feature gets even more funky: when power and thermal limits allow, the CPU can run faster than its sustainable max frequency. So the first ~20 seconds to 1 minute of your benchmark may run faster than the rest of it, if you aren't careful to control for these factors.

Another issue not mensioned by Amro and Marc is memory (pre)allocation.
If your script does not pre-allocate its memory it's first run would be very slow due to memory allocation. Once it completed its first iteration all memory is allocated, so consecutive invokations of the script would be more efficient.
An illustrative example
for ii = 1:1000
vec(ii) = ii; %// vec grows inside loop the first time this code is executed only
end

Synchronise real-time workshop in matlab for grt target

I am trying to run a real-time simulation in Simulink using Real-time Workshop. The target is grt(I have tried rtwin, but my simulation refuses to compile for it). I need the simulation to run in real-time so that one second in simulation lasts one second of real time. Grt ignores realtime and finishes the simulation in shortest time possible. Is there any way to synchronise it?
I have tried http://www.mathworks.com/matlabcentral/fileexchange/3175 but could not get it to work(does not compile).
Thank you for any suggestions.

Looks like it is impossible. I was able to slow down the execution by using Sleep(time in ms) function from WinApi and clock function from time.h, which looked quite good for low sample rates. However, when I increased the sample rate the Sleep function was sleeping for too long, which resulted in errors, with one second in simulation lasting more than one real world second.
The idea was to say that one period of iteration should last, let's say 200ms. Then time how long it takes for one iteration of code to execute using the clock function. Then call Sleep(200 - u), where u is the length of the iteration. The problem is that Sleep function sleeps the process and wakes it up when it wants to, not when you tell it to in the argument.
I know this is not a solution, but post this so that if anyone faces the same problem as me they won't try this dead-end solution. I had to rewrite the simulation for rtwin and now it works fine.
Another idea would be to somehow use interrupts, but I guess it would be quite complicated and not worth the trouble.