LabVIEW Real-Time Timed Loop resolution - real-time

we are using LabVIEW Real-Time with the PXI-8110 Controller.
I am facing the following problem:
I have a loop with 500µs period time (time-loop) and no other task. I write the time each loop iteration into ram and then save the data afterwords.
It is necessary that the period is exact, but I see that it is 500µs with +/- 25 µs.
The clock for the timed loop is 1 MHz.
How is it possible to have 500µs - 25µs. I would understand if I get 500µs + xx µs when my compution is to heavy. But till now I just do an addition nothing more.
So does anyone have a clue what is going wrong?
I thought it would be possible to have resolution of 1µs as NI advertise (if the computation isn't so heavy).
Thanks.

You may need to check which thread the code is working in. An easier way to work is to use the Timed Loop as this will try and correct for overruns. Also pre-allocate the array that you are storing the data into and then replace array subset which each new value. You should see a massive improvement with this way.
If you display that value and are running in development mode you will see jitter +- time as you are reporting everything back to the host. Build the executable and again jitter will shrink.

Related

High precision millisecond timer in matlab

I'm trying to implement a high-precision, millisecond-timescale timer in matlab. Every T seconds, I want to query a camera linked to matlab, and if there is an image in the memory, I want to pull it out. The actual connection to the camera is straightforward - but a problem arises because images are coming in every ~60ms, and need to all be pulled off before another image enters the camera buffer. This essentially means that I need to be checking the camera buffer at least every ~30ms, and ideally every ~5ms.
While MATLAB's built in timer function ostensibly allows millisecond timing, it suffers from poor precision. While in >95% of executions the built in MATLAB timer will indeed pause for ~5ms between runs, in ~5% of cases it hovers around ~30ms, and in ~1% of cases it takes >100ms between executions, which unacceptably kills the performance. I should clarify in MATLAB's defense that simultaneously there are two other timers running (both with 1 s periods), as well as a number of figure windows open, so even though my machine is beefy (16-core, 64GB RAM), there is certainly a lot to be doing all at once. I have tried using timers based on .NET timers (System.Timers.Timer(period)) as well as with the Java sleep function (java.lang.Thread.sleep(period)), both of which should theoretically be more precise, and while both are better than the MATLAB timer (at the cost of being more unwieldy), none are able to consistently achieve <60ms execution delay across thousands of iterations.
Maybe I'm asking for something which is not implementable - but I hope that there is some way to implement a high-precision timer in MATLAB which will continue executing at a ms time-scale even when there are other figures/timers/commands being executed semi-simultaneously. I should maybe clarify that when running just a timer with no other timers/figures open I am able to consistently achieve <60ms execution (and really, consistent <10ms execution for a 5ms timer period). This is possible even when all those timers/figures are open in a different instance of MATLAB, so it seems the problem is to somehow separate the timer from the rest of MATLAB. Any advice or guidance would be appreciated in this regard.
Depending on what exactly you are doing, the timing system of Psychtoolbox may help you.
Specifically, check out the WaitSecs function and its documentation. It is supposed to be more precise than timer, and the documentation contains some tips about achieving high precision timing in general.
Also related is the GetSecs function.
It might however happen that switching to WaitSecs will not help you. In that case you can be quite sure that your machine is just too loaded to do what you are trying to do.

Possible way to speed up SUMO simulation

Hi all I am a new SUMO user. I am having simulation iteratively with DUAROUTER and SUMO. The simulation consist of 20000 trips in Singapore network and it's very slow, took one hour and more to complete one simulation.
Anyone knows any way to speed up the process? I need to do 50 iterations. 1 hour per iteration is too slow.
My commands are as follows:
duarouter --net-file sg_left_v1.net.xml --trip-files trips20000_merged.trips.xml --output-file 0.20000.route.xml --ignore-errors true --no-warnings true --repair true
sumo -c simulation_sg_20000.sumocfg --tripinfo-output 0.20000.trip.output.xml --no-warnings true --tripinfo-output.write-unfinished true --vehroute-output 0.20000.individual.output.xml --link-output 0.20000.link-state.output.xml
The number X in X.20000.something.xml is increased on each iteration by my python code.
Thank you all in advance.
There are different things you can do to speed up the process by analyzing the bottlenecks. I would do the following:
Check whether the traffic flow is smooth. If there are big jams piling up the simulation slows down.
Do the vehicles depart at the times you expect them too. Even is there no visible jam, the backlog slows the simulation down. A good indicator is that vehicles which have an intended departure time near the end of the simulation, take much longer to depart (it's also in the tripinfo).
Recheck whether you need all outputs. To get a feeling whether it helps disable them one by one and have a look at the running time.
3a. Extend SUMO to aggregate your data. It is open source after all, so if the outputs are the bottleneck, aggregate inside the simulation.
Think about parallel execution. Maybe you do not need to start the iterations one after another?
Make the scenario smaller.
To accelerate the simulation, you will need to pass a parameter to Sumo called step-length
which a ratio of sumoTime / realWorldtime.
sumo your-other-args-here --step-length 1
It should enable you the get the wanted result

How to run a multi-queue code using OpenCL?

For example, I'm doing some image processing work on every frame of a video.
Every frame's processing using 200ms including writing, processing and reading.
And the fps is 25, in that case every two frames' distance is 40ms. Then the processing is too slow to show continuous result.
So here is my idea, I use multi-queues for this work.
In CPU part,
while(video is not over)
{
1. read the frame0;
processing the **frame0** using **queue0**;
wait 40 ms;
2. read the frame1;
processing the frame1 using **queue1**;
wait 40 ms;
3.4.5.
...(after 5 frames(just about the 200ms's processing time))
6. download the **frame0**'s result.
7. read the frame5;
processing the frame5 using **queue0**;
wait 40 ms;
...
}
The code means that, I use different queues for reading and processing the same frame in a video.
However, my experiment result is faster, but just 2 times faster, but not in my imaginary speed.
Can anyone tell me how to deal it? THX!
Assuming you have one Device, here are some thoughts on this point:
Main reason to have multiple Command Queues (CQ) per single OpenCL Device is the ability to execute kernels & do IO operations simultaneously.
Usually one CQ is enought to load single Device at ~100%. Though, your multi-CQ idea is good (in my opinion), as you're constantly feeding GPU with workload.
Look at kernel execution time. May be, it's big enough, so that your Device is constantly executing kernels & can't go any faster.
I think, you don't need to wait for 40ms. Good solution is to process frames in queue, in which they are put to eliminate the difference between bitstream & display order.
If you have too many CQ, your OpenCL driver thread will be busy maintaining them, so that performance may decrease.

Why does Matlab run faster after a script is "warmed up"?

I have noticed that the first time I run a script, it takes considerably more time than the second and third time1. The "warm-up" is mentioned in this question without an explanation.
Why does the code run faster after it is "warmed up"?
I don't clear all between calls2, but the input parameters change for every function call. Does anyone know why this is?
1. I have my license locally, so it's not a problem related to license checking.
2. Actually, the behavior doesn't change if I clear all.
One reason why it would run faster after the first time is that many things are initialized once, and their results are cached and reused the next time. For example in the M-side, variables can be defined as persistent in functions that can be locked. This can also occur on the MEX-side of things.
In addition many dependencies are loaded after the first time and remain so in memory to be re-used. This include M-functions, OOP classes, Java classes, MEX-functions, and so on. This applies to both builtin and user-defined ones.
For example issue the following command before and after running your script for the first run, then compare:
[M,X,C] = inmem('-completenames')
Note that clear all does not necessarily clear all of the above, not to mention locked functions...
Finally let us not forget the role of the accelerator. Instead of interpreting the M-code every time a function is invoked, it gets compiled into machine code instructions during runtime. JIT compilation occurs only for the first invocation, so ideally the efficiency of running object code the following times will overcome the overhead of re-interpreting the program every time it runs.
Matlab is interpreted. If you don't warm up the code, you will be losing a lot of time due to interpretation instead of the actual algorithm. This can skew results of timings significantly.
Running the code at least once will enable Matlab to actually compile appropriate code segments.
Besides Matlab-specific reasons like JIT-compilation, modern CPUs have large caches, branch predictors, and so on. Warming these up is an issue for benchmarking even in assembly language.
Also, more importantly, modern CPUs usually idle at low clock speed, and only jump to full speed after several milliseconds of sustained load.
Intel's Turbo feature gets even more funky: when power and thermal limits allow, the CPU can run faster than its sustainable max frequency. So the first ~20 seconds to 1 minute of your benchmark may run faster than the rest of it, if you aren't careful to control for these factors.
Another issue not mensioned by Amro and Marc is memory (pre)allocation.
If your script does not pre-allocate its memory it's first run would be very slow due to memory allocation. Once it completed its first iteration all memory is allocated, so consecutive invokations of the script would be more efficient.
An illustrative example
for ii = 1:1000
vec(ii) = ii; %// vec grows inside loop the first time this code is executed only
end

Synchronise real-time workshop in matlab for grt target

I am trying to run a real-time simulation in Simulink using Real-time Workshop. The target is grt(I have tried rtwin, but my simulation refuses to compile for it). I need the simulation to run in real-time so that one second in simulation lasts one second of real time. Grt ignores realtime and finishes the simulation in shortest time possible. Is there any way to synchronise it?
I have tried http://www.mathworks.com/matlabcentral/fileexchange/3175 but could not get it to work(does not compile).
Thank you for any suggestions.
Looks like it is impossible. I was able to slow down the execution by using Sleep(time in ms) function from WinApi and clock function from time.h, which looked quite good for low sample rates. However, when I increased the sample rate the Sleep function was sleeping for too long, which resulted in errors, with one second in simulation lasting more than one real world second.
The idea was to say that one period of iteration should last, let's say 200ms. Then time how long it takes for one iteration of code to execute using the clock function. Then call Sleep(200 - u), where u is the length of the iteration. The problem is that Sleep function sleeps the process and wakes it up when it wants to, not when you tell it to in the argument.
I know this is not a solution, but post this so that if anyone faces the same problem as me they won't try this dead-end solution. I had to rewrite the simulation for rtwin and now it works fine.
Another idea would be to somehow use interrupts, but I guess it would be quite complicated and not worth the trouble.