Flutter compute function takes time to start execute - flutter

I am trying to use Flutters compute function to do some real time heavy image processing using a C++ code and dart ffi.
I tried wrapping the call to the heavy function in a compute to avoid messing with the ui thread and I took some time measurements to see what takes the most time to execute.
the code looks like this:
double _work(CheckPhotoData p) {
DateTime s = DateTime.now();
Pointer<Double> rPointer = Pointer.fromAddress(p.rPointerAddress);
Pointer<Double> gPointer = Pointer.fromAddress(p.gPointerAddress);
Pointer<Double> bPointer = Pointer.fromAddress(p.bPointerAddress);
final a = NativeCCode.checkPhoto(rPointer, gPointer, bPointer, p.w, 1);
print("ACTUAL NativeCCode.checkPhoto took: " + DateTime.now().difference(s).inMilliseconds.toString());
return a;
}
class CheckPhotoWrapper {
static Future<double> checkPhotoWrapper(Uint8List photo) async {
final CheckPhotoData deconstructData = _deconstructData(photo);
DateTime s = DateTime.now();
double res = await compute(_work, deconstructData);
print("compute took: " + DateTime.now().difference(s).inMilliseconds.toString());
return res;
}
...
}
After running the code I got this output:
ACTUAL NativeCCode.checkPhoto took: 106
compute took: 514
(this means that compute took 408ms more than the code it runs)
From what I understand from these results, the actual compute method from dart:async is taking much more time then the actual code its executing and causes a big overhead impacting the performance.
Even worse, my app UI is stuck when the processing starts.
Is there a way to reduce the overhead that compute introduces or a different approach this issue that I couldn't figure out?
Thanks for any idea or a solution to my problem.
Note:
I ran the test on debug mode on a physical device.
CheckPhotoData is a simple class containing the parameters to my _work function.
I am using flutter version 2.2.3, Channel stable

The overhead seems to be caused by debug mode. I saw a similar compute delay of several hundred milliseconds in my app (using Flutter 2.10.2), but when running in release mode it's less than 10 milliseconds.

Related

Slowdown on tensorflow convolutional network with custom parameter update

I'm trying to implement a custom parameter update on a convolutional network, but every mini batch executed gets slower and slower.
I realize that there's no need to go through this trouble with a fixed learning rate, but I plan to update this later.
I call this in a loop where the feed_dict is the mini_batch.
sess.run(layered_optimizer(cost,.1,1),feed_dict = feed_dict)
where
def layered_optimizer(cost,base_rate, rate_multiplier):
gradients = tf.gradients(cost, [*weights, *biases])
print(gradients)
#update parameters based on gradients: var = var - gradient * base_rate * multiplier
for i in range(len(weights)-1):
weights[i].assign(tf.subtract(weights[i], tf.multiply(gradients[i], base_rate * rate_multiplier)))
biases[i].assign(tf.subtract(biases[i], tf.multiply(gradients[len(weights)+i], base_rate * rate_multiplier)))
return(cost)
I'm not sure if this is has to do with the problem, but after trying to run the code a second time I get the following errors and have to restart.
could not create cudnn handle: CUDNN_STATUS_NOT_INITIALIZED
error retrieving driver version: Unimplemented: kernel reported driver version not implemented on Windows
could not destroy cudnn handle: CUDNN_STATUS_BAD_PARAM
Check failed: stream->parent()->GetConvolveAlgorithms( conv_parameters.ShouldIncludeWinogradNonfusedAlgo(), &algorithms)
What happens is that every time this gets called
gradients = tf.gradients(cost, [*weights, *biases])
a new instance of tf.gradients gets created, taking up unnecessary memory.

using protractor for performance testing

I am having a terrible time trying to get decent from end timing numbers using Protractor. I have tried using protractor-perf, but the timings from that don't seem to really reflect the reality of the page load time. It says that the "Program" metric is the total time, however I am seeing it report timings much faster that what you actually see when running the test manually.
I have also tried creating my own timer, and that is proving very difficult based on the controlFlow and all the promises.
Has anyone done any performance testing with Protractor? Is there any good guidance to follow when trying to get timings? Has anyone successfully implemented a timer?
You can use your own timers, you just have to insert them into the control flow right before and after the functions you are trying to measure:
var startTime;
browser.controlFlow().execute(function() {
startTime = Date.now();
});
element(by.css('#startThing')).click();
element(by.css('#endThing')).getText();
browser.controlFlow().execute(function() {
var endTime = Date.now();
var elapsed = endTime - startTime;
console.log('clicking the startThing until getText of the endThing = ' + elapsed + 'ms);
});

Unity3d Parse FindAsync method freezes UI

I'm running a simple Parse FindAsync method as show below (on Unity3d):
Task queryTask = query.FindAsync();
Debug.Log("Start");
Thread.Sleep(5000);
Debug.Log("Middle");
while (!queryTask.IsCompleted) {
Debug.Log("Waiting");
Thread.Sleep(1);
}
Debug.Log("Finished");
I'm running this method on a separate thread and I put a load circle on UI. My load freezes (+- 1 second) somewhere in the middle of the Thread.sleep method. It's look like when findAsync finishes the process it freezes the UI until it complete their job. Is there anything I could do?
Ps: This works perfectly on editor, the problem is on Android devices.
Ps2: I'm running parse 1.4.1
Ps3: I already tried the continueWith method, but the same problem happens.
IEnumerator RunSomeLongLastingTask () {
Task queryTask = query.FindAsync();
Debug.Log("Start");
//Thread.Sleep(5000); //Replace with below call
yield WaitForSeconds(5); //Try this
Debug.Log("Middle");
while (!queryTask.IsCompleted) {
Debug.Log("Waiting");
//Thread.Sleep(1);
yield WaitForSeconds(0.001f);
}
Debug.Log("Finished");
}
To call this function, use:
StartCoroutine(RunSomeLongLastingTask());
Making the thread sleep might not be a good idea, mainly because the number of threads available is different on each device.
Unity as a built-in scheduler that uses coroutines, so it is better to use it.
IEnumerator RunSomeLongLastingTask()
{
Task queryTask = query.FindAsync();
while (!queryTask.IsCompleted)
{
Debug.Log("Waiting"); // consider removing this log because it also impact performance
yield return null; // wait until next frame
}
}
Now, one possible issue is if your task take too much CPU, then the UI will still not be responsive. If possible, try to give a lower priority to this task.

How to ignore the 60fps limit in javafx?

I need to create a 100fps animation that display 3d data from a file that contains 100 frames per second. But the AnimationTimer in javaFx allows me to get 60fps only. How to get over it?
Removing the JavaFX Frame Rate Cap
You can remove the 60fps JavaFX frame rate cap by setting a system property, e.g.,
java -Djavafx.animation.fullspeed=true MyApp
Which is an undocumented and unsupported setting.
Removing the JavaFX frame rate cap may make your application considerably less efficient in terms of resource usage (e.g. a JavaFX application without a frame rate cap will consume more CPU than an application with the frame rate cap in place).
Configuring the JavaFX Frame Rate Cap
Additionally, there is another undocumented system property you could try:
javafx.animation.framerate
I have not tried it.
Debugging JavaFX Frames (Pulses)
There are other settings like -Djavafx.pulseLogger=true which you could enable to help you debug the JavaFX architecture and validate that your application is actually running at the framerate you expect.
JavaFX 8 has a Pulse Logger (-Djavafx.pulseLogger=true system property) that "prints out a lot of crap" (in a good way) about the JavaFX engine's execution. There is a lot of provided information on a per-pulse basis including pulse number (auto-incremented integer), pulse duration, and time since last pulse. The information also includes thread details and events details. This data allows a developer to see what is taking most of the time.
Warning
Normal warnings for using undocumented features apply, as Richard Bair from the JavaFX team notes:
Just a word of caution, if we haven't documented the command line switches, they're fair game for removal / modification in subsequent releases :-)
Fullspeed=true will give you a high framerate without control (and thereby decrease performance of your app as it renders too much), framerate doesn't work indeed.
Use:
-Djavafx.animation.pulse=value
You can check your framerate with the following code. I checked if it actually works too by setting the pulserate to 2, 60 and 120 (I have a 240Hz monitor) and you see a difference in how fast the random number changes.
private final long[] frameTimes = new long[100];
private int frameTimeIndex = 0 ;
private boolean arrayFilled = false ;
Label label = new Label();
root.getChildren().add(label);
AnimationTimer frameRateMeter = new AnimationTimer() {
#Override
public void handle(long now) {
long oldFrameTime = frameTimes[frameTimeIndex] ;
frameTimes[frameTimeIndex] = now ;
frameTimeIndex = (frameTimeIndex + 1) % frameTimes.length ;
if (frameTimeIndex == 0) {
arrayFilled = true ;
}
if (arrayFilled) {
long elapsedNanos = now - oldFrameTime ;
long elapsedNanosPerFrame = elapsedNanos / frameTimes.length ;
double frameRate = 1_000_000_000.0 / elapsedNanosPerFrame ;
label.setText(String.format("Current frame rate: %.3f" +", Random number:" + Math.random(), frameRate));
}
}
};
frameRateMeter.start();

IPhone: different system timers?

I have been using mach_absolute_time() for all my timing functions so far. calculating how long between frames etc.
I now want to get the exact time touch input events happen using event.timestamp in the touch callbacks.
the problem is these two seem to use completely different timers. sure, you can get them both in seconds, but their origins are different and seemingly random...
is there any way to sync the two different timers?
or is there anyway to get access to the same timer that the touch input uses to generate that timestamp property? otherwise its next to useless.
Had some trouble with this myself. There isn't a lot of good documentation, so I went with experimentation. Here's what I was able to determine:
mach_absolute_time depends on the processor of the device. It returns ticks since the device was last rebooted (otherwise known as uptime). In order to get it in a human readable form, you have to modify it by the result from mach_timebase_info (a ratio), which will return billionth of seconds (or nanoseconds). To make this more usable I use a function like the one below:
#include <mach/mach_time.h>
int getUptimeInMilliseconds()
{
static const int64_t kOneMillion = 1000 * 1000;
static mach_timebase_info_data_t s_timebase_info;
if (s_timebase_info.denom == 0) {
(void) mach_timebase_info(&s_timebase_info);
}
// mach_absolute_time() returns billionth of seconds,
// so divide by one million to get milliseconds
return (int)((mach_absolute_time() * s_timebase_info.numer) / (kOneMillion * s_timebase_info.denom));
}
Get the initial difference between two i.e
what is returned by mach_absolute_time() initally when your application starts and also get the event.timestamp initially at the same time...
store the difference... it would remain same through out the time your application runs.. so you can use this time difference to convert one to another...
How about CFAbsoluteTimeGetCurrent?