OpenCL doesn't support recursion. CUDA does, but only from a certain version. Initial search indicated RenderScript does support recursion, but I couldn't find anything explicit.
Does RenderScript support recursive function calls?
Yes it does. However, using this will limit a script to processors capable of recursion.
Related
I'm looking to replace the C atan2 function with something more efficient. RenderScript does offer atan2, including versions that take vectors.
The examples I found, demonstrates calling RenderScript from Java. Is it possible to call RS from C code ? an example would be great.
Thanks
It used to be possible, though RS support in the NDK has been dropped for some time now. It may still be possible, but even the NDK samples no longer include RS samples. Starting with Android 7 you could try to use "Single Source RenderScript", described here, which is supposed to be possible from C/C++ code.
The efficiency gains you may see using RS are due to a few possible reasons (which are very platform dependent):
RS will parallelize operations over your data set. In some cases the function you are calling (such as atan2) may parallelize the operation, if possible.
Your RS code may be executed on a co-processor (such as a GPU or DSP).
The RS provided intrinsics and library functions are highly optimized for the platform. Using atan2 as an example again, it may be possible that the function is more optimized in the RS core than the standard C library as it could be using a co-processor or it could be using architecture specific optimized implementation (assembly).
All of that being said, your code can take an I/O hit when moving data between RS space (Allocation) back to the non-RS code.
I have found two examples; here is the one I got to build and run:
https://github.com/adhere/NDKCallRenderScriptDemo
I've been searching for documentation of the C++ API but haven't found it.
I found the following code in BLAS.scala:
// For level-1 routines, we use Java implementation.
private def f2jBLAS: NetlibBLAS = {
if (_f2jBLAS == null) {
_f2jBLAS = new F2jBLAS
}
_f2jBLAS
}
I think the native blas is faster than a pure Java implementation.
So why spark choose the f2jblas for level 1 routines, Is there any reason I do not know?
Thank you!
The answer is most likely to be found in the Performance section of the readme file of the netlib-java repository.
Java has a reputation with older generation developers because Java applications were slow in the 1990s. Nowadays, the JIT ensures that Java applications keep pace with – or exceed the performance of – C / C++ / Fortran applications.
This is followed by charts showing detailed benchmark results for various BLAS routines in both pure Java (translated from Fortran with f2j) and from native BLAS on both Linux on ARM and macOS on x86_64. The ddot benchmark shows that on x86 (JRE for ARM doesn't seem to have JIT capabilities) F2J performs on par with the reference native BLAS implementation for longer vector sizes and even outperforms it for shorter vector sizes. The caveat here is that the JIT kicks in after a couple of invocations, which is not a problem as most ML algorithms are iterative in nature. Most of the level 1 routines are fairly simple and the JIT compiler is able to generate well optimised code. This is also why the tuning efforts in highly optimised BLAS implementations go into the level 2 and 3 routines.
Since non-tail recursion calls use stack frames like Java does, I'd think you'd be using it very sparingly, if at all. This seems however severely restrictive given it's one of the most important tools.
When can I use non-tail recursion functions? Also, are there plans to remove the memory restriction in the future?
In the same situations where it would be safe in Java, where the dataset you are working with never grows huge and the performance isn't critical/hot path of your app.
Also, IMHO, there are times when the clarity of non tail recursion version of an algorithm is way better than the tail recursive version.
Hello everyone help me please!!!!
What can use to avoid the msgSend function overhead?
maybe answer is IMP, but I not sure.
You could simply inline the function to avoid any function call overhead. Then it would be faster than even a C function! But before you start down this path - are you certain this level of optimisation is warranted? You are more likely to get a better payoff by optimizing the algorithm.
The use of IMP is very rarely required. The method dispatching in Objective-C (especially in the 64-bit runtime) has been very heavily optimised, and exploits many tricks for speed.
What profiling have you done that tells you that method dispatching is the cause of your performance issue? I suggest first you examine the algorithm to see first of all where the most expensive operations are, and see if there is a more efficient way to implement it.
To answer your question, a quick search finds some directly relevant questions similar to yours right here on SO, with some great and detailed answers:
Objective-C optimization
Objective-C and use of SEL/IMP
My scala application needs to perform simple operations over large arrays of integers & doubles, and performance is a bottleneck. I've struggled to put my finger on exactly when certain optimizations kick in (e.g. escape analysis) although I can observe their results through various benchmarking. I'd love to do some AOT compilation of my scala application, so I can see or enforce (or implement) certain optimizations ... or compile to native code, if possible, so I can cut corners like bounds checking and observe if it makes a difference.
My question: what alternative compilation methods work for scala? I'm interested in tools like llvm, vmkit, soot, gcj, etc. Who is using those successfully with scala at this point, or are none of these methods currently compatible or maintained?
GCJ can compile JVM classes to native code. This blog describes tests done with Scala code: http://lampblogs.epfl.ch/b2evolution/blogs/index.php/2006/10/02/scala_goes_native_almost?blog=7
To answer my own question, there is no alternative backend for Scala except for the JVM. The .NET backend has been in development for a long time, but its status is unclear. The LLVM backend is also not yet ready for use, and it's not clear what its future is.