Is the argout paradigm good practice in RenderScript? - renderscript

Reflection classes in RenderScript contain functions that execute the kernels. These functions follow the out argument paradigm- one of their arguments an Allocation in which output is stored.
Is there a reason this is better practice than returning the output Allocation? (Should I follow suit and use out arguments in my RenderScript-related functions?)
For example, I have implemented the following helper class which wraps ScriptC_gradient and computes the gradient of a Bitmap. It can infer from the input Allocation what type the output Allocation should have, and thus hide the boilerplate necessary to set up the destination Allocation. Is there a reason to prefer one implementation of compute() over the other?
public class Gradient {
private RenderScript mRS;
private ScriptC_gradient mScript;
public Gradient(RenderScript RS) {
mRS = RS;
mScript = new ScriptC_gradient(mRS);
}
/* Out-argument implementation
*
* This closely mirrors RenderScript's kernel functions, but
* it requires the caller to write boilerplate to set up the
* destination Allocation.
*/
public void compute(Allocation elevation, Allocation gradient) {
mScript.invoke_setDimensions(elevation);
mScript.forEach_root(elevation, gradient);
}
/* Allocation-returning implementation
*
* This hides the boilerplate.
*/
public Allocation compute(Allocation elevation) {
Allocation gradient = Allocation.createTyped(mRS,
new Type.Builder(mRS,Element.F32_2(mRS))
.setX(elevation.getType().getX())
.setY(elevation.getType().getY())
.create(),
Allocation.USAGE_SCRIPT);
mScript.invoke_setDimensions(elevation);
mScript.forEach_root(elevation, gradient);
return gradient;
}

Yes, the reason to prefer the approach with the passed in Allocation for the output is memory reuse. Creating allocations is expensive and should not be done more than necessary.
The second method would also cause problems for "tiling" where you do multiple kernel launches to each fill out a part of an output allocation. As the output would be reallocated each time the previous contents would be lost (or have to be copied).

Related

Make the code shorter in Unity3D with Vector3 and float value in 1 axis

This is my code:
transform.localPosition = new Vector3(transform.localPosition.x, Mathf.PingPong(Time.time * movementSpeed, movementRange), transform.localPosition.z);
Can you help me to short it?
Like
transform.localPosition = Vector3.up * Mathf.PingPong(Time.time * movementSpeed, movementRange)
I am constantly encountering a Vector3 abbreviation of this form, but can't write this code any better
Perfect use-case for Extension Methods
public static class Vector3Extensions {
public static Vector3 SetY(this Vector3 vector, float y) {
//This works as-is because structs, such as Vector_ in Unity, are pass-by-value
vector.y = y;
return vector;
}
}
Usage:
//You have to assign the result of the method because, again, pass-by-value...
//..So the vector that was modified inside the method is a different object from the original
transform.localPosition = transform.localPosition.SetY(Mathf.PingPong(Time.time * movementSpeed, movementRange));
Your second example is functionally different from the first. In it, the X and Z will be zero instead of inheriting the original transform's values. If your intention is that, then your example's code is already pretty much as small as it gets, without another extension method to the Transform class, ie transform.SetLocalPositionY(y).
If you need inheritance of the X and Z values, AND using SetY would still be too verose to your liking, then use a Transform Extension Method as I said above:
public static class TransformExtensions {
public static void SetLocalPositionY(this Transform transform, float y) {
transform.localPosition = transform.localPosition.SetY(y);
}
}
Usage:
//Assignment not needed in this case, as transform is a class...
//..and assignment of localPosition is handled inside the extension method.
transform.SetLocalPositionY(Mathf.PingPong(Time.time * movementSpeed, movementRange));
There is a slight performance cost to extensions methods for struct types, due to the pass-by-value nature of structs. Which are the types being used in this case. But it's negligible in this case specifically, and should be negligible for all structs in general, if you are using structs correctly.
On pass-by-reference types, there should be no performance cost any different to any standard static method modifying the data.
Note that the operation does not need to be inline, and if you're willing to spread it across a few lines, you can use the same principle of the extension method, of modifying a cached clone of the original vector and then (re)assigning it back to the transform, to achieve this with shorter lines and without the extra struct instance from passing as parameter and then returning (one instance creation instead of two):
var localPos = transform.localPosition; //Creates temp/cached instance
localPos.y = Mathf.PingPong(Time.time * movementSpeed, movementRange); //Modification
transform.localPosition = localPos; //(Re)Assignment.
But with exception to cases requiring extreme performance, the extension will be fine. In fact, anything short of a case verified in the debugger, doing it this way just for the performance concern is going to be premature-optimization.

Protobuf without serialization

As the name suggests I was wondering if it makes sense to use Protobuf without the requirement of having to serialize the data in any form at the moment (might change in future). I mean to use them purely as data structures to pass Information from one function to the other, all executed in the same address space. Or do you feel it may be an Overkill and see other alternatives.
Backgroud:
I have to design a lib that implements certain interfaces. At the moment, my collegues have implemented it using several functions taking arguments ..
Example:
void readA(int iIP1, int iIP2, Result& oOP)
void readB(std::string iIP1, Result& oOP)
void readC(std::vector<int> iIP1, Result& oOP)
I want to change this and provide just one interface function:
void ReadFn(ReadMsg& ip, ReadResult& res);
And the data structures are defined in Protobuf as below ..
message ReadMsg {
enum ReadWhat {
A = 0;
B = 1;
C = 2;
}
message readA {
int32 iIP1 = 1;
int32 iIP2 = 2;
}
message readB {
string IP1 = 1;
}
message readC {
repeated int IP1 = 1;
}
oneof actRead {
readA rA = 1;
readB rB = 2;
readC rC = 3;
}
}
It offers many advantages over traditional interface design(using functions), with very Little effort from my side. And it will be future proof should these components be deployed as Services in different processes/machines (ofcourse with additional implementation). But given that Protocol Buffers strength is their serialization Features, which I do not make use of at the moment, would you choose to use them in such trivial Tasks ?
Thank you
It can make sense to group function arguments into a struct if there are many of them. And it can make sense to combine your readA, readB and readC functions into a single function if they share a lot of common parts.
What doesn't, however, make sense in my opinion is introducing a separate .proto file and a protobuf dependency if you are not going to use it for serialization. Similar features for grouping data into reusable structures already exist in most languages. And when you use the built-in features of the language, all the code remains in the same place and is easier to understand.

What operations are unsafe before __libc_init_array is invoked?

I want to run some code before main begins, and before constructors for static variables run. I can do with with code like this (ideone)
extern "C" {
static void do_my_pre_init(void) {
// something
}
__attribute__ ((section (".preinit_array"))) void(*p_init)(void) = &do_my_pre_init;
}
Are there any language features that will not work correctly when executed in this function, due to _init and .init_array not yet having been executed?
Or is it only user code that should be hooking into this mechanism?
Some background on __libc_init_array
The source for a typical __libc_init_array is something like:
static void __libc_init_array() {
size_t count, i;
count = __preinit_array_end - __preinit_array_start;
for (i = 0; i < count; i++)
__preinit_array_start[i]();
_init();
count = __init_array_end - __init_array_start;
for (i = 0; i < count; i++)
__init_array_start[i]();
}
Where the __... symbols come from a linker script containing
. = ALIGN(4);
__preinit_array_start = .;
KEEP (*(.preinit_array))
__preinit_array_end = .;
. = ALIGN(4);
__init_array_start = .;
KEEP (*(SORT(.init_array.*)))
KEEP (*(.init_array))
__init_array_end = .;
Are there any language features that will not work correctly when executed in this function, due to _init and .init_array not yet having been executed?
This question is impossible to answer in general, because the language itself has no concept of .preinit_array, or _init, or .init_array. All of these concepts are implementation details for a particular system.
In reality, you aren't guaranteed to have anything work at all. Things as simple as malloc may not work (e.g. because the malloc subsystem itself may be using .preinit_array to initialize itself).
In practice, using dynamic linking on a GLIBC-based platform most everything will work (because libc.so.6 initializes itself long before the first instruction of the main executable runs).
For fully-static executable, all bets are off.
For non-GLIBC platform, you'll need to look into specifics of that platform (and you are very unlikely to find any guarantees).
Update:
Can I make function calls,
Function calls need no setup with fully-static linking, and need dynamic loader to have initialized in dynamic linking case. No dynamic loader will start executing code in the application before it has fully initialized itself, so function calls should be safe.
assign structs
In C, at best, this is a few instructions. At worst, this is a call to memcpy or memset. That should be safe.
use array initializers.
This is just a special case of struct assignment, so should be safe.

Using boost::python::handle as temporary?

In a custom converter, I am checking whether a sequence item is some type. So far I've had this code (simplified)
namespace bp=boost::python;
/* ... */
static void* convertible(PyObject* seq_ptr){
if(!PySequence_Check(seq_ptr)) return 0;
for(int i=0; i<PySequence_Size(seq_ptr); i++)
if(!bp::extract<double>(PySequence_GetItem(seq_ptr,i)).check()) return 0;
/* ... */
}
/* ... */
but this is leaking memory, since PySequence_GetItem is returning a new reference. So either I can do something like this in the loop:
PyObject* it=PySequence_GetItem(seq_ptr,i);
bool ok(bp::extract<double>(it).check();
Py_DECREF(it); // will delete the object which had been newly created
if(!ok) return 0;
but that is quite clumsy; I could make a stand-alone function doing that, but that is where I recalled bp::handle implementing the ref-counting machinery; so something like this might do:
if(!bp::extract<double>(bp::handle<>(PySequence_GetItem(seq_ptr,i))).check()) return 0;
but this page mentions using handles as temporaries is discouraged. Why? Can the object be destroyed before .check() is actually called? Is there some other elegant way to write this?
The object will not be destroyed before the .check() is called and is safe in the posted context.
The recommendation to not use temporaries is due to the unspecified order of evaluation of the arguments and exception safety. If there is only one order in which arguments can be evaluated, such as in your example, then it is safe. For instance, consider function bad() which always throws an exception:
f(boost::python::handle<>(PySequence_GetItem(...)), bad());
If bad() gets evaluated between PySequence_GetItem(...) and boost::python::handle<>(...), then the new reference will be leaked as the stack will begin to unwind before the construction of boost::python::handle<>. On the other hand, when a non-temporary is used, there is no chance for something to throw between PySequence_GetItem() and boost::python::handle<>(), so the following is safe in the presence of exceptions:
boost::python::handle<> item_handle(PySequence_GetItem(...));
f(item_handle, bad());
Consider reading Herb Sutter's GotW #56: Exception-Safe Function Calls for more details.

Storing and Using State in a GUI Application

I'm writing an iPhone App, and I'm finding that as I add features, predictably, the permutations of state increase dramatically.
I then find myself having to add code all over the place of the form:
If this and that and not the other then do x and y and set state z
Does anybody have suggestions for systematic approaches to deal with this?
Even though my app is iPhone, I think this applies to many GUI cases.
In general, a user interface application is always waiting for an event to happen. The event can be an action by the user (tap, shake iPhone, type letter on virtual keyboard), or by another process (network packet becomes available, battery runs out), or a time event (a timer expires). Whenever an event takes place ("if this"), you consult the current state of your application ("... and that and not the other") and then do something ("do x and y"), which most likely changes the application state ("set state z"). This is what you described in your question. And this is a general pattern.
There is no single systematic approach to make it right, but as you ask for suggestions of approaches, here some suggestions:
HINT 1: Use as few and little real data structures and variables to represent the internal state as possible, avoiding duplication of state by all means (until you run into performance issues). This makes the "do x and y and set state z" thing shorter, because the state gets set implicitly. Trivial example: instead of having (examples in C++)
if (namelen < 20) { name.append(c); namelen++; }
use
if (name.size() < 20) { name.append(c); }
The second example correctly avoids the replicated state variable 'namelen', making the action part shorter.
HINT 2: Whenever a compound condition (X and Y or Z) appears many times in your program, abstract it away into a procedure, so instead of
if ((x && y) || z) { ... }
write
bool my_condition() { return (x && y) || z; }
if (my_condition()) { ... }
HINT 3: If your user interface has a small number of clearly defined states, and the states affect how events are handled, you can represent the states as singleton instances of classes which inherit from an interface for handling those events. For example:
class UIState {
public:
virtual void HandleShake() = 0;
}
class MainScreen : public UIState {
public:
void HandleShake() { ... }
}
class HelpScreen : public UIState {
public:
void HandleShake() { ... }
}
Instantiate one instance of every derivate class and have then a pointer that points to the current state object:
UIState *current;
UIState *mainscreen = new MainScreen();
UIState *helpscreen = new HelpScreen();
current = mainscreen;
To handle shake then, call:
current->HandleShake();
To change UI state later:
current = helpscreen;
In this way, you can collect state-related procedures into classes, and encapsulate and abstract them away. Of course, you can add all kinds of interesting things into these state-specific (singleton) classes.
HINT 4: In general, if you have N boolean state variables and T different events that can be triggered, there are T * 2**N entries in the "matrix" of all possible events in all possible conditions. It requires your architectural view and domain expertise to correctly identify those dimensions and areas in the matrix which are most logical and natural to encapsulate into objects, and how. And that's what software engineering is about. But if you try to do your project without proper encapsulation and abstraction, you can't scale it far.