Cython callback causing memory corruption/segfaults - callback

I am interfacing python with a c++ library using cython. I need a callback function that the c++ code can call. I also need to pass a reference to a specific python object to this function. This is all quite clear from the callback demo.
However I get various errors when the callback is called from c++ thread (pthread):
Pass function pointer and class/object (as void*) to c++
Store the pointers within c++
Start new thread (pthread) running a loop
Call function using the stored function pointer and pass back the class pointer (void*)
In python: cast void* back to class/object
Call a method of above class/object (Error)
Steps 2 and 3 are in c++.
The errors are mostly segmentation faults but sometimes I get complaints about some low level python calls.
I have the exact same code where I create a thread in python and call the callback directly. This works OK. So the callback itself is working.
I do need a separate thread running in c++ since this thread is communicating with hardware and on occasion calling my callback.
I have also triple-checked all the pointers that are being passed around. They point to valid locations.
I suspect there are some problems when using cython classes from a c++ thread..?
I am using Python 2.6.6 on Ubuntu.
So my question is:
Can I manipulate python objects from a non-python thread?
If not, is there a way can make the thread python-compatible? (pthread)
This is the minimal callback that already causes problems when called from c++ thread:
cdef int CheckCollision(float* Angles, void* user_data):
self = <CollisionDetector>user_data
return self.__sizeof__() # <====== Error

No, you must not manipulate Python objects without acquiring GIL in the first place. You must use PyGILState_Ensure() + PyGILState_Release() (see the PyGILState_Ensure documentation)
You can ensure that your object will be not deleted by python if you explicitly take the reference, and release it when you're not using it anymore with Py_INCREF() and Py_DECREF(). If you pass the callback to your c++, and if you don't have a reference taken anymore, maybe your python object is freed, and the crash happen (see the Py_INCREF documentatation).
Disclamer: i'm not saying this is your issue, just giving you tips to track down the bug :)

Related

release build variable corruption when using ne10 math library assembly function

has anyone experience the following issue?
A stack variable getting changed/corrupted after calling ne10 assembly function such as ne10_len_vec2f_neon?
e.g
float gain = 8.0;
ne10_len_vec2f_neon(src, dst, len);
after the call to ne10_len_vec2f_neon, the value of gain changes as its memory is getting corrupted.
1. Note this only happens when the project is compiled in release build but not debug build.
2. Does Ne10 assembly functions preserve registers?
3. Replacing the assembly function call to c equivalent such as ne10_len_vec2f_c and both release and debug build seem to work OK.
thanks for any help on this. Not sure if there's an inherent issue within the program or it is really the call to ne10_len_vec2f_neon causing the corruption with release build.enter code here
I had a quick rummage through the master NEON code here:
https://github.com/projectNe10/Ne10/blob/master/modules/math/NE10_len.neon.s
... and it doesn't really touch address-based stack at all, so not sure it's a stack problem in memory.
However based on what I remember of the NEON procedure call standard q4-q7 (alias d8-d15 or s16-s31) should be preserved by the callee, and as far as I can tell that code is clobbering q4-6 without the necessary save/restore, so it does indeed look like it's clobbering the stack in registers.
In the failed case do you know if gain is still stored in FPU registers, and if yes which ones? If it's stored in any of s16/17/18/19 then this looks like the problem. It also seems plausible that a compiler would choose to use s16 upwards for things it needs to keep across a function call, as it avoids the need to touch in-RAM stack memory.
In terms of a fix, if you perform the following replacements:
s/q4/q8/
s/q5/q9/
s/q6/q10/
in that file, then I think it should work; no means to test here, but those higher register blocks are not callee saved.

wrap function without dlsym

How to write a shared library that:
wraps a system function (say malloc),
internally uses the real version of wrapped functions (e.g., malloc defined in libc), AND
can be linked from client code without giving --wrap=malloc every time it is used?
I learned from several posts that I can wrap system functions with --wrap option of ld; something like this:
void * __wrap_malloc(size_t sz) {
return __real_malloc(sz);
}
and get a shared library with:
gcc -O0 -g -Wl,--wrap=malloc -shared -fPIC m.c -o libwrapmalloc.so
But when a client code links this library, it needs to pass --wrap=malloc every time. I want to hide this from the client code, as the library I am working on actually wraps tons of system functions.
An approach I was using was to define malloc and find the real malloc in libc using dlopen and dlsym. This was nearly what I needed, but just as someone posted before Function interposition in Linux without dlsym, dlsym and dlopen internally call mem-alloc functions (calloc, as I witnessed it) so we cannot easily override calloc/malloc functions with this approach.
I recently learned --wrap and thought it was neat, but I just do not want to ask clients to give tons of --wrap=xxxx arguments every time they get executables...
I want to have a situation in which malloc in the client code calls malloc defined in my shared library whereas malloc in my shared library calls malloc in libc.
If this is impossible, I would like to reduce the burden of the clients to give lots of --wrap=... arguments correctly.

Not working python breakpoints in C thread in pycharm or eclipse+pydev

I have a django application using a C++ library (imported via swig).
The C++ library launches own thread which calls callbacks in Python code.
I cannot setup a breakpoint in python code, neither in PyDev nor PyCharm.
Tried also 'gevent compatibility' option too with no luck.
I verified the callbacks are properly called as logging.info dumps what expected. Breakpoints set in other threads work fine. So it seems that python debuggers cannot manage breakpoints in python code called by threads created in non-python code.
Does anyone know a workaround? Maybe there is some 'magic' thread initialization sequence I could use?
You have to setup the debugger machinery for it to work on non-python threads (this is done automatically when a Python thread is created, but when you create a thread for which Python doesn't have any creation hook, you have to do it yourself) -- note that for some frameworks -- such as QThread/Gevent -- things are monkey patched so that we know about the initialization and start the debugger, but for other frameworks you have to do it yourself.
To do that, after starting the thread you have to call:
import pydevd
pydevd.settrace(suspend=False, trace_only_current_thread=True)
Note that if you had put suspend=True, it'd simulate a manual breakpoint and would stop at that point of the code.
This is a follow-up to #fabio-zadrozny answer.
Here is a mixin I've created that my class (which gets callbacks from a C-thread) inherits from.
class TracingMixing(object):
"""The callbacks in the FUSE Filesystem are C threads and breakpoints don't work normally.
This mixin adds callbacks to every function call so that we can breakpoint them."""
def __call__(self, op, path, *args):
pydevd.settrace(suspend=False, trace_only_current_thread=True, patch_multiprocessing=True)
return getattr(self, op)(path, *args)

Haxe: define a function/macro which fires when an object goes out of scope?

Is this possible in Haxe to have the compiler automatically insert a function call / code segment at the point where an object instance goes out of scope? I have object instances that require manual cleanup beyond what garbage collection does (for the JS target).
More Info
I'm experimenting with allocating small data structures in JavaScript code manually inside a virtual heap (an ArrayBuffer), similar to what is done with compiled asm.js programs. I am using Haxe in part because I can create abstract types as convenient aliases/abstractions for their underlying data allocated in the heap (always some manner of ArrayBufferView), while suffering no runtime overhead from the abstraction.
The only issue is that deallocation must be done manually. It's simple enough to resort to calling a destructor function manually within code, but I find this error-prone and messy. I was hoping Haxe would have some mechanism I could use to automate the insertion of these function calls whenever a variable went out of scope, in a deterministic, compile-time manner.

Scriptable Plugin, Javascript returns undefined

Im trying to write a scritable plugin and I am using mozilla's example below as my guide, as well as looking at firebreath to see how it wraps the code. I am getting stuck on the return value to javascript.
Mozilla scriptable example
When javascript calls my function the Allocate,HasProperty,HasMethod,Invoke all get called. I return back the result in Invoke and the javascript variable is undefined or crashes the browser when modifying the result.
STRINGZ_TO_NPVARIANT(_strdup("Hello World"), *result);
STRINGZ_TO_NPVARIANT is actually a bit dangerous; when you put a string into an NPVariant object you give ownership of that memory to the browser. However, if you didn't allocate that memory with NPN_MemAlloc things may explode when it tries to release that memory (possibly the source of your crash).
Look at what STRINGZ_TO_NPVARIANT is actually doing and don't use it 'til you understand how it works; until then, you may try performing the steps by hand so you have a better understanding. Allocate memory using NPN_MemAlloc and then strcpy your string to it. I bet this fixes your problem; after you've got it figured out build your own inline functions or whatnot to clean up the code again.