In pybind11-wrapped c++, called from python, I can do this:
py::array_t<double> t1 = py::array_t<double>(3);
But: if I do that in a separate thread, it crashes with a segmentation fault (actually it seems to crash when t1 goes out of scope or gets destructed).
I can fix this by doing
PyGILState_STATE gstate;
gstate = PyGILState_Ensure();
pybind11::array_t<double> t1 = pybind11::array_t<double>(3)
PyGILState_Release(gstate);
So obviously there is some GIL-dependent stuff in pybind11::array_t. Is it necessarily so? Do I have to acquire the GIL in order to instantiate it?
Yes, GIL is required.
When you create a python object you are interacting with python interpreter. At some point of the process you need to acquire GIL to do it right.
That said, creation of pybind11::array_t might have expensive part which doesn't require GIL and could be done in another thread. For example you may allocate and initialize raw data and pass it to pybind11::array_t constructor to quickly create object with GIL.
Related
The Apple doc says that
If a property marked with the lazy modifier is accessed by multiple
threads simultaneously and the property has not yet been initialized,
there is no guarantee that the property will be initialized only once.
My question is what are the potential repercussions of a property getting initialized more than once?
And in case of a property getting initialized more than once, which one of it will be used? How Swift manages them?
I went through some of the answers.
Is it normal that lazy var property is initialized twice?
But they are just saying that lazy properties can get initialized more than once. I want to know what are the repercussions of this.
Thanks in advance.
(See my comment to rmaddy's answer regarding my concern about thread-safety on writing the pointer itself. My gut is that memory corruption is not possible, but that object duplication is. But I can't prove so far from the documentation that memory corruption isn't possible.)
Object duplication is a major concern IMO if the lazy var has reference semantics. Two racing threads can get different instances:
Thread 1 begins to initialize (object A)
Thread 2 begins to initialize (object B)
Thread 1 assigns A to var and returns A to caller
Thread 2 assigns B to var and returns B to caller
This means that thread 1 and thread 2 have different instances. That definitely could be a problem if they are expecting to have the same instance. If the type has value semantics, then this shouldn't matter (that being the point of value semantics). But if it has reference semantics, then this very likely be a problem.
IMO, lazy should always be avoided if multi-threaded callers are possible. It throws uncertainty into what thread the object construction will occur on, and the last thing you want in a thread-safe object is uncertainty about what thread code will run on.
Personally I've rarely seen good use cases for lazy except for where you need to pass self in the initializer of one of your own properties. (Even then, I typically use ! types rather than lazy.) In this way, lazy is really just a kludgy work-around a Swift init headache that I wish we could solve another way, and do away with lazy, which IMO has the same "doesn't quite deliver what it promises, and so you probably have to write your own version anyway" problem as #atomic in ObjC.
The concept of "lazy initialization" is only useful if the type in question is both very expensive to construct, and unlikely to ever be used. If the variable is actually used at some point, it's slower and has less deterministic performance to make it lazy, plus it forces you to make it var when it is most often readonly.
The answer completely depends on the code you have inside the implementation of the lazy property. The biggest problem would arise from any side effects you've put in the code since they might be called more than once.
If all you do is create a self-contained object, initialize it, and return it, then there won't be any issues.
But if also do things like add a view, update an array or other data structure, or modify other properties, then you have an issue if the lazy variable is created more than once since all of those side effects will happen more than once. You end up adding two views or adding two objects to the array, etc.
Ensure that the code in your lazy property only creates and initializes an object and does not perform any other operations. If you do that, then your code won't cause any issues if the lazy property gets created multiple times from multiple threads.
I want to know what is difference between perform selector in backgorund and detachNewThread
They Are identical. as you can see in Documentation section Click Here
performSelectorInBackground:withObject: The effect of calling this method is the same as if you called the detachNewThreadSelector:toTarget:withObject: method of NSThread with the current object, selector, and parameter object as parameters.
performSelectorInBackground:withObject: is easier way rather than NSThread.
However, NSThread can control its priority, stacksize, etc. If you'd like to customize the behavior, I recommend NSThread instead of performSelectorInBackground:withObject:.
I would look at it from a semantic point of view. There is no technical reason to use one or the other.
Use NSThread if you actually "think" of having a thread that "does something"; in particular, it will probably be the most appropiate way of creating a thread if your thread runs some form of event- or messaging loop. In such a case, the "thread object" is really just that; in many cases it's not an "application realm" object with actual application data, as these will be handed over to the thread in some way.
Use the NSObject-based methods if your thread is merely meant to run some single operation in the background. You don't really care about this being a "thread", and the object that you run this on is likely to be the "application realm" object with the data; there's no event- or messageloop to feed it commands from other threads.
Thus, I would base the decision on abstract factors, as in "what looks better in the given context". Having an NSThread "feels" like a more detached entity that is willing to offer services to multiple clients, whereas the NSObject method feels like it's closely attached to the data object that it runs with, and doesn't really deal with anything else unless it's vital to the cause.
When implementing algorithms and other things while trying to maintain reusability and separation patterns, I regularly get stuck in situations like this:
I communicate back and forth with an delegate while traversing a big graph of objects. My concern is how much all this messaging hurts, or how much I must care about Objective-C messaging overhead.
The alternative is to not separate anything and always put the individual code right into the algorithm like for example this graph traverser. But this would be nasty to maintain later and is not reusable.
So: Just to get an idea of how bad it really is: How many Objective-C messages can be sent in one second, on an iPhone 4?
Sure I could write a test but I don't want to get it biased by making every message increment a variable.
There's not really a constant number to be had. What if the phone is checking email in the background, or you have a background thread doing IO work?
The approach to take with things like this is, just do the simple thing first. Call delegates as you would, and see if performance is OK.
If it's not, then figure out how to improve things. If messaging is the overhead you could replace it with a plan C function call.
Taking the question implicitly to be "at what point do you sacrifice good design patterns for speed?", I'll add that you can eliminate many of the Objective-C costs while keeping most of the benefits of good design.
Objective-C's dynamic dispatch consults a table of some sort to map Objective-C selectors to the C-level implementations of those methods. It then performs the C function call, or drops back onto one of the backup mechanisms (eg, forwarding targets) and/or ends up throwing an exception if no such call exists. In situations where you've effectively got:
int c = 1000000;
while(c--)
{
[delegate something]; // one dynamic dispatch per loop iteration
}
(which is ridiculously artificial, but you get the point), you can instead perform:
int c = 1000000;
IMP methodToCall = [delegate methodForSelector:#selector(something)];
while(c--)
{
methodToCall(delegate, #selector(something));
// one C function call per loop iteration, and
// delegate probably doesn't know the difference
}
What you've done there is taken the dynamic part of the dispatch — the C function lookup — outside the inner loop. So you've lost many dynamic benefits. 'delegate' can't method swizzle during the loop, with the side effect that you've potentially broken key-value observing, and all of the backup mechanisms won't work. But what you've managed to do is pull the dynamic stuff out of the loop.
Since it's ugly and defeats many of the Objective-C mechanisms, I'd consider this bad practice in the general case. The main place I'd recommend it is when you have a tightly constrained class or set of classes hidden somewhere behind the facade pattern (so, you know in advance exactly who will communicate with whom and under what circumstances) and you're able to prove definitively that dynamic dispatch is costing you significantly.
For full details of the inner workings at the C level, see the Objective-C Runtime Reference. You can then cross-check that against the NSObject class reference to see where convenience methods are provided for getting some bits of information (such as the IMP I use in the example).
I'm dealing with neural networks here, but it's safe to ignore that, as the real question has to deal with blocks in objective-c. Here is my issue. I found a way to convert a neural network into a big block that can be executed all at once. However, it goes really, really slow, relative to activating the network. This seems a bit counterintuitive.
If I gave you a group of nested functions like
CGFloat answer = sin(cos(gaussian(1.5*x + 2.5*y)) + (.3*d + bias))
//or in block notation
^(CGFloat x, CGFloat y, CGFloat d, CGFloat bias) {
return sin(cos(gaussian(1.5*x + 2.5*y)) + (.3*d + bias));
};
In theory, running that function multiple times should be easier/quicker than looping through a bunch of connections, and setting nodes active/inactive, etc, all of which essentially calculate this same function in the end.
However, when I create a block (see thread: how to create function at runtime) and run this code, it is slow as all hell for any moderately sized network.
Now, what I don't quite understand is:
When you copy a block, what exactly are you copying?
Let's say, I copy a block twice, copy1 and copy2. If I call copy1 and copy2 on the same thread, is the same function called? I don't understand exactly what the docs mean for block copies: Apple Block Docs
Now if I make that copy again, copy1 and copy2, but instead, I call the copies on separate threads, now how do the functions behave? Will this cause some sort of slowdown, as each thread attempts to access the same block?
When you copy a block, what exactly
are you copying?
You are copying any state the block has captured. If that block captures no state -- which that block appears not to -- then the copy should be "free" in that the block will be a constant (similar to how #"" works).
Let's say, I copy a block twice, copy1
and copy2. If I call copy1 and copy2
on the same thread, is the same
function called? I don't understand
exactly what the docs mean for block
copies: Apple Block Docs
When a block is copied, the code of the block is never copied. Only the captured state. So, yes, you'll be executing the exact same set of instructions.
Now if I make that copy again, copy1
and copy2, but instead, I call the
copies on separate threads, now how do
the functions behave? Will this cause
some sort of slowdown, as each thread
attempts to access the same block?
The data captured within a block is not protected from multi-threaded access in any way so, no, there would be no slowdown (but there will be all the concurrency synchronization fun you might imagine).
Have you tried sampling the app to see what is consuming the CPU cycles? Also, given where you are going with this, you might want to become acquainted with your friendly local disassembler (otool -TtVv binary/or/.o/file) as it can be quite helpful in determining how costly a block copy really is.
If you are sampling and seeing lots of time in the block itself, then that is just your computation consuming lots of CPU time. If the block were to consume CPU during the copy, you would see the consumption in a copy helper.
Try creating a source file that contains a bunch of different kinds of blocks; with parameters, without, with captured state, without, with captured blocks with/without captured state, etc.. and a function that calls Block_copy() on each.
Disassemble that and you'll gain a deep understanding on what happens when blocks are copied. Personally, I find x86_64 assembly to be easier to read than ARM. (This all sounds like good blog fodder -- I should write it up).
I am trying to determine if an object is valid. The program has (at least) two threads and one of the threads might invalidate the object by removing it from an NSMutableArray. I need the other thread to check either its existence or validity before acting on it.
You can't. The only way to check if the memory your object pointer has still represents a valid object is to dereference it, but dereferencing an "invalid" object (by which I assume you mean one that has been dealloced) will result in either accessing the memory of a new object that has been allocated in the same location, garbage data that may or may not be identical to a normal object, or an unmapped memory page that will result in an immediate EXEC_BAD_ACCESS.
Any time you are holding a reference to an object you might use in the future you must retain it. If you don't you have not shown any interest or ownership in the object and the system may throw it away at any time.
Using objective C accessors and properties instead of directly setting ivars and using retain/release simplifies doing the right thing quite a bit.
Multi-threaded programming is hard. Hard does not begin to capture how difficult it is. This is the kind of hard in which a general, useable, 'reasonably qualified' way of deterministically adding two different numbers together that are being mutated and shared by multiple threads in bounded time without the use of any special assistance from the CPU in the form of atomic instructions would be a major breakthrough and the thesis of your PhD. A deity of your choice would publicly thank you for your contribution to humanity. Just for adding two numbers together. Actually, multi-threaded programming is even harder than that.
Take a look at: Technical Note TN2059
Using collection classes safely with multithreaded applications. It covers this topic in general, and outlines some of the non-obvious pitfalls that await you.
You say
I need the other thread to check either its existence or validity before acting on it.
The easiest way is to hold on to the index of the object in the NSMutableArray and then do the following
if( myObject == [myArray objectAtIndex: myObjectIndex] ) {
// everything is good !
}
else {
// my object is not what I think it is anymore
}
There are clear problem with this approach however
insertion, and deletion will stuff you up
The approach is not thread safe since the array can be changed while you are reading it
I really recomend using a different way to share this array between the two threads. Does it have to be mutable? If it doesn't then make it immutable and then you no longer have to worry about the threading issues.
If it does, then you really have to reconsider your approach. Hopefully someone can give an cocoa way of doing this in a thread safe way as I don't have the experience.