Is Swift's copy-on-write thread-safe? - swift

Making an array or dictionary a value type by definition, but then actually copying it only when one reference to it tries to modify it is a lovely idea, but it makes me wary in a multi-queued/threaded context. I need to know:
Is Swift's copy-on-write capability thread-safe? eg: If I create an array on one queue and pass it to another queue, is it safe for either queue to modify it while the other might be reading or modifying it? Since by definition the copy was made when the array reference was passed into the second queue, can we assume that the Swift engineers did the right thing and implemented copy-on-write in a queue-safe way?
I found this old discussion of this, which seems authoritative, but in both directions!
https://developer.apple.com/forums/thread/53488
Some credible voices say it's thread-safe, others say it isn't. I imagine that this may be because in some early version of Swift it was not, while perhaps in Swift 5 it is. Does anyone here know for sure for Swift 5?
Here's some sample code to illustrate the issue:
func func1()
{
var strings1: [String] = ["A", "B", "C"]
var strings2: [String] = strings1 // array not actually copied
queue.async()
{
strings2.append("D")
}
print(strings1[0]) // is this reference thread-safe?
strings1.append("E") // is this modification thread-safe?
}

OK, since no one from Apple/Swift Inc is replying, I'll venture my best guess:
I imagine that when you have an Array value in swift, it's a reference to a reference to an NSArray or an NSMutableArray. (Yes, I know this is only true for class objects, but let's keep it simple here.) Without assigning a new value to your Array value, the lower-level reference can can be made to refer to a different NS-object by simple operations on it, such as appending or trimming. There is also a reference count attached to the underlying NS-object.
When you modify an Array, the first thing Swift does is check to see if the Swift reference is the only one to the underlying NS-object. If so, the NSArray is converted to an NSMutableArray if necessary and the modification takes place. If not, the NSArray is copied into an NSMutableArray, the modification takes place, and the lower-level Swift reference is changed to point to the new NS-object.
If this is indeed the process that Copy on Write follows, and if we can assume that the reference count mechanism, and the retain/release system is thread-safe, then I would say that Copy on Write IS thread-safe. Even if another thread is making the modification, as described above, since it's modifying a copy that IT just made, it shouldn't alter the original array at all.
If you know for sure that anything I've written is incorrect, or if you know other information pertinent to this issue, PLEASE share it here. This issue is far too important to leave to "It should just work" status. :-)

Related

Passing in "const string &test" is bugging me

Hello wonderful folks,
I'm scratching my head regarding this issue. 'Const' means the contents are not changed, but '&' means the contents can be modified. So how do they both bond together?
& means that the contents are passed by reference, not necessarily that they are modified. Rather, it means that the object is not copied, meaning that its copy constructor isn't invoked to pass it. The confusion may arise because one (common) use of a reference is to allow the callee to modify the referenced object.
A const reference, as shown here, is a pretty natural way to express that a reference to an object will be passed without copying the object, and that the callee will not modify the object. It's a simple and somewhat effective optimization when the object is expensive to copy, the callee doesn't need to modify it, and there are no relevant lifetime concerns that would require a copy to be taken.

self. in trailing swift closures, meaning and purpose?

whenever I use a trailing closure on an action ... example:
run(SKAction.wait(forDuration: 10)){timeRemains = false}
I’m seeing this:
Reference to property (anything) in closure requires explicitly ‘self’
to make capture semantics explicit.
What does this mean? And what is it on about? I'm curious because I'm only ever doing this in the context/scope of the property or function I want to call in the trailing closure, so don't know why I need `self and fascinated by the use of the word
"semantics"
here. Does it have some profound meaning, and will I magically understand closures if I understand this?
Does it have some profound meaning, and will I magically understand closures if I understand this?
No and no. Or perhaps, maybe and maybe.
The reason for this syntactical demand is that this might really be a closure, that is, it might capture and preserve its referenced environment (because the anonymous function you are passing might be stored somewhere for a while). That means that if you refer here to some property of self, such as myProperty, you are in fact capturing a reference to self. Swift demands that you acknowledge this fact explicitly (by saying self.myProperty, not merely myProperty) so that you understand that this is what's happening.
Why do you need to understand it? Because under some circumstances you can end up with a retain cycle, or in some other way preserving the life of self in ways that you didn't expect. This is a way of getting you to think about that fact.
(If it is known that this particular function will not act as a closure, i.e. that it will be executed immediately, then there is no such danger and Swift will not demand that you say self explicitly.)

Why NOT use optionals in Swift?

I was reading up on how to program in Swift, and the concept of optionals bugged me a little bit. Not really in terms of why to use optionals, that makes sense, but more so as to in what case would you not want to use optionals. From what I understand, an optional just allows an object to be set to nil, so why would you not want that feature? Isn't setting an object to nil the way you tell ARC to release an object? And don't most of the functions in Foundation and Cocoa return optionals? So outside of having to type an extra character each time to refer to an object, is there any good reason NOT to use an optional in place of a regular type?
There are tons of reasons to NOT use optional. The main reason: You want to express that a value MUST be available. For example, when you open a file, you want the file name to be a string, not an optional string. Using nil as filename simply makes no sense.
I will consider two main use cases: Function arguments and function return values.
For function arguments, the following holds: If the argument needs to be provided, option should not be used. If handing in nothing is okay and a valid (documented!) input, then hand in an optional.
For function return values returning no optional is especially nice: You promise the caller that he will receive an object, not either an object or nothing. When you do not return an optional, the caller knows that he may use the value right away instead of checking for null first.
For example, consider a factory method. Such method should always return an object, so why should you use optional here? There are a lot of examples like this.
Actually, most APIs should rather use non-optionals instead of optionals. Most of the time, simply passing/receiving possibly nothing is just not what you want. There are rather few cases where nothing is an option.
Each case where optional is used must be thoroughly documented: In which circumstances will a method return nothing? When is it okay to hand nothing to a method and what will the consequences be? A lot of documentation overhead.
Then there is also conciseness: If you use an API that uses optional all over the place, your code will be cluttered with null-checks. Of course, if every use of optional is intentional, then these checks are fine and necessary. However, if the API only uses optional because its author was lazy and was simply using optional all over the place, then the checks are unnecessary and pure boilerplate.
But beware!
My answer may sound as if the concept of optionals is quite crappy. The opposite is true! By having a concept like optionals, the programmer is able to declare whether handing in/returning nothing is okay. The caller of a function is always aware of that and the compiler enforces safety. Compare that to plain old C: You could not declare whether a pointer could be null. You could add documentation comments that state whether it may be null, but such comments were not enforced by the compiler. If the caller forgot to check a return value for null, you received a segfault. With optionals you can be sure that noone dereferences a null pointer anymore.
So in conclusion, a null-safe type system is one of the major advances in modern programming languages.
The original idea of optionals (which existed long before Swift) is to force programmer to check value for nil before using it — or to prevent outside code from passing nil where it is not allowed. A huge part of crashes in software, maybe even most of them, happens at address 0x00000000 (or with NullPointerException, or alike) precisely because it is way too easy to forget about nil-pointer scenario. (In 2009, Tony Hoare apologized for inventing null pointers).
Not using optionals is as valid and widespread use case as using them: when the value absolutely can not be missing, there should be non-optional type; when it can, there should be an optional.
But currently, the existing frameworks are written in Obj-C without optionals in mind, so automatically generated bridges between Swift and Obj-C just have to take and return optionals, because it is impossible to automatically deeply analyze each method and figure out which arguments and return values should be optionals or not. I'm sure over time Apple will manually fix every case where they got it wrong; right now you should not use those frameworks as an example, because they are definitely not a good one. (For good examples, you could check a popular functional language like Haskell which had optionals since the beginning).

MATLAB weak references to handle class objects

While thinking about the possibility of a handle class based ORM in MATLAB, the issue of caching instances came up. I could not immediately think of a way to make weak references or a weak map, though I'm guessing that something could be contrived with event listeners. Any ideas?
More Info
In MATLAB, a handle class (as opposed to a value class) has reference semantics. An example included with MATLAB is the containers.Map class. If you instantiate one and pass it to a function, any modifications the function makes to the object will be visible via the original reference. That is, it works like a Java or Python object reference.
Like Java and Python, MATLAB keeps track in one way or another of how many things are referencing each object of a handle class. When there aren't any more, MATLAB knows it is safe to delete the object.
A weak reference is one that refers to the object but does not count as a reference for purposes of garbage collection. So if the only remaining references to the object are weak, then it can be thrown away. Generally an event or callback can be supplied to the weak reference - when the object is thrown away, the weak references to it will be notified, allowing cleanup code to run.
For instance, a weak value map is like a normal map, except that the values (as opposed to the keys) are implemented as weak references. The weak map class can arrange a callback or event on each of these weak references so that when the referenced object is deleted, the key/value entry in the map is removed, keeping the map nice and tidy.
These special reference types are really a language-level feature, something you need the VM and GC to do. Trying to implement it in user code will likely end in tears, especially if you lean on undocumented behavior. (Sorry to be a stick in the mud.)
There's a couple ways you could do something similar. These are just ideas, not endorsements; I haven't actually done them.
Perhaps instead of caching Matlab object instances per se, you could cache expensive computational results using a real Java weak ref map in the JVM embedded inside Matlab. If you can convert your Matlab values to and from Java relatively quickly, this could be a win. If it's relatively flat numeric data, primitives like double[] or double[][] convert quickly using Matlab's implicit conversion.
Or you could make a regular LRU object cache in the Matlab level (maybe using a containers.Map keyed by hashcodes) that explicitly removes the objects inside it when new ones are added. Either use it directly, or add an onCleanup() behavior to your objects that has them automatically add a copy of themselves to a global "recently deleted objects" LRU cache of fixed size, keyed by an externally meaningful id, and mark the instances in the cache so your onCleanup() method doesn't try to re-add them when they're deleted due to expiration from the cache. Then you could have a factory method or other lookup method "resurrect" instances from the cache instead of constructing brand new ones the expensive way. This sounds like a lot of work, though, and really not idiomatic Matlab.
This is not an answer to your question but just my 2 cents.
Weak reference is a feature of garbage collector. In Java and .NET garbage collector is being called when the pressure on memory is high and is therefore indeterministic.
This MATLAB Digest post says that MATLAB is not using a (indeterministic) garbage collector. In MATLAB references are being deleted from memory (deterministically) on each stack pop i.e. on leaving each function.
Thus I do not think that weak references belongs to the MATLAB reference handling concept. But MATLAB has always had tons of undocumented features so I can not exclude that it is buried somewhere.
In this SO post I asked about MATLAB garbage collector implementation and got no real answer. One MathWorks stuff member instead of answering my question has accused me of trying to construct a Python vs. MATLAB argument. Another MathWorks stuff member wrote something looking reasonable but in substance a clever deception - purposeful distraction from the problem I asked about. And the best answer has been:
if you ask this question then MATLAB
is not the right language for you!

How to make Scala's immutable collections hold immutable objects

I'm evaluating Scala and am having a problem with its immutable collections.
I want to make immutable collections, which are completely immutable, right down through all the contained objects, the objects they reference, ad infinitum.
Is there a simple way to do this?
The code on http://www.finalcog.com/immutable-containers-scala illustrates what I'm trying to achieve, and a nasty work around (ImmutablePoint).
The problem with the workaround is that every time I want to change an object I have to manually make a new copy. I understand that the runtime will have to implement copy-on-write, but can this be made transparent to the developer?
I suppose I'm looking to make Immutable Objects, where methods change the current object state, but all other 'val' (and all immutable container) references to the object retain the 'old' state.
This is not possible out-of-the-box with scala via some specific language construct unless you have followed the idiom that all of your objects are immutable, in which case this behaviour comes for free!
With 2.8, named parameters have made "copy constructors" quite nice to use, from a readability perspective. But you are correct, this works as copy-on-write. The behaviour you are asking for, where the "current" object is the only one mutated goes completely against the way the JVM works, unfortunately (for you)!
Actually the phrase "the current object" makes no sense; really you mean "the current reference"! All other references (outside the current lexical scope) which point to the same object, erm, point to the same object! There is only one object!
Hence it's just not possible for this object to appear to be mutable from the perspective of the current lexical scope but immutable for others
If you're interested in some more general theory on how to handle updates to immutable data structures efficiently,
http://en.wikipedia.org/wiki/Zipper_%28data_structure%29
might prove interesting.