Does String.init(cString: UnsafePointer<Int8>) copy the memory contents? - swift

What are the inner workings?
Does it create a Swift string copy of the C string data?
Does it use it as a reference and return it as a Swift string, so the string returned uses the same data? How does it work?
Does it copy the C string into a newly allocated Swift string?
If String(cString: UnsafePointer<Int8>) indeed works by copying the C string into a newly allocated swift string, is there a way to convert C strings to swift by referencing the already existing data instead of copying it?
How does String(cString: UnsafePointer<Int8>) work, and how can I determine whether it copies, or whether it references the same memory as a Swift string?

The documentation clearly states that the data is copied:
Initializer
init(cString:)
Creates a new string by copying the null-terminated UTF-8 data referenced by the given pointer.
is there a way to convert C strings to swift by referencing the already existing data instead of copying it?
Nope. Strings are frequently copied/destroyed, which involves doing retain/release operations on the underlying buffer, to do the necessary booking keeping of thethe reference count. If the memory is not owned by the String, then there's no way to reliably de-allocate it.
What are you trying to achieve by avoiding the copy?

Related

Does Data's "copy constructor" copy its referenced bytes if inited with `freeWhenDone:false`?

If I allocate a Data object with bytesNoCopy:count:deallocator:.none, it should reference the given bytes but in an unsafe manner, where I as programmer promise the bytes will be available during the lifetime of the Data, rather than Data controlling that on its own.
That's all fine. What I wonder is... Since it's a value type rather than a reference type, what happens when I assign another Data variable from my nocopy-Data? Does it COPY THE DATA (against my explicit wishes)? Or does it create one more unsafe Data instance which I must track the lifetime of, or risk crashes?
Here's an illustration:
let unsafe = malloc(5);
func makeUnsafeData() -> Data
{
return Data(bytesNoCopy: unsafe, count: 5, deallocator: .none)
}
struct Foo
{
var d: Data
}
var foo = Foo(d: makeUnsafeData())
free(unsafe)
The question is: does foo.d contain a dangling pointer to the freed bytes that used to be in unsafe? Or does it contain its own copy of those bytes, and is safe to use?
This experiment gist seems to indicate that NSData crashes in the above scenario, as expected, but Data does not; so my tentative conclusion is that Data copies the data, and there's no way to use a Data instance to transport bytes between functions without copying the bytes. But I'd love a reference to any documentation refuting or confirming this theory.
Turns out, the answer is in the documentation after all... as long as all your Data instances are let and you don't mutate to it ever, ONLY the original bytes should be in memory, all Datas just referencing it.
I might just use NSData instead though since it's a reference type and less magic going on...

How does an array in swift deep copy itself when copied or assigned

We all know an array in swift is a value type, this means after copying or assigning an array to another, modify the new array will not effect the old one. Such as:
var a = ["a", "b", "c", "d", "e"]
var b = a
b[0] = "1"
print(a[0]) // a
print(b[0]) // 1
But I'm wondering how could an array work like that. The length for a 'var' array is dynamical. Usually we must alloc some heap memory to contain all the values. And I do peek some source codes for struct Array, the underlining buffer for an array is implemented using a class. But when copying a struct which contains class or memory pointer member, the class and alloced memory will not copied by default.
So how could an array copy its buffer when copy or assign it to another one?
Assignment of any struct (such as Array) causes a shallow copy of the structure contents. There's no special behavior for Array. The buffer that stores the Array's elements is not actually part of the structure. A pointer to that buffer, stored on the heap, is part of the Array structure, meaning that upon assignment, the buffer pointer is copied, but it still points to the same buffer.
All mutating operations on Array do a check to see if the buffer is uniquely referenced. If so, then the algorithm proceeds. Otherwise, a copy of the buffer is made, and the pointer to the new buffer is saved to that Array instance, then the algorithm proceeds as previously. This is called Copy on Write (CoW). Notice that it's not an automatic feature of all value types. It is merely a manually implemented feature of a few standard library types (like Array, Set, Dictionary, String, and others). You could even implement it yourself for your own types.
When CoW occurs, it does not do any deep copying. It will copy values, which means:
In the case of value types (struct, enum, tuples), the values are the struct/enum/tuples themselves. In this case, a deep and shallow copy are the same thing.
In the case of reference types (class), the value being copied is the reference. The referenced object is not copied. The same object is pointed to by both the old and copied reference. Thus, it's a shallow copy.

Why storing substrings may lead to memory leak in Swift?

On Apple's documentation on Substring, is says:
Don’t store substrings longer than you need them to perform a specific operation. A substring holds a reference to the entire storage of the string it comes from, not just to the portion it presents, even when there is no other reference to the original string. Storing substrings may, therefore, prolong the lifetime of string data that is no longer otherwise accessible, which can appear to be memory leakage.
I feel confused that String is a value type in Swift and how does it lead to memory leak?
Swift Arrays, Sets, Dictionaries and Strings have value semantics, but they're actually copy-on-write wrappers for reference types. In other words, they're all struct wrappers around a class. This allows the following to work without making a copy:
let foo = "ABCDEFG"
let bar = foo
When you write to a String, it uses the standard library function isUniquelyReferencedNonObjC (unless it's been renamed again) to check if there are multiple references to the backing object. If so, it creates a copy before modifying it.
var foo = "ABCDEFG"
var bar = foo // no copy (yet)
bar += "HIJK" // backing object copied to keep foo and bar independent
When you use a Substring (or array slice), you get a reference to the entire backing object rather than just the bit that you want. This means that if you have a very large string and you have a substring of just 4 characters, as long as the substring is live, you're holding the entire string backing buffer in memory. This is the leak that this warns you about.
Given the way Swift is often portrayed your confusion is understandable. Types such as String, Array and Dictionary present value semantics but are library types constructed from a combination of value and references types.
The implementation of these types use dynamically allocated storage. This storage can be shared between different values. However library facilities are used to implement copy-on-write so that such shared storage is copied as needed to maintain value semantics, that is behaviour like that of value types.
HTH

Array pass by value by default & thread-safety

Say I have a class which has an Array of object Photo:
class PhotoManager {
fileprivate var _photos: [Photo] = []
var photos: [Photo] {
return _photos
}
}
I read one article which says the following:
By default in Swift class instances are passed by reference and
structs passed by value. Swift’s built-in data types like Array and
Dictionary, are implemented as structs.
Meaning that the above getter returns a copy of [Photo] array.
Then, that same article tries to make the getter thread-safe by refactoring the code to:
fileprivate let concurrentPhotoQueue = DispatchQueue(label: "com.raywenderlich.GooglyPuff.photoQueue",
attributes: .concurrent)
fileprivate var _photos: [Photo] = []
var photos: [Photo] {
var photosCopy: [Photo]!
concurrentPhotoQueue.sync {
photosCopy = self._photos
}
return photosCopy
}
The above code explictly make a copy of self._photos in getter.
My questions are:
If by default swift already return an copy (pass by value) like the article said in the first place, why the article copy again to photosCopy to make it thread-safe? I feel myself do not fully understand these two parts mentioned in that article.
Does Swift3 really pass by value by default for Array instance like the article says?
Could someone please clarify it for me? Thanks!
I'll address your questions in reverse:
Does Swift3 really pass by value by default for Array instance like the article says?
Simple Answer: Yes
But I'm guessing that is not what your concern is when asking "Does Swift3 really pass by value". Swift behaves as if the array is copied in its entirety but behind the scenes it optimises the operation and the whole array is not copied until, and if, it needs to be. Swift uses an optimisation known as copy-on-write (COW).
However for the Swift programmer how the copy is done is not so important as the semantics of the operation - which is that after an assignment/copy the two arrays are independent and mutating one does not effect the other.
If by default swift already return an copy (pass by value) like the article said in the first place, why the article copy again to photosCopy to make it thread-safe? I feel myself do not fully understand these two parts mentioned in that article.
What this code is doing is insuring that the copy is done in a thread-safe way.
An array is not a trivial value, it is implemented as multi-field struct and some of those fields reference other structs and/or objects - this is needed to support an arrays ability to grow in size, etc.
In a multi-threaded system one thread could try to copy the array while another thread is trying to change the array. If these are allowed to happen at the same time then things easily can go wrong, e.g. the array could change while the copy is in progress, resulting in an invalid copy - part old value, part new value.
Swift per se is not thread safe; and in particular it will not prevent an array from being changed while a copy is being performed. The code you have addresses this by using a GCD queue so that during any alteration to the array by one thread all other writes or reads to the array in any other thread are blocked until the alteration is complete.
You might also be concerned that their are multiple copies going on here, self._photos to photoCopy, then photoCopy to the return value. While semantically this is what happens in practice there will probably only be one complex copy (and that will be thread safe) as the Swift system will optimise.
HTH
1) In code example what you provided will be returned copy of _photos.
As wrote in article:
The getter for this property is termed a read method as it’s reading
the mutable array. The caller gets a copy of the array and is protected
against mutating the original array inappropriately.
that's mean what you can access to _photos from outside of class, but you can't change them from there. Values of photos could be changed only inside class what make this array protected from it accidental changing.
2)Yes, Array is a value-type struct and it will be passed by value. You can easily check it in Playground
let arrayA = [1, 2, 3]
var arrayB = arrayA
arrayB[1] = 4 //change second value of arrayB
print(arrayA) //but arrayA didn't change
UPD #1
In article they have method func addPhoto(_ photo: Photo) what add new photo to _photos array what makes access to this property not thread-safe. That's mean what value of _photos could be changed on few thread in same time what will lead to issues.
They fixed it by writing photos on concurrentQueue with .barrier what make it thread-safely, _photos array will changed once per time
func addPhoto(_ photo: Photo) {
concurrentPhotoQueue.async(flags: .barrier) { // 1
self._photos.append(photo) // 2
DispatchQueue.main.async { // 3
self.postContentAddedNotification()
}
}
}
Now for ensure thread safety you need to read of _photos array on same queue. That's only reason why they refactored read method

How to open a file in binary form in Swift

I'm new to Swift and I'm not sure how I can pick up a file and store it in a binary array.
I do know how to pick up a file, but I don't know how I can store it in a binary array which would be modified later.
Suppose the variable "chosenFile" is the file I pick up ( in NSData type)
And the variable "bArray" ( [int8] array) is the array used to store the binary representation of the file.
var bArray: [Int8] = [Int8]()
var chosenFile: NSData! = NSData(contentsOfURL: "xxxxxxxx")
Any help ?
If you create an instance of NSMutableData, you can use the mutableBytes property to get a reference to the underlying data. In C, this would be a void *; in Swift, it is an UnsafeMutablePointer<Void>.
You can then work with this data directly (there's no need for another byte array). So long as you stay within the length of the NSMutableData, you can, eg, replace individual bits of data, or indeed the entirety of the data.
Alternatively, rather than working on the primitive directly, you can use replaceBytesInRange:withBytes: to selectively modify part of your data. This is also explained in the documentation.