Why storing substrings may lead to memory leak in Swift? - swift

On Apple's documentation on Substring, is says:
Don’t store substrings longer than you need them to perform a specific operation. A substring holds a reference to the entire storage of the string it comes from, not just to the portion it presents, even when there is no other reference to the original string. Storing substrings may, therefore, prolong the lifetime of string data that is no longer otherwise accessible, which can appear to be memory leakage.
I feel confused that String is a value type in Swift and how does it lead to memory leak?

Swift Arrays, Sets, Dictionaries and Strings have value semantics, but they're actually copy-on-write wrappers for reference types. In other words, they're all struct wrappers around a class. This allows the following to work without making a copy:
let foo = "ABCDEFG"
let bar = foo
When you write to a String, it uses the standard library function isUniquelyReferencedNonObjC (unless it's been renamed again) to check if there are multiple references to the backing object. If so, it creates a copy before modifying it.
var foo = "ABCDEFG"
var bar = foo // no copy (yet)
bar += "HIJK" // backing object copied to keep foo and bar independent
When you use a Substring (or array slice), you get a reference to the entire backing object rather than just the bit that you want. This means that if you have a very large string and you have a substring of just 4 characters, as long as the substring is live, you're holding the entire string backing buffer in memory. This is the leak that this warns you about.

Given the way Swift is often portrayed your confusion is understandable. Types such as String, Array and Dictionary present value semantics but are library types constructed from a combination of value and references types.
The implementation of these types use dynamically allocated storage. This storage can be shared between different values. However library facilities are used to implement copy-on-write so that such shared storage is copied as needed to maintain value semantics, that is behaviour like that of value types.
HTH

Related

Does String.init(cString: UnsafePointer<Int8>) copy the memory contents?

What are the inner workings?
Does it create a Swift string copy of the C string data?
Does it use it as a reference and return it as a Swift string, so the string returned uses the same data? How does it work?
Does it copy the C string into a newly allocated Swift string?
If String(cString: UnsafePointer<Int8>) indeed works by copying the C string into a newly allocated swift string, is there a way to convert C strings to swift by referencing the already existing data instead of copying it?
How does String(cString: UnsafePointer<Int8>) work, and how can I determine whether it copies, or whether it references the same memory as a Swift string?
The documentation clearly states that the data is copied:
Initializer
init(cString:)
Creates a new string by copying the null-terminated UTF-8 data referenced by the given pointer.
is there a way to convert C strings to swift by referencing the already existing data instead of copying it?
Nope. Strings are frequently copied/destroyed, which involves doing retain/release operations on the underlying buffer, to do the necessary booking keeping of thethe reference count. If the memory is not owned by the String, then there's no way to reliably de-allocate it.
What are you trying to achieve by avoiding the copy?

ARC doesn't apply to struct and enum, how are they deallocated in Swift

Since ARC doesn't apply to struct and enum, then how are they deallocated from the memory? I have to get stuck when it asked in the interviews and try to find the correct answer but can't find much info on it googling. I know swift is smart at handling value types. But how?
The memory management of objects (instances of classes) is relatively difficult, because objects can outlive a function call, the life of other objects, or even the life of the threads that allocated them. They're independent entities on the heap, that need book keeping to make sure they're freed once they're not needed (once they're no longer referenced from any other threads/objects, they're unreachable, thus can't possible be needed, so are safe to delete).
On the other hand, structs and enums just have their instances stored inline:
If they're declared as a global variable, they're stored in the program text.
If they're declared as a local variable, they're allocated on the stack (or in registers, but never mind that).
If they're allocated as a property of another object, they're just
stored directly inline within that object.
They're only ever deleted
by virtue of their containing context being deallocated, such as when
a function returns, or when an object is deallocated.

Does Data's "copy constructor" copy its referenced bytes if inited with `freeWhenDone:false`?

If I allocate a Data object with bytesNoCopy:count:deallocator:.none, it should reference the given bytes but in an unsafe manner, where I as programmer promise the bytes will be available during the lifetime of the Data, rather than Data controlling that on its own.
That's all fine. What I wonder is... Since it's a value type rather than a reference type, what happens when I assign another Data variable from my nocopy-Data? Does it COPY THE DATA (against my explicit wishes)? Or does it create one more unsafe Data instance which I must track the lifetime of, or risk crashes?
Here's an illustration:
let unsafe = malloc(5);
func makeUnsafeData() -> Data
{
return Data(bytesNoCopy: unsafe, count: 5, deallocator: .none)
}
struct Foo
{
var d: Data
}
var foo = Foo(d: makeUnsafeData())
free(unsafe)
The question is: does foo.d contain a dangling pointer to the freed bytes that used to be in unsafe? Or does it contain its own copy of those bytes, and is safe to use?
This experiment gist seems to indicate that NSData crashes in the above scenario, as expected, but Data does not; so my tentative conclusion is that Data copies the data, and there's no way to use a Data instance to transport bytes between functions without copying the bytes. But I'd love a reference to any documentation refuting or confirming this theory.
Turns out, the answer is in the documentation after all... as long as all your Data instances are let and you don't mutate to it ever, ONLY the original bytes should be in memory, all Datas just referencing it.
I might just use NSData instead though since it's a reference type and less magic going on...

How does an array in swift deep copy itself when copied or assigned

We all know an array in swift is a value type, this means after copying or assigning an array to another, modify the new array will not effect the old one. Such as:
var a = ["a", "b", "c", "d", "e"]
var b = a
b[0] = "1"
print(a[0]) // a
print(b[0]) // 1
But I'm wondering how could an array work like that. The length for a 'var' array is dynamical. Usually we must alloc some heap memory to contain all the values. And I do peek some source codes for struct Array, the underlining buffer for an array is implemented using a class. But when copying a struct which contains class or memory pointer member, the class and alloced memory will not copied by default.
So how could an array copy its buffer when copy or assign it to another one?
Assignment of any struct (such as Array) causes a shallow copy of the structure contents. There's no special behavior for Array. The buffer that stores the Array's elements is not actually part of the structure. A pointer to that buffer, stored on the heap, is part of the Array structure, meaning that upon assignment, the buffer pointer is copied, but it still points to the same buffer.
All mutating operations on Array do a check to see if the buffer is uniquely referenced. If so, then the algorithm proceeds. Otherwise, a copy of the buffer is made, and the pointer to the new buffer is saved to that Array instance, then the algorithm proceeds as previously. This is called Copy on Write (CoW). Notice that it's not an automatic feature of all value types. It is merely a manually implemented feature of a few standard library types (like Array, Set, Dictionary, String, and others). You could even implement it yourself for your own types.
When CoW occurs, it does not do any deep copying. It will copy values, which means:
In the case of value types (struct, enum, tuples), the values are the struct/enum/tuples themselves. In this case, a deep and shallow copy are the same thing.
In the case of reference types (class), the value being copied is the reference. The referenced object is not copied. The same object is pointed to by both the old and copied reference. Thus, it's a shallow copy.

iOS Obj-C: Variable object that can be assigned as a double or a string?

I'm pretty new to iOS development, and I want to figure out if there's a good way to handle this issue. Basically, I'm making a technical calculator that returns some product specifications based on user input parameters. The product in question has specs for some, but not all user parameters, so I . In a constants file, I have a bunch of ATTEN_SPEC_X variables which are const double or const NSString *. Now, it's perfectly okay to be missing a spec, so my plan was to leverage NSArray's ability to hold different types and use introspection later to handle strings vs doubles before I report the returned specs.
Here's an incomplete example of one method I'm implementing. It's just a big conditional tree that should return a two-element array of the final values of spec and nominal.
- (NSArray *)attenuatorSwitching:(double *)attenuator{
double spec, nominal;
{...}
else if (*attenuator==0){
spec=ATTEN_SPEC_3; //this atten spec is a string!
nominal=ATTEN_NOM_3;
}
{...}
return {array of spec, nominal} //not actual obj-c code
So instead of making spec and nominal doubles, can I make them some other general type? The really important thing here is that I don't want to use any special handling within this method; another coder should be able to go back to the constants file, change ATTEN_NOM_3 to a double, and not have to retool this method at all.
Thanks.
The problem you'll run into is that NSArrays can't directly handle doubles. However, you can get around this if you start using NSNumber instances instead - you can return an NSArray * containing an NSString * and an NSNumber * with no problems. If you need even more general typing, the Objective-C type id can be used for any object instance (though still not with primitives; you can't make a double an id).
Later, when you get an array, you can use the NSObject method -isKindOfClass: to determine the type of object you're pulling out of the array, and deal with the string or number depending on the resultant type. If you need to convert your NSNumber back to a double, just use the NSNumber instance method -doubleValue to unbox your double. (+[NSNumber numberWithDouble:] goes the other way, giving you an NSNumber out of a double.)
If you're using a recent enough version of Xcode, you can even make these things literals, rather than having to litter calls to +numberWithDouble: all over the place:
return #[ #3, #"number of things" ]

Categories