Does string concatenation in Swift make a new copy? - swift

I can concatenate two strings in Swift like this:
var c = "Hello World"
c += "!"
Does this create a new string? (Allocating a new block of memory, copying over the original string, concatenating the "!" string and returning the new memory.) Or, does it update the original string in place (only allocating a new block of memory if the original block can't fit the character).

No, it does not make a new copy. As you can see, the original string has changed.But the address remains same.

As it says in the apple documentation : https://developer.apple.com/documentation/swift/string
in the section of Performance optimizations:
"Although strings in Swift have value semantics, strings use a copy-on-write strategy to store their data in a buffer. This buffer can then be shared by different copies of a string. A string’s data is only copied lazily, upon mutation, when more than one string instance is using the same buffer. Therefore, the first in any sequence of mutating operations may cost O(n) time and space."
IOs use copy-on-write so if more than 1 process use the same variable, or has more than 1 copy (i don't fully understand this part), it makes a copy, but if the variable is only used for 1 process and has only one copy, then you can mutate it as you wish without generating copies

Related

Swift Mutating Data at Range

I am building a Data object that looks something like this:
struct StructuredData {
var crc: UInt16
var someData: UInt32
var someMoreData: UInt64
// etc.
}
I'm running a CRC algorithm that will start at byte 2 and process length 12.
When the CRC is returned, it must exist at the beginning of the Data object. As I see it, my options are:
Generate a Data object that does not include the CRC, process it, and then build another Data object that does (so that the CRC value that I now have will be at the start of the Data object.
Generate the data object to include a zero'd out CRC to start with and then mutate the data at range [0..<2].
Obviously, 2 would be preferable, as it uses less memory and less processing, but this type of an optimization I'm not sure is necessary anymore. I'd still rather go with 2, except I do not know how to mutate the data at a given index range. Any help is greatly appreciated.
I do not recommend the way of mutating Data using this:
data.replaceSubrange(0..<2, with: UnsafeBufferPointer(start: &self.crc, count: 1))
Please try this:
data.replaceSubrange(0..<2, with: &self.crc, count: 2)
It is hard to explain why, but I'll try...
In Swift, inout parameters work in copy-in-copy-out semantics. When you write something like this:
aMethod(&param)
Swift allocates some region of enough size to hold the content of param,
copies param into the region, (copy-in)
calls the method with passing the address of the region,
and when returned from the call, copies the content of the region back to param (copy-out).
In many cases, Swift optimizes (which may happen even in -Onone setting) the steps by just passing the actual address of param, but it is not clearly documented.
So, when an inout parameter is passed to the initializer of UnsafeBufferPointer, the address received by UnsafeBufferPointer may be pointing to a temporal region, which will be released immediately after the initializer is finished.
Thus, replaceSubrange(_:with:) may copy the bytes in the already released region into the Data.
I believe the first code would work in this case as crc is a property of a struct, but if there is a simple and safe alternative, you should better avoid the unsafe way.
ADDITION for the comment of Brandon Mantzey's own answer.
data.append(UnsafeBufferPointer(start: &self.crcOfRecordData, count: 1))
Using safe in the meaning above. This is not safe, for the same reason described above.
I would write it as:
data.append(Data(bytes: &self.crcOfRecordData, count: MemoryLayout<UInt16>.size))
(Assuming the type of crcOfRecordData as UInt16.)
If you do not prefer creating an extra Data instance, you can write it as:
withUnsafeBytes(of: &self.crcOfRecordData) {urbp in
data.append(urbp.baseAddress!.assumingMemoryBound(to: UInt8.self), count: MemoryLayout<UInt16>.size)
}
This is not referred in the comment, but in the meaning of above safe, the following line is not safe.
let uint32Data = Data(buffer: UnsafeBufferPointer(start: &self.someData, count: 1))
All the same reason.
I would write it as:
let uint32Data = Data(bytes: &self.someData, count: MemoryLayout<UInt32>.size)
Though, observable unexpected behavior may happen in some very limited condition and with very little probability.
Such behavior would happen only when the following two conditions are met:
Swift compiler generates non-Optimized copy-in-copy-out code
Between the very narrow period, since the temporal region is released till the append method (or Data.init) finishes copying the whole content, the region is modified for another use.
The condition #1 becomes true only limited cases in the current implementation of Swift.
The condition #2 happens very rarely only in the multi-threaded environment. (Though, Apple's framework uses many hidden threads as you can find in the debugger of Xcode.)
In fact, I have not seen any questions regarding the unsafe cases above, my safe may be sort of overkill.
But alternative safe codes are not so complex, are they?
In my opinion, you should better be accustomed to use all-cases-safe code.
I figured it out. I actually had a syntax error that was boggling me because I hadn't seen that before.
Here's the answer:
data.replaceSubrange(0..<2, with: UnsafeBufferPointer(start: &self.crc, count: 1))

Swift pointer from ArraySlice

I'm trying to determine the "Swift-y" way of creating my own contiguous memory containers (in my particular case, I'm building n-dimensional arrays). I want my containers to be as close to Swift's builtin Array as possible - in terms of functionality and usability.
I need to access the pointer to memory of my containers for stuff like Accelerate and BLAS operations.
I want to know whether an ArraySlice's pointer would point to the first element of the slice, or the first element of its base.
When I tried to test UnsafePointer<Int>(array) == UnsafePointer<Int>(array[1...2]) it looks like Swift doesn't allow pointer construction from ArraySlices (or I just did it incorrectly).
I'm looking for advice on which way would be the most "Swift-y"?
I understand that when slicing an array the follow is true:
let array = [1, 2, 3]
array[1] == array[1...2][1]
and
array[1...2][0] != 2 # index out of bounds error
In other words, indexing is always performed relative to the base Array.
Which suggests: that we should return a pointer to the base's first element. Because slices are relative to their base.
However, iteration through a slice (obviously) only considers elements of that slice:
for i in array[1..2] # i takes on 2 followed by 3
Which suggests: that we should return a pointer to the slice's first element. Because slices have their own starting point.
If my user wanted to operate on a slice in a BLAS operation it would be intuitive to expect:
mmul(matrix1[1...2, 0...1].pointer, matrix2[4...5, 0...1].pointer)
to point to the first elements of slice, but I don't know if this is the way a Swift ArraySlice would do things.
My Question: Should a container slice object's pointer point to the first element of the slice, or, the first element of the base container.
This operation is unsafe:
UnsafePointer<Int>(array)
What you mean is:
array.withUnsafeBufferPointer { ... }
This applies to your types as well, and is the pattern you should employ to interoperate with BLAS and Accelerate. You should not try to use a pointer method IMO.
There is no promise that array will continue to exist by the time you actually access the pointer, even if that happens in the same line of code. ARC is free to destroy that memory shockingly quickly.
UnsafeBufferPointer is actually a very nice type in that it is already promised to be contiguous and it behaves as a Collection.
My suggestion here would be to manage your own memory internally, probably with a ManagedBuffer, but maybe just with a UnsafeMutablePointer that you alloc and destroy yourself. It's very important that you manage the layout of the data so that it's compatible with Accelerate. You don't want Array<Array<UInt8>>. That's going to add too much structure. You want a blob of bytes that you index into in the good-ol' C ways (row*width+column, etc). You probably don't want your slices to return pointers at all directly. Your mmul function is likely going to need special logic to understand how to pull the pieces it needs out of slices with minimal copying so that it works with vDSP_mmul. "Generic" and "Accelerate" seldom go together.
For example, considering this:
mmul(matrix1[1...2, 0...1].pointer, matrix2[4...5, 0...1].pointer)
(Obviously I assume your real matrices are dramatically larger; this kind of matrix doesn't make much sense to send to vDSP.)
You're going to have to write your own mmul here obviously since this memory isn't laid out correctly. So you might as well pass the slices. Then you'd do something like (totally untested, uncompiled, and I'm sure the syntax is wildly wrong):
mmul(m1: MatrixSlice, m2: MatrixSlice) -> Matrix {
var s1 = UnsafeMutablePointer<Float>.alloc(m1.rows * m1.columns)
// use vDSP_vgathr to copy each sliced row out of m1 into s1
var s2 = UnsafeMutablePointer<Float>.alloc(m2.rows * m2.columns)
// use vDSP_vgathr to copy each sliced row out of m2 into s2
var result = UnsafeMutablePointer<Float>.alloc(m1.rows * m2.columns)
vDSP_mmul(s1, 1, s2, 1, result, 1, m1.rows, m2.columns, m1.columns)
s1.destroy()
s2.destroy()
// This will need to call result.move() or moveInitializeFrom or something
return Matrix(result)
}
I'm just throwing out stuff here, but this is probably the kind of structure you'd want.
To your underlying question about whether the pointer to the container or to the data is usually passed by Swift, the answer is unfortunately "magic" for Array and no one else. Passing an Array to something that wants a pointer will magically (by the compiler, not the stdlib) pass a pointer to the storage of the Array. No other type gets this magic. Not even ContiguousArray gets this magic. If you pass a ContiguousArray to something that wants a pointer, you'll pass the pointer to the container (and if it's mutable, corrupt the container; true story, hated that one…)
Thanks in part to #RobNapier the answer to my question is: ArraySlice's pointer should point to the slice's first element.
The way I verified this was simply:
var array = [5,4,3,325,67,7,3]
array.withUnsafeBufferPointer{ $0 } != array[3...6].withUnsafeBufferPointer{ $0 } # true
^--- points to 5's address ^--- points to 325's address

How to cast [Int8] to [UInt8] in Swift

I have a buffer that contains just characters
let buffer: [Int8] = ....
Then I need to pass this to a function process that takes [UInt8] as an argument.
func process(buffer: [UInt8]) {
// some code
}
What would be the best way to pass the [Int8] buffer to cast to [Int8]? I know following code would work, but in this case the buffer contains just bunch of characters, and it is unnecessary to use functions like map.
process(buffer.map{ x in UInt8(x) }) // OK
process([UInt8](buffer)) // error
process(buffer as! [UInt8]) // error
I am using Xcode7 b3 Swift2.
I broadly agree with the other answers that you should just stick with map, however, if your array were truly huge, and it really was painful to create a whole second buffer just for converting to the same bit pattern, you could do it like this:
// first, change your process logic to be generic on any kind of container
func process<C: CollectionType where C.Generator.Element == UInt8>(chars: C) {
// just to prove it's working...
print(String(chars.map { UnicodeScalar($0) }))
}
// sample input
let a: [Int8] = [104, 101, 108, 108, 111] // ascii "Hello"
// access the underlying raw buffer as a pointer
a.withUnsafeBufferPointer { buf -> Void in
process(
UnsafeBufferPointer(
// cast the underlying pointer to the type you want
start: UnsafePointer(buf.baseAddress),
count: buf.count))
}
// this prints [h, e, l, l, o]
Note withUnsafeBufferPointer means what it says. It’s unsafe and you can corrupt memory if you get this wrong (be especially careful with the count). It works based on your external knowledge that, for example, if any of the integers are negative then your code doesn't mind them becoming corrupt unsigned integers. You might know that, but the Swift type system can't, so it won't allow it without resort to the unsafe types.
That said, the above code is correct and within the rules and these techniques are justifiable if you need the performance edge. You almost certainly won’t unless you’re dealing with gigantic amounts of data or writing a library that you will call a gazillion times.
It’s also worth noting that there are circumstances where an array is not actually backed by a contiguous buffer (for example if it were cast from an NSArray) in which case calling .withUnsafeBufferPointer will first copy all the elements into a contiguous array. Also, Swift arrays are growable so this copy of underlying elements happens often as the array grows. If performance is absolutely critical, you could consider allocating your own memory using UnsafeMutablePointer and using it fixed-size style using UnsafeBufferPointer.
For a humorous but definitely not within the rules example that you shouldn’t actually use, this will also work:
process(unsafeBitCast(a, [UInt8].self))
It's also worth noting that these solutions are not the same as a.map { UInt8($0) } since the latter will trap at runtime if you pass it a negative integer. If this is a possibility you may need to filter them first.
IMO, the best way to do this would be to stick to the same base type throughout the whole application to avoid the whole need to do casts/coercions. That is, either use Int8 everywhere, or UInt8, but not both.
If you have no choice, e.g. if you use two separate frameworks over which you have no control, and one of them uses Int8 while another uses UInt8, then you should use map if you really want to use Swift. The latter 2 lines from your examples (process([UInt8](buffer)) and
process(buffer as! [UInt8])) look more like C approach to the problem, that is, we don't care that this area in memory is an array on singed integers we will now treat it as if it is unsigneds. Which basically throws whole Swift idea of strong types to the window.
What I would probably try to do is to use lazy sequences. E.g. check if it possible to feed process() method with something like:
let convertedBuffer = lazy(buffer).map() {
UInt8($0)
}
process(convertedBuffer)
This would at least save you from extra memory overhead (as otherwise you would have to keep 2 arrays), and possibly save you some performance (thanks to laziness).
You cannot cast arrays in Swift. It looks like you can, but what's really happening is that you are casting all the elements, one by one. Therefore, you can use cast notation with an array only if the elements can be cast.
Well, you cannot cast between numeric types in Swift. You have to coerce, which is a completely different thing - that is, you must make a new object of a different numeric type, based on the original object. The only way to use an Int8 where a UInt8 is expected is to coerce it: UInt8(x).
So what is true for one Int8 is true for an entire array of Int8. You cannot cast from an array of Int8 to an array of UInt8, any more than you could cast one of them. The only way to end up with an array of UInt8 is to coerce all the elements. That is exactly what your map call does. That is the way to do it; saying it is "unnecessary" is meaningless.

Swift: Converting a string into a variable name

I have variables with incremented numbers within, such as row0text, row1text, row2text, etc.
I've figured out how to dynamically create string versions of those variable names, but once I have those strings, how can I use them as actual variable names rather than strings in my code?
Example:
var row3text = "This is the value I need!"
var firstPart = "row"
var rowNumber = 3
var secondPart = "text"
var together = (firstPart+String(rowNumber)+secondPart)
// the below gives me the concatenated string of the three variables, but I'm looking for a way to have it return the value set at the top.
println (together)
Once I know how to do this, I'll be able to iterate through those variables using a for loop; it's just that at the moment I'm unsure of how to use that string as a variable name in my code.
Thanks!
Short Answer: There is no way to do this for good reason. Use arrays instead.
Long Answer:
Essentially you are looking for a way to define an unknown number of variables that are all linked together by their common format. You are looking to define an ordered set of elements of variable length. Why not just use an array?
Arrays are containers that allow you to store an ordered set or list of elements and access them by their ordered location, which is exactly what you're trying to do. See Apple's Swift Array Tutorial for further reading.
The advantage of arrays is that they are faster, far more convenient for larger sets of elements (and probably the same for smaller sets as well), and they come packaged with a ton of useful functionality. If you haven't worked with arrays before it is a bit of a learning curve but absolutely worth it.

var vs let in Swift [duplicate]

This question already has answers here:
What is the difference between `let` and `var` in Swift?
(32 answers)
Closed 8 years ago.
I'm new to Swift programming, and I've met the var and let types. I know that let is a constant and I know what that means, but I never used a constant mainly because I didn't need to. So why should I use var instead of let, at what situation should I use it?
Rather than constant and variable, the correct terminology in swift is immutable and mutable.
You use let when you know that once you assign a value to a variable, it doesn't change - i.e. it is immutable. If you declare the id of a table view cell, most likely it won't change during its lifetime, so by declaring it as immutable there's no risk that you can mistakenly change it - the compiler will inform you about that.
Typical use cases:
A constant (the timeout of a timer, or the width of a fixed sized label, the max number of login attempts, etc.). In this scenario the constant is a replacement for the literal value spread over the code (think of #define)
the return value of a function used as input for another function
the intermediate result of an expression, to be used as input for another expression
a container for an unwrapped value in optional binding
the data returned by a REST API call, deserialized from JSON into a struct, which must be stored in a database
and a lot more. Every time I write var, I ask myself: can this variable change?. If the answer is no, I replace var with let. Sometimes I also use a more protective approach: I declare everything as immutable, then the compiler will let me know when I try to modify one of them, and for each case I can proceed accordingly.
Some considerations:
For reference types (classes), immutable means that once you assign an instance to the immutable variable, you cannot assign another instance to the same variable.
For value types (numbers, strings, arrays, dictionaries, structs, enums) immutable means that that once you assign a value, you cannot change the value itself. For simple data types (Int, Float, String) it means you cannot assign another value of the same type. For composite data types (structs, arrays, dictionaries) it means you cannot assign a new value (such as a new instance of a struct) and you cannot change any of their stored properties.
Also an immutable variable has a semantic meaning for the developer and whoever reading the code - it clearly states that the variable won't change.
Last, but maybe less important from a pure development point of view, immutables can be subject to optimizations by the compiler.
Generally speaking, mutable state is to avoid as much as possible.
Immutable values help in reasoning about code, because you can easily track them down and clearly identify the value from the start to the end.
Mutable variables, on the other hand, make difficult to follow your data flow, since anyone can modify them at any time. Especially when dealing with concurrent applications, reasoning about mutable state can quickly become an incredibly hard task.
So, as a design principle, try to use let whenever possible and if you need to modify an object, simply produce a new instance.
Whenever you need to use a var, perhaps because using it makes the code clearer, try to limit their scope as much as possible and not to expose any mutable state. As an example, if you declare a var inside a function, it's safe to do so as long as you don't expose that mutability to the caller, i.e. from the caller point of view, it must not matter whether you used a var or a val in the implementation.
In general, if you know a variable's value is not going to change, declare it as a constant. Immutable variables will make your code more readable as you know for sure a particular variable is never being changed. This might also be better for the compiler as it can take advantage of the fact that a variable is constant and perform some optimisations.
This doesn't only apply to Swift. Even in C when the value of a variable is not be changed after being initialised, it's good practise to make sure it's const.
So, the way you think about "I didn't need to" should change. You don't need a constant only for values like TIMEOUT etc. You should have a constant variable anywhere you know the value of a variable doesn't need to be changed after initialisation.
Note: This is more of a general "programming as a whole" answer and not specific to Swift. #Antonio's answer has more of a focus on Swift.