Why are Data.endIndex and Data.count different? - swift

let str = "This is a swift bug"
let data = Data(str.utf8)
print("data size = ", data.endIndex, data.count)
let trimmed = data[2..<data.endIndex]
print("trimmed size = ", trimmed.endIndex, trimmed.count)
The result is
data size = 19 19
trimmed size = 19 17
According to the Apple doc about endIndex:
This is the “one-past-the-end” position, and will always be equal to the count.
Is it a bug? or I'm missing something?

You should open an Apple Feedback for the documentation of Data.endIndex. It's incorrect.
The startIndex of Data is not promised to be zero, and this is an example of when it isn't. Using the Int subscript on Data is unfortunately very dangerous unless you know precisely how the Data was constructed (and specifically that it has a zero index).
Data uniquely mixes two facts that make it tricky to use correctly:
It is its own Slice
Its Index is Int
For some discussion of this, and suggested patterns, see Data.popFirst(), removeFirst() adjust indices. Also see Data ranged subscribe strange behavior for another version of this question.

When you use an expression like array[2..<array.endIndex] you are creating a slice. A slice is a sort of window onto an array (or something similar to an array). Its startIndex is not necessarily 0 and its endIndex is not necessarily one after the last index of the original.
Example:
let arr = Array(1...10)
print(arr.startIndex) // 0
print(arr.endIndex) // 10
let slice = arr[2...4]
print(slice.startIndex) // 2
print(slice.endIndex) // 5
print(slice.count) // 3
You see how this works? The slice has its own logic. Its size (count) is the size of the slice, but its index numbers come from the original array, because the slice is nothing but a pointer into a section of the original array. It has no independent existence; it is just a way of seeing, as it were.
An important consequence is that slice[0] will crash: the first available index of slice is 2, as we have already been told. This is why it is crucial to know whether you're dealing with an original array or a slice.
However, at least you have reason to know that this issue might exist, because slice has a special type — Array<Int>.SubSequence, meaning an ArraySlice. But the fact that you are encountering this by way of Data makes it more tricky, because trimmed is typed as a Data, not as a DataSlice! It is in fact a Data.SubSequence, but you have no simple way of finding that out! That's because Data.SubSequence is typealiased to Data itself. This is to be regarded as a flaw in the Data implementation.
Nevertheless, it is exactly the same phenomenon. These answers should look strangely familiar:
let str = "This is a swift bug"
let data = Data(str.utf8)
let trimmed = data[2...4]
print(trimmed.startIndex) // 2
print(trimmed.endIndex) // 5
print(trimmed.count) // 3
The best way to solve this is Don't Do That. To take a subrange of a Data as a true Data, use subdata:
let trimmed2 = data.subdata(in: 2..<5)
print(trimmed2.startIndex) // 0, and so on; it's an independent copy

Related

Does this sorting algorithm exist? (implemented in Swift)

This might be a bad question but I am curious.
I was following some data structures and algorithms courses online, and I came across algorithms such as selection sort, insertion sort, bubble sort, merge sort, quick sort, heap sort.. They almost never get close to O(n) when the array is reverse-sorted.
I was wondering one thing: why are we not using space in return of time?
When I organise something I pick up one, and put it where it belongs to. So I thought if we have an array of items, we could just put each value to the index with that value.
Here is my implementation in Swift 4:
let simpleArray = [5,8,3,2,1,9,4,7,0]
let maxSpace = 20
func spaceSort(array: [Int]) -> [Int] {
guard array.count > 1 else {
return array
}
var realResult = [Int]()
var result = Array<Int>(repeating: -1, count: maxSpace)
for i in 0..<array.count{
if(result[array[i]] != array[i]){
result[array[i]] = array[i]
}
}
for i in 0..<result.count{
if(result[i] != -1){
realResult.append(i)
}
}
return realResult
}
var spaceSorted = [Int]()
var execTime = BenchTimer.measureBlock {
spaceSorted = spaceSort(array: simpleArray)
}
print("Average execution time for simple array: \(execTime)")
print(spaceSorted)
Results I get:
Does this sorting algorithm exist already?
Is this a bad idea because it only takes unique values and loses the duplicates? Or could there be uses for it?
And why can't I use Int.max for the maxSpace?
Edit:
I get the error below
error: Execution was interrupted.
when I use let maxSpace = Int.max
MyPlayground(6961,0x7000024af000) malloc: Heap corruption detected,
free list is damaged at 0x600003b7ebc0
* Incorrect guard value: 0 MyPlayground(6961,0x7000024af000) malloc: * set a breakpoint in malloc_error_break to debug
Thanks for the answers
This is an extreme version of radix sort. Quoted from Wikipedia:
radix sort is a non-comparative sorting algorithm. It avoids comparison by creating and distributing elements into buckets according to their radix. For elements with more than one significant digit, this bucketing process is repeated for each digit, while preserving the ordering of the prior step, until all digits have been considered. For this reason, radix sort has also been called bucket sort and digital sort.
In this case you choose your radix as maxSpace, and so you don't have any "elements with more than one significant digit" (from quote above).
Now, if you would use a Hash Set data structure instead of an array, you would actually not need to really allocate the space for the whole range. You would still keep all the loop iterations though (from 0 to maxSpace), and it would check whether the hash set contains the value of i (the loop variable), and if so, output it.
This can only be an efficient algorithm if maxSpace has the same order of magnitude as the number of elements in your input array. Other sorting algorithms can sort with O(nlogn) time complexity, so for cases where maxSpace is much greater than nlogn, the algorithm is not that compelling.

Data ranged subscribe strange behavior

I was playing with swift's Data in the following a small code:
var d = Data(count: 10)
d[5] = 3
let d2 = d[5..<8]
print("\(d2[0])")
To my surprise, this code throws exception on print() while the following code does not:
var d = Data(count: 10)
d[5] = 3
let d2 = d.subdata(in: 5..<8)
print("\(d2[0])")
I somehow understand why this happens, but I don't get why this is designed like this. When I use subdata() I get a whole copy of range, so indexing is valid from 0. But when I use range subscribe [], I get access to the requested range while indexing is the same as before. So in my first example d2[5] is 3.
But I wonder why it is designed like this? I don't want to make a copy of my data by using subdata() method. I just wanted to access a portion of my data with better indexing.
This is especially creates unexpected behaviors if you pass it to a function. For example, following code creates unexpected results and exceptions and you may not find out easily why:
func testit(idata: Data) {
if idata.count > 0 {
print("\(idata.count)")
print("\(idata[0])")
}
}
//...
var d = Data(count: 10)
d[5] = 3
let d2 = d[5..<8]
testit(idata: d2)
This code is really strange. Because if you debug your code, you see that print("\(idata.count)") prints 3 as size of idata which is correct, but accessing it with idata[0] creates exception.
Is there any reason for this design? I was expecting that I could access resulting Data from subscribe starting index 0 while it is not true. Can I do this without using subdata() which creates copy of data or using additional arguments to pass base of data slice?
d[5..<8] returns Data.Slice – which happens to be Data. Generally, slices share the indices with their base collection, as documented in Slice.
One possible reason for this design decision is that it guarantees that subscripting a slice is a O(1) operation (adding an offset for accessing the base collection is not necessarily O(1), e.g. not for strings.)
It is also convenient, as in this example to locate the text after the second occurrence of a character in a string:
let string = "abcdefgabcdefg"
// Find first occurrence of "d":
if let r1 = string.range(of: "d") {
// Find second occurrence of "d":
if let r2 = string[r1.upperBound...].range(of: "d") {
print(string[r2.upperBound...]) // efg
}
}
As a consequence, you must never assume that the indices of a collection are zero-based (unless documented, as for Array.startIndex). Use startIndex to get the first index, or first to get the first element.

If given a Substring, is it possible to access the underlying complete String on which it is based?

Say I have the following code...
let x = "ABCDE"
// 'x' is a String
var y = x[1...3]
// 'y' is a Substring that equals "BCD"
If you only have access to y, is it possible to access x, or specifically parts of x which are outside the range of y? (i.e. can you access 'A' or 'E', or grow the range of y?)
So here's what Apple says:
Important
Don’t store substrings longer than you need them to perform a specific
operation. A substring holds a reference to the entire storage of the
string it comes from, not just to the portion it presents, even when
there is no other reference to the original string. Storing substrings
may, therefore, prolong the lifetime of string data that is no longer
otherwise accessible, which can appear to be memory leakage.
Now I find their use of the word "otherwise" in the last sentence rather interesting. It seems to me to keep the door open on this question - could a substring be manipulated to be expanded to include memory on either side that we know still exists as part of the original string?
So here's what I'd think is a fair test:
let x = "ABCDEFGH"
let substr = x.prefix(3)
var substrIndex = substr.startIndex
substr.formIndex(&substrIndex, offsetBy: 4) // offset beyond the substring
let prefix = substr.prefix(through:substrIndex)
print(prefix)
So what'cha think that would print?
Actually we never get to the print. We get a runtime fatal error instead.
Thread 1: Fatal error: Operation results in an invalid index
BTW, even trying the following results in an EXC_BAD_ACCESS crash:
let x = "ABCDEFGH"
var substr = x.prefix(3)
withUnsafePointer(to: &substr)
{ substrPointer in
let z = substrPointer.advanced(by: 3)
print(z.pointee)
}
So I don't think there's a way to get to the rest of the string if you just have a substring... from within Substring or String classes anyhow, or even dealing with unsafe pointers. I'm sure there's a way using direct memory access, for Apple claims the rest of the String's memory is there... but you'd probably have to fall back to C or C++.

Can this be more Swift3-like?

What I want to do is populate an Array (sequence) by appending in the elements of another Array (availableExercises), one by one. I want to do it one by one because the sequence has to hold a given number of items. The available exercises list is in nature finite, and I want to use its elements as many times as I want, as opposed to a multiple number of the available list total.
The current code included does exactly that and works. It is possible to just paste that in a Playground to see it at work.
My question is: Is there a better Swift3 way to achieve the same result? Although the code works, I'd like to not need the variable i. Swift3 allows for structured code like closures and I'm failing to see how I could use them better. It seems to me there would be a better structure for this which is just out of reach at the moment.
Here's the code:
import UIKit
let repTime = 20 //seconds
let restTime = 10 //seconds
let woDuration = 3 //minutes
let totalWOTime = woDuration * 60
let sessionTime = repTime + restTime
let totalSessions = totalWOTime / sessionTime
let availableExercises = ["push up","deep squat","burpee","HHSA plank"]
var sequence = [String]()
var i = 0
while sequence.count < totalSessions {
if i < availableExercises.count {
sequence.append(availableExercises[i])
i += 1
}
else { i = 0 }
}
sequence
You can overcome from i using modulo of sequence.count % availableExercises.count like this way.
var sequence = [String]()
while(sequence.count < totalSessions) {
let currentIndex = sequence.count % availableExercises.count
sequence.append(availableExercises[currentIndex])
}
print(sequence)
//["push up", "deep squat", "burpee", "HHSA plank", "push up", "deep squat"]
You can condense your logic by using map(_:) and the remainder operator %:
let sequence = (0..<totalSessions).map {
availableExercises[$0 % availableExercises.count]
}
map(_:) will iterate from 0 up to (but not including) totalSessions, and for each index, the corresponding element in availableExercises will be used in the result, with the remainder operator allowing you to 'wrap around' once you reach the end of availableExercises.
This also has the advantage of preallocating the resultant array (which map(_:) will do for you), preventing it from being needlessly re-allocated upon appending.
Personally, Nirav's solution is probably the best, but I can't help offering this solution, particularly because it demonstrates (pseudo-)infinite lazy sequences in Swift:
Array(
repeatElement(availableExercises, count: .max)
.joined()
.prefix(totalSessions))
If you just want to iterate over this, you of course don't need the Array(), you can leave the whole thing lazy. Wrapping it up in Array() just forces it to evaluate immediately ("strictly") and avoids the crazy BidirectionalSlice<FlattenBidirectionalCollection<Repeated<Array<String>>>> type.

Why do Slice<T>s always have an even capacity?

I'm currently using the Slice type in a project.
I noticed some weird behaviour, so I decided to take a closer look at Slices. While testing around I discovered this:
var slice = Slice<Int>()
var range = 1...9
let length = range.endIndex - range.startIndex
println(" length of 'range': \(length)") //prints "length of 'range': 9"
slice.reserveCapacity(length)
println("capacity of 'slice': \(slice.capacity)") //prints "capacity of 'slice': 10"
Now when changing the range the capacity of slice is still always rounded up to the next even number. Why is that?
Update #1:
Now the first problem was addressed by #MartinR. The initial reason I asked this question was the following though.
Let's add this chunk of code:
for index in range.startIndex..<range.endIndex {
slice[index - range.startIndex] = index
}
What I would assume it would do, is to fill slice with the values of the range. It doesn't though, and actually says this: fatal error: Slice index out of range.
When I check the indices though, like here, they're fine.
Why is this happening then?
Increasing the capacity only allocates internal memory, but does not increase the endIndex of the slice.
You still have to append new elements:
for index in range.startIndex..<range.endIndex {
slice.append(index)
}
which is the same as
slice += range
Or you can replace an empty slice with a new one, for example
slice.replaceRange(0 ..< 0, with: range)