Data ranged subscribe strange behavior - swift

I was playing with swift's Data in the following a small code:
var d = Data(count: 10)
d[5] = 3
let d2 = d[5..<8]
print("\(d2[0])")
To my surprise, this code throws exception on print() while the following code does not:
var d = Data(count: 10)
d[5] = 3
let d2 = d.subdata(in: 5..<8)
print("\(d2[0])")
I somehow understand why this happens, but I don't get why this is designed like this. When I use subdata() I get a whole copy of range, so indexing is valid from 0. But when I use range subscribe [], I get access to the requested range while indexing is the same as before. So in my first example d2[5] is 3.
But I wonder why it is designed like this? I don't want to make a copy of my data by using subdata() method. I just wanted to access a portion of my data with better indexing.
This is especially creates unexpected behaviors if you pass it to a function. For example, following code creates unexpected results and exceptions and you may not find out easily why:
func testit(idata: Data) {
if idata.count > 0 {
print("\(idata.count)")
print("\(idata[0])")
}
}
//...
var d = Data(count: 10)
d[5] = 3
let d2 = d[5..<8]
testit(idata: d2)
This code is really strange. Because if you debug your code, you see that print("\(idata.count)") prints 3 as size of idata which is correct, but accessing it with idata[0] creates exception.
Is there any reason for this design? I was expecting that I could access resulting Data from subscribe starting index 0 while it is not true. Can I do this without using subdata() which creates copy of data or using additional arguments to pass base of data slice?

d[5..<8] returns Data.Slice – which happens to be Data. Generally, slices share the indices with their base collection, as documented in Slice.
One possible reason for this design decision is that it guarantees that subscripting a slice is a O(1) operation (adding an offset for accessing the base collection is not necessarily O(1), e.g. not for strings.)
It is also convenient, as in this example to locate the text after the second occurrence of a character in a string:
let string = "abcdefgabcdefg"
// Find first occurrence of "d":
if let r1 = string.range(of: "d") {
// Find second occurrence of "d":
if let r2 = string[r1.upperBound...].range(of: "d") {
print(string[r2.upperBound...]) // efg
}
}
As a consequence, you must never assume that the indices of a collection are zero-based (unless documented, as for Array.startIndex). Use startIndex to get the first index, or first to get the first element.

Related

Why are Data.endIndex and Data.count different?

let str = "This is a swift bug"
let data = Data(str.utf8)
print("data size = ", data.endIndex, data.count)
let trimmed = data[2..<data.endIndex]
print("trimmed size = ", trimmed.endIndex, trimmed.count)
The result is
data size = 19 19
trimmed size = 19 17
According to the Apple doc about endIndex:
This is the “one-past-the-end” position, and will always be equal to the count.
Is it a bug? or I'm missing something?
You should open an Apple Feedback for the documentation of Data.endIndex. It's incorrect.
The startIndex of Data is not promised to be zero, and this is an example of when it isn't. Using the Int subscript on Data is unfortunately very dangerous unless you know precisely how the Data was constructed (and specifically that it has a zero index).
Data uniquely mixes two facts that make it tricky to use correctly:
It is its own Slice
Its Index is Int
For some discussion of this, and suggested patterns, see Data.popFirst(), removeFirst() adjust indices. Also see Data ranged subscribe strange behavior for another version of this question.
When you use an expression like array[2..<array.endIndex] you are creating a slice. A slice is a sort of window onto an array (or something similar to an array). Its startIndex is not necessarily 0 and its endIndex is not necessarily one after the last index of the original.
Example:
let arr = Array(1...10)
print(arr.startIndex) // 0
print(arr.endIndex) // 10
let slice = arr[2...4]
print(slice.startIndex) // 2
print(slice.endIndex) // 5
print(slice.count) // 3
You see how this works? The slice has its own logic. Its size (count) is the size of the slice, but its index numbers come from the original array, because the slice is nothing but a pointer into a section of the original array. It has no independent existence; it is just a way of seeing, as it were.
An important consequence is that slice[0] will crash: the first available index of slice is 2, as we have already been told. This is why it is crucial to know whether you're dealing with an original array or a slice.
However, at least you have reason to know that this issue might exist, because slice has a special type — Array<Int>.SubSequence, meaning an ArraySlice. But the fact that you are encountering this by way of Data makes it more tricky, because trimmed is typed as a Data, not as a DataSlice! It is in fact a Data.SubSequence, but you have no simple way of finding that out! That's because Data.SubSequence is typealiased to Data itself. This is to be regarded as a flaw in the Data implementation.
Nevertheless, it is exactly the same phenomenon. These answers should look strangely familiar:
let str = "This is a swift bug"
let data = Data(str.utf8)
let trimmed = data[2...4]
print(trimmed.startIndex) // 2
print(trimmed.endIndex) // 5
print(trimmed.count) // 3
The best way to solve this is Don't Do That. To take a subrange of a Data as a true Data, use subdata:
let trimmed2 = data.subdata(in: 2..<5)
print(trimmed2.startIndex) // 0, and so on; it's an independent copy

In swift, is there a way to only check part of an array in a for loop (with a set beginning and ending point)

So lets say we have an array a = [20,50,100,200,500,1000]
Generally speaking we could do for number in a { print(a) } if we wanted to check the entirety of a.
How can you limit what indexes are checked? As in have a set beginning and end index (b, and e respectively), and limit the values of number that are checked to between b and e?
For an example, in a, if b is set to 1, and e is set to 4, then only a1 through a[4] are checked.
I tried doing for number in a[b...e] { print(number) }, I also saw here someone do this,
for j in 0..<n { x[i] = x[j]}, which works if we want just a ending.
This makes me think I can do something like for number in b..<=e { print(a[number]) }
Is this correct?
I'm practicing data structures in Swift and this is one of the things I've been struggling with. Would really appreciate an explanation!
Using b..<=e is not the correct syntax. You need to use Closed Range Operator ... instead, i.e.
for number in b...e {
print(a[number])
}
And since you've already tried
for number in a[b...e] {
print(number)
}
There is nothing wrong with the above syntax as well. You can use it either way.
An array has a subscript that accepts a Range: array[range] and returns a sub-array.
A range of integers can be defined as either b...e or b..<e (There are other ways as well), but not b..<=e
A range itself is a sequence (something that supports a for-in loop)
So you can either do
for index in b...e {
print(a[index])
}
or
for number in a[b...e] {
print(number)
}
In both cases, it is on you to ensure that b...e are valid indices into the array.

If given a Substring, is it possible to access the underlying complete String on which it is based?

Say I have the following code...
let x = "ABCDE"
// 'x' is a String
var y = x[1...3]
// 'y' is a Substring that equals "BCD"
If you only have access to y, is it possible to access x, or specifically parts of x which are outside the range of y? (i.e. can you access 'A' or 'E', or grow the range of y?)
So here's what Apple says:
Important
Don’t store substrings longer than you need them to perform a specific
operation. A substring holds a reference to the entire storage of the
string it comes from, not just to the portion it presents, even when
there is no other reference to the original string. Storing substrings
may, therefore, prolong the lifetime of string data that is no longer
otherwise accessible, which can appear to be memory leakage.
Now I find their use of the word "otherwise" in the last sentence rather interesting. It seems to me to keep the door open on this question - could a substring be manipulated to be expanded to include memory on either side that we know still exists as part of the original string?
So here's what I'd think is a fair test:
let x = "ABCDEFGH"
let substr = x.prefix(3)
var substrIndex = substr.startIndex
substr.formIndex(&substrIndex, offsetBy: 4) // offset beyond the substring
let prefix = substr.prefix(through:substrIndex)
print(prefix)
So what'cha think that would print?
Actually we never get to the print. We get a runtime fatal error instead.
Thread 1: Fatal error: Operation results in an invalid index
BTW, even trying the following results in an EXC_BAD_ACCESS crash:
let x = "ABCDEFGH"
var substr = x.prefix(3)
withUnsafePointer(to: &substr)
{ substrPointer in
let z = substrPointer.advanced(by: 3)
print(z.pointee)
}
So I don't think there's a way to get to the rest of the string if you just have a substring... from within Substring or String classes anyhow, or even dealing with unsafe pointers. I'm sure there's a way using direct memory access, for Apple claims the rest of the String's memory is there... but you'd probably have to fall back to C or C++.

Can this be more Swift3-like?

What I want to do is populate an Array (sequence) by appending in the elements of another Array (availableExercises), one by one. I want to do it one by one because the sequence has to hold a given number of items. The available exercises list is in nature finite, and I want to use its elements as many times as I want, as opposed to a multiple number of the available list total.
The current code included does exactly that and works. It is possible to just paste that in a Playground to see it at work.
My question is: Is there a better Swift3 way to achieve the same result? Although the code works, I'd like to not need the variable i. Swift3 allows for structured code like closures and I'm failing to see how I could use them better. It seems to me there would be a better structure for this which is just out of reach at the moment.
Here's the code:
import UIKit
let repTime = 20 //seconds
let restTime = 10 //seconds
let woDuration = 3 //minutes
let totalWOTime = woDuration * 60
let sessionTime = repTime + restTime
let totalSessions = totalWOTime / sessionTime
let availableExercises = ["push up","deep squat","burpee","HHSA plank"]
var sequence = [String]()
var i = 0
while sequence.count < totalSessions {
if i < availableExercises.count {
sequence.append(availableExercises[i])
i += 1
}
else { i = 0 }
}
sequence
You can overcome from i using modulo of sequence.count % availableExercises.count like this way.
var sequence = [String]()
while(sequence.count < totalSessions) {
let currentIndex = sequence.count % availableExercises.count
sequence.append(availableExercises[currentIndex])
}
print(sequence)
//["push up", "deep squat", "burpee", "HHSA plank", "push up", "deep squat"]
You can condense your logic by using map(_:) and the remainder operator %:
let sequence = (0..<totalSessions).map {
availableExercises[$0 % availableExercises.count]
}
map(_:) will iterate from 0 up to (but not including) totalSessions, and for each index, the corresponding element in availableExercises will be used in the result, with the remainder operator allowing you to 'wrap around' once you reach the end of availableExercises.
This also has the advantage of preallocating the resultant array (which map(_:) will do for you), preventing it from being needlessly re-allocated upon appending.
Personally, Nirav's solution is probably the best, but I can't help offering this solution, particularly because it demonstrates (pseudo-)infinite lazy sequences in Swift:
Array(
repeatElement(availableExercises, count: .max)
.joined()
.prefix(totalSessions))
If you just want to iterate over this, you of course don't need the Array(), you can leave the whole thing lazy. Wrapping it up in Array() just forces it to evaluate immediately ("strictly") and avoids the crazy BidirectionalSlice<FlattenBidirectionalCollection<Repeated<Array<String>>>> type.

Swift 3 subscript range works for first cluster but not for middle

I'm trying to figure out why the following works on the first string cluster (character) but not on a second one. Perhaps the endIndex cannot be applied on another String?
let part = "A"
let full = "ABC"
print(full[part.startIndex ... part.startIndex]) // "A"
print(full[part.endIndex ... part.endIndex]) // "" <- ???
print(full[part.endIndex ... full.index(after: part.endIndex)]) // "B"
bSecond should hold "B", but instead is empty. But the proof that one string index works on another is that the last statement works.
EDIT:
Assuming full.hasPrefix(part) is true.
Swift puzzles.
You cannot use the indices of one string to subscript a different
string. That may work by chance (in your first example) or not
(in your second example), or crash at runtime.
In this particular case, part.endIndex (which is the "one past the end position" for the part string) returns
String.UnicodeScalarView.Index(_position: 1), _countUTF16: 0)
with _countUTF16: (which is the "count of this extended grapheme cluster in UTF-16 code units") being zero, i.e. it describes
a position (in the unicode scalar view) with no extent. Then
full[part.endIndex ... part.endIndex]
returns an empty string. But that is an implementation detail
(compare StringCharacterView.swift). The real answer is just "you can't do that".
A safe way to obtain the intended (?) result is
let part = "A"
let full = "ABC"
if let range = full.range(of: part) {
print(full[range]) // A
if range.upperBound != full.endIndex {
print(full[range.upperBound...range.upperBound]) // B
}
}