How to handle large strings in swift? - swift

I've this code and its taking a lot of time to execute in swift?
Each iteration takes 1 second to execute, Why?
CPU Percentage while executing that loop is 97-98% and energy impact is High
Here is the Code
var braces:Int = 1;
var i:Int = startIndex;
let jsFileChars = Array(javascriptFile);
while(i < javascriptFile.count){ //count:1240265
if (braces == 0) {
break;
}
if (jsFileChars[i] == "{"){
braces = braces+1;
}else if (jsFileChars[i] == "}"){
braces = braces-1;
}
i = i+1;
}
This loop is iterated on a very slow pace, why?

The loop is slow because determining the count of a Swift string is a
O(N) operation, where N is the number of characters in the string.
See also Counting Characters in “The Swift Programming Language”:
NOTE
Extended grapheme clusters can be composed of multiple Unicode scalars. This means that different characters—and different representations of the same character—can require different amounts of memory to store. Because of this, characters in Swift don’t each take up the same amount of memory within a string’s representation. As a result, the number of characters in a string can’t be calculated without iterating through the string to determine its extended grapheme cluster boundaries. ...
Replacing javascriptFile.count by jsFileChars.count should already
improve the performance, because the length of an array is determined in
constant time.
Even better iterate over the characters directly, without creating an
array at all:
var braces = 1
for char in javascriptFile {
if char == "{" {
braces += 1
} else if char == "}" {
braces -= 1
}
}
Iterating over the UTF-16 view is even faster, because that
is what Swift strings (currently) use as internal storage:
let openingBrace = "{".utf16.first!
let closingBrace = "}".utf16.first!
var braces = 1
for char in javascriptFile.utf16 {
if char == openingBrace {
braces += 1
} else if char == closingBrace {
braces -= 1
}
}

When you're thinking of iterating over a collection in Swift (and a String is a collection of characters), it's sometimes faster to use reduce() instead. You can implement your brace counter using reduce() like this:
let braces = javascriptFile.reduce(0, { count, char in
switch char {
case "{": return count + 1
case "}": return count - 1
default: return count
}
})
I don't know if that'll actually be faster than using a for loop in your case, but might be worth a try. If nothing else, the intent is very clear when written this way.

Related

Swift Dictionary is slow?

Situation: I was solving LeetCode 3. Longest Substring Without Repeating Characters, when I use the Dictionary using Swift the result is Time Limit Exceeded that failed to last test case, but using the same notion of code with C++ it acctually passed with runtime just fine. I thought in swift Dictionary is same thing as UnorderdMap.
Some research: I found some resources said use NSDictionary over regular one but it requires reference type instead of Int or Character etc.
Expected result: fast performance in accessing Dictionary in Swift
Question: I know there are better answer for the question, but the main goal here is Is there a effiencient to access and write to Dictionary or someting we can use to substitude.
func lengthOfLongestSubstring(_ s: String) -> Int {
var window:[Character:Int] = [:] //swift dictionary is kind of slow?
let array = Array(s)
var res = 0
var left = 0, right = 0
while right < s.count {
let rightChar = array[right]
right += 1
window[rightChar, default: 0] += 1
while window[rightChar]! > 1 {
let leftChar = array[left]
window[leftChar, default: 0] -= 1
left += 1
}
res = max(res, right - left)
}
return res
}
Because complexity of count in String is O(n), so that you should save count in a variable. You can read at chapter
Strings and Characters in Swift Book
Extended grapheme clusters can be composed of multiple Unicode scalars. This means that different characters—and different representations of the same character—can require different amounts of memory to store. Because of this, characters in Swift don’t each take up the same amount of memory within a string’s representation. As a result, the number of characters in a string can’t be calculated without iterating through the string to determine its extended grapheme cluster boundaries. If you are working with particularly long string values, be aware that the count property must iterate over the Unicode scalars in the entire string in order to determine the characters for that string.
The count of the characters returned by the count property isn’t always the same as the length property of an NSString that contains the same characters. The length of an NSString is based on the number of 16-bit code units within the string’s UTF-16 representation and not the number of Unicode extended grapheme clusters within the string.

Are overflow operators less efficient than performing operations that don't result in overflows?

What I Am Doing: I am writing a chess engine in Swift. One of the most important parts of writing a strong chess engine is the ability to generate as many possible future board positions in as little time possible. The more positions your engine can generate and evaluate in a shorter amount of time, the stronger the engine is.
That being said, I've written functions for generating moves for sliding pieces (bishops, rooks, and queens). These functions make use of overflow operators (&+, &-, &*), as using normal bitwise operators frequently cause overflow errors.
Generating said moves requires two functions, one for generating all legal vertical and horizontal moves for a sliding piece, and one for generating all legal diagonal moves for a sliding piece. These two functions effectively go about doing the same thing, we just manipulate the arguments slightly differently. Here is what the function for generating horizontal and vertical moves looks like:
//occupied is a bitboard that represents every square on the chess board that is occupied
//this value is set somewhere before our move generation functions are ever called
var occupied: UInt64 = 0
//the rankMasks and fileMasks are simply arrays of bitboards that represent each individual file and rank on a chess board
//rankMasks8[0] would represent the squares a8-h8, rankMasks8[1] would represent the squares a7-h7
//fileMasks8[0] would represent the squares a1-a8, fileMasks8[1] would represent the squares b1-b8
let rankMasks8: [UInt64] = [ 255, 65280, 16711680, 4278190080, 1095216660480, 280375465082880, 71776119061217280, 18374686479671623680 ]
let fileMasks8: [UInt64] = [ 72340172838076673, 144680345676153346, 289360691352306692, 578721382704613384, 1157442765409226768, 2314885530818453536, 4629771061636907072, 9259542123273814144 ]
...
//We pass a square (0 - 63) as s and we are returned a UInt64, the bitboard representing all the squares that the piece on the passed square can move to.
func horizontalAndVerticalMoves(s: Int) -> UInt64 {
//convert the passed square into a bitboard that represents its location, by raising 2 to the power of s
let binaryS: UInt64 = 1<<s
//formula for generating possible horizontal moves
let possibilitiesHorizontal: UInt64 = (occupied &- (2 &* binaryS)) ^ UInt64.reverse(UInt64.reverse(occupied) &- 2 &* UInt64.reverse(binaryS))
//formula for generating vertical moves
let possibilitiesVertical: UInt64 = ((occupied & fileMasks8[s % 8]) &- (2 &* binaryS)) ^ UInt64.reverse(UInt64.reverse(occupied & fileMasks8[s % 8]) &- (2 &* UInt64.reverse(binaryS)))
//we return possible horizontal moves OR possible vertical moves
return (possibilitiesHorizontal & rankMasks8[s / 8]) | (possibilitiesVertical & fileMasks8[s % 8])
}
The only important thing you need to recognize about the above function is that it gives us the expected output and it does so using overflow operators.
Now, my previous iteration of this same method (before I understood how to circumvent the overflows using overflow operators) was much more drawn out. It required running four while loops that would move away from the current piece in either the "north", "south", "east", or "west" direction until it came into contact with a piece that blocks further movement in the respective direction. Here's what this iteration of the horizontalAndVerticalMoves function looked like:
func horizontalAndVerticalMoves(s: Int) -> UInt64 {
let rankMask: UInt64 = rankMasks8[s/8]
let fileMask: UInt64 = fileMasks8[s%8]
let pseudoPossibleMoves: UInt64 = rankMask ^ fileMask
var unblockedRanks: UInt64 = 0
var unblockedFiles: UInt64 = 0
var direction: Direction! = Direction.north
var testingSquare: Int = s - 8
while direction == .north {
if testingSquare < 0 || testingSquare%8 != s%8 {
direction = .east
} else {
if 1<<testingSquare&occupied != 0 {
unblockedRanks += rankMasks8[testingSquare/8]
direction = .east
} else {
unblockedRanks += rankMasks8[testingSquare/8]
testingSquare -= 8
}
}
}
testingSquare = s + 1
while direction == .east {
if testingSquare > 63 || testingSquare/8 != s/8 {
direction = .south
} else {
if 1<<testingSquare&occupied != 0 {
unblockedFiles += fileMasks8[testingSquare%8]
direction = .south
} else {
unblockedFiles += fileMasks8[testingSquare%8]
testingSquare += 1
}
}
}
testingSquare = s + 8
while direction == .south {
if testingSquare > 63 || testingSquare%8 != s%8 {
direction = .west
} else {
if 1<<testingSquare&occupied != 0 {
unblockedRanks += rankMasks8[testingSquare/8]
direction = .west
} else {
unblockedRanks += rankMasks8[testingSquare/8]
testingSquare += 8
}
}
}
testingSquare = s - 1
while direction == .west {
if testingSquare < 0 || testingSquare/8 != s/8 {
direction = .north
} else {
if 1<<testingSquare&occupied != 0 {
unblockedFiles += fileMasks8[testingSquare%8]
direction = .north
} else {
unblockedFiles += fileMasks8[testingSquare%8]
testingSquare -= 1
}
}
}
let mask = unblockedRanks | unblockedFiles
let possibleMoves = pseudoPossibleMoves&mask
return possibleMoves
}
I figured my newly implemented version of this function (the one that makes use of overflow operators) would be not only more succinct, but also much more efficient. The only important things you need to note about this iteration of the same function is that it gives us the expected output, but appears much more drawn out and doesn't use overflow operators.
What I've Noticed: As mentioned, I expected that my newer, cleaner code using overflow operators would perform much more quickly than the iteration that uses a bunch of while loops. When running tests to see how quickly I can generate chess moves, I've found that the version that uses while loops instead of overflow operators was significantly faster. Calculating every combination of the first three moves in a chess game takes the original function a little less than 6 seconds, while the newer function that uses overflow operators takes a little under 13 second.
What I'm Wondering: As I am wanting to create the strongest chess engine possible, I am hunting for bits of my code that I can make execute faster. The old function performing quicker than the new function seems counterintuitive to me. So I am wondering, are overflow operator in Swift notoriously slow/inefficient?
Here is what the class in question that generates these moves looks like: https://github.com/ChopinDavid/Maestro/blob/master/Maestro/Moves.swift
So I am wondering, are overflow operator in Swift notoriously slow/inefficient?
No, if anything the opposite might be true.
The machine-level instructions for multiply etc. may set an overflow flag but they don't do any more than that. For the standard operators Swift has to compile additional instructions to test that flag and generate an error and this code includes branches (though branch prediction should mitigate those effectively).
The code for your overflow operator version is shorter than that for the standard operator version, its also branch-free.
What the performance difference is between versions is another matter, but the overflow version should not be slower.
You probably need to look for your performance difference elsewhere. Happy hunting!
Note: the above comparison is based on fully optimised code ("Fastest, Smallest [-Os]") produced by the Swift compiler in Xcode 11.3.1, a debug build might produce very different results.

Swift 5: String prefix with a maximum UTF-8 length

I have a string that can contain arbitrary Unicode characters and I want to get a prefix of that string whose UTF-8 encoded length is as close as possible to 32 bytes, while still being valid UTF-8 and without changing the characters' meaning (i.e. not cutting off an extended grapheme cluster).
Consider this CORRECT example:
let string = "\u{1F3F4}\u{E0067}\u{E0062}\u{E0073}\u{E0063}\u{E0074}\u{E007F}\u{1F1EA}\u{1F1FA}"
print(string) // 🏴󠁧󠁢󠁳󠁣󠁴󠁿🇪🇺
print(string.count) // 2
print(string.utf8.count) // 36
let prefix = string.utf8Prefix(32) // <-- function I want to implement
print(prefix) // 🏴󠁧󠁢󠁳󠁣󠁴󠁿
print(prefix.count) // 1
print(prefix.utf8.count) // 28
print(string.hasPrefix(prefix)) // true
And this example of a WRONG implementation:
let string = "ar\u{1F3F4}\u{200D}\u{2620}\u{FE0F}\u{1F3F4}\u{200D}\u{2620}\u{FE0F}\u{1F3F4}\u{200D}\u{2620}\u{FE0F}"
print(string) // ar🏴‍☠️🏴‍☠️🏴‍☠️
print(string.count) // 5
print(string.utf8.count) // 41
let prefix = string.wrongUTF8Prefix(32) // <-- wrong implementation
print(prefix) // ar🏴‍☠️🏴‍☠️🏴
print(prefix.count) // 5
print(prefix.utf8.count) // 32
print(string.hasPrefix(prefix)) // false
What's an elegant way to do this? (besides trial&error)
You've shown no attempt at a solution and SO doesn't normally write code for you. So instead here as some algorithm suggestions for you:
What's an elegant way to do this? (besides trial&error)
By what definition of elegant? (like beauty it depends on the eye of the beholder...)
Simple?
Start with String.makeIterator, write a while loop, append Characters to your prefix as long as the byte count ≤ 32.
It's a very simple loop, worse case is 32 iterations and 32 appends.
"Smart" Search Strategy?
You could implement a strategy based on the average byte length of each Character in the String and using String.Prefix(Int).
E.g. for your first example the character count is 2 and the byte count 36, giving an average of 18 bytes/character, 18 goes into 32 just once (we don't deal in fractional characters or bytes!) so start with Prefix(1), which has a byte count of 28 and leaves 1 character and 8 bytes – so the remainder has an average byte length of 8 and you are seeking at most 4 more bytes, 8 goes into 4 zero times and you are done.
The above example shows the case of extending (or not) your prefix guess. If your prefix guess is too long you can just start your algorithm from scratch using the prefix character & byte counts rather than the original string's.
If you have trouble implementing your algorithm ask a new question showing the code you've written, describe the issue, and someone will undoubtedly help you with the next step.
HTH
I discovered that String and String.UTF8View share the same indices, so I managed to create a very simple (and efficient?) solution, I think:
extension String {
func utf8Prefix(_ maxLength: Int) -> Substring {
if self.utf8.count <= maxLength {
return Substring(self)
}
var index = self.utf8.index(self.startIndex, offsetBy: maxLength+1)
self.formIndex(before: &index)
return self.prefix(upTo: index)
}
}
Explanation (assuming maxLength == 32 and startIndex == 0):
The first case (utf8.count <= maxLength) should be clear, that's where no work is needed.
For the second case we first get the utf8-index 33, which is either
A: the endIndex of the string (if it's exactly 33 bytes long),
B: an index at the start of a character (after 33 bytes of previous characters)
C: an index somewhere in the middle of a character (after <33 bytes of previous characters)
So if we now move our index back one character (with formIndex(before:)) this will jump to the first extended grapheme cluster boundary before index which in case A and B is one character before and in C the start of that character.
I any case, the utf8-index will now be guaranteed to be at most 32 and at an extended grapheme cluster boundary, so prefix(upTo: index) will safely create a prefix with length ≤32.
…but it's not perfect.
In theory this should also be always the optimal solution, i.e. the prefix's count is as close as possible to maxLength but sometimes when the string ends with an extended grapheme cluster consisting of more than one Unicode scalar, formIndex(before: &index) goes back one character too many than would be necessary, so the prefix ends up shorter. I'm not exactly sure why that's the case.
EDIT: A not as elegant but in exchange completely "correct" solution would be this (still only O(n)):
extension String {
func utf8Prefix(_ maxLength: Int) -> Substring {
if self.utf8.count <= maxLength {
return Substring(self)
}
let endIndex = self.utf8.index(self.startIndex, offsetBy: maxLength)
var index = self.startIndex
while index <= endIndex {
self.formIndex(after: &index)
}
self.formIndex(before: &index)
return self.prefix(upTo: index)
}
}
I like the first solution you came up with. I've found it works more correctly (and simpler) if you take out the formIndex:
extension String {
func utf8Prefix(_ maxLength: Int) -> Substring {
if self.utf8.count <= maxLength {
return Substring(self)
}
let index = self.utf8.index(self.startIndex, offsetBy: maxLength)
return self.prefix(upTo: index)
}
}
My solution looks like this:
extension String {
func prefix(maxUTF8Length: Int) -> String {
if self.utf8.count <= maxUTF8Length { return self }
var utf8EndIndex = self.utf8.index(self.utf8.startIndex, offsetBy: maxUTF8Length)
while utf8EndIndex > self.utf8.startIndex {
if let stringIndex = utf8EndIndex.samePosition(in: self) {
return String(self[..<stringIndex])
} else {
self.utf8.formIndex(before: &utf8EndIndex)
}
}
return ""
}
}
It takes the highest possible utf8 index, checks if it is a valid character index using the Index.samePosition(in:) method. If not, it reduces the utf8 index one by one until it finds a valid character index.
The advantage is that you could replace utf8 with utf16 and it would also work.

I seem to have an infinite while loop in my Swift code and I can't figure out why

var array: [Int] = []
//Here I make an array to try to dictate when to perform an IBaction.
func random() -> Int {
let rand = arc4random_uniform(52)*10+10
return Int(rand)
}
//this function makes a random integer for me
func finalRand() -> Int {
var num = random()
while (array.contains(num) == true){
if (num == 520){
num = 10
}else {
num += 10
}
}
array.append(num)
return num
}
The logic in the while statement is somewhat confusing, but you could try this:
var array:Array<Int> = []
func finalRand() -> Int {
var num = Int(arc4random_uniform(52)*10+10)
while array.contains(num) {
num = Int(arc4random_uniform(52)*10+10)
}
array.append(num)
return num
}
This way there will never be a repeat, and you have less boiler code.
There is probably a better method involving Sets, but I'm sorry I do not know much about that.
A few things:
Once your array has all 52 values, an attempt to add the 53rd number will end up in an infinite loop because all 52 values are already in your array.
In contemporary Swift versions, you can simplify your random routine to
func random() -> Int {
return Int.random(in: 1...52) * 10
}
It seems like you might want a shuffled array of your 52 different values, which you can reduce to:
let array = Array(1...52).map { $0 * 10 }
.shuffled()
Just iterate through that shuffled array of values.
If you really need to continue generating numbers when you’re done going through all of the values, you could, for example, reshuffle the array and start from the beginning of the newly shuffled array.
As an aside, your routine will not generate truly random sequence. For example, let’s imagine that your code just happened to populate the values 10 through 500, with only 510 and 520 being the final possible remaining values: Your routine is 51 times as likely to generate 510 over 520 for the next value. You want to do a Fisher-Yates shuffle, like the built-in shuffled routine does, to generate a truly evenly distributed series of values. Just generate array of possible values and shuffle it.

Can this be more Swift3-like?

What I want to do is populate an Array (sequence) by appending in the elements of another Array (availableExercises), one by one. I want to do it one by one because the sequence has to hold a given number of items. The available exercises list is in nature finite, and I want to use its elements as many times as I want, as opposed to a multiple number of the available list total.
The current code included does exactly that and works. It is possible to just paste that in a Playground to see it at work.
My question is: Is there a better Swift3 way to achieve the same result? Although the code works, I'd like to not need the variable i. Swift3 allows for structured code like closures and I'm failing to see how I could use them better. It seems to me there would be a better structure for this which is just out of reach at the moment.
Here's the code:
import UIKit
let repTime = 20 //seconds
let restTime = 10 //seconds
let woDuration = 3 //minutes
let totalWOTime = woDuration * 60
let sessionTime = repTime + restTime
let totalSessions = totalWOTime / sessionTime
let availableExercises = ["push up","deep squat","burpee","HHSA plank"]
var sequence = [String]()
var i = 0
while sequence.count < totalSessions {
if i < availableExercises.count {
sequence.append(availableExercises[i])
i += 1
}
else { i = 0 }
}
sequence
You can overcome from i using modulo of sequence.count % availableExercises.count like this way.
var sequence = [String]()
while(sequence.count < totalSessions) {
let currentIndex = sequence.count % availableExercises.count
sequence.append(availableExercises[currentIndex])
}
print(sequence)
//["push up", "deep squat", "burpee", "HHSA plank", "push up", "deep squat"]
You can condense your logic by using map(_:) and the remainder operator %:
let sequence = (0..<totalSessions).map {
availableExercises[$0 % availableExercises.count]
}
map(_:) will iterate from 0 up to (but not including) totalSessions, and for each index, the corresponding element in availableExercises will be used in the result, with the remainder operator allowing you to 'wrap around' once you reach the end of availableExercises.
This also has the advantage of preallocating the resultant array (which map(_:) will do for you), preventing it from being needlessly re-allocated upon appending.
Personally, Nirav's solution is probably the best, but I can't help offering this solution, particularly because it demonstrates (pseudo-)infinite lazy sequences in Swift:
Array(
repeatElement(availableExercises, count: .max)
.joined()
.prefix(totalSessions))
If you just want to iterate over this, you of course don't need the Array(), you can leave the whole thing lazy. Wrapping it up in Array() just forces it to evaluate immediately ("strictly") and avoids the crazy BidirectionalSlice<FlattenBidirectionalCollection<Repeated<Array<String>>>> type.