component(separatedBy:) versus .split(separator: ) - swift

In Swift 4, new method .split(separator:) is introduced by apple in String struct. So to split a string with whitespace which is faster for e.g..
let str = "My name is Sudhir"
str.components(separatedBy: " ")
//or
str.split(separator: " ")

Performance aside, there is an important difference between split(separator:) and components(separatedBy:) in how they treat empty subsequences.
They will produce different results if your input contains a trailing whitespace:
let str = "My name is Sudhir " // trailing space
str.split(separator: " ")
// ["My", "name", "is", "Sudhir"]
str.components(separatedBy: " ")
// ["My", "name", "is", "Sudhir", ""] ← Additional empty string
To have both produce the same result, use the omittingEmptySubsequences:false argument (which defaults to true):
// To get the same behavior:
str.split(separator: " ", omittingEmptySubsequences: false)
// ["My", "name", "is", "Sudhir", ""]
Details here:
https://developer.apple.com/documentation/swift/string/2894564-split

I have made sample test with following Code.
var str = """
One of those refinements is to the String API, which has been made a lot easier to use (while also gaining power) in Swift 4. In past versions of Swift, the String API was often brought up as an example of how Swift sometimes goes too far in favoring correctness over ease of use, with its cumbersome way of handling characters and substrings. This week, let’s take a look at how it is to work with strings in Swift 4, and how we can take advantage of the new, improved API in various situations. Sometimes we have longer, static strings in our apps or scripts that span multiple lines. Before Swift 4, we had to do something like inline \n across the string, add an appendOnNewLine() method through an extension on String or - in the case of scripting - make multiple print() calls to add newlines to a long output. For example, here is how TestDrive’s printHelp() function (which is used to print usage instructions for the script) looks like in Swift 3 One of those refinements is to the String API, which has been made a lot easier to use (while also gaining power) in Swift 4. In past versions of Swift, the String API was often brought up as an example of how Swift sometimes goes too far in favoring correctness over ease of use, with its cumbersome way of handling characters and substrings. This week, let’s take a look at how it is to work with strings in Swift 4, and how we can take advantage of the new, improved API in various situations. Sometimes we have longer, static strings in our apps or scripts that span multiple lines. Before Swift 4, we had to do something like inline \n across the string, add an appendOnNewLine() method through an extension on String or - in the case of scripting - make multiple print() calls to add newlines to a long output. For example, here is how TestDrive’s printHelp() function (which is used to print usage instructions for the script) looks like in Swift 3
"""
var newString = String()
for _ in 1..<9999 {
newString.append(str)
}
var methodStart = Date()
_ = newString.components(separatedBy: " ")
print("Execution time Separated By: \(Date().timeIntervalSince(methodStart))")
methodStart = Date()
_ = newString.split(separator: " ")
print("Execution time Split By: \(Date().timeIntervalSince(methodStart))")
I run above code on iPhone6 , Here are the results
Execution time Separated By: 8.27463299036026
Execution time Split By: 4.06880903244019
Conclusion : split(separator:) is faster than components(separatedBy:).

Maybe a little late to answer:
split is a native swift method
components is NSString Foundation method
When you play with them, they behave a little bit different:
str.components(separatedBy: "\n\n")
This call can give you some interesting results
str.split(separator: "\n\n")
This leads to an compile error as you must provide a single character.

Related

Swift string vs [Character] and performance

From the very beginning Swift strings were tricky since they work properly with UTF and there is a standard example from Apple:
let cafe1 = "Cafe\u{301}"
let cafe2 = "Café"
print(cafe1 == cafe2)
// Prints "true"
It means that comparison has some implicit logic and it's not a simple comparison of two memory areas are the same. I used to see recommendations to flat out strings into [Character] since when you do this all unicode-related conversions take place once and then all operations are faster. Additionally strings are not necessarily use continuous memory area which makes it more expensive to compare them than character arrays.
Long story short, I solved this problem on leetcode: https://leetcode.com/problems/implement-strstr/ and tried different approaches: KMP, character arrays and strings. To my surprise strings are the fastest.
How is it so? KMP has some prework and it is less efficient in general but why strings are faster than [Character]? Is it new for some recent Swift version or do I miss something conceptually?
Code that I used for reference:
[Character], 8ms, 15mb memory
func strStr(_ haystack: String, _ needle: String) -> Int {
guard !needle.isEmpty else { return 0 }
guard haystack.count >= needle.count else { return -1 }
var result: Int = -1
let str = Array(haystack)
let pattern = Array(needle)
for i in 0...(str.count - pattern.count) {
if str[i] == pattern[0] && Array(str[i...(i + pattern.count - 1)]) == pattern {
result = i
break
}
}
return result
}
Strings, 4ms(!!!), 14.5mb memory
func strStr(_ haystack: String, _ needle: String) -> Int {
guard !needle.isEmpty else { return 0 }
guard haystack.count >= needle.count else { return -1 }
var result: Int = -1
for i in 0...(haystack.count - needle.count) {
var hIdx = haystack.index(haystack.startIndex, offsetBy: i)
if haystack[hIdx] == needle[needle.startIndex] {
var hEndIdx = haystack.index(hIdx, offsetBy: needle.count - 1)
if haystack[hIdx...hEndIdx] == needle {
result = i
break
}
}
}
return result
}
First, I think there may be some misunderstandings on your part:
flat out strings into [Character] since when you do this all unicode-related conversions take place once and then all operations are faster
This doesn't make a lot of sense. Character has exactly the same issues as String. It still may be made of composed or decomposed UnicodeScalars that need special handling for equality.
Additionally strings are not necessarily use continuous memory area
This is equally true of Array. Nothing in Array promises that memory is contiguous. That's why ContiguousArray exists.
As to why String is faster than hand-coded abstractions, that should be obvious. If you could easily out-perform String with no major tradeoffs, then stdlib would implement String to do that.
To the mechanics of it, String does not promise any particular internal representation, so it heavily depends on how you're creating your strings. Small strings, for example, can be reduced all the way to a tagged pointer that requires zero memory (it can live in a register). Strings can be stored in UTF-8, but they can also be stored in UTF-16 (which is extremely fast to work with).
When Strings are compared with other Strings that know they have the same internal representations, then they can apply various optimizations. And this really points to one part of your problem:
Array(str[i...(i + pattern.count - 1)])
This is forcing a memory allocation and copy to create a new Array out of str. You would probably do much better if you used Slice for this work rather than making full Array copies. You'd almost certainly find in that case that you're exactly matching String's implementations (using SubStr).
But the real lesson here is that you're unlikely to beat String at its own game in the general case. If you happen to have very specialized knowledge about your Strings, then I can see where you'd be able to beat the general-purpose String algorithms. But if you think you're beating stdlib for arbitary strings, why would stdlib not just implement what you're doing (and beat you using knowledge of the internal details of String)?

Unused positional argument skipped in String formatting (Swift)

I want to format a string, in Swift, with two potential arguments (using Format specifiers). The string to format may have a place for only the first argument, only the second argument, or both arguments. If I use the first or both arguments it works, but if I use only the second argument, it does not work. For instance:
let title = "M."
let name = "David"
let greetingFormat = "Hello %1$# %2$#"
print(String(format: greetingFormat, title, name))
// OUTPUT> Hello M. David
// OK
If I use only the first argument in the String to format:
let greetingFormat = "Hello %1$#"
print(String(format: greetingFormat, title, name))
// OUTPUT> Hello M.
// OK
But when using only the second argument
let greetingFormat = "Hello %2$#"
print(String(format: greetingFormat, title, name))
// OUTPUT> Hello M.
// NOT THE EXPECTED RESULT!
In the last case I expected "Hello David". Is it a bug? How can I obtain the intended result for the last case where only the second argument is used?
Remarks:
Please note that this problem occurs in the context of localization (i.e. the string to format comes from a Localizable.strings file), so I don’t have the possibility to remove unused argument directly.
The question does not relate to person’s name formatting. This is just taken as a example.
I answer my own question but all credit to #Martin R that provides the relevant information in comments.
It is not a bug, String(format:) does not support omitting positional parameters.
It is a known behavior since ObjectiveC, see: stackoverflow.com/a/2946880/1187415
If you only have String arguments you can use multiple String substitutions with String.replacingOccurrences(of:, with:) instead of String(format:).
More precision on the last solution. The following will work in the case that only one argument is used and also if both arguments are used in the greetingFormat String:
greetingFormat.replacingOccurrences(
of: "%1$#", with: title)
.replacingOccurrences(
of: "%2$#", with: name)
Of course, with String.replacingOccurrences(of:, with:) you can choose other identifiers for the substitution than %1$# and %2$#.

Swift: Simple method to replace a single character in a String?

I wanted to replace the first character of a String and got it to work like this:
s.replaceSubrange(Range(NSMakeRange(0,1),in:s)!, with:".")
I wonder if there is a simpler method to achieve the same result?
[edit]
Get nth character of a string in Swift programming language doesn't provide a mutable substring. And it requires writing a String extension, which isn't really helping when trying to shorten code.
To replace the first character, you can do use String concatenation with dropFirst():
var s = "😃hello world!"
s = "." + s.dropFirst()
print(s)
Result:
.hello world!
Note: This will not crash if the String is empty; it will just create a String with the replacement character.
Strings work very differently in Swift than many other languages. In Swift, a character is not a single byte but instead a single visual element. This is very important when working with multibyte characters like emoji (see: Why are emoji characters like 👩‍👩‍👧‍👦 treated so strangely in Swift strings?)
If you really do want to set a single random byte of your string to an arbitrary value as you expanded on in the comments of your question, you'll need to drop out of the string abstraction and work with your data as a buffer. This is sort of gross in Swift thanks to various safety features but it's doable:
var input = "Hello, world!"
//access the byte buffer
var utf8Buffer = input.utf8CString
//replace the first byte with whatever random data we want
utf8Buffer[0] = 46 //ascii encoding of '.'
//now convert back to a Swift string
var output:String! = nil //buffer for holding our new target
utf8Buffer.withUnsafeBufferPointer { (ptr) in
//Load the byte buffer into a Swift string
output = String.init(cString: ptr.baseAddress!)
}
print(output!) //.ello, world!

Expression was too complex to be solved in reasonable time- Appending Strings [duplicate]

I find this amusing more than anything. I've fixed it, but I'm wondering about the cause. Here is the error: DataManager.swift:51:90: Expression was too complex to be solved in reasonable time; consider breaking up the expression into distinct sub-expressions. Why is it complaining? It seems like one of the most simple expressions possible.
The compiler points to the columns + ");"; section
func tableName() -> String { return("users"); }
func createTableStatement(schema: [String]) -> String {
var schema = schema;
schema.append("id string");
schema.append("created integer");
schema.append("updated integer");
schema.append("model blob");
var columns: String = ",".join(schema);
var statement = "create table if not exists " + self.tableName() + "(" + columns + ");";
return(statement);
}
the fix is:
var statement = "create table if not exists " + self.tableName();
statement += "(" + columns + ");";
this also works (via #efischency) but I don't like it as much because I think the ( get lost:
var statement = "create table if not exists \(self.tableName()) (\(columns))"
I am not an expert on compilers - I don't know if this answer will "change how you think in a meaningful way," but my understanding of the problem is this:
It has to do with type inference. Each time you use the + operator, Swift has to search through all of the possible overloads for + and infer which version of + you are using. I counted just under 30 overloads for the + operator. That's a lot of possibilities, and when you chain 4 or 5 + operations together and ask the compiler to infer all of the arguments, you are asking a lot more than it might appear at first glance.
That inference can get complicated - for example, if you add a UInt8 and an Int using +, the output will be an Int, but there's some work that goes into evaluating the rules for mixing types with operators.
And when you are using literals, like the String literals in your example, the compiler doing the work of converting the String literal to a String, and then doing the work of infering the argument and return types for the + operator, etc.
If an expression is sufficiently complex - i.e., it requires the compiler to make too many inferences about the arguments and the operators - it quits and tells you that it quit.
Having the compiler quit once an expression reaches a certain level of complexity is intentional. The alternative is to let the compiler try and do it, and see if it can, but that is risky - the compiler could go on trying forever, bog down, or just crash. So my understanding is that there is a static threshold for the complexity of an expression that the compiler will not go beyond.
My understanding is that the Swift team is working on compiler optimizations that will make these errors less common. You can learn a little bit about it on the Apple Developer forums by clicking on this link.
On the Dev forums, Chris Lattner has asked people to file these errors as radar reports, because they are actively working on fixing them.
That is how I understand it after reading a number of posts here and on the Dev forum about it, but my understanding of compilers is naive, and I am hoping that someone with a deeper knowledge of how they handle these tasks will expand on what I have written here.
This is almost same as the accepted answer but with some added dialogue (I had with Rob Napier, his other answers and Matt, Oliver, David from Slack) and links.
See the comments in this discussion. The gist of it is:
+ is heavily overloaded (Apple seems to have fixed this for some cases)
The + operator is heavily overloaded, as of now it has 27 different functions so if you are concatenating 4 strings ie you have 3 + operators the compiler has to check between 27 operators each time, so that's 27^3 times. But that's not it.
There is also a check to see if the lhs and rhs of the + functions are both valid if they are it calls through to core the append called. There you can see there are a number of somewhat intensive checks that can occur. If the string is stored non-contiguously, which appears to be the case if the string you’re dealing with is actually bridged to NSString. Swift then has to re-assemble all the byte array buffers into a single contiguous buffer and which requires creating new buffers along the way. and then you eventually get one buffer that contains the string you’re attempting to concatenate together.
In a nutshell there is 3 clusters of compiler checks that will slow you down ie each sub-expression has to be reconsidered in light of everything it might return. As a result concatenating strings with interpolation ie using " My fullName is \(firstName) \(LastName)" is much better than "My firstName is" + firstName + LastName since interpolation doesn't have any overloading
Swift 3 has made some improvements. For more information read How to merge multiple Arrays without slowing the compiler down?. Nonetheless the + operator is still overloaded and it's better to use string interpolation for longer strings
Usage of optionals (ongoing problem - solution available)
In this very simple project:
import UIKit
class ViewController: UIViewController {
let p = Person()
let p2 = Person2()
func concatenatedOptionals() -> String {
return (p2.firstName ?? "") + "" + (p2.lastName ?? "") + (p2.status ?? "")
}
func interpolationOptionals() -> String {
return "\(p2.firstName ?? "") \(p2.lastName ?? "")\(p2.status ?? "")"
}
func concatenatedNonOptionals() -> String {
return (p.firstName) + "" + (p.lastName) + (p.status)
}
func interpolatedNonOptionals() -> String {
return "\(p.firstName) \(p.lastName)\(p.status)"
}
}
struct Person {
var firstName = "Swift"
var lastName = "Honey"
var status = "Married"
}
struct Person2 {
var firstName: String? = "Swift"
var lastName: String? = "Honey"
var status: String? = "Married"
}
The compile time for the functions are as such:
21664.28ms /Users/Honey/Documents/Learning/Foundational/CompileTime/CompileTime/ViewController.swift:16:10 instance method concatenatedOptionals()
2.31ms /Users/Honey/Documents/Learning/Foundational/CompileTime/CompileTime/ViewController.swift:20:10 instance method interpolationOptionals()
0.96ms /Users/Honey/Documents/Learning/Foundational/CompileTime/CompileTime/ViewController.swift:24:10 instance method concatenatedNonOptionals()
0.82ms /Users/Honey/Documents/Learning/Foundational/CompileTime/CompileTime/ViewController.swift:28:10 instance method interpolatedNonOptionals()
Notice how crazy high the compilation duration for concatenatedOptionals is.
This can be solved by doing:
let emptyString: String = ""
func concatenatedOptionals() -> String {
return (p2.firstName ?? emptyString) + emptyString + (p2.lastName ?? emptyString) + (p2.status ?? emptyString)
}
which compiles in 88ms
The root cause of the problem is that the compiler doesn't identify the "" as a String. It's actually ExpressibleByStringLiteral
The compiler will see ?? and will have to loop through all types that have conformed to this protocol, till it finds a type that can be a default to String.
By Using emptyString which is hardcoded to String, the compiler no longer needs to loop through all conforming types of ExpressibleByStringLiteral
To learn how to log compilation times see here or here
Other similar answers by Rob Napier on SO:
Why string addition takes so long to build?
How to merge multiple Arrays without slowing the compiler down?
Swift Array contains function makes build times long
This is quite ridiculous no matter what you say! :)
But this gets passed easily
return "\(year) \(month) \(dayString) \(hour) \(min) \(weekDay)"
I had similar issue:
expression was too complex to be solved in reasonable time; consider breaking up the expression into distinct sub-expressions
In Xcode 9.3 line goes like this:
let media = entities.filter { (entity) -> Bool in
After changing it into something like this:
let media = entities.filter { (entity: Entity) -> Bool in
everything worked out.
Probably it has something to do with Swift compiler trying to infer data type from code around.
Great news - this seems to be fixed in the upcoming Xcode 13.
I was filing radar reports for this:
http://openradar.appspot.com/radar?id=4962454186491904
https://bugreport.apple.com/web/?problemID=39206436
... and Apple has just confirmed that this is fixed.
I have tested all cases that I have with complex expressions and SwiftUI code and everything seems to work great in Xcode 13.
Hi Alex,
Thanks for your patience, and thanks for your feedback. We believe this issue is resolved.
Please test with the latest Xcode 13 beta 2 release and update your feedback report with your results by logging into https://feedbackassistant.apple.com or by using the Feedback Assistant app.

Multiline statement in Swift

I was working on a Swift tutorial and found that Swift has a strange way to handle multi-line statement.
First, I defined some extension to the standard String class:
extension String {
func replace(target: String, withString: String) -> String {
return self.stringByReplacingOccurrencesOfString(target, withString: withString)
}
func toLowercase() -> String {
return self.lowercaseString
}
}
This works as expected:
let str = "HELLO WORLD"
let s1 = str.lowercaseString.replace("hello", withString: "goodbye") // -> goodbye world
This doesn't work:
let s2 = str
.lowercaseString
.replace("hello", withString: "goodbye")
// Error: could not find member 'lowercaseString'
If I replace the reference to the lowercaseString property with a function call, it works again:
let s3 = str
.toLowercase()
.replace("hello", withString: "goodbye") // -> goodbye world
Is there anything in the Swift language specifications that prevent a property to be broken onto its own line?
Code at Swift Stub.
This is definitely a compiler bug. Issue has been resolved in Xcode 7 beta 3.
This feels like a compiler bug, but it relates to the fact that you can define prefix, infix, and postfix operators in Swift (but not the . operator, ironically enough). I don't know why it only gives you grief on the property and not the function call, but is a combination of two things:
the whitespace before and after the . (dot) operator for properties (only)
some nuance of this ever growing language that treats properties differently than function calls (even though functions are supposed to first class types).
I would file a bug to see what comes out of it, Swift is not supposed to by pythonic this way. That said, to work around it, you can either not break the property from the type, or you can add a white space before and after the . .
let s2 = str.lowercaseString
.replace("hello", withString: "goodbye")
let s3 = str
. lowercaseString
.replace("hello", withString: "goodbye")
Using semicolons is not mandatory in swift. And I think that the problems with multiline statements in swift are because of optional semicolons.
Note that swift does not support multiline strings. Check here: Swift - Split string over multiple lines
So maybe swift cannot handle multiline statements. I am not sure about this and this could be one of the reasons so I would appreciate if anyone else can help regarding this issue.