How to get # of distinct characters in a string? (Swift 4.2 + ) - swift4

This algorithm or code should work for any # of unique character in a string, by the condition we use to check after.
For instance (If I have a string that I want to know if we have at least 7 unique characters we can do):
let number_of_distinct = Set(some_string.characters).count
if(number_of_distinct >= 7)
{
// yes we have at least 7 unique chars.
}
else
{
// no we don't have at least 7 unique chars.
}
However, this technique seems to be deprecated in Swift 4.2 +, due to the way Strings were updated in Swift 4.0 +.
What would be the new correct approach for this technique mentioned above?

Just remove the .characters
let number_of_distinct = Set(some_string).count
if(number_of_distinct >= 7)
{
print("yes")
// yes we have at least 7 unique chars.
}
else
{
print("no")
// no we don't have at least 7 unique chars.
}

You can also do this without using the Set.
func printUniqueCompleteSubString(from string: String) {
var uniquString = ""
uniquString = string.reduce(uniquString) { (result, char) -> String in
if result.contains(char) {
return result
}
else {
return result + String.init(char)
}
}
print("Unique String is:", uniquString)
}

Related

Count number of characters between two specific characters

Trying to make a func that will count characters in between two specified char like:
count char between "#" and "." or "#" and ".com"
If this is only solution could this code be written in a simple way with .count or something less confusing
func validateEmail(_ str: String) -> Bool {
let range = 0..<str.count
var numAt = Int()
numDot = Int()
if str.contains("#") && str.contains(".") && str.characters.first != "#" {
for num in range {
if str[str.index(str.startIndex, offsetBy: num)] == "#" {
numAt = num
print("The position of # is \(numAt)")
} else if
str[str.index(str.startIndex, offsetBy: num)] == "." {
numDot = num
print("The position of . is \(numDot)")
}
}
if (numDot - numAt) > 1 {
return true
}
}
return false
}
With help from #Βασίλης Δ. i made a direct if statement for func validateEmail that check if number of char in between are less than 1
if (str.split(separator: "#").last?.split(separator: ".").first!.count)! < 1{
return false
}
It could be usefull
There are many edge cases to what you're trying to do, and email validation is notoriously complicated. I recommend doing as little of it as possible. Many, many things are legal email addresses. So you will need to think carefully about what you want to test. That said, this addresses what you've asked for, which is the distance between the first # and the first . that follows it.
func lengthOfFirstComponentAfterAt(in string: String) -> Int? {
guard
// Find the first # in the string
let firstAt = string.firstIndex(of: "#"),
// Find the first "." after that
let firstDotAfterAt = string[firstAt...].firstIndex(of: ".")
else {
return nil
}
// Return the distance between them (not counting the dot itself)
return string.distance(from: firstAt, to: firstDotAfterAt) - 1
}
lengthOfFirstComponentAfterAt(in: "rob#example.org") // Optional(7)
There's a very important lesson about Collections in this code. Notice the expression:
string[firstAt...].firstIndex(of: ".")
When you subscript a Collection, each element of the resulting slice has the same index as in the original collection. The returned value from firstIndex can be used directly to subscript string without offsetting. This is very different than how indexes work in many other languages, and allows powerful algorithms, and also creates at lot of bugs when developers forget this.

Fastest way to convert Character to uppercase or lowercase in Swift 4?

I understand the reasons why the Character class doesn't support toUpper() and toLower() but my use case is not for language purposes. Furthermore, I do not wish to revert to NSString.
So what's the fastest way to convert a character to upper case or lower case using Swift 4?
// Is there something better than this?
extension Character {
func toLower() -> Character {
return String(self).lowercased().first!
}
}
Use the uppercase2() below if you only need to uppercase the first char. It’s a 5x speed up over uppercasing the entire string.
import Foundation
// too slow, maybe with some bitwise operations could get faster 🤷‍♀️
func uppercase(_ string: String) -> Character? {
let key: Int8 = string.utf8CString[0]
guard key>0, key<127, let c = Unicode.Scalar(Int(key >= 97 ? key - Int8(32) : key)) else { return nil }
return Character(c)
}
// winner but using internal _core stuff
func uppercase2(_ string: String) -> Character? {
guard let key = string._core.asciiBuffer?[0] else { return nil }
return Character(Unicode.Scalar(key >= 97 ? key - 32 : key)) // use < + to lowercase
}
func measure(times: Int, task: ()->()){
let start1 = CFAbsoluteTimeGetCurrent()
for _ in 1..<times {
task()
}
print(CFAbsoluteTimeGetCurrent() - start1)
}
print("😀".uppercased().first as Any) // Optional("😀")
print(uppercase("😀") as Any) // nil
print(uppercase2("😀") as Any) // nil
measure(times: 10_000_000) { _ = "ABCDEFGHIJKLMNOPQRSTUVWXYZ".uppercased().first } // 4.17883902788162
measure(times: 10_000_000) { _ = uppercase("ABCDEFGHIJKLMNOPQRSTUVWXYZ") } // 4.91275697946548
measure(times: 10_000_000) { _ = uppercase2("ABCDEFGHIJKLMNOPQRSTUVWXYZ") } // 0.720575034618378
In a 10 million run, Apple’s uppercased ran 148x times faster than the code at the bottom of this post, even with force-unwrap. I’ll leave it for comedic purposes.
Their approach is of course, way lower level. See lowercased(). They check for an internal asciiBuffer and then use an _asciiUpperCaseTable.
My understanding is that if the original String is already a Swift String, it will be represented by a StringCore class which is already optimized to deal with ASCII characters at a low level. Thus, you won’t be able to beat the uppercase function of Swift.
So, kind of an answer: the fastest way is to use the regular uppercase() function.
I'm assuming that “my use-case is not for language purposes” means I’m only using ASCII. The advantage that this provides is that UTF-8 and ASCII share the same scalar code, so upper/lowercasing implies subtracting or adding a fixed number.
import Foundation
print("a".unicodeScalars.first!.value) // 97
print("A".unicodeScalars.first!.value) // 65
let uppercase = String("abcde".flatMap {
guard let char = $0.unicodeScalars.first,
let uppercased = Unicode.Scalar(char.value - UInt32(97 - 65))
else {
return nil
}
return Character(uppercased)
})
print(uppercase) // ABCDE

Trim only trailing whitespace from end of string in Swift 3

Every example of trimming strings in Swift remove both leading and trailing whitespace, but how can only trailing whitespace be removed?
For example, if I have a string:
" example "
How can I end up with:
" example"
Every solution I've found shows trimmingCharacters(in: CharacterSet.whitespaces), but I want to retain the leading whitespace.
RegEx is a possibility, or a range can be derived to determine index of characters to remove, but I can't seem to find an elegant solution for this.
With regular expressions:
let string = " example "
let trimmed = string.replacingOccurrences(of: "\\s+$", with: "", options: .regularExpression)
print(">" + trimmed + "<")
// > example<
\s+ matches one or more whitespace characters, and $ matches
the end of the string.
In Swift 4 & Swift 5
This code will also remove trailing new lines.
It works based on a Character struct's method .isWhitespace
var trailingSpacesTrimmed: String {
var newString = self
while newString.last?.isWhitespace == true {
newString = String(newString.dropLast())
}
return newString
}
This short Swift 3 extension of string uses the .anchored and .backwards option of rangeOfCharacter and then calls itself recursively if it needs to loop. Because the compiler is expecting a CharacterSet as the parameter, you can just supply the static when calling, e.g. "1234 ".trailing(.whitespaces) will return "1234". (I've not done timings, but would expect faster than regex.)
extension String {
func trailingTrim(_ characterSet : CharacterSet) -> String {
if let range = rangeOfCharacter(from: characterSet, options: [.anchored, .backwards]) {
return self.substring(to: range.lowerBound).trailingTrim(characterSet)
}
return self
}
}
In Foundation you can get ranges of indices matching a regular expression. You can also replace subranges. Combining this, we get:
import Foundation
extension String {
func trimTrailingWhitespace() -> String {
if let trailingWs = self.range(of: "\\s+$", options: .regularExpression) {
return self.replacingCharacters(in: trailingWs, with: "")
} else {
return self
}
}
}
You can also have a mutating version of this:
import Foundation
extension String {
mutating func trimTrailingWhitespace() {
if let trailingWs = self.range(of: "\\s+$", options: .regularExpression) {
self.replaceSubrange(trailingWs, with: "")
}
}
}
If we match against \s* (as Martin R. did at first) we can skip the if let guard and force-unwrap the optional since there will always be a match. I think this is nicer since it's obviously safe, and remains safe if you change the regexp. I did not think about performance.
Handy String extension In Swift 4
extension String {
func trimmingTrailingSpaces() -> String {
var t = self
while t.hasSuffix(" ") {
t = "" + t.dropLast()
}
return t
}
mutating func trimmedTrailingSpaces() {
self = self.trimmingTrailingSpaces()
}
}
Swift 4
extension String {
var trimmingTrailingSpaces: String {
if let range = rangeOfCharacter(from: .whitespacesAndNewlines, options: [.anchored, .backwards]) {
return String(self[..<range.lowerBound]).trimmingTrailingSpaces
}
return self
}
}
Demosthese's answer is a useful solution to the problem, but it's not particularly efficient. This is an upgrade to their answer, extending StringProtocol instead, and utilizing Substring to remove the need for repeated copying.
extension StringProtocol {
#inline(__always)
var trailingSpacesTrimmed: Self.SubSequence {
var view = self[...]
while view.last?.isWhitespace == true {
view = view.dropLast()
}
return view
}
}
No need to create a new string when dropping from the end each time.
extension String {
func trimRight() -> String {
String(reversed().drop { $0.isWhitespace }.reversed())
}
}
This operates on the collection and only converts the result back into a string once.
It's a little bit hacky :D
let message = " example "
var trimmed = ("s" + message).trimmingCharacters(in: .whitespacesAndNewlines)
trimmed = trimmed.substring(from: trimmed.index(after: trimmed.startIndex))
Without regular expression there is not direct way to achieve that.Alternatively you can use the below function to achieve your required result :
func removeTrailingSpaces(with spaces : String) -> String{
var spaceCount = 0
for characters in spaces.characters{
if characters == " "{
print("Space Encountered")
spaceCount = spaceCount + 1
}else{
break;
}
}
var finalString = ""
let duplicateString = spaces.replacingOccurrences(of: " ", with: "")
while spaceCount != 0 {
finalString = finalString + " "
spaceCount = spaceCount - 1
}
return (finalString + duplicateString)
}
You can use this function by following way :-
let str = " Himanshu "
print(removeTrailingSpaces(with : str))
One line solution with Swift 4 & 5
As a beginner in Swift and iOS programming I really like #demosthese's solution above with the while loop as it's very easy to understand. However the example code seems longer than necessary. The following uses essentially the same logic but implements it as a single line while loop.
// Remove trailing spaces from myString
while myString.last == " " { myString = String(myString.dropLast()) }
This can also be written using the .isWhitespace property, as in #demosthese's solution, as follows:
while myString.last?.isWhitespace == true { myString = String(myString.dropLast()) }
This has the benefit (or disadvantage, depending on your point of view) that this removes all types of whitespace, not just spaces but (according to Apple docs) also including newlines, and specifically the following characters:
“\t” (U+0009 CHARACTER TABULATION)
“ “ (U+0020 SPACE)
U+2029 PARAGRAPH SEPARATOR
U+3000 IDEOGRAPHIC SPACE
Note: Even though .isWhitespace is a Boolean it can't be used directly in the while loop as it ends up being optional ? due to the chaining of the optional .last property, which returns nil if the String (or collection) is empty. The == true logic gets around this since nil != true.
I'd love to get some feedback on this, esp. in case anyone sees any issues or drawbacks with this simple single line approach.
Swift 5
extension String {
func trimTrailingWhiteSpace() -> String {
guard self.last == " " else { return self }
var tmp = self
repeat {
tmp = String(tmp.dropLast())
} while tmp.last == " "
return tmp
}
}

How can I check if a string contains Chinese in Swift?

I want to know that how can I check if a string contains Chinese in Swift?
For example, I want to check if there's Chinese inside:
var myString = "Hi! 大家好!It's contains Chinese!"
Thanks!
This answer
to How to determine if a character is a Chinese character can also easily be translated from
Ruby to Swift (now updated for Swift 3):
extension String {
var containsChineseCharacters: Bool {
return self.range(of: "\\p{Han}", options: .regularExpression) != nil
}
}
if myString.containsChineseCharacters {
print("Contains Chinese")
}
In a regular expression, "\p{Han}" matches all characters with the
"Han" Unicode property, which – as I understand it – are the characters
from the CJK languages.
Looking at questions on how to do this in other languages (such as this accepted answer for Ruby) it looks like the common technique is to determine if each character in the string falls in the CJK range. The ruby answer could be adapted to Swift strings as extension with the following code:
extension String {
var containsChineseCharacters: Bool {
return self.unicodeScalars.contains { scalar in
let cjkRanges: [ClosedInterval<UInt32>] = [
0x4E00...0x9FFF, // main block
0x3400...0x4DBF, // extended block A
0x20000...0x2A6DF, // extended block B
0x2A700...0x2B73F, // extended block C
]
return cjkRanges.contains { $0.contains(scalar.value) }
}
}
}
// true:
"Hi! 大家好!It's contains Chinese!".containsChineseCharacters
// false:
"Hello, world!".containsChineseCharacters
The ranges may already exist in Foundation somewhere rather than manually hardcoding them.
The above is for Swift 2.0, for earlier, you will have to use the free contains function rather than the protocol extension (twice):
extension String {
var containsChineseCharacters: Bool {
return contains(self.unicodeScalars) {
// older version of compiler seems to need extra help with type inference
(scalar: UnicodeScalar)->Bool in
let cjkRanges: [ClosedInterval<UInt32>] = [
0x4E00...0x9FFF, // main block
0x3400...0x4DBF, // extended block A
0x20000...0x2A6DF, // extended block B
0x2A700...0x2B73F, // extended block C
]
return contains(cjkRanges) { $0.contains(scalar.value) }
}
}
}
The accepted answer only find if string contains Chinese character, i created one suit for my own case:
enum ChineseRange {
case notFound, contain, all
}
extension String {
var findChineseCharacters: ChineseRange {
guard let a = self.range(of: "\\p{Han}*\\p{Han}", options: .regularExpression) else {
return .notFound
}
var result: ChineseRange
switch a {
case nil:
result = .notFound
case self.startIndex..<self.endIndex:
result = .all
default:
result = .contain
}
return result
}
}
if "你好".findChineseCharacters == .all {
print("All Chinese")
}
if "Chinese".findChineseCharacters == .notFound {
print("Not found Chinese")
}
if "Chinese你好".findChineseCharacters == .contain {
print("Contains Chinese")
}
gist here: https://gist.github.com/williamhqs/6899691b5a26272550578601bee17f1a
Try this in Swift 2:
var myString = "Hi! 大家好!It's contains Chinese!"
var a = false
for c in myString.characters {
let cs = String(c)
a = a || (cs != cs.stringByApplyingTransform(NSStringTransformMandarinToLatin, reverse: false))
}
print("\(myString) contains Chinese characters = \(a)")
I have created a Swift 3 String extension for checking how much Chinese characters a String contains. Similar to the code by Airspeed Velocity but more comprehensive. Checking various Unicode ranges to see whether a character is Chinese. See Chinese character ranges listed in the tables under section 18.1 in the Unicode standard specification: http://www.unicode.org/versions/Unicode9.0.0/ch18.pdf
The String extension can be found on GitHub: https://github.com/niklasberglund/String-chinese.swift
Usage example:
let myString = "Hi! 大家好!It contains Chinese!"
let chinesePercentage = myString.chinesePercentage()
let chineseCharacterCount = myString.chineseCharactersCount()
print("String contains \(chinesePercentage) percent Chinese. That's \(chineseCharacterCount) characters.")

How can I check if a string contains letters in Swift? [duplicate]

This question already has answers here:
What is the best way to determine if a string contains a character from a set in Swift
(11 answers)
Closed 7 years ago.
I'm trying to check whether a specific string contains letters or not.
So far I've come across NSCharacterSet.letterCharacterSet() as a set of letters, but I'm having trouble checking whether a character in that set is in the given string. When I use this code, I get an error stating:
'Character' is not convertible to 'unichar'
For the following code:
for chr in input{
if letterSet.characterIsMember(chr){
return "Woah, chill out!"
}
}
You can use NSCharacterSet in the following way :
let letters = NSCharacterSet.letters
let phrase = "Test case"
let range = phrase.rangeOfCharacter(from: characterSet)
// range will be nil if no letters is found
if let test = range {
println("letters found")
}
else {
println("letters not found")
}
Or you can do this too :
func containsOnlyLetters(input: String) -> Bool {
for chr in input {
if (!(chr >= "a" && chr <= "z") && !(chr >= "A" && chr <= "Z") ) {
return false
}
}
return true
}
In Swift 2:
func containsOnlyLetters(input: String) -> Bool {
for chr in input.characters {
if (!(chr >= "a" && chr <= "z") && !(chr >= "A" && chr <= "Z") ) {
return false
}
}
return true
}
It's up to you, choose a way. I hope this help you.
You should use the Strings built in range functions with NSCharacterSet rather than roll your own solution. This will give you a lot more flexibility too (like case insensitive search if you so desire).
let str = "Hey this is a string"
let characterSet = NSCharacterSet(charactersInString: "aeiou")
if let _ = str.rangeOfCharacterFromSet(characterSet, options: .CaseInsensitiveSearch) {
println("true")
}
else {
println("false")
}
Substitute "aeiou" with whatever letters you're looking for.
A less flexible, but fun swift note all the same, is that you can use any of the functions available for Sequences. So you can do this:
contains("abc", "c")
This of course will only work for individual characters, and is not flexible and not recommended.
The trouble with .characterIsMember is that it takes a unichar (a typealias for UInt16).
If you iterate your input using the utf16 view of the string, it will work:
let set = NSCharacterSet.letterCharacterSet()
for chr in input.utf16 {
if set.characterIsMember(chr) {
println("\(chr) is a letter")
}
}
You can also skip the loop and use the contains algorithm if you only want to check for presence/non-presence:
if contains(input.utf16, { set.characterIsMember($0) }) {
println("contains letters")
}