Counting character frequencies in a Swift string

Counting character frequencies in a Swift string - swift

I'm trying to convert Java code to Swift and facing the issue:
single-quoted string literal found, use '"' charArray[(s[i].asciiValue)! - ('a'.asciiValue)!]++ ^~~ "a"
Java Code:
for(String s: str){
char arr[] = new char[26]
for(int i =0;i< s.length(); i++){
arr[s.charAt(i) -'a']++;
}
}
Swift Code:
extension String {
var asciiArray: [UInt32] {
return unicodeScalars.filter{$0.isASCII}.map{$0.value}
}
}
extension Character {
var asciiValue: UInt32? {
return String(self).unicodeScalars.filter{$0.isASCII}.first?.value
}
}
class GroupXXX {
func groupXXX(strList: [String]) {
for str in strList {
var charArray = [Character?](repeating: nil, count: 26)
var s = str.characters.map { $0 }
for i in 0..<s.count {
charArray[(s[i].asciiValue)! - ('a'.asciiValue)!]++
}
}
}
}

There are several problems in your Swift code:
There are no single-quoted character literals in Swift (as already explained
by JeremyP).
The ++ operator has been removed in Swift 3.
s[i] does not compile because Swift strings are not indexed by
integers.
Defining the array as [Character?] makes no sense, and you cannot
increment a Character?. The Swift equivalent of the Java char
would be UInt16.
You don't check if the character is in the range "a"..."z".
Apparently you want to count the number of occurrences of
each character "a" to "z" in a string.
This is how I would do it in Swift:
Define the "frequency" array as an array of integers.
Enumerate the unicodeScalars property of the string.
Use a switch statement to check for the valid range of characters.
Then the custom extension are not needed anymore and the code becomes
var frequencies = [Int](repeating: 0, count: 26)
for c in str.unicodeScalars {
switch c {
case "a"..."z":
frequencies[Int(c.value - UnicodeScalar("a").value)] += 1
default:
break // ignore all other characters
}
}

charArray[(s[i].asciiValue)! - ('a'.asciiValue)!]++
As the error says, use double quotes, Swift doesn't have a syntax that differentiates between characters and strings (characters themselves may be sequences of bytes in swift).
You may need to explicitly force the character to be a Character if the compiler can't differentiate.
charArray[(s[i].asciiValue)! - (Character("a").asciiValue)!]++

Related

Adding numbers inside a string in Swift

Reading through this problem in a book
Given a string that contains both letters and numbers, write a
function that pulls out all the numbers then returns their sum. Sample
input and output
The string “a1b2c3” should return 6 (1 + 2 + 3). The string
“a10b20c30” should return 60 (10 + 20 + 30). The string “h8ers” should
return “8”.
My solution so far is
import Foundation
func sumOfNumbers(in string: String) -> Int {
var numbers = string.filter { $0.isNumber }
var numbersArray = [Int]()
for number in numbers {
numbersArray.append(Int(number)!)
}
return numbersArray.reduce(0, { $0 * $1 })
}
However, I get the error
Solution.swift:8:33: error: cannot convert value of type 'String.Element' (aka 'Character') to expected argument type 'String'
numbersArray.append(Int(number)!)
^
And I'm struggling to get this number of type String.Element into a Character. Any guidance would be appreciated.

The error occurs because Int.init is expecting a String, but the argument number you gave is of type Character.
It is easy to fix the compiler error just by converting the Character to String by doing:
numbersArray.append(Int("\(number)")!)
or just:
numbersArray.append(number.wholeNumberValue!)
However, this does not produce the expected output. First, you are multiplying the numbers together, not adding. Second, you are considering each character separately, and not considering groups of digits as one number.
You can instead implement the function like this:
func sumOfNumbers(in string: String) -> Int {
string.components(separatedBy: CharacterSet(charactersIn: "0"..."9").inverted)
.compactMap(Int.init)
.reduce(0, +)
}
The key thing is to split the string using "non-digits", so that "10" and "20" etc gets treated as individual numbers.

What is the best way to test if a CharacterSet contains a Character in Swift 4?

I'm looking for a way, in Swift 4, to test if a Character is a member of an arbitrary CharacterSet. I have this Scanner class that will be used for some lightweight parsing. One of the functions in the class is to skip any characters, at the current position, that belong to a certain set of possible characters.
class MyScanner {
let str: String
var idx: String.Index
init(_ string: String) {
str = string
idx = str.startIndex
}
var remains: String { return String(str[idx..<str.endIndex])}
func skip(charactersIn characters: CharacterSet) {
while idx < str.endIndex && characters.contains(str[idx])) {
idx = source.index(idx, offsetBy: 1)
}
}
}
let scanner = MyScanner("fizz buzz fizz")
scanner.skip(charactersIn: CharacterSet.alphanumerics)
scanner.skip(charactersIn: CharacterSet.whitespaces)
print("what remains: \"\(scanner.remains)\"")
I would like to implement the skip(charactersIn:) function so that the above code would print buzz fizz.
The tricky part is characters.contains(str[idx])) in the while - .contains() requires a Unicode.Scalar, and I'm at a loss trying to figure out the next step.
I know I could pass in a String to the skip function, but I'd like to find a way to make it work with a CharacterSet, because of all the convenient static members (alphanumerics, whitespaces, etc.).
How does one test a CharacterSet if it contains a Character?

Not sure if it's the most efficient way but you can create a new CharSet and check if they are sub/super-sets (Set comparison is rather quick)
let newSet = CharacterSet(charactersIn: "a")
// let newSet = CharacterSet(charactersIn: "\(character)")
print(newSet.isSubset(of: CharacterSet.decimalDigits)) // false
print(newSet.isSubset(of: CharacterSet.alphanumerics)) // true

Swift 4.2
CharacterSet extension function to check whether it contains Character:
extension CharacterSet {
func containsUnicodeScalars(of character: Character) -> Bool {
return character.unicodeScalars.allSatisfy(contains(_:))
}
}
Usage example:
CharacterSet.decimalDigits.containsUnicodeScalars(of: "3") // true
CharacterSet.decimalDigits.containsUnicodeScalars(of: "a") // false

I know that you wanted to use CharacterSet rather than String, but CharacterSet does not (yet, at least) support characters that are composed of more than one Unicode.Scalar. See the "family" character (👩‍👩‍👧‍👦) or the international flag characters (e.g. "🇯🇵" or "🇯🇲") that Apple demonstrated in the string discussion in WWDC 2017 video What's New in Swift. The multiple skin tone emoji also manifest this behavior (e.g. 👩🏻 vs 👩🏽).
As a result, I'd be wary of using CharacterSet (which is a "set of Unicode character values for use in search operations"). Or, if you want to provide this method for the sake of convenience, be aware that it will not work correctly with characters represented by multiple unicode scalars.
So, you might offer a scanner that provides both CharacterSet and String renditions of the skip method:
class MyScanner {
let string: String
var index: String.Index
init(_ string: String) {
self.string = string
index = string.startIndex
}
var remains: String { return String(string[index...]) }
/// Skip characters in a string
///
/// This rendition is safe to use with strings that have characters
/// represented by more than one unicode scalar.
///
/// - Parameter skipString: A string with all of the characters to skip.
func skip(charactersIn skipString: String) {
while index < string.endIndex, skipString.contains(string[index]) {
index = string.index(index, offsetBy: 1)
}
}
/// Skip characters in character set
///
/// Note, character sets cannot (yet) include characters that are represented by
/// more than one unicode scalar (e.g. 👩‍👩‍👧‍👦 or 🇯🇵 or 👰🏻). If you want to test
/// for these multi-unicode characters, you have to use the `String` rendition of
/// this method.
///
/// This will simply stop scanning if it encounters a multi-unicode character in
/// the string being scanned (because it knows the `CharacterSet` can only represent
/// single-unicode characters) and you want to avoid false positives (e.g., mistaking
/// the Jamaican flag, 🇯🇲, for the Japanese flag, 🇯🇵).
///
/// - Parameter characterSet: The character set to check for membership.
func skip(charactersIn characterSet: CharacterSet) {
while index < string.endIndex,
string[index].unicodeScalars.count == 1,
let character = string[index].unicodeScalars.first,
characterSet.contains(character) {
index = string.index(index, offsetBy: 1)
}
}
}
Thus, your simple example will still work:
let scanner = MyScanner("fizz buzz fizz")
scanner.skip(charactersIn: CharacterSet.alphanumerics)
scanner.skip(charactersIn: CharacterSet.whitespaces)
print(scanner.remains) // "buzz fizz"
But use the String rendition if the characters you want to skip might include multiple unicode scalars:
let family = "👩\u{200D}👩\u{200D}👧\u{200D}👦" // 👩‍👩‍👧‍👦
let boy = "👦"
let charactersToSkip = family + boy
let string = boy + family + "foobar" // 👦👩‍👩‍👧‍👦foobar
let scanner = MyScanner(string)
scanner.skip(charactersIn: charactersToSkip)
print(scanner.remains) // foobar
As Michael Waterfall noted in the comments below, CharacterSet has a bug and doesn’t even handle 32-bit Unicode.Scalar values correctly, meaning that it doesn’t even handle single scalar characters properly if the value exceeds 0xffff (including emoji, amongst others). The String rendition, above, handles these correctly, though.

Guard Statement Parameter Error [duplicate]

How do you get the length of a String? For example, I have a variable defined like:
var test1: String = "Scott"
However, I can't seem to find a length method on the string.

As of Swift 4+
It's just:
test1.count
for reasons.
(Thanks to Martin R)
As of Swift 2:
With Swift 2, Apple has changed global functions to protocol extensions, extensions that match any type conforming to a protocol. Thus the new syntax is:
test1.characters.count
(Thanks to JohnDifool for the heads up)
As of Swift 1
Use the count characters method:
let unusualMenagerie = "Koala 🐨, Snail 🐌, Penguin 🐧, Dromedary 🐪"
println("unusualMenagerie has \(count(unusualMenagerie)) characters")
// prints "unusualMenagerie has 40 characters"
right from the Apple Swift Guide
(note, for versions of Swift earlier than 1.2, this would be countElements(unusualMenagerie) instead)
for your variable, it would be
length = count(test1) // was countElements in earlier versions of Swift
Or you can use test1.utf16count

TLDR:
For Swift 2.0 and 3.0, use test1.characters.count. But, there are a few things you should know. So, read on.
Counting characters in Swift
Before Swift 2.0, count was a global function. As of Swift 2.0, it can be called as a member function.
test1.characters.count
It will return the actual number of Unicode characters in a String, so it's the most correct alternative in the sense that, if you'd print the string and count characters by hand, you'd get the same result.
However, because of the way Strings are implemented in Swift, characters don't always take up the same amount of memory, so be aware that this behaves quite differently than the usual character count methods in other languages.
For example, you can also use test1.utf16.count
But, as noted below, the returned value is not guaranteed to be the same as that of calling count on characters.
From the language reference:
Extended grapheme clusters can be composed of one or more Unicode
scalars. This means that different characters—and different
representations of the same character—can require different amounts of
memory to store. Because of this, characters in Swift do not each take
up the same amount of memory within a string’s representation. As a
result, the number of characters in a string cannot be calculated
without iterating through the string to determine its extended
grapheme cluster boundaries. If you are working with particularly long
string values, be aware that the characters property must iterate over
the Unicode scalars in the entire string in order to determine the
characters for that string.
The count of the characters returned by the characters property is not
always the same as the length property of an NSString that contains
the same characters. The length of an NSString is based on the number
of 16-bit code units within the string’s UTF-16 representation and not
the number of Unicode extended grapheme clusters within the string.
An example that perfectly illustrates the situation described above is that of checking the length of a string containing a single emoji character, as pointed out by n00neimp0rtant in the comments.
var emoji = "👍"
emoji.characters.count //returns 1
emoji.utf16.count //returns 2

Swift 1.2 Update: There's no longer a countElements for counting the size of collections. Just use the count function as a replacement: count("Swift")
Swift 2.0, 3.0 and 3.1:
let strLength = string.characters.count
Swift 4.2 (4.0 onwards): [Apple Documentation - Strings]
let strLength = string.count

Swift 1.1
extension String {
var length: Int { return countElements(self) } //
}
Swift 1.2
extension String {
var length: Int { return count(self) } //
}
Swift 2.0
extension String {
var length: Int { return characters.count } //
}
Swift 4.2
extension String {
var length: Int { return self.count }
}
let str = "Hello"
let count = str.length // returns 5 (Int)

Swift 4
"string".count
;)
Swift 3
extension String {
var length: Int {
return self.characters.count
}
}
usage
"string".length

If you are just trying to see if a string is empty or not (checking for length of 0), Swift offers a simple boolean test method on String
myString.isEmpty
The other side of this coin was people asking in ObjectiveC how to ask if a string was empty where the answer was to check for a length of 0:
NSString is empty

Swift 5.1, 5
let flag = "🇵🇷"
print(flag.count)
// Prints "1" -- Counts the characters and emoji as length 1
print(flag.unicodeScalars.count)
// Prints "2" -- Counts the unicode lenght ex. "A" is 65
print(flag.utf16.count)
// Prints "4"
print(flag.utf8.count)
// Prints "8"

tl;dr If you want the length of a String type in terms of the number of human-readable characters, use countElements(). If you want to know the length in terms of the number of extended grapheme clusters, use endIndex. Read on for details.
The String type is implemented as an ordered collection (i.e., sequence) of Unicode characters, and it conforms to the CollectionType protocol, which conforms to the _CollectionType protocol, which is the input type expected by countElements(). Therefore, countElements() can be called, passing a String type, and it will return the count of characters.
However, in conforming to CollectionType, which in turn conforms to _CollectionType, String also implements the startIndex and endIndex computed properties, which actually represent the position of the index before the first character cluster, and position of the index after the last character cluster, respectively. So, in the string "ABC", the position of the index before A is 0 and after C is 3. Therefore, endIndex = 3, which is also the length of the string.
So, endIndex can be used to get the length of any String type, then, right?
Well, not always...Unicode characters are actually extended grapheme clusters, which are sequences of one or more Unicode scalars combined to create a single human-readable character.
let circledStar: Character = "\u{2606}\u{20DD}" // ☆⃝
circledStar is a single character made up of U+2606 (a white star), and U+20DD (a combining enclosing circle). Let's create a String from circledStar and compare the results of countElements() and endIndex.
let circledStarString = "\(circledStar)"
countElements(circledStarString) // 1
circledStarString.endIndex // 2

In Swift 2.0 count doesn't work anymore. You can use this instead:
var testString = "Scott"
var length = testString.characters.count

Here's something shorter, and more natural than using a global function:
aString.utf16count
I don't know if it's available in beta 1, though. But it's definitely there in beta 2.

Updated for Xcode 6 beta 4, change method utf16count --> utf16Count
var test1: String = "Scott"
var length = test1.utf16Count
Or
var test1: String = "Scott"
var length = test1.lengthOfBytesUsingEncoding(NSUTF16StringEncoding)

As of Swift 1.2 utf16Count has been removed. You should now use the global count() function and pass the UTF16 view of the string. Example below...
let string = "Some string"
count(string.utf16)

For Xcode 7.3 and Swift 2.2.
let str = "🐶"
If you want the number of visual characters:
str.characters.count
If you want the "16-bit code units within the string’s UTF-16 representation":
str.utf16.count
Most of the time, 1 is what you need.
When would you need 2? I've found a use case for 2:
let regex = try! NSRegularExpression(pattern:"🐶",
options: NSRegularExpressionOptions.UseUnixLineSeparators)
let str = "🐶🐶🐶🐶🐶🐶"
let result = regex.stringByReplacingMatchesInString(str,
options: NSMatchingOptions.WithTransparentBounds,
range: NSMakeRange(0, str.utf16.count), withTemplate: "dog")
print(result) // dogdogdogdogdogdog
If you use 1, the result is incorrect:
let result = regex.stringByReplacingMatchesInString(str,
options: NSMatchingOptions.WithTransparentBounds,
range: NSMakeRange(0, str.characters.count), withTemplate: "dog")
print(result) // dogdogdog🐶🐶🐶

You could try like this
var test1: String = "Scott"
var length = test1.bridgeToObjectiveC().length

in Swift 2.x the following is how to find the length of a string
let findLength = "This is a string of text"
findLength.characters.count
returns 24

Swift 2.0:
Get a count: yourString.text.characters.count
Fun example of how this is useful would be to show a character countdown from some number (150 for example) in a UITextView:
func textViewDidChange(textView: UITextView) {
yourStringLabel.text = String(150 - yourStringTextView.text.characters.count)
}

In swift4 I have always used string.count till today I have found that
string.endIndex.encodedOffset
is the better substitution because it is faster - for 50 000 characters string is about 6 time faster than .count. The .count depends on the string length but .endIndex.encodedOffset doesn't.
But there is one NO. It is not good for strings with emojis, it will give wrong result, so only .count is correct.

In Swift 4 :
If the string does not contain unicode characters then use the following
let str : String = "abcd"
let count = str.count // output 4
If the string contains unicode chars then use the following :
let spain = "España"
let count1 = spain.count // output 6
let count2 = spain.utf8.count // output 7

In Xcode 6.1.1
extension String {
var length : Int { return self.utf16Count }
}
I think that brainiacs will change this on every minor version.

Get string value from your textview or textfield:
let textlengthstring = (yourtextview?.text)! as String
Find the count of the characters in the string:
let numberOfChars = textlength.characters.count

Here is what I ended up doing
let replacementTextAsDecimal = Double(string)
if string.characters.count > 0 &&
replacementTextAsDecimal == nil &&
replacementTextHasDecimalSeparator == nil {
return false
}

Swift 4 update comparing with swift 3
Swift 4 removes the need for a characters array on String. This means that you can directly call count on a string without getting characters array first.
"hello".count // 5
Whereas in swift 3, you will have to get characters array and then count element in that array. Note that this following method is still available in swift 4.0 as you can still call characters to access characters array of the given string
"hello".characters.count // 5
Swift 4.0 also adopts Unicode 9 and it can now interprets grapheme clusters. For example, counting on an emoji will give you 1 while in swift 3.0, you may get counts greater than 1.
"👍🏽".count // Swift 4.0 prints 1, Swift 3.0 prints 2
"👨‍❤️‍💋‍👨".count // Swift 4.0 prints 1, Swift 3.0 prints 4

Swift 4
let str = "Your name"
str.count
Remember: Space is also counted in the number

You can get the length simply by writing an extension:
extension String {
// MARK: Use if it's Swift 2
func stringLength(str: String) -> Int {
return str.characters.count
}
// MARK: Use if it's Swift 3
func stringLength(_ str: String) -> Int {
return str.characters.count
}
// MARK: Use if it's Swift 4
func stringLength(_ str: String) -> Int {
return str.count
}
}

Best way to count String in Swift is this:
var str = "Hello World"
var length = count(str.utf16)

String and NSString are toll free bridge so you can use all methods available to NSString with swift String
let x = "test" as NSString
let y : NSString = "string 2"
let lenx = x.count
let leny = y.count

test1.characters.count
will get you the number of letters/numbers etc in your string.
ex:
test1 = "StackOverflow"
print(test1.characters.count)
(prints "13")

Apple made it different from other major language. The current way is to call:
test1.characters.count
However, to be careful, when you say length you mean the count of characters not the count of bytes, because those two can be different when you use non-ascii characters.
For example;
"你好啊hi".characters.count will give you 5 but this is not the count of the bytes.
To get the real count of bytes, you need to do "你好啊hi".lengthOfBytes(using: String.Encoding.utf8). This will give you 11.

Right now (in Swift 2.3) if you use:
myString.characters.count
the method will return a "Distance" type, if you need the method to return an Integer you should type cast like so:
var count = myString.characters.count as Int

my two cents for swift 3/4
If You need to conditionally compile
#if swift(>=4.0)
let len = text.count
#else
let len = text.characters.count
#endif

Swift (for in) immutable value

I am trying to create a function within the ViewController class. I want to loop through a string and count the number of times a specific character occurs. Code looks like this:
var dcnt:Int = 0
func decimalCount(inputvalue: String) -> Int {
for chr in inputvalue.characters {
if chr == “.” {
++dcnt
}
}
return dcnt
}
The input string comes from a UILabel!
I get a warning: Immutable value ‘chr’ was never used.
How can I fix this problem

The problem, as so often in Swift, lies elsewhere. It's the curly quotes. Put this:
if chr == "." {

Validate Unicode code point in Swift

I'm writing a routine in Swift that needs to try to convert an arbitrary integer to a UnicodeScalar or return an error. The constructor UnicodeScalar(_:Int) does the job for valid code points, but it crashes when passed integers that are not valid code points.
Is there a Swift (or Foundation) function I can call to pre-flight that an integer i is a valid Unicode code point and won't cause UnicodeScalar(i) to crash?

Update for Swift 3:
UnicodeScalar has a failable initializer now which verifies if the
given number is a valid Unicode code point or not:
if let unicode = UnicodeScalar(0xD800) {
print(unicode)
} else {
print("invalid")
}
(Previous answer:) You can use the built-in UTF32() codec to check if a given integer
is a valid Unicode scalar:
extension UnicodeScalar {
init?(code: UInt32) {
var codegen = GeneratorOfOne(code) // As suggested by #rintaro
var utf32 = UTF32()
guard case let .Result(scalar) = utf32.decode(&codegen) else {
return nil
}
self = scalar
}
}
(Using ideas from https://stackoverflow.com/a/24757284/1187415
and https://stackoverflow.com/a/31285671/1187415.)

The Swift documentation states
A Unicode scalar is any Unicode code point in the range U+0000 to
U+D7FF inclusive or U+E000 to U+10FFFF inclusive.
The UnicodeScalar constructor does not crash for all values in those ranges in Swift 2.0b4. I use this convenience constructor:
extension UnicodeScalar {
init?(code: Int) {
guard (code >= 0 && code <= 0xD7FF) || (code >= 0xE000 && code <= 0x10FFFF) else {
return nil
}
self.init(code)
}
}

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse