How can i find character number except space between words - swift

My string value is shown below
let str = "Hello I m iOS developer"
if I wanted to count character in str I can find like this :
print(str.utf16.count) // 23
But in this solution code block counting space to. But I want to find just character number.Is there any simple code just counting characters?

let str = "Hello I m iOS developer"
let filter = str.filter {!$0.isWhitespace}
print(filter.count)

Related

Swift 5 split string at integer index

It used to be you could use substring to get a portion of a string. That has been deprecated in favor on string index. But I can't seem to make a string index out of integers.
var str = "hellooo"
let newindex = str.index(after: 3)
str = str[newindex...str.endIndex]
No matter what the string is, I want the second 3 characters. So and str would contain "loo". How can I do this?
Drop the first three characters and the get the remaining first three characters
let str = "helloo"
let secondThreeCharacters = String(str.dropFirst(3).prefix(3))
You might add some code to handle the case if there are less than 6 characters in the string

Getting a random emoji/character from a unicode string

My goal is to get a random emoticon, from a list, in F#.
I started with this:
let pickOne (icons: string) : char = icons.[Helpers.random.Next(icons.Length)]
let happySymbols = "๐Ÿ”ฅ๐Ÿ˜‚๐Ÿ˜Š๐Ÿ˜๐Ÿ™๐Ÿ˜Ž๐Ÿ’ช๐Ÿ˜‹๐Ÿ˜‡๐ŸŽ‰๐Ÿ™Œ๐Ÿค˜๐Ÿ‘๐Ÿค‘๐Ÿคฉ๐Ÿคช๐Ÿค ๐Ÿฅณ๐Ÿ˜Œ๐Ÿคค๐Ÿ˜๐Ÿ˜€"
let sadSymbols = "๐Ÿ˜ญ๐Ÿ˜”๐Ÿ˜’๐Ÿ˜ฉ๐Ÿ˜ข๐Ÿคฆ๐Ÿคท๐Ÿ˜ฑ๐Ÿ‘Ž๐Ÿคจ๐Ÿ˜‘๐Ÿ˜ฌ๐Ÿ™„๐Ÿคฎ๐Ÿ˜ต๐Ÿคฏ๐Ÿง๐Ÿ˜•๐Ÿ˜Ÿ๐Ÿ˜ค๐Ÿ˜ก๐Ÿคฌ"
that doesn't work because:
"๐Ÿ”ฅ๐Ÿ˜‚๐Ÿ˜Š๐Ÿ˜๐Ÿ™๐Ÿ˜Ž๐Ÿ’ช๐Ÿ˜‹๐Ÿ˜‡๐ŸŽ‰๐Ÿ™Œ๐Ÿค˜๐Ÿ‘๐Ÿค‘๐Ÿคฉ๐Ÿคช๐Ÿค ๐Ÿฅณ๐Ÿ˜Œ๐Ÿคค๐Ÿ˜๐Ÿ˜€".Length
is returning 44 as length returns the number of chars in a string, which is not working well with unicode characters.
I can't just divide by 2 because I may add some single byte characters in the string at some point.
Indexing doesn't work either:
let a = "๐Ÿ”ฅ๐Ÿ˜‚๐Ÿ˜Š๐Ÿ˜๐Ÿ™๐Ÿ˜Ž๐Ÿ’ช๐Ÿ˜‹๐Ÿ˜‡๐ŸŽ‰๐Ÿ™Œ๐Ÿค˜๐Ÿ‘๐Ÿค‘๐Ÿคฉ๐Ÿคช๐Ÿค ๐Ÿฅณ๐Ÿ˜Œ๐Ÿคค๐Ÿ˜๐Ÿ˜€"
a.[0]
will not return ๐Ÿ”ฅ but I get some unknown character symbol.
so, plan B was: let's make this an array instead of a string:
let a = [| '๐Ÿ”ฅ'; '๐Ÿ˜‚'; '๐Ÿ˜Š'; '๐Ÿ˜'; '๐Ÿ™'; '๐Ÿ˜Ž'; '๐Ÿ’ช'; '๐Ÿ˜‹'; '๐Ÿ˜‡'; '๐ŸŽ‰'; '๐Ÿ™Œ'; '๐Ÿค˜'; '๐Ÿ‘'; '๐Ÿค‘'; '๐Ÿคฉ'; '๐Ÿคช'; '๐Ÿค '; '๐Ÿฅณ'; '๐Ÿ˜Œ'; '๐Ÿคค'; '๐Ÿ˜'; '๐Ÿ˜€' |]
this is not compiling, I'm getting:
Parse error Unexpected quote symbol in binding. Expected '|]' or other token.
why is that?
anyhow, I can make a list of strings and get it to work, but I'm curious: is there a "proper" way to make the first one work and take a random unicode character from a unicode string?
Asti's answer works for your purpose, but I wasn't too happy about where we landed on this. I guess I got hung up in the word "proper" in the answer. After a lot of research in various places, I got curious about the method String.EnumerateRunes, which again lead me to the type Rune. The documentation for that type is particularly enlightening about proper string handling, and what's in a Unicode UTF-8 string in .NET. I also experimented in LINQPad, and got this.
let dump x = x.Dump()
let runes = "abcABCรฆรธรฅร†ร˜ร…๐Ÿ˜‚๐Ÿ˜Š๐Ÿ˜โ‚…่Œจ่Œง่Œฆ่Œฅ".EnumerateRunes().ToArray()
runes.Length |> dump
// 20
runes |> Array.iter (fun rune -> dump (string rune))
// a b c A B C รฆ รธ รฅ ร† ร˜ ร… ๐Ÿ˜‚ ๐Ÿ˜Š ๐Ÿ˜ โ‚… ่Œจ ่Œง ่Œฆ ่Œฅ
dump runes
// see screenshot
let smiley = runes.[13].ToString()
dump smiley
// ๐Ÿ˜Š
All strings in .NET are 16-bit unicode strings.
That's the definition of char:
Represents a character as a UTF-16 code unit.
All characters take up the minimum encoding size (2 bytes for UTF-16), up to as many bytes as required. Emojis don't fit in 2 bytes, so they align to 4 bytes, or 2 chars.
So what's the solution? align(4) all the things! (insert GCC joke here).
First we convert everything into UTF32:
let utf32 (source: string) =
Encoding.Convert(Encoding.Unicode, Encoding.UTF32, Encoding.Unicode.GetBytes(source))
Then we can pick and choose any "character":
let pick (arr: byte[]) index =
Encoding.UTF32.GetString(arr, index * 4, 4)
Test:
let happySymbols = "๐Ÿ”ฅ๐Ÿ˜‚๐Ÿ˜Š๐Ÿ˜๐Ÿ™๐Ÿ˜Ž๐Ÿ’ช๐Ÿ˜‹๐Ÿ˜‡๐ŸŽ‰๐Ÿ™Œ๐Ÿค˜๐Ÿ‘๐Ÿค‘๐Ÿคฉ๐Ÿคช๐Ÿค ๐Ÿฅณ๐Ÿ˜Œ๐Ÿคค๐Ÿ˜๐Ÿ˜€YTHO"
pick (utf32 happySymbols) 0;;
val it : string = "๐Ÿ”ฅ"
> pick (utf32 happySymbols) 22;;
val it : string = "Y"
For the actual length, just div by 4.
let surpriseMe arr =
let rnd = Random()
pick arr (rnd.Next(0, arr.Length / 4))
Hmmm
> surpriseMe (utf32 happySymbols);;
val it : string = "๐Ÿ˜"

How to get upper str this case

I want to get upper"T"..
how to get upper string!
str = "Test Version"
print(str.upper())
print(str[3])
It's not clear what you are asking.
But from context I am guessing you would like to make the second non-capitalised "t" in the string uppercase. I'm also going to assume you are using python 3 given your use of upper().
If you just want to get the "t" (and not change the string itself):
upper_T = str[3].upper()
If you want to create a string from the original you may be running into the fact that strings in python are immutable. You therefore must create a new string.
One way do this:
str2 = list(str)
str2[3] = str[3].upper()
str2 = ''.join(str2)

How to split a Korean word into it's components?

So, for example the character ๊น€ is made up of ใ„ฑ, ใ…ฃ and ใ…. I need to split the Korean word into it's components to get the resulting 3 characters.
I tried by doing the following but it doesn't seem to output it correctly:
let str = "๊น€"
let utf8 = str.utf8
let first:UInt8 = utf8.first!
let char = Character(UnicodeScalar(first))
The problem is, that that code returns รช, when it should be returning ใ„ฑ.
You need to use the decomposedStringWithCompatibilityMapping string to get the unicode scalar values and then use those scalar values to get the characters. Something below,
let string = "๊น€"
for scalar in string.decomposedStringWithCompatibilityMapping.unicodeScalars {
print("\(scalar) ")
}
Output:
แ„€
แ…ต
แ†ท
You can create list of character strings as,
let chars = string.decomposedStringWithCompatibilityMapping.unicodeScalars.map { String($0) }
print(chars)
// ["แ„€", "แ…ต", "แ†ท"]
Korean related info in Apple docs
Extended grapheme clusters are a flexible way to represent many
complex script characters as a single Character value. For example,
Hangul syllables from the Korean alphabet can be represented as either
a precomposed or decomposed sequence. Both of these representations
qualify as a single Character value in Swift:
let precomposed: Character = "\u{D55C}" // ํ•œ
let decomposed: Character = "\u{1112}\u{1161}\u{11AB}" // แ„’, แ…ก, แ†ซ
// precomposed is ํ•œ, decomposed is แ„’แ…กแ†ซ

Remove substring from a string knowing first and last characters in Swift

Having a string like this:
let str = "In 1273, however, they lost their son in an accident;[2] the young Theobald was dropped by his nurse over the castle battlements.[3]"
I'm looking for a solution of removing all appearances of square brackets and anything that between it.
I was trying using a String's method: replacingOccurrences(of:with:), but it requires the exact substring it needs to be removed, so it doesn't work for me.
You can use:
let updated = str.replacingOccurrences(of: "\\[[^\\]]+\\]", with: "", options: .regularExpression)
The regular expression (without the required escapes needed in a Swift string is:
\[[^\]+]\]
The \[ and \] look for the characters [ and ]. They have a backslash to remove the normal special meaning of those characters in a regular expression.
The [^]] means to match any character except the ] character. The + means match 1 or more.
You can create a while loop to get the lowerBound of the range of the first string and the upperBound of the range of the second string and create a range from that. Next just remove the subrange of your string and set the new startIndex for the search.
var str = "In 1273, however, they lost their son in an accident;[2] the young Theobald was dropped by his nurse over the castle battlements.[3]"
var start = str.startIndex
while let from = str.range(of: "[", range: start..<str.endIndex)?.lowerBound,
let to = str.range(of: "]", range: from..<str.endIndex)?.upperBound,
from != to {
str.removeSubrange(from..<to)
start = from
}
print(str) // "In 1273, however, they lost their son in an accident; the young Theobald was dropped by his nurse over the castle battlements."