How can I count the number of sentences in a given text in Swift? [closed] - swift

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I wanted to create a playground that would count the number of sentences of a given text.
let input = "That would be the text . it hast 3. periods. "
func sentencecount() {
let periods = CharacterSet.whitespacesAndNewlines.union(.punctuationCharacters)
let periods = input.components(separatedBy: spaces)
let periods2 = Int (words.count)
print ("The Average Sentence length is \(periods2)")
}
sentencecount()

You can use enumerateSubstrings(in: Range) and use option .bySentences:
let input = "Hello World !!! That would be the text. It hast 3 periods."
var sentences: [String] = []
input.enumerateSubstrings(in: input.startIndex..., options: .bySentences) { (string, range, enclosingRamge, stop) in
sentences.append(string!)
}
An alternative is to use an array of Substrings instead of Strings:
var sentences: [Substring] = []
input.enumerateSubstrings(in: input.startIndex..., options: .bySentences) { (string, range, enclosingRamge, stop) in
sentences.append(input[range])
}
print(sentences) // "["Hello World !!! ", "That would be the text. ", "It hast 3 periods."]\n"
print(sentences.count) // "3\n"

This should work :
let input = "That would be the text . it hast 3. periods. "
let occurrencies = input.characters.filter { $0 == "." || $0 == "?" }.count
print(occurrencies)
//result 3

Just add the character in charset by which you are going to differentiate your sentences:
I am assuming ? . , for now:
let input = "That would be the text. it hast 3? periods."
let charset = CharacterSet(charactersIn: ".?,")
let arr = input.components(separatedBy: charset)
let count = arr.count - 1
Here arr would be:
["That would be the text", " it hast 3", " periods", ""]
decrease count by 1, to get actual sentences.
Note: If you don't want to consider " , " then remove it from charset.

As far as i can see that you need to split them using . and trimming whitespaces as the following:
func sentencecount () {
let result = input.trimmingCharacters(in: .whitespaces).split(separator: ".")
print ("The Average Sentence length is \(result.count)") // 3
}
Good luck!

Related

How to lowecase only first word in the sentence? [duplicate]

This question already has answers here:
Swift apply .uppercaseString to only the first letter of a string
(31 answers)
Closed last year.
I have a text "Hello Word" "Word Hello",
How can i get "hello Word" "word Hello" (for example)
'''
let string1 = "Hello Word"
let referenceString1 = "hello Word"
let string2 = "Word Hello"
let referenceString2 = "word Hello"
'''
Get first letter of the first word and make it lowercase , then remove first letter and add the rest.
extension StringProtocol {
var lowerCaseFirts: String { prefix(1).lowercased() + dropFirst() }
}
let str = "Hello World"
print(str.lowerCaseFirts)
def fun(sentence : str) -> str:
words = sentence.split()
if not (words[0])[0].islower():
words[0] = (words[0])[0].lower() + (words[0])[1:]
out = " ".join(words)
return out
if __name__ == '__main__':
sentence1 = "Hello Everyone"
out1 = fun(sentence1)
print(sentence1, " : ", out1)
sentence2 = "HOW ARE YOU"
out2 = fun(sentence2)
print(sentence2, " : ", out2)
Output:
Hello Everyone : hello Everyone
HOW ARE YOU : hOW ARE YOU
The selected answer works perfectly for the OPs use case:
Hello Word
However, if we are speaking about first word in a sentence, there could be a more complex string with multiple sentences:
Hello Word. JELLO world? WEllo Word.\n ZEllo world! MeLLow lord.
In such a case, using REGEX might also work to solve both the scenarios.
Someone with more savvy REGEX skills might be able to improve the REGEX
extension String
{
func withLowerCaseSentences() -> String
{
do
{
// A sentence is something that starts after a period, question mark,
// exclamation followed by a space. This is only not true for the first
// sentence
//
// REGEX to capture this
// (?:^|(?:[.!?]\\s+))
// Exclude group to get the starting character OR
// anything after a period, exclamation, question mark followed by a
// whitespace (space or optional new line)
//
// ([A-Z])
// Capture group to capture all the capital characters
let regexString = "(?:^|(?:[.!?]\\s+))([A-Z])"
// Initialize the regex
let regex = try NSRegularExpression(pattern: regexString,
options: .caseInsensitive)
// Convert string to a character array
var characters = Array(self)
// Loop through all the regex matches
for match in regex.matches(in: self,
options: NSRegularExpression.MatchingOptions(),
range: NSRange(location: 0,
length: count))
as [NSTextCheckingResult]
{
// We are not looking for the match, but for the group
// For example Hello Word. JELLO word will give give us two matches
// "H" and ". J" but each of the groups within each match
// will give us what we want which is "H" and "J" so we check if we
// have a group. Look up matches and groups to learn more
if match.numberOfRanges > 1
{
// Get the range (location and length) of first group from
// the regex match
let matchedRange = match.range(at: 1)
// Get the capital character at the start of the sentence we found
let characterToReplace = characters[matchedRange.location]
// Replace the capital letter with a lower cased latter
characters[matchedRange.location]
= Character(characterToReplace.lowercased())
}
}
// Convert the processed character array back to a string if needed
return String(characters)
}
catch
{
// handle errors
print("error")
return self
}
}
}
Then it can be used:
let simpleString = "Hello world"
print("BEFORE")
print(simpleString)
print("\nAFTER")
print(simpleString.withLowerCaseSentences())
let complexString
= "Hello Word. JELLO world? WEllo Word.\n ZEllo world! MeLLow lord."
print("\nBEFORE")
print(complexString)
print("\nAFTER")
print(complexString.withLowerCaseSentences())
This gives the output:
BEFORE
Hello world
AFTER
hello world
BEFORE
Hello Word. JELLO world? WEllo Word.
ZEllo world! MeLLow lord.
AFTER
hello Word. jELLO world? wEllo Word.
zEllo world! meLLow lord.

Swift filter map reduce which option [duplicate]

This question already has answers here:
How to get the first character of each word in a string?
(11 answers)
Closed 1 year ago.
I have quick question about Swift algorithm, assuming I have a string “New Message” which option I need to use to get just initials NM ?
I would use map to get the first character of each word in the string, then use reduce to combine them.
let string = "New Message"
let individualWords = string.components(separatedBy: " ")
let firstCharacters = individualWords.map { $0.prefix(1) }.reduce("", +)
print("firstCharacters is \(firstCharacters)")
Result:
firstCharacters is NM
Edit: Per #LeoDabus' comment, joined is more concise than reduce("", +), and does the same thing.
let firstCharacters = individualWords.map { $0.prefix(1) }.joined()

Remove characters from string [Swift] [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 1 year ago.
Improve this question
there is a line from the console that needs to be displayed in the application. Can anyone suggest how to clear the line?
"\u{1B}[2K\u{1B}[1G\u{1B}[32msuccess\u{1B}[39m Checking for changed pages - 0.000s"
And get in the end:
"success Checking for changed pages - 0.000s"
Use the regular expression mentioned in this answer
let str = "\u{1B}[2K\u{1B}[1G\u{1B}[32msuccess\u{1B}[39m Checking for changed pages - 0.000s"
let regex = try! NSRegularExpression(pattern: "\u{1B}(?:[#-Z\\-_]|\\[[0-?]*[ -/]*[#-~])")
let range = NSRange(str.startIndex..<str.endIndex, in: str)
let cleaned = regex.stringByReplacingMatches(in: str, options: [], range: range, withTemplate: "")
print(cleaned) // "success Checking for changed pages - 0.000s"
#Gereon's answer is one way to skin the cat. Here's another:
let s = "\u{1B}[2K\u{1B}[1G\u{1B}[32msuccess\u{1B}[39m Checking for changed pages - 0.000s"
guard let r = s.range(of: "Checking for changed pages") else {
fatalError("Insert code for what to do if the substring isn't found")
}
let cleaned = "success " + String(s[r.lowerBound...])
Here I just literally insert the "success". But if you really need to verify that it's in the string, that can be done too.
guard let r = s.range(of: "Checking for changed pages"), s.contains("success")
else
{
fatalError("Insert code for what to do if the substring isn't found")
}
To solve the more general problem of removing ANSI escape sequences, you'll need to parse them. Neither my simple solution, nor the regex solution will do it. You'll need to explicitly look for the possible valid codes that follow the escapes.
let escapeSequences: [String] =
[/* list of escape sequnces */
"[2K", "[1G", "[32m", "[39m", // etc...
]
let escapeChar = Character("\u{1B}")
var result = ""
var i = s.startIndex
outer: while i != s.endIndex
{
if s[i] == escapeChar
{
i = s.index(after: i)
for sequence in escapeSequences {
if s[i...].hasPrefix(sequence) {
i = s.index(i, offsetBy: sequence.distance(from: sequence.startIndex, to: sequence.endIndex))
continue outer
}
}
}
else { result.append(s[i]) }
i = s.index(after: i)
}
print(result)
The thing is, I think ANSI escape sequences can be combined in interesting ways so that what would be multiple escapes can be merged into a single one in some cases. So it can be more complex than just the simple parser I presented.

Swift String Tokenizer / Parser [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
Hello there fellow Swift devs!
I am a junior dev, and I'm trying to figure out a best way to tokenize / parse Swift String as an exercise.
What I have is a string which looks like this:
let string = "This is a {B}string{/B} and this is a substring."
What I would like to do is, tokenize the string, and change the "strings / tokens" inside the tags you see.
I can see using NSRegularExpression and it's matches, but it feels too generic. I would like to have only say 2 of these tags, that change the text. What would be the best approach in Swift 5.2^?
if let regex = try? NSRegularExpression(pattern: "\\{[a-z0-9]+\}", options: .caseInsensitive) {
let string = self as NSString
return regex.matches(in: self, options: [], range: NSRange(location: 0, length: string.length)).map {
// now $0 is the result? but it won't work for enclosing the tags :/
}
}
If the option of using html tags instead of {B}{/B} is acceptable, then you can use the StringEx library that I wrote for this purpose.
You can select a substring inside the html tag and replace it with another string like this:
let string = "This is a <b>string</b> and this is a substring."
let ex = string.ex
ex[.tag("b")].replace(with: "some value")
print(ex.rawString) // This is a <b>some value</b> and this is a substring.
print(ex.string) // This is a some value and this is a substring.
if necessary, you can also style the selected substrings and get NSAttributedString:
ex[.tag("b")].style([
.font(.boldSystemFont(ofSize: 16)),
.color(.black)
])
myLabel.attributedText = ex.attributedString
Not sure if you have solved it with NLTokenizer or not, but you can certainly solve it with Regx here is how (I have implemented it as generic, in future if you have to handle different kinds of tags and substite different string for them small tweak to the logic should do the job )
override func viewDidLoad() {
super.viewDidLoad()
let regexStr = "(\\{B\\}(\\s*\\w+\\s*)*\\{\\/B\\})"
let regex = try! NSRegularExpression(pattern: regexStr)
var string = "Sandeep {B}Bhandaari{/B} is here{B}Sandeep{/B}"
var foundRanges = [NSRange]()
regex.enumerateMatches(in: string, options: [], range: NSMakeRange(0, string.count)) { (match, flag, stop) in
if let matchRange = match?.range(at: 1) {
foundRanges.append(matchRange)
}
}
let substituteString = "abcd"
var replacedString = string as NSString
let foundRangesCount = foundRanges.count
var currentRange = 0
while foundRangesCount > currentRange {
let range = foundRanges[currentRange]
replacedString = replacedString.replacingCharacters(in: range, with: substituteString) as NSString
reEvaluateAllRanges(ranges: &foundRanges, byOffset: range.length - substituteString.count)
currentRange += 1
}
debugPrint(replacedString)
}
func reEvaluateAllRanges(ranges: inout [NSRange], byOffset: Int) {
var newFoundRange = [NSRange]()
for range in ranges {
newFoundRange.append(NSMakeRange(range.location - byOffset, range.length))
}
ranges = newFoundRange
}
Input: "Sandeep {B}Bhandaari{/B} is here"
Output: Sandeep abcd is here
Input: "Sandeep {B}Bhandaari{/B} is here{B}Sandeep{/B}"
Output: Sandeep abcd is hereabcd
Look at the edge case handling Longer strings replaced by smaller substitute strings and vice versa also detection of string enclosed in tag with / without space
EDIT 1:
Regx (\\{B\\}(\\s*\\w+\\s*)*\\{\\/B\\}) should be self explanatory, incase you need help with understanding it use cheat sheet
regex.enumerateMatches(in: string, options: [], range: NSMakeRange(0, string.count)) { (match, flag, stop) in
if let matchRange = match?.range(at: 1) {
foundRanges.append(matchRange)
}
}
I could have modified substring here itself, but if you have more than one match and if you mutate string evaluated ranges will be corrupted hence am saving all found ranges into an array and apply replace on each one of them later
let substituteString = "abcd"
var replacedString = string as NSString
let foundRangesCount = foundRanges.count
var currentRange = 0
while foundRangesCount > currentRange {
let range = foundRanges[currentRange]
replacedString = replacedString.replacingCharacters(in: range, with: substituteString) as NSString
reEvaluateAllRanges(ranges: &foundRanges, byOffset: range.length - substituteString.count)
currentRange += 1
}
Here am iterating through all found match ranges and replace character in range with substitute string, you can always have a switch / if else ladder inside while loop to look for different types of tags and pass different substitute strings for each tags
func reEvaluateAllRanges(ranges: inout [NSRange], byOffset: Int) {
var newFoundRange = [NSRange]()
for range in ranges {
newFoundRange.append(NSMakeRange(range.location - byOffset, range.length))
}
ranges = newFoundRange
}
This function modifies all the ranges in array using the offset, remember you need to only modify range's location, length remains same
One bit of optimisation you can do is probably get rid of ranges from array for which you have already applied substitute strings

Swift - Parse a String to extract and edit numbers in it [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
Given a String taken from the internet, such as:
"A dusting of snow giving way to moderate rain (total 10mm) heaviest on Thu night. Freeze-thaw conditions (max 8°C on Fri morning, min -2°C on Wed night). Mainly strong winds."
Using Swift 3, I want to convert the temperatures to Fahrenheit. So I need to find any numbers that have °C after them (including negative numbers); convert them to Fahrenheit, and then replace the integer back into the string.
I was originally trying to use: components(separatedBy: String). I did get it to work with this method. Although I think there is probably a better way.
func convertStringToFahrenheit (_ message: String) -> String{
var stringBuilder = String()
let stringArray = message.components(separatedBy: "°C")
for subString in stringArray {
if subString != stringArray.last {
if subString.contains("(max "){
let subStringArray = subString.components(separatedBy: "(max ")
stringBuilder.append(subStringArray[0])
stringBuilder.append("(max ")
if var tempInt = Int(subStringArray[1]){
tempInt = convertCelsiusToFahrenheit(tempInt)
stringBuilder.append(String(tempInt))
stringBuilder.append("°F")
}
}
else if subString.contains(", min "){
let subStringArray = subString.components(separatedBy: ", min ")
stringBuilder.append(subStringArray[0])
stringBuilder.append(", min ")
if var tempInt = Int(subStringArray[1]){
tempInt = convertCelsiusToFahrenheit(tempInt)
stringBuilder.append(String(tempInt))
stringBuilder.append("°F")
}
}
}
else {
stringBuilder.append(subString)
}
}
return stringBuilder
}
A job for regular expression.
The pattern "(-?\\d+)°C" searches for
an optional minus sign -?
followed by one or more digits \\d+
followed by °C
The group – within the parentheses – captures the degrees value.
var string = "A dusting of snow giving way to moderate rain (total 10mm) heaviest on Thu night. Freeze-thaw conditions (max 8°C on Fri morning, min -2°C on Wed night). Mainly strong winds."
let pattern = "(-?\\d+)°C"
do {
let regex = try NSRegularExpression(pattern: pattern)
let matches = regex.matches(in: string, range: NSRange(location: 0, length: string.utf16.count))
for match in matches.reversed() { // reversed() is crucial to preserve the indexes.
let celsius = (string as NSString).substring(with: match.rangeAt(1))
let fahrenheit = Double(celsius)! * 1.8 + 32
let range = match.range
let start = string.index(string.startIndex, offsetBy: range.location)
let end = string.index(start, offsetBy: range.length)
string = string.replacingCharacters(in: start..<end, with: String(format: "%.0f°F", fahrenheit))
}
} catch {
print("Regex Error:", error)
}
print(string)
The most complicated part of the code is the conversion NSRange -> Range<String.Index>