I am trying to alphabetically sort an array of non-English strings which contain a number of special Unicode characters. I can create a CharacterSet sequence which contains the desired lexicographic sort order.
Is there an approach in Swift5 to performing this type of customized sort?
I believe I saw such a function some years back, but a pretty exhaustive search today failed to turn anything up.
Any pointers would be appreciated!
As a simple implementation of matt's cosorting comment:
// You have `t` twice in your string; I've removed the first one.
let alphabet = "ꜢjiyꜤwbpfmnRrlhḥḫẖzsšqkgtṯdḏ "
// Map characters to their location in the string as integers
let order = Dictionary(uniqueKeysWithValues: zip(alphabet, 0...))
// Make the alphabet backwards as a test string
let string = alphabet.reversed()
// This sorts unknown characters at the end. Or you could throw instead.
let sorted = string.sorted { order[$0] ?? .max < order[$1] ?? .max }
print(sorted)
Rather than building your own “non-English” sorting, you might consider localized comparison. E.g.:
let strings = ["a", "á", "ä", "b", "c", "d", "e", "é", "f", "r", "s", "ß", "t"]
let result1 = strings.sorted()
print(result1) // ["a", "b", "c", "d", "e", "f", "r", "s", "t", "ß", "á", "ä", "é"]
let result2 = strings.sorted {
$0.localizedCaseInsensitiveCompare($1) == .orderedAscending
}
print(result2) // ["a", "á", "ä", "b", "c", "d", "e", "é", "f", "r", "s", "ß", "t"]
let locale = Locale(identifier: "sv")
let result3 = strings.sorted {
$0.compare($1, options: .caseInsensitive, locale: locale) == .orderedAscending
}
print(result3) // ["a", "á", "b", "c", "d", "e", "é", "f", "r", "s", "ß", "t", "ä"]
And a non-Latin example:
let strings = ["あ", "か", "さ", "た", "い", "き", "し", "ち", "う", "く", "す", "つ", "ア", "カ", "サ", "タ", "イ", "キ", "シ", "チ", "ウ", "ク", "ス", "ツ", "が", "ぎ"]
let result4 = strings.sorted {
$0.localizedCaseInsensitiveCompare($1) == .orderedAscending
}
print(result4) // ["あ", "ア", "い", "イ", "う", "ウ", "か", "カ", "が", "き", "キ", "ぎ", "く", "ク", "さ", "サ", "し", "シ", "す", "ス", "た", "タ", "ち", "チ", "つ", "ツ"]
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 6 years ago.
Improve this question
I have a string let string = "!101eggs". Now, I want to have an array like this ["!", "101", "e", "g", "g", "s"]. How can I do this?
I presume the hard part for you is "Where's the number"? As long as this is just a simple sequence of digits, a regular expression makes it easy to find:
let string = "!101eggs"
let patt = "\\d+"
let reg = try! NSRegularExpression(pattern:patt)
let r = reg.rangeOfFirstMatch(in: string,
options: [],
range: NSMakeRange(0,string.utf16.count)) // {1,3}
So now you know that the number starts at position 1 and is 3 characters long. The rest is left as an exercise for the reader.
Sorry It's too long
when input is
print("-1-2a000+4-1/000!00005gf101eg14g1s46nj3j4b1j5j23jj212j4b2j41234j01010101g0000z00005g0000".toArrayByNumber())
Result: ["-", "1", "-", "2", "a", "000", "+", "4", "-", "1", "/", "000", "!", "00005", "g", "f", "101", "e", "g", "14", "g", "1", "s", "46", "n", "j", "3", "j", "4", "b", "1", "j", "5", "j", "23", "j", "j", "212", "j", "4", "b", "2", "j", "41234", "j", "01010101", "g", "0000", "z", "00005", "g", "0000"]
extension Int {
func toZeroString() -> String {
return (0 ..< self).reduce("", { (result, zero) -> String in
return result + "0"
})
}
}
extension String {
func toArrayByNumber() -> [String] {
var array: [String] = []
var num = 0
var zeroCount = 0
var zeroEnd = false
for char in self.characters {
if let number = Int("\(char)") {
if zeroEnd == false && number == 0 {
zeroCount += 1
} else {
num = num * 10 + number
zeroEnd = true
}
} else {
if num != 0 {
array.append(zeroCount.toZeroString() + ("\(num)"))
} else if zeroCount > 0 {
array.append(zeroCount.toZeroString())
}
array.append(String(char))
num = 0
zeroCount = 0
zeroEnd = false
}
}
if num != 0 {
array.append(zeroCount.toZeroString() + ("\(num)"))
} else if zeroCount > 0 {
array.append(zeroCount.toZeroString())
}
return array
}
}
I want to make a procedure to find out how many words are there in a string, separated by space, or comma, or some other character. And then add up the total later.
I'm making an average calculator, so I want the total count of data and then add up all the words.
update: Xcode 10.2.x • Swift 5 or later
Using Foundation method enumerateSubstrings(in: Range)and setting .byWords as options:
let sentence = "I want to an algorithm that could help find out how many words are there in a string separated by space or comma or some character. And then append each word separated by a character to an array which could be added up later I'm making an average calculator so I want the total count of data and then add up all the words. By words I mean the numbers separated by a character, preferably space Thanks in advance"
var words: [Substring] = []
sentence.enumerateSubstrings(in: sentence.startIndex..., options: .byWords) { _, range, _, _ in
words.append(sentence[range])
}
print(words) // "["I", "want", "to", "an", "algorithm", "that", "could", "help", "find", "out", "how", "many", "words", "are", "there", "in", "a", "string", "separated", "by", "space", "or", "comma", "or", "some", "character", "And", "then", "append", "each", "word", "separated", "by", "a", "character", "to", "an", "array", "which", "could", "be", "added", "up", "later", "I\\'m", "making", "an", "average", "calculator", "so", "I", "want", "the", "total", "count", "of", "data", "and", "then", "add", "up", "all", "the", "words", "By", "words", "I", "mean", "the", "numbers", "separated", "by", "a", "character", "preferably", "space", "Thanks", "in", "advance"]\n"
print(words.count) // 79
Or using native Swift 5 new Character property isLetter and the split method:
let words = sentence.split { !$0.isLetter }
print(words) // "["I", "want", "to", "an", "algorithm", "that", "could", "help", "find", "out", "how", "many", "words", "are", "there", "in", "a", "string", "separated", "by", "space", "or", "comma", "or", "some", "character", "And", "then", "append", "each", "word", "separated", "by", "a", "character", "to", "an", "array", "which", "could", "be", "added", "up", "later", "I", "m", "making", "an", "average", "calculator", "so", "I", "want", "the", "total", "count", "of", "data", "and", "then", "add", "up", "all", "the", "words", "By", "words", "I", "mean", "the", "numbers", "separated", "by", "a", "character", "preferably", "space", "Thanks", "in", "advance"]\n"
print(words.count) // 80
Extending StringProtocol to support Substrings as well:
extension StringProtocol {
var words: [SubSequence] {
return split { !$0.isLetter }
}
var byWords: [SubSequence] {
var byWords: [SubSequence] = []
enumerateSubstrings(in: startIndex..., options: .byWords) { _, range, _, _ in
byWords.append(self[range])
}
return byWords
}
}
sentence.words // ["I", "want", "to", "an", "algorithm", "that", "could", "help", "find", "out", "how", "many", "words", "are", "there", "in", "a", "string", "separated", "by", "space", "or", "comma", "or", "some", "character", "And", "then", "append", "each", "word", "separated", "by", "a", "character", "to", "an", "array", "which", "could", "be", "added", "up", "later", "I", "m", "making", "an", "average", "calculator", "so", "I", "want", "the", "total", "count", "of", "data", "and", "then", "add", "up", "all", "the", "words", "By", "words", "I", "mean", "the", "numbers", "separated", "by", "a", "character", "preferably", "space", "Thanks", "in", "advance"]
let sentences = "Let there be light!"
let separatedCount = sentences.split(whereSeparator: { ",.! ".contains($0) }).count
print(separatedCount) // prints out 4 (if you just want the array, you can omit ".count")
If you have a specific condition of punctuations you want to use, you could use this code. Also if you prefer to use swift codes only :).
You may want to try componentsSeparatedByCharactersInset:
let s = "Let there be light"
let c = NSCharacterSet(charactersInString: " ,.")
let a = s.componentsSeparatedByCharactersInSet(c).filter({!$0.isEmpty})
// a = ["Let", "there", "be", "light"]
You can use regular expression and extension to simplify your code like this:
extension String {
var wordCount: Int {
let regex = try? NSRegularExpression(pattern: "\\w+")
return regex?.numberOfMatches(in: self, range: NSRange(location: 0, length: self.utf16.count)) ?? 0
}
}
let text = "I live in iran and i love Here"
print(text.wordCount) // 8
If you are aiming at fresh operating systems (such as iOS13) there is no need to reinvent the wheel trying to count words by yourself. You can benefit from a powerful API specially dedicated for this purpose. It can split text into words for many languages you don't even know about, it can and classify parts of speech show lemmas, detect script and more.
Check this in playground.
import NaturalLanguage
let taggerLexical = NLTagger(tagSchemes: [.lexicalClass, .lemma])
let txt = "I'm an architector 👨🏻💼 by 90%. My family 👨👩👧👦 and I live in 🏴."
taggerLexical.string = txt
let lexicalTags = NSCountedSet()
taggerLexical.enumerateTags(in: txt.startIndex..<txt.endIndex, unit: .word, scheme: .lexicalClass, options: [.omitPunctuation, .omitWhitespace]) { tag, tokenRange in
if let tag = tag {
lexicalTags.add(tag)
let lemma = taggerLexical.tag(at: tokenRange.lowerBound, unit: .word, scheme: .lemma).0?.rawValue ?? ""
let word = String(txt[tokenRange])
print("\(word): \(tag.rawValue)\(word == lemma ? "" : " | Lemma: \(lemma) " )")
}
return true
}
let sortedLexicalTagCount = lexicalTags.allObjects.map({ (($0 as! NLTag), lexicalTags.count(for: $0))}).sorted(by: {$0.1 > $1.1})
print("Total word count: \(sortedLexicalTagCount.map({ $0.1}).reduce(0, +)) \nTotal word count without grapheme clusters: \(sortedLexicalTagCount.compactMap({ $0.0 == NLTag.otherWord ? nil : $0.1 }).reduce(0, +)) \nDetails: \(sortedLexicalTagCount.map {($0.0.rawValue, $0.1)})")
// Output:
I: Pronoun
'm: Verb | Lemma: be
an: Determiner
architector: Adjective | Lemma:
👨🏻💼: OtherWord | Lemma:
by: Preposition
90: Number | Lemma:
My: Determiner | Lemma: I
family: Noun
👨👩👧👦: OtherWord | Lemma:
and: Conjunction
I: Pronoun
live: Verb
in: Preposition
🏴: OtherWord | Lemma:
Total word count: 15
Total word count without grapheme clusters: 12
Details: [("OtherWord", 3), ("Pronoun", 2), ("Determiner", 2), ("Verb", 2), ("Preposition", 2), ("Number", 1), ("Noun", 1), ("Conjunction", 1), ("Adjective", 1)]
For older Apple operating systems using preceding linguisticTags API is an option.
import Foundation
let linguisticTags = txt.linguisticTags(in: text.startIndex..., scheme: NSLinguisticTagScheme.tokenType.rawValue)
print("Total word count: \(linguisticTags.filter({ [NSLinguisticTag.word.rawValue, NSLinguisticTag.other.rawValue].contains($0) }).count)\nTotal word count without grapheme clusters: \(linguisticTags.filter({ [NSLinguisticTag.word.rawValue].contains($0) }).count)")
// Output:
Total word count: 15
Total word count without grapheme clusters: 12
Another option is to use NSRegularExpression. It knows how match word boundaries (\\b), word (\\w) and non-word (\\W) symbols.
Using .numberOfMatches(in: , range:..) looks better from the calculation effectiveness point of view since it returns only number of matches but not matches themselves. Yet there are issues for strings with emojis for this approach.
extension String {
private var regexMatchWords: NSRegularExpression? { try? NSRegularExpression(pattern: "\\w+") }
var aproxWordCount: Int {
guard let regex = regexMatchWords else { return 0 }
return regex.numberOfMatches(in: self, range: NSRange(self.startIndex..., in: self))
}
var wordCount: Int {
guard let regex = regexMatchWords else { return 0 }
return regex.matches(in: self, range: NSRange(self.startIndex..., in: self)).reduce(0) { (r, match) in
r + (Range(match.range, in: self) == nil ? 0 : 1)
}
}
var words: [String] {
var w = [String]()
guard let regex = regexMatchWords else { return [] }
regex.enumerateMatches(in: self, range: NSRange(self.startIndex..., in: self)) { (match, _, _) in
guard let match = match else { return }
guard let range = Range(match.range, in: self) else { return }
w.append(self[range])
}
return w
}
}
let text = "We're a family 👨👩👧👦 of 4. Next week we'll go to 🇬🇷."
print("Arpoximate word count: \(text.aproxWordCount)\nWord count: \(text.wordCount)\nWords:\(text.words)")
// Output:
Arpoximate word count: 15
Word count: 12
Words:["We", "re", "a", "family", "of", "4", "Next", "week", "we", "ll", "go", "to"]
This works for me,
let spaces=CharacterSet.whitespacesAndNewlines.union(.punctuationCharacters)
let words = YourString.components(separatedBy: spaces)
if words.count > 8 { return 110 } else { return 90 }
You may try some of these options:
let name = "some name with, space # inbetween -- and more"
let wordsSeparatedBySpaces = name.components(separatedBy: .whitespacesAndNewlines) // CharacterSet
let wordsSeparatedByPunctuations = name.components(separatedBy: .punctuationCharacters) // CharacterSet
// (can be separated by some string
let wordsSeparatedByHashChar = name.components(separatedBy: "#") // String protocol
let wordsSeparatedByComma = name.components(separatedBy: ",") // String protocol
let wordsSeparatedBySomeString = name.components(separatedBy: " -- ") // String protocol
let total = wordsSeparatedBySpaces.count + wordsSeparatedByPunctuations.count + wordsSeparatedByHashChar.count + wordsSeparatedByComma.count
print("Total number of separators = \(total)")
I've searched google and Stackoverflow but I just can't find an answer to my question.
I'm currently using Jaspersoft Studio 5.6.2 final and I'm trying to get a custom function to show up in my Expression Editor, but what ever I try I just cannot get the Category and function to show.
I started following the tutorial on https://community.jaspersoft.com/wiki/jaspersoft-studio-expression-editor-how-extend-it-and-contribute-your-own-functions-part-2-0. Generating all the necesarry files using the Functions Librry wizard (using File > New > Other > Jaspersoft Studio > Functions Library) and creating the function itself was easy.
The way the *.properties files are supposed to be configured is not very clear in the tutorial so I've also looked at http://jasperreports.sourceforge.net/sample.reference/functions/index.html#functions, after making adjustments to the properties file the custom function and category are still not showing
I've tried testing my code directly inside Jaspersoft, exporting it to a jar file, creating the jar file in another Eclipse and restarting Jaspersoft Studio. Nothing works.
Below is are the contents of my files.
The category class
package net.sf.jasperreports.functions.custom;
import net.sf.jasperreports.functions.annotations.FunctionCategory;
// I've also tried #FunctionCategory("Able")
#FunctionCategory()
public final class Able {
}
The class containing the function
package net.sf.jasperreports.functions.custom;
import net.sf.jasperreports.functions.annotations.Function;
import net.sf.jasperreports.functions.annotations.FunctionCategories;
import net.sf.jasperreports.functions.annotations.FunctionParameter;
import net.sf.jasperreports.functions.annotations.FunctionParameters;
import net.sf.jasperreports.functions.standard.TextCategory;
#FunctionCategories({ Able.class, TextCategory.class })
public final class AbleFunctions
{
/**
* Returns a barcode the custom font IDAutomationSHI25M will understand based on the page number
* and whether or not the current pagenumber is the lastpage
* #param pagenumber
* #param lastpage
* #return
*/
#Function("CREATE_BARCODE")
#FunctionParameters({
#FunctionParameter("pagenumber"),
#FunctionParameter("lastpage")})
public static String CREATE_BARCODE(Integer pagenumber, boolean lastpage)
{
String[] barcodeArray = {"!","\"", "#", "$", "%", "&", "(", ")", "*", "+", ",", "-", ".","/",
"0", "1", "2", "3", "4", "5", "6", "7", "8","9", ":", ";", "<", "=", ">", "?", "#",
"A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O", "P", "Q",
"R", "S", "T", "U", "V", "W", "X", "Y", "Z", "[", "\\", "]", "^","_", "`","a", "b",
"c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s",
"t", "u", "v", "w", "x", "y", "z", "{", "|", "}", "~", "┼", "ã", "Ã", "╚", "╔", "╩"};
String onePage = barcodeArray[1] + barcodeArray[1];
String middlePagePrefix = barcodeArray[0];
String lastPagePrefix = barcodeArray[10];
// There are a couple of specific conditions that will generate a specific outcome.
// Checking these conditions first
if (pagenumber == 1 && lastpage) {
return onePage;
} else if (pagenumber > 1 && lastpage) {
return lastPagePrefix + barcodeArray[pagenumber];
} else {
return middlePagePrefix + barcodeArray[pagenumber];
}
}
}
jasperreports_messages.properties file
net.sf.jasperreports.functions.custom.Able.CREATE_BARCODE.description = Provide the current pagenumber and a boolean property telling if this is the lastpage and this method will return a string that can be turned in a barcode
net.sf.jasperreports.functions.custom.Able.CREATE_BARCODE.lastpage.description = A boolean value telling if the current page number belongs to the lastpage or not
net.sf.jasperreports.functions.custom.Able.CREATE_BARCODE.lastpage.name = lastpage
net.sf.jasperreports.functions.custom.Able.CREATE_BARCODE.name = CREATE_BARCODE
net.sf.jasperreports.functions.custom.Able.CREATE_BARCODE.pagenumber.description = The current page number
net.sf.jasperreports.functions.custom.Able.CREATE_BARCODE.pagenumber.name = pagenumber
net.sf.jasperreports.functions.custom.Able.description = Custom Able functions for Jasperreports
net.sf.jasperreports.functions.custom.Able.name = Able
jasperreports_extension.properties
net.sf.jasperreports.extension.registry.factory.functions=net.sf.jasperreports.functions.FunctionsRegistryFactory
net.sf.jasperreports.extension.functions.ablefunctions=eu.able.functions.AbleFunctions
As you can see in the AbleFunctions class I've also tried to add the function to the existing TextCategory class, but this also has no effect.
Does anybody have a clue what the problem could be? This is already taking me days without any succes so any help would be great!
This seems to be an old bug fixed with the latest version of the Community Edition: https://community.jaspersoft.com/questions/848371/custom-functions