String sort using CharacterSet order

String sort using CharacterSet order - swift

I am trying to alphabetically sort an array of non-English strings which contain a number of special Unicode characters. I can create a CharacterSet sequence which contains the desired lexicographic sort order.
Is there an approach in Swift5 to performing this type of customized sort?
I believe I saw such a function some years back, but a pretty exhaustive search today failed to turn anything up.
Any pointers would be appreciated!

As a simple implementation of matt's cosorting comment:
// You have `t` twice in your string; I've removed the first one.
let alphabet = "ꜢjiyꜤwbpfmnRrlhḥḫẖzsšqkgtṯdḏ "
// Map characters to their location in the string as integers
let order = Dictionary(uniqueKeysWithValues: zip(alphabet, 0...))
// Make the alphabet backwards as a test string
let string = alphabet.reversed()
// This sorts unknown characters at the end. Or you could throw instead.
let sorted = string.sorted { order[$0] ?? .max < order[$1] ?? .max }
print(sorted)

Rather than building your own “non-English” sorting, you might consider localized comparison. E.g.:
let strings = ["a", "á", "ä", "b", "c", "d", "e", "é", "f", "r", "s", "ß", "t"]
let result1 = strings.sorted()
print(result1) // ["a", "b", "c", "d", "e", "f", "r", "s", "t", "ß", "á", "ä", "é"]
let result2 = strings.sorted {
$0.localizedCaseInsensitiveCompare($1) == .orderedAscending
}
print(result2) // ["a", "á", "ä", "b", "c", "d", "e", "é", "f", "r", "s", "ß", "t"]
let locale = Locale(identifier: "sv")
let result3 = strings.sorted {
$0.compare($1, options: .caseInsensitive, locale: locale) == .orderedAscending
}
print(result3) // ["a", "á", "b", "c", "d", "e", "é", "f", "r", "s", "ß", "t", "ä"]
And a non-Latin example:
let strings = ["あ", "か", "さ", "た", "い", "き", "し", "ち", "う", "く", "す", "つ", "ア", "カ", "サ", "タ", "イ", "キ", "シ", "チ", "ウ", "ク", "ス", "ツ", "が", "ぎ"]
let result4 = strings.sorted {
$0.localizedCaseInsensitiveCompare($1) == .orderedAscending
}
print(result4) // ["あ", "ア", "い", "イ", "う", "ウ", "か", "カ", "が", "き", "キ", "ぎ", "く", "ク", "さ", "サ", "し", "シ", "す", "ス", "た", "タ", "ち", "チ", "つ", "ツ"]

Related

An allowed character is being percent encoded

Documentation for the string method addingPercentEncoding(withAllowedCharacters:):
Returns a new string made from the receiver by replacing all characters not in the specified set with percent-encoded characters.
The predefined set CharacterSet.alphanumerics says:
Returns a character set containing the characters in Unicode General Categories L*, M*, and N*.
The L (Letter) category consists of 5 subcategories: Ll,Lm,Lt,Lu,Lo. So I assume L* means all of L's subcategories.
I'll choose to look at the Ll subcategory (https://www.compart.com/en/unicode/category/Ll#UNC_SCRIPTS), and pick the character "æ" (U+00E6).
I can then see that the alphanumerics character set indeed contains this character. But when I add percent encoding to a string containing this character, it gets percent encoded.
"\u{E6}" // "æ"
CharacterSet.alphanumerics.contains("\u{E6}") // true
"æ".addingPercentEncoding(withAllowedCharacters: .alphanumerics) // "%C3%A6" 🤨
// Let's try with "a"
"\u{61}" // "a"
CharacterSet.alphanumerics.contains("\u{61}") // true
"a".addingPercentEncoding(withAllowedCharacters: .alphanumerics) // "a"
Why does this happen? It's in the allowed character set that I passed in, so it shouldn't be replaced, right?
I feel like it has something to do with the fact that "a" (U+0061) is also 0x61 in UTF-8 but "æ" (U+00E6) is [0xC3, 0xA6]; not 0xE6. Or that it takes up more than 1 byte?
String(data: Data([0x61]), encoding: .utf8)! // "a"
String(data: Data([0xC3, 0xA6]), encoding: .utf8)! // "æ"
String(data: Data([0xE6]), encoding: .utf8)! // crashes 🌋
Update
Is it because the percent encoding algorithm converts the string to Data and goes through 1 byte at a time? so it'll look at 0xC3 which isn't an allowed character, so that gets percent encoded. Then it'll look at 0xA6 which also isn't an allowed character, so that gets percent encoded too. So allowed characters technically have to be a single byte?

A truly allowed character has to be in the allowed character set, and be an ASCII character. Thanks #alobaili for pointing that out.
If you're curious, the predefined set CharacterSet.alphanumerics contains 129172 characters in total, but only 62 are truly allowed when this set is passed to a string's addingPercentEncoding(allowedSet:) method.
A quick way to inspect all the truly allowed characters in a particular CharacterSet can be done like so:
func inspect(charSet: CharacterSet) {
var characters: [String] = []
for char: UInt8 in 0..<128 { // ASCII range
let u = UnicodeScalar(char)
if charSet.contains(u) {
characters.append(String(u))
}
}
print("Characters:", characters.count)
print(characters)
}
inspect(charSet: .alphanumerics) // [a-z, A-Z, 0-9]
This is handy as you can't simply iterate through a CharacterSet. It can be useful to know what those allowed elements are. For example, the predefined CharacterSet.urlQueryAllowed only says:
Returns the character set for characters allowed in a query URL
component.
We can know what those allowed characters are:
inspect(charSet: .urlQueryAllowed)
// Characters: 81
// ["!", "$", "&", "\'", "(", ")", "*", "+", ",", "-", ".", "/", "0", "1", "2", "3", "4", "5", "6", "7", "8", "9", ":", ";", "=", "?", "#", "A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O", "P", "Q", "R", "S", "T", "U", "V", "W", "X", "Y", "Z", "_", "a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z", "~"]
Just For Fun
There's another (long, sure-fire) way, which looks at all the characters in the set (not just the ASCII ones), and compares a string of the character itself, with the string after adding percent encoding with only that character in the allowed set. When those two are equal then you know it really is an allowed character. Code adapted from this helpful article.
func inspect(charSet: CharacterSet) {
var characters: [String] = []
var allowed: [String] = []
var asciiCount = 0
for plane: UInt8 in 0..<17 {
if charSet.hasMember(inPlane: plane) {
let planeStart = UInt32(plane) << 16
let nextPlaneStart = (UInt32(plane) + 1) << 16
for char: UTF32Char in planeStart..<nextPlaneStart {
if let u = UnicodeScalar(char), charSet.contains(u) {
let s = String(u)
characters.append(s)
if s.addingPercentEncoding(withAllowedCharacters: CharacterSet([u])) == s {
allowed.append(s)
}
if u.isASCII {
asciiCount += 1
}
}
}
}
}
print("Characters:", characters.count)
print("Allowed:", allowed.count)
print("ASCII:", asciiCount)
}
inspect(charSet: .alphanumerics)
// Characters: 129172
// Allowed: 62
// ASCII: 62

How to sort a multidimensional character array alphabetically Swift

I am trying to sort a [[Character]] array that has random characters put in to it so that it is all alphabetical. ex output for right now:
["H", "P", "C"]
["F", "K", "V"]
["J", "Y", "B"]
I need it to be like this
["A", "B", "C"]
["D", "E", "F"]
["G", "H", "I"]
Any Ideas?

Please check :
let input = [["H", "P", "C"], ["F", "K", "V"], ["J", "Y", "B"], ["A", "L"]]
var sortedArray = input.flatMap({ $0 }).sorted()
var finalArray:[[String]]=[]
var subArray:[String]=[]
for i in 0..<sortedArray.count {
subArray.append(sortedArray[i])
if subArray.count == 3 || i == sortedArray.count-1 {
finalArray.append(subArray)
subArray = []
}
}
print(finalArray)
// Output : [["A", "B", "C"], ["F", "H", "J"], ["K", "L", "P"], ["V", "Y"]]

func flattenArray(nestedArray: [[Character]]) -> [[Character]]{
var myFlattendArray = [Character]()
var sortedArray = [[Character]]()
for element in nestedArray{
if element is [Character]{
for char in element{
myFlattendArray.append(char)
}
}
}
myFlattendArray = myFlattendArray.sorted(by: {$0 < $1})
var arrayForArray = [Character]()
for i in 0..<myFlattendArray.count{
if((i % nestedArray.count == 0 && i != 0)){
sortedArray.append(arrayForArray)
arrayForArray.removeAll()
}else if i == myFlattendArray.count - 1{
arrayForArray.append(myFlattendArray[i])
sortedArray.append(arrayForArray)
arrayForArray.removeAll()
}
arrayForArray.append(myFlattendArray[i])
}
return sortedArray
}

You can use flatMap to flatten your array, sort it and then you can group using this extension from this answer as follow:
extension Array {
func group(of n: IndexDistance) -> Array<Array> {
return stride(from: 0, to: count, by: n)
.map { Array(self[$0..<Swift.min($0+n, count)]) }
}
}
let arr = [["H", "P", "C"],
["F", "K", "V"],
["J", "Y", "B"]]
let sorted = arr.flatMap{$0}.sorted().group(of: 3)
sorted // [["B", "C", "F"], ["H", "J", "K"], ["P", "V", "Y"]]

Swift 3: split a string to array by number [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 6 years ago.
Improve this question
I have a string let string = "!101eggs". Now, I want to have an array like this ["!", "101", "e", "g", "g", "s"]. How can I do this?

I presume the hard part for you is "Where's the number"? As long as this is just a simple sequence of digits, a regular expression makes it easy to find:
let string = "!101eggs"
let patt = "\\d+"
let reg = try! NSRegularExpression(pattern:patt)
let r = reg.rangeOfFirstMatch(in: string,
options: [],
range: NSMakeRange(0,string.utf16.count)) // {1,3}
So now you know that the number starts at position 1 and is 3 characters long. The rest is left as an exercise for the reader.

Sorry It's too long
when input is
print("-1-2a000+4-1/000!00005gf101eg14g1s46nj3j4b1j5j23jj212j4b2j41234j01010101g0000z00005g0000".toArrayByNumber())
Result: ["-", "1", "-", "2", "a", "000", "+", "4", "-", "1", "/", "000", "!", "00005", "g", "f", "101", "e", "g", "14", "g", "1", "s", "46", "n", "j", "3", "j", "4", "b", "1", "j", "5", "j", "23", "j", "j", "212", "j", "4", "b", "2", "j", "41234", "j", "01010101", "g", "0000", "z", "00005", "g", "0000"]
extension Int {
func toZeroString() -> String {
return (0 ..< self).reduce("", { (result, zero) -> String in
return result + "0"
})
}
}
extension String {
func toArrayByNumber() -> [String] {
var array: [String] = []
var num = 0
var zeroCount = 0
var zeroEnd = false
for char in self.characters {
if let number = Int("\(char)") {
if zeroEnd == false && number == 0 {
zeroCount += 1
} else {
num = num * 10 + number
zeroEnd = true
}
} else {
if num != 0 {
array.append(zeroCount.toZeroString() + ("\(num)"))
} else if zeroCount > 0 {
array.append(zeroCount.toZeroString())
}
array.append(String(char))
num = 0
zeroCount = 0
zeroEnd = false
}
}
if num != 0 {
array.append(zeroCount.toZeroString() + ("\(num)"))
} else if zeroCount > 0 {
array.append(zeroCount.toZeroString())
}
return array
}
}

How to check if string contains a certain character?

I've been trying to find this but I can't. I have the code:
func ayylmao(vorno: String){
if (vorno.(WHATEVER THE FUNCTION FOR FINDING A STRING GOES HERE)("a", "e", "i", "o", "u"))
{
print("Input Includes vowels")
}
}
but right at the if statement I can't find anything to check if the characters are in the string.

Like this:
let s = "hello"
let ok = s.characters.contains {"aeiou".characters.contains($0)} // true

I suggest two implementations:
func numberOfVowelsIn(_ string: String) -> Int {
let vowels: [Character] = ["a", "e", "i", "o", "u", "y", "A", "E", "I", "O", "U", "Y"]
return string.reduce(0, { $0 + (vowels.contains($1) ? 1 : 0) })
}
numberOfVowelsIn("hello my friend") //returns 5.
... and the second one with this code snippet hereafter to reach your goal:
let isVowel: (Character) -> Bool = { "aeiouyAEIOUY".contains($0) }
isVowel("B") //returns 'false'.
isVowel("a") //returns 'true'.
isVowel("U") //returns 'true'.

Need an algorithm for to shuffle elements of 5 arrays, each with the same 5 elements, such that no two arrays have the same element at the same index

I've the following five arrays
var E1 = ["A", "B", "C", "D", "E"]
var E2 = ["A", "B", "C", "D", "E"]
var E3 = ["A", "B", "C", "D", "E"]
var E4 = ["A", "B", "C", "D", "E"]
var E5 = ["A", "B", "C", "D", "E"]
Each array have the same five elements namely "A", "B", "C", "D" and "E". I want to write an algorithm to sort the elements in all the arrays such that no two arrays have the same element (let's say "A") at the same index.
A sort of the sample output that will work for me will be like:
var E1 = ["A", "B", "C", "D", "E"]
var E2 = ["B", "C", "D", "E", "A"]
var E3 = ["C", "D", "E", "A", "B"]
var E4 = ["D", "E", "A", "B", "C"]
var E5 = ["E", "A", "B", "C", "D"]
I've tried to solve this but couldn't complete. I've just written a shuffling function for sorting the elements of two arrays(E1 and E2).
var E1 = ["A", "B", "C", "D", "E"]
var E2 = ["A", "B", "C", "D", "E"]
var E3 = ["A", "B", "C", "D", "E"]
var E4 = ["A", "B", "C", "D", "E"]
var E5 = ["A", "B", "C", "D", "E"]
func shuffledArrays(var array1: [String],var array2: [String]) {
if array1[0] == array2[0] || array1[1] == array2[1] || array1[2] == array2[2] || array1[3] == array2[3] || array1[4] == array2[4] {
shuffled1 = GKRandomSource.sharedRandom().arrayByShufflingObjectsInArray(array1)
shuffled2 = GKRandomSource.sharedRandom().arrayByShufflingObjectsInArray(array2)
var array3 = shuffled1 as! [String]
var array4 = shuffled2 as! [String]
} else {
var array3 = array1
var array4 = array2
}
array1 = array3
array2 = array4
}
// Now calling the function on arrays E1 and E2
shuffledArrays(E1, array2: E2)
print(E1)
print(E2)
With this code I'm getting the following error on Xcode Playground. While sometimes the error is removed and the output is correct at lines 102 and 103 but still I'm unable to extract that output out and save it permanently into E1 and E2 respectively. Please help me with the whole algorithm in arranging the five arrays' elements.
Thanks

Since you know arrays E1, ..., E5 to hold identical entries (in identical order), you needn't explicitly hold the arrays E2 through E5 (since you know these are value-equal to E1).
Hence, you could simply define a shift function and create E2 through E5 by repeated shifting of previous array.
import GameplayKit
func shiftByOne (arr: [String]) -> [String] {
var shiftedArr = arr
shiftedArr.insert(shiftedArr.popLast()!, atIndex: 0)
return shiftedArr
}
var E1 = ["A", "B", "C", "D", "E"]
E1 = GKRandomSource.sharedRandom().arrayByShufflingObjectsInArray(E1) as! [String]
var E2 = shiftByOne(E1)
var E3 = shiftByOne(E2)
var E4 = shiftByOne(E3)
var E5 = shiftByOne(E4)
/** Result without initial shuffle:
E1 = ["A", "B", "C", "D", "E"]
E2 = ["B", "C", "D", "E", "A"]
E3 = ["C", "D", "E", "A", "B"]
E4 = ["D", "E", "A", "B", "C"]
E5 = ["E", "A", "B", "C", "D"] */
This method starts with E1 (possibly shuffling only this array), and guarantees that E2 through E5 are constructed to all differ, w.r.t. order, from each other.
As noted by R Menke below, if you shuffle array E1, the same behaviour will hold (however with shuffled initial array). Here I've used the same shuffler as in your example, but for a more Swifty approach, see e.g.:
How do I shuffle an array in Swift?

As far as I can see you can make it this way:
Take a valid solution (just like the example you gave)
shuffle the columns (this can only lead to valid solutions)
shuffle the rows (this can only lead to valid solutions)
This way you can get all possible solutions.
For example, starting with a valid solution.
["A", "B", "C", "D", "E"]
["B", "C", "D", "E", "A"]
["C", "D", "E", "A", "B"]
["D", "E", "A", "B", "C"]
["E", "A", "B", "C", "D"]
Shuffle the rows.
["C", "D", "E", "A", "B"]
["B", "C", "D", "E", "A"]
["A", "B", "C", "D", "E"]
["D", "E", "A", "B", "C"]
["E", "A", "B", "C", "D"]
Then shuffle the columns.
["C", "E", "A", "D", "B"]
["B", "D", "E", "C", "A"]
["A", "C", "D", "B", "E"]
["D", "A", "B", "E", "C"]
["E", "B", "C", "A", "D"]
This guarantees no two arrays have the same element in the same index and they're randomized.
Here is an example in Ruby. The algorithm is applicable to any language.
# Initialize the first row of the matrix
matrix = []
matrix[0] = ('A'..'E').to_a
size = matrix[0].size
# Initialize the rotated array
for i in 1..size-1
matrix[i] = matrix[i-1].rotate
end
puts "Original matrix"
for x in matrix
puts x.inspect
end
# Shuffle the indexes of the rows and columns
rows = (0..size-1).to_a.shuffle
cols = (0..size-1).to_a.shuffle
# Shuffle the rows
for i in 0..size-1
row1 = i
row2 = rows[i]
tmp = matrix[row1]
matrix[row1] = matrix[row2]
matrix[row2] = tmp
end
# Shuffle the columns.
for i in 0..size-1
col1 = i
col2 = cols[i]
for j in 0..size-1
tmp = matrix[j][col1]
matrix[j][col1] = matrix[j][col2]
matrix[j][col2] = tmp
end
end
puts "Shuffled matrix"
for x in matrix
puts x.inspect
end
#Schwern: Tanks for adding the example and the code.

I am answering from an iPhone, so I have no compiler to test some code. I can just give you a basic idea how to solve this.
Create a dictionary where the key is one element (like "A") and the value is an array of already added indices.
var dict = [String : [Int]]()
Then iterate thru all 5 arrays and append each element. Before appending it check in the dictionary if this element is already added somewhere else at the same index. If not you can add it, if yes, append another element and add the added index to the dict.
I hope you get the idea.

First some shortcuts :
Construct a type that easily displays a nested array in rows / columns.
Have a way to detect duplicate elements in columns (same element with same index in different row)
This can be found here
I did not include it in the answer because it is not part of the algorithm.
Extend Array to implement a function only when the Array is multidimensional and it's nested Element is Equatable.
extension Array where Element : _ArrayType, Element.Element : Equatable {
Self will be of type Array<Element> and will not recognise that it is cast-able to Array<Array<Element.Element>>. See the gist for the conversion.
func entropy() -> [[Element.Element]]? { // returns nil when it is impossible
Create a matrix from the array.
var matrix = self.matrix() // see gist for Matrix type
This nested function does the actual Array mutations. It swaps two elements in a row.
func swap(row r:Int,column c:Int) {
let nextColumn : Int = {
if (c + 1) < matrix[r].count {
return c + 1
} else {
return 0
}
}()
let element = matrix[r][c]
let neighbour = matrix[r][nextColumn]
matrix[r][c] = neighbour
matrix[r][nextColumn] = element
}
This is the looping logic:
As long as the columns of the matrix contain duplicates -> go over each column -> find a duplicate -> swap it with a neighbour.
It is not very efficient. But you can try different functions to mutate the rows and see what works best.
while matrix.columnsContainDuplicates() {
for c in 0..<matrix.columns.count {
let column = matrix.columns[c]
if let dupeIndex = column.indexOfDuplicate() {
swap(row: dupeIndex, column: c)
}
}
}
return matrix.rows
}
}
Complete code with and extra nested function to check if it is solvable. There might be other reasons why it might not be solvable.
extension Array where Element : _ArrayType, Element.Element : Equatable {
func entropy() -> [[Element.Element]]? {
var matrix = self.matrix() // this comes from an extension to Array: if nested, can be converted to a matrix
// is there a possible configuration where no column has duplicates?
func solvable() -> Bool {
// this just checks if element x does not appear more than there are possible indexes.
var uniqueElements : [Element.Element] = []
var occurences : [Int] = []
for row in matrix.rows {
var rowCopy = row
while let pop = rowCopy.popLast() {
if let index = uniqueElements.indexOf(pop) {
occurences[index] += 1
occurences[index]
} else {
uniqueElements.append(pop)
occurences.append(1)
}
}
}
var highest = 0
for times in occurences {
if highest < times { highest = times }
}
if highest > matrix.columns.count {
highest
return false
}
return true
}
func swap(row r:Int,column c:Int) {
let nextColumn : Int = {
if (c + 1) < matrix[r].count {
return c + 1
} else {
return 0
}
}()
let element = matrix[r][c]
let neighbour = matrix[r][nextColumn]
matrix[r][c] = neighbour
matrix[r][nextColumn] = element
}
guard solvable() else {
return nil
}
while matrix.columnsContainDuplicates() {
for c in 0..<matrix.columns.count {
let column = matrix.columns[c]
if let dupeIndex = column.indexOfDuplicate() {
swap(row: dupeIndex, column: c)
}
}
}
return matrix.rows
}
}
Tests :
let test = [[4,3,2,1],[5,2,3,4],[1,6,3,4],[1,2,3,4]] // 8 loops
test.entropy() // [[3, 4, 1, 2], [4, 3, 2, 5], [6, 1, 4, 3], [1, 2, 3, 4]]
let test2 = [[1,2,3,4],[1,2,3,4],[1,2,3,4],[1,2,3,4]] // 8 loops
test2.entropy() // [[3, 4, 1, 2], [4, 3, 2, 1], [2, 1, 4, 3], [1, 2, 3, 4]]

i like Leo's answer. My answer is almost 'pure' swift (import Foundation is there due to arc4random_uniform function)
import Foundation // arc4random_uniform
let arr = Array(1...5)
var result: [[Int]] = []
repeat {
let r = arr.sort { (a, b) -> Bool in
arc4random_uniform(2) == 0
}
if !result.contains({ (s) -> Bool in
s == r
}) {
result.append(r)
}
} while result.count < 6
print(result)
// [[2, 3, 1, 4, 5], [4, 2, 1, 3, 5], [2, 1, 3, 4, 5], [2, 3, 1, 5, 4], [2, 4, 5, 1, 3], [2, 1, 4, 3, 5]]

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

String sort using CharacterSet order - swift

Related

An allowed character is being percent encoded

How to sort a multidimensional character array alphabetically Swift

Swift 3: split a string to array by number [closed]

How to check if string contains a certain character?

Need an algorithm for to shuffle elements of 5 arrays, each with the same 5 elements, such that no two arrays have the same element at the same index

Categories

Resources