Get Length of a substring in string before certain character Swift - swift

My main string is like this "90000+8000-1000*10". I wanted to find the length of substring that contain number and make it into array. So it will be like this:
print(substringLength[0]) //Show 5
print(substringLength[1]) //Show 4
Could anyone help me with this? Thanks in advance!

⚠️ Be aware of using replacingOccurrences!
Although this method (mentioned by #Raja Kishan) may work in some cases, it's not forward compatible and will fail if you have unhandled characters (like other expression operators)
βœ… Just write it as you say it:
let numbers = "90000+8000-1000*10".split { !$0.isWholeNumber && $0 != "." }
You have the numbers! go ahead and count the length
numbers[0].count // show 5
numbers[1].count // shows 4
🎁 You can also have the operators like:
let operators = "90000+8000-1000*10".split { $0.isWholeNumber || $0 == "." }

You can split when the character is not a number.
The 'max splits' method is used for performance, so you don't unnecessarily split part of the input you don't need. There are also preconditions to handle any bad input.
func substringLength(of input: String, at index: Int) -> Int {
precondition(index >= 0, "Index is negative")
let sections = input.split(maxSplits: index + 1, omittingEmptySubsequences: false) { char in
!char.isNumber
}
precondition(index < sections.count, "Out of range")
return sections[index].count
}
let str = "90000+8000-1000*10"
substringLength(of: str, at: 0) // 5
substringLength(of: str, at: 1) // 4
substringLength(of: str, at: 2) // 4
substringLength(of: str, at: 3) // 2
substringLength(of: str, at: 4) // Precondition failed: Out of range

If the sign (operator) is fixed then you can replace all signs with a common one sign and split the string by a common sign.
Here is the example
extension String {
func getSubStrings() -> [String] {
let commonSignStr = self.replacingOccurrences(of: "+", with: "-").replacingOccurrences(of: "*", with: "-")
return commonSignStr.components(separatedBy: "-")
}
}
let str = "90000+8000-1000*10"
str.getSubStrings().forEach({print($0.count)})

I'd assume that the separators are not numbers, regardless of what they are.
let str = "90000+8000-1000*10"
let arr = str.split { !$0.isNumber }
let substringLength = arr.map { $0.count }
print(substringLength) // [5, 4, 4, 2]
print(substringLength[0]) //Show 5
print(substringLength[1]) //Show 4

Don't use isNumber Character property. This would allow fraction characters as well as many others that are not single digits 0...9.
Discussion
For example, the following characters all represent numbers:
β€œ7” (U+0037 DIGIT SEVEN)
β€œβ…šβ€ (U+215A VULGAR FRACTION FIVE SIXTHS)
β€œγŠˆβ€ (U+3288 CIRCLED IDEOGRAPH NINE)
β€œπŸ β€ (U+1D7E0 MATHEMATICAL DOUBLE-STRUCK DIGIT EIGHT)
β€œΰΉ’β€ (U+0E52 THAI DIGIT TWO)
let numbers = "90000+8000-1000*10".split { !("0"..."9" ~= $0) } // ["90000", "8000", "1000", "10"]
let numbers2 = "90000+8000-1000*10 ΰ₯« ΰΉ™ δΈ‡ β…š 𝟠 ΰΉ’ ".split { !("0"..."9" ~= $0) } // ["90000", "8000", "1000", "10"]

Related

Text Recognition - Matching Strings to Patterns

I am using the Apple example on text recognition/reading phone numbers. I would like to change it so that instead of recognizing phone numbers it recognizes two different patterns, CMW followed by numbers and letters or DWP followed by numbers and letters.
Here is what I am using that I am unsure what to change:
import Foundation
extension Character {
// Given a list of allowed characters, try to convert self to those in list
// if not already in it. This handles some common misclassifications for
// characters that are visually similar and can only be correctly recognized
// with more context and/or domain knowledge. Some examples (should be read
// in Menlo or some other font that has different symbols for all characters):
// 1 and l are the same character in Times New Roman
// I and l are the same character in Helvetica
// 0 and O are extremely similar in many fonts
// oO, wW, cC, sS, pP and others only differ by size in many fonts
func getSimilarCharacterIfNotIn(allowedChars: String) -> Character {
let conversionTable = [
"s": "S",
"S": "5",
"5": "S",
"o": "O",
"Q": "O",
"O": "0",
"0": "O",
"l": "I",
"I": "1",
"1": "I",
"B": "8",
"8": "B"
]
// Allow a maximum of two substitutions to handle 's' -> 'S' -> '5'.
let maxSubstitutions = 2
var current = String(self)
var counter = 0
while !allowedChars.contains(current) && counter < maxSubstitutions {
if let altChar = conversionTable[current] {
current = altChar
counter += 1
} else {
// Doesn't match anything in our table. Give up.
break
}
}
return current.first!
}
}
extension String {
// Extracts the first US-style phone number found in the string, returning
// the range of the number and the number itself as a tuple.
// Returns nil if no number is found.
func extractPhoneNumber() -> (Range<String.Index>, String)? {
// Do a first pass to find any substring that could be a US phone
// number. This will match the following common patterns and more:
// xxx-xxx-xxxx
// xxx xxx xxxx
// (xxx) xxx-xxxx
// (xxx)xxx-xxxx
// xxx.xxx.xxxx
// xxx xxx-xxxx
// xxx/xxx.xxxx
// +1-xxx-xxx-xxxx
// Note that this doesn't only look for digits since some digits look
// very similar to letters. This is handled later.
let pattern = #"""
(?x) # Verbose regex, allows comments
(?:\+1-?)? # Potential international prefix, may have -
[(]? # Potential opening (
\b(\w{3}) # Capture xxx
[)]? # Potential closing )
[\ -./]? # Potential separator
(\w{3}) # Capture xxx
[\ -./]? # Potential separator
(\w{4})\b # Capture xxxx
"""#
guard let range = self.range(of: pattern, options: .regularExpression, range: nil, locale: nil) else {
// No phone number found.
return nil
}
// Potential number found. Strip out punctuation, whitespace and country
// prefix.
var phoneNumberDigits = ""
let substring = String(self[range])
let nsrange = NSRange(substring.startIndex..., in: substring)
do {
// Extract the characters from the substring.
let regex = try NSRegularExpression(pattern: pattern, options: [])
if let match = regex.firstMatch(in: substring, options: [], range: nsrange) {
for rangeInd in 1 ..< match.numberOfRanges {
let range = match.range(at: rangeInd)
let matchString = (substring as NSString).substring(with: range)
phoneNumberDigits += matchString as String
}
}
} catch {
print("Error \(error) when creating pattern")
}
// Must be exactly 10 digits.
guard phoneNumberDigits.count == 17 else {
return nil
}
// Substitute commonly misrecognized characters, for example: 'S' -> '5' or 'l' -> '1'
var result = ""
let allowedChars = "0123456789"
for var char in phoneNumberDigits {
char = char.getSimilarCharacterIfNotIn(allowedChars: allowedChars)
guard allowedChars.contains(char) else {
return nil
}
result.append(char)
}
return (range, result)
}
func extractSerialNumber() -> (Range<String.Index>, String)? {
// Do a first pass to find any substring that could be a US phone
// number. This will match the following common patterns and more:
// xxx-xxx-xxxx
// xxx xxx xxxx
// (xxx) xxx-xxxx
// (xxx)xxx-xxxx
// xxx.xxx.xxxx
// xxx xxx-xxxx
// xxx/xxx.xxxx
// +1-xxx-xxx-xxxx
// Note that this doesn't only look for digits since some digits look
// very similar to letters. This is handled later.
let pattern = #"""
(?x) # Verbose regex, allows comments
(?:\+1-?)? # Potential international prefix, may have -
[(]? # Potential opening (
\b(\w{3}) # Capture xxx
[)]? # Potential closing )
[\ -./]? # Potential separator
(\w{3}) # Capture xxx
[\ -./]? # Potential separator
(\w{4})\b # Capture xxxx
"""#
guard let range = self.range(of: pattern, options: .regularExpression, range: nil, locale: nil) else {
// No phone number found.
return nil
}
// Potential number found. Strip out punctuation, whitespace and country
// prefix.
var phoneNumberDigits = ""
let substring = String(self[range])
let nsrange = NSRange(substring.startIndex..., in: substring)
do {
// Extract the characters from the substring.
let regex = try NSRegularExpression(pattern: pattern, options: [])
if let match = regex.firstMatch(in: substring, options: [], range: nsrange) {
for rangeInd in 1 ..< match.numberOfRanges {
let range = match.range(at: rangeInd)
let matchString = (substring as NSString).substring(with: range)
phoneNumberDigits += matchString as String
}
}
} catch {
print("Error \(error) when creating pattern")
}
// Must be exactly 10 digits.
guard phoneNumberDigits.count == 10 else {
return nil
}
// Substitute commonly misrecognized characters, for example: 'S' -> '5' or 'l' -> '1'
var result = ""
let allowedChars = "0123456789"
for var char in phoneNumberDigits {
char = char.getSimilarCharacterIfNotIn(allowedChars: allowedChars)
guard allowedChars.contains(char) else {
return nil
}
result.append(char)
}
return (range, result)
}
}
class StringTracker {
var frameIndex: Int64 = 0
typealias StringObservation = (lastSeen: Int64, count: Int64)
// Dictionary of seen strings. Used to get stable recognition before
// displaying anything.
var seenStrings = [String: StringObservation]()
var bestCount = Int64(0)
var bestString = ""
func logFrame(strings: [String]) {
for string in strings {
if seenStrings[string] == nil {
seenStrings[string] = (lastSeen: Int64(0), count: Int64(-1))
}
seenStrings[string]?.lastSeen = frameIndex
seenStrings[string]?.count += 1
print("Seen \(string) \(seenStrings[string]?.count ?? 0) times")
}
var obsoleteStrings = [String]()
// Go through strings and prune any that have not been seen in while.
// Also find the (non-pruned) string with the greatest count.
for (string, obs) in seenStrings {
// Remove previously seen text after 30 frames (~1s).
if obs.lastSeen < frameIndex - 30 {
obsoleteStrings.append(string)
}
// Find the string with the greatest count.
let count = obs.count
if !obsoleteStrings.contains(string) && count > bestCount {
bestCount = Int64(count)
bestString = string
}
}
// Remove old strings.
for string in obsoleteStrings {
seenStrings.removeValue(forKey: string)
}
frameIndex += 1
}
func getStableString() -> String? {
// Require the recognizer to see the same string at least 10 times.
if bestCount >= 10 {
return bestString
} else {
return nil
}
}
func reset(string: String) {
seenStrings.removeValue(forKey: string)
bestCount = 0
bestString = ""
}
}

Replace string subrange with character and maintain length in Swift

I would like to hide sensitive user data in strings by replacing a certain subrange with asterisks. For instance, replace all characters except the first and last three, turning
"sensitive info"
into
"sen********nfo".
I have tried this:
func hideInfo(sensitiveInfo: String) -> String {
let fIndex = sensitiveInfo.index(sensitiveInfo.endIndex, offsetBy: -4)
let sIndex = sensitiveInfo.index(sensitiveInfo.startIndex, offsetBy: 3)
var hiddenInfo = sensitiveInfo
let hiddenSubstring = String(repeating: "*", count: hiddenInfo.count - 6)
hiddenInfo.replaceSubrange(sIndex...fIndex, with: hiddenSubstring)
return hiddenInfo
}
and it works. But it seems overcomplicated. Is there a simpler and/or more elegant way of achieving this?
How about building the string with the first three characters (prefix(3)) the created asterisk substring and the last three characters (suffix(3))
func hideInfo(sensitiveInfo: String) -> String {
let length = sensitiveInfo.utf8.count
if length <= 6 { return sensitiveInfo }
return String(sensitiveInfo.prefix(3) + String(repeating: "*", count: length - 6) + sensitiveInfo.suffix(3))
}

How to separate characters in String by whitespace with multiple strides?

I have a working function that separates every n character with whitespace, which works fine.
Here is the code (Swift 5):
extension String {
/// Creates a new string, separating characters specified by stride lenght.
/// - Parameters:
/// - stride: Desired stride lenght.
/// - separator: Character to be placed in between separations
func separate(every stride: Int, with separator: Character) -> String {
return String(self.enumerated().map { $0 > 0 && $0 % stride == 0 ? [separator, $1] : [$1] }.joined())
}
}
This prints an example string of 1234123412341234 like this
1234 1234 1234 1234
Now, how can i separate this string 1234123412341234 with multiple strides, for example white space to be set after 4th, then after 6th and then after 5th character, like this:
1234 123412 34123 4
Here's how I would do this:
// Prints sequences of bools using 1/0s for easy reading
func p<S: Sequence>(_ bools: S) where S.Element == Bool {
print(bools.map { $0 ? "1" : "0"}.joined())
}
// E.g. makeWindow(span: 3) returns 0001
func makeWindow(span: Int) -> UnfoldSequence<Bool, Int> {
return sequence(state: span) { state in
state -= 1
switch state {
case -1: return nil
case 0: return true
case _: return false
}
}
}
// E.g. calculateSpacePositions(spans: [4, 6, 5]) returns 000100000100001
func calculateSpacePositions<S: Sequence>(spans: S)
-> LazySequence<FlattenSequence<LazyMapSequence<S, UnfoldSequence<Bool, Int>>>>
where S.Element == Int {
return spans.lazy.flatMap(makeWindow(span:))
}
extension String {
func insertingSpaces(at spans: [Int]) -> String {
let spacePositions = calculateSpacePositions(spans: spans + [Int.max])
// p(spacePositions.prefix(self.count))
let characters = zip(inputString, spacePositions)
.flatMap { character, shouldHaveSpace -> [Character] in
return shouldHaveSpace ? [character, "_"] : [character]
}
return String(characters)
}
}
let inputString = "1234123412341234"
let result = inputString.insertingSpaces(at: [4, 6, 5])
print(result)
The main idea is that I want to zip(self, spacePositions), so that I obtain a sequence of the characters of self, along with a boolean that tells me if I should append a space after the current character.
To calculate spacePositions, I first started by making a function that when given an Int input span, would return span falses followed by a true. E.g. makeWindow(span: 3) returns a sequence that yields false, false, false, true.
From there, it's just a matter of making one of these windows per element of the input, and joining them all together using flatMap. I do this all lazily, so that we don't actually need to store all of these repeated booleans.
I hit one snag though. If you give the input [4, 6, 5], the output I would get used to be 4 characters, space, 6 characters, space, 5 characters, end. The rest of the string was lost, because zip yields a sequence whose length is equal to the length of the shorter of the two inputs.
To remedy this, I append Int.max on the spans input. That way, the space positions are 000010000001000001 ...now followed by Int.max falses.
func separate(text: String,every stride: [Int], with separator: Character)->String {
var separatorLastPosition = 0 // This is the last separator position in text
var myText = text
if text.count < stride.reduce(0,+){
return text //if your text length not enough for adding separator for all stride positions it will return the text without modifications.you can return error msg also
}else{
for (index, item) in stride.enumerated(){
myText.insert(separator, at:myText.index(myText.startIndex, offsetBy: index == 0 ? item : separatorLastPosition+item))
separatorLastPosition += item+1
}
return myText
}
}
print(separate(text: "12345678901234567890", every: [2,4,5,2], with: " "))
//Result -- 12 3456 78901 23 4567890
func separateCharcters(numbers: String, every: inout [Int], character: Character) ->String{
var counter = 0
var numbersWithSpaces = ""
for (_, number) in numbers.enumerated(){
numbersWithSpaces.append(number)
if !every.isEmpty{
counter += 1
if counter == every.first!{
numbersWithSpaces.append(character)
every.removeFirst()
counter = 0
}
}
}
return numbersWithSpaces
}
Test Case
var numberArray = [4, 6, 5]
separateCharcters(numbers: "1234123412341234", every: &numberArray, character: " ")
Return Result = "1234 123412 34123 4"

Remove leading characters and zero in swift

I tried to remove some characters from string in Swift 4, says: I have
QC00012345, return 12345
QC00009876, return 9876
QC12345678, return 12345678
removing first two characters and those leading zeros.
I looked around in here, people are just using dropFirst then convert
String into Int then convert it back to String.
Is there any better way?
You could use a regular expression and dropFirst:
let str = "QC00009876"
let clean = str.dropFirst(2).replacingOccurrences(of: "^0*", with: "", options: .regularExpression)
The expression ^0* means "zero or more 0 characters at the start of the string".
Or with just a regular expression:
let str = "QC00009876"
let clean = str.replacingOccurrences(of: "^..0*", with: "", options: .regularExpression)
The expression ^..0* means "any two characters followed by zero or more 0 characters at the start of the string".
You can simply use drop(while:):
let test = "QC00012345"
let num = test.dropFirst(2).drop { $0 == "0"}
You can also create a ClosedRange<String> and use it as predicate if you don't want to drop first two characters manually:
let num = test.drop { !("1"..."9" ~= $0) }
print(num) // 12345
or check if a string from 1 to 9 does not contain the character:
let num = test.drop { !"123456789".contains($0) }
edit/update:
Swift 5 or later
Using the Character property isWholeNumber
let num = test.drop { !$0.isWholeNumber || $0 == "0" }

Iterating based on a variable number of inner loops

In the below code I am trying to go through all possible combination of alphabets for number of characters which are runtime variable.
The purpose of this code is to build a kind of password cracker, which basically brute-force guess the string. I want to use loop, because I will be able to break the loop as soon as the correct combination is hit thus saving on time and resources which otherwise will be required if I try to build an array of all possible combinations in first step.
I have a static code which works for a string 5 characters long but in reality my string could be any length. How can I make my code work with any length of string?
let len = textField.text?.characters.count //Length of string
let charRange = "abcdefghijklmnopqrstuvwxyz" //Allowed characterset
for char1 in charRange.characters {
for char2 in charRange.characters {
for char3 in charRange.characters {
for char4 in charRange.characters {
for char5 in charRange.characters {
// Do whatever with all possible combinations
}
}
}
}
}
I think I have to utilize for totalChars in 1...len { somehow but can't figure out how the for loops are going to be created dynamically?
Idea: form the string using an array of indices into your alphabet; each time increment the indices.
[0, 0, 0] -> [1, 0, 0] -> [2, 0, 0] ->
[0, 1, 0] -> [1, 1, 0] -> [2, 1, 0] ->
[0, 2, 0] -> [1, 2, 0] -> [2, 2, 0] ->
[0, 0, 1] ... [2, 2, 2]
Here's an example using a length of 3 and an alphabet of abcd
let len = 3
let alphabet = "abcd".characters.map({ String($0) })
var allStrings = [String]()
let maxIndex = alphabet.endIndex
var indicies = Array(count: len, repeatedValue: 0)
outerLoop: while (true) {
// Generate string from indicies
var string = ""
for i in indicies {
let letter = alphabet[i]
string += letter
}
allStrings.append(string)
print("Adding \(string)")
// Increment the index
indicies[0] += 1
var idx = 0
// If idx overflows then (idx) = 0 and (idx + 1) += 1 and try next
while (indicies[idx] == maxIndex) {
// Reset current
indicies[idx] = 0
// Increment next (as long as we haven't hit the end done)
idx += 1
if (idx >= alphabet.endIndex - 1) {
print("Breaking outer loop")
break outerLoop
}
indicies[idx] += 1
}
}
print("All Strings: \(allStrings)")
As suggested by Martin R, you can use recursion
This is the function
func visit(alphabet:[Character], combination:[Character], inout combinations:[String], length: Int) {
guard length > 0 else {
combinations.append(String(combination))
return
}
alphabet.forEach {
visit(alphabet, combination: combination + [$0], combinations: &combinations, length: length - 1)
}
}
The helper function
func combinations(alphabet: String, length: Int) -> [String] {
var combinations = [String]()
visit([Character](alphabet.characters), combination: [Character](), combinations: &combinations, length: length)
return combinations
}
Test
Now if you want every combination of 3 chars, and you want "ab" as alphabet then
combinations("ab", length: 3) // ["aaa", "aab", "aba", "abb", "baa", "bab", "bba", "bbb"]
Duplicates
Please note that if you insert duplicates into your alphabet, you'll get duplicate elements into the result.
Time complexity
The visit function is invoked as many times as the nodes into a perfect k-ary tree with height h where:
k: the number of elements into the alphabet param
h: the length param
Such a tree has
nodes. And this is the exact number of times the function will be invoked.
Space complexity
Theoretically The max number of stack frames allocated at the same time to execute visit is length.
However since the Swift compiler does implement the Tail Call Optimization the number of allocated stack frames is only 1.
Finally we must consider that combinations will be as big as the number of results: alphabet^length
So the time complexity is the max of length and elements into the result.
And it is O(length + alphabet^length)
Update
It turns out you want a brute force password breaker so.
func find(alphabet:[Character], combination:[Character] = [Character](), length: Int, check: (keyword:String) -> Bool) -> String? {
guard length > 0 else {
let keyword = String(combination)
return check(keyword: keyword) ? keyword : nil
}
for char in alphabet {
if let keyword = find(alphabet, combination: combination + [char], length: length - 1, check: check) {
return keyword
}
}
return nil
}
The last param check is a closure to verify if the current word is the correct password. You will put your logic here and the find will stop as soon as the password is found.
Example
find([Character]("tabcdefghil".characters), length: 3) { (keyword) -> Bool in
return keyword == "cat" // write your code to verify the password here
}
Alternative to recursion; loop radix representation of incremental (repeated) traversing of your alphabet
An alternative to recursion is to loop over an numeral representation of your alphabet, using a radix representative for the different number of letters. A limitation with this method is that the String(_:,radix:) initializer allows at most base36 numbers (radix 36), i.e., you can at most perform your "password cracking" with a set of characters with a unique count <=36.
Help function
// help function to use to pad incremental alphabeth cycling to e.g. "aa..."
let padToTemplate: (str: String, withTemplate: String) -> String = {
return $0.characters.count < $1.characters.count
? String($1.characters.suffixFrom($0.characters.endIndex)) + $0
: $0
}
Main radix brute-force password checking method
// attempt brute-force attempts to crack isCorrectPassword closure
// for a given alphabet, suspected word length and for a maximum number of
// attempts, optionally with a set starting point
func bruteForce(isCorrectPassword: (String) -> Bool, forAlphabet alphabet: [Character], forWordLength wordLength: Int, forNumberOfAttempts numAttempts: Int, startingFrom start: Int = 0) -> (Int, String?) {
// remove duplicate characters (but preserve order)
var exists: [Character:Bool] = [:]
let uniqueAlphabet = Array(alphabet.filter { return exists.updateValue(true, forKey: $0) == nil })
// limitation: allows at most base36 radix
guard case let radix = uniqueAlphabet.count
where radix < 37 else {
return (-1, nil)
}
// begin brute-force attempts
for i in start..<start+numAttempts {
let baseStr = String(i, radix: radix).characters
.flatMap { Int(String($0), radix: radix) }
.map { String(uniqueAlphabet[$0]) }
.joinWithSeparator("")
// construct attempt of correct length
let attempt = padToTemplate(str: baseStr,
withTemplate: String(count: wordLength, repeatedValue: alphabet.first!))
// log
//print(i, attempt)
// test attempt
if isCorrectPassword(attempt) { return (i, attempt) }
}
return (start+numAttempts, nil) // next to test
}
Example usage
Example usage #1
// unknown content closure
let someHashBashing : (String) -> Bool = {
return $0 == "ask"
}
// setup alphabet
let alphabet = [Character]("abcdefghijklmnopqrstuvwxyz".characters)
// any success for 500 attempts?
if case (let i, .Some(let password)) =
bruteForce(someHashBashing, forAlphabet: alphabet,
forWordLength: 3, forNumberOfAttempts: 500) {
print("Password cracked: \(password) (attempt \(i))")
} /* Password cracked: ask (attempt 478) */
Example usage #2 (picking up one failed "batch" with another)
// unknown content closure
let someHashBashing : (String) -> Bool = {
return $0 == "axk"
}
// setup alphabet
let alphabet = [Character]("abcdefghijklmnopqrstuvwxyz".characters)
// any success for 500 attempts?
let firstAttempt = bruteForce(someHashBashing, forAlphabet: alphabet,
forWordLength: 3, forNumberOfAttempts: 500)
if let password = firstAttempt.1 {
print("Password cracked: \(password) (attempt \(firstAttempt.0))")
}
// if not, try another 500?
else {
if case (let i, .Some(let password)) =
bruteForce(someHashBashing, forAlphabet: alphabet,
forWordLength: 3, forNumberOfAttempts: 500,
startingFrom: firstAttempt.0) {
print("Password cracked: \(password) (attempt \(i))")
} /* Password cracked: axk (attempt 608) */
}