In the below code I am trying to go through all possible combination of alphabets for number of characters which are runtime variable.
The purpose of this code is to build a kind of password cracker, which basically brute-force guess the string. I want to use loop, because I will be able to break the loop as soon as the correct combination is hit thus saving on time and resources which otherwise will be required if I try to build an array of all possible combinations in first step.
I have a static code which works for a string 5 characters long but in reality my string could be any length. How can I make my code work with any length of string?
let len = textField.text?.characters.count //Length of string
let charRange = "abcdefghijklmnopqrstuvwxyz" //Allowed characterset
for char1 in charRange.characters {
for char2 in charRange.characters {
for char3 in charRange.characters {
for char4 in charRange.characters {
for char5 in charRange.characters {
// Do whatever with all possible combinations
}
}
}
}
}
I think I have to utilize for totalChars in 1...len { somehow but can't figure out how the for loops are going to be created dynamically?
Idea: form the string using an array of indices into your alphabet; each time increment the indices.
[0, 0, 0] -> [1, 0, 0] -> [2, 0, 0] ->
[0, 1, 0] -> [1, 1, 0] -> [2, 1, 0] ->
[0, 2, 0] -> [1, 2, 0] -> [2, 2, 0] ->
[0, 0, 1] ... [2, 2, 2]
Here's an example using a length of 3 and an alphabet of abcd
let len = 3
let alphabet = "abcd".characters.map({ String($0) })
var allStrings = [String]()
let maxIndex = alphabet.endIndex
var indicies = Array(count: len, repeatedValue: 0)
outerLoop: while (true) {
// Generate string from indicies
var string = ""
for i in indicies {
let letter = alphabet[i]
string += letter
}
allStrings.append(string)
print("Adding \(string)")
// Increment the index
indicies[0] += 1
var idx = 0
// If idx overflows then (idx) = 0 and (idx + 1) += 1 and try next
while (indicies[idx] == maxIndex) {
// Reset current
indicies[idx] = 0
// Increment next (as long as we haven't hit the end done)
idx += 1
if (idx >= alphabet.endIndex - 1) {
print("Breaking outer loop")
break outerLoop
}
indicies[idx] += 1
}
}
print("All Strings: \(allStrings)")
As suggested by Martin R, you can use recursion
This is the function
func visit(alphabet:[Character], combination:[Character], inout combinations:[String], length: Int) {
guard length > 0 else {
combinations.append(String(combination))
return
}
alphabet.forEach {
visit(alphabet, combination: combination + [$0], combinations: &combinations, length: length - 1)
}
}
The helper function
func combinations(alphabet: String, length: Int) -> [String] {
var combinations = [String]()
visit([Character](alphabet.characters), combination: [Character](), combinations: &combinations, length: length)
return combinations
}
Test
Now if you want every combination of 3 chars, and you want "ab" as alphabet then
combinations("ab", length: 3) // ["aaa", "aab", "aba", "abb", "baa", "bab", "bba", "bbb"]
Duplicates
Please note that if you insert duplicates into your alphabet, you'll get duplicate elements into the result.
Time complexity
The visit function is invoked as many times as the nodes into a perfect k-ary tree with height h where:
k: the number of elements into the alphabet param
h: the length param
Such a tree has
nodes. And this is the exact number of times the function will be invoked.
Space complexity
Theoretically The max number of stack frames allocated at the same time to execute visit is length.
However since the Swift compiler does implement the Tail Call Optimization the number of allocated stack frames is only 1.
Finally we must consider that combinations will be as big as the number of results: alphabet^length
So the time complexity is the max of length and elements into the result.
And it is O(length + alphabet^length)
Update
It turns out you want a brute force password breaker so.
func find(alphabet:[Character], combination:[Character] = [Character](), length: Int, check: (keyword:String) -> Bool) -> String? {
guard length > 0 else {
let keyword = String(combination)
return check(keyword: keyword) ? keyword : nil
}
for char in alphabet {
if let keyword = find(alphabet, combination: combination + [char], length: length - 1, check: check) {
return keyword
}
}
return nil
}
The last param check is a closure to verify if the current word is the correct password. You will put your logic here and the find will stop as soon as the password is found.
Example
find([Character]("tabcdefghil".characters), length: 3) { (keyword) -> Bool in
return keyword == "cat" // write your code to verify the password here
}
Alternative to recursion; loop radix representation of incremental (repeated) traversing of your alphabet
An alternative to recursion is to loop over an numeral representation of your alphabet, using a radix representative for the different number of letters. A limitation with this method is that the String(_:,radix:) initializer allows at most base36 numbers (radix 36), i.e., you can at most perform your "password cracking" with a set of characters with a unique count <=36.
Help function
// help function to use to pad incremental alphabeth cycling to e.g. "aa..."
let padToTemplate: (str: String, withTemplate: String) -> String = {
return $0.characters.count < $1.characters.count
? String($1.characters.suffixFrom($0.characters.endIndex)) + $0
: $0
}
Main radix brute-force password checking method
// attempt brute-force attempts to crack isCorrectPassword closure
// for a given alphabet, suspected word length and for a maximum number of
// attempts, optionally with a set starting point
func bruteForce(isCorrectPassword: (String) -> Bool, forAlphabet alphabet: [Character], forWordLength wordLength: Int, forNumberOfAttempts numAttempts: Int, startingFrom start: Int = 0) -> (Int, String?) {
// remove duplicate characters (but preserve order)
var exists: [Character:Bool] = [:]
let uniqueAlphabet = Array(alphabet.filter { return exists.updateValue(true, forKey: $0) == nil })
// limitation: allows at most base36 radix
guard case let radix = uniqueAlphabet.count
where radix < 37 else {
return (-1, nil)
}
// begin brute-force attempts
for i in start..<start+numAttempts {
let baseStr = String(i, radix: radix).characters
.flatMap { Int(String($0), radix: radix) }
.map { String(uniqueAlphabet[$0]) }
.joinWithSeparator("")
// construct attempt of correct length
let attempt = padToTemplate(str: baseStr,
withTemplate: String(count: wordLength, repeatedValue: alphabet.first!))
// log
//print(i, attempt)
// test attempt
if isCorrectPassword(attempt) { return (i, attempt) }
}
return (start+numAttempts, nil) // next to test
}
Example usage
Example usage #1
// unknown content closure
let someHashBashing : (String) -> Bool = {
return $0 == "ask"
}
// setup alphabet
let alphabet = [Character]("abcdefghijklmnopqrstuvwxyz".characters)
// any success for 500 attempts?
if case (let i, .Some(let password)) =
bruteForce(someHashBashing, forAlphabet: alphabet,
forWordLength: 3, forNumberOfAttempts: 500) {
print("Password cracked: \(password) (attempt \(i))")
} /* Password cracked: ask (attempt 478) */
Example usage #2 (picking up one failed "batch" with another)
// unknown content closure
let someHashBashing : (String) -> Bool = {
return $0 == "axk"
}
// setup alphabet
let alphabet = [Character]("abcdefghijklmnopqrstuvwxyz".characters)
// any success for 500 attempts?
let firstAttempt = bruteForce(someHashBashing, forAlphabet: alphabet,
forWordLength: 3, forNumberOfAttempts: 500)
if let password = firstAttempt.1 {
print("Password cracked: \(password) (attempt \(firstAttempt.0))")
}
// if not, try another 500?
else {
if case (let i, .Some(let password)) =
bruteForce(someHashBashing, forAlphabet: alphabet,
forWordLength: 3, forNumberOfAttempts: 500,
startingFrom: firstAttempt.0) {
print("Password cracked: \(password) (attempt \(i))")
} /* Password cracked: axk (attempt 608) */
}
Related
The problem: to return a number of zeros in an array that contains 0s and 1s, if there are 3 0s in a row, count them as one, for example [0, 1, 0, 0, 0, 1, 0, 1, 0] should return 4, but when i try to solve it like this
func findZeros(_ c: [Int]) -> Int {
var zeros = 0
for var i in 0..<c.count {
switch c[i] {
case _ where c[i] == 0 && c[i+1] == 0 && c[i+2] == 0: // row 5
zeros += 1
i += 2
case _ where c[i] == 0:
zeros += 1
default:
break
}
}
return zeros
}
i always get index out of range error in row 5 , although when i hardcode c[1], c[2], c[3] == 0, it just counts as false and goes through... i've just started to learn swift so maybe that's not optimal, but anyway i can't even get this one working :/
You can group the consecutive elements and sum how many groups you have of those elements:
extension Collection where Element: Equatable {
var grouped: [[Element]] {
reduce(into: []) {
// check if the last element of the last collection is equal to the current element
$0.last?.last == $1 ?
// append the element to the last collection
$0[$0.index(before: $0.endIndex)].append($1) :
// otherwise add a new collection with the new element
$0.append([$1])
}
}
func repeatedOccurences(of element: Element) -> Int {
// if the collection first element is equal to the element add one otherwise return the current result
grouped.reduce(0) { $1.first == element ? $0 + 1 : $0 }
}
}
[0, 1, 0, 0, 0, 1, 0, 1, 0].repeatedOccurences(of: 0) // 4
[0, 1, 0, 0, 0, 1, 0, 1, 0].repeatedOccurences(of: 1) // 3
If you were only accessing c[i], then counting i up to c.count - 1 would be fine - but you're also trying to access c[i+1] and c[i+2], so you need to take that into account when setting the upper boundary of the range - and then verify that the array indeed has at least 3 elements:
if c.count < 3 {
return 0;
}
for var i in 0..<c.count-2 {
// now you can safely access c[i+2]
}
I'm going to show a very different solution approach. It's not one that would come naturally to a new beginner, so don't worry if it's foreign to you. It's something that you might come up with from having a bit more experience and being able to relate scattered concepts with each other.
The over-all process is actually very straight-forward, and is an almost exact codification of the your english explanation to the solution:
Identify all the sub-sequences of repeating elements. These are called "runs", and there's a concept called a "run-length encoding". It takes an input like ["A", "B, B", "C", "D", "D", "D"], and turns it into a sequence like [("A", 1), ("B", 2), ("C", 1), ("D", 3)]
Identify the runs of 0s that repeat precisely 3 times, and treat them as runs of a single 0.
Count the number of 0s in the runs.
Here's what the code to do that would look like:
let input = [0, 1, 0, 0, 0, 1, 0, 1, 0]
let result = input.runLengthEncoded()
.lazy
.filter(keepOnlyRunsOfZeros)
.map(convertThreeCountRunsIntoOneCountRuns)
.map { $0.count }
.reduce(0, +)
print(result)
And here are the supporting functions that make it possible:
typealias Run = (element: Int, count: Int)
func keepOnlyRunsOfZeros(_ run: Run) -> Bool {
return run.element == 0
}
func convertThreeCountRunsIntoOneCountRuns(_ run: Run) -> Run {
if run.count == 3 {
return (element: run.element, count: 1)
}
else {
return run
}
}
Knowing that the process in #1 is a known algorithm called run-length encoding, I can find and reuse an existing implementation. Over time as a developer, you build up a collection of useful functions/techniques that you reuse for future use. Often times, you find other people's libraries that you've come to find useful, thus have booked-marked and re-use.
In this case, I have an implementation of run-length encoding that I've written and used in previous projects. It's quite long, but it's generalized and lazy-evaluated (it doesn't need to make an array of all runs, it serves them one by one as you request them, which improves performance for huge inputs), which isn't strictly necessary in this case, but it's what I already have on hand.
There alternative implementations that you can find that are eagerly evaluated and less generic (which should be fine for a simple problem like this), feel free to substitute one of those, instead.
public extension Sequence where Self.Iterator.Element: Equatable {
func runLengthEncoded() -> LazySequenceRunLengthEncoder<Self> {
return LazySequenceRunLengthEncoder(encoding: self)
}
}
public struct LazySequenceRunLengthEncoder<WrappedSequence: Sequence>: Sequence
where WrappedSequence.Element: Equatable {
public let wrappedSequence: WrappedSequence
public init(encoding wrappedSequence: WrappedSequence) {
self.wrappedSequence = wrappedSequence
}
public func makeIterator() -> RunLengthEncodingIterator<WrappedSequence.Iterator> {
return RunLengthEncodingIterator(encoding: wrappedSequence.makeIterator())
}
}
public struct RunLengthEncodingIterator<WrappedIterator: IteratorProtocol>: IteratorProtocol
where WrappedIterator.Element: Equatable {
public private(set) var wrappedIterator: WrappedIterator
public private(set) var currentGrouping: (element: WrappedIterator.Element, count: Int)? = nil
public init(encoding wrappedIterator: WrappedIterator) {
self.wrappedIterator = wrappedIterator
}
public mutating func next() -> (element: WrappedIterator.Element, count: Int)? {
while let newElement = wrappedIterator.next() { // Take all elements of this run
if let currentGrouping = self.currentGrouping {
if newElement == currentGrouping.element { // increment the current run
let newCount = currentGrouping.count + 1
self.currentGrouping = (element: newElement, count: newCount)
} else { // Broke the streak
defer {
self.currentGrouping = (element: newElement, count: 1) // start a new group
}
return self.currentGrouping
}
} else { // There is no current group, this is the first element
self.currentGrouping = (element: newElement, count: 1)
}
}
// Reached end of the wrapped iterator
// 2. Only return the current grouping once, return the `nil` next time to end this iterator.
defer { self.currentGrouping = nil }
// 1. Return current grouping, if there is one
return self.currentGrouping
}
}
I have a working function that separates every n character with whitespace, which works fine.
Here is the code (Swift 5):
extension String {
/// Creates a new string, separating characters specified by stride lenght.
/// - Parameters:
/// - stride: Desired stride lenght.
/// - separator: Character to be placed in between separations
func separate(every stride: Int, with separator: Character) -> String {
return String(self.enumerated().map { $0 > 0 && $0 % stride == 0 ? [separator, $1] : [$1] }.joined())
}
}
This prints an example string of 1234123412341234 like this
1234 1234 1234 1234
Now, how can i separate this string 1234123412341234 with multiple strides, for example white space to be set after 4th, then after 6th and then after 5th character, like this:
1234 123412 34123 4
Here's how I would do this:
// Prints sequences of bools using 1/0s for easy reading
func p<S: Sequence>(_ bools: S) where S.Element == Bool {
print(bools.map { $0 ? "1" : "0"}.joined())
}
// E.g. makeWindow(span: 3) returns 0001
func makeWindow(span: Int) -> UnfoldSequence<Bool, Int> {
return sequence(state: span) { state in
state -= 1
switch state {
case -1: return nil
case 0: return true
case _: return false
}
}
}
// E.g. calculateSpacePositions(spans: [4, 6, 5]) returns 000100000100001
func calculateSpacePositions<S: Sequence>(spans: S)
-> LazySequence<FlattenSequence<LazyMapSequence<S, UnfoldSequence<Bool, Int>>>>
where S.Element == Int {
return spans.lazy.flatMap(makeWindow(span:))
}
extension String {
func insertingSpaces(at spans: [Int]) -> String {
let spacePositions = calculateSpacePositions(spans: spans + [Int.max])
// p(spacePositions.prefix(self.count))
let characters = zip(inputString, spacePositions)
.flatMap { character, shouldHaveSpace -> [Character] in
return shouldHaveSpace ? [character, "_"] : [character]
}
return String(characters)
}
}
let inputString = "1234123412341234"
let result = inputString.insertingSpaces(at: [4, 6, 5])
print(result)
The main idea is that I want to zip(self, spacePositions), so that I obtain a sequence of the characters of self, along with a boolean that tells me if I should append a space after the current character.
To calculate spacePositions, I first started by making a function that when given an Int input span, would return span falses followed by a true. E.g. makeWindow(span: 3) returns a sequence that yields false, false, false, true.
From there, it's just a matter of making one of these windows per element of the input, and joining them all together using flatMap. I do this all lazily, so that we don't actually need to store all of these repeated booleans.
I hit one snag though. If you give the input [4, 6, 5], the output I would get used to be 4 characters, space, 6 characters, space, 5 characters, end. The rest of the string was lost, because zip yields a sequence whose length is equal to the length of the shorter of the two inputs.
To remedy this, I append Int.max on the spans input. That way, the space positions are 000010000001000001 ...now followed by Int.max falses.
func separate(text: String,every stride: [Int], with separator: Character)->String {
var separatorLastPosition = 0 // This is the last separator position in text
var myText = text
if text.count < stride.reduce(0,+){
return text //if your text length not enough for adding separator for all stride positions it will return the text without modifications.you can return error msg also
}else{
for (index, item) in stride.enumerated(){
myText.insert(separator, at:myText.index(myText.startIndex, offsetBy: index == 0 ? item : separatorLastPosition+item))
separatorLastPosition += item+1
}
return myText
}
}
print(separate(text: "12345678901234567890", every: [2,4,5,2], with: " "))
//Result -- 12 3456 78901 23 4567890
func separateCharcters(numbers: String, every: inout [Int], character: Character) ->String{
var counter = 0
var numbersWithSpaces = ""
for (_, number) in numbers.enumerated(){
numbersWithSpaces.append(number)
if !every.isEmpty{
counter += 1
if counter == every.first!{
numbersWithSpaces.append(character)
every.removeFirst()
counter = 0
}
}
}
return numbersWithSpaces
}
Test Case
var numberArray = [4, 6, 5]
separateCharcters(numbers: "1234123412341234", every: &numberArray, character: " ")
Return Result = "1234 123412 34123 4"
I am trying to solve code fights interview practice questions, but I am stuck on how to solve this particular problem in swift. My first thought was to use a dictionary with the counts of each character, but then I would have to iterate over the string again to compare, so that doesn't work per the restrictions. Any help would be good. Thank you. Here is the problem and requirements:
Note: Write a solution that only iterates over the string once and uses O(1) additional memory, since this is what you would be asked to do during a real interview.
Given a string s, find and return the first instance of a non-repeating character in it. If there is no such character, return '_'
Here is the code I started with (borrowed from another post)
func firstNotRepeatingCharacter(s: String) -> Character {
var countHash:[Character:Int] = [:]
for character in s {
countHash[character] = (countHash[character] ?? 0) + 1
}
let nonRepeatingCharacters = s.filter({countHash[$0] == 1})
let firstNonRepeatingCharacter = nonRepeatingCharacters.first!
return firstNonRepeatingCharacter
}
firstNotRepeatingCharacter(s:"abacabad")
You can create a dictionary to store the occurrences and use first(where:) method to return the first occurrence that happens only once:
Swift 4
func firstNotRepeatingCharacter(s: String) -> Character {
var occurrences: [Character: Int] = [:]
s.forEach{ occurrences[$0, default: 0] += 1 }
return s.first{ occurrences[$0] == 1 } ?? "_"
}
Swift 3
func firstNotRepeatingCharacter(s: String) -> Character {
var occurrences: [Character:Int] = [:]
s.characters.forEach{ occurrences[$0] = (occurrences[$0] ?? 0) + 1}
return s.characters.first{ occurrences[$0] == 1 } ?? "_"
}
Another option iterating the string in reversed order and using an array of 26 elements to store the characters occurrences
func firstNotRepeatingCharacter(s: String) -> Character {
var chars = Array(repeating: 0, count: 26)
var characters: [Character] = []
var charIndex = 0
var strIndex = 0
s.characters.reversed().forEach {
let index = Int(String($0).unicodeScalars.first!.value) - 97
chars[index] += 1
if chars[index] == 1 && strIndex >= charIndex {
characters.append($0)
charIndex = strIndex
}
strIndex += 1
}
return characters.reversed().first { chars[Int(String($0).unicodeScalars.first!.value) - 97] == 1 } ?? "_"
}
Use a dictionary to store the character counts as well as where they were first encountered. Then, loop over the dictionary (which is constant in size since there are only so many unique characters in the input string, thus also takes constant time to iterate) and find the earliest occurring character with a count of 1.
func firstUniqueCharacter(in s: String) -> Character
{
var characters = [Character: (count: Int, firstIndex: Int)]()
for (i, c) in s.characters.enumerated()
{
if let t = characters[c]
{
characters[c] = (t.count + 1, t.firstIndex)
}
else
{
characters[c] = (1, i)
}
}
var firstUnique = (character: Character("_"), index: Int.max)
for (k, v) in characters
{
if v.count == 1 && v.firstIndex <= firstUnique.index
{
firstUnique = (k, v.firstIndex)
}
}
return firstUnique.character
}
Swift
Use dictionary, uniqueCharacter optional variable with unique characters array to store all uniquely present characters in the string , every time duplication of characters found should delete that character from unique characters array and same time it is the most first character then should update the dictionary with its count incremented , refer following snippet , how end of the iteration through all characters gives a FIRST NON REPEATED CHARACTER in given String. Refer following code to understand it properly
func findFirstNonRepeatingCharacter(string:String) -> Character?{
var uniqueChars:[Character] = []
var uniqueChar:Character?
var chars = string.lowercased().characters
var charWithCount:[Character:Int] = [:]
for char in chars{
if let count = charWithCount[char] { //amazon
charWithCount[char] = count+1
if char == uniqueChar{
uniqueChars.removeFirst()
uniqueChar = uniqueChars.first
}
}else{
charWithCount[char] = 1
uniqueChars.append(char)
if uniqueChar == nil{
uniqueChar = char
}
}
}
return uniqueChar
}
// Use
findFirstNonRepeatingCharacter(string: "eabcdee")
I have a function in Swift that computes the hamming distance of two strings and then puts them into a connected graph if the result is 1.
For example, read to hear returns a hamming distance of 2 because read[0] != hear[0] and read[3] != hear[3].
At first, I thought my function was taking a long time because of the quantity of input (8,000+ word dictionary), but I knew that several minutes was too long. So, I rewrote my same algorithm in Java, and the computation took merely 0.3s.
I have tried writing this in Swift two different ways:
Way 1 - Substrings
extension String {
subscript (i: Int) -> String {
return self[Range(i ..< i + 1)]
}
}
private func getHammingDistance(w1: String, w2: String) -> Int {
if w1.length != w2.length { return -1 }
var counter = 0
for i in 0 ..< w1.length {
if w1[i] != w2[i] { counter += 1 }
}
return counter
}
Results: 434 seconds
Way 2 - Removing Characters
private func getHammingDistance(w1: String, w2: String) -> Int {
if w1.length != w2.length { return -1 }
var counter = 0
var c1 = w1, c2 = w2 // need to mutate
let length = w1.length
for i in 0 ..< length {
if c1.removeFirst() != c2.removeFirst() { counter += 1 }
}
return counter
}
Results: 156 seconds
Same Thing in Java
Results: 0.3 seconds
Where it's being called
var graph: Graph
func connectData() {
let verticies = graph.canvas // canvas is Array<Node>
// Node has key that holds the String
for vertex in 0 ..< verticies.count {
for compare in vertex + 1 ..< verticies.count {
if getHammingDistance(w1: verticies[vertex].key!, w2: verticies[compare].key!) == 1 {
graph.addEdge(source: verticies[vertex], neighbor: verticies[compare])
}
}
}
}
156 seconds is still far too inefficient for me. What is the absolute most efficient way of comparing characters in Swift? Is there a possible workaround for computing hamming distance that involves not comparing characters?
Edit
Edit 1: I am taking an entire dictionary of 4 and 5 letter words and creating a connected graph where the edges indicate a hamming distance of 1. Therefore, I am comparing 8,000+ words to each other to generate edges.
Edit 2: Added method call.
Unless you chose a fixed length character model for your strings, methods and properties such as .count and .characters will have a complexity of O(n) or at best O(n/2) (where n is the string length). If you were to store your data in an array of character (e.g. [Character] ), your functions would perform much better.
You can also combine the whole calculation in a single pass using the zip() function
let hammingDistance = zip(word1.characters,word2.characters)
.filter{$0 != $1}.count
but that still requires going through all characters of every word pair.
...
Given that you're only looking for Hamming distances of 1, there is a faster way to get to all the unique pairs of words:
The strategy is to group words by the 4 (or 5) patterns that correspond to one "missing" letter. Each of these pattern groups defines a smaller scope for word pairs because words in different groups would be at a distance other than 1.
Each word will belong to as many groups as its character count.
For example :
"hear" will be part of the pattern groups:
"*ear", "h*ar", "he*r" and "hea*".
Any other word that would correspond to one of these 4 pattern groups would be at a Hamming distance of 1 from "hear".
Here is how this can be implemented:
// Test data 8500 words of 4-5 characters ...
var seenWords = Set<String>()
var allWords = try! String(contentsOfFile: "/usr/share/dict/words")
.lowercased()
.components(separatedBy:"\n")
.filter{$0.characters.count == 4 || $0.characters.count == 5}
.filter{seenWords.insert($0).inserted}
.enumerated().filter{$0.0 < 8500}.map{$1}
// Compute patterns for a Hamming distance of 1
// Replace each letter position with "*" to create patterns of
// one "non-matching" letter
public func wordH1Patterns(_ aWord:String) -> [String]
{
var result : [String] = []
let fullWord : [Character] = aWord.characters.map{$0}
for index in 0..<fullWord.count
{
var pattern = fullWord
pattern[index] = "*"
result.append(String(pattern))
}
return result
}
// Group words around matching patterns
// and add unique pairs from each group
func addHamming1Edges()
{
// Prepare pattern groups ...
//
var patternIndex:[String:Int] = [:]
var hamming1Groups:[[String]] = []
for word in allWords
{
for pattern in wordH1Patterns(word)
{
if let index = patternIndex[pattern]
{
hamming1Groups[index].append(word)
}
else
{
let index = hamming1Groups.count
patternIndex[pattern] = index
hamming1Groups.append([word])
}
}
}
// add edge nodes ...
//
for h1Group in hamming1Groups
{
for (index,sourceWord) in h1Group.dropLast(1).enumerated()
{
for targetIndex in index+1..<h1Group.count
{ addEdge(source:sourceWord, neighbour:h1Group[targetIndex]) }
}
}
}
On my 2012 MacBook Pro, the 8500 words go through 22817 (unique) edge pairs in 0.12 sec.
[EDIT] to illustrate my first point, I made a "brute force" algorithm using arrays of characters instead of Strings :
let wordArrays = allWords.map{Array($0.unicodeScalars)}
for i in 0..<wordArrays.count-1
{
let word1 = wordArrays[i]
for j in i+1..<wordArrays.count
{
let word2 = wordArrays[j]
if word1.count != word2.count { continue }
var distance = 0
for c in 0..<word1.count
{
if word1[c] == word2[c] { continue }
distance += 1
if distance > 1 { break }
}
if distance == 1
{ addEdge(source:allWords[i], neighbour:allWords[j]) }
}
}
This goes through the unique pairs in 0.27 sec. The reason for the speed difference is the internal model of Swift Strings which is not actually an array of equal length elements (characters) but rather a chain of varying length encoded characters (similar to the UTF model where special bytes indicate that the following 2 or 3 bytes are part of a single character. There is no simple Base+Displacement indexing of such a structure which must always be iterated from the beginning to get to the Nth element.
Note that I used unicodeScalars instead of Character because they are 16 bit fixed length representations of characters that allow a direct binary comparison. The Character type isn't as straightforward and take longer to compare.
Try this:
extension String {
func hammingDistance(to other: String) -> Int? {
guard self.characters.count == other.characters.count else { return nil }
return zip(self.characters, other.characters).reduce(0) { distance, chars in
distance + (chars.0 == chars.1 ? 0 : 1)
}
}
}
print("read".hammingDistance(to: "hear")) // => 2
The following code executed in 0.07 secounds for 8500 characters:
func getHammingDistance(w1: String, w2: String) -> Int {
if w1.characters.count != w2.characters.count {
return -1
}
let arr1 = Array(w1.characters)
let arr2 = Array(w2.characters)
var counter = 0
for i in 0 ..< arr1.count {
if arr1[i] != arr2[i] { counter += 1 }
}
return counter
}
After some messing around, I found a faster solution to #Alexander's answer (and my previous broken answer)
extension String {
func hammingDistance(to other: String) -> Int? {
guard !self.isEmpty, !other.isEmpty, self.characters.count == other.characters.count else {
return nil
}
var w1Iterator = self.characters.makeIterator()
var w2Iterator = other.characters.makeIterator()
var distance = 0;
while let w1Char = w1Iterator.next(), let w2Char = w2Iterator.next() {
distance += (w1Char != w2Char) ? 1 : 0
}
return distance
}
}
For comparing strings with a million characters, on my machine it's 1.078 sec compared to 1.220 sec, so roughly a 10% improvement. My guess is this is due to avoiding .zip and the slight overhead of .reduce and tuples
As others have noted, calling .characters repeatedly takes time. If you convert all of the strings once, it should help.
func connectData() {
let verticies = graph.canvas // canvas is Array<Node>
// Node has key that holds the String
// Convert all of the keys to utf16, and keep them
let nodesAsUTF = verticies.map { $0.key!.utf16 }
for vertex in 0 ..< verticies.count {
for compare in vertex + 1 ..< verticies.count {
if getHammingDistance(w1: nodesAsUTF[vertex], w2: nodesAsUTF[compare]) == 1 {
graph.addEdge(source: verticies[vertex], neighbor: verticies[compare])
}
}
}
}
// Calculate the hamming distance of two UTF16 views
func getHammingDistance(w1: String.UTF16View, w2: String.UTF16View) -> Int {
if w1.count != w2.count {
return -1
}
var counter = 0
for i in w1.startIndex ..< w1.endIndex {
if w1[i] != w1[i] {
counter += 1
}
}
return counter
}
I used UTF16, but you might want to try UTF8 depending on the data. Since I don't have the dictionary you are using, please let me know the result!
*broken*, see new answer
My approach:
private func getHammingDistance(w1: String, w2: String) -> Int {
guard w1.characters.count == w2.characters.count else {
return -1
}
let countArray: Int = w1.characters.indices
.reduce(0, {$0 + (w1[$1] == w2[$1] ? 0 : 1)})
return countArray
}
comparing 2 strings of 10,000 random characters took 0.31 seconds
To expand a bit: it should only require one iteration through the strings, adding as it goes.
Also it's way more concise 🙂.
let numbers = [1,3,4,5,5,9,0,1]
To find the first 5, use:
numbers.indexOf(5)
How do I find the second occurence?
List item
You can perform another search for the index of element at the remaining array slice as follow:
edit/update: Swift 5.2 or later
extension Collection where Element: Equatable {
/// Returns the second index where the specified value appears in the collection.
func secondIndex(of element: Element) -> Index? {
guard let index = firstIndex(of: element) else { return nil }
return self[self.index(after: index)...].firstIndex(of: element)
}
}
extension Collection {
/// Returns the second index in which an element of the collection satisfies the given predicate.
func secondIndex(where predicate: (Element) throws -> Bool) rethrows -> Index? {
guard let index = try firstIndex(where: predicate) else { return nil }
return try self[self.index(after: index)...].firstIndex(where: predicate)
}
}
Testing:
let numbers = [1,3,4,5,5,9,0,1]
if let index = numbers.secondIndex(of: 5) {
print(index) // "4\n"
} else {
print("not found")
}
if let index = numbers.secondIndex(where: { $0.isMultiple(of: 3) }) {
print(index) // "5\n"
} else {
print("not found")
}
Once you've found the first occurrence, you can use indexOf on the remaining slice of the array to locate the second occurrence:
let numbers = [1,3,4,5,5,9,0,1]
if let firstFive = numbers.indexOf(5) { // 3
let secondFive = numbers[firstFive+1..<numbers.count].indexOf(5) // 4
}
I don't think you can do it with indexOf. Instead you'll have to use a for-loop. A shorthand version:
let numbers = [1,3,4,5,5,9,0,1]
var indexes = [Int]()
numbers.enumerate().forEach { if $0.element == 5 { indexes += [$0.index] } }
print(indexes) // [3, 4]
Here's a general use extension of Array that will work for finding the nth element of a kind in any array:
extension Array where Element: Equatable {
// returns nil if there is no nth occurence
// or the index of the nth occurence if there is
func findNthIndexOf(n: Int, thing: Element) -> Int? {
guard n > 0 else { return nil }
var count = 0
for (index, item) in enumerate() where item == thing {
count += 1
if count == n {
return index
}
}
return nil
}
}
let numbers = [1,3,4,5,5,9,0]
numbers.findNthIndexOf(2, thing: 5) // returns 4
EDIT: as per #davecom's comment, I've included a similar but slightly more complex solution at the bottom of the answer.
I see a couple of good solutions here, especially considering the limitations the relatively new language of Swift. There is a really concise way to do it too, but beware...it is rather quick-and-dirty. May not be the perfect solution, but it is pretty quick. Also very versatile (not to brag).
extension Array where Element: Equatable {
func indexes(search: Element) -> [Int] {
return enumerate().reduce([Int]()) { $1.1 == search ? $0 + [$1.0] : $0 }
}
}
Using this extension, you could access the second index as follows:
let numbers = [1, 3, 4, 5, 5, 9, 0, 1]
let indexesOf5 = numbers.indexes(5) // [3, 4]
indexesOf5[1] // 4
And you're done!
Basically, the method works like this: enumerate() maps the array to tuples including the index of each element with the element itself. In this case, [1, 3, 4, 5, 5, 9, 0, 1].enumerate() returns a collection of the type EnumerateSequence<Array<Int>> which, translated to an Integer array, returns [(0,1), (1,3), (2,4), (3,5), (4,5), (5,9), (6,0), (7,1)].
The rest of the work is done using reduce (called 'inject' in some languages), which is an extremely powerful tool that many coders are not familiar with. If the reader is among those coders, I'd recommend checking out this article regarding use of the function in JS (keep in mind the placement of the non-block argument passed in is inputted after the block in JS, rather than before as seen here).
Thanks for reading.
P.S. not to be too long-winded on this relatively simple solution, but if the syntax for the indexes method shown above is a bit too quick-and-dirty, you could try something like this in the method body, where the closure's parameters are expanded for a bit more clarity:
return enumerate().reduce([Int]()) { memo, element in
element.1 == search ? memo + [element.0] : memo
}
EDIT: Here's another option that allows the implementer to scan for a specific "index at index" (e.g. the second occurrence of 5) for a more efficient solution.
extension Array where Element: Equatable {
func nIndex(search: Element, n: Int) -> Int? {
let info = enumerate().reduce((count: 0, index: 0), combine: { memo, element in
memo.count < n && element.1 == search ? (count: memo.count + 1, index: element.0) : memo
})
return info.count == n ? info.index : nil
}
}
[1, 3, 4, 5, 5, 9, 0, 1].nIndex(5, n: 2) // 4
[1, 3, 4, 5, 5, 9, 0, 1].nIndex(5, n: 3) // nil
The new method still iterates over the entire array, but is much more efficient due to the lack of "array-building" in the previous method. That performance hit would be negligible with the 8-object array used for the majority. But consider a list of 10,000 random numbers from 0 to 99:
let randomNumbers = (1...10000).map{_ in Int(rand() % 100)}
let indexes = randomNumbers.indexes(93) // count -> 100 (in my first run)
let index1 = indexes[1] // 238
// executed in 29.6603130102158 sec
let index2 = randomNumbers.nIndex(93, n: 2) // 238
// executed in 3.82625496387482 sec
As can be seen, this new method is considerably faster with the (very) large dataset; it is a bit more cumbersome and confusing though, so depending on your application, you may prefer the simpler solution, or a different one entirely.
(Again) thanks for reading.
extension Collection where Element: Equatable {
func nth(occurance: Int, of element: Element) -> Index? {
var level : Int = occurance
var position = self.startIndex
while let index = self[position...].index(of: element) {
level -= 1
guard level >= 0 else { return nil }
guard level != 0 else { return index }
position = self.index(after: index)
}
return nil
}
}