Filter & Map over a Swift Dictionary? - swift

I have two methods doing the same thing, one using the more functional Swift.
func getConstraints1(forState newState: State) -> (on: [NSLayoutConstraint], off: [NSLayoutConstraint]) {
return (
constraints.filter { $0.key.rawValue == newState.rawValue }.map { $1 }.first!,
constraints.filter { $0.key.rawValue != newState.rawValue }.map { $1 }.first!
)
}
The other one using a standard procedural approach.
func getConstraints2(for nextState: State) -> (on: [NSLayoutConstraint], off: [NSLayoutConstraint]) {
var on = [NSLayoutConstraint]()
var off = [NSLayoutConstraint]()
for (state, constraints) in constraints {
if state == nextState {
on += constraints
} else {
off += constraints
}
}
return (on, off)
}
I feel like getConstraints2 is going to be faster than getConstraints1, but I like the syntax in getConstraints1. Is there ever a case fort doing filter/map/reduce operations on a Dictionary? Is this it?

Every time you say filter or map, you are cycling through the entire sequence. So getConstraints1 cycles at least twice (maybe four times, but maybe not, because of laziness considerations), whereas getConstraints2 cycles once.
The question is: do you care? Only you (and Instruments) can answer that one.
Personally, if I were going to deal a deck of cards into two piles (say, all red cards and all black cards), I would work my way through the deck once. Having dealt all the red cards into one pile, I'd feel pretty silly working my way through the deck again looking to see if each remaining card was black.

If your data set is so large that iterating over it twice matters than use function2. Normally, I wouldn't think it matters because it will still run in O(n). If you want a functional approach you can alway modify your filter function to return two arrays in a tuple if you are worried about the second iteration.

It really depends what you are map/filter/reduce'ing over since it will iterate over each item, every time one is called. Sometimes using map/filter/reduce can be more readable than larger for loops.
To make getConstraints2 a bit more functional, you could use a reduce here and it would only iterate over the dictionary once. I did a bit of guessing with the types so hopefully this matches up :)
typealias StatefulConstraints = (on: [NSLayoutConstraint], off: [NSLayoutConstraint])
var constraints = [State: [NSLayoutConstraint]]()
func getConstraints3(for nextSate: State) -> StatefulConstraints {
let initial = (on: [NSLayoutConstraint](), off: [NSLayoutConstraint]())
return constraints.reduce(initial) { (combined, next: (state: State, constraints: [NSLayoutConstraint])) -> StatefulConstraints in
var mutableCombined = combined
if next.state == nextSate {
mutableCombined.on += next.constraints
} else {
mutableCombined.off += next.constraints
}
return mutableCombined
}
}

Related

How can I compare two Arrays in Swift and mutate one of the arrays if two Items are the same

I want to compare two Arrays with each other, that means each single item of them.
I need to run some code if two items in this arrays are the same.
I've done that so far with two For-Loops, but someone in the comments says that's not that good (because I get an Error too.
Has anybody an Idea which Code I can use to reach that?
Btw: That's the code I used before:
var index = 0
for Item1 in Array1 {
for Item2 in Array2 {
if (Item1 == Item2) {
// Here I want to put some code if the items are the same
}
}
// Index is in this case the counter which counts on which place in the Array1 I am.
index += 1
}
Okay I'll try again to describe what I mean:
I want to compare two Arrays. If there are some Items the same, I want to delete that Item from Array A / Array 1.
Okay if that is working, I want to add a few more statements that alow me to delete a Item only if an parameter of this item has a special worth, but I think I can do this step alone.
If you want to compare items from different array you need to add Equatable protocol for your Item
For example:
struct Item: Equatable {
let name: String
static func ==(l: Item, r: Item) -> Bool {
return l.name == r.name
}
}
You need to decide by which attributes you want to compare your Item. In my case I compare by name.
let array1: [Item] = [
.init(name: "John"),
.init(name: "Jack"),
.init(name: "Soer"),
.init(name: "Kate")
]
let array2: [Item] = [
.init(name: "John"),
]
for item1 in array1 {
if array2.contains(item1) {
// Array 2 contains item from the array1 and you can perform your code.
}
}
If you want to support this
Okay I'll try again to describe what I mean: I want to compare two
Arrays. If there are some Items the same, I want to delete that Item
from Array A / Array 1. Okay if that is working, I want to add a few
more statements that alow me to delete a Item only if an parameter of
this item has a special worth, but I think I can do this step alone.
I guess it can fit for you
You need to make your array1 mutable
var array1: [Item] = ...
Filter the array1 like this
let filteredArray1 = array1.filter { (item) -> Bool in
return !array2.contains(item)
}
And redefine your array1 with filtered array.
array1 = filteredArray1
Let me know it it works for you.
var itemsToRemove = array1.filter { array2.contains($0) }
for item in itemsToRemove {
if let i = array1.firstIndex(of: item) {
// run your conditional code here.
array1.remove(at: i)
}
}
Update
The question has been restated that elements are to be removed from one of the arrays.
You don't provide the actual code where you get the index out of bounds error, but it's almost certainly because you don't account for having removed elements when using the index, so you're probably indexing into a region that was valid at the start, but isn't anymore.
My first advice is don't do the actual removal inside the search loop. There is a solution to achieve the same result, which I'll give, but apart from having to be very careful about indexing into the shortened array, there is also a performance issue: Every deletion requires Array to shift all the later elements down one. That means that every deletion is an O(n) operation, which makes an the overall algorithim O(n^3).
So what do you do instead? The simplest method is to create a new array containing only the elements you wish to keep. For example, let's say you want to remove from array1 all elements that are also in array2:
array1 = array1.filter { !array2.contains($0) }
I should note that one of the reasons I kept my original answer below is because you can use those methods to replace array2.contains($0) to achieve better performance in some cases, and the original answer describes those cases and how to achieve it.
Also even though the closure is used to determine whether or not to keep the element, so it has to return a Bool, there is nothing that prevents you from putting additional code in it to do any other work you might want to:
array1 = array1.filter
{
if array2.contains($0)
{
doSomething(with: $0)
return false // don't keep this element
}
return true
}
The same applies to all of the closures below.
You could just use the removeAll(where:) method of Array.
array1.removeAll { array2.contains($0) }
In this case, if the closure returns true, it means "remove this element" which is the opposite sense of the closure used in filter, so you have to take that into account if you do additional work in it.
I haven't looked up how the Swift library implements removeAll(where:) but I'm assuming it does its job the efficient way rather than the naive way. If you find the performance isn't all that good you could roll your own version.
extension Array where Element: Equatable
{
mutating func efficientRemoveAll(where condition: (Element) -> Bool)
{
guard count > 0 else { return }
var i = startIndex
var iEnd = endIndex
while i < iEnd
{
if condition(self[i])
{
iEnd -= 1
swapAt(i, iEnd)
continue // note: skips incrementing i,iEnd has been decremented
}
i += 1
}
removeLast(endIndex - iEnd)
}
}
array1.efficientRemoveAll { array2.contains($0) }
Instead of actually removing the elements inside the loop, this works by swapping them with the end element, and handling when to increment appropriately. This collects the elements to be removed at the end, where they can be removed in one go after the loop finishes. So the deletion is just one O(n) pass at the end, which avoids increasing the algorithmic complexity that removing inside the loop would entail.
Because of the swapping with the current "end" element, this method doesn't preserve the relative order of the elements in the mutating array.
If you want to preserve the order you can do it like this:
extension Array where Element: Equatable
{
mutating func orderPreservingRemoveAll(where condition: (Element) -> Bool)
{
guard count > 0 else { return }
var i = startIndex
var j = i
repeat
{
swapAt(i, j)
if !condition(self[i]) { i += 1 }
j += 1
} while j != endIndex
removeLast(endIndex - i)
}
}
array1.orderPreservingRemoveAll { array2.contains($0) }
If I had to make a bet, this would be very close to how standard Swift's removeAll(where:) for Array is implemented. It keeps two indices, one for the current last element to be kept, i, and one for the next element to be examined, j. This has the effect of accumulating elements to be removed at the end (those past i).
Original answer
The previous solutions are fine for small (yet still surprisingly large) arrays, but they are O(n^2) solutions.
If you're not mutating the arrays, they can be expressed more succinctly
array1.filter { array2.contains($0) }.forEach { doSomething($0) }
where doSomething doesn't actually have to be a function call - just do whatever you want to do with $0 (the common element).
You'll normally get better performance by putting the smaller array inside the filter.
Special cases for sorted arrays
If one of your arrays is sorted, then you might get better performance in a binary search instead of contains, though that will require that your elements conform to Comparable. There isn't a binary search in the Swift Standard Library, and I also couldn't find one in the new swift-algorithms package, so you'd need to implement it yourself:
extension RandomAccessCollection where Element: Comparable, Index == Int
{
func sortedContains(_ element: Element) -> Bool
{
var range = self[...]
while range.count > 0
{
let midPoint = (range.startIndex + range.endIndex) / 2
if range[midPoint] == element { return true }
range = range[midPoint] > element
? self[..<midPoint]
: self[index(after: midPoint)...]
}
return false
}
}
unsortedArray.filter { sortedArray.sortedContains($0) }.forEach { doSomething($0) }
This will give O(n log n) performance. However, you should test it for your actual use case if you really need performance, because binary search is not especially friendly for the CPU's branch predictor, and it doesn't take that many mispredictions to result in slower performance than just doing a linear search.
If both arrays are sorted, you can get even better performance by exploiting that, though again you have to implement the algorithm because it's not supplied by the Swift Standard Library.
extension Array where Element: Comparable
{
// Assumes no duplicates
func sortedIntersection(_ sortedOther: Self) -> Self
{
guard self.count > 0, sortedOther.count > 0 else { return [] }
var common = Self()
common.reserveCapacity(Swift.min(count, sortedOther.count))
var i = self.startIndex
var j = sortedOther.startIndex
var selfValue = self[i]
var otherValue = sortedOther[j]
while true
{
if selfValue == otherValue
{
common.append(selfValue)
i += 1
j += 1
if i == self.endIndex || j == sortedOther.endIndex { break }
selfValue = self[i]
otherValue = sortedOther[j]
continue
}
if selfValue < otherValue
{
i += 1
if i == self.endIndex { break }
selfValue = self[i]
continue
}
j += 1
if j == sortedOther.endIndex { break }
otherValue = sortedOther[j]
}
return common
}
}
array1.sortedIntersection(array2).forEach { doSomething($0) }
This is O(n) and is the most efficient of any of the solutions I'm including, because it makes exactly one pass through both arrays. That's only possible because both of the arrays are sorted.
Special case for large array of Hashable elements
However, if your arrays meet some criteria, you can get O(n) performance by going through Set. Specifically if
The arrays are large
The array elements conform to Hashable
You're not mutating the arrays
Make the larger of the two arrays a Set:
let largeSet = Set(largeArray)
smallArray.filter { largeSet.contains($0) }.forEach { doSomething($0) }
For the sake of analysis if we assume both arrays have n elements, the initializer for the Set will be O(n). The intersection will involve testing each element of the smallArray's elements for membership in largeSet. Each one of those tests is O(1) because Swift implements Set as a hash table. There will be n of those tests, so that's O(n). Then the worst case for forEach will be O(n). So O(n) overall.
There is, of course, overhead in creating the Set, the worst of which is the memory allocation involved, so it's not worth it for small arrays.

Does initializing an array from a set have a complexity and if so what is it?

Swift: Option 1
var dictionaryWithoutDuplicates = [Int: Int]()
for item in arrayWithDuplicates {
if dictionaryWithoutDuplicates[item] == nil {
dictionaryWithoutDuplicates[item] = 1
}
}
print(dictionaryWithoutDuplicates.keys)
// [1,2,3,4]
Option 2
let arrayWithDuplicates = [1,2,3,3,2,4,1]
let arrayWithoutDuplicates = Array(Set(arrayWithDuplicates))
print(arrayWithoutDuplicates)
// [1,2,3,4]
For the first option there might be a more elegant way to do it but that's not my point, I just wanted to show an example that has a complexity of n.
Both options return an array without duplicates. Since the first option has a complexity of O(n), I was wondering if the second option even has a complexity and if so what is it?
What you did is pretty much exactly what Set does. A Set<T> is pretty much just a [T: Void] (a.k.a. Dictionary<T, Void>).
Both examples have O(arrayWithDuplicates.count) time and space complexity.

Efficiency of using filter{where:} vs. removeAll{where:} when modifying a parameter value

Swift 4.2 introduced a new removeAll {where:} function. From what I have read, it is supposed to be more efficient than using filter {where:}. I have several scenarios in my code like this:
private func getListOfNullDates(list: [MyObject]) -> [MyObject] {
return list.filter{ $0.date == nil }
.sorted { $0.account?.name < $1.account?.name }
}
However, I cannot use removeAll{where:} with a param because it is a constant. So I would need to redefine it like this:
private func getListOfNullDates(list: [MyObject]) -> [MyObject] {
var list = list
list.removeAll { $0.date == nil }
return list.sorted { $0.account?.name < $1.account?.name }
}
Is using the removeAll function still more efficient in this scenario? Or is it better to stick with using the filter function?
Thank you for this question 🙏🏻
I've benchmarked both functions using this code on TIO:
let array = Array(0..<10_000_000)
do {
let start = Date()
let filtering = array.filter { $0 % 2 == 0 }
let end = Date()
print(filtering.count, filtering.last!, end.timeIntervalSince(start))
}
do {
let start = Date()
var removing = array
removing.removeAll { $0 % 2 == 0 }
let end = Date()
print(removing.count, removing.last!, end.timeIntervalSince(start))
}
(To have the removing and filtering identical, the closure passed to removeAll should have been { $0 % 2 != 0 }, but I didn't want to give an advantage to either snippet by using a faster or slower comparison operator.)
And indeed, removeAll(where:) is faster when the probability of removing elements (let's call it Pr)is 50%! Here are the benchmark results :
filter : 94ms
removeAll : 74ms
This is the same case when Pr is less than 50%.
Otherwise, filtering is faster for a higher Pr.
One thing to bear in mind is that in your code list is mutable, and that opens the possibility for accidental modifications.
Personally, I would choose performance over old habits, and in a sense, this use case is more readable since the intention is clearer.
Bonus : Removing in-place
What's meant by removing in-place is the swap the elements in the array in such a way that the elements to be removed are placed after a certain pivot index. The elements to keep are the ones before the pivot element :
var inPlace = array
let p = inPlace.partition { $0 % 2 == 0 }
Bear in mind that partition(by:) doesn't keep the original order.
This approach clocks better than removeAll(where:)
Beware of premature optimization. The efficiency of a method often depends on your specific data and configuration, and unless you're working with a large data set or performing many operations at once, it's not likely to have a significant impact either way. Unless it does, you should favor the more readable and maintainable solution.
As a general rule, just use removeAll when you want to mutate the original array and filter when you don't. If you've identified it as a potential bottleneck in your program, then test it to see if there's a performance difference.

How to check that a function always return a value (aka "doesn't fall off the end")?

I'm building a didactic compiler, and I'd like to check if the function will always return a value. I intend to do this in the semantic analysis step (as this is not covered by the language grammar).
Out of all the flow control statements, this didactic language only has if, else, and while statements (so no do while, for, switch cases, etc). Note that else if is also possible. The following are all valid example snippets:
a)
if (condition) {
// non-returning commands
}
return value
b)
if (condition) {
return value
}
return anotherValue
c)
if (condition) {
return value1
} else {
return value2
}
// No return value needed here
I've searched a lot about this but couldn't find a pseudoalgorithm that I could comprehend. I've searched for software path testing, the white box testing, and also other related Stack Overflow questions like this and this.
I've heard that this can be solved using graphs, and also using a stack, but I have no idea how to implement those strategies.
Any help with pseudocode would be very helpful!
(and if it matters, I'm implementing my compiler in Swift)
If you have a control flow graph, checking that a function always returns is as easy as checking that the implicit return at the end of the function is unreachable. So since there are plenty of analyses and optimizations where you'll want a CFG, it would not be a bad idea to construct one.
That said, even without a control flow graph, this property is pretty straight forward to check assuming some common restrictions (specifically that you're okay with something like if(cond) return x; if(!cond) return y; being seen as falling of the end even though it's equivalent to if(cond) return x; else return y;, which would be allowed). I also assume there's no goto because you didn't list it in your list of control flow statements (I make no assumptions about break and continue because those only appear within loops and loops don't matter).
We just need to consider the cases of what a legal block (i.e. one that always reaches a return) would look like:
So an empty block would clearly not be allowed because it can't reach a return if it's empty. A block that directly (i.e. not inside an if or loop) contains a return would be allowed (and if it isn't at the end of the block, everything after the return in the block would be unreachable, which you might also want to turn into an error or warning).
Loops don't matter. That is, if your block contains a loop, it still has to have a return outside of the loop even if the loop contains a return because the loop condition may be false, so there's no need for us to even check what's inside the loop. This wouldn't be true for do-while loops, but you don't have those.
If the block directly contains an if with an else and both the then-block and the else-block always reach a return, this block also always reaches a return. In that case, everything after the if-else is unreachable. Otherwise the if doesn't matter just like loops.
So in pseudo code that would be:
alwaysReturns( {} ) = false
alwaysReturns( {return exp; ...rest} ) = true
alwaysReturns( { if(exp) thenBlock else elseBlock; ...rest}) =
(alwaysReturns(thenBlock) && alwaysReturns(elseBlock)) || alwaysReturns(rest)
alwaysReturns( {otherStatement; ...rest} ) = alwaysReturns(rest)
So, after 5 hours thinking how to implement this, I came up with a decent solution (at least I haven't been able to break it so far). I actually spent most of the time browsing the web (with no luck) than actually thinking about the problem and trying to solve it on my own.
Below is my implementation (in Swift 4.2, but the syntax is fairly easy to pick up), using a graph:
final class SemanticAnalyzer {
private var currentNode: Node!
private var rootNode: Node!
final class Node {
var nodes: [Node] = []
var returnsExplicitly = false
let parent: Node?
var elseNode: Node!
var alwaysReturns: Bool { return returnsExplicitly || elseNode?.validate() == true }
init(parent: Node?) {
self.parent = parent
}
func validate() -> Bool {
if alwaysReturns {
return true
} else {
return nodes.isEmpty ? false : nodes.allSatisfy { $0.alwaysReturns }
}
}
}
/// Initializes the components of the semantic analyzer.
func startAnalyzing() {
rootNode = Node(parent: nil)
currentNode = rootNode
}
/// Execute when an `if` statement is found.
func handleIfStatementFound() {
let ifNode = Node(parent: currentNode)
let elseNode = Node(parent: currentNode)
// Assigning is not necessary if the current node returns explicitly.
// But assigning is not allowed if the else node always returns, so we check if the current node always returns.
if !currentNode.alwaysReturns {
currentNode.elseNode = elseNode
}
currentNode.nodes += [ ifNode, elseNode ]
currentNode = ifNode
}
/// Execute when an `else` statement is found.
func handleElseStatementFound() {
currentNode = currentNode.elseNode
}
/// Execute when a branch scope is closed.
func handleBranchClosing() {
currentNode = currentNode.parent! // If we're in a branch, the parent node is never nil
}
/// Execute when a function return statement is found.
func handleReturnStatementFound() {
currentNode.returnsExplicitly = true
}
/// Determine whether the function analyzed always returns a value.
///
/// - Returns: whether the root node validates.
func validate() -> Bool {
return rootNode.validate()
}
}
Basically what it does is:
When it finds an if statement is create 2 new nodes and point the current node to both of them (as in a binary tree node).
When the else statement is found, we just switch the current node to the else node created previously in the if statement.
When a branch is closed (e.g. in an if statement's } character), it switches the current node to the parent node.
When it finds a function return statement, it can assume that the current node will always have a return value.
Finally, to validate a node, either the node has an explicit return value, or all of the nodes must be valid.
This works with nested if/else statements, as well as branches without return values at all.

Improving performance of higher order functions vs loops in Swift (3.0)

import XCTest
class testTests: XCTestCase {
static func makearray() -> [Int] {
var array: [Int] = []
for x in 0..<1000000 {
array.append(x)
}
return array
}
let array = testTests.makearray()
func testPerformanceExample() {
self.measure {
var result: [String] = []
for x in self.array {
let tmp = "string\(x)"
if tmp.hasSuffix("1234") {
result.append(tmp)
}
}
print(result.count)
}
}
func testPerformanceExample2() {
self.measure {
let result = self.array.map { "string\($0)" }
.filter { $0.hasSuffix("1234") }
print(result.count)
}
}
func testPerformanceExample3() {
self.measure {
let result = self.array.flatMap { int -> String? in
let tmp = "string\(int)"
return tmp.hasSuffix("1234") ? tmp : nil
}
print(result.count)
}
}
}
In this code I am trying to see how the higher order functions perform with respect to processing a large array.
The 3 tests produce the same results with times of around 0.75s for loop, 1.38s map/filter, 1.21s flatmap.
Assuming HOFs are, more or less, functions wrapping loops, this makes sense as in the map/filter case, it is looping through the first array for map, then looping through the result of that to filter.
In the case of flatmap, is it doing the map first, and then able to do a simpler filter operation?
Is my understanding of what is happening under the hood (roughly) correct?
If so, would it be fair to say that the compiler is not able to do much optimisation of this?
And finally, is there a better way of doing this? The HOF versions are definitely easier for me to understand, but for performance critical areas, it looks like for-loops are the way to go?
The flatmap approach is likely to be nearly equivalent to the loop approach. Algorithmically, it is equivalent. I would add that even the map/filter approach, in this instance, should be "nearly" as fast because the bulk of the running time in taken by the operations on strings.
For good performance, one wants to avoid working with temporary strings. We can achieve the desired result as follows...
func fastloopflatmap (_ test_array: [Int]) -> [String] {
var result: [String] = []
for x in array {
if x % 10000 == 1234 {
result.append("string\(x)")
}
}
return result;
}
Here are my timings :
loop : 632 ns/element
filter/map : 650 ns/element
flatmap : 632 ns/element
fast loop : 1.2 ns/element
Thus, as you can see, the bulk (99%) of the running time of the slow functions is due to the operation over temporary strings.
Source code : https://github.com/lemire/Code-used-on-Daniel-Lemire-s-blog/tree/master/extra/swift/flatmap
Technically speaking, these HOF could have better Big O performance if the compiler substituted an implementation that used multiple cores in parallel. However, Swift does not do this at this time. A step further would be using granularity control to use an iterative or parallel implementation appropriately, based on weighing the overhead costs of parallelization vs. the input size:
https://en.wikipedia.org/wiki/Granularity_(parallel_computing)