What does the reduce(_:_:) function do in Swift? [duplicate] - swift

This question already has answers here:
What is the reduce() function doing, in Swift
(4 answers)
Closed 9 months ago.
Here is a piece of code I don't understand. This code uses swift's reduce(::) function along with the closure which I am having trouble to understand. What are the values set in maxVerticalPipCount and maxHorizontalPipCount? Are they 5 and 2 respectively?
let pipsPerRowForRank = [[0], [1], [1,1], [1,1,1], [2,2], [2,1,2],
[2,2,2], [2,1,2,2], [2,2,2,2], [2,2,1,2,2],
[2,2,2,2,2]]
let maxVerticalPipCount = CGFloat(pipsPerRowForRank.reduce(0) { max($1.count, $0) })
let maxHorizontalPipCount = CGFloat(pipsPerRowForRank.reduce(0) { max($1.max() ?? 0, $0) })

By the way, if you’re wondering what precisely reduce does, you can always refer to the source code, where you can see the actual code as well as a nice narrative description in the comments.
But the root of your question is that this code is not entirely obvious. I might suggest that if you’re finding it hard to reason about the code snippet, you can replace the opaque shorthand argument names, $0 and $1, with meaningful names, e.g.:
let verticalMax = pipsPerRowForRank.reduce(0) { previousMax, nextArray in
max(nextArray.count, previousMax)
}
let horizontalMax = pipsPerRowForRank.reduce(0) { previousMax, nextArray in
max(nextArray.max() ?? 0, previousMax)
}
By using argument names that make the functional intent more clear, it often is easier to grok what the code is doing. IMHO, especially when there are multiple arguments, using explicit argument names can make it more clear.
That having been said, I’d probably not use reduce and instead do something like:
let verticalMax = pipsPerRowForRank
.lazy
.map { $0.count }
.max() ?? 0
To my eye, that makes the intent extremely clear, namely that we’re counting how many items are in each sub-array and returning the maximum count.
Likewise, for the horizontal one:
let horizontalMax = pipsPerRowForRank
.lazy
.flatMap { $0 }
.max() ?? 0
Again, I think that’s clear that we’re creating a flat array of the values, and then getting the maximum value.
And, in both cases, we’re using lazy to avoid building interim structures (in case our arrays were very large), but evaluating it as we go along. This improves memory characteristics of the routine and the resulting code is more efficient. Frankly, with an array of arrays this small, lazy isn’t needed, but I include it for your reference.
Bottom line, the goal with functional patterns is not to write code with the fewest keystrokes possible (as there are more concise renditions we could have written), but rather to write efficient code whose intent is as clear as possible with the least amount of cruft. But we should always be able to glance at the code and reason about it quickly. Sometimes if further optimization is needed, we’ll make a conscious decision to sacrifice readability for performance reasons, but that’s not needed here.

This is what the reduce functions do here
var maxVerticalPipCount:CGFloat = 0
for rark in pipsPerRowForRank {
if CGFloat(rark.count) > maxVerticalPipCount {
maxVerticalPipCount = CGFloat(rark.count)
}
}
var maxHorizontalPipCount:CGFloat = 0
for rark in pipsPerRowForRank {
if CGFloat(rark.max() ?? 0) > maxHorizontalPipCount {
maxHorizontalPipCount = CGFloat(rark.max() ?? 0)
}
}
You shouldn't use reduce(::) function for finding the max value. Use max(by:)
function like this
let maxVerticalPipCount = CGFloat(pipsPerRowForRank.max { $0.count < $1.count }?.count ?? 0)
let maxHorizontalPipCount = CGFloat(pipsPerRowForRank.max { ($0.max() ?? 0) < ($1.max() ?? 0) }?.max() ?? 0)

The reduce function loops over every item in a collection, and combines them into one value. Think of it as literally reducing multiple values to one value. [Source]
From Apple Docs
let numbers = [1, 2, 3, 4]
let numberSum = numbers.reduce(0, { x, y in
x + y
})
// numberSum == 10
In your code,
maxVerticalPipCount is iterating through the whole array and finding the max between count of 2nd element and 1st element of each iteration.
maxHorizontalPipCount is finding max of 2nd element's max value and first element.
Try to print each element inside reduce function for better understandings.
let maxVerticalPipCount = pipsPerRowForRank.reduce(0) {
print($0)
return max($1.count, $0)
}

Reduce adds together all the numbers in an array opens a closure and really do whatever you tell it to return.
let pipsPerRowForRank = [[1,1], [2,2,2]]
let maxVerticalPipCount = CGFloat(pipsPerRowForRank.reduce(0) {
max($1.count, $0)})
Here it starts at 0 at reduce(0) and loops through the full array. where it takes the highest value between it's previous value it's in process of calculating and the number of items in the subarray. In the example above the process will be:
maxVerticalPipCount = max(2, 0)
maxVerticalPipCount = max(3, 2)
maxVerticalPipCount = 3
As for the second one
let pipsPerRowForRank = [[1,2], [1,2,3], [1,2,3,4], []]
let maxHorizontalPipCount = CGFloat(pipsPerRowForRank.reduce(0) {
max($1.max() ?? 0, $0)})
Here we instead of checking count of array we check the max value of the nested array, unless it's empty, the it's 0. So here goes this one:
let maxHorizontalPipCount = max(2, 0)
let maxHorizontalPipCount = max(3, 2)
let maxHorizontalPipCount = max(4, 3)
let maxHorizontalPipCount = max(0, 4)
let maxHorizontalPipCount = 4

Example With Swift 5,
enum Errors: Error {
case someError
}
let numbers = [1,2,3,4,5]
let inititalValue = 0
let sum = numbers.reduce(Result.success(inititalValue)) { (result, value) -> Result<Int, Error> in
if let initialValue = try? result.get() {
return .success(value + initialValue)
} else {
return .failure(Errors.someError)
}
}
switch sum {
case .success(let totalSum):
print(totalSum)
case .failure(let error):
print(error)
}

Related

Finding the largest number in a [String: [Int]] Swift dictionary

I am new in Swift.
I have an error with this code and I can't find any answer on this site.
I print largest number but I want to print largest number's kind.
let interestingNumbers = [
"Prime": [2,3,5,7,11,13],
"Fibonacci": [1,1,2,3,5,8,13],
"Square": [1,4,9,16,25,36]
]
var largestNumber = 0
for (kind, numbers) in interestingNumbers {
for x in numbers {
for y in kind {
if x > largestNumber {
largestNumber = x
}
}
}
}
print("the largest number is = \(largestNumber)")
Try this instead:
var largestNumber = 0
var largestNumberKind: String!
for (kind, numbers) in interestingNumbers {
for x in numbers {
if x > largestNumber {
largestNumber = x
largertNumberKind = kind
}
}
}
print("the largest number is = \(largestNumber)")
print("the largest number kind is = \(largestNumberKind)")
Regarding your original code:
you were only keeping track of the largest number, losing the kind info you wanted. The largestNumberKind variable I added does just that.
looping over the kind: String didn't make any sense (the for y in kind line). Your outside loop already iterates a key at a time, so such inner loop is pointless.
There is nothing wrong with Paulo's approach (with some minor corrections; see comments there), but this is a reasonable problem to explore more functional approaches that don't require looping and mutation.
For example, we can just flatten each kind to its maximum element (or Int.min if it's empty), then take the kind with the highest max:
interestingNumbers
.map { (kind: $0, maxValue: $1.max() ?? .min) } // Map each kind to its max value
.max { $0.maxValue < $1.maxValue }? // Find the kind with the max value
.kind // Return the kind
This does create a slight edge condition that I don't love. If you evaluate the following:
let interestingNumbers = [
"ImaginaryReals": [],
"Smallest": [Int.min],
]
It's not well defined here which will be returned. Clearly the correct answer is "Smallest," but it's kind of order-dependent. With a little more thought (and code) we can fix this. The problem is that we are taking a little shortcut by treating an empty list as having an Int.min maximum (this also prevents our system from working for things like Float, so that's sad). So let's fix that. Let's be precise. The max of an empty list is nil. We want to drop those elements.
We can use a modified version of mapValues (which is coming in Swift 4 I believe). We'll make flatMapValues:
extension Dictionary {
func flatMapValues<T>(_ transform: (Value) throws -> T?) rethrows -> [Key: T] {
var result: [Key: T] = [:]
for (key, value) in self {
if let newValue = try transform(value) {
result[key] = newValue
}
}
return result
}
}
And with that, we can be totally precise, even with empty lists in the mix:
interestingNumbers
.flatMapValues { $0.max() }
.max { $0.1 < $1.1 }?
.key
Again, nothing wrong with Paulo's approach if you find it clear, but there are other ways of thinking about the problem.
BTW, the equivalent version that iterates would look like this:
var largestNumber: Int? = nil
var largestNumberKind: String? = nil
for (kind, numbers) in interestingNumbers {
for x in numbers {
if largestNumber == nil || largestNumber! < x {
largestNumber = x
largestNumberKind = kind
}
}
}

A better approach to recursion?

I built this code sample in Swift Playgrounds as a proof-of-concept for part of a larger project that I'm working on. What I need to do is pass in a series of options (represented by optionsArray or testArray) where each int is the number of options available. These options will eventually be built into 300+ million separate PDFs and HTML files. The code currently works, and puts out the giant list of possibilities that I want it to.
My question is this: Is there a better approach to handling this kind of situation? Is there something more elegant or efficient? This is not something that will be run live on an app or anything, it will run from a command line and take all the time it needs, but if there is a better approach for performance or stability I'm all ears.
Things I already know: It can't handle a value of 0 coming out of the array. The array is a constant, so it won't happen by accident. The way the code down the line will handle things, 0 is a nonsensical value to use. Each element represents the number of options available, so 2 is essentially a Boolean, 1 would be false only. So if I needed placeholder elements for future expansion, they would be a value of 1 and show up as a 0 in the output.
Also, the final product will not just barf text to the console as output, it will write a file in the permutationEnding() function based on the currentOptions array.
let optionsArray: [Int] = [7,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,3,2,2,2,2,2,2,2,2]
let testArray: [Int] = [7,2,3,2]
var currentOptions: [Int] = []
var outputString: String = ""
func buildPermutations(array: Array<Int>) {
currentOptions.removeAll()
permutationRecursion(array: array, index: 0)
}
func permutationRecursion(array: Array<Int>, index: Int) {
for i in 1 ... array[index] {
currentOptions.append(Int(i-1))
if array.count > (index + 1) {
permutationRecursion(array: array, index: index + 1)
} else {
permutationEnding()
}
currentOptions.removeLast()
}
}
func permutationEnding() {
for i in 1 ... currentOptions.count { // Output Elements
outputString += String(currentOptions[i-1])
}
outputString += "\n" // Goes after output elements closing bracket.
}
// buildPermutations(array: optionsArray)
buildPermutations(array: testArray)
print(outputString)
Thoughts?
I think I've figured out what you're trying to do. You want a string output of every possible integer combination that could map all possible routes on the decision tree.
I got it down to four or five lines.
let n = testArray.count // for readability
let products = ([Int](1...n)).map({testArray[$0..<n].reduce(1, *)})
// products is the cross product of element i + 1 to element n of the array for all i in the array
let zipped = zip(testArray, products)
for i in 0..<testArray.reduce(1, *) { // this reduce is the cross product of the whole array
let treePath = zipped.map(){ String(i / $0.1 % $0.0) }.joined()
outputString += treePath + "\n"
}
One more edit: I think this might be faster with some fancy matrix operations like NumPy. I wonder if the Accelerate framework could do some magic for you, but I have not worked with it.
edit: I was curious so I timed it with this test array
let testArray: [Int] = [7,2,2,2,2,2,2,3,2]
The recursive function in the question was: 132.56 s
The zip-map here was: 14.44 s
And it appears to be exponential as I add elements to the test array.

Strange values from vDSP_meanD

I am using the vDSP_meanD function to determine the average of a data set (consecutive diferences from an array)
The code I am using is below
func F(dataAllFrames:[Double],std:Double,medida:String)->Double{
let nframes=dataAllFrames.count
var diferencas_consecutivas_media = [Double](count: dataAllFrames.count-1, repeatedValue:0.0)
var mediaDifConseq:Double = 0
for(var i:Int=1; i<dataAllFrames.count; i++){
diferencas_consecutivas_media[i-1]=dataAllFrames[i]-dataAllFrames[i-1]
}
var meanConseqDif = [Double](count: 1, repeatedValue:0.0)
var meanConseqDifPtr = UnsafeMutablePointer<Double>(meanConseqDif)
vDSP_meanvD(diferencas_consecutivas_media,1,meanConseqDifPtr,UInt(nframes))
print( meanConseqDif[0])
}
The function F is called within a thread block
let group = dispatch_group_create()
let queue = dispatch_queue_create("myqueue.data.processor", DISPATCH_QUEUE_CONCURRENT)
dispatch_group_async(group, queue) {
F(measureData,std: std, medida: medida)
}
The F function is called in multiple dispatch block with different variables instances every now and then i get different values for the value returned from vDSP_meanD is there any context where this may happen ?
May the thread call have some influence on that?
Any "lights" would be greatly appreciated
I wouldn't expect this code to work. This shouldn't be correct:
var meanConseqDif = [Double](count: 1, repeatedValue:0.0)
var meanConseqDifPtr = UnsafeMutablePointer<Double>(meanConseqDif)
vDSP_meanvD(diferencas_consecutivas_media,1,meanConseqDifPtr,UInt(nframes))
I believe this is pointing directly at the Array struct, so you're probably blowing away the metadata rather than updating the value you meant. But I would expect that you don't get the right answers at all in that case. Have you validated that your results are correct usually?
I think the code you mean is like this:
func F(dataAllFrames: [Double], std: Double, medida: String) -> Double {
let nframes = UInt(dataAllFrames.count)
var diferencas_consecutivas_media = [Double](count: dataAllFrames.count-1, repeatedValue:0.0)
for(var i = 1; i < dataAllFrames.count; i += 1) {
diferencas_consecutivas_media[i-1] = dataAllFrames[i] - dataAllFrames[i-1]
}
var mediaDifConseq = 0.0
vDSP_meanvD(diferencas_consecutivas_media, 1, &mediaDifConseq, nframes)
return mediaDifConseq
}
You don't need an output array to collect a single result. You can just use a Double directly, and use & to take an unsafe pointer to it.
Unrelated point, but you can get rid of all of the difference-generating code with a single zip and map:
let diferencasConsecutivasMedia = zip(dataAllFrames, dataAllFrames.dropFirst())
.map { $1 - $0 }
I haven't profiled these two approaches, though. It's possible that your approach is faster. I find the zip and map much clearer and less error-prone, but others may feel differently.

Unwrapping optional inside of closure using reduce

I have a quick question that is confusing me a little bit. I made a simple average function that takes an array of optional Ints. I check to make sure the array does not contain a nil value but when I use reduce I have to force unwrap one of the two elements in the closure. Why is it that I only force unwrap the second one (in my case $1!)
func average2(array: [Int?]) -> Double? {
let N = Double(array.count)
guard N > 0 && !array.contains({$0 == nil}) else {
return nil
}
let sum = Double(array.reduce(0) {$0+$1!})
let average = sum / N
return average
}
I know it is simple but I would like to understand it properly.
The first parameter of reduce is the sum, which is 0 in the beginning. The second one is the current element of your array which is an optional Int and therefore has to be unwrapped.
Your invocation of reduce does this:
var sum = 0 // Your starting value (an Int)
for elem in array {
sum = sum + elem! // This is the $0 + $1!
}
EDIT: I couldn't get a more functional approach than this to work:
func average(array: [Int?]) -> Double? {
guard !array.isEmpty else { return nil }
let nonNilArray = array.flatMap{ $0 }
guard nonNilArray.count == array.count else { return nil }
return Double(nonNilArray.reduce(0, combine: +)) / Double(nonNilArray.count)
}
You can also discard the second guard if you want something like average([1, 2, nil]) to return 1.5 instead of nil

Process Array in parallel using GCD

I have a large array that I would like to process by handing slices of it to a few asynchronous tasks. As a proof of concept, I have the written the following code:
class TestParallelArrayProcessing {
let array: [Int]
var summary: [Int]
init() {
array = Array<Int>(count: 500000, repeatedValue: 0)
for i in 0 ..< 500000 {
array[i] = Int(arc4random_uniform(10))
}
summary = Array<Int>(count: 10, repeatedValue: 0)
}
func calcSummary() {
let group = dispatch_group_create()
let queue = dispatch_get_global_queue(QOS_CLASS_USER_INITIATED, 0)
for i in 0 ..< 10 {
dispatch_group_async(group, queue, {
let base = i * 50000
for x in base ..< base + 50000 {
self.summary[i] += self.array[x]
}
})
}
dispatch_group_notify(group, queue, {
println(self.summary)
})
}
}
After init(), array will be initialized with random integers between 0 and 9.
The calcSummary function dispatches 10 tasks that take disjoint chunks of 50000 items from array and add them up, using their respective slot in summary as an accummulator.
This program crashes at the self.summary[i] += self.array[x] line. The error is:
EXC_BAD_INSTRUCTION (code = EXC_I386_INVOP).
I can see, in the debugger, that it has managed to iterate a few times before crashing, and that the variables, at the time of the crash, have values within correct bounds.
I have read that EXC_I386_INVOP can happen when trying to access an object that has already been released. I wonder if this has anything to do with Swift making a copy of the array if it is modified, and, if so, how to avoid it.
This is a slightly different take on the approach in #Eduardo's answer, using the Array type's withUnsafeMutableBufferPointer<R>(body: (inout UnsafeMutableBufferPointer<T>) -> R) -> R method. That method's documentation states:
Call body(p), where p is a pointer to the Array's mutable contiguous storage. If no such storage exists, it is first created.
Often, the optimizer can eliminate bounds- and uniqueness-checks within an array algorithm, but when that fails, invoking the same algorithm on body's argument lets you trade safety for speed.
That second paragraph seems to be exactly what's happening here, so using this method might be more "idiomatic" in Swift, whatever that means:
func calcSummary() {
let group = dispatch_group_create()
let queue = dispatch_get_global_queue(QOS_CLASS_USER_INITIATED, 0)
self.summary.withUnsafeMutableBufferPointer {
summaryMem -> Void in
for i in 0 ..< 10 {
dispatch_group_async(group, queue, {
let base = i * 50000
for x in base ..< base + 50000 {
summaryMem[i] += self.array[x]
}
})
}
}
dispatch_group_notify(group, queue, {
println(self.summary)
})
}
When you use the += operator, the LHS is an inout parameter -- I think you're getting race conditions when, as you mention in your update, Swift moves around the array for optimization. I was able to get it to work by summing the chunk in a local variable, then simply assigning to the right index in summary:
func calcSummary() {
let group = dispatch_group_create()
let queue = dispatch_get_global_queue(QOS_CLASS_USER_INITIATED, 0)
for i in 0 ..< 10 {
dispatch_group_async(group, queue, {
let base = i * 50000
var sum = 0
for x in base ..< base + 50000 {
sum += self.array[x]
}
self.summary[i] = sum
})
}
dispatch_group_notify(group, queue, {
println(self.summary)
})
}
You can also use concurrentPerform(iterations: Int, execute work: (Int) -> Swift.Void) (since Swift 3).
It has a much simpler syntax and will wait for all threads to finalise before returning.:
DispatchQueue.concurrentPerform(iterations: iterations) { i in
performOperation(i)
}
I think Nate is right: there are race conditions with the summary variable. To fix it, I used summary's memory directly:
func calcSummary() {
let group = dispatch_group_create()
let queue = dispatch_get_global_queue(QOS_CLASS_USER_INITIATED, 0)
let summaryMem = UnsafeMutableBufferPointer<Int>(start: &summary, count: 10)
for i in 0 ..< 10 {
dispatch_group_async(group, queue, {
let base = i * 50000
for x in base ..< base + 50000 {
summaryMem[i] += self.array[x]
}
})
}
dispatch_group_notify(group, queue, {
println(self.summary)
})
}
This works (so far).
EDIT
Mike S has a very good point, in his comment below. I have also found this blog post, which sheds some light on the problem.
Any solution that assigns the i'th element of the array concurrently risks race condition (Swift's array is not thread-safe). On the other hand, dispatching to the same queue (in this case main) before updating solves the problem but results in a slower performance overall. The only reason I see for taking either of these two approaches is if the array (summary) cannot wait for all concurrent operations to finish.
Otherwise, perform the concurrent operations on a local copy and assign it to summary upon completion. No race condition, no performance hit:
Swift 4
func calcSummary(of array: [Int]) -> [Int] {
var summary = Array<Int>.init(repeating: 0, count: array.count)
let iterations = 10 // number of parallel operations
DispatchQueue.concurrentPerform(iterations: iterations) { index in
let start = index * array.count / iterations
let end = (index + 1) * array.count / iterations
for i in start..<end {
// Do stuff to get the i'th element
summary[i] = Int.random(in: 0..<array.count)
}
}
return summary
}
I've answered a similar question here for simply initializing an array after computing on another array