Scala : fold vs foldLeft - scala

I am trying to understand how fold and foldLeft and the respective reduce and reduceLeft work. I used fold and foldLeft as my example
scala> val r = List((ArrayBuffer(1, 2, 3, 4),10))
scala> r.foldLeft(ArrayBuffer(1,2,4,5))((x,y) => x -- y._1)
scala> res28: scala.collection.mutable.ArrayBuffer[Int] = ArrayBuffer(5)
scala> r.fold(ArrayBuffer(1,2,4,5))((x,y) => x -- y._1)
<console>:11: error: value _1 is not a member of Serializable with Equals
r.fold(ArrayBuffer(1,2,4,5))((x,y) => x -- y._1)
Why fold didn't work as foldLeft? What is Serializable with Equals? I understand fold and foldLeft has slight different API signature in terms of parameter generic types. Please advise. Thanks.

The method fold (originally added for parallel computation) is less powerful than foldLeft in terms of types it can be applied to. Its signature is:
def fold[A1 >: A](z: A1)(op: (A1, A1) => A1): A1
This means that the type over which the folding is done has to be a supertype of the collection element type.
def foldLeft[B](z: B)(op: (B, A) => B): B
The reason is that fold can be implemented in parallel, while foldLeft cannot. This is not only because of the *Left part which implies that foldLeft goes from left to right sequentially, but also because the operator op cannot combine results computed in parallel -- it only defines how to combine the aggregation type B with the element type A, but not how to combine two aggregations of type B. The fold method, in turn, does define this, because the aggregation type A1 has to be a supertype of the element type A, that is A1 >: A. This supertype relationship allows in the same time folding over the aggregation and elements, and combining aggregations -- both with a single operator.
But, this supertype relationship between the aggregation and the element type also means that the aggregation type A1 in your example should be the supertype of (ArrayBuffer[Int], Int). Since the zero element of your aggregation is ArrayBuffer(1, 2, 4, 5) of the type ArrayBuffer[Int], the aggregation type is inferred to be the supertype of both of these -- and that's Serializable with Equals, the only least upper bound of a tuple and an array buffer.
In general, if you want to allow parallel folding for arbitrary types (which is done out of order) you have to use the method aggregate which requires defining how two aggregations are combined. In your case:
r.aggregate(ArrayBuffer(1, 2, 4, 5))({ (x, y) => x -- y._1 }, (x, y) => x intersect y)
Btw, try writing your example with reduce/reduceLeft -- because of the supertype relationship between the element type and the aggregation type that both these methods have, you will find that it leads to a similar error as the one you've described.

Related

Why does this function parameter need double braces on a tuple?

Note: I am using scalac. Please do not recommend to use sbt instead.
I ran into a peculiar issue that I could resolve, but I am wondering why it works that way and not the way I did it before. Here's a code snippet:
def multiply[A](r1: Vector[A], r2: Vector[A], multOp: (A,A) => A, sumOp: (A, A) => A) =
r1.zip(r2).map(multOp).reduce(sumOp)
It does not compile, resulting in an error message like:
Error:(73, 20) type mismatch;
found : (A, A) => A
required: ((A, A)) => ?
r1.zip(r2).map(multOp).reduce(sumOp)
Changing the snippet to:
def multiply[A](r1: Vector[A], r2: Vector[A], multOp: ((A,A)) => A, sumOp: (A, A) => A) =
r1.zip(r2).map(multOp).reduce(sumOp)
will resolve the issue.
Note that sumOp does work with only one pair of braces.
Why?
Method map is defined as taking a single parameter, and (A, A) => A has two. By converting two parameters of type A into one parameter which is a tuple of type (A, A), it compiles.
(A, A) => A // fails due to two params of type A
((A, A)) => A // works due to one param of type (A, A)
On the other hand, reduce is defined as taking two parameters of the same type, so it's happy to take sumOp which matches that description.
Here are the full signatures found in TraversableLike and TraversableOnce respectively:
def map[B, That](f: A => B)(implicit bf: CanBuildFrom[Repr, B, That]): That
def reduce[A1 >: A](op: (A1, A1) => A1): A1
EDIT (extra info):
Reason for this is the fact that reduce always takes a 2-arity function (that is, a function of two parameters) in order to reduce the collection to a single value by iteratively applying that function on the result of previous application and the next value. On the other hand, map always takes a 1-arity function and it maps the underlying value with that function. In case of Option, Future, etc. there's only one single underlying value, while in case of a Vector (like yours) there can be many, so it applies it to every element of the collection.
In some libraries you might come across map2 which takes a two-parameter function. For example, in order to combine two Options (actually, any applicative functors, but let's leave theory aside), you might do:
// presudocode
Option(1, 2).map2((a, b) => a + b)
which would give you an Option(3). I think this mechanism has been dropped in the favour of more easily understandable map + product
// presudocode
(Option(1) product Option(2)) map ((a, b) => a + b)
Actual scalaz syntax for the line above would be (in Scalaz 7):
(Option(1) |#| Option(2))((a, b) => a + b)
They are equally powerful principles (what one can do, exactly the same other one can do, no more, no less) so the latter one is usually preferred and sometimes it's the only one provided, but yes, you might come across map2 from time to time.
Alright, that's a bit of extra info. As far as map is concerned, just remember there's always just one single parameter coming in and one value coming out.
The first snippet will work, if you define it as:
def multiply[A](r1: Vector[A], r2: Vector[A], multOp: (A,A) => A, sumOp: (A, A) => A) =
(r1, r2).zipped.map(multOp).reduce(sumOp)
The map method of a zipped tuple takes a function with two arguments (A, A) => B, because that's the expected usage pattern.
This approach also avoids the creation of an intermediate Vector[(A, A)].

What 's the difference between foldRight and foldLeft in concat

Why I cannot use fold Left in the following code:
def concatList[T](xs: List[T],ys:List[T]): List[T]=
(xs foldLeft ys)(_::_)
Actually it is hard for me to understand the differences between foldRight and foldLeft, it there any examples to illustrate the real differences?
Thanks.
Well, you can,
scala> def concatList[T](xs: List[T],ys:List[T]) =
(xs foldLeft ys)( (a, b) => b :: a )
concatList: [T](xs: List[T], ys: List[T])List[T]
scala> concatList(List(1,2,3), List(6,7,8))
res0: List[Int] = List(3, 2, 1, 6, 7, 8)
Was that the result you were expecting? I don't think so.
First let's look at the signature of the folds and :: (only a simplification for illustrative purposes, but fits perfectly in our case) :
given a List[T]
def ::(v:T): List[T] // This is a right associative method, more below
def foldLeft[R](r:R)(f: (R,T) => R):R
def foldRight[R](r:R)(f: (T,R) => R):R
Now, apply one argument list in foldLeft we xs.foldLeft(ys) and unifying the types from our signature from foldLeft sample call:
List[T] : List[Int], therefore T : Int, and R : List[Int], that applied to foldLeft signature gives
foldLeft[List[Int]](r:List[Int])( f:(List[Int],Int) => List[Int] )
Now, for the usage of ::, a :: b compiles to b.::(a), Scala often refers to it as a right associative method. This a special syntax sugar for methods ending in : and quite convenient when defining a list: 1 :: 2 :: Nil is like writing Nil.::(2).::(1).
Continuing our instantiation of foldLeft, the function we need to pass has to look like this: (List[Int],Int) => List[Int]. Consider (a,b) => a :: b, if we unify that with the type of we f get:
a : List[Int], and b : Int, comparing that with the signature of a2 :: b2, a2 : Int, b2 : List[Int]. For this to compile, a and a2 in conjunction with b and b2 must have the same types each. Which they don't!
Notice in my example, I inverted the arguments, making a match the type of b2 and b match the type of a2.
I will offer yet another version that compiles:
def concatList[T](xs: List[T],ys:List[T]) = (xs foldLeft ys)( _.::(_) )
To cut the story short, look at foldRight signature
def foldRight[R](r:R)(f: (T,R) => R):R
The arguments are already inverted, so making f = _ :: _ gives us the right types.
Wow, that was a lot of explanation about type inference and I am sort on time, but I still owe a explanation on difference between the meaning of fold left and right. For now have a look at https://wiki.haskell.org/Fold, in special these two imagines:
Notice, the argument to foldl and foldr are inverted, it first takes the function and them the inital arguments, r in the signatures, and instead of :: for list construction it uses just :. Two very small details.

Why is fold curried?

Could start value just be a parameter in op argument list ?
Food is defined on List as
def fold[A1 >: A](z: A1)(op: (A1, A1) ⇒ A1): A1 Folds the elements
of this traversable or iterator using the specified associative binary
operator.
What would the implications of defining fold as
def fold[A1 >: A](op: (z:A1,A1, A1) ⇒ A1): A1
So in this version the initial value is passed as a value to the function instead of being curried in a separate parameter list.
If you're looking to motivate that particular signature of foldLeft, it may be worthwhile to first examine reduceLeft.
// Slightly simplified to remove the supertype constraint
def reduceLeft(f: (A, A) => A): A
reduceLeft squishes the entire collection into a single element and it takes as an argument a function that tells it how to squish each new element in the collection onto what it's got so far.
There's, however, a problem. reduceLeft is partial. In particular if the collection is empty, reduceLeft has nowhere to begin squishing things. So we can make it total, by telling reduceLeft where to begin. So we give reduceLeft an additional parameter.
def reduceLeftTotal(initial: A, f: (A, A) => A): A
Note that if we just glommed initial as another argument to f, we wouldn't fix the partiality of reduceLeft. If this is an empty collection, we still blow up.
// This doesn't get us what we want. Where does the initial `A` come from?
def reduceLeftNotWhatWeWant(f: (A, A, A) => A): A
Okay, now that we've got reduceLeftTotal, there's an immediate new avenue for generalization. Why does the thing that we're squishing all the elements of our collection onto have to have the same type as the elements? The answer is it doesn't!
def generalReduceLeftTotal[B](initial: B, f: (B, A) => A): B
Finally because type information in previous argument lists, but not previous arguments in the same list, can be used to help Scala's type inference, we can reduce the amount of explicit type annotations we need by currying.
// And we're back to foldLeft!
def foldLeft[B](initial: B)(f: (B, A) => A): B

type inference in fold left one-liner?

I was trying to reverse a List of Integers as follows:
List(1,2,3,4).foldLeft(List[Int]()){(a,b) => b::a}
My question is that is there a way to specify the seed to be some List[_] where the _ is the type automatically filled in by scala's type-inference mechanism, instead of having to specify the type as List[Int]?
Thanks
Update: After reading a bit more on Scala's type inference, I found a better answer to your question. This article which is about the limitations of the Scala type inference says:
Type information in Scala flows from function arguments to their results [...], from left to right across argument lists, and from first to last across statements. This is in contrast to a language with full type inference, where (roughly speaking) type information flows unrestricted in all directions.
So the problem is that Scala's type inference is rather limited. It first looks at the first argument list (the list in your case) and then at the second argument list (the function). But it does not go back.
This is why neither this
List(1,2,3,4).foldLeft(Nil){(a,b) => b::a}
nor this
List(1,2,3,4).foldLeft(List()){(a,b) => b::a}
will work. Why? First, the signature of foldLeft is defined as:
foldLeft[B](z: B)(f: (B, A) => B): B
So if you use Nil as the first argument z, the compiler will assign Nil.type to the type parameter B. And if you use List(), the compiler will use List[Nothing] for B.
Now, the type of the second argument f is fully defined. In your case, it's either
(Nil.type, Int) => Nil.type
or
(List[Nothing], Int) => List[Nothing]
And in both cases the lambda expression (a, b) => b :: a is not valid, since its return type is inferred to be List[Int].
Note that the bold part above says "argument lists" and not "arguments". The article later explains:
Type information does not flow from left to right within an argument list, only from left to right across argument lists.
So the situation is even worse if you have a method with a single argument list.
The only way I know how is
scala> def foldList[T](l: List[T]) = l.foldLeft(List[T]()){(a,b) => b::a}
foldList: [T](l: List[T])List[T]
scala> foldList(List(1,2,3,4))
res19: List[Int] = List(4, 3, 2, 1)
scala> foldList(List("a","b","c"))
res20: List[java.lang.String] = List(c, b, a)

difference between foldLeft and reduceLeft in Scala

I have learned the basic difference between foldLeft and reduceLeft
foldLeft:
initial value has to be passed
reduceLeft:
takes first element of the collection as initial value
throws exception if collection is empty
Is there any other difference ?
Any specific reason to have two methods with similar functionality?
Few things to mention here, before giving the actual answer:
Your question doesn't have anything to do with left, it's rather about the difference between reducing and folding
The difference is not the implementation at all, just look at the signatures.
The question doesn't have anything to do with Scala in particular, it's rather about the two concepts of functional programming.
Back to your question:
Here is the signature of foldLeft (could also have been foldRight for the point I'm going to make):
def foldLeft [B] (z: B)(f: (B, A) => B): B
And here is the signature of reduceLeft (again the direction doesn't matter here)
def reduceLeft [B >: A] (f: (B, A) => B): B
These two look very similar and thus caused the confusion. reduceLeft is a special case of foldLeft (which by the way means that you sometimes can express the same thing by using either of them).
When you call reduceLeft say on a List[Int] it will literally reduce the whole list of integers into a single value, which is going to be of type Int (or a supertype of Int, hence [B >: A]).
When you call foldLeft say on a List[Int] it will fold the whole list (imagine rolling a piece of paper) into a single value, but this value doesn't have to be even related to Int (hence [B]).
Here is an example:
def listWithSum(numbers: List[Int]) = numbers.foldLeft((List.empty[Int], 0)) {
(resultingTuple, currentInteger) =>
(currentInteger :: resultingTuple._1, currentInteger + resultingTuple._2)
}
This method takes a List[Int] and returns a Tuple2[List[Int], Int] or (List[Int], Int). It calculates the sum and returns a tuple with a list of integers and it's sum. By the way the list is returned backwards, because we used foldLeft instead of foldRight.
Watch One Fold to rule them all for a more in depth explanation.
reduceLeft is just a convenience method. It is equivalent to
list.tail.foldLeft(list.head)(_)
foldLeft is more generic, you can use it to produce something completely different than what you originally put in. Whereas reduceLeft can only produce an end result of the same type or super type of the collection type. For example:
List(1,3,5).foldLeft(0) { _ + _ }
List(1,3,5).foldLeft(List[String]()) { (a, b) => b.toString :: a }
The foldLeft will apply the closure with the last folded result (first time using initial value) and the next value.
reduceLeft on the other hand will first combine two values from the list and apply those to the closure. Next it will combine the rest of the values with the cumulative result. See:
List(1,3,5).reduceLeft { (a, b) => println("a " + a + ", b " + b); a + b }
If the list is empty foldLeft can present the initial value as a legal result. reduceLeft on the other hand does not have a legal value if it can't find at least one value in the list.
For reference, reduceLeft will error if applied to an empty container with the following error.
java.lang.UnsupportedOperationException: empty.reduceLeft
Reworking the code to use
myList foldLeft(List[String]()) {(a,b) => a+b}
is one potential option. Another is to use the reduceLeftOption variant which returns an Option wrapped result.
myList reduceLeftOption {(a,b) => a+b} match {
case None => // handle no result as necessary
case Some(v) => println(v)
}
The basic reason they are both in Scala standard library is probably because they are both in Haskell standard library (called foldl and foldl1). If reduceLeft wasn't, it would quite often be defined as a convenience method in different projects.
From Functional Programming Principles in Scala (Martin Odersky):
The function reduceLeft is defined in terms of a more general function, foldLeft.
foldLeft is like reduceLeft but takes an accumulator z, as an additional parameter, which is returned when foldLeft is called on an empty list:
(List (x1, ..., xn) foldLeft z)(op) = (...(z op x1) op ...) op x
[as opposed to reduceLeft, which throws an exception when called on an empty list.]
The course (see lecture 5.5) provides abstract definitions of these functions, which illustrates their differences, although they are very similar in their use of pattern matching and recursion.
abstract class List[T] { ...
def reduceLeft(op: (T,T)=>T) : T = this match{
case Nil => throw new Error("Nil.reduceLeft")
case x :: xs => (xs foldLeft x)(op)
}
def foldLeft[U](z: U)(op: (U,T)=>U): U = this match{
case Nil => z
case x :: xs => (xs foldLeft op(z, x))(op)
}
}
Note that foldLeft returns a value of type U, which is not necessarily the same type as List[T], but reduceLeft returns a value of the same type as the list).
To really understand what are you doing with fold/reduce,
check this: http://wiki.tcl.tk/17983
very good explanation. once you get the concept of fold,
reduce will come together with the answer above:
list.tail.foldLeft(list.head)(_)
Scala 2.13.3, Demo:
val names = List("Foo", "Bar")
println("ReduceLeft: "+ names.reduceLeft(_+_))
println("ReduceRight: "+ names.reduceRight(_+_))
println("Fold: "+ names.fold("Other")(_+_))
println("FoldLeft: "+ names.foldLeft("Other")(_+_))
println("FoldRight: "+ names.foldRight("Other")(_+_))
outputs:
ReduceLeft: FooBar
ReduceRight: FooBar
Fold: OtherFooBar
FoldLeft: OtherFooBar
FoldRight: FooBarOther