What's the diff between reduceLeft and reduceRight in Scala? - scala

What's the diff between reduceLeft and reduceRight in Scala?
val list = List(1, 0, 0, 1, 1, 1)
val sum1 = list reduceLeft {_ + _}
val sum2 = list reduceRight {_ + _}
println { sum2 == sum2 }
In my snippet sum1 = sum2 = 4, so the order does not matter here.

When do they produce the same result
As Lionel already pointed out, reduceLeft and reduceRight only produce the same result if the function you are using to combine the elements is associative (this isn't always true, see my note at the bottom). For instance when running reduceLeft and reduceRight on Seq(1,2,3) with the function (a: Int, b: Int) => a - b you get a different result.
scala> Seq(1,2,3)
res0: Seq[Int] = List(1, 2, 3)
scala> res0.reduceLeft(_ - _)
res5: Int = -4
scala> res0.reduceRight(_ - _)
res6: Int = 2
Why this happens can be made clear if we look at how each of the functions is applied over the list.
For reduceRight this is what the calls look like if we were to unwrap them.
(1 - (2 - 3))
(1 - (-1))
2
For reduceLeft the sequence is built up starting from the left,
((1 - 2) - 3)
((-1) - 3)
(-4)
Tail Recursion
Further because reduceLeft is implemented using Tail Recursion, it will not stack overflow when operating on very large collections (possibly even infinite). reduceRight is not tail recursive, so given a collection of large enough size, it will produce a stack overflow.
For instance, on my machine if I run the following I get an Out of Memory error,
scala> (0 to 100000000).reduceRight(_ - _)
java.lang.OutOfMemoryError: GC overhead limit exceeded
at java.lang.Integer.valueOf(Integer.java:832)
at scala.runtime.BoxesRunTime.boxToInteger(BoxesRunTime.java:65)
at scala.collection.immutable.Range.apply(Range.scala:61)
at scala.collection.IndexedSeqLike$Elements.next(IndexedSeqLike.scala:65)
at scala.collection.Iterator$class.foreach(Iterator.scala:742)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1194)
at scala.collection.TraversableOnce$class.reversed(TraversableOnce.scala:99)
at scala.collection.AbstractIterator.reversed(Iterator.scala:1194)
at scala.collection.TraversableOnce$class.reduceRight(TraversableOnce.scala:197)
at scala.collection.AbstractIterator.reduceRight(Iterator.scala:1194)
at scala.collection.IterableLike$class.reduceRight(IterableLike.scala:85)
at scala.collection.AbstractIterable.reduceRight(Iterable.scala:54)
... 20 elided
But if I compute with reduceLeft I don't get the OOM,
scala> (0 to 100000000).reduceLeft(_ - _)
res16: Int = -987459712
You might get slightly different results on your system, depending your JVM default memory settings.
Prefer left versions
So, because of tail recursion, if you know that reduceLeft and reduceRight will produce the same value, you should prefer the reduceLeft variant. This generally holds true of the other left/right functions, such as foldRight and foldLeft (which are just more general versions of reduceRight and reduceLeft).
When do they really always produce the same result
A small note about reduceLeft and reduceRight and the Associative Property of the function you are using. I said that reduceRight and reduceLeft only produce the same results if the operator is associative. This isn't always true for all collection types. That is somewhat of another topic though, so consult the ScalaDoc, but in short the function you are reducing with needs to be both commutative and associative in order to get the same results for all collection types.

Reduce left doesn't always equal the same result as reduce right. Consider an asymmetric function on your array.
Assuming the same result, performance is one obvious difference
See performance-characteristics
The data structure is build with constant access time to head and tail. Iterating backwards will perform worse for large lists.

The best way to know the difference is read the source code in library/scala/collection/LinearSeqOptimized.scala:
def reduceLeft[B >: A](f: (B, A) => B): B =
......
tail.foldLeft[B](head)(f)
def reduceRight[B >: A](op: (A, B) => B): B =
......
op(head, tail.reduceRight(op))
def foldLeft[B](z: B)(f: (B, A) => B): B = {
var acc = z
var these = this
while (!these.isEmpty) {
acc = f(acc, these.head)
these = these.tail
}
acc
}
The above is some key part of the code, and you can see reduceLeft is based on foldLeft , while reduceRight is implemented via recursion.
I guess reduceLeft has a better performance.

Related

Function takes two List[Int] arguments and produces a List[Int]. SCALA [duplicate]

This question already has answers here:
Scala - Combine two lists in an alternating fashion
(4 answers)
Closed 3 years ago.
The elements of the resulting list should alternate between the elements of the arguments. Assume that the two arguments have the same length.
USE RECURSION
My code as follows
val finalString = new ListBuffer[Int]
val buff2= new ListBuffer[Int]
def alternate(xs:List[Int], ys:List[Int]):List[Int] = {
while (xs.nonEmpty) {
finalString += xs.head + ys.head
alternate(xs.tail,ys.tail)
}
return finalString.toList
}
EXPECTED RESULT:
alternate ( List (1 , 3, 5) , List (2 , 4, 6)) = List (1 , 2, 3, 4, 6)
As far for the output, I don't get any output. The program is still running and cannot be executed.
Are there any Scala experts?
There are a few problems with the recursive solutions suggested so far (including yours, which would actually work, if you replace while with if): appending to end of list is a linear operation, making the whole thing quadratic (taking a .length of a list too, as well ас accessing elements by index), don't do that; also, if the lists are long, a recursion may require a lot of space on the stack, you should be using tail-recursion whenever possible.
Here is a solution that is free of both those problems: it builds the output backwards, by prepending elements to the list (constant time operation) rather than appending them, and reverses the result at the end. It is also tail-recursive: the recursive call is the last operation in the function, which allows the compiler to convert it into a loop, so that it will only use a single stack frame for execution regardless of the size of the lists.
#tailrec
def alternate(
a: List[Int],
b: List[Int],
result: List[Int] = Nil
): List[Int] = (a,b) match {
case (Nil, _) | (_, Nil) => result.reversed
case (ah :: at, bh :: bt) => alternate(at, bt, bh :: ah :: result)
}
(if the lists are of different lengths, the whole thing stops when the shortest one ends, and whatever is left in the longer one is thrown out. You may want to modify the first case (split it into two, perhaps) if you desire a different behavior).
BTW, your own solution is actually better than most suggested here: it is actually tail recursive (or rather can be made one if you add else after your if, which is now while), and appending to ListBuffer isn't actually as bad as to a List. But using mutable state is generally considered "code smell" in scala, and should be avoided (that's one of the main ideas behind using recursion instead of loops in the first place).
Condition xs.nonEmpty is true always so you have infinite while loop.
Maybe you meant if instead of while.
A more Scala-ish approach would be something like:
def alternate(xs: List[Int], ys: List[Int]): List[Int] = {
xs.zip(ys).flatMap{case (x, y) => List(x, y)}
}
alternate(List(1,3,5), List(2,4,6))
// List(1, 2, 3, 4, 5, 6)
A recursive solution using match
def alternate[T](a: List[T], b: List[T]): List[T] =
(a, b) match {
case (h1::t1, h2::t2) =>
h1 +: h2 +: alternate(t1, t2)
case _ =>
a ++ b
}
This could be more efficient at the cost of clarity.
Update
This is the more efficient solution:
def alternate[T](a: List[T], b: List[T]): List[T] = {
#annotation.tailrec
def loop(a: List[T], b: List[T], res: List[T]): List[T] =
(a, b) match {
case (h1 :: t1, h2 :: t2) =>
loop(t1, t2, h2 +: h1 +: res)
case _ =>
a ++ b ++ res.reverse
}
loop(a, b, Nil)
}
This retains the original function signature but uses an inner function that is an efficient, tail-recursive implementation of the algorithm.
You're accessing variables from outside the method, which is bad. I would suggest something like the following:
object Main extends App {
val l1 = List(1, 3, 5)
val l2 = List(2, 4, 6)
def alternate[A](l1: List[A], l2: List[A]): List[A] = {
if (l1.isEmpty || l2.isEmpty) List()
else List(l1.head,l2.head) ++ alternate(l1.tail, l2.tail)
}
println(alternate(l1, l2))
}
Still recursive but without accessing state from outside the method.
Assuming both lists are of the same length, you can use a ListBuffer to build up the alternating list. alternate is a pure function:
import scala.collection.mutable.ListBuffer
object Alternate extends App {
def alternate[T](xs: List[T], ys: List[T]): List[T] = {
val buffer = new ListBuffer[T]
for ((x, y) <- xs.zip(ys)) {
buffer += x
buffer += y
}
buffer.toList
}
alternate(List(1, 3, 5), List(2, 4, 6)).foreach(println)
}

Fold method using List as accumulator

To find prime factors of a number I was using this piece of code :
def primeFactors(num: Long): List[Long] = {
val exists = (2L to math.sqrt(num).toLong).find(num % _ == 0)
exists match {
case Some(d) => d :: primeFactors(num/d)
case None => List(num)
}
}
but this I found a cool and more functional approach to solve this using this code:
def factors(n: Long): List[Long] = (2 to math.sqrt(n).toInt)
.find(n % _ == 0).fold(List(n)) ( i => i.toLong :: factors(n / i))
Earlier I was using foldLeft or fold simply to get sum of a list or other simple calculations, but here I can't seem to understand how fold is working and how this is breaking out of the recursive function.Can somebody plz explain how fold functionality is working here.
Option's fold
If you look at the signature of Option's fold function, it takes two parameters:
def fold[B](ifEmpty: => B)(f: A => B): B
What it does is, it applies f on the value of Option if it is not empty. If Option is empty, it simply returns output of ifEmpty (this is termination condition for recursion).
So in your case, i => i.toLong :: factors(n / i) represents f which will be evaluated if Option is not empty. While List(n) is termination condition.
fold used for collection / iterators
The other fold that you are taking about for getting sum of collection, comes from TraversableOnce and it has signature like:
def foldLeft[B](z: B)(op: (B, A) => B): B
Here, z is starting value (suppose incase of sum it's 0) and op is associative binary operator which is applied on z and each value of collection from left to right.
So both folds differ in their implementation.

What is point of /: function?

Documentation for /: includes
Note: might return different results for different runs, unless the underlying collection type is ordered or the operator
is associative and commutative.
( src)
This just applies if the par version of this function is run, otherwise the result is deterministic (same as foldLeft) ?
Also this function is calling foldLeft under the hood : def /:[B](z: B)(op: (B, A) => B): B = foldLeft(z)(op)
Their function definitions are same (except for function param label, "op" instad of "f") :
def /:[B](z: B)(op: (B, A) ⇒ B): B
def foldLeft[B](z: B)(f: (B, A) ⇒ B): B
For these reasons what is point of /: function and when should it be used in favour of foldLeft ?
Is my reasoning incorrect ?
It's just an alternative syntax. Methods ending in : are called on the right hand side.
Instead of
list.foldLeft(0) { op(_, _) }
or
list./:(0) { op(_, _) }
you can
( z /: list ) { op(_, _) }
For example,
scala> val a = List(1,2,3,4)
a: List[Int] = List(1, 2, 3, 4)
scala> ( 0 /: a ) { _ + _ }
res5: Int = 10
Yes, those are aliases originating from dark times when people liked their operators like this:
val x = y |#<#|: z.
The point of the note is to remind that for collections with unspecified iteration order the result of folds might differ. Consider having a Set {1,2,3} that doesn't guarantee the same access order even if left unmodified, and applying an operation that is not e. g. associative (like /). Even if run not after par call, this might result in the following (pseudocode):
{1,2,3} foldLeft / ==> (1 / 2) / 3 ==> 1/6 = 0.1(6)
{3,1,2} foldLeft / ==> (3 / 1) / 2 ==> 3/2 = 1.5
In terms of consistency this is similar to applying non-parallelizable operations to parallel collections, though.

Composing Functions Leads to StackOverflow

Functional Programming in Scala lists the following example as to how composing functions can lead to a StackOverflowError.
scala> val f = (x: Int) => x
f: Int => Int = <function1>
scala> val g = List.fill(100000)(f).foldLeft(f)(_ compose _)
g: Int => Int = <function1>
scala> g(42)
java.lang.StackOverflowError
As the book explains, g is a composite function that has 100,000 functions where each one calls the next.
Since foldLeft is tail-recursive, why does this StackOverflowError occur? How, if at all, are tail-recursion and StackOverflow's related?
When (as it's expanded) the second argument (B, A) => B of foldLeft, ((acc, elem) => acc.compose(elem)), doesn't each fold step result in composing only 2 functions?
Since foldLeft is tail-recursive, why does this StackOverflowError occur? How, if at all, are tail-recursion and StackOverflow's related?
When (as it's expanded) the second argument (B, A) => B of foldLeft, ((acc, elem) => acc.compose(elem)), doesn't each fold step result in composing only 2 functions?
Note that the fold itself (i.e. the line val g = ...) doesn't overflow stack. However, g ends up being defined effectively as f(f(...(x))) and therefore you need 100000 stack frames to evaluate g(42) which obviously does overflow.
It's not because of foldLeft or compose itself. It's because g(x) = f(f(f(...(x))).

foldRight Efficiency?

I heard foldLeft is much more efficient in most of operations, but Scala School (from Twitter) gave the following example. Can someone give an analysis of its efficiency and should we achieve the same operation using foldLeft?
val numbers = List(1,2,3,4,5,...10)
def ourMap(numbers: List[Int], fn: Int => Int): List[Int] = {
numbers.foldRight(List[Int]()) { (x: Int, xs: List[Int]) =>
fn(x) :: xs
}
}
scala> ourMap(numbers, timesTwo(_))
res0: List[Int] = List(2, 4, 6, 8, 10, 12, 14, 16, 18, 20)
As you can see from the docs, List's foldRight and foldLeft methods are defined in LinearSeqOptimized. So take a look at the source:
override /*TraversableLike*/
def foldLeft[B](z: B)(f: (B, A) => B): B = {
var acc = z
var these = this
while (!these.isEmpty) {
acc = f(acc, these.head)
these = these.tail
}
acc
}
override /*IterableLike*/
def foldRight[B](z: B)(f: (A, B) => B): B =
if (this.isEmpty) z
else f(head, tail.foldRight(z)(f))
So foldLeft uses a while-loop and foldRight uses a simple-recursive method. In particular it is not tail-recursive. Thus foldRight has the overhead of creating new stack frames and has a tendency to overflow the stack if you try it on a long list (try, for example ((1 to 10000).toList :\ 0)(_+_). Boom! But it works fine without the toList, because Range's foldRight works by reversing the folding left).
So why not always use foldLeft? For linked lists, a right fold is arguably a more natural function, because linked lists need to be built in reverse order. You can use foldLeft for your method above, but you need to reverse the output at the end. (Do not try appending to Lists in a left fold, as the complexity is O(n-squared).)
As for which is quicker in practice, foldRight or foldLeft + reverse, I ran a simple test and foldRight is faster for Lists by between 10 and 40 %. This must be why List's foldRight is implemented the way it is.
foldRight reverses the list and applies foldLeft.