Fold left and fold right - scala

I am trying to learn how to use fold left and fold right. This is my first time learning functional programming. I am having trouble understanding when to use fold left and when to use fold right. It seems to me that a lot of the time the two functions are interchangeable. For example (in Scala)the two functions:
val nums = List(1, 2, 3, 4, 5)
val sum1 = nums.foldLeft(0) { (total, n) =>
total + n
}
val sum2 = nums.foldRight(0) {(total, n) =>
total + n
}
both yield the same result. Why and when would I choose one or the other?

foldleft and foldright differ in the way the function is nested.
foldleft: (((...) + a) + a) + a
foldright: a + (a + (a + (...)))
Since the function you are using is addition, both of them give the same result. Try using subtraction.
Moreover, the motivation to use fold(left/right) is not the result - in most of the cases, both yield the same result. It depends on which you you want your function to be aggregated.

Since the operator you are using is associated & commutative operator means a + b = b + a that's why leftFold and rightFold worked equivalent but it's not the equivalent in general as you can visualised by below examples where operator(+) is not associative & commutative operation i.e in case of string concatenation '+' operator is not associative & commutative means 'a' + 'b' != 'b' + 'a'
val listString = List("a", "b", "c") // : List[String] = List(a,b,c)
val leftFoldValue = listString.foldLeft("z")((el, acc) => el + acc) // : String = zabc
val rightFoldValue = listString.foldRight("z")((el, acc) => el + acc) // : abcz
OR in shorthand ways
val leftFoldValue = listString.foldLeft("z")(_ + _) // : String = zabc
val rightFoldValue = listString.foldRight("z")(_ + _) // : String = abcz
Explanation:
leftFold is worked as ( ( ('z' + 'a') + 'b') + 'c') = ( ('za' + 'b') + 'c') = ('zab' + 'c') = 'zabc'
and rightFold as ('a' + ('b' + ('c' + 'z'))) = ('a' + ('b' + 'cz')) = ('a' + 'bcz') = 'abcz'
So in short for operators that are associative and commutative, foldLeft and
foldRight are equivalent (even though there may be a difference in
efficiency).
But sometimes, only one of the two operators is appropriate.

Related

How to get more functional Scala based code when manipulating string by counting its length?

I am quite new to Scala and functional programming.
I wrote the simple codes as below, which manipulates the string by counting the word.
When the 4th comma-delimitted part is empty then, I concated only three columns, otherwise I concated all the columns including the values as code above.
But I think that it is not quite proper to the functional programming. Because I used the if statement to see the input value contains the value or not.
How to change it to the more scala-like code?
str = "aa,bb,1668268540040,34.0::aa,bb,1668268540040"
val parts = str.split("::")
for (case <- parts) {
val ret = case.map(c => if (c.value.isEmpty) {
c.columnFamily + "," + c.qualifier + "," + c.ts
} else {
c.columnFamily + "," + c.qualifier + "," + c.ts + "," + c.value
})
}
str = "aa,bb,1668268540040,34.0::aa,bb,166826434343"
val parts = str.split("::")
for (part <- parts) {
val elem = part.split(",", 4)
if (elem.length == 4) {
val Array(f, q, t, v) = elem
state.put(f + ":" + q, (v, t.toLong))
} else {
val Array(f, q, t) = elem
state.put(f + ":" + q, ("", t.toLong))
}
}
#LeviRamsey's comment tells you actually everything, but just to make your code more "scala-ish", you should avoid mutable data structures in the first place (what you're doing with state, which I think is a Map object), and use immutable data structures. About your if-else part, it's actually okay in FP, but in Scala, you can use pattern matching on a list, rather than manual length checking and using Arrays. Something like this:
parts.foldLeft(Map.empty[String, (String, Long)]) {
case (state, part) =>
part.split(",", 4).toList match {
case f :: q :: t :: v :: Nil =>
state.updated(f + ":" + q, (v, t.toLong))
case f :: q :: t :: Nil =>
state.updated(f + ":" + q, ("", t.toLong))
case _ => state // or whatever thing you want to do, in case neither 4 nor 3 elements are splitted
}
}

Scala fold right and fold left

I am trying to learn functional programming and Scala, so I'm reading the "Functional Programming in Scala" by Chiusano and Bjarnason. I' m having trouble understanding what fold left and fold right methods do in case of a list. I've looked around here but I haven't find something beginner friendly. So the code provided by the book is:
def foldRight[A,B](as: List[A], z: B)(f: (A, B) => B): B = as match {
case Nil => z
case Cons(h, t) => f(h, foldRight(t, z)(f))
}
def foldLeft[A,B](l: List[A], z: B)(f: (B, A) => B): B = l match {
case Nil => z
case Cons(h,t) => foldLeft(t, f(z,h))(f)
}
Where Cons and Nil are:
case class Cons[+A](head: A, tail: List[A]) extends List[A]
case object Nil extends List[Nothing]
So what do actually fold left and right do? Why are needed as "utility" methods? There are many other methods that use them and I have trouble to understand them as well, since I don't get those two.
According to my experience, one of the best ways to workout the intuition is to see how it works on the very simple examples:
List(1, 3, 8).foldLeft(100)(_ - _) == ((100 - 1) - 3) - 8 == 88
List(1, 3, 8).foldRight(100)(_ - _) == 1 - (3 - (8 - 100)) == -94
As you can see, foldLeft/Right just passes the element of the list and the result of the previous application to the the operation in second parentheses.
It should be also mentioned that if you apply these methods to the same list, they will return equal results only if the applied operation is associative.
Say you have a list of numbers, and you want to add them all up. How would you do that?
You add the first and the second, then take the result of that, add that to the third, take the result of that, add it to the fourth.. and so on.
That's what fold let's you do.
List(1,2,3,4,5).foldLeft(0)(_ + _)
The "+" is the function you want to apply, with the first operand being the result of its application to the elements so far, and the second operand being the next element.
As you don't have a "result so far" for the first application, you provide a start value - in this case 0, as it is the identity element for addition.
Say you want to multiply all of your list elements, with fold, that'd be
List(1,2,3,4,5).foldLeft(1)(_ * _)
Fold has it's own Wikipedia page you might want to check.
Of course there are also ScalaDoc entries for foldLeft and foldRight.
Another way of visualisation of leftFold and rightFold in Scala is through string concatenation, its clearly show how leftFold and rightFold worked, let's see the below example:
val listString = List("a", "b", "c") // : List[String] = List(a,b,c)
val leftFoldValue = listString.foldLeft("z")((el, acc) => el + acc) // : String = zabc
val rightFoldValue = listString.foldRight("z")((el, acc) => el + acc) // : abcz
OR in shorthand ways
val leftFoldValue = listString.foldLeft("z")(_ + _) // : String = zabc
val rightFoldValue = listString.foldRight("z")(_ + _) // : String = abcz
Explanation:
leftFold is worked as ( ( ('z' + 'a') + 'b') + 'c') = ( ('za' + 'b') + 'c') = ('zab' + 'c') = 'zabc'
and rightFold as ('a' + ('b' + ('c' + 'z'))) = ('a' + ('b' + 'cz')) = ('a' + 'bcz') = 'abcz'

Strange results when using Scala collections

I have some tests with results that I can't quite explain.
The first test does a filter, map and reduce on a list containing 4 elements:
{
val counter = new AtomicInteger(0)
val l = List(1, 2, 3, 4)
val filtered = l.filter{ i =>
counter.incrementAndGet()
true
}
val mapped = filtered.map{ i =>
counter.incrementAndGet()
i*2
}
val reduced = mapped.reduce{ (a, b) =>
counter.incrementAndGet()
a+b
}
println("counted " + counter.get + " and result is " + reduced)
assert(20 == reduced)
assert(11 == counter.get)
}
The counter is incremented 11 times as I expected: once for each element during filtering, once for each element during mapping and three times to add up the 4 elements.
Using wildcards the result changes:
{
val counter = new AtomicInteger(0)
val l = List(1, 2, 3, 4)
val filtered = l.filter{
counter.incrementAndGet()
_ > 0
}
val mapped = filtered.map{
counter.incrementAndGet()
_*2
}
val reduced = mapped.reduce{ (a, b) =>
counter.incrementAndGet()
a+b
}
println("counted " + counter.get + " and result is " + reduced)
assert(20 == reduced)
assert(5 == counter.get)
}
I can't work out how to use wildcards in the reduce (code doesnt compile), but now, the counter is only incremented 5 times!!
So, question #1: Why do wildcards change the number of times the counter is called and how does that even work?
Then my second, related question. My understanding of views was that they would lazily execute the functions passed to the monadic methods, but the following code doesn't show that.
{
val counter = new AtomicInteger(0)
val l = Seq(1, 2, 3, 4).view
val filtered = l.filter{
counter.incrementAndGet()
_ > 0
}
println("after filter: " + counter.get)
val mapped = filtered.map{
counter.incrementAndGet()
_*2
}
println("after map: " + counter.get)
val reduced = mapped.reduce{ (a, b) =>
counter.incrementAndGet()
a+b
}
println("after reduce: " + counter.get)
println("counted " + counter.get + " and result is " + reduced)
assert(20 == reduced)
assert(5 == counter.get)
}
The output is:
after filter: 1
after map: 2
after reduce: 5
counted 5 and result is 20
Question #2: How come the functions are being executed immediately?
I'm using Scala 2.10
You're probably thinking that
filter {
println
_ > 0
}
means
filter{ i =>
println
i > 0
}
but Scala has other ideas. The reason is that
{ println; _ > 0 }
is a statement that first prints something, and then returns the > 0 function. So it interprets what you're doing as a funny way to specify the function, equivalent to:
val p = { println; (i: Int) => i > 0 }
filter(p)
which in turn is equivalent to
println
val temp = (i: Int) => i > 0 // Temporary name, forget we did this!
val p = temp
filter(p)
which as you can imagine doesn't quite work out the way you want--you only print (or in your case do the increment) once at the beginning. Both your problems stem from this.
Make sure if you're using underscores to mean "fill in the parameter" that you only have a single expression! If you're using multiple statements, it's best to stick to explicitly named parameters.

Difference between fold and foldLeft or foldRight?

NOTE: I am on Scala 2.8—can that be a problem?
Why can't I use the fold function the same way as foldLeft or foldRight?
In the Set scaladoc it says that:
The result of folding may only be a supertype of this parallel collection's type parameter T.
But I see no type parameter T in the function signature:
def fold [A1 >: A] (z: A1)(op: (A1, A1) ⇒ A1): A1
What is the difference between the foldLeft-Right and fold, and how do I use the latter?
EDIT: For example how would I write a fold to add all elements in a list? With foldLeft it would be:
val foo = List(1, 2, 3)
foo.foldLeft(0)(_ + _)
// now try fold:
foo.fold(0)(_ + _)
>:7: error: value fold is not a member of List[Int]
foo.fold(0)(_ + _)
^
Short answer:
foldRight associates to the right. I.e. elements will be accumulated in right-to-left order:
List(a,b,c).foldRight(z)(f) = f(a, f(b, f(c, z)))
foldLeft associates to the left. I.e. an accumulator will be initialized and elements will be added to the accumulator in left-to-right order:
List(a,b,c).foldLeft(z)(f) = f(f(f(z, a), b), c)
fold is associative in that the order in which the elements are added together is not defined. I.e. the arguments to fold form a monoid.
fold, contrary to foldRight and foldLeft, does not offer any guarantee about the order in which the elements of the collection will be processed. You'll probably want to use fold, with its more constrained signature, with parallel collections, where the lack of guaranteed processing order helps the parallel collection implements folding in a parallel way. The reason for changing the signature is similar: with the additional constraints, it's easier to make a parallel fold.
You're right about the old version of Scala being a problem. If you look at the scaladoc page for Scala 2.8.1, you'll see no fold defined there (which is consistent with your error message). Apparently, fold was introduced in Scala 2.9.
For your particular example you would code it the same way you would with foldLeft.
val ns = List(1, 2, 3, 4)
val s0 = ns.foldLeft (0) (_+_) //10
val s1 = ns.fold (0) (_+_) //10
assert(s0 == s1)
Agree with other answers. thought of giving a simple illustrative example:
object MyClass {
def main(args: Array[String]) {
val numbers = List(5, 4, 8, 6, 2)
val a = numbers.fold(0) { (z, i) =>
{
println("fold val1 " + z +" val2 " + i)
z + i
}
}
println(a)
val b = numbers.foldLeft(0) { (z, i) =>
println("foldleft val1 " + z +" val2 " + i)
z + i
}
println(b)
val c = numbers.foldRight(0) { (z, i) =>
println("fold right val1 " + z +" val2 " + i)
z + i
}
println(c)
}
}
Result is self explanatory :
fold val1 0 val2 5
fold val1 5 val2 4
fold val1 9 val2 8
fold val1 17 val2 6
fold val1 23 val2 2
25
foldleft val1 0 val2 5
foldleft val1 5 val2 4
foldleft val1 9 val2 8
foldleft val1 17 val2 6
foldleft val1 23 val2 2
25
fold right val1 2 val2 0
fold right val1 6 val2 2
fold right val1 8 val2 8
fold right val1 4 val2 16
fold right val1 5 val2 20
25
There is two way to solve problems, iterative and recursive. Let's understand by a simple example.let's write a function to sum till the given number.
For example if I give input as 5, I should get 15 as output, as mentioned below.
Input: 5
Output: (1+2+3+4+5) = 15
Iterative Solution.
iterate through 1 to 5 and sum each element.
def sumNumber(num: Int): Long = {
var sum=0
for(i <- 1 to num){
sum+=i
}
sum
}
Recursive Solution
break down the bigger problem into smaller problems and solve them.
def sumNumberRec(num:Int, sum:Int=0): Long = {
if(num == 0){
sum
}else{
val newNum = num - 1
val newSum = sum + num
sumNumberRec(newNum, newSum)
}
}
FoldLeft: is a iterative solution
FoldRight: is a recursive solution
I am not sure if they have memoization to improve the complexity.
And so, if you run the foldRight and FoldLeft on the small list, both will give you a result with similar performance.
However, if you will try to run a FoldRight on Long List it might throw a StackOverFlow error (depends on your memory)
Check the following screenshot, where foldLeft ran without error, however foldRight on same list gave OutofMemmory Error.
fold() does parallel processing so does not guarantee the processing order.
where as foldLeft and foldRight process the items in sequentially for left to right (in case of foldLeft) or right to left (in case of foldRight)
Examples of sum the list -
val numList = List(1, 2, 3, 4, 5)
val r1 = numList.par.fold(0)((acc, value) => {
println("adding accumulator=" + acc + ", value=" + value + " => " + (acc + value))
acc + value
})
println("fold(): " + r1)
println("#######################")
/*
* You can see from the output that,
* fold process the elements of parallel collection in parallel
* So it is parallel not linear operation.
*
* adding accumulator=0, value=4 => 4
* adding accumulator=0, value=3 => 3
* adding accumulator=0, value=1 => 1
* adding accumulator=0, value=5 => 5
* adding accumulator=4, value=5 => 9
* adding accumulator=0, value=2 => 2
* adding accumulator=3, value=9 => 12
* adding accumulator=1, value=2 => 3
* adding accumulator=3, value=12 => 15
* fold(): 15
*/
val r2 = numList.par.foldLeft(0)((acc, value) => {
println("adding accumulator=" + acc + ", value=" + value + " => " + (acc + value))
acc + value
})
println("foldLeft(): " + r2)
println("#######################")
/*
* You can see that foldLeft
* picks elements from left to right.
* It means foldLeft does sequence operation
*
* adding accumulator=0, value=1 => 1
* adding accumulator=1, value=2 => 3
* adding accumulator=3, value=3 => 6
* adding accumulator=6, value=4 => 10
* adding accumulator=10, value=5 => 15
* foldLeft(): 15
* #######################
*/
// --> Note in foldRight second arguments is accumulated one.
val r3 = numList.par.foldRight(0)((value, acc) => {
println("adding value=" + value + ", acc=" + acc + " => " + (value + acc))
acc + value
})
println("foldRight(): " + r3)
println("#######################")
/*
* You can see that foldRight
* picks elements from right to left.
* It means foldRight does sequence operation.
*
* adding value=5, acc=0 => 5
* adding value=4, acc=5 => 9
* adding value=3, acc=9 => 12
* adding value=2, acc=12 => 14
* adding value=1, acc=14 => 15
* foldRight(): 15
* #######################
*/

Why do I have to explicitly state Tuple2(a, b) to be able to use Map add in a foldLeft?

I wish to create a Map keyed by name containing the count of things with that name. I have a list of the things with name, which may contain more than one item with the same name. Coded like this I get an error "type mismatch; found : String required: (String, Int)":
//variation 0, produces error
(Map[String, Int]() /: entries)((r, c) => { r + (c.name, if (r.contains(c.name)) (c.name) + 1 else 1) })
This confuses me as I though (a, b) was a Tuple2 and therefore suitable for use with Map add. Either of the following variations works as expected:
//variation 1, works
(Map[String, Int]() /: entries)((r, c) => { r + Tuple2(c.name, if (r.contains(c.name)) (c.name) + 1 else 1) })
//variation 2, works
(Map[String, Int]() /: entries)((r, c) => {
val e = (c.name, if (r.contains(c.name)) (c.name) + 1 else 1) })
r + e
I'm unclear on why there is a problem with my first version; can anyone advise. I am using Scala-IDE 2.0.0 beta 2 to edit the source; the error is from the Eclipse Problems window.
When passing a single tuple argument to a method used with operator notation, like your + method, you should use double parentheses:
(Map[String, Int]() /: entries)((r, c) => { r + ((c.name, r.get(c.name).map(_ + 1).getOrElse(1) )) })
I've also changed the computation of the Int, which looks funny in your example…
Because + is used to concatenate strings stuff with strings. In this case, parenthesis are not being taken to mean a tuple, but to mean a parameter.
Scala has used + for other stuff, which resulted in all sorts of problems, just like the one you mention.
Replace + with updated, or use -> instead of ,.
r + (c.name, if (r.contains(c.name)) (c.name) + 1 else 1)
is parsed as
r.+(c.name, if (r.contains(c.name)) (c.name) + 1 else 1)
So the compiler looks for a + method with 2 arguments on Map and doesn't find it. The form I prefer over double parentheses (as Jean-Philippe Pellet suggests) is
r + (c.name -> if (r.contains(c.name)) (c.name) + 1 else 1)
UPDATE:
if Pellet is correct, it's better to write
r + (c.name -> r.getOrElse(c.name, 0) + 1)
(and of course James Iry's solution expresses the same intent even better).