Scala: Is operator foldl infix? - scala

Looking at code with foldl it is hard to understand its syntax, for example:
def lstToMap(lst:List[(String,Int)], map: Map[String, Int] ):Map[String, Int] = {
(map /: lst) (addToMap)
}
Is /: infix operator? What does (map /: lst) mean, partial application? Why I can not call like this:
`/: map lst addToMap`

Method names that end in a : character can be used on the left hand side of the instance they're bound to (ie, they associate to the right). In this case, /: is a method on List. As per the Scaladoc:
Note: /: is alternate syntax for foldLeft; z /: xs is the same as xs foldLeft z.
An alternative to what you wrote would be:
lst./:(map)(addToMap)
Edit: and another alternative with foldLeft:
lst.foldLeft(map)(addToMap)

Yes, /: can be used as an infix operator. However, the fold operation takes three arguments:
The sequence to fold across
The initial value for the reduction
The function used for folding
Using infix you can only specify two of these three arguments: the sequence (which is the receiver) and the initial value. The fact that (map /: lst) is a partial application reflects the fact that you're still missing an argument. Here's an example of a product of a sequence of numbers, starting with an initial value of 1:
(1 /: xs)(_*_)
Since Scala supports curly braces for function literals, you can also use that to make the function argument look more like a function body:
(1 /: xs) { (x, y) =>
x * y
}

Related

Folding lists in scala

Folding list in scala using /: and :\ operator
I tried to to look at different sites and they only talk about foldRight and foldLeft functions.
def sum(xs: List[Int]): Int = (0 /: xs) (_ + _)
sum(List(1,2,3))
res0: 6
The code segment works as described. But I am not able to completely understand the method definition. What I understand is that the one inside the first parenthesis -> 0 /: xs where /: is a right associate operator. The object is xs and the parameter is 0. I am not sure about the return type of the operation (most probably it would be another list?). The second part is a functional piece which sums its two parameters. But I don't understand what object invokes it ? and the name of function. Can someone please help me to understand.
The signature of :/ is
/:[B](z: B)(op: (B, A) ⇒ B): B
It is a method with multiple argument lists, so when it is just invoked with on argument (i.e. 0 /: xs in your case) the return type is (op: (B, A) ⇒ B): B. So you have to pass it a method with 2 parameters ( _ + _ ) that is used to combine the elements of the list starting from z.
This method is usually called foldLeft:
(0 /: xs)(_ + _) is the same as xs.foldLeft(0)(_ + _)
You can find more details here: https://www.scala-lang.org/api/2.12.3/scala/collection/immutable/List.html
Thanks #HaraldGliebe & #LuisMiguelMejíaSuárez for your great responses. I am enlightened now!. I am just summarisig the answer here which may benefit others who read this thread.
"/:" is actually the name of the function which is defined inside the List class. The signature of the function is: /:[B](z: B)(op: (B, A) ⇒ B): B --> where B is the type parameter, z is the first parameter; op is the second parameter which is of functional type.
The function follows curried version --> which means we can pass less number of parameters than that of the actual number. If we do that,
the partially applied function is stored in a temporary variable; we can then use the temporary variable to pass the remaining parameters.
If supplied with all parameters, "/:" can be called as: x./:(0)(_+_) where x is val/var of List type. OR "/:" can be called in two steps which are given as:
step:1 val temp = x./:(0)(_) where we pass only the first parameter. This results in a partially applied function which is stored in the temp variable.
step:2 temp(_+_) here using the partially applied function temp is passed with the second (final) parameter.
If we decide to follow the first style ( x./:(0)(_+_) ), calling the first parameter can be written in operator notion which is: x /: 0
Since the method name ends with a colon, the object will be pulled from right side. So x /: 0 is invalid and it has to be written as 0 /: x which is correct.
This one is equivalent to the temp variable. On following 0 /: x, second parameter also needs to be passed. So the whole construct becomes: (0/:x)(_+_)
This is how the definition of the function sum in the question, is interpreted.
We have to note that when we use curried version of the function in operator notion, we have to supply all the parameters in a single go.
That is: (0 /: x) (_) OR (0 /: x) _ seems throwing syntax errors.

Understanding infix behavior in scala

Wasn't sure if I should ask this here or on Programmers, but anyway
In Scala it's possible to write method calls using infix syntax, i.e. omitting dots and parens.
As an example you could do this:
lst foreach println // equivalent to lst.foreach(println)
Naturally one would assume that lst map _.toString would be evaluated to lst.map(_.toString), which is equivalent to lst.map(x$1 => x$1.toString)
But dropping lst map _.toString into the repl yields a surprising result, it's evaluated as ((x$1) => sList.map(x$1.toString)) causing the method call to malfunction.
So why is that? Why is it that the simple rule of a.f(b) being equivalent to a f b no longer applies when writing a f _.b?
Because the expression is ambiguous.
From Scala's (somewhat outdated) spec P94: http://www.scala-lang.org/docu/files/ScalaReference.pdf
An expression(of syntactic category Expr) may contain embedded underscore symbols _ at places where identifiers are legal. Such an expression represents an anonymous function where subsequent occurrences of underscores denote successive parameters.
Since lst map _.toString is a legal expression, it can naturally be evaluated as an anonymous function like (x) => lst.map(x.toString).
You can still use infix expression by curly brackets that make Scala compiler evaluate placeholder function first.
scala> val lst = List(1,2,3,4,5)
lst: List[Int] = List(1, 2, 3, 4, 5)
scala> lst map { _.toString }
res43: List[String] = List(1, 2, 3, 4, 5)

difference between foldLeft and reduceLeft in Scala

I have learned the basic difference between foldLeft and reduceLeft
foldLeft:
initial value has to be passed
reduceLeft:
takes first element of the collection as initial value
throws exception if collection is empty
Is there any other difference ?
Any specific reason to have two methods with similar functionality?
Few things to mention here, before giving the actual answer:
Your question doesn't have anything to do with left, it's rather about the difference between reducing and folding
The difference is not the implementation at all, just look at the signatures.
The question doesn't have anything to do with Scala in particular, it's rather about the two concepts of functional programming.
Back to your question:
Here is the signature of foldLeft (could also have been foldRight for the point I'm going to make):
def foldLeft [B] (z: B)(f: (B, A) => B): B
And here is the signature of reduceLeft (again the direction doesn't matter here)
def reduceLeft [B >: A] (f: (B, A) => B): B
These two look very similar and thus caused the confusion. reduceLeft is a special case of foldLeft (which by the way means that you sometimes can express the same thing by using either of them).
When you call reduceLeft say on a List[Int] it will literally reduce the whole list of integers into a single value, which is going to be of type Int (or a supertype of Int, hence [B >: A]).
When you call foldLeft say on a List[Int] it will fold the whole list (imagine rolling a piece of paper) into a single value, but this value doesn't have to be even related to Int (hence [B]).
Here is an example:
def listWithSum(numbers: List[Int]) = numbers.foldLeft((List.empty[Int], 0)) {
(resultingTuple, currentInteger) =>
(currentInteger :: resultingTuple._1, currentInteger + resultingTuple._2)
}
This method takes a List[Int] and returns a Tuple2[List[Int], Int] or (List[Int], Int). It calculates the sum and returns a tuple with a list of integers and it's sum. By the way the list is returned backwards, because we used foldLeft instead of foldRight.
Watch One Fold to rule them all for a more in depth explanation.
reduceLeft is just a convenience method. It is equivalent to
list.tail.foldLeft(list.head)(_)
foldLeft is more generic, you can use it to produce something completely different than what you originally put in. Whereas reduceLeft can only produce an end result of the same type or super type of the collection type. For example:
List(1,3,5).foldLeft(0) { _ + _ }
List(1,3,5).foldLeft(List[String]()) { (a, b) => b.toString :: a }
The foldLeft will apply the closure with the last folded result (first time using initial value) and the next value.
reduceLeft on the other hand will first combine two values from the list and apply those to the closure. Next it will combine the rest of the values with the cumulative result. See:
List(1,3,5).reduceLeft { (a, b) => println("a " + a + ", b " + b); a + b }
If the list is empty foldLeft can present the initial value as a legal result. reduceLeft on the other hand does not have a legal value if it can't find at least one value in the list.
For reference, reduceLeft will error if applied to an empty container with the following error.
java.lang.UnsupportedOperationException: empty.reduceLeft
Reworking the code to use
myList foldLeft(List[String]()) {(a,b) => a+b}
is one potential option. Another is to use the reduceLeftOption variant which returns an Option wrapped result.
myList reduceLeftOption {(a,b) => a+b} match {
case None => // handle no result as necessary
case Some(v) => println(v)
}
The basic reason they are both in Scala standard library is probably because they are both in Haskell standard library (called foldl and foldl1). If reduceLeft wasn't, it would quite often be defined as a convenience method in different projects.
From Functional Programming Principles in Scala (Martin Odersky):
The function reduceLeft is defined in terms of a more general function, foldLeft.
foldLeft is like reduceLeft but takes an accumulator z, as an additional parameter, which is returned when foldLeft is called on an empty list:
(List (x1, ..., xn) foldLeft z)(op) = (...(z op x1) op ...) op x
[as opposed to reduceLeft, which throws an exception when called on an empty list.]
The course (see lecture 5.5) provides abstract definitions of these functions, which illustrates their differences, although they are very similar in their use of pattern matching and recursion.
abstract class List[T] { ...
def reduceLeft(op: (T,T)=>T) : T = this match{
case Nil => throw new Error("Nil.reduceLeft")
case x :: xs => (xs foldLeft x)(op)
}
def foldLeft[U](z: U)(op: (U,T)=>U): U = this match{
case Nil => z
case x :: xs => (xs foldLeft op(z, x))(op)
}
}
Note that foldLeft returns a value of type U, which is not necessarily the same type as List[T], but reduceLeft returns a value of the same type as the list).
To really understand what are you doing with fold/reduce,
check this: http://wiki.tcl.tk/17983
very good explanation. once you get the concept of fold,
reduce will come together with the answer above:
list.tail.foldLeft(list.head)(_)
Scala 2.13.3, Demo:
val names = List("Foo", "Bar")
println("ReduceLeft: "+ names.reduceLeft(_+_))
println("ReduceRight: "+ names.reduceRight(_+_))
println("Fold: "+ names.fold("Other")(_+_))
println("FoldLeft: "+ names.foldLeft("Other")(_+_))
println("FoldRight: "+ names.foldRight("Other")(_+_))
outputs:
ReduceLeft: FooBar
ReduceRight: FooBar
Fold: OtherFooBar
FoldLeft: OtherFooBar
FoldRight: FooBarOther

Scala Vector fold syntax (/: and :\ and /:\)

Can someone provide some examples for how
/: :\ and /:\
Actually get used? I assume they're shortcuts to the reduce / fold methods, but there's no examples on how they actually get used in the Scala docs, and they're impossible to google / search for on StackOverflow.
I personally prefer the /: and :\ forms of foldLeft and foldRight. Two reasons:
It has a more natural feel because you can see that you are pushing a value into the left/right of a collection and applying a function. That is
(1 /: ints) { _ + _ }
ints.foldLeft(1) { _ + _ }
Are both equivalent, but I tend to think the former emphasises my intuition as to what is happening. If you want to know how this is happening (i.e. the method appears to be called on the value 1, not the collection), it's because methods ending in a colon are right-associative. This can be seen in ::, +: etc etc elsewhere in the standard library.
The ordering of the Function2 parameters is the same order as the folded element and that which is folded into:
(b /: as) { (bb, a) => f(bb, a) }
// ^ ^ ^ ^
// ^ ^ ^ ^
// B A B A
Better in every way than:
as.foldLeft(b) { (bb, a) => f(bb, a) }
Although I admit that this was a far more important difference in the era before decent IDE support: nowadays IDEA can tell me what function is expected with a simple CTRL-P
I hope it should also be obvious how :\ works with foldRight - it's basically exactly the same, except that the value appears to be being pushed in from the right hand side. I must say, I tend to steer well clear of foldRight in scala because of how it is implemented (i.e. wrongly).
/: is a synonym for foldLeft and :\ for foldRight.
But remember that : makes /: apply to the object to the right of it.
Assuming you know that (_ * _) is an anonymous function that's equivalent to (a, b) => a * b, and the signature of foldLeft and foldRight are
def foldLeft [B] (z: B)(f: (B, A) ⇒ B): B
def foldRight [B] (z: B)(f: (A, B) ⇒ B): B
i.e. they're curried functions taking a start value and a function combining the start value / accumulator with an item from the list, some examples are:
List(1,2,3).foldLeft(1)(_*_)
which is the same as
(1 /: List(1,2,3))(_*_)
And
List(1,2,3).foldRight(1)(_*_)
in infix notation is
(List(1,2,3) foldRight 1)(_*_)
which is the same as
(List(1,2,3) :\ 1)(_*_)
Add your own collections and functions and enjoy!
The thing to remember with the short (/: and :\) notations is that, because you're using the infix notations you need to put parentheses around the first part in order for it to pick up the second argument list properly. Also, remember that the functions for foldLeft and foldRight are the opposite way round, but it makes sense if you're visualising the fold in your head.
Rex Kerr has written nice answer about folds here. Near the end you can see an example of shortcut syntax of foldLeft and foldRight.

Difference in asymptotic time of two variants of flatten

I am going through the Scala by Example document and I am having trouble with exercise 9.4.2. Here is the text:
Exercise 9.4.2 Consider the problem of writing a function flatten, which takes a list of element lists as arguments. The result of flatten should be the concatenation of all element lists into a single list. Here is an implementation of this method in terms of :\.
def flatten[A](xs: List[List[A]]): List[A] =
(xs :\ (Nil: List[A])) {(x, xs) => x ::: xs}
Consider replacing the body of flatten by
((Nil: List[A]) /: xs) ((xs, x) => xs ::: x)
What would be the difference in asymptotic complexity between the two versions of flatten?
In fact flatten is predefined together with a set of other userful function in an object
called List in the standatd Scala library. It can be accessed from user program by calling List.flatten. Note that flatten is not a method of class List – it would not make sense there, since it applies only to lists of lists, not to all lists in general.
I do not see how the asymptotic time of these two function variants are different. I'm sure it's because I am missing something fundamental about the meaning of fold left and fold right.
Here is a pdf of the document I am describing:
http://www.scala-lang.org/docu/files/ScalaByExample.pdf
I am generally finding this an excellent introduction into Scala.
Look at the implementation of concatenation ::: (p.68) (the rest of answer is masked with spoiler-tags, mouse-over to read !)
Witness that it's linear (in ::) in the size of the left argument (the list that ends up being the prefix of the result).
Assume (for the sake of the complexity analysis) that your list of lists contains n equal-sized small lists of size a fixed constant k, k<n. If you use foldLeft, you compute:
f (... (f (f a b1) b2) ...) bn
Where f is the concatenation. If you use foldRight:
f a1 (f a2 (... (f an b) ...))
With again f standing for the prefix notation of concatenation. In the second case it's easy : you add k elements at the head each time, so you do (k*n cons).
For the first case (foldLeft), in the first concatenation, the list (f a b1) is of size k. You add it on the second round to b2 to form (f (f a b1) b2) of size 2k ... You do (k+(k+k)+(3k)+... = k*sum_{i=1}^n(i) = k*n(n+1)/2 cons).
(Followup question : is this the only parameter that should be taken into account while thinking of the efficiency of that function ? Doesn't foldLeft have an advantage -not asymptotic complexity- that foldRight doesn't ?)