How is foldLeft operator implemented in Scala? - scala

Why does foldLeft syntax operator works, for example i would expect this code
(10 /: (1 to 5))(_ + _)
To give me an error "value /: is not a member of Int". How does it expands method /: on all types in type system?

Here is the definition of the "shortcut" operator:
def /:[B](z: B)(op: (B, A) => B): B = foldLeft(z)(op)
If operator ends with a colon, it is a right-associative. 1 :: Nil is another example, there is no method :: on Int
this all works:
(1 to 5)./:(10)(_ + _)
((1 to 5) foldLeft 10)(_ + _) (almost the same as your example,
but here it's more obvious that foldLeft is actually a method on the
range object)
(1 to 5).foldLeft(10)(_ + _)

Your question is not entirely clear (there's no n mentioned in your expression), but: Operators that end with a colon are interpreted as methods on the right-hand argument, not the left. Your expression is equivalent to
(1 to 5)./:(10)(_ + _)
in which /: is more clearly seen to be a method of the Range object on the left.

Related

Scala underscore notation for map and filter

Say I have the following code:
val a: List[(Int, String)] = List((1,"A"),(2,"B"),(3,"C"))
val b: List[String] = List("A","C","E")
I can do:
a.map{case (fst,snd) => (fst,snd + "a")}
a.filter{case (_,snd) => b.contains(snd)}
But why can't I do:
a.map((_._1,_._2 + "a"))
a.filter(b.contains(_._2))
Is there a way to accomplish this using underscore notation, or am I forced here?
For the example:
a.map((_._1,_._2 + "a"))
Each placeholder (i.e. each underscore/_) introduces a new parameter in the argument expression.
To cite the Scala spec
An expression (of syntactic category Expr)
may contain embedded underscore symbols _ at places where identifiers
are legal. Such an expression represents an anonymous function where subsequent
occurrences of underscores denote successive parameters.
[...]
The anonymous functions in the left column use placeholder
syntax. Each of these is equivalent to the anonymous function on its right.
|---------------------------|----------------------------|
|`_ + 1` | `x => x + 1` |
|`_ * _` | `(x1, x2) => x1 * x2` |
|`(_: Int) * 2` | `(x: Int) => (x: Int) * 2` |
|`if (_) x else y` | `z => if (z) x else y` |
|`_.map(f)` | `x => x.map(f)` |
|`_.map(_ + 1)` | `x => x.map(y => y + 1)` |
You'll have to use the expanded forms when you need to use a given parameter more than once. So your example has to be rewritten as:
a.map(x => (x._1, x._2 + "a"))
For the example
a.filter(b.contains(_._2))
The problem is that you are effectively passing in an anonymous function to contains rather than filter, so you won't be able to use underscore notation here either. Instead you'll have to write
a.filter(x => b.contains(x._2))
You can't do
a.map((_._1,_._2 + "a"))
because _ will match the elements of the iterable for each iteration. The first _ will match with the elements of the first iterable and second _ will match with the elements of the second iterable and so on. _._1 will match the first element of tupled elements of the first iterable, but _._2 will try to get the second element of tupled elements of second iterable. As there is no second iterable, Scala compiler would throw compilation error
In your second line of code
a.filter(b.contains(_._2))
_._2 tries to get the second element of tupled iterable of b, but b is not a tupled iterable. b is simply a iterable of String.
to make it work you can do
a.map(x => (x._1, x._2 + "a"))
a.filter(x => b.contains(x._2))

Difference between dot and space in Scala

What precisely is the difference between . and when used to invoke functions from objects in Scala?
For some reason, I get variations, like:
scala> val l:List[Int] = 1::Nil
l: List[Int] = List(1, 2, 3)
scala> l foldLeft(0)((hd, nxt) => hd + nxt)
<console>:13: error: Int(1) does not take parameters
| foldLeft(1)((hd, nxt) => hd + nxt)
^
scala>l.foldLeft(0)((hd, nxt) => hd + nxt)
res2: Int = 2
(And while I'm at it, what's the name of that operation? I kept trying to find the strict definition of the . operator and I have no idea what it's called.)
Having space instead of dot is called postfix notation if there are no arguments in the called function on the object, or infix notation if there is an argument that the function requires.
Postix example: l sum, equivalent to l.sum
Infix example: l map (_ * 2), equivalent to l.map(_ * 2)
The issue with these notations is that they are inherently more ambiguous in their interpretation. A classic example from math:
1 + 2 * 3 + 4 is ambiguous and depends on the priority of the operators.
1.+(2.*(3).+(4) has only one meaningful interpretation.
Therefore it is not a different operator, but the same as the dot, just susceptible to ambiguity that can lead to syntactical errors like your case or even worse logical errors when you chain infix operators.
You can actually express foldLeft with infix notation in this way:
(l foldLeft 0)((hd, nxt) => hd + nxt)
or even
(0 /: l)((hd, nxt) => hd + nxt)
Where /: is just an alias for foldLeft and makes use of the unique semantics of operator ending in colon(:), which are interpreted as l./:(0) (the reverse of the usual).
Desugar it with "-Xprint:parser" or "-Xprint:typer"
Example 1 Desugared:
scala> (List(1,2) foldLeft 0)((hd, nxt) => hd + nxt)
...
List(1, 2).foldLeft(0)(((hd, nxt) => hd.$plus(nxt)))
...
immutable.this.List.apply[Int](1, 2).foldLeft[Int](0)(((hd: Int, nxt: Int) => hd.+(nxt)));
As you can see, (List(1,2) foldLeft 0) translates into (List(1, 2).foldLeft(0)) in the parser phase. This expression returns a curried function that takes in the second set of parenthesis to produce a result (remember that a curried function is just a function that takes in an argument and returns another function with one fewer argument).
Example 2 Desugared:
scala> List(1,2) foldLeft(0)((hd, nxt) => hd + nxt)
...
List(1, 2)(foldLeft(0)(((hd, nxt) => hd.$plus(nxt))))
...
<console>:8: error: not found: value foldLeft
List(1,2) (foldLeft(0)((hd, nxt) => hd + nxt))
The parenthesis are going around (foldLeft(0)((hd, nxt) => hd + nxt)).
Style:
The way you are supposed to use space delimited methods is 1 object followed by 1 method followed by 1 set of parenthesis, which produces a new object that can be followed by a new method.
obj method paramerer // good
obj method1 paramerer1 method2 paramerer2 // good
obj method1 paramerer1 method2 paramerer2 method3 paramerer3 // compiles, but might need to be broken up
You can follow an object with postfix a method that takes no parameters, but this isn't always the approved style, especially for accessors.
foo.length // good
foo length // compiles, but can be confusing.
Space delimited methods are normally reserved for either pure functions (like map, flatmap, filter) or for domain specific languages (DSL).
In the case of foo.length, there is no () on length, so the whitespace isn't necessary to convey the idea that length is pure.

Parser combinator grammar not yielding correct associativity

I am working on a simple expression parser, however given the following parser combinator declarations below, I can't seem to pass my tests and a right associative tree keeps on popping up.
def EXPR:Parser[E] = FACTOR ~ rep(SUM|MINUS) ^^ {case a~b => (a /: b)((acc,f) => f(acc))}
def SUM:Parser[E => E] = "+" ~ EXPR ^^ {case "+" ~ b => Sum(_, b)}
def MINUS:Parser[E => E] = "-" ~ EXPR ^^ {case "-" ~ b => Diff(_, b)}
I've been debugging hours for this. I hope someone can help me figure it out it's not coming out right.
"5-4-3" would yield a tree that evaluates to 4 instead of the expected -2.
What is wrong with the grammar above?
I don't work with Scala but do work with F# parser combinators and also needed associativity with infix operators. While I am sure you can do 5-4 or 2+3, the problem comes in with a sequence of two or more such operators of the same precedence and operator, i.e. 5-4-2 or 2+3+5. The problem won't show up with addition as (2+3)+5 = 2+(3+5) but (5-4)-2 <> 5-(4-2) as you know.
See: Monadic Parser Combinators 4.3 Repetition with meaningful separators. Note: The separators are the operators such as "+" and "*" and not whitespace or commas.
See: Functional Parsers Look for the chainl and chainr parsers in section 7. More parser combinators.
For example, an arithmetical expressions, where the operators that
separate the subexpressions have to be part of the parse tree. For
this case we will develop the functions chainr and chainl. These
functions expect that the parser for the separators yields a function
(!);
The function f should operate on an element and a list of tuples, each
containing an operator and an element. For example, f(e0; [(1; e1);
(2; e2); (3; e3)]) should return ((eo 1 e1) 2 e2) 3 e3. You may
recognize a version of foldl in this (albeit an uncurried one), where
a tuple (; y) from the list and intermediate result x are combined
applying x y.
You need a fold function in the semantic parser, i.e. the part that converts the tokens from the syntactic parser into the output of the parser. In your code I believe it is this part.
{case a~b => (a /: b)((acc,f) => f(acc))}
Sorry I can't do better as I don't use Scala.
"-" ~ EXPR ^^ {case "-" ~ b => Diff(_, b)}
for 5-4-3, it expands to
Diff(5, 4-3)
which is
Diff(5, Diff(4, 3))
however, what you need is:
Diff(Diff(5, 4), 3))
// for 5 + 4 - 3 it should be
Diff(Sum(5, 4), 3)
you need to involve stack.
It seems using "+" ~ EXPR made the answer incorrect. It should have been FACTOR instead.

Why does leaving the dot out in foldLeft cause a compilation error?

Can anyone explain why I see this compile error for the following when I omit the dot notation for applying the foldLeft function?(version 2.9.2)
scala> val l = List(1, 2, 3)
res19: List[Int] = List(1 ,2 ,3)
scala> l foldLeft(1)(_ * _)
<console>:9: error: Int(1) does not take parameters
l foldLeft(1)(_ * _)
^
but
scala> l.foldLeft(1)(_ * _)
res27: Int = 6
This doesn't hold true for other higher order functions such as map which doesn't seem to care whether I supply the dot or not.
I don't think its an associativity thing because I can't just invoke foldLeft(1)
It's because foldLeft is curried. As well as using the dot notation, you can also fix this by adding parentheses:
scala> (l foldLeft 1)(_ * _)
res3: Int = 6
Oh - and regarding your comment about not being able to invoke foldLeft(l), you can, but you need to partially apply it like this:
scala> (l foldLeft 1) _
res3: ((Int, Int) => Int) => Int = <function1>
Omitting the dot is possible because of scala's syntactic support for the infix notation, which expects 3 parts:
leftOperand operator rightOperand.
But because foldLeft had two list of parameters, you end up with 4 parts at the syntactic level: l foldLeft (1) (_ * _)
Which does not fit infix notation, hence the error.

difference between foldLeft and reduceLeft in Scala

I have learned the basic difference between foldLeft and reduceLeft
foldLeft:
initial value has to be passed
reduceLeft:
takes first element of the collection as initial value
throws exception if collection is empty
Is there any other difference ?
Any specific reason to have two methods with similar functionality?
Few things to mention here, before giving the actual answer:
Your question doesn't have anything to do with left, it's rather about the difference between reducing and folding
The difference is not the implementation at all, just look at the signatures.
The question doesn't have anything to do with Scala in particular, it's rather about the two concepts of functional programming.
Back to your question:
Here is the signature of foldLeft (could also have been foldRight for the point I'm going to make):
def foldLeft [B] (z: B)(f: (B, A) => B): B
And here is the signature of reduceLeft (again the direction doesn't matter here)
def reduceLeft [B >: A] (f: (B, A) => B): B
These two look very similar and thus caused the confusion. reduceLeft is a special case of foldLeft (which by the way means that you sometimes can express the same thing by using either of them).
When you call reduceLeft say on a List[Int] it will literally reduce the whole list of integers into a single value, which is going to be of type Int (or a supertype of Int, hence [B >: A]).
When you call foldLeft say on a List[Int] it will fold the whole list (imagine rolling a piece of paper) into a single value, but this value doesn't have to be even related to Int (hence [B]).
Here is an example:
def listWithSum(numbers: List[Int]) = numbers.foldLeft((List.empty[Int], 0)) {
(resultingTuple, currentInteger) =>
(currentInteger :: resultingTuple._1, currentInteger + resultingTuple._2)
}
This method takes a List[Int] and returns a Tuple2[List[Int], Int] or (List[Int], Int). It calculates the sum and returns a tuple with a list of integers and it's sum. By the way the list is returned backwards, because we used foldLeft instead of foldRight.
Watch One Fold to rule them all for a more in depth explanation.
reduceLeft is just a convenience method. It is equivalent to
list.tail.foldLeft(list.head)(_)
foldLeft is more generic, you can use it to produce something completely different than what you originally put in. Whereas reduceLeft can only produce an end result of the same type or super type of the collection type. For example:
List(1,3,5).foldLeft(0) { _ + _ }
List(1,3,5).foldLeft(List[String]()) { (a, b) => b.toString :: a }
The foldLeft will apply the closure with the last folded result (first time using initial value) and the next value.
reduceLeft on the other hand will first combine two values from the list and apply those to the closure. Next it will combine the rest of the values with the cumulative result. See:
List(1,3,5).reduceLeft { (a, b) => println("a " + a + ", b " + b); a + b }
If the list is empty foldLeft can present the initial value as a legal result. reduceLeft on the other hand does not have a legal value if it can't find at least one value in the list.
For reference, reduceLeft will error if applied to an empty container with the following error.
java.lang.UnsupportedOperationException: empty.reduceLeft
Reworking the code to use
myList foldLeft(List[String]()) {(a,b) => a+b}
is one potential option. Another is to use the reduceLeftOption variant which returns an Option wrapped result.
myList reduceLeftOption {(a,b) => a+b} match {
case None => // handle no result as necessary
case Some(v) => println(v)
}
The basic reason they are both in Scala standard library is probably because they are both in Haskell standard library (called foldl and foldl1). If reduceLeft wasn't, it would quite often be defined as a convenience method in different projects.
From Functional Programming Principles in Scala (Martin Odersky):
The function reduceLeft is defined in terms of a more general function, foldLeft.
foldLeft is like reduceLeft but takes an accumulator z, as an additional parameter, which is returned when foldLeft is called on an empty list:
(List (x1, ..., xn) foldLeft z)(op) = (...(z op x1) op ...) op x
[as opposed to reduceLeft, which throws an exception when called on an empty list.]
The course (see lecture 5.5) provides abstract definitions of these functions, which illustrates their differences, although they are very similar in their use of pattern matching and recursion.
abstract class List[T] { ...
def reduceLeft(op: (T,T)=>T) : T = this match{
case Nil => throw new Error("Nil.reduceLeft")
case x :: xs => (xs foldLeft x)(op)
}
def foldLeft[U](z: U)(op: (U,T)=>U): U = this match{
case Nil => z
case x :: xs => (xs foldLeft op(z, x))(op)
}
}
Note that foldLeft returns a value of type U, which is not necessarily the same type as List[T], but reduceLeft returns a value of the same type as the list).
To really understand what are you doing with fold/reduce,
check this: http://wiki.tcl.tk/17983
very good explanation. once you get the concept of fold,
reduce will come together with the answer above:
list.tail.foldLeft(list.head)(_)
Scala 2.13.3, Demo:
val names = List("Foo", "Bar")
println("ReduceLeft: "+ names.reduceLeft(_+_))
println("ReduceRight: "+ names.reduceRight(_+_))
println("Fold: "+ names.fold("Other")(_+_))
println("FoldLeft: "+ names.foldLeft("Other")(_+_))
println("FoldRight: "+ names.foldRight("Other")(_+_))
outputs:
ReduceLeft: FooBar
ReduceRight: FooBar
Fold: OtherFooBar
FoldLeft: OtherFooBar
FoldRight: FooBarOther