Named parameters vs _, dot notation vs infix operation, curly vs round brackets when using higher-order functions in Scala - scala

I'm having the hardest time understanding when I can or can't omit brackets and/or periods, and how this interplays with _.
The specific case I had with this was
val x: X = ???
val xss: List[List[X]] = ???
xss map x :: _ //this doesn't compile
xss map _.::(x) //this is the same as the above (and thus doesn't compile)
the above two seem to be identical to xss.map(_).::(x)
xss map (x :: _) //this works as expected
xss map {x :: _} //this does the same thing as the above
meanwhile, the following also fail:
xss.map xs => x :: xs //';' expected but '=>' found.
xss.map x :: _ //missing arguments for method map in class List; follow this method with `_' if you want to treat it as a partially applied function
//so when I try following the method with _, I get my favourite:
xss.map _ x :: _ //Cannot construct a collection of type That with elements of type B based on a collection of type List[List[Main.X]]
//as opposed to
xss map _ x :: _ //missing parameter type for expanded function ((x$1) => xss.map(x$1).x(($colon$colon: (() => <empty>))))
Right now, I often play "toggle the symbols until it compiles" which I believe to be a suboptimal programming strategy. How does this all work?

First we need to distinguish between xss.map(f) and xss map f. According to Scala Documentation any method which takes a single parameter can be used as an infix operator.
Actually map method in List is one of these methods. Ignoring the full signature and the fact that it's inherited from TraversableLike, the signature is as follows:
final def map[B](f: (A) ⇒ B): List[B]
So it takes a single parameter, namely f, which is a function with type A => B. So if you have a function value defined as
val mySize = (xs:List[Int]) => xs.size
you can choose between
xss.map(mySize)
or
xss map mySize
This is a matter of preference but according to Scala Style Guide, for this case, the latter is preferred, unless if it is part of a complex expression where it's better to stick with dot notation.
Note that if you opt to use the dot notation you always need to qualify the function application with brackets! That's why none of the following compiles successfully.
xss.map xs => x :: xs // Won't compile
xss.map x :: _ // Won't compile
xss.map _ x :: _ // Won't compile
But most of the time instead of passing a function value you need to pass a function literal (aka anonymous function). In this case again if you use the dot notation you need something like xss.map(_.size). But if you use the infix notation, it will be a matter of precedence.
For example
xss map x :: _ // Won't compile!
does not work because of operator precedence. So you need to use brackets to disambiguiate the situation for compiler by xss map (x :: _).
Use of curly braces instead of brackets has a very clear and simple rule. Again any function which takes only one parameter can be applied with curly braces instead of brackets, both for infix and dot notations. So the following statements will compile.
xss.map{x :: _}
xss map {x :: _}
For avoiding confusions you can begin with dot notation and explicit types for parameters. Later after being compiled - and probably writing some unit tests for your code - you can start refactoring the code by removing unnecessary types, using infix notation, and using curly braces instead of brackets where it makes sense.
For this purpose you can refer to Scala Style Guide and Martin Odersky's talk in Scala Days 2013 which is concerning Scala coding style. Also you can always ask for help from IDEs for refactoring the code to be more concise.

Related

Scala: What is :: in { case x :: y :: _ => y}? Method or case class?

object Test1 extends App {
val list: List[Int] => Int = {
case x :: y :: _ => y //what is ::? method or case class?
}
println(list(List(1, 2, 3))) //result is 2.
}
I set "syntax coloring" in scala IDE, foreground color of method I set Red. code snapshot:
And I can't open declaration of black ::, so I don't know what it is.
If black :: is method, it should be called by this way:
... {case _.::(y).::(x) => y} //compile failed!
So, What is black ::? method or case class?
Thanks a lot!
I think it's a method as described here. For history's sake, in case that page goes away, here's the blurb:
About pattern matching on Lists
If you review the possible forms of patterns explained in Chapter 15,
you might find that neither List(...) nor :: looks like it fits one of
the kinds of patterns defined there. In fact, List(...) is an instance
of a library-defined extractor pattern. Such patterns will be treated
in Chapter 24. The "cons" pattern x :: xs is a special case of an
infix operation pattern. You know already that, when seen as an
expression, an infix operation is equivalent to a method call. For
patterns, the rules are different: When seen as a pattern, an infix
operation such as p op q is equivalent to op(p, q). That is, the infix
operator op is treated as a constructor pattern. In particular, a cons
pattern such as x :: xs is treated as ::(x, xs). This hints that there
should be a class named :: that corresponds to the pattern
constructor. Indeed there is such as class. It is named scala.:: and
is exactly the class that builds non-empty lists. So :: exists twice
in Scala, once as a name of a class in package scala, and again as a
method in class List. The effect of the method :: is to produce an
instance of the class scala.::. You'll find out more details about how
the List class is implemented in Chapter 22.
So it's scala.::(a,b)
Here Sequence pattern is applied.
case x :: y :: _ => y
means, the passed sequence first value is mapped to x and second value is mapped to y and all remaining values are applied with wildcard pattern (_).
Here finally the case returns y means second value.
another ex: case x :: y :: z :: _ => z
this case returns third element from the sequence.
another ex: case _ :: _ :: z :: _ => z
in this example first and second element are used underscore as we don't need THESE elements hence replaced with _.
Also it throws Exception if the passing list size is less than what it expects, for Example the below throws exception:
val third: List[Int] => Int = {case _ :: _ :: z :: _ => z}
third(List(1, 2))

Scala collections: why do we need a case statement to extract values tuples in higher order functions?

Related to Tuple Unpacking in Map Operations, I don't understand why do we need a case (that looks like a partial function to me) to extract values from tuple, like that:
arrayOfTuples map {case (e1, e2) => e1.toString + e2}
Instead of extracting in the same way it works in foldLeft, for example
def sum(list: List[Int]): Int = list.foldLeft(0)((r,c) => r+c)
Anyway we don't specify the type of parameters in the first case, so why do we need the case statement?
Because in Scala function argument lists and tuples are not a unified concept as they are in Haskell and other functional languages. So a function:
(t: (Int, Int)) => ...
is not the same thing as a function:
(e1: Int, e2: Int) => ...
In the first case you can use pattern matching to extract the tuple elements, and that's always done using case syntax. Actually, the expression:
{case (e1, e2) => ...}
is shorthand for:
t => t match {case (e1, e2) => ...}
There has been some discussions about unifying tuples and function argument lists, but there are complications regarding Java overloading rules, and also default/named arguments. So, I think it's unlikely the concepts will ever be unified in Scala.
Lambda with one primitive parameter
With
var listOfInt=(1 to 100).toList
listOfInt.foldRight(0)((current,acc)=>current+acc)
you have a lambda function operating on two parameter.
Lambda with one parameter of type tuple
With
var listOfTuple=List((1,"a"),(2,"b"),(3," "))
listOfTuple.map(x => x._1.toString + x._2.toString)
you have a lambda function working on one parameter (of type Tuple2[Int, String])
Both works fine with type inference.
Partial lambda with one parameter
With
listOfTuple.map{case (x,y) => x.toString + y.toString}
you have a lambda function, working with one parameter (of type Tuple2[Int, String]). This lambda function then uses Tuple2.unapply internally to decompose the one parameter in multiple values. This still works fine with type inference. The case is needed for the decomposition ("pattern matching") of the value.
This example is a little bit unintuitive, because unapply returns a Tuple as its result. In this special case there might indeed be a trick, so Scala uses the provided tuple directly. But I am not really aware of such a trick.
Update: Lambda function with currying
Indeed there is a trick. With
import Function.tupled
listOfTuple map tupled{(x,y) => x.toString + y.toString}
you can directly work with the tuple. But of course this is really a trick: You provide a function operating on two parameters and not with a tuple. tupled then takes that function and changes it to a different function, operating on a tuple. This technique is also called uncurrying.
Remark:
The y.toString is superfluous when y is already a string. This is not considered good style. I leave it in for the sake of the example. You should omit it in real code.

difference between foldLeft and reduceLeft in Scala

I have learned the basic difference between foldLeft and reduceLeft
foldLeft:
initial value has to be passed
reduceLeft:
takes first element of the collection as initial value
throws exception if collection is empty
Is there any other difference ?
Any specific reason to have two methods with similar functionality?
Few things to mention here, before giving the actual answer:
Your question doesn't have anything to do with left, it's rather about the difference between reducing and folding
The difference is not the implementation at all, just look at the signatures.
The question doesn't have anything to do with Scala in particular, it's rather about the two concepts of functional programming.
Back to your question:
Here is the signature of foldLeft (could also have been foldRight for the point I'm going to make):
def foldLeft [B] (z: B)(f: (B, A) => B): B
And here is the signature of reduceLeft (again the direction doesn't matter here)
def reduceLeft [B >: A] (f: (B, A) => B): B
These two look very similar and thus caused the confusion. reduceLeft is a special case of foldLeft (which by the way means that you sometimes can express the same thing by using either of them).
When you call reduceLeft say on a List[Int] it will literally reduce the whole list of integers into a single value, which is going to be of type Int (or a supertype of Int, hence [B >: A]).
When you call foldLeft say on a List[Int] it will fold the whole list (imagine rolling a piece of paper) into a single value, but this value doesn't have to be even related to Int (hence [B]).
Here is an example:
def listWithSum(numbers: List[Int]) = numbers.foldLeft((List.empty[Int], 0)) {
(resultingTuple, currentInteger) =>
(currentInteger :: resultingTuple._1, currentInteger + resultingTuple._2)
}
This method takes a List[Int] and returns a Tuple2[List[Int], Int] or (List[Int], Int). It calculates the sum and returns a tuple with a list of integers and it's sum. By the way the list is returned backwards, because we used foldLeft instead of foldRight.
Watch One Fold to rule them all for a more in depth explanation.
reduceLeft is just a convenience method. It is equivalent to
list.tail.foldLeft(list.head)(_)
foldLeft is more generic, you can use it to produce something completely different than what you originally put in. Whereas reduceLeft can only produce an end result of the same type or super type of the collection type. For example:
List(1,3,5).foldLeft(0) { _ + _ }
List(1,3,5).foldLeft(List[String]()) { (a, b) => b.toString :: a }
The foldLeft will apply the closure with the last folded result (first time using initial value) and the next value.
reduceLeft on the other hand will first combine two values from the list and apply those to the closure. Next it will combine the rest of the values with the cumulative result. See:
List(1,3,5).reduceLeft { (a, b) => println("a " + a + ", b " + b); a + b }
If the list is empty foldLeft can present the initial value as a legal result. reduceLeft on the other hand does not have a legal value if it can't find at least one value in the list.
For reference, reduceLeft will error if applied to an empty container with the following error.
java.lang.UnsupportedOperationException: empty.reduceLeft
Reworking the code to use
myList foldLeft(List[String]()) {(a,b) => a+b}
is one potential option. Another is to use the reduceLeftOption variant which returns an Option wrapped result.
myList reduceLeftOption {(a,b) => a+b} match {
case None => // handle no result as necessary
case Some(v) => println(v)
}
The basic reason they are both in Scala standard library is probably because they are both in Haskell standard library (called foldl and foldl1). If reduceLeft wasn't, it would quite often be defined as a convenience method in different projects.
From Functional Programming Principles in Scala (Martin Odersky):
The function reduceLeft is defined in terms of a more general function, foldLeft.
foldLeft is like reduceLeft but takes an accumulator z, as an additional parameter, which is returned when foldLeft is called on an empty list:
(List (x1, ..., xn) foldLeft z)(op) = (...(z op x1) op ...) op x
[as opposed to reduceLeft, which throws an exception when called on an empty list.]
The course (see lecture 5.5) provides abstract definitions of these functions, which illustrates their differences, although they are very similar in their use of pattern matching and recursion.
abstract class List[T] { ...
def reduceLeft(op: (T,T)=>T) : T = this match{
case Nil => throw new Error("Nil.reduceLeft")
case x :: xs => (xs foldLeft x)(op)
}
def foldLeft[U](z: U)(op: (U,T)=>U): U = this match{
case Nil => z
case x :: xs => (xs foldLeft op(z, x))(op)
}
}
Note that foldLeft returns a value of type U, which is not necessarily the same type as List[T], but reduceLeft returns a value of the same type as the list).
To really understand what are you doing with fold/reduce,
check this: http://wiki.tcl.tk/17983
very good explanation. once you get the concept of fold,
reduce will come together with the answer above:
list.tail.foldLeft(list.head)(_)
Scala 2.13.3, Demo:
val names = List("Foo", "Bar")
println("ReduceLeft: "+ names.reduceLeft(_+_))
println("ReduceRight: "+ names.reduceRight(_+_))
println("Fold: "+ names.fold("Other")(_+_))
println("FoldLeft: "+ names.foldLeft("Other")(_+_))
println("FoldRight: "+ names.foldRight("Other")(_+_))
outputs:
ReduceLeft: FooBar
ReduceRight: FooBar
Fold: OtherFooBar
FoldLeft: OtherFooBar
FoldRight: FooBarOther

Usage of _ in scala lambda functions

Can anyone please explain me why I can do:
a.mapValues(_.size)
instead of
a.mapValues(x => x.size)
but I can't do
a.groupBy(_)
instead of a
a.groupBy(x => x)
When you write a.groupBy(_) the compiler understands it as an anonymous function:
x => a.groupBy(x)
According to Scala Specifications §6.23, an underscore placeholder in an expression is replaced by a anonymous parameter. So:
_ + 1 is expanded to x => x + 1
f(_) is expanded to x => f(x)
_ is not expanded by itself (the placeholder is not part of any expression).
The expression x => a.groupBy(x) will confuse the compiler because it cannot infer the type of x. If a is some collection of type E elements, then the compiler expects x to be a function of type (E) => K, but type K cannot be inferred...
It isn't easy to see it here:
a.groupBy(_)
But it's easier to see it in something like this:
a.mkString("<", _, ">")
I'm partially applying the method/function. I'm applying it to some parameters (the first and last), and leaving the second parameter unapplied, so I'm getting a new function like this:
x => a.mkString("<", x, ">")
The first example is just a special case where the sole parameter is partially applied. When you use underscore on an expression, however, it stands for positional parameters in an anonymous function.
a.mapValues(_.size)
a.mapValues(x => x.size)
It is easy to get confused, because they both result in an anonymous function. In fact, there's a third underscore that is used to convert a method into a method value (which is also an anonymous function), such as:
a.groupBy _

Scala underscore minimal function

Let's create a value for the sake of this question:
val a = 1 :: Nil
now, I can demonstrate that the anonymous functions can be written in shorthand form like this:
a.map(_*2)
is it possible to write a shorthand of this function?:
a.map((x) => x)
my solution doesn't work:
a.map(_)
For the record, a.map(_) does not work because it stands for x => a.map(x), and not a.map(x => x). This happens because a single _ in place of a parameter stands for a partially applied function. In the case of 2*_, that stands for an anonymous function. These two uses are so close that is very common to get confused by them.
Your first shorthand form can also be written point-free
a map (2*)
Thanks to multiplication being commutative.
As for (x) => x, you want the identity function. This is defined in Predef and is generic, so you can be sure that it's type-safe.
You should use identity function for this use case.
a.map(identity)
identity is defined in scala.Predef as:
implicit def identity[A](x: A): A = x