What does an underscore after a scala method call mean? - scala

The scala documentation has a code example that includes the following line:
val numberFunc = numbers.foldLeft(List[Int]())_
What does the underscore after the method call mean?

It's a partially applied function. You only provide the first parameter to foldLeft (the initial value), but you don't provide the second one; you postpone it for later. In the docs you linked they do it in the next line, where they define squares:
val numberFunc = numbers.foldLeft(List[Int]())_
val squares = numberFunc((xs, x) => xs:+ x*x)
See that (xs, x) => xs:+ x*x, that's the missing second parameter which you omitted while defining numberFunc. If you had provided it right away, then numberFunc would not be a function - it would be the computed value.
So basically the whole thing can also be written as a one-liner in the curried form:
val squares = numbers.foldLeft(List[Int]())((xs, x) => xs:+ x*x)
However, if you want to be able to reuse foldLeft over and over again, having the same collection and initial value, but providing a different function every time, then it's very convinient to define a separate numbersFunc (as they did in the docs) and reuse it with different functions, e.g.:
val squares = numberFunc((xs, x) => xs:+ x*x)
val cubes = numberFunc((xs, x) => xs:+ x*x*x)
...
Note that the compiler error message is pretty straightforward in case you forget the underscore:
Error: missing argument list for method foldLeft in trait
LinearSeqOptimized Unapplied methods are only converted to functions
when a function type is expected. You can make this conversion
explicit by writing foldLeft _ or foldLeft(_)(_) instead of
foldLeft. val numberFunc = numbers.foldLeft(ListInt)
EDIT: Haha I just realized that they did the exact same thing with cubes in the documentation.

I don't know if it helps but I prefer this syntax
val numberFunc = numbers.foldLeft(List[Int]())(_)
then numberFunc is basically a delegate corresponding to an instance method (instance being numbers) waiting for a parameter. Which later comes to be a lambda expression in the scala documentation example

Related

Folding lists in scala

Folding list in scala using /: and :\ operator
I tried to to look at different sites and they only talk about foldRight and foldLeft functions.
def sum(xs: List[Int]): Int = (0 /: xs) (_ + _)
sum(List(1,2,3))
res0: 6
The code segment works as described. But I am not able to completely understand the method definition. What I understand is that the one inside the first parenthesis -> 0 /: xs where /: is a right associate operator. The object is xs and the parameter is 0. I am not sure about the return type of the operation (most probably it would be another list?). The second part is a functional piece which sums its two parameters. But I don't understand what object invokes it ? and the name of function. Can someone please help me to understand.
The signature of :/ is
/:[B](z: B)(op: (B, A) ⇒ B): B
It is a method with multiple argument lists, so when it is just invoked with on argument (i.e. 0 /: xs in your case) the return type is (op: (B, A) ⇒ B): B. So you have to pass it a method with 2 parameters ( _ + _ ) that is used to combine the elements of the list starting from z.
This method is usually called foldLeft:
(0 /: xs)(_ + _) is the same as xs.foldLeft(0)(_ + _)
You can find more details here: https://www.scala-lang.org/api/2.12.3/scala/collection/immutable/List.html
Thanks #HaraldGliebe & #LuisMiguelMejíaSuárez for your great responses. I am enlightened now!. I am just summarisig the answer here which may benefit others who read this thread.
"/:" is actually the name of the function which is defined inside the List class. The signature of the function is: /:[B](z: B)(op: (B, A) ⇒ B): B --> where B is the type parameter, z is the first parameter; op is the second parameter which is of functional type.
The function follows curried version --> which means we can pass less number of parameters than that of the actual number. If we do that,
the partially applied function is stored in a temporary variable; we can then use the temporary variable to pass the remaining parameters.
If supplied with all parameters, "/:" can be called as: x./:(0)(_+_) where x is val/var of List type. OR "/:" can be called in two steps which are given as:
step:1 val temp = x./:(0)(_) where we pass only the first parameter. This results in a partially applied function which is stored in the temp variable.
step:2 temp(_+_) here using the partially applied function temp is passed with the second (final) parameter.
If we decide to follow the first style ( x./:(0)(_+_) ), calling the first parameter can be written in operator notion which is: x /: 0
Since the method name ends with a colon, the object will be pulled from right side. So x /: 0 is invalid and it has to be written as 0 /: x which is correct.
This one is equivalent to the temp variable. On following 0 /: x, second parameter also needs to be passed. So the whole construct becomes: (0/:x)(_+_)
This is how the definition of the function sum in the question, is interpreted.
We have to note that when we use curried version of the function in operator notion, we have to supply all the parameters in a single go.
That is: (0 /: x) (_) OR (0 /: x) _ seems throwing syntax errors.

why scala lambda with _ can't using && to combine two bool expression

As far as I understand .
_ is a short lambda to omit a=>
i find this code (can find here scala-function-true-power)
val file = List("warn 2013 msg", "warn 2012 msg", "error 2013 msg", "warn 2013 msg")
val size = file.filter(_.contains("warn")).filter(_.contains("2013")).size
//val size1 = file.filter(_.contains("warn") && _.contains("2013")).size
val size2 = file.filter( a=> a.contains("warn") && a.contains("2013")).size
println("cat file | grep 'warn' | grep '2013' | wc : " +size )
the line to get size1 has syntax error,looks like it can't recognize the "_" ,it's not a element in fileList.
but i use a=>,the normal kind,it works good .
so,why the scala work by this way?
is there more difference in _ and a=> ?
In scala, any _ placeholder is matched against the passed arguments in the context of calling function. So for example if the signature of the function you are trying to use is f : A ⇒ B and you are calling something like collectionOfFunctA.map(_.f) - Scala compiler will infer the correct type of the function and will use the first underscore to put the actual item from a collection and call the function f over it. But if you will try to write it as collectionOfFunctA.map(_.f + _.size) - that will fail, because Scala compiler will pick up the first placeholder as of type that has function f defined, and the second underscore will not match any function in the context. So it will expect to have a function that takes two parameters instead of one.
More on this
As jdevelop says, but here in the words of the compiler/REPL:
scala> val size1 = file.filter(_.contains("warn") && _.contains("2013")).size
<console>:8: error: missing parameter type for expanded function ((x$1, x$2) => x$1.contains("warn").$amp$amp(x$2.contains("2013")))
val size1 = file.filter(_.contains("warn") && _.contains("2013")).size
^
<console>:8: error: missing parameter type for expanded function ((x$1: <error>, x$2) => x$1.contains("warn").$amp$amp(x$2.contains("2013")))
val size1 = file.filter(_.contains("warn") && _.contains("2013")).size
^
You see that hint: for expanded function ((x$1, x$2) => x$1.contains("warn").$amp$amp(x$2.contains("2013")))
It is expecting 2 parameters while there is just one.
You can think of the place holder as being matched with the lambda's arguments positionnally.
The first occurrence of the _ is matched with the first argument, the second occurence is matched with the second argument, etc.
As the other answers have shown, this means that using the placeholder twice will be desugared as trying to pass a lamba with 2 arguments to the filter which only expects one.
In your example :
val size = file.filter(_.contains("warn") && _.contains("2013")).size
would be desugared as
val size = file.filter((a,b)=>a.contains("warn") && b.contains("2013")).size
which will not compile since filter expects a predicate p: A => Boolean
Now, a reason the placeholder is matched positionnally is to avoid ambiguity in lambdas with more than one argument.
How can the compiler guess the correct implementation for the following case if the place holder can be reused multiple times for the same argument:
file.fold("")(_++_)
Should it be desugared as :
file.fold("")((a,b)=> a++b )
or as
file.fold("")((a,b)=> a++a )
or as
file.fold("")((a,b)=> b++b )
and worse, what would you expect for
file.fold("")(_++_++_)
There is no general way for the compiler to infer the correct implementation.
One might argue for relaxing the constraint when the expected lambda only accepts one argument. I suggest doing a more detailed research before taking the first steps to the scala improvement process as it seems likely that this particular design decision has been challenged and explained before.
If you are worried about the performance of iterating over the list twice (which is the case when you write)
file.filter(_.contains("warn")).filter(_.contains("2013")).size
In theory it should be possible for the compiler to detect that both filters can be applied within the same iteration.
In scala, the collections are eager by default but you can get the lazy evaluation by using views.
The current implementation has known issues which are being worked on. Other collection implementations in scala are actively being developed to be able to combine transformations and computations by default (see psp-std for example)

Scala collections: why do we need a case statement to extract values tuples in higher order functions?

Related to Tuple Unpacking in Map Operations, I don't understand why do we need a case (that looks like a partial function to me) to extract values from tuple, like that:
arrayOfTuples map {case (e1, e2) => e1.toString + e2}
Instead of extracting in the same way it works in foldLeft, for example
def sum(list: List[Int]): Int = list.foldLeft(0)((r,c) => r+c)
Anyway we don't specify the type of parameters in the first case, so why do we need the case statement?
Because in Scala function argument lists and tuples are not a unified concept as they are in Haskell and other functional languages. So a function:
(t: (Int, Int)) => ...
is not the same thing as a function:
(e1: Int, e2: Int) => ...
In the first case you can use pattern matching to extract the tuple elements, and that's always done using case syntax. Actually, the expression:
{case (e1, e2) => ...}
is shorthand for:
t => t match {case (e1, e2) => ...}
There has been some discussions about unifying tuples and function argument lists, but there are complications regarding Java overloading rules, and also default/named arguments. So, I think it's unlikely the concepts will ever be unified in Scala.
Lambda with one primitive parameter
With
var listOfInt=(1 to 100).toList
listOfInt.foldRight(0)((current,acc)=>current+acc)
you have a lambda function operating on two parameter.
Lambda with one parameter of type tuple
With
var listOfTuple=List((1,"a"),(2,"b"),(3," "))
listOfTuple.map(x => x._1.toString + x._2.toString)
you have a lambda function working on one parameter (of type Tuple2[Int, String])
Both works fine with type inference.
Partial lambda with one parameter
With
listOfTuple.map{case (x,y) => x.toString + y.toString}
you have a lambda function, working with one parameter (of type Tuple2[Int, String]). This lambda function then uses Tuple2.unapply internally to decompose the one parameter in multiple values. This still works fine with type inference. The case is needed for the decomposition ("pattern matching") of the value.
This example is a little bit unintuitive, because unapply returns a Tuple as its result. In this special case there might indeed be a trick, so Scala uses the provided tuple directly. But I am not really aware of such a trick.
Update: Lambda function with currying
Indeed there is a trick. With
import Function.tupled
listOfTuple map tupled{(x,y) => x.toString + y.toString}
you can directly work with the tuple. But of course this is really a trick: You provide a function operating on two parameters and not with a tuple. tupled then takes that function and changes it to a different function, operating on a tuple. This technique is also called uncurrying.
Remark:
The y.toString is superfluous when y is already a string. This is not considered good style. I leave it in for the sake of the example. You should omit it in real code.

When should I use Scala method & function

e.g.
val f1 = (a:Int) => a+1
def f2 = (a:Int) => a+1
Seems they are quite inter-changeable, are there any specific use case which I should use a particular one?
Typically one would write the def version as:
def f2(a:Int) = a+1
If you write it they way you did, it's actually creating a method f2 that, when called, constructs a new function closure. So in both of your cases, you still have a function. But in the first case, the function gets constructed only once and stored (as f1) for later use. But in the second case, the function gets constructed every time f2 is referenced, which is kind of a waste.
So when to prefer a function over a method? Well, you can work with a function a bit more easily in certain situations. Consider this case:
(1 to 3).map(f1 andThen f1) // totally fine
(1 to 3).map(f2 andThen f1) // "missing arguments" error
(1 to 3).map(f2 _ andThen f1) // fixed by adding "_" after the method name
So f1 is always a function and so you can call a method like andThen on it. The variable f2 represents a method, so it doesn't expect to be used like this; it always expects to be followed by arguments. You can always turn a method into a function by following it with an underscore, but this can look a little ugly if it's not necessary.

Scala underscore minimal function

Let's create a value for the sake of this question:
val a = 1 :: Nil
now, I can demonstrate that the anonymous functions can be written in shorthand form like this:
a.map(_*2)
is it possible to write a shorthand of this function?:
a.map((x) => x)
my solution doesn't work:
a.map(_)
For the record, a.map(_) does not work because it stands for x => a.map(x), and not a.map(x => x). This happens because a single _ in place of a parameter stands for a partially applied function. In the case of 2*_, that stands for an anonymous function. These two uses are so close that is very common to get confused by them.
Your first shorthand form can also be written point-free
a map (2*)
Thanks to multiplication being commutative.
As for (x) => x, you want the identity function. This is defined in Predef and is generic, so you can be sure that it's type-safe.
You should use identity function for this use case.
a.map(identity)
identity is defined in scala.Predef as:
implicit def identity[A](x: A): A = x