What is the parameter "_" in below method call signify ?
Is this a wildcard that accepts a parameter of any type ?
val integerSorter = msort[Int]((a, b) => a < b) _
The method msort signature :
def msort[T](less: (T, T) => Boolean)(xs: List[T]): List[T] = {
The easiest way to explain this is probably to let the compiler do most of the explaining—just try the first line without the underscore:
scala> val integerSorter = msort[Int]((a, b) => a < b)
<console>:11: error: missing arguments for method msort;
follow this method with `_' if you want to treat it as a partially applied function
val integerSorter = msort[Int]((a, b) => a < b)
^
So there you have it—the msort method has two parameter lists, but you've only passed arguments for the first, and the trailing underscore is the syntax that Scala provides to tell the compiler that you want partial application in that situation.
(If you try that line in the REPL with the underscore, you'll see that the inferred type of integerSorter is List[Int] => List[Int], so to answer your second question, no, the underscore doesn't allow you to provide a parameter of any type.)
For more information, see section 6.7 of the language specification:
The expression e _ is well-formed if e is of method type or if e is a
call-by-name parameter. If e is a method with parameters, e _
represents e converted to a function type by eta expansion
(§6.26.5).
Reading the section on eta expansion may also be helpful.
msort takes two parameters, a function that returns a boolean, and a list of items to be sorted. the function integerSorter supplies the first parameter, and the underscore represents the list that still needs to be specified. look up currying (http://www.scala-lang.org/old/node/135.html) for a more detailed explanation.
Related
2 different examples, the first one works:
import cats.syntax.either._
val e = 10.asRight[String]
def i2s(i:Int):String = i.toString
e.map(i => List(i2s(i))) //using explicit parameter
e.map(List(i2s(_))) //using no-name _ parameter
Now the same example with Option is not compiled:
e.map(Option(i2s(_)))
The error:
Error:(27, 15) type mismatch;
found : Option[Int => String]
required: Int => ?
e.map(Option(i2s(_)))
With explicit parameter it works fine:
e.map(i => Option(i2s(i)))
In both cases apply method is invoked with List and Option. List.apply signature:
def apply[A](xs: A*): List[A] = ???
Option.apply signature:
def apply[A](x: A): Option[A]
Please explain the difference.
Both of your List examples compile but they don't mean the same thing and don't produce the same results.
e.map(i => List(i2s(i))) //res0: scala.util.Either[String,List[String]] = Right(List(10))
e.map(List(i2s(_))) //java.lang.IndexOutOfBoundsException: 10
The 1st is easy to understand, so what's going on with the 2nd?
What's happening is that you're using eta expansion to create an Int => String function from the i2s() method. You then populate a List with that single function as the only element in the list, and then try to retrieve the value at index 10, which doesn't exist, thus the exception.
If you change the 1st line to val e = 0.asRight[String] then the exception goes away because something does exist at index 0, the function that was just put in there.
This compiles because a List instance will accept an Int as a parameter (via the hidden apply() method), but an Option instance does not have an apply() method that takes an Int (*) so that can't be compiled.
(*) The Option object does have an apply() method, but that's a different animal.
There are multiple things at play here as to why your first example with List[A] works. First, let's look at the expansion that happens on the expression:
val res: Either[String, Int => String] =
e.map[Int => String](List.apply[Int => String](((x$1: Int) => FlinkTest.this.i2s(x$1))));
Notice two things:
The expansion of the lambda expression happens inside List.apply, and perhaps not as you expected, for it to be outside of List.apply, like this:
e.map(i => List(i2s(i))
The return type from .map is somehow not Either[String, List[Int => String]], but Either[String, Int => String]. This is due to the fact that in it's hierarchy chain, List[A] extends PartialFunction[Int, A], thus allowing it to transform the result into a function type.
This doesn't work for Option[A], as it doesn't extend PartialFunction anywhere in it's type hierarchy.
The key takeaway here is that the expansion of the lambda expression doesn't work as you expect, as List(i2s(_)) expands to List(i2s(x => i2s(x)) and not List(i => i2s(i)). For more on underscore expansion, see What are all the uses of an underscore in Scala?
I am new to Scala & trying to understand the Function Documentation for Scala in Spark. the 'flatMap' function has documentation like this
def
flatMap[U](f: (T) ⇒ TraversableOnce[U])(implicit arg0: ClassTag[U]): RDD[U]
Return a new RDD by first applying a function to all elements of this RDD, and then flattening the results.
Although I know what exactly flatMap does, understanding the documentation seems to be too Cryptic(with letter like U, f, T etc...). Would appreciate if someone can explain what exactly each part of this documentation conveys
def flatMap: this is a method called flatMap.
[U]: it's generic, with one type parameter, U.
(f: (T) ⇒ TraversableOnce[U]): it takes one argument, f, of type T ⇒ TraversableOnce[U] (T is the generic parameter of RDD itself, so e.g. if you have an RDD[String] then T = String). So f is a one-parameter function that takes a T and returns a TraversableOnce[U]. Remember that U is the type parameter (generic) on the method. So you can call this method with any function that takes T and returns TraversableOnce[Something].
(implicit arg0: ClassTag[U]): the method requires an implicit parameter of type ClassTag[U] to be available. Implicits like this often constrain what types a type parameter can be. In this case ClassTag means that the type U needs to have concrete type information available at compile time. In practice you can ignore this unless you're trying to call flatMap from a generic method of your own.
: RDD[U]: the method returns an RDD[U]. Remember U was the type parameter on the method. So if you call flatMap with an f that returns TraversableOnce[Int], the return type will be RDD[Int]; if you call flatMap with an f that returns TraversableOnce[Potato] the return type will be RDD[Potato], and so on.
def flatMap[U](f: (T) ⇒ TraversableOnce[U])(implicit arg0: ClassTag[U]): RDD[U]
Try replacing T with Person and U with Pet.
flatMap takes a function f as an argument. This function takes an instance of type Person as an argument, and returns a collection of Pets - i.e., that person's pets. flatMap will then return a single collection of Pets - i.e., RDD[Pet].
def flatMap[Pet](f: (Person) ⇒ TraversableOnce[Pet])(implicit arg0: ClassTag[Pet]): RDD[Pet]
//usage .
val allPets = people.flatMap(person => person.pets)
The implicit ClassTag on the second parameter list is a different story. That's used to ask the compiler to create a manifest for the type Pet, so that flatMap can reflect on the type.
Read more about it here: TypeTags and Manifests
I'm a bit confused by this Scala notation:
List(1, 2, 3).foldLeft(0)((x, acc) => acc+x)
Both "0" and the function are arguments for foldLeft, why are they passed in two adjacent brackets groups? I'd aspect this to work:
List(1, 2, 3).foldLeft(0, ((x, acc) => acc+x))
But it doesn't. Can anyone explain this to me? Also, how and why to declare such a type of function? Thanks
Scala allows you to have multiple arguments list:
def foo(a: Int)(b: String) = ???
def bar(a: Int)(b: String)(c: Long) = ???
The reason for using such syntax for foldLeft is the way compiler does type inference: already inferred types in the previous group of arguments used to infer types in consecutive arguments group. In case of foldLeft it allows you to drop type ascription next to the (x, acc), so instead of:
List(1, 2, 3).foldLeft(0)((x: Int, acc: Int) => acc+x)
you can write just
List(1, 2, 3).foldLeft(0)((x, acc) => acc+x)
This is an example of multiple parameter lists in Scala. They're really just syntactic sugar for a normal method call (if you look at the class file's method signatures with javap you'll see that when compiled to Java bytecode they're all combined into a single argument list). The reason for supporting multiple parameter lists are twofold:
Passing functions as arguments: Scala will allow you to replace a parameter list that takes a single argument with a function literal in curly braces {}. For example, your code could be re-written as List(1, 2, 3).foldLeft(0) { (x, acc) => acc+x }, which might be considered more readable. (Then again, I'd just use List(1, 2, 3).foldLeft(0)(_+_) in this case...) Being able to use curly braces like this makes it possible for the user to declare new functions that look more like native syntax. A good example of this is the react function for Actors.
Type inference: There are some details of the type inference process (which I admit I don't fully understand) that make it easier to infer the types used in a later list based on the types in an earlier list. For example, the initial z value passed to foldLeft is used to infer the result type (and left argument type) of the function parameter.
Because in Scala you can define function arguments in multiple groups separated by ()
def test(a: String)(b: String)(implicit ev: Something) { }
The most practical scenario is where a context bound or currying is required, e.g. a specific implicit definition available in scope.
For instance, Future will expect an implicit executor. Look here.
If you look at the definition of the foldLeft method, you will see the first argument is an accumulator and the second a function that will be used for currying.
def foldLeft[B](z: B)(op: (B, A) ⇒ B): B
The parentheses thing is a very useful separation of concerns.
Also, once you define a method with:
def test(a: String)(b: String)
You can't call it with: test("a", "b");
I wrote the following
def mapFun[T, U](xs: List[T], f: T => U): List[U] = (xs foldRight List[U]())( f(_)::_ )
and when I did
def f(x: Int):Int=x*x
mapFun(List(1,2,3), f)
It worked fine. However, I really wanted to make the following work too
mapFun(List(1,2,3), x=>x*x)
It complains about "missing parameter type". I know that I could use currying, but is there any way to still use anonymous function for non-currying def I had above?
It seems to me that because "f" is in the same parameter list as "xs", you're required to give some information regarding the type of x so that the compiler can solve it.
In your case, this will work:
mapFun(List(1,2,3) , (x: Int) => x * x)
Do you see how I'm informing the compiler that x is an Int?
A "trick" that you can do is currying f. If you don't know what currying is check this out: http://www.codecommit.com/blog/scala/function-currying-in-scala
You will end up with a mapFun like this:
def mapFun[T, U](xs: List[T])(f: T => U): List[U] =
(xs foldRight List[U]())( f(_)::_ )
And this will work:
mapFun(List(1,2,3))(x => x * x)
In the last call, the type of x is resolved when the compiler checks the first parameter list.
EDIT:
As Dominic pointed out, you could tell the compiler what your types are. Leading to:
mapFun[Int, Int](List(1,2,3), x => x * x)
Cheers!
The limitation of scala's type system that you're running into here is that the type information flows from left to right across parameter groups and does not flow from left to right within a parameter group.
What this means is that specifying the type parameter T by providing a List[Int] will not provide that information to other parameters within the group like f. Which results in a missing parameter type error. But it will provide it to f if f were a part of the next parameter group. This is why the curried function approach works.
i.e. if you defined it like this:
def mapFun[T, U](xs: List[T])(f: T => U): List[U] = (xs foldRight List[U]())( f(_)::_ )
The type parameter T that you define in the first parameter group: (xs: List[T]) as Int will be made available to the next parameter group: (f: T => U). So now you do not have to explicitly specify T at the call site.
I'm following the tutorial Pattern matching & functional composition on Scala compose and andThen methods. There's such an example:
scala> def addUmm(x: String) = x + " umm"
scala> def addAhem(x: String) = x + " ahem"
val ummThenAhem = addAhem(_).compose(addUmm(_))
When I try to use it I get an error:
<console>:7: error: missing parameter type for expanded function ((x$1) => addAhem(x$1).compose(((x$2) => addUmm(x$2))))
val ummThenAhem = addAhem(_).compose(addUmm(_))
^
<console>:7: error: missing parameter type for expanded function ((x$2) => addUmm(x$2))
val ummThenAhem = addAhem(_).compose(addUmm(_))
^
<console>:7: error: type mismatch;
found : java.lang.String
required: Int
val ummThenAhem = addAhem(_).compose(addUmm(_))
However, this works:
val ummThenAhem = addAhem _ compose addUmm _
or even
val ummThenAhem = addAhem _ compose addUmm
What's wrong with the code in the tutorial? Isn't the latter expression the same as the first one without parenthesis?
Well, this:
addUhum _
is an eta expansion. It converts methods into functions. On the other hand, this:
addUhum(_)
is an anonymous function. In fact, it is a partial function application, in that this parameter is not applied, and the whole thing converted into a function. It expands to:
x => addUhum(x)
The exact rules for expansion are a bit difficult to explain, but, basically, the function will "start" at the innermost expression delimiter. The exception is partial function applications, where the "x" is moved outside the function -- if _ is used in place of a parameter.
Anyway, this is how it expands:
val ummThenAhem = x => addAhem(x).compose(y => addUmm(y))
Alas, the type inferencer doesn't know the type of x or y. If you wish, you can see exactly what it tried using the parameter -Ytyper-debug.
addAhem is a method. compose method is defined on functions. addAhem _ converts addAhem from method to function, so compose can be called on it. compose expects a function as it's argument. You are giving it a method addUmm by converting addUmm into a function with addUmm _ (The underscore can be left out because the compiler can automatically convert a method into a function when it knows that a function is expected anyway). So your code:
addAhem _ compose addUmm
is the same as
(addAhem _).compose(addUmm)
but not
addAhem(_).compose(addUmm(_))
PS
I didn't look at the link you provided.
From compose documentation:
Composes two instances of Function1 in a new Function1, with this
function applied last.
so you should write
scala> val ummThenAhem = (addAhem _).compose(addUmm _)
ummThenAhem: String => java.lang.String = <function1>
to treat addAhem and addUmm as partially applied functions (i.e function1)
scala> addAhem _
res0: String => java.lang.String = <function1>
I believe the tutorial was written for an earlier version of Scala (probably 2.7.7 or earlier). There have been some changes in the compiler since then, namely, extensions to the type system, which now cause the type inferencing to fail on the:
addUhum(_).compose(addAhem(_))
The lifting to a function still works with that syntax if you just write:
addUhum(_)