Lets say we have a function def fun(x: X): X => Y and we pass this function as a parameter to another function using fun _ instead of just fun. I understand that fun _ is actually a function value, while fun refers to a function definition.
For example let:
val a = List(1,2,3,4)
def fun(x: Int) = {println(x); x + 1}
Then Running:
//This line works
a.map(fun _)
//This one also works, even though "fun" is not a function value
a.map(fun)
They have the same output:
1
2
3
4
resX: List[Int] = List(2, 3, 4, 5)
For the most part they seem to work the same, are there any examples in which the function value is not equivalent to the function definition?
In the signature of map, you can see that it's expecting a
"function" to apply to each element
But in your code, fun is a regular method in a class. So when you do:
a.map(fun _)
you are explicitly asking for eta-expansion. When you do:
a.map(fun)
you are implicitly asking for eta-expansion.
Because fun is a "method", and is being used in a place where a Function type is expected, it's automagically converted to that type. Basically to something like:
new Function1[Int, Int] {
def apply(x: Int): Int = fun(x)
}
This transformation converting the name fun to a Function is called eta-expansion. See documentation for details.
Unfortunately, there are various ways of doing what you're doing - a.map(fun), a.map(fun _), a.map(fun(_)) and a.map(x => fun(x)). This is one of those frequent scenarios in Scala where you can explicitly do something yourself, or explicitly ask the compiler to do it for you, or just let the compiler do it implicitly. They can have different behaviors because of implicits and it can be a major source of confusion. Also, _ is heavily overloaded in the language, only adding to the confusion. So I use implicit behavior sparingly in general.
As others have pointed out in the comments, you need to use the fun _ syntax (which performs an eta expansion) when you need a value (methods have no value by themselves). Within the context of a map (or other functional contexts) an eta expansion is performed on the method implicitly. There are some cases where the eta expansion must be triggered manually.
As a concrete example of where an explicit eta expansion is needed, consider this valid snippet:
def f1(x: Int): Int = 2*x
def f2(x: Int): Int = 3*x
val fList1 = List(f1 _, f2 _)
fList1.map(_(2)) // List(4, 6)
as opposed to this invalid snippet.
val fList2 = List(f1, f2)
Related
def weirdfunc(message: String, f: (Int, Int) => Int){
println(message + s" ${f(3,5)}")
}
I have the function as above. How do I make it so that the function f is generic for all Numeric types?
What you want is called a higher-ranked type (specifically, rank 2). Haskell has support for these types, and Scala gets a lot of its type theory ideas from Haskell, but Scala has yet to directly support this particular feature.
Now, the thing is, with a bit of black magic, we can get Scala to do what you want, but the syntax is... not pretty. In Scala, functions are always monomorphic, but you want to pass a polymorphic function around as an argument. We can't do that, but we can pass a polymorphic function-like object around that looks and behaves mostly like a function. What would this object look like?
trait NumFunc {
def apply[A : Numeric](a: A, b: A): A
}
It's just a trait that defines a polymorphic apply. Now we can define the function that you really want.
def weirdfunc(message: String, f: NumFunc) = ???
The trouble here, as I mentioned, is that the syntax is really quite atrocious. To call this, we can't just pass in a function anymore. We have to create a NumFunc and pass that in. Essentially, from a type theoretic perspective, we have to prove to the compiler that our function works for all numeric A. For instance, to call the simple weirdfunc that only takes integers and pass the addition function is very simple.
weirdfunc("Some message", (_ + _))
However, to call our "special" weirdfunc that works for all number types, we have to write this mess.
weirdfunc("Hi", new NumFunc {
override def apply[A : Numeric](a: A, b: A): A = {
import math.Numeric.Implicits._
a + b
}
})
And we can't hide that away with an implicit conversion because, as I alluded to earlier, functions are monomorphic, so any conversion coming out a function type is going to be monomorphic.
Bottom line. Is it possible? Yes. Is it worth the costs in terms of readability and usability? Probably not.
Scala has a typeclass for this, so it's quite easy to achieve using a context bound and the standard lib.
def weirdfunc[T: Numeric](message: String, x: T, y: T, f: (T, T) => T) {
println(message + s" ${f(x, y)}")
}
def test[T](a: T, b: T)(implicit ev: Numeric[T]): T = ev.plus(a, b)
weirdFunc[Int]("The sum is ", 3, 5, test)
// The sum is 8
Sorry cktang you cannot generify this. The caller gets to set the generic parameter.. not the called function.. just like the caller passes function parameters.
However you can use currying so that you pass the 'f' of type Int once, and then pass different Int pairs. Then you may pass 'f' of type Double, and pass different Double pairs.
def weirdfunc[A](message: String, f: (A, A) => A)(x: A, y: A){
println(message + s" ${f(x, y)}")
}
def g(x: Int, y: Int): Int = x * y
val wierdfuncWithF = weirdfunc("hello", g) _
wierdfuncWithF(3, 5)
wierdfuncWithF(2, 3)
In particular what you want cannot be done as it will break generics rules.
I am new to Scala. I just heard the term "eta expansion" and roughly know that it means to expand a method to a function object. But I find few resources in SO that systematically introduce it.
I am curious about how eta expansion works in Scala. What are the scenarios that eta expansion are needed? And how eta expansion is implemented in Scala?
I roughly know that in cases like this:
def someMethod(x: Int): Int = x * x
someMethod _ will roughly be translated to a new function object like this:
new Function1[Int, Int] {
def apply(x: Int): Int = x * x
}
Is it all that Scala does?
The definition, and some examples, are given in http://scala-lang.org/files/archive/spec/2.11/06-expressions.html#method-values.
someMethod _ will roughly be translated to a new function object like this:
Not quite: it's actually
new Function1[Int, Int] {
def apply(x: Int): Int = someMethod(x)
}
The difference matters e.g. if someMethod is overridden somewhere.
Is it all that Scala does?
You also need to take into account what happens if the method takes multiple parameter lists (you get a function which returns a function) or by-name parameters.
What are the scenarios that eta expansion are needed?
When you specifically ask for it (e.g. someMethod _).
When you use a method (with parameters) where a value of a function type (or a SAM type in Scala 2.12) is expected. E.g.
def foo(f: Int => Int) = ???
foo(someMethod)
That's it.
Note that using eta-expansion and an anonymous function with placeholders (someMethod(_)) can behave differently due to type inference, implicits, etc.
Eta expansion In high level, is a process of translating methods into functions. Why? What? Aren't them the same? Let's explain:
A method in scala is what we know as def someMethodName(SomePramList): SomeReturnType. It starts with def. It may have parameter list, or even maybe more then 1. For example:
def numAdder(num1: Int)(num2: Int): Int =
num1 + num2
A function, or lambda function looks something like: (SomeParams) => SomeReturnType. For example:
val aFunction: Int => Int => Int = (num1: Int) => (num2: Int) => num1 + num2
Important to understand about functions is that this syntax is basically a syntactic sugar to FunctionN.apply method.
What are the scenarios that eta expansion are needed?
Some examples:
Example1 - Applying a method inside map (or filter, flatMap etc)
Writing such code:
def addPlus1(x: Int): Int = x + 1
List(1,2,3).map(addPlus1)
The compiler needs to have a function inside the map. So, it transforms the method given into a function:
List(1,2,3).map(x => addPlus1(x)). This is Eta expansion.
Example2 - currying
When defining curried method, for example:
def numAdder(num1: Int)(num2: Int): Int =
num1 + num2
And them creating a function like:
val curriedFunction: Int => Int = numAdder(4)
//or
val curriedFunction2 = numAdder(4) _
We defined a function out of a method. This is Eta expansion.
Some more examples
Defined a method which accepts a function value:
def someMethod(f: () => Int): Int = f()
def method(): Int = 10
And then run:
someMethod(method)
will transform the method method into a function. This is Eta expansion
I was going through the test code for spark. While I understand the logic behind the function given below
What does it means and What is the benefit of defining in the below syntax ?
Test Code
def withStreamingContext[R](ssc: StreamingContext)(block: StreamingContext => R): R = {
try {
block(ssc)
} finally {
try {
ssc.stop(stopSparkContext = true)
} catch {
case e: Exception =>
logError("Error stopping StreamingContext", e)
}
}
}
why does it has to be defined this way ? why can't it be
def withStreamingContext[R](ssc: StreamingContext,block: StreamingContext => R): R =
Well, it can. Separating arguments into two or more parameter lists is called currying. This way a two-parameter function can be turned into a function that takes one argument and returns a function that takes one argument and returns the result. This is what happened in the code you posted. Every n-parameter function can be seen as n 1-parameter functions (in fact, in Haskell all functions are treated like this).
Note that Scala also has a concept of partially applied functions, which boils down to the same thing. Both PAF and currying allow you to only pass a subset of parameters, thus receiving a function that takes the rest.
For example,
def sum(x: Int, y: Int) = x + y
can be curried and then you could say, for example:
def sum(x: Int)(y: Int) = x + y
def addTwo = sum(2) _ // type of addTwo is Int => Int
which gives you the same function, but with its first parameter applied. Using PAF, it would be
def sum(x: Int, y: Int) = x + y
def addTwo = sum(2, _: Int)
It is more convenient to use:
withStreamingContext(ssc) {
doSomething()
doSomethingElse()
}
vs
withStreamingContext(ssc, { doSomething(); doSomethingElse() })
First of all
def a(x: Int)(y: Int) = x * y
Is a syntactic sugar for
def a(x: Int) = (y: Int) => x * y
That means that you define a method that returns a function (closed over x)
You can invoke such method without all parameter lists and pass returned function around. You can also partially apply any other method but I think this syntax is cleaner.
Moreover, functions/methods with unary parameter lists can be invoked with expression syntax.
withStreamingContext(ssc) {
// your code block passed to function
}
This style of declaring functions is referred to as currying. It was independently introduced by Moses Schönfinkel, and then later by Haskell Curry from where it takes its name. The concept actually originates in mathematics and then introduced into computer science.
It is often conflated with partial function application; the main difference is that a call to a partially applied function returns the result immediately, not another function down the "currying" chain.
scala> def foo (x:Int, y:Int, z:Int) : Int = x + y + z
foo: (x: Int, y: Int, z: Int)Int
scala> val pa = foo(1, _:Int, _:Int)
pa: (Int, Int) => Int = <function2>
scala> pa(2,3)
res0: Int = 6
In contrast, given f:(x,y,z) -> n, currying produces f':x -> (y -> (z -> n)). In other words, applying each argument in turn to a single argument function returned by the previous invocation.
After calling f'(1), a function that takes a single argument and returns another function is returned, not a function that takes two arguments.
In contrast partial function application refers to the process of fixing a number of arguments to a function, producing another function of smaller arity. These two are often conflated.
The benefits/advantages of currying have already been mentioned elsewhere. The main issue you had was understanding the syntax and it's origins which has been explained.
I'm trying to understand the crucial difference between these two approaches of referencing / defining Function Literal (reference to anonymous function):
By val
scala> val v2 = new Function[Int, Int] {
| def apply(a: Int): Int = a + 1
| }
v2: Int => Int = <function1>
And by def
scala> def f2 = new Function[Int, Int] {
| def apply(a: Int): Int = a + 1
| }
f2: Int => Int
It seems that it pretty much the same in terms of use. I either can pass v2 or f2 to the function that accepts (Int) => Int as an argument. Passing arguments to its..
I guess or the case of v2 it creates an Function1 object that refers to the Function1 object.. like a proxy?
Ok.. My question is: what is advantage and disadvantages of 1th and 2nd approach?
And of it is defined by def, is it still Function Literal?
First of all, neither of your examples are actually function literals—you're creating a Function instance in the plain old sugar-free way, and in fact you could use this approach (new Function { ... }) to create an instance of scala.Function from Java code.
The following are both function literals, and are exactly equivalent to your definitions:
val v2 = (a: Int) => a + 1
def f2 = (a: Int) => a + 1
The only real difference here is that the val will create a single instance once and for all, no matter how many times you use v2 (and even if you never use it), while the def will create a new instance every time (or not at all, if you never use it). So you'll generally want to go with a val.
There are cases, however, where you need to use def. Consider the following:
def myIdentity[A] = (a: A) => a
There's no way we could write this as a val, since Scala doesn't have polymorphic functions in this sense (for any instance of Function[A, B], A and B have to be concrete types). But we can define a polymorphic method that returns a function, and when we write e.g. myIndentity(1), the A will be inferred to be Int, and we'll create (and apply) a Function[Int, Int] exactly as you'd expect.
I'm a bit confused by this Scala notation:
List(1, 2, 3).foldLeft(0)((x, acc) => acc+x)
Both "0" and the function are arguments for foldLeft, why are they passed in two adjacent brackets groups? I'd aspect this to work:
List(1, 2, 3).foldLeft(0, ((x, acc) => acc+x))
But it doesn't. Can anyone explain this to me? Also, how and why to declare such a type of function? Thanks
Scala allows you to have multiple arguments list:
def foo(a: Int)(b: String) = ???
def bar(a: Int)(b: String)(c: Long) = ???
The reason for using such syntax for foldLeft is the way compiler does type inference: already inferred types in the previous group of arguments used to infer types in consecutive arguments group. In case of foldLeft it allows you to drop type ascription next to the (x, acc), so instead of:
List(1, 2, 3).foldLeft(0)((x: Int, acc: Int) => acc+x)
you can write just
List(1, 2, 3).foldLeft(0)((x, acc) => acc+x)
This is an example of multiple parameter lists in Scala. They're really just syntactic sugar for a normal method call (if you look at the class file's method signatures with javap you'll see that when compiled to Java bytecode they're all combined into a single argument list). The reason for supporting multiple parameter lists are twofold:
Passing functions as arguments: Scala will allow you to replace a parameter list that takes a single argument with a function literal in curly braces {}. For example, your code could be re-written as List(1, 2, 3).foldLeft(0) { (x, acc) => acc+x }, which might be considered more readable. (Then again, I'd just use List(1, 2, 3).foldLeft(0)(_+_) in this case...) Being able to use curly braces like this makes it possible for the user to declare new functions that look more like native syntax. A good example of this is the react function for Actors.
Type inference: There are some details of the type inference process (which I admit I don't fully understand) that make it easier to infer the types used in a later list based on the types in an earlier list. For example, the initial z value passed to foldLeft is used to infer the result type (and left argument type) of the function parameter.
Because in Scala you can define function arguments in multiple groups separated by ()
def test(a: String)(b: String)(implicit ev: Something) { }
The most practical scenario is where a context bound or currying is required, e.g. a specific implicit definition available in scope.
For instance, Future will expect an implicit executor. Look here.
If you look at the definition of the foldLeft method, you will see the first argument is an accumulator and the second a function that will be used for currying.
def foldLeft[B](z: B)(op: (B, A) ⇒ B): B
The parentheses thing is a very useful separation of concerns.
Also, once you define a method with:
def test(a: String)(b: String)
You can't call it with: test("a", "b");