I am new to Scala. I just heard the term "eta expansion" and roughly know that it means to expand a method to a function object. But I find few resources in SO that systematically introduce it.
I am curious about how eta expansion works in Scala. What are the scenarios that eta expansion are needed? And how eta expansion is implemented in Scala?
I roughly know that in cases like this:
def someMethod(x: Int): Int = x * x
someMethod _ will roughly be translated to a new function object like this:
new Function1[Int, Int] {
def apply(x: Int): Int = x * x
}
Is it all that Scala does?
The definition, and some examples, are given in http://scala-lang.org/files/archive/spec/2.11/06-expressions.html#method-values.
someMethod _ will roughly be translated to a new function object like this:
Not quite: it's actually
new Function1[Int, Int] {
def apply(x: Int): Int = someMethod(x)
}
The difference matters e.g. if someMethod is overridden somewhere.
Is it all that Scala does?
You also need to take into account what happens if the method takes multiple parameter lists (you get a function which returns a function) or by-name parameters.
What are the scenarios that eta expansion are needed?
When you specifically ask for it (e.g. someMethod _).
When you use a method (with parameters) where a value of a function type (or a SAM type in Scala 2.12) is expected. E.g.
def foo(f: Int => Int) = ???
foo(someMethod)
That's it.
Note that using eta-expansion and an anonymous function with placeholders (someMethod(_)) can behave differently due to type inference, implicits, etc.
Eta expansion In high level, is a process of translating methods into functions. Why? What? Aren't them the same? Let's explain:
A method in scala is what we know as def someMethodName(SomePramList): SomeReturnType. It starts with def. It may have parameter list, or even maybe more then 1. For example:
def numAdder(num1: Int)(num2: Int): Int =
num1 + num2
A function, or lambda function looks something like: (SomeParams) => SomeReturnType. For example:
val aFunction: Int => Int => Int = (num1: Int) => (num2: Int) => num1 + num2
Important to understand about functions is that this syntax is basically a syntactic sugar to FunctionN.apply method.
What are the scenarios that eta expansion are needed?
Some examples:
Example1 - Applying a method inside map (or filter, flatMap etc)
Writing such code:
def addPlus1(x: Int): Int = x + 1
List(1,2,3).map(addPlus1)
The compiler needs to have a function inside the map. So, it transforms the method given into a function:
List(1,2,3).map(x => addPlus1(x)). This is Eta expansion.
Example2 - currying
When defining curried method, for example:
def numAdder(num1: Int)(num2: Int): Int =
num1 + num2
And them creating a function like:
val curriedFunction: Int => Int = numAdder(4)
//or
val curriedFunction2 = numAdder(4) _
We defined a function out of a method. This is Eta expansion.
Some more examples
Defined a method which accepts a function value:
def someMethod(f: () => Int): Int = f()
def method(): Int = 10
And then run:
someMethod(method)
will transform the method method into a function. This is Eta expansion
Related
Why are functions values but methods are not, or what makes a function a value in particular? What would allow me to decipher that methods do not contain that characteristic?
Scala is an object-oriented language. In object-oriented languages, every value is an object, and every object is a value.
Methods are bound to objects. It is awkward to have methods at the same time be a building block of objects and be objects themselves. It is easier to have them not be objects.
A function in Scala is, in some sense, just an object with an apply method. If there is no method named foo in scope, then foo() is simply syntactic sugar for foo.apply(). So, functions are values because they are objects.
While any object that has an apply method can be called as if it were a function, when we talk about "functions" in Scala, we usually mean something more specific: an instance of one of the FunctionN traits such as Function2[-T1, -T2, +R]. In particular, the function literal syntax
val add = (a: Int, b: Int) => a + b
is syntactic sugar for
val add = new Function2[Int, Int, Int] {
override def apply(a: Int, b: Int) = a + b
}
And the function type
type F = (Int, String, Long) => Boolean
is syntactic sugar for
type F = Function3[Int, String, Long, Boolean]
[Scastie link]
where each of the FunctionN traits is defined like this:
package scala
trait Function0[+R] {
def apply: R
override def toString = "<function>"
}
trait Function1[-T, +R] {
def apply(x: T): R
override def toString = "<function>"
}
trait Function2[-T1, -T2, +R] {
def apply(x1: T1, x2: T2): R
override def toString = "<function>"
}
trait Function3[-T1, -T2, -T3, +R] {
def apply(x1: T1, x2: T2, x3: T3): R
override def toString = "<function>"
}
and so on.
It is possible to convert a method into a function value using η-expansion. This can be done explicitly using a trailing underscore:
val f = println _
or in some cases, when it is clear that a function value is required, even by just using the bare name of the method:
val it = Iterable(1, 2, 3)
it.foreach(println)
[Scastie link]
Note that this is not much different from other languages. In Java, C#, and Ruby, for example, it is the same thing: methods are defined as part of classes (or structs in C# and modules in Ruby) and are bound to objects, but aren't objects themselves. Instead, you have a separate notion of a function (instance of a SAM interface / functional interface in Java, Action or Func in C#, Proc in Ruby), which is an object.
In Ruby, you can create a proxy object for a method that is bound to an object which has the same interface as Proc.
Other languages make different choices, i.e. in ECMAScript and Python, methods are objects / values, but they are not as tightly bound to objects as they are in Scala, Java, C#, and Ruby. Instead, methods are basically normal functions that are assigned to fields of the object.
One of the defining characteristics of a value is that it can be assigned to a variable. A method, technically speaking, cannot be assigned to a variable, for example
def m(i: Int): Int = i + 1
val f = m // error
However there is a process called eta expansion that converts a method to a function value. In Scala 2, we can trigger it by providing type ascription to a variable
val f: Int => Int = m
or by using underscore _
val f = m _
However this restriction has been lifted in Scala 3 (Dotty) which now provides automatic eta expansion so the following compiles successfully
scala> def f(i: Int): Int = i + 1
| val x = f
def f(i: Int): Int
val x: Int => Int = Lambda$1218/158882051#62cf6a84
I was wondering how one would implement an infinite curried add function, for the case of explanation i would stick to scala.
I know how to prepare a simple curry like
def add(a: Int): Int => Int = {
def iadd(b: Int): Int = {
a + b
}
iadd
}
add(4)(5) // 9
How would i got about implementing add(5)(4)(x1)(x2)..(xn)
The Smart Way
The question is the comments is well-posed: when do you stop the currying and produce a result?
One solution is to stop the recursion by calling the function with zero arguments. Scala's overloading with let us do this.
add(1)(2)(3)(4)() // The () indicates that we're done currying
This is relatively straightforward. We just need a class with an apply that returns a new instance of itself
// A class with an apply method is callable like a function
class Adder(val acc: Int) {
def apply(a: Int): Adder =
new Adder(acc + a)
def apply(): Int =
acc
}
def add: Adder = new Adder(0)
println(add(1)(2)(3)(4)()) // 10
If you ever had a real reason to do this, this would be the way I would recommend. It's simple, easy to read, and adds very little boilerplate on top of the currying.
The Slightly Unhinged Way
But what fun is simple and logical? Let's get rid of those silly parentheses at the end, eh? We can do it with Scala's implicit conversions. First, we'll need to import the feature, so that Scala will stop warning us that what we're doing is silly and not a good idea.
import scala.language.implicitConversions
Then we make it so that Adder can be converted to Int
// Don't do this in real code
implicit def adderToInt(adder: Adder): Int =
adder()
Now, we don't need those parentheses at the end. We do, however, need to indicate to the type system that we want an Int.
val result: Int = add(1)(2)(3)(4)
println(result) // 10
Passing the result to a function which takes an Int, for instance, would also suffice.
Comments
Since you mentioned functional programming in general, I will note that you can do similar tricks in Haskell, using typeclasses. You can see this in action in the standard library with Text.PrintF. Note that since Haskell functions always take one argument, you'll need to have a sentinel value to indicate the "end" of the arguments (() may suffice, depending on how generic your argument types are).
If you want to reinterpret every integer n as function n.+, then just do it:
implicit class Add(val x: Int) extends AnyVal { def apply(i: Int) = x + i }
val add = 0
or even shorter (with implicit conversions):
implicit def asAdd(n: Int): Int => Int = n.+
val add = 0
Example:
add(1)(2)(3)(4) // res1: Int = 10
There is no such thing as "infinitely curryable", it's not a meaningful notion.
Well, this is not exactly infinite currying, but it gives you the something similar.
final class InfiniteCurrying[A, B] private (acc: A, op: (A, B) => A) {
final val run: A = acc
final def apply(b: B): InfiniteCurrying[A, B] =
new InfiniteCurrying(
acc = op(acc, b),
op,
)
}
object InfiniteCurrying {
def add(initial: Int): InfiniteCurrying[Int, Int] =
new InfiniteCurrying(
acc = initial,
op = (acc, b) => acc + b
)
}
import InfiniteCurrying._
val r = add(10)(20)(30)
r.run // res: Int = 60
Lets say we have a function def fun(x: X): X => Y and we pass this function as a parameter to another function using fun _ instead of just fun. I understand that fun _ is actually a function value, while fun refers to a function definition.
For example let:
val a = List(1,2,3,4)
def fun(x: Int) = {println(x); x + 1}
Then Running:
//This line works
a.map(fun _)
//This one also works, even though "fun" is not a function value
a.map(fun)
They have the same output:
1
2
3
4
resX: List[Int] = List(2, 3, 4, 5)
For the most part they seem to work the same, are there any examples in which the function value is not equivalent to the function definition?
In the signature of map, you can see that it's expecting a
"function" to apply to each element
But in your code, fun is a regular method in a class. So when you do:
a.map(fun _)
you are explicitly asking for eta-expansion. When you do:
a.map(fun)
you are implicitly asking for eta-expansion.
Because fun is a "method", and is being used in a place where a Function type is expected, it's automagically converted to that type. Basically to something like:
new Function1[Int, Int] {
def apply(x: Int): Int = fun(x)
}
This transformation converting the name fun to a Function is called eta-expansion. See documentation for details.
Unfortunately, there are various ways of doing what you're doing - a.map(fun), a.map(fun _), a.map(fun(_)) and a.map(x => fun(x)). This is one of those frequent scenarios in Scala where you can explicitly do something yourself, or explicitly ask the compiler to do it for you, or just let the compiler do it implicitly. They can have different behaviors because of implicits and it can be a major source of confusion. Also, _ is heavily overloaded in the language, only adding to the confusion. So I use implicit behavior sparingly in general.
As others have pointed out in the comments, you need to use the fun _ syntax (which performs an eta expansion) when you need a value (methods have no value by themselves). Within the context of a map (or other functional contexts) an eta expansion is performed on the method implicitly. There are some cases where the eta expansion must be triggered manually.
As a concrete example of where an explicit eta expansion is needed, consider this valid snippet:
def f1(x: Int): Int = 2*x
def f2(x: Int): Int = 3*x
val fList1 = List(f1 _, f2 _)
fList1.map(_(2)) // List(4, 6)
as opposed to this invalid snippet.
val fList2 = List(f1, f2)
I was going through the test code for spark. While I understand the logic behind the function given below
What does it means and What is the benefit of defining in the below syntax ?
Test Code
def withStreamingContext[R](ssc: StreamingContext)(block: StreamingContext => R): R = {
try {
block(ssc)
} finally {
try {
ssc.stop(stopSparkContext = true)
} catch {
case e: Exception =>
logError("Error stopping StreamingContext", e)
}
}
}
why does it has to be defined this way ? why can't it be
def withStreamingContext[R](ssc: StreamingContext,block: StreamingContext => R): R =
Well, it can. Separating arguments into two or more parameter lists is called currying. This way a two-parameter function can be turned into a function that takes one argument and returns a function that takes one argument and returns the result. This is what happened in the code you posted. Every n-parameter function can be seen as n 1-parameter functions (in fact, in Haskell all functions are treated like this).
Note that Scala also has a concept of partially applied functions, which boils down to the same thing. Both PAF and currying allow you to only pass a subset of parameters, thus receiving a function that takes the rest.
For example,
def sum(x: Int, y: Int) = x + y
can be curried and then you could say, for example:
def sum(x: Int)(y: Int) = x + y
def addTwo = sum(2) _ // type of addTwo is Int => Int
which gives you the same function, but with its first parameter applied. Using PAF, it would be
def sum(x: Int, y: Int) = x + y
def addTwo = sum(2, _: Int)
It is more convenient to use:
withStreamingContext(ssc) {
doSomething()
doSomethingElse()
}
vs
withStreamingContext(ssc, { doSomething(); doSomethingElse() })
First of all
def a(x: Int)(y: Int) = x * y
Is a syntactic sugar for
def a(x: Int) = (y: Int) => x * y
That means that you define a method that returns a function (closed over x)
You can invoke such method without all parameter lists and pass returned function around. You can also partially apply any other method but I think this syntax is cleaner.
Moreover, functions/methods with unary parameter lists can be invoked with expression syntax.
withStreamingContext(ssc) {
// your code block passed to function
}
This style of declaring functions is referred to as currying. It was independently introduced by Moses Schönfinkel, and then later by Haskell Curry from where it takes its name. The concept actually originates in mathematics and then introduced into computer science.
It is often conflated with partial function application; the main difference is that a call to a partially applied function returns the result immediately, not another function down the "currying" chain.
scala> def foo (x:Int, y:Int, z:Int) : Int = x + y + z
foo: (x: Int, y: Int, z: Int)Int
scala> val pa = foo(1, _:Int, _:Int)
pa: (Int, Int) => Int = <function2>
scala> pa(2,3)
res0: Int = 6
In contrast, given f:(x,y,z) -> n, currying produces f':x -> (y -> (z -> n)). In other words, applying each argument in turn to a single argument function returned by the previous invocation.
After calling f'(1), a function that takes a single argument and returns another function is returned, not a function that takes two arguments.
In contrast partial function application refers to the process of fixing a number of arguments to a function, producing another function of smaller arity. These two are often conflated.
The benefits/advantages of currying have already been mentioned elsewhere. The main issue you had was understanding the syntax and it's origins which has been explained.
I'm trying to understand the crucial difference between these two approaches of referencing / defining Function Literal (reference to anonymous function):
By val
scala> val v2 = new Function[Int, Int] {
| def apply(a: Int): Int = a + 1
| }
v2: Int => Int = <function1>
And by def
scala> def f2 = new Function[Int, Int] {
| def apply(a: Int): Int = a + 1
| }
f2: Int => Int
It seems that it pretty much the same in terms of use. I either can pass v2 or f2 to the function that accepts (Int) => Int as an argument. Passing arguments to its..
I guess or the case of v2 it creates an Function1 object that refers to the Function1 object.. like a proxy?
Ok.. My question is: what is advantage and disadvantages of 1th and 2nd approach?
And of it is defined by def, is it still Function Literal?
First of all, neither of your examples are actually function literals—you're creating a Function instance in the plain old sugar-free way, and in fact you could use this approach (new Function { ... }) to create an instance of scala.Function from Java code.
The following are both function literals, and are exactly equivalent to your definitions:
val v2 = (a: Int) => a + 1
def f2 = (a: Int) => a + 1
The only real difference here is that the val will create a single instance once and for all, no matter how many times you use v2 (and even if you never use it), while the def will create a new instance every time (or not at all, if you never use it). So you'll generally want to go with a val.
There are cases, however, where you need to use def. Consider the following:
def myIdentity[A] = (a: A) => a
There's no way we could write this as a val, since Scala doesn't have polymorphic functions in this sense (for any instance of Function[A, B], A and B have to be concrete types). But we can define a polymorphic method that returns a function, and when we write e.g. myIndentity(1), the A will be inferred to be Int, and we'll create (and apply) a Function[Int, Int] exactly as you'd expect.