I was going through the test code for spark. While I understand the logic behind the function given below
What does it means and What is the benefit of defining in the below syntax ?
Test Code
def withStreamingContext[R](ssc: StreamingContext)(block: StreamingContext => R): R = {
try {
block(ssc)
} finally {
try {
ssc.stop(stopSparkContext = true)
} catch {
case e: Exception =>
logError("Error stopping StreamingContext", e)
}
}
}
why does it has to be defined this way ? why can't it be
def withStreamingContext[R](ssc: StreamingContext,block: StreamingContext => R): R =
Well, it can. Separating arguments into two or more parameter lists is called currying. This way a two-parameter function can be turned into a function that takes one argument and returns a function that takes one argument and returns the result. This is what happened in the code you posted. Every n-parameter function can be seen as n 1-parameter functions (in fact, in Haskell all functions are treated like this).
Note that Scala also has a concept of partially applied functions, which boils down to the same thing. Both PAF and currying allow you to only pass a subset of parameters, thus receiving a function that takes the rest.
For example,
def sum(x: Int, y: Int) = x + y
can be curried and then you could say, for example:
def sum(x: Int)(y: Int) = x + y
def addTwo = sum(2) _ // type of addTwo is Int => Int
which gives you the same function, but with its first parameter applied. Using PAF, it would be
def sum(x: Int, y: Int) = x + y
def addTwo = sum(2, _: Int)
It is more convenient to use:
withStreamingContext(ssc) {
doSomething()
doSomethingElse()
}
vs
withStreamingContext(ssc, { doSomething(); doSomethingElse() })
First of all
def a(x: Int)(y: Int) = x * y
Is a syntactic sugar for
def a(x: Int) = (y: Int) => x * y
That means that you define a method that returns a function (closed over x)
You can invoke such method without all parameter lists and pass returned function around. You can also partially apply any other method but I think this syntax is cleaner.
Moreover, functions/methods with unary parameter lists can be invoked with expression syntax.
withStreamingContext(ssc) {
// your code block passed to function
}
This style of declaring functions is referred to as currying. It was independently introduced by Moses Schönfinkel, and then later by Haskell Curry from where it takes its name. The concept actually originates in mathematics and then introduced into computer science.
It is often conflated with partial function application; the main difference is that a call to a partially applied function returns the result immediately, not another function down the "currying" chain.
scala> def foo (x:Int, y:Int, z:Int) : Int = x + y + z
foo: (x: Int, y: Int, z: Int)Int
scala> val pa = foo(1, _:Int, _:Int)
pa: (Int, Int) => Int = <function2>
scala> pa(2,3)
res0: Int = 6
In contrast, given f:(x,y,z) -> n, currying produces f':x -> (y -> (z -> n)). In other words, applying each argument in turn to a single argument function returned by the previous invocation.
After calling f'(1), a function that takes a single argument and returns another function is returned, not a function that takes two arguments.
In contrast partial function application refers to the process of fixing a number of arguments to a function, producing another function of smaller arity. These two are often conflated.
The benefits/advantages of currying have already been mentioned elsewhere. The main issue you had was understanding the syntax and it's origins which has been explained.
Related
I was reading about scala anonymous functions here and saw that they can take the format:
{ case p1 => b1 … case pn => bn }
However, I thought that this was how partial functions were written. In fact, in this blog post, the author calls a partial function an anonymous function. At first he says that collect takes a partial function but then appears to call it an anonymous function ("collect can handle the fact that your anonymous function...").
Is it just that some anonymous functions are partial functions? If so, are all partial functions anonymous? Or are they only anonymous if in format like Alvin Alexander's example here:
val divide2: PartialFunction[Int, Int] = {
case d: Int if d != 0 => 42 / d
}
Anonymous and partial are different concepts. We would not say the following function is anonymous
val divide2: PartialFunction[Int, Int] = {
case d: Int if d != 0 => 42 / d
}
because it is bound to the name divide2, however we could say divide2 is defined in terms of the anonymous (function) value
{ case d: Int if d != 0 => 42 / d }
in the same sense x is defined in terms of anonymous value 42 in the following definition
val x: Int = 42
Orthogonal concept of partial refers to special subtype of function, as opposed to whether values of particular type are bound to a name or not.
From the documentation on pattern matching anonymous functions that you linked:
If the expected type is SAM-convertible to scala.Functionk[S1,…,Sk, R], the expression is taken to be equivalent to the anonymous
function:
(x1:S1,…,xk:Sk) => (x1,…,xk) match {
case p1 => b1 … case pn => bn
}
Here, each xi is a fresh name. As was shown here, this anonymous
function is in turn equivalent to the following instance creation
expression, where T is the weak least upper bound of the types of all
bi.
new scala.Functionk[S1,…,Sk, T] {
def apply(x1:S1,…,xk:Sk): T = (x1,…,xk) match {
case p1 => b1 …
case pn => bn
}
}
If the expected type is scala.PartialFunction[S, R], the expression is taken to be equivalent
to the following instance creation expression:
new scala.PartialFunction[S, T] {
def apply(x: S): T = x match {
case p1 => b1 … case pn => bn
}
def isDefinedAt(x: S): Boolean = {
case p1 => true … case pn => true
case _ => false
}
}
Your first code snippet is a pattern matching anonymous function, but not necessarily a partial function. It would only be turned into a PartialFunction if it was given to a method with a PartialFunction parameter or assigned to a variable of type PartialFunction.
So you are right that just some (pattern matching) anonymous functions are partial functions (AFAIK, function literals defined with fat arrows, such as x => x, can only ever be used to create FunctionN instances and not PartialFunction instances).
However, not all partial functions are anonymous functions. A sugar-free way to define PartialFunctions is extending the PartialFunction trait (which extends Function1) and manually overriding the isDefinedAt and apply methods. For example, divide2 could also be defined like this, with an anonymous class:
val divide2 = new PartialFunction[Int, Int] {
override def isDefinedAt(x: Int) = x != 0
override def apply(x: Int) = 42 / x
}
You probably won't see this very often, though, as it's a lot easier to just use pattern matching to define a PartialFunction.
In the blog post by Alvin Alexander you linked, the author refers to the pattern matching anonymous partial function literal as an anonymous function only because it happens to be both a partial function and an anonymous function. You could also define the function like so:
List(42, "cat").collect(new PartialFunction[Any, Int] {
def isDefinedAt(x: Any) = x.isInstanceOf[Int]
def apply(x: Any) = x match {
case i: Int => i + 1
}
})
It's no longer an anonymous function, although it's still an anonymous object that's the instance of an anonymous class. Or you could define a singleton object beforehand and then use that.
object Foo extends PartialFunction[Any, Int] {
def isDefinedAt(x: Any) = x.isInstanceOf[Int]
def apply(x: Any) = x match {
case i: Int => i + 1
}
}
List(42, "cat").collect(Foo)
No matter how you define it, though, it's a partial function.
Here's another way of writing your pattern matching partial function.
val divide = new PartialFunction[Int, Int] {
def apply(x: Int) = 42 / x
def isDefinedAt(x: Int) = x != 0
}
In essence, a partial function is a function that isn't defined for a set of inputs. It could be like in the example that it doesn't make sense to divide by 0, or you could want to restrict some specific values.
The nifty thing with partial functions is that it has synergies with orElse, andThen and collect. Depending on whether or not you're inputing a 0 in the divide function, your variable can be passed along to andThen if it wasn't a 0, can go through orElse if it was a 0. Finally, collect will only apply your partial function if it is defined on that input.
The way you create a partial function is usually through pattern matching with case as shown in your example.
Last thing, an anonymous function in Scala is like a lambda in Python. It's just a way of creating a function without "naming" it.
Eg
val f: Int => Int = (x: Int) => x * x
collect {
case a: Int => 1-a
}
def weirdfunc(message: String, f: (Int, Int) => Int){
println(message + s" ${f(3,5)}")
}
I have the function as above. How do I make it so that the function f is generic for all Numeric types?
What you want is called a higher-ranked type (specifically, rank 2). Haskell has support for these types, and Scala gets a lot of its type theory ideas from Haskell, but Scala has yet to directly support this particular feature.
Now, the thing is, with a bit of black magic, we can get Scala to do what you want, but the syntax is... not pretty. In Scala, functions are always monomorphic, but you want to pass a polymorphic function around as an argument. We can't do that, but we can pass a polymorphic function-like object around that looks and behaves mostly like a function. What would this object look like?
trait NumFunc {
def apply[A : Numeric](a: A, b: A): A
}
It's just a trait that defines a polymorphic apply. Now we can define the function that you really want.
def weirdfunc(message: String, f: NumFunc) = ???
The trouble here, as I mentioned, is that the syntax is really quite atrocious. To call this, we can't just pass in a function anymore. We have to create a NumFunc and pass that in. Essentially, from a type theoretic perspective, we have to prove to the compiler that our function works for all numeric A. For instance, to call the simple weirdfunc that only takes integers and pass the addition function is very simple.
weirdfunc("Some message", (_ + _))
However, to call our "special" weirdfunc that works for all number types, we have to write this mess.
weirdfunc("Hi", new NumFunc {
override def apply[A : Numeric](a: A, b: A): A = {
import math.Numeric.Implicits._
a + b
}
})
And we can't hide that away with an implicit conversion because, as I alluded to earlier, functions are monomorphic, so any conversion coming out a function type is going to be monomorphic.
Bottom line. Is it possible? Yes. Is it worth the costs in terms of readability and usability? Probably not.
Scala has a typeclass for this, so it's quite easy to achieve using a context bound and the standard lib.
def weirdfunc[T: Numeric](message: String, x: T, y: T, f: (T, T) => T) {
println(message + s" ${f(x, y)}")
}
def test[T](a: T, b: T)(implicit ev: Numeric[T]): T = ev.plus(a, b)
weirdFunc[Int]("The sum is ", 3, 5, test)
// The sum is 8
Sorry cktang you cannot generify this. The caller gets to set the generic parameter.. not the called function.. just like the caller passes function parameters.
However you can use currying so that you pass the 'f' of type Int once, and then pass different Int pairs. Then you may pass 'f' of type Double, and pass different Double pairs.
def weirdfunc[A](message: String, f: (A, A) => A)(x: A, y: A){
println(message + s" ${f(x, y)}")
}
def g(x: Int, y: Int): Int = x * y
val wierdfuncWithF = weirdfunc("hello", g) _
wierdfuncWithF(3, 5)
wierdfuncWithF(2, 3)
In particular what you want cannot be done as it will break generics rules.
Lets say we have a function def fun(x: X): X => Y and we pass this function as a parameter to another function using fun _ instead of just fun. I understand that fun _ is actually a function value, while fun refers to a function definition.
For example let:
val a = List(1,2,3,4)
def fun(x: Int) = {println(x); x + 1}
Then Running:
//This line works
a.map(fun _)
//This one also works, even though "fun" is not a function value
a.map(fun)
They have the same output:
1
2
3
4
resX: List[Int] = List(2, 3, 4, 5)
For the most part they seem to work the same, are there any examples in which the function value is not equivalent to the function definition?
In the signature of map, you can see that it's expecting a
"function" to apply to each element
But in your code, fun is a regular method in a class. So when you do:
a.map(fun _)
you are explicitly asking for eta-expansion. When you do:
a.map(fun)
you are implicitly asking for eta-expansion.
Because fun is a "method", and is being used in a place where a Function type is expected, it's automagically converted to that type. Basically to something like:
new Function1[Int, Int] {
def apply(x: Int): Int = fun(x)
}
This transformation converting the name fun to a Function is called eta-expansion. See documentation for details.
Unfortunately, there are various ways of doing what you're doing - a.map(fun), a.map(fun _), a.map(fun(_)) and a.map(x => fun(x)). This is one of those frequent scenarios in Scala where you can explicitly do something yourself, or explicitly ask the compiler to do it for you, or just let the compiler do it implicitly. They can have different behaviors because of implicits and it can be a major source of confusion. Also, _ is heavily overloaded in the language, only adding to the confusion. So I use implicit behavior sparingly in general.
As others have pointed out in the comments, you need to use the fun _ syntax (which performs an eta expansion) when you need a value (methods have no value by themselves). Within the context of a map (or other functional contexts) an eta expansion is performed on the method implicitly. There are some cases where the eta expansion must be triggered manually.
As a concrete example of where an explicit eta expansion is needed, consider this valid snippet:
def f1(x: Int): Int = 2*x
def f2(x: Int): Int = 3*x
val fList1 = List(f1 _, f2 _)
fList1.map(_(2)) // List(4, 6)
as opposed to this invalid snippet.
val fList2 = List(f1, f2)
I am new to Scala. I just heard the term "eta expansion" and roughly know that it means to expand a method to a function object. But I find few resources in SO that systematically introduce it.
I am curious about how eta expansion works in Scala. What are the scenarios that eta expansion are needed? And how eta expansion is implemented in Scala?
I roughly know that in cases like this:
def someMethod(x: Int): Int = x * x
someMethod _ will roughly be translated to a new function object like this:
new Function1[Int, Int] {
def apply(x: Int): Int = x * x
}
Is it all that Scala does?
The definition, and some examples, are given in http://scala-lang.org/files/archive/spec/2.11/06-expressions.html#method-values.
someMethod _ will roughly be translated to a new function object like this:
Not quite: it's actually
new Function1[Int, Int] {
def apply(x: Int): Int = someMethod(x)
}
The difference matters e.g. if someMethod is overridden somewhere.
Is it all that Scala does?
You also need to take into account what happens if the method takes multiple parameter lists (you get a function which returns a function) or by-name parameters.
What are the scenarios that eta expansion are needed?
When you specifically ask for it (e.g. someMethod _).
When you use a method (with parameters) where a value of a function type (or a SAM type in Scala 2.12) is expected. E.g.
def foo(f: Int => Int) = ???
foo(someMethod)
That's it.
Note that using eta-expansion and an anonymous function with placeholders (someMethod(_)) can behave differently due to type inference, implicits, etc.
Eta expansion In high level, is a process of translating methods into functions. Why? What? Aren't them the same? Let's explain:
A method in scala is what we know as def someMethodName(SomePramList): SomeReturnType. It starts with def. It may have parameter list, or even maybe more then 1. For example:
def numAdder(num1: Int)(num2: Int): Int =
num1 + num2
A function, or lambda function looks something like: (SomeParams) => SomeReturnType. For example:
val aFunction: Int => Int => Int = (num1: Int) => (num2: Int) => num1 + num2
Important to understand about functions is that this syntax is basically a syntactic sugar to FunctionN.apply method.
What are the scenarios that eta expansion are needed?
Some examples:
Example1 - Applying a method inside map (or filter, flatMap etc)
Writing such code:
def addPlus1(x: Int): Int = x + 1
List(1,2,3).map(addPlus1)
The compiler needs to have a function inside the map. So, it transforms the method given into a function:
List(1,2,3).map(x => addPlus1(x)). This is Eta expansion.
Example2 - currying
When defining curried method, for example:
def numAdder(num1: Int)(num2: Int): Int =
num1 + num2
And them creating a function like:
val curriedFunction: Int => Int = numAdder(4)
//or
val curriedFunction2 = numAdder(4) _
We defined a function out of a method. This is Eta expansion.
Some more examples
Defined a method which accepts a function value:
def someMethod(f: () => Int): Int = f()
def method(): Int = 10
And then run:
someMethod(method)
will transform the method method into a function. This is Eta expansion
I've been experimenting with Scala. I am trying to understand implicits and came across this situation.
Is behaviour of parameter b same for both functions?
Are paramareters list just syntax sugar over returning function?
My experiments show, they behave the same.
Thanks
implicit val v = 2
// 1.
def testB(a: Int)(b: Int)(implicit i: Int): Int = {
println(a + b + i)
11
}
println(testB(7)(8))
println(testB(7) {
8
})
// 2.
def testC(a: Int): (Int) => Int = {
def innerTest2C(b: Int)(implicit i: Int) = {
println(a + b + i)
11
}
innerTest2C
}
println(testC(7)(8))
println(testC(7) {
8
})
The rule is that whenever a function takes exactly one parameter, you can replace the normal brackets () with curly brackets {}. Curly brackets define a block and allow you to place several statements inside it. The block will evaluate to the value of the expression in the last line like in all blocks.
In 2., the function testC returns another function from Int to Int, so you cann call the result of testC(7) again with one parameter: testC(7)(x). If you just consider the println statements, there is nothing different here.
What you need to understand is that
def testB(a: Int)(b: Int)
is different from
def testB(a: Int, b: Int)
insofar that the former represents two functions like in your second case. You can call testB(x) and will obtain another function from Int to Int. Applying only part of the paramters of a function in order to obtain another function is called currying.