Not sure how to properly formulate the question, there is a problem with currying in the merge sort example from Scala by Example book on page 69. The function is defined as follows:
def msort[A](less: (A, A) => Boolean)(xs: List[A]): List[A] = {
def merge(xs1: List[A], xs2: List[A]): List[A] =
if (xs1.isEmpty) xs2
else if (xs2.isEmpty) xs1
else if (less(xs1.head, xs2.head)) xs1.head :: merge(xs1.tail, xs2)
else xs2.head :: merge(xs1, xs2.tail)
val n = xs.length/2
if (n == 0) xs
else merge(msort(less)(xs take n), msort(less)(xs drop n))
}
and then there is an example of how to create other functions from it by currying:
val intSort = msort((x : Int, y : Int) => x < y)
val reverseSort = msort((x:Int, y:Int) => x > y)
however these two lines give me errors about insufficient number of arguments. And if I do like this:
val intSort = msort((x : Int, y : Int) => x < y)(List(1, 2, 4))
val reverseSort = msort((x:Int, y:Int) => x > y)(List(4, 3, 2))
it WILL work. Why? Can someone explain? Looks like the book is really dated since it is not the first case of such an inconsistance in its examples. Could anyone point to something more real to read? (better a free e-book).
My compiler (2.9.1) agrees, there seems to be an error here, but the compiler does tell you what to do:
error: missing arguments for method msort in object $iw;
follow this method with `_' if you want to treat it as a partially applied function
So, this works:
val intSort = msort((x : Int, y : Int) => x < y) _
Since the type of intSort is inferred, the compiler doesn't otherwise seem to know whether you intended to partially apply, or whether you missed arguments.
The _ can be omitted when the compiler can infer from the expected type that a partially applied function is what is intended. So this works too:
val intSort: List[Int] => List[Int] = msort((x: Int, y: Int) => x < y)
That's obviously more verbose, but more often you will take advantage of this without any extra boilerplate, for example if msort((x: Int, y: Int) => x < y) were the argument to a function where the parameter type is already known to be List[Int] => List[Int].
Edit: Page 181 of the current edition of the Scala Language Specification mentions a tightening of rules for implicit conversions of partially applied methods to functions since Scala 2.0. There is an example of invalid code very similar to the one in Scala by Example, and it is described as "previously legal code".
Related
Simple question, and sorry if this is a stupid question as I am just beginning in scala. I am getting a type mismatch error that says:
found : (AnyRef, org.apache.tinkerpop.gremlin.hadoop.structure.io.VertexWritable) => List[Object]
required: ((AnyRef, org.apache.tinkerpop.gremlin.hadoop.structure.io.VertexWritable)) => scala.collection.GenTraversableOnce[?]
But according to this post (I have a Scala List, how can I get a TraversableOnce?), a scala.collection.immutable.List is an Iterable and therefore also a GenTraversableOnce. And yet this error seems to indicate otherwise. And furthermore, when I actually look at the link in the accepted answer of that post, I don't see any reference to the word "traversable".
If the problem has to do with my inner class not being correct, then I have to say this error is extremely uninformative, since requiring that the inner class be of type "?" is obviously a vacuous statement ... Any help in understanding this would be appreciated.
Function2[X, Y, Z] is not the same thing as Function1[(X, Y), Z].
Compare these two definitions:
val f: ((Int, Int)) => Int = xy => xy._1 + xy._2
val f: (Int, Int) => Int = (x, y) => x + y
The first could also be written with a pattern-matching, that first decomposes the tuple:
val f: ((Int, Int)) => Int = { case (x, y) => x + y }
This is exactly what the error message asks you to do: provide an unary function that takes a tuple as argument, not a binary function. Note that there is the tupled-method, that does exactly that.
The return types of the functions are mostly irrelevant here, the compiler doesn't get to unify them, because it fails on the types of the inputs.
Also related:
Same story with eta-expansions: Why does my implementation of Haskell snd not compile in Scala
Whilst I understand what a partially applied/curried function is, I still don't fully understand why I would use such a function vs simply overloading a function. I.e. given:
def add(a: Int, b: Int): Int = a + b
val addV = (a: Int, b: Int) => a + b
What is the practical difference between
def addOne(b: Int): Int = add(1, b)
and
def addOnePA = add(1, _:Int)
// or currying
val addOneC = addV.curried(1)
Please note I am NOT asking about currying vs partially applied functions as this has been asked before and I have read the answers. I am asking about currying/partially applied functions VS overloaded functions
The difference in your example is that overloaded function will have hardcoded value 1 for the first argument to add, i.e. set at compile time, while partially applied or curried functions are meant to capture their arguments dynamically, i.e. at run time. Otherwise, in your particular example, because you are hardcoding 1 in both cases it's pretty much the same thing.
You would use partially applied/curried function when you pass it through different contexts, and it captures/fills-in arguments dynamically until it's completely ready to be evaluated. In FP this is important because many times you don't pass values, but rather pass functions around. It allows for higher composability and code reusability.
There's a couple reasons why you might prefer partially applied functions. The most obvious and perhaps superficial one is that you don't have to write out intermediate functions such as addOnePA.
List(1, 2, 3, 4) map (_ + 3) // List(4, 5, 6, 7)
is nicer than
def add3(x: Int): Int = x + 3
List(1, 2, 3, 4) map add3
Even the anonymous function approach (that the underscore ends up expanding out to by the compiler) feels a tiny bit clunky in comparison.
List(1, 2, 3, 4) map (x => x + 3)
Less superficially, partial application comes in handy when you're truly passing around functions as first-class values.
val fs = List[(Int, Int) => Int](_ + _, _ * _, _ / _)
val on3 = fs map (f => f(_, 3)) // partial application
val allTogether = on3.foldLeft{identity[Int] _}{_ compose _}
allTogether(6) // (6 / 3) * 3 + 3 = 9
Imagine if I hadn't told you what the functions in fs were. The trick of coming up with named function equivalents instead of partial application becomes harder to use.
As for currying, currying functions often lets you naturally express transformations of functions that produce other functions (rather than a higher order function that simply produces a non-function value at the end) which might otherwise be less clear.
For example,
def integrate(f: Double => Double, delta: Double = 0.01)(x: Double): Double = {
val domain = Range.Double(0.0, x, delta)
domain.foldLeft(0.0){case (acc, a) => delta * f(a) + acc
}
can be thought of and used in the way that you actually learned integration in calculus, namely as a transformation of a function that produces another function.
def square(x: Double): Double = x * x
// Ignoring issues of numerical stability for the moment...
// The underscore is really just a wart that Scala requires to bind it to a val
val cubic = integrate(square) _
val quartic = integrate(cubic) _
val quintic = integrate(quartic) _
// Not *utterly* horrible for a two line numerical integration function
cubic(1) // 0.32835000000000014
quartic(1) // 0.0800415
quintic(1) // 0.015449626499999999
Currying also alleviates a few of the problems around fixed function arity.
implicit class LiftedApply[A, B](fOpt: Option[A => B]){
def ap(xOpt: Option[A]): Option[B] = for {
f <- fOpt
x <- xOpt
} yield f(x)
}
def not(x: Boolean): Boolean = !x
def and(x: Boolean)(y: Boolean): Boolean = x && y
def and3(x: Boolean)(y: Boolean)(z: Boolean): Boolean = x && y && z
Some(not _) ap Some(false) // true
Some(and _) ap Some(true) ap Some(true) // true
Some(and3 _) ap Some(true) ap Some(true) ap Some(true) // true
By having curried functions, we've been able to "lift" a function to work on Option for as many arguments as we need. If our logic functions had not been curried, then we would have had to have separate functions to lift A => B to Option[A] => Option[B], (A, B) => C to (Option[A], Option[B]) => Option[C], (A, B, C) => D to (Option[A], Option[B], Option[C]) => Option[D] and so on for all the arities we cared about.
Currying also has some other miscellaneous benefits when it comes to type inference and is required if you have both implicit and non-implicit arguments for a method.
Finally, the answers to this question list out some more times you might want currying.
I am working on spark and not an expert in scala. I have got the two variants of map function. Could you please explain the difference between them.?
first variant and known format.
first variant
val.map( (x,y) => x.size())
Second variant -> This has been applied on tuple
val.map({case (x, y) => y.toString()});
The type of val is RDD[(IntWritable, Text)]. When i tried with first function, it gave error as below.
type mismatch;
found : (org.apache.hadoop.io.IntWritable, org.apache.hadoop.io.Text) ⇒ Unit
required: ((org.apache.hadoop.io.IntWritable, org.apache.hadoop.io.Text)) ⇒ Unit
When I added extra parenthesis it said,
Tuples cannot be directly destructured in method or function parameters.
Well you say:
The type of val is RDD[(IntWritable, Text)]
so it is a tuple of arity 2 with IntWritable and Text as components.
If you say
val.map( (x,y) => x.size())
what you're doing is you are essentially passing in a Function2, a function with two arguments to the map function. This will never compile because map wants a function with one argument. What you can do is the following:
val.map((xy: (IntWritable, Text)) => xy._2.toString)
using ._2 to get the second part of the tuple which is passed in as xy (the type annotation is not required but makes it more clear).
Now the second variant (you can leave out the outer parens):
val.map { case (x, y) => y.toString() }
this is special scala syntax for creating a PartialFunction that immediately matches on the tuple that is passed in to access the x and y parts. This is possible because PartialFunction extends from the regular Function1 class (Function1[A,B] can be written as A => B) with one argument.
Hope that makes it more clear :)
I try this in repl:
scala> val l = List(("firstname", "tom"), ("secondname", "kate"))
l: List[(String, String)] = List((firstname,tom), (secondname,kate))
scala> l.map((x, y) => x.size)
<console>:9: error: missing parameter type
Note: The expected type requires a one-argument function accepting a 2-Tuple.
Consider a pattern matching anonymous function, `{ case (x, y) => ... }`
l.map((x, y) => x.size)
maybe can give you some inspire.
Your first example is a function that takes two arguments and returns a String. This is similar to this example:
scala> val f = (x:Int,y:Int) => x + y
f: (Int, Int) => Int = <function2>
You can see that the type of f is (Int,Int) => Int (just slightly changed this to be returning an int instead of a string). Meaning that this is a function that takes two Int as arguments and returns an Int as a result.
Now the second example you have is a syntactic sugar (a shortcut) for writing something like this:
scala> val g = (k: (Int, Int)) => k match { case (x: Int, y: Int) => x + y }
g: ((Int, Int)) => Int = <function1>
You see that the return type of function g is now ((Int, Int)) => Int. Can you spot the difference? The input type of g has two parentheses. This shows that g takes one argument and that argument must be a Tuple[Int,Int] (or (Int,Int) for short).
Going back to your RDD, what you have is an Collection of Tuple[IntWritable, Text] so the second function will work, whereas the first one will not work.
If I'm using the F# interpreter, I can define a simple function like this:
> // Function to check if x is an integer multiple of y
> let multipleOf x y = (x % y = 0);;
val multipleOf : x:int -> y:int -> bool
If I know a function exists in the F# interpreter session but I'm unsure of its precise type, I can ask the interpreter to give me its type simply by typing the function's name:
> // I can't remember the type of the function multipleOf!
> multipleOf;;
val it : (int -> int -> bool) = <fun:it#12-1>
Clearly, this tells me that the function multipleOf is of type int->int->bool. I find this incredibly useful as a tool to jog my memory when working in the F# interpreter.
However, I can't seem to find similar functionality in Scala's REPL. I can define an equivalent function in Scala easily enough of course:
def multipleOf(x: Int, y: Int) = x % y == 0
But if I'm ten minutes on in my Scala REPL session and can't remember the type of the function, typing multipleOf gives no information about the type (in fact, it gives an error). Similarly, :type multipleOf tells me nothing useful.
scala> val f = (i: Int, j: Int) => i % j == 0
f: (Int, Int) => Boolean = <function2>
scala> f
res2: (Int, Int) => Boolean = <function2>
scala> def multipleOf(x: Int, y: Int) = x % y == 0
multipleOf: (x: Int, y: Int)Boolean
scala> :type multipleOf(_, _)
(Int, Int) => Boolean
Yuck! This is one of those occasions where the solution to a question occurs to you just as you're about to submit the question to StackOverflow. Hopefully someone will find it useful if I answer it here myself.
It turns out that Scala will play ball on providing type information for functions as long as you tell it to evaluate the function as a partially evaluated function! In other words, the following does the trick:
scala> multipleOf _
res0: (Int, Int) => Boolean = <function2>
In other words, the REPL gives you information about the type of a function only if you reevaluate the function as a partially evaluated version of itself. This seems significantly less than optimal. ;-)
Perhaps somebody can mention in the comments why it's sane for Scala to approach things this way?
I had a List of Scala tuples like the following:
val l = List((1,2),(2,3),(3,4))
and I wanted to map it in a list of Int where each item is the sum of the Ints in a the corresponding tuple. I also didn't want to use to use the x._1 notation so I solved the problem with a pattern matching like this
def addTuple(t: (Int, Int)) : Int = t match {
case (first, second) => first + second
}
var r = l map addTuple
Doing that I obtained the list r: List[Int] = List(3, 5, 7) as expected. At this point, almost by accident, I discovered that I can achieve the same result with an abbreviated form like the following:
val r = l map {case(first, second) => first + second}
I cannot find any reference to this syntax in the documentation I have. Is that normal? Am I missing something trivial?
See Section 8.5 of the language reference, "Pattern Matching Anonymous Functions".
An anonymous function can be defined by a sequence of cases
{case p1 =>b1 ... case pn => bn }
which appear as an expression without a prior match. The expected type of such an expression must in part be defined. It must be either scala.Functionk[S1, ..., Sk, R] for some k > 0, or scala.PartialFunction[S1, R], where the argument type(s) S1, ..., Sk must be fully determined, but the result type R may be undetermined.
The expected type deternines whether this is translated to a FunctionN or PartialFunction.
scala> {case x => x}
<console>:6: error: missing parameter type for expanded function ((x0$1) => x0$1 match {
case (x # _) => x
})
{case x => x}
^
scala> {case x => x}: (Int => Int)
res1: (Int) => Int = <function1>
scala> {case x => x}: PartialFunction[Int, Int]
res2: PartialFunction[Int,Int] = <function1>
{case(first, second) => first + second} is treated as a PartialFunction literal. See examples in "Partial Functions" section here: http://programming-scala.labs.oreilly.com/ch08.html or section 15.7 of Programming in Scala.
Method map accepts a function. In your first example you create a function, assign it to a variable, and pass it to the map method. In the second example you pass your created function directly, omitting assigning it to a variable. You are doing just the same thing.