I've a question regarding this pattern matching in scala:
val div: (Double, Double) => Double = {
case (x, y) if y != 0 => x / y
}
I've understand how pattern matching works and its syntaxis in scala, but this expression drives me crazy. How does complier knows that x and y is an arguments of the function and pattern match on them?
The rules for this are defined in section 8.5 "Pattern Matching Anonymous Functions" of the Scala Language Specification. If using an anonymous function with pattern matching, the type must be partially provided. You do that by saying the type is (Double, Double) => Double, which is shorthand for Function2[Double, Double, Double].
Now:
If the expected type is scala.Function k [S1,…,Sk, R], the expression is taken to be equivalent to the anonymous function:
(x1:S1,…,xk:Sk) => (x1,…,xk) match {
case p1 => b1 … case pn => bn
}
So no matter what the arity of your function, the pattern match is passed a tuple of the function's arguments, hence you can use the regular tuple extractor syntax.
So your example is short for
val div: (Double, Double) => Double = (a, b) => (a, b) match {
case (x, y) if y != 0 => x / y
}
or
val div = (a: Double, b: Double) => (a, b) match {
case (x, y) if y != 0 => x / y
}
The naming of the extractor parameters x and y is up to your imagination. You decide how to call the resulting elements of the extractor, you could as well write case (foo, bar) => ...
Related
I'm struggling to write an anonymous function with by-name parameter. Here is what i tired.
val fun = (x: Boolean, y: =>Int) => if(x) y else 0
This fail with following error.
Error:(106, 31) identifier expected but '=>' found.
val fun = (x: Boolean, y: =>Int) => if(x) y else 0
^
Error:(109, 3) ')' expected but '}' found.
}
^
How ever same code as a standard function works.
def fun1(x: Boolean, y: =>Int) = if(x) y else 0
Any pointers ?
---------------Edit-----------------
I had a two part problem. senia answer solved the initial case. Suppose I have a function takes a function.
def xxx[A,B](f:(A,=>B)=>B)={}
As per senia solution it works.
val fun: (Int, =>Boolean) => Boolean = (x, y) => y
xxx[Int,Boolean](fun)
However I wanna get rid of the intermediate fun and call xxx with anonymous function. Doing
xxx((Int, =>Boolean) => Boolean = (x, y) => y)
Will not work. Any ideas how to do this ?
You could specify type of anonymous function, instead of types of parameters like this:
val fun: (Boolean, => Int) => Int = (x, y) => if(x) y else 0
scala> fun(false, {println("!"); 2})
res1: Int = 0
scala> fun(true, {println("!"); 2})
!
res2: Int = 2
=> Int is not a correct type name, it's a special syntax for by-name parameters in parameters block of method declaration or anonymous function type.
See SLS 4.6 Function Declarations and Definitions
ParamType ::= Type
| ‘=>’ Type
| Type ‘*’
In case you don't want to assign anonymous function to variable you could either use type inference like this:
xxx[Int, Boolean]{ (x, y) => y }
Or specify its type this way:
xxx({ (x, y) => y }: ((Int, => Boolean) => Boolean))
Given Scala 2.12.6:
val list = List(1)
val x = 2
This works:
list.map ( y => x + y )
returning List[Int] = List(3)
and this works:
list.map ( (y: Int) => x + y )
returning the same value.
Same for this:
list.map { (y: Int) => x + y }
And same for this:
list.map { y: Int => x + y }
Yet this fails:
list.map ( y: Int => x + y )
producing the error:
error: not found: type +
list.map ( y: Int => x + y )
^
Why is Scala thinking the + is meant to indicate a type, and where is this difference between using parenthesis and curly braces documented and explained?
The Section 6.23 about anonymous functions says:
In the case of a single untyped formal parameter, (x) => e can be abbreviated to x => e. If an anonymous function (x: T) => e with a single typed parameter appears as the result expression of a block, it can be abbreviated to x: T => e.
Thus, in a block { ... }, the function literal (y: Int) => x + y can be abbreviated to just y: Int => x + y.
Without the block, the entire Int => x + y-part is treated as type ascription, so the error message actually makes sense. For example, here is a context in which the offending expression becomes valid:
type y = Unit
type x = Unit
type +[A, B] = Int
val y = (i: Int) => 42 + i
val list = List(1)
println(
list.map ( y: Int => x + y )
) // happily prints `List(43)`.
This is because there are two ys in two separate scopes (one value, one type alias), so that (y: Int => x + y) becomes (y: Int => +[x, y]), and then (y: Int => Int), which is just a type ascription enforcing that value y is indeed of function type Int => Int (which it is, so everything compiles and runs). Here is another similar example.
My suggestion: stick to the slightly more verbose (foo: Foo) => { ... } notation, it will cause fewer surprises for everyone who tries to read and to modify the code. Otherwise there is some risk that
argument types in bindings collide with type ascriptions
=> of the anonymous lambda collides with function type =>
arithmetic operation + collides with binary infix type constructor +[_,_]
values x, y collide with undefined types x, y.
The fact that same syntax can denote both types and expressions can be somewhat of a double-edged sword.
So I have this function in Scala:
def f(a: Int)(b: Int)(c: Double)(d: Double): Double = a * c + b * d
The question is What are the three types that make the following statements compile.
def g: <Type1> = f(1)(2)(3.0)
def h: <Type2> = f(1)(2)
def k: <Type3> = f(1)
I'm still new to Scala and I am not really understanding the concept of currying. Maybe an answer to this question with some explanation will really help me. Thanks.
First, one main thing: function that takes two parameters a and b and returns a value c can be viewed as a function that takes an a and returns a function that takes b and returns c. This "change of point of view" is called currying.
Imagine a function that sums up two numbers. You give it 2 and 3, it returns 5. It can be viewed as a function that takes one number and returns a function from a number to a number. You give it a 2, it returns a function that takes some number and adds 2 to it.
Now, some types that you requested:
// pseudocode!
def g: Double => Double
= f(1)(2)(3.0) // we supply three params and are left with only one, "d"
= (d: Double) => 1 * 3.0 + 2 * d // we comply with g's type
def h: Double => Double => Double // or (Double, Double) => Double
= f(1)(2) // we supply two params and are left with two more, "c" and "d"
= (c: Double)(d: Double) => 1 * c + 2 * d // we comply with h's type
def k: Double => Double => Double => Double // or (Double, Double, Double) => Double
= f(1) // we supply one param and are left with three more, "b", "c" and "d"
= (b: Double)(c: Double)(d: Double) => 1 * c + b * d // we comply with k's type
Currying IMO is one of the most confusing concepts in Scala. The term itself comes from functional programming paradigm and, according to wikipedia, is
the technique of translating the evaluation of a function that takes
multiple arguments (or a tuple of arguments) into evaluating a
sequence of functions, each with a single argument.
which means that function call f(a, b, c) is represented by f(a)(b)(c). Looks like Scala? Not exactly. Here we have three function calls, each of them returns another function. Type of f (in Scala speak) is Int => (Int => (Double => (Double => Double))). Let's look at your f:
scala> def f(a: Int)(b: Int)(c: Double)(d: Double): Double = a * c + b * d
f: (a: Int)(b: Int)(c: Double)(d: Double)Double
As you see, there are no arrows here. What we have here is a method with multiple parameter lists. Method has no value and can't be assigned or passed anywhere, it belongs to an object. Function, on the other hand, is an object and can be assigned or passed to another method or function. In most cases omitting parameter lists is not allowed for methods:
scala> f(0)
<console>:01: error: missing argument list for method f
Unapplied methods are only converted to functions when a function type is expected.
You can make this conversion explicit by writing `f _` or `f(_)(_)(_)(_)` instead of `f`.
There is one exception though as error message implies: if f(0) is placed in a functional context Scala will perform automatic eta-expansion, which means it will convert your method to a function:
scala> val fl: (Int => (Double => (Double => Double))) = f(0)
fl: Int => (Double => (Double => Double)) = $$Lambda$1342/937956960#43c1614
where eta-expansion means literally this:
scala> val fl: (Int => (Double => (Double => Double))) = (b => (c => (d => f(0)(b)(c)(d))))
fl: Int => (Double => (Double => Double)) = $$Lambda$1353/799716194#52048150
Another (explicit) way to convert a method to a curried function is by using placeholder (which will give you correct types right away):
scala> f _
res11: Int => (Int => (Double => (Double => Double))) = $$Lambda$1354/1675405592#4fa649d8
scala> f(0) _
res12: Int => (Double => (Double => Double)) = $$Lambda$1355/1947050122#ba9f744
Also be aware that:
def g: Int => (Double => (Double => Double)) = f(0)
is in fact
def g: Int => (Double => (Double => Double)) = (b => (c => (d => f(0)(b)(c)(d))))
i.e. it's a method g, which creates a function on the fly and returns it. So g(0) means "call method g without parameters, get back a function and apply it to 0".
I am working on spark and not an expert in scala. I have got the two variants of map function. Could you please explain the difference between them.?
first variant and known format.
first variant
val.map( (x,y) => x.size())
Second variant -> This has been applied on tuple
val.map({case (x, y) => y.toString()});
The type of val is RDD[(IntWritable, Text)]. When i tried with first function, it gave error as below.
type mismatch;
found : (org.apache.hadoop.io.IntWritable, org.apache.hadoop.io.Text) ⇒ Unit
required: ((org.apache.hadoop.io.IntWritable, org.apache.hadoop.io.Text)) ⇒ Unit
When I added extra parenthesis it said,
Tuples cannot be directly destructured in method or function parameters.
Well you say:
The type of val is RDD[(IntWritable, Text)]
so it is a tuple of arity 2 with IntWritable and Text as components.
If you say
val.map( (x,y) => x.size())
what you're doing is you are essentially passing in a Function2, a function with two arguments to the map function. This will never compile because map wants a function with one argument. What you can do is the following:
val.map((xy: (IntWritable, Text)) => xy._2.toString)
using ._2 to get the second part of the tuple which is passed in as xy (the type annotation is not required but makes it more clear).
Now the second variant (you can leave out the outer parens):
val.map { case (x, y) => y.toString() }
this is special scala syntax for creating a PartialFunction that immediately matches on the tuple that is passed in to access the x and y parts. This is possible because PartialFunction extends from the regular Function1 class (Function1[A,B] can be written as A => B) with one argument.
Hope that makes it more clear :)
I try this in repl:
scala> val l = List(("firstname", "tom"), ("secondname", "kate"))
l: List[(String, String)] = List((firstname,tom), (secondname,kate))
scala> l.map((x, y) => x.size)
<console>:9: error: missing parameter type
Note: The expected type requires a one-argument function accepting a 2-Tuple.
Consider a pattern matching anonymous function, `{ case (x, y) => ... }`
l.map((x, y) => x.size)
maybe can give you some inspire.
Your first example is a function that takes two arguments and returns a String. This is similar to this example:
scala> val f = (x:Int,y:Int) => x + y
f: (Int, Int) => Int = <function2>
You can see that the type of f is (Int,Int) => Int (just slightly changed this to be returning an int instead of a string). Meaning that this is a function that takes two Int as arguments and returns an Int as a result.
Now the second example you have is a syntactic sugar (a shortcut) for writing something like this:
scala> val g = (k: (Int, Int)) => k match { case (x: Int, y: Int) => x + y }
g: ((Int, Int)) => Int = <function1>
You see that the return type of function g is now ((Int, Int)) => Int. Can you spot the difference? The input type of g has two parentheses. This shows that g takes one argument and that argument must be a Tuple[Int,Int] (or (Int,Int) for short).
Going back to your RDD, what you have is an Collection of Tuple[IntWritable, Text] so the second function will work, whereas the first one will not work.
I had a List of Scala tuples like the following:
val l = List((1,2),(2,3),(3,4))
and I wanted to map it in a list of Int where each item is the sum of the Ints in a the corresponding tuple. I also didn't want to use to use the x._1 notation so I solved the problem with a pattern matching like this
def addTuple(t: (Int, Int)) : Int = t match {
case (first, second) => first + second
}
var r = l map addTuple
Doing that I obtained the list r: List[Int] = List(3, 5, 7) as expected. At this point, almost by accident, I discovered that I can achieve the same result with an abbreviated form like the following:
val r = l map {case(first, second) => first + second}
I cannot find any reference to this syntax in the documentation I have. Is that normal? Am I missing something trivial?
See Section 8.5 of the language reference, "Pattern Matching Anonymous Functions".
An anonymous function can be defined by a sequence of cases
{case p1 =>b1 ... case pn => bn }
which appear as an expression without a prior match. The expected type of such an expression must in part be defined. It must be either scala.Functionk[S1, ..., Sk, R] for some k > 0, or scala.PartialFunction[S1, R], where the argument type(s) S1, ..., Sk must be fully determined, but the result type R may be undetermined.
The expected type deternines whether this is translated to a FunctionN or PartialFunction.
scala> {case x => x}
<console>:6: error: missing parameter type for expanded function ((x0$1) => x0$1 match {
case (x # _) => x
})
{case x => x}
^
scala> {case x => x}: (Int => Int)
res1: (Int) => Int = <function1>
scala> {case x => x}: PartialFunction[Int, Int]
res2: PartialFunction[Int,Int] = <function1>
{case(first, second) => first + second} is treated as a PartialFunction literal. See examples in "Partial Functions" section here: http://programming-scala.labs.oreilly.com/ch08.html or section 15.7 of Programming in Scala.
Method map accepts a function. In your first example you create a function, assign it to a variable, and pass it to the map method. In the second example you pass your created function directly, omitting assigning it to a variable. You are doing just the same thing.