(data:Any)=>println(data) vs data:Any=>println(data) - scala

I came across the following scala code:
val x = 10 match {
case _: Int => data: Any => println(data) // no issue
}
val y = data: Any => println(data) //compiling error
In the match case clause, I don't have to write data: Any => println(data) as
(data: Any) => println(data)
But in val y = data: Any => println(data), I have to write as val y = (data: Any) => println(data). Why?

The relevant rule is
If an anonymous function (x: T) => e with a single typed parameter appears as the result expression of a block, it can be abbreviated to x: T => e.
The first case (case _: Int => data: Any => println(data)) satisfies this condition, the second one doesn't. But you can rewrite it to
val y = { data: Any => println(data) }

The Difference in one is case statement with pattern matching another is assignment statement. In the first case we are essentially returning the value(result) of the right hand side of => as statement block as shown below.
scala> {data: Any => println(data)}
res0: Any => Unit = $$Lambda$1141/1699449247#71468613
scala> data: Any => println(data)
<console>:1: error: ';' expected but '=>' found.
data: Any => println(data)
scala> val x = 10 match {
| case _: Int => val y = 56;data: Any => println(data) // no issue
| }
x: Any => Unit = $$Lambda$1171/115016870#8f374de
The above statements considers everything after => will be considered as a statement block in case statement, but in the second case ( assignment - ie., assigning a function value ) it is not. Hence brackets around parameter declaration are necessary.
EDIT:
Though I have given the above answer as per my observation in scala REPL, I further thought
that it will not clearly answer the Why? part of the question asked. I further did some more
trails on how scala compiler behaves after investigating the exact error message we get when we type
val y = data:Int => println(data)
scala> val y = data:Int => println(data)
<console>:1: error: ';' expected but '(' found.
val y = data:Int => println(data)
This seems to be due to the scala's feature of type inference of the val y.
In Scala type can be specified both explicitly wherever it is required and Scala compiler can implicitly infer type wherever it can. Here the type of y is to be inferred by the scala compiler in the first instance because it was not explicitly declared here. It is due to this attempt of the compiler the above error message occurred.
Now if we want to explicitly declare the type there are two ways in scala:
First way : val y : Int = 5
Second way : val y = 5:Int
Both of the above assignment statements are valid in scala.
Because in our specific assignment statement ie.,
val y = data:Any => println(data)
we are forcing scala to infer the type of y in the Second way shown above.
To infer the type of y, the scala compiler is attempting as explained below:
Here the compiler is assuming that data is a function defined somewhere before this line of code and the function is of type Int=>println(data). Then it is checking the validity of this type. The compiler determined that Int is a valid type but println(data) is not, because the valid type names cannot contain parentheses('(' char) and that is why the error message as above.
If we insert the right hand side of the above assignment statement in a block ie; between curly
braces or putting the parameter declaration within parentheses, there will be no problem in inferring the type of y.
Or Using the First way, we can do this to compile without curly braces,
val y : Any => Unit = data =>println(data)
scala> val y : Any => Unit = data =>println(data)
y: Any => Unit = $$Lambda$1059/1854873748#799c8758
Hope this explains the why part of the question.

Related

Why is pattern match needed to preserve existential type information? [duplicate]

This question already has an answer here:
Scala lists with existential types: `map{ case t => ... }` works, `map{ t => ... }` doesn't?
(1 answer)
Closed 2 years ago.
A blog describing how to use type classes to avoid F-bounded polymorphism (see Returning the "Current" Type in Scala) mentions near its end:
The problem here is that the connection between the types of p._1 and p._2 is lost in this context, so the compiler no longer knows that they line up correctly. The way to fix this, and in general the way to prevent the loss of existentials, is to use a pattern match.
I have verified the code mentioned does not work:
pets.map(p => esquire(p._1)(p._2))
while the other pattern matching variant does:
pets.map { case (a, pa) => esquire(a)(pa) }
There is also another variant not mentioned which also works:
pets.map{case p => esquire(p._1)(p._2)}
What is the magic here? Why does using case p => instead of p => preserve the existential type information?
I have tested this with Scala 2.12 and 2.13.
Scastie link to play with the code: https://scastie.scala-lang.org/480It2tTS2yNxCi1JmHx8w
The question needs to be modified for Scala 3 (Dotty), as existential types no longer exists there (pun intended). It seems it works even without the case there, as demonstrated by another scastie: https://scastie.scala-lang.org/qDfIgkooQe6VTYOssZLYBg (you can check you still need the case p even with a helper class in Scala 2.12 / 2.13 - you will get a compile error without it).
Modified code with a helper case class:
case class PetStored[A](a: A)(implicit val pet: Pet[A])
val pets = List(PetStored(bob), PetStored(thor))
println(pets.map{case p => esquire(p.a)(p.pet)})
Based on https://stackoverflow.com/a/49712407/5205022, consider the snippet
pets.map { p =>
val x = p._1
val y = p._2
esquire(x)(y)
}
after typechecking -Xprint:typer becomes
Hello.this.pets.map[Any](((p: (A, example.Hello.Pet[A]) forSome { type A }) => {
val x: Any = p._1;
val y: example.Hello.Pet[_] = p._2;
Hello.this.esquire[Any](x)(<y: error>)
}))
whilst the snippet with pattern matching
pets.map { case (a, pa) =>
val x = a
val y = pa
esquire(x)(y)
}
after typechecking becomes
Hello.this.pets.map[Any](((x0$1: (A, example.Hello.Pet[A]) forSome { type A }) => x0$1 match {
case (_1: A, _2: example.Hello.Pet[A]): (A, example.Hello.Pet[A])((a # _), (pa # _)) => {
val x: A = a;
val y: example.Hello.Pet[A] = pa;
Hello.this.esquire[A](x)(y)
}
}));
We note that in the latter pattern matching case, the existential type parameter A is re-introduced
val x: A = a;
val y: example.Hello.Pet[A] = pa;
and so relationship between x and y is re-established, whilst in the case without pattern matching the relationship is lost
val x: Any = p._1;
val y: example.Hello.Pet[_] = p._2;

Scala function declaration: placeholder vs parameter? [duplicate]

I've taken a look at the list of surveys taken on scala-lang.org and noticed a curious question: "Can you name all the uses of “_”?". Can you? If yes, please do so here. Explanatory examples are appreciated.
The ones I can think of are
Existential types
def foo(l: List[Option[_]]) = ...
Higher kinded type parameters
case class A[K[_],T](a: K[T])
Ignored variables
val _ = 5
Ignored parameters
List(1, 2, 3) foreach { _ => println("Hi") }
Ignored names of self types
trait MySeq { _: Seq[_] => }
Wildcard patterns
Some(5) match { case Some(_) => println("Yes") }
Wildcard patterns in interpolations
"abc" match { case s"a$_c" => }
Sequence wildcard in patterns
C(1, 2, 3) match { case C(vs # _*) => vs.foreach(f(_)) }
Wildcard imports
import java.util._
Hiding imports
import java.util.{ArrayList => _, _}
Joining letters to operators
def bang_!(x: Int) = 5
Assignment operators
def foo_=(x: Int) { ... }
Placeholder syntax
List(1, 2, 3) map (_ + 2)
Method values
List(1, 2, 3) foreach println _
Converting call-by-name parameters to functions
def toFunction(callByName: => Int): () => Int = callByName _
Default initializer
var x: String = _ // unloved syntax may be eliminated
There may be others I have forgotten!
Example showing why foo(_) and foo _ are different:
This example comes from 0__:
trait PlaceholderExample {
def process[A](f: A => Unit)
val set: Set[_ => Unit]
set.foreach(process _) // Error
set.foreach(process(_)) // No Error
}
In the first case, process _ represents a method; Scala takes the polymorphic method and attempts to make it monomorphic by filling in the type parameter, but realizes that there is no type that can be filled in for A that will give the type (_ => Unit) => ? (Existential _ is not a type).
In the second case, process(_) is a lambda; when writing a lambda with no explicit argument type, Scala infers the type from the argument that foreach expects, and _ => Unit is a type (whereas just plain _ isn't), so it can be substituted and inferred.
This may well be the trickiest gotcha in Scala I have ever encountered.
Note that this example compiles in 2.13. Ignore it like it was assigned to underscore.
From (my entry) in the FAQ, which I certainly do not guarantee to be complete (I added two entries just two days ago):
import scala._ // Wild card -- all of Scala is imported
import scala.{ Predef => _, _ } // Exception, everything except Predef
def f[M[_]] // Higher kinded type parameter
def f(m: M[_]) // Existential type
_ + _ // Anonymous function placeholder parameter
m _ // Eta expansion of method into method value
m(_) // Partial function application
_ => 5 // Discarded parameter
case _ => // Wild card pattern -- matches anything
val (a, _) = (1, 2) // same thing
for (_ <- 1 to 10) // same thing
f(xs: _*) // Sequence xs is passed as multiple parameters to f(ys: T*)
case Seq(xs # _*) // Identifier xs is bound to the whole matched sequence
var i: Int = _ // Initialization to the default value
def abc_<>! // An underscore must separate alphanumerics from symbols on identifiers
t._2 // Part of a method name, such as tuple getters
1_000_000 // Numeric literal separator (Scala 2.13+)
This is also part of this question.
An excellent explanation of the uses of the underscore is Scala _ [underscore] magic.
Examples:
def matchTest(x: Int): String = x match {
case 1 => "one"
case 2 => "two"
case _ => "anything other than one and two"
}
expr match {
case List(1,_,_) => " a list with three element and the first element is 1"
case List(_*) => " a list with zero or more elements "
case Map[_,_] => " matches a map with any key type and any value type "
case _ =>
}
List(1,2,3,4,5).foreach(print(_))
// Doing the same without underscore:
List(1,2,3,4,5).foreach( a => print(a))
In Scala, _ acts similar to * in Java while importing packages.
// Imports all the classes in the package matching
import scala.util.matching._
// Imports all the members of the object Fun (static import in Java).
import com.test.Fun._
// Imports all the members of the object Fun but renames Foo to Bar
import com.test.Fun.{ Foo => Bar , _ }
// Imports all the members except Foo. To exclude a member rename it to _
import com.test.Fun.{ Foo => _ , _ }
In Scala, a getter and setter will be implicitly defined for all non-private vars in a object. The getter name is same as the variable name and _= is added for the setter name.
class Test {
private var a = 0
def age = a
def age_=(n:Int) = {
require(n>0)
a = n
}
}
Usage:
val t = new Test
t.age = 5
println(t.age)
If you try to assign a function to a new variable, the function will be invoked and the result will be assigned to the variable. This confusion occurs due to the optional braces for method invocation. We should use _ after the function name to assign it to another variable.
class Test {
def fun = {
// Some code
}
val funLike = fun _
}
There is one usage I can see everyone here seems to have forgotten to list...
Rather than doing this:
List("foo", "bar", "baz").map(n => n.toUpperCase())
You could can simply do this:
List("foo", "bar", "baz").map(_.toUpperCase())
Here are some more examples where _ is used:
val nums = List(1,2,3,4,5,6,7,8,9,10)
nums filter (_ % 2 == 0)
nums reduce (_ + _)
nums.exists(_ > 5)
nums.takeWhile(_ < 8)
In all above examples one underscore represents an element in the list (for reduce the first underscore represents the accumulator)
Besides the usages that JAiro mentioned, I like this one:
def getConnectionProps = {
( Config.getHost, Config.getPort, Config.getSommElse, Config.getSommElsePartTwo )
}
If someone needs all connection properties, he can do:
val ( host, port, sommEsle, someElsePartTwo ) = getConnectionProps
If you need just a host and a port, you can do:
val ( host, port, _, _ ) = getConnectionProps
There is a specific example that "_" be used:
type StringMatcher = String => (String => Boolean)
def starts: StringMatcher = (prefix:String) => _ startsWith prefix
may be equal to :
def starts: StringMatcher = (prefix:String) => (s)=>s startsWith prefix
Applying “_” in some scenarios will automatically convert to “(x$n) => x$n ”

Scala function composition: brackets and types

(1) Having defined two functions in the Scala REPL:
scala> def f(s: String) = "f(" + s + ")"
f: (s: String)String
scala> def g(s: String) = "g(" + s + ")"
g: (s: String)String
(2) Composing them without brackets works as expected:
scala> f _ compose g _
res18: String => String = <function1>
(3) Composing them with brackets doesn't:
scala> f(_).compose(g(_))
<console>:14: error: missing parameter type for expanded function ((x$1) => f(x$1).compose(((x$2) => g(x$2))))
f(_).compose(g(_))
^
<console>:14: error: missing parameter type for expanded function ((x$2) => g(x$2))
f(_).compose(g(_))
^
<console>:14: error: type mismatch;
found : String
required: Int
f(_).compose(g(_))
^
Question 1: Can somebody explain why?
Question 2: Why the type mismatch? Why is Scala expecting an Int at all?
(4) Surrounding f(_) with brackets seems to help a little bit, by making the first two errors go away:
scala> (f(_)).compose(g(_))
<console>:14: error: missing parameter type for expanded function ((x$2) => g(x$2))
(f(_)).compose(g(_))
^
Question 3: Why do these brackets help?
Question 4: Why does Scala need the parameter types, even though they are clearly defined in f and g respectively?
(5) Finally, adding the parameter type makes it work:
scala> (f(_)).compose(g(_:String))
res22: String => String = <function1>
Could you please explain what's going on, and provide alternative syntaxes to achieve the composition?
Thanks.
You can see the (unexpected) expansion using magic show comment:
scala> f(_).compose(g(_)) // show
[snip]
val res0 = ((x$1) => f(x$1).compose(((x$2) => g(x$2))))
Function literals need the params constrained, as you showed. f _ is eta expansion, which is different from f(_) which is sugar for x => f(x).
Since the unintended application f(x$1) returns a string, which is a Int => Char for indexing, you get the added type mismatch.
Underscore is covered by many SO questions, including one canonical.

Convert value depending on a type in SparkSQL via case matching of type

Is it possible to match a parametric type in Scala? Lets say I have a function that receives two parameters: a value and a type. I would like to use pattern matching to do a type conversion.
Something like this:
datatype match {
case IntegerType => return value.toInt
case FloatType => return value.toFloat
case StringType => return value
case DecimalType(_,_) => return BigDecimal(value) // this is not working
case _ => return strrepr
}
Here DecimalType accepts two parameters to specify precision the required precision. It can be for example:
org.apache.spark.sql.types.DecimalType = DecimalType(10,2)
I have tried several options and nothing seems to be working:
For case DecimalType => return BigDecimal(value) I get:
error: pattern type is incompatible with expected type;
found : org.apache.spark.sql.types.DecimalType.type
required: org.apache.spark.sql.types.DataType
Note: if you intended to match against the class, try `case DecimalType(_,_)`
For case DecimalType(_,_) => return BigDecimal(value) I get:
error: result type Boolean of unapply defined in method unapply in object DecimalType does not conform to Option[_] or Boolean
For case DecimalType[_,_] => return BigDecimal(value) I get:
error: org.apache.spark.sql.types.DecimalType does not take type parameters
Turns out that DecimalType only pattern matches with zero arguments:
case DecimalType() => ...
If you need the precision and scale, you must define the type of the case and manually extract them:
datatype match {
case dt: DecimalType =>
val precision = dt.precision
val scale = dt.scale
...
The problem is the use of the return in your code. You said you use this code snippet in a function somewhere. What is the return type of that function? Obviously, you intend that sometimes it is Integer, sometimes String, sometimes BigDecimal; but if you use return, it will look to the type of the returned object to determine the return type of the function. In general, you should strongly avoid using return in Scala code. The last evaluated value in the function body is returned. The only case for using a return is when you want to force returning a value somewhere else in the function body. But still, a better way would be to save the return object in a variable and just evaluate that variable in the last line of your function body. And never use return!
Without return it works
scala> val datatype = DecimalType(10, 2)
datatype: org.apache.spark.sql.types.DecimalType = DecimalType(10,2)
scala> val value = BigDecimal(10)
value: scala.math.BigDecimal = 10
scala> datatype match {case DecimalType(_,_) => value}
res150: scala.math.BigDecimal = 10
** Problems with return **
scala> def test = {datatype match {case DecimalType(_,_) => return value}}
<console>:138: error: method test has return statement; needs result type
def test = {datatype match {case DecimalType(_,_) => return value}}
scala> def test:BigDecimal = {datatype match {case DecimalType(_,_) => return value}}
test: BigDecimal
scala> def test:DataType = {datatype match {case DecimalType(_,_) => return value}}
<console>:138: error: type mismatch;
found : scala.math.BigDecimal
required: org.apache.spark.sql.types.DataType
def test:DataType = {datatype match {case DecimalType(_,_) => return value}}
scala> def test3 = {datatype match {case DecimalType(_,_) => value}}
test3: scala.math.BigDecimal
Could be something specific to the code I'm working on, or perhaps it varies depending on the SQL vendor, but I found that DecimalType doesn't have a single underlying type. Sometimes I get a spark Decimal and other times I get a java BigDecimal. If I try to getAs[Decimal] when it's a BigDecimal, I get an exception. If I try to getAs[BigDecimal] when it's a Decimal, I get an exception.
To handle this I had to do some more sniffing after matching DecimalType:
case d: DecimalType =>
// Oddly a column that matches to DecimalType can be of several different
// class types and trying to getAs[Decimal] when it's a BigDecimal and/or
// trying to getAs[BigDecimal] when the column is a Decimal results in an
// exception, so make the right decision based on the instance class.
val decimal = row.get(index) match {
case bigDecimal: java.math.BigDecimal => Decimal(bigDecimal)
case decimal: Decimal => decimal
case _ => throw(
new RuntimeException("Encountered unexpected decimal type")
)
}
From there you can do whatever you need to do knowing that decimal is of type Decimal.

How is a match word omitted in Scala?

In Scala, you can do
list.filter { item =>
item match {
case Some(foo) => foo.bar > 0
}
}
But you can also do the quicker way by omitting match:
list.filter {
case Some(foo) => foo.bar > 0
}
How is this supported in Scala? Is this new in 2.9? I have been looking for it, and I can figure out what makes this possible. Is it just part of the Scala compiler?
Edit: parts of this answer are wrong; please refer to huynhjl's answer.
If you omit the match, you signal the compiler that you are defining a partial function. A partial function is a function that is not defined for every input value. For instance, your filter function is only defined for values of type Some[A] (for your custom type A).
PartialFunctions throw a MatchError when you try to apply them where they are not defined. Therefore, you should make sure, when you pass a PartialFunction where a regular Function is defined, that your partial function will never be called with an unhanded argument. Such a mechanism is very useful e.g. for unpacking tuples in a collection:
val tupleSeq: Seq[(Int, Int)] = // ...
val sums = tupleSeq.map { case (i1, i2) => i1 + i2 }
APIs which ask for a partial function, like the collect filter-like operation on collections, usually call isDefinedAt before applying the partial function. There, it is safe (and often wanted) to have a partial function that is not defined for every input value.
So you see that although the syntax is close to that of a match, it is actually quite a different thing we're dealing with.
The language specification addresses that in section 8.5. The relevant portions:
An anonymous function can be defined by a sequence of cases
{ case p1 => b1 ... case pn => bn }
If the expected type is scala.Functionk[S1, ..., Sk, R] , the expression is taken to
be equivalent to the anonymous function:
(x1 : S1, ..., xk : Sk) => (x1, ..., xk) match {
case p1 => b1 ... case pn => bn
}
If the expected type is scala.PartialFunction[S, R], the expression is taken to
be equivalent to the following instance creation expression:
new scala.PartialFunction[S, T ] {
def apply(x: S): T = x match {
case p1 => b1 ... case pn => bn
}
def isDefinedAt(x: S): Boolean = {
case p1 => true ... case pn => true
case _ => false
}
}
So typing the expression as PartialFunction or a Function influences how the expression is compiled.
Also trait PartialFunction [-A, +B] extends (A) ⇒ B so a partial function PartialFunction[A,B] is also a Function[A,B].
-- Revised post --
Hmm, I'm not sure I see a difference, Scala 2.9.1.RC3,
val f: PartialFunction[Int, Int] = { case 2 => 3 }
f.isDefinedAt(1) // evaluates to false
f.isDefinedAt(2) // evaluates to true
f(1) // match error
val g: PartialFunction[Int, Int] = x => x match { case 2 => 3 }
g.isDefinedAt(1) // evaluates to false
g.isDefinedAt(2) // evaluates to true
g(1) // match error
It seems f and g behave exactly the same as PartialFunctions.
Here's another example demonstrating the equivalence:
Seq(1, "a").collect(x => x match { case s: String => s }) // evaluates to Seq(a)
Even more interesting:
// this compiles
val g: PartialFunction[Int, Int] = (x: Int) => {x match { case 2 => 3 }}
// this fails; found Function[Int, Int], required PartialFunction[Int, Int]
val g: PartialFunction[Int, Int] = (x: Int) => {(); x match { case 2 => 3 }}
So there's some special casing at the compiler level to convert between x => x match {...} and just {...}.
Update. After reading the language spec, this seems like a bug to me. I filed SI-4940 in the bug tracker.