Scala: pattern match problem with fully qualified classnames in parameterization - scala

I have a little problem in pattern matching an object in Scala when it is parameterized with a fully qualified class name. This is based on Scala 2.9.0.1. Anyone knows what's wrong with this code?
scala> "foo" match {
| case y : Seq[Integer] =>
| case y : Seq[java.lang.Integer] =>
<console>:3: error: ']' expected but '.' found.
case y : Seq[java.lang.Integer] =>
Why does the first version work, but the latter fail? The problem only seems to occur when a fully qualified classname is used for the parameterization.

From the Scala Language Specification, section 8.1 Patterns, the identifier after the : needs to be what is referred to as a Type Pattern, defined in Section 8.2:
Type patterns consist of types, type variables, and wildcards. A type
pattern T is of one of the following forms:
...
A parameterized type pattern T [a(1), . . . , a(n)], where the a(i) are
type variable patterns or wildcards _. This type pattern matches all
values which match T for some arbitrary instantiation of the type
variables and wildcards. The bounds or alias type of these type
variable are determined as described in (§8.3).
...
A type variable pattern is a simple identifier which starts with a
lower case letter. However, the predefined primitive type aliases
unit, boolean, byte, short, char, int, long, float, and double are not
classified as type variable patterns.
So, syntactically, you can't use a fully qualified class as a type variable pattern IN THIS POSITION. You can however, use a type alias, so:
type JavaInt = java.lang.Integer
List(new java.lang.Integer(5)) match {
case y: Seq[JavaInt] => 6
case _ => 7
}
will return 6 as expected. The problem is that as Alan Burlison points out, the following also returns 6:
List("foobar") match {
case y: Seq[JavaInt] => 6
case _ => 7
}
because the type is being erased. You can see this by running the REPL, or scalac with the -unchecked option.

In fact your first example doesn't work either. If you run the REPL with -unchecked, you'll see the following error:
warning: non variable type-argument Integer in type pattern Seq[Integer] is unchecked since it is eliminated by erasure
So you can't actually do what you are trying to do - at run-time there's no difference between a List[Integer] and a List[AnythingElse], so you can't pattern-match on it. You might be able to do this with a Manifest, see http://ofps.oreilly.com/titles/9780596155957/ScalasTypeSystem.html#Manifests and http://www.scala-blogs.org/2008/10/manifests-reified-types.html

Related

Pattern matching with type parameter bounded to final class

Here is an example
def maybeeq[A <: String](x: A):A = x match {
case z:A => x
}
It produced the following error message during compilation
Error:(27, 12) scrutinee is incompatible with pattern type;
found : A
required: String
case z:A => x
I can put any final class into A's bound to reproduce the error.
Why this compiles for non-final classes but fails on the final? Why type erasure not just replace A with String?
Edited:
Note: such bound allows me to pass String-typed value to 'x' parameter. So 'x' can be just a String and don't have to be subtype of string, so I'm not asking compiler to compile method with incorrect signature. In the real-world code I would just put String instead on A parameter, but from the experimental perspective I'm interested why such extra restriction on top of existing restriction (based on final class nature) is needed.
TBH this is a question about compiler design which can only be answered by those who implemented such check
There is a test in compiler test suite that requires such error to be shown. It has something to do with type information being discarded to a point where a concrete type cannot be assigned to variable, but the reasons for that cannot be understood from git blame of that test.
I'll point out, however, that there is still a number of ways to satisfy A <: String without A being known at compile time to be a String. For one, Null and Nothing satisfy that, being at the bottom of Scala type hierarchy. Those two are explicitly disallowed at type matching. The other example is a bit more involved:
val UhOh: { type T <: String } = new { type T = String }
implicitly[UhOh.T <:< String] // satisfies type bound
implicitly[UhOh.T =:= String] // won't compile - compiler cannot prove the type equality
This is similar to some newtyping patterns, e.g. shapeless.tag
Out of all these possibilities, only one that can do anything reasonable is when A =:= String because String is the only type that can be actually checked at runtime. Oh, except when you use generic type in match - that does not work at all (not without ClassTag in scope at least) because such types are eliminated by erasure.
Final class cannot be extended.so for def maybeeq[A <: String](x: A) is not a correct syntax, since String is final, there should not have any subtype extend from String. the compiler smartly point out this issue.

Can I use #switch and Enumerations?

Can I use switch-case for pattern matching on enumerations?
I tried
import scala.annotation.switch
object Foo extends Enumeration {
val First = Value
val Second = Value
val Third = Value
}
object Main {
def foo(x: Foo.Value) = (x: #switch) match {
case Foo.First => 1
case Foo.Second => 2
case Foo.Third => 3
}
}
but get the following warning (Scala 2.11.4):
warning: could not emit switch for #switch annotated match
def foo(x: Foo.Value) = (x: #switch) match {
I then tried defining the enumeration in Java instead, since Java's enums are different than Scala's Enumeration. Still no luck.
It #switch pattern matching only available on primitive types?
To complete Regis answer, in Scala In Depth, Joshua Suereth states that the following conditions must be true for Scala to apply the tableswitch optimization:
The matched value must be a known integer.
The matched expression must be “simple.” It can’t contain any type checks, if statements, or extractors.
The expression must also have its value available at compile time.
There should be more than two case statements.
Foo object does not match to any of above criteria though it is not a subject for tableswitch optimisation.
The point of the switch annotation is to make sure that your match is compiled into a tableswitch or lookupswitch JVM instruction. Those instructions only work on ints, which means that the switch annotation will only have any effect on types that can safely fit in an Int. Meaning Int itself as well as Char, Byte, Short and Boolean. In addition, the values you match against have to be literal values (as opposed to values stored in a val). Given that an Enumeration is a reference value, they are not compatible with switch annotation. The restriction about literal values actually means that there is probably no way to use this ennotation for Short and Byte, for purely syntactic reasons as there is no support for literal shorts and bytes in scala: you have to use a literal int along with a type ascription as in 123: Byte, but this is not accepted as a pattern.
So that leaves only Int, Char and Boolean as valid types (the usefulness of using #switch for a boolean value is dubious to say the least)

Pattern matching on List[T] and Set[T] in Scala vs. Haskell: effects of type erasure

Would the Haskell equivalent of the code below produce correct answers?
Can this Scala code be fixed to produce correct answers ? If yes, how ?
object TypeErasurePatternMatchQuestion extends App {
val li=List(1,2,3)
val ls=List("1","2","3")
val si=Set(1,2,3)
val ss=Set("1","2","3")
def whatIsIt(o:Any)=o match{
case o:List[Int] => "List[Int]"
case o:List[String] => "List[String]"
case o:Set[Int] => "Set[Int]"
case o:Set[String] => "Set[String]"
}
println(whatIsIt(li))
println(whatIsIt(ls))
println(whatIsIt(si))
println(whatIsIt(ss))
}
prints:
List[Int]
List[Int]
Set[Int]
Set[Int]
but I would expect it to print:
List[Int]
List[String]
Set[Int]
Set[String]
You must understand that by saying o:Any you erase all the specific information about the type and further on the type Any is all that the compiler knows about value o. That's why from that point on you can only rely on the runtime information about the type.
The case-expressions like case o:List[Int] are resolved using the JVM's special instanceof runtime mechanism. However the buggy behaviour you experience is caused by this mechanism only taking the first-rank type into account (the List in List[Int]) and ignoring the parameters (the Int in List[Int]). That's why it treats List[Int] as equal to List[String]. This issue is known as "Generics Erasure".
Haskell on the other hand performs a complete type erasure, which is well explained in the answer by Ben.
So the problem in both languages is the same: we need to provide a runtime information about the type and its parameters.
In Scala you can achieve that using the "reflection" library, which resolves that information implicitly:
import reflect.runtime.{universe => ru}
def whatIsIt[T](o : T)(implicit t : ru.TypeTag[T]) =
if( t.tpe <:< ru.typeOf[List[Int]] )
"List[Int]"
else if ( t.tpe <:< ru.typeOf[List[String]] )
"List[String]"
else if ( t.tpe <:< ru.typeOf[Set[Int]] )
"Set[Int]"
else if ( t.tpe <:< ru.typeOf[Set[String]] )
"Set[String]"
else sys.error("Unexpected type")
println(whatIsIt(List("1","2","3")))
println(whatIsIt(Set("1","2","3")))
Output:
List[String]
Set[String]
Haskell has a very different approach to polymorphism. Above all, it does not have subtype polymorphism (it's not a weakness though), that's why the type-switching pattern matches as in your example are simply irrelevant. However it is possible to translate the Scala solution from above into Haskell quite closely:
{-# LANGUAGE MultiWayIf, ScopedTypeVariables #-}
import Data.Dynamic
import Data.Set
whatIsIt :: Dynamic -> String
whatIsIt a =
if | Just (_ :: [Int]) <- fromDynamic a -> "[Int]"
| Just (_ :: [String]) <- fromDynamic a -> "[String]"
| Just (_ :: Set Int) <- fromDynamic a -> "Set Int"
| Just (_ :: Set String) <- fromDynamic a -> "Set String"
| otherwise -> error "Unexpected type"
main = do
putStrLn $ whatIsIt $ toDyn ([1, 2, 3] :: [Int])
putStrLn $ whatIsIt $ toDyn (["1", "2", "3"] :: [String])
putStrLn $ whatIsIt $ toDyn (Data.Set.fromList ["1", "2", "3"] :: Set String)
Output:
[Int]
[String]
Set String
However I must outline boldly that this is far from a typical scenario of Haskell programming. The language's type-system is powerful enough to solve extremely intricate problems while maintaining all the type-level information (and safety). Dynamic is only used in very special cases in low-level libraries.
GHC does even more type erasure than the JVM; at runtime the types are completely gone (not just the type parameters).
Haskell's approach to types is to use them at compile time to guarantee that no ill-typed operation can ever be carried out, and since Haskell doesn't have OO-style subtyping and dynamic dispatch, there's no purpose at all to keeping the types around. So data is compiled to a memory structure that simply contains the right values, and functions are compiled with baked-in knowledge of the structure of the types on which they operate1, and just blindly expect their arguments to have that structure. That's why you get fun things like segmentation faults if you mess with unsafeCoerce incorrectly, not just a runtime exception saying the value was not of the expected type; at runtime Haskell has no idea whether a value is of any given type.
So rather than Haskell giving "the right answer" to the equivalent program, Haskell disallows your program as unsafe! There is no Any type in Haskell to which you can cast whatever you want.
That's not 100% true; in both Haskell and Scala there are ways of keeping type information alive at runtime. Essentially it's done by creating ordinary data structures that represent types, and passing them around together values that are of those types, so at runtime you can refer to the type representation object for information about the type of the other object. There are library and language facilities in both languages to let you use this mechanism at a higher (and more principled) level, so that it's easier to use safely. Because it requires the type tokens to be passed around, you have to "opt-in" to such features, and your callers have to be aware of it to pass you the required type tokens (whether the actual generation and passing of the token is done implicitly or explicitly).
Without using such features, Haskell provides no way to pattern match on a value that could be of type List Int or Set String to find out which one it is. Either you're using a monomorphic type, in which case it can only be one type and the others will be rejected, or you're using a polymorphic type, in which case you can only apply code to it that will do the same thing2 regardless of which concrete type instantiates the polymorphic type.
1 Except for polymorphic functions, which assume nothing about their polymorphic arguments, and so can basically do nothing with them except pass them to other polymorphic functions (with matching type class constraints, if any).
2 Type class constrained polymorphic types are the only exception to this. Even then, if you've got a value a type that's a member of some type class, all you can do with it is pass it to other functions that accept values in any type that is a member of that type class. And if those functions are general functions defined outside of the type class in question, they'll be under the same restriction. It's only the type class methods themselves that can actually "do something different" for different types in the class, and that's because they are the union of a whole bunch of monomorphic definitions that operate on one particular type in the class. You can't write code that gets to take a polymorphic value, inspect it to see what it was instantiated with, and then decide what to do.
Of course Haskell prints the right answer:
import Data.Set
import Data.Typeable
main = do
let li=[1,2,3]
let ls=["1","2","3"]
let si=Data.Set.fromList[1,2,3]
let ss=Data.Set.fromList["1","2","3"]
print $ typeOf li
print $ typeOf ls
print $ typeOf si
print $ typeOf ss
prints
[Integer]
[[Char]]
Set Integer
Set [Char]

Zero-arg pattern matches when one arg expected

Given this definition in Scala:
class Foo(val n: Int)
object Foo {
def unapply(foo: Foo): Option[Int] = Some(foo.n)
}
This expression compiles and returns ok:
new Foo(1) match {
case Foo() => "ok"
}
Why does this even compile? I would expect that an extractor with Option[T] implies matching patterns with exactly one argument only.
What does the pattern Foo() mean here? Is it equivalent to Foo(_)?
In other words, what is the language rule that enables the experienced behavior.
Today (in 2.11 milestone) you get the error:
<console>:15: error: wrong number of patterns for object Foo offering Int: expected 1, found 0
case Foo() => "ok"
^
I encountered this when adding Regex.unapply(c: Char). At some point, the case you point out was accepted, then later rejected. I remember I liked the idea that if my extractor returns Some(thing), then the Boolean match case r() would work the same as case r(_).
What works is in the scaladoc of unapply(Char) :
http://www.scala-lang.org/files/archive/nightly/docs-master/library/#scala.util.matching.Regex
Section 8.18 of the Scala Language Reference discusses this type of pattern matching. According to the reference, for a pattern like Foo(), it should only match if unapply returns a boolean. If unapply returns Option[T] for some T that isn't a tuple, then the pattern must include exactly one parameter, e.g. Foo(_). Unless I'm really misunderstanding what is happening here, it looks like this is an edge case where the compiler violates the spec.

Scala priority of method call on implicit object

Let's say I have the following scala code:
case class Term(c:Char) {
def unary_+ = Plus(this)
}
case class Plus(t:Term)
object Term {
implicit def fromChar(c:Char) = Term(c)
}
Now I get this from the scala console:
scala> val p = +'a'
p: Int = 97
scala> val q:Plus = +'a'
<console>:16: error: type mismatch;
found : Int
required: Plus
val q:Plus = +'a'
^
Because '+' is already present on the Char type, the implicit conversion does not take place, I think. Is there a way to override the default behaviour and apply '+' on the converted Term before applying on the Char type?
(BTW, the example is artificial and I'm not looking for alternative designs. The example is just here to illustrate the problem)
No, there is no way to override the default + operator, not even with an implicit conversion. When it encounters an operator (actually a method, as operators are just plain methods) that is not defined on the receiving object, the compiler will look for an implicit conversion to an object to does provide this operator. But if the operator is already defined on the target object, it will never look up for any conversion, the original operator will always be called.
You should thus define a separate operator whose name will not conflict with any preexisting operator.
UPDATE:
The precise rules that govern implicit conversions are defined in the Scala Language Specification:
Views are applied in three situations.
If an expression e is of type T , and T does not conform to the expression’s
expected type pt. In this case an implicit v is searched which is applicable to
e and whose result type conforms to pt. The search proceeds as in the case of
implicit parameters, where the implicit scope is the one of T => pt. If such a
view is found, the expression e is converted to v(e).
In a selection e.m with e of type T , if the selector m does not denote a member
of T . In this case, a view v is searched which is applicable to e and whose result
contains a member named m. The search proceeds as in the case of implicit
parameters, where the implicit scope is the one of T . If such a view is found,
the selection e.m is converted to v(e).m.
In a selection e.m(args) with e of type T , if the selector m denotes some member(s) of T , but none of these members is applicable to the arguments args. In
this case a view v is searched which is applicable to e and whose result contains a method m which is applicable to args. The search proceeds as in the
case of implicit parameters, where the implicit scope is the one of T . If such a
view is found, the selection e.m is converted to v(e).m(args).
In other words, an implicit conversion occurs in 3 situations:
when an expression is of type T but is used in a context where the unrelated type T' is expected, an implicit conversion from T to T' (if any such conversion is in scope) is applied.
when trying to access an object's member that does not exists on said object, an implicit conversion from the object into another object that does have this member (if any such conversion is in scope) is applied.
when trying to call a method of an object's with a parameter list that does not match any of the corresponding overloads, the compiler applies an implicit conversion from the object into another object that does have a method of this name and with a compatible parameter list (if any such conversion is in scope).
Note for completeness that this actually applies to more than just methods (inner objects/vals with an apply method are eligible too). Note also that this is the case that Randall Schulz was talking about in his comment below.
So in your case, points (2) and (3) are relevant. Given that you want to define a method named unary_+, which already exists for type Int, case (2) won't kick in. And given that your version has the same parameter list as the built-in Int.unary_+ method (they are both parameterless), point (3) won't kick in either. So you definitly cannot define an implicit that will redefine unary_+.