I want to map over a Map that has the type List[~(A,Option[B])] but before I map over it I group it by A. Now that I can map over it, I have to match the Tuple of the Map:
val rawData: List[A ~ Option[B]]
rawData
.groupBy(_._1)
.map(case (first: A, second: Seq[A ~ Option[B]]) =>
C(first, second.map(_._2))
)
Now the Compiler warns me:
non-variable type argument anorm.~[A,Option[B]] in type pattern Seq[anorm.~[A,Option[B]]] is unchecked since it is eliminated by erasure
I found several solutions to make that matching possible, but I have the feeling, that it could be also possible to avoid the matching at all as I only want to go through that Map that has already a defined Type. How could this be possible?
In this case you actually don't have to worry about this. The error is because your case statement is too verbose. Change to the following:
rawData.groupBy(_._1).map(case (first, second) =>
C(first, second.map(_._2))
)
The types in the case statement restrict the type of the tuple (which is unnecessary). However, they restrict it in a way which is not verifiable at runtime (due to type erasure), that's why you get the error.
Related
This question already has answers here:
How do I get around type erasure on Scala? Or, why can't I get the type parameter of my collections?
(11 answers)
Closed 3 months ago.
I am new to Scala(2.13.8) and working on code to use pattern matching to handle a value in different ways, code is very simply like below
def getOption(o: Option[Any]): Unit = {
o match {
case l: Some[List[String]] => handleListData(l)
case _ => handleData(_)
}
}
getOption(Some(3))
getOption(Some(Seq("5555")))
The result is handleListData() been invoked for both input. Can someone help on what's wrong in my code?
As sarveshseri mentioned in the comments, the problem here is caused by type erasure. When you compile this code, scalac issues a warning:
[warn] /Users/tmoore/IdeaProjects/scala-scratch/src/main/scala/PatternMatch.scala:6:15: non-variable type argument List[String] in type pattern Some[List[String]] is unchecked since it is eliminated by erasure
[warn] case l: Some[List[String]] => handleListData(l)
[warn] ^
This is because the values of type parameters are not available at runtime due to erasure, so this case is equivalent to:
case l: Some[_] => handleListData(l.asInstanceOf[Some[List[String]]])
This may fail at runtime due to an automatically-inserted cast in handleListData, depending on how it actually uses its argument.
One thing you can do is take advantage of destructuring in the case pattern in order to do a runtime type check on the content of the Option:
case Some(l: List[_]) => handleListData(l)
This will work with a handleListData with a signature like this:
def handleListData(l: List[_]): Unit
Note that it unwraps the Option, which is most likely more useful than passing it along.
However, it does not check that the List contains strings. To do so would require inspecting each item in the list. The alternative is an unsafe cast, made with the assumption that the list contains strings. This opens up the possibility of runtime exceptions later if the list elements are cast to strings, and are in fact some other type.
This change also reveals a problem with the second case:
case _ => handleData(_)
This does not do what you probably think it does, and issues its own compiler warning:
warn] /Users/tmoore/IdeaProjects/scala-scratch/src/main/scala/PatternMatch.scala:7:28: a pure expression does nothing in statement position
[warn] case _ => handleData(_)
[warn] ^
What does this mean? It's telling us that this operation has no effect. It does not invoke the handleData method with o as you might think. This is because the _ character has special meaning in Scala, and that meaning depends on the context where it's used.
In the pattern match case _, it is a wildcard that means "match anything without binding the match to a variable". In the expression handleData(_) it is essentially shorthand for x => handleData(x). In other words, when this case is reached, it evaluates to a Function value that would invoke handleData when applied, and then discards that value without invoking it. The result is that any value of o that doesn't match the first case will have no effect, and handleData is never called.
This can be solved by using o in the call:
case _ => handleData(o)
or by assigning a name to the match:
case x => handleData(x)
Returning to the original problem: how can you call handleListData only when the argument contains a List[String]? Since the type parameter is erased at runtime, this requires some other kind of runtime type information to differentiate it. A common approach is to define a custom algebraic data type instead of using Option:
object PatternMatch {
sealed trait Data
case class StringListData(l: List[String]) extends Data
case class OtherData(o: Any) extends Data
def handle(o: Data): Unit = {
o match {
case StringListData(l) => handleListData(l)
case x => handleData(x)
}
}
def handleListData(l: List[String]): Unit = println(s"Handling string list data: $l")
def handleData(value: Any): Unit = println(s"Handling data: $value")
def main(args: Array[String]): Unit = {
PatternMatch.handle(OtherData(3))
PatternMatch.handle(StringListData(List("5555", "6666")))
PatternMatch.handle(OtherData(List(7777, 8888)))
PatternMatch.handle(OtherData(List("uh oh!")))
/*
* Output:
* Handling data: OtherData(3)
* Handling string list data: List(5555, 6666)
* Handling data: OtherData(List(7777, 8888))
* Handling data: OtherData(List(uh oh!))
*/
}
}
Note that it's still possible here to create an instance of OtherData that actually contains a List[String], in which case handleData is called instead of handleListData. You would need to be careful not to do this when creating the Data passed to handle. This is the best you can do if you really need to handle Any in the default case. You can also extend this pattern with other special cases by creating new subtypes of Data, including a case object to handle the "empty" case, if needed (similar to None for Option):
case object NoData extends Data
// ...
PatternMatch.handle(NoData) // prints: 'Handling data: NoData'
I wonder why a Set[A] is converted to a Vector[A] if I ask for a Seq[A] subclass? To illustrate this take the following example:
val A = Set("one", "two")
val B = Set("one", "two", "three")
def f(one: Seq[String], other : Seq[String]) = {
one.intersect(other) match {
case head :: tail => head
case _ => "unknown"
}
}
f(A.to, B.to)
This function will return "unknown" instead of one. The reason is that A.to will be casted to a Vector[String]. The cons operator (::) is not defined for Vectors but for Lists so the second case is applied and "unknown" is returned. To fix this problem I could use the +: operator which is defined for all Seqs or convert the Set to List (A.to[List]). So my (academic) question is:
Why does A.to returns a Vector. At least according to the scala docs the default implementation of Seq is LinearSeq and the default of this is List. What did I got wrong?
Because it can, you are depending on runtime class implementation details, instead of compile-time type information guarantees. The to or toSeq method is free to return anything that typechecks, it could even generate a random number and chose a concrete class in base of that number, so you may get a List something other times a Vector or whatever. It may even decide in base of the operating system. Of course, I am being pedantic here and hopefully, they do not do that, but my point is, we can't really explain, that is what the implementation does and it may change in the future.
Also, the "default implementation of Seq is a List", applies only in the constructor. And again, they may change that in any moment.
So, if you want a List ask for a List, not for a Seq.
I'm trying to define "immutable setter traits", and generic functions for those.
I have a working implementation, but i'm bit disturbed about the "unchecked" warnings from the pattern matching. I'm not really sure what i can do about it.
type Point = (Double, Double)
trait Sizable[A] {
this: A =>
def size: Point
/* immutable object value setter, returns a new of the same object*/
def size(point: Point): A with Sizable[A]
}
def partitionToSizable[T](elements: List[T]):
(List[T], List[T with Sizable[T]]) =
elements.foldLeft((List[T](), List[T with Sizable[T]]()))((p, c) =>
c match {
case a: T with Sizable[T] => (p._1, p._2 ++ List(a))
case a => (p._1 ++ List(a), p._2)
})
The code above demonstrates the problem.
I'm not even sure how big of an issue is that T being unchecked, since all the elements in the list will have a type of T, and the point of the pattern matching is not to determine if it's type is T since we already know that.
In theory Sizable will always have type of T because of the signature of it's enclosing function.
If it's there is no other solution i'd at least like to suppress the warning. #unchecked annotations does not seem to suppress the warning.
If i modify case a: T with Sizable[T] to case a: Sizable[_] it will not compile, since the result type will obviously not confirm to T.
ClassTags or TypeTags might solve the warning, but i suspect they are not necessary really. (also that might have a performance overhead and TypeTags don't work with Scala.js)
I think just case a: Sizable[T #unchecked] => a.size((15,20)) should work.
The main problem here is in partitionToSizable. It's taking a List[T] and then trying to partition it into a List[T] and a List[T with Sizable[T]]. You are taking a list of a single type and then saying that one type T is actually two different types T and T with Sizable[T]. This indicates an issue with your type design. I would recommend solving that first.
One solution might be to recognise that things that are sizable and things that are not sizable should be represented as two different types. Then you can use the following signature:
def partitionToSizable[T1, T2 <: Sizable[T2]](
elements: List[Either[T1, T2]]): (List[T1], List[T2])
Then you don't need to cast; you can pattern match or fold on the Either.
Define a list first:
val list = List(1,2,3)
Scala compiler gives warning (even if it can match):
list match {
case head :: tail => println(s"h:${head} ~ t: ${tail}")
}
Scala compiler won't give warning (even if it can't match):
list match {
case List(a,b) => println("!!!")
}
I can't understand the second one
The "match may not be exhaustive" warning is only given when pattern matching on a type that is a sealed class and you have cases for only a subset of the subclasses or objects. List is a sealed class with a subclass :: and a subobject Nil, something like:
sealed abstract class List[+T]
class ::[+T] extends List[+T]
object Nil extends List[Nothing]
If you have a match and don't have a case for :: and one for Nil, and also don't have a case that might match any List, Scala knows the match isn't exhaustive and will report it. A case _ would match anything and will prevent the warning. But List(a, b) also prevents the warning, because Scala doesn't know if it only matches some of the subclasses.
When you use List as an extractor, as in List(a, b), you are using the extractor List.unapplySeq to take apart the matched value. Scala doesn't try to make assumptions about how the extractor behaves, and therefore doesn't know that the match isn't exhaustive. Without knowing the implementation details of List.unapplySeq, there is no way to know it won't happily match everything and return the required two values.
Would the Haskell equivalent of the code below produce correct answers?
Can this Scala code be fixed to produce correct answers ? If yes, how ?
object TypeErasurePatternMatchQuestion extends App {
val li=List(1,2,3)
val ls=List("1","2","3")
val si=Set(1,2,3)
val ss=Set("1","2","3")
def whatIsIt(o:Any)=o match{
case o:List[Int] => "List[Int]"
case o:List[String] => "List[String]"
case o:Set[Int] => "Set[Int]"
case o:Set[String] => "Set[String]"
}
println(whatIsIt(li))
println(whatIsIt(ls))
println(whatIsIt(si))
println(whatIsIt(ss))
}
prints:
List[Int]
List[Int]
Set[Int]
Set[Int]
but I would expect it to print:
List[Int]
List[String]
Set[Int]
Set[String]
You must understand that by saying o:Any you erase all the specific information about the type and further on the type Any is all that the compiler knows about value o. That's why from that point on you can only rely on the runtime information about the type.
The case-expressions like case o:List[Int] are resolved using the JVM's special instanceof runtime mechanism. However the buggy behaviour you experience is caused by this mechanism only taking the first-rank type into account (the List in List[Int]) and ignoring the parameters (the Int in List[Int]). That's why it treats List[Int] as equal to List[String]. This issue is known as "Generics Erasure".
Haskell on the other hand performs a complete type erasure, which is well explained in the answer by Ben.
So the problem in both languages is the same: we need to provide a runtime information about the type and its parameters.
In Scala you can achieve that using the "reflection" library, which resolves that information implicitly:
import reflect.runtime.{universe => ru}
def whatIsIt[T](o : T)(implicit t : ru.TypeTag[T]) =
if( t.tpe <:< ru.typeOf[List[Int]] )
"List[Int]"
else if ( t.tpe <:< ru.typeOf[List[String]] )
"List[String]"
else if ( t.tpe <:< ru.typeOf[Set[Int]] )
"Set[Int]"
else if ( t.tpe <:< ru.typeOf[Set[String]] )
"Set[String]"
else sys.error("Unexpected type")
println(whatIsIt(List("1","2","3")))
println(whatIsIt(Set("1","2","3")))
Output:
List[String]
Set[String]
Haskell has a very different approach to polymorphism. Above all, it does not have subtype polymorphism (it's not a weakness though), that's why the type-switching pattern matches as in your example are simply irrelevant. However it is possible to translate the Scala solution from above into Haskell quite closely:
{-# LANGUAGE MultiWayIf, ScopedTypeVariables #-}
import Data.Dynamic
import Data.Set
whatIsIt :: Dynamic -> String
whatIsIt a =
if | Just (_ :: [Int]) <- fromDynamic a -> "[Int]"
| Just (_ :: [String]) <- fromDynamic a -> "[String]"
| Just (_ :: Set Int) <- fromDynamic a -> "Set Int"
| Just (_ :: Set String) <- fromDynamic a -> "Set String"
| otherwise -> error "Unexpected type"
main = do
putStrLn $ whatIsIt $ toDyn ([1, 2, 3] :: [Int])
putStrLn $ whatIsIt $ toDyn (["1", "2", "3"] :: [String])
putStrLn $ whatIsIt $ toDyn (Data.Set.fromList ["1", "2", "3"] :: Set String)
Output:
[Int]
[String]
Set String
However I must outline boldly that this is far from a typical scenario of Haskell programming. The language's type-system is powerful enough to solve extremely intricate problems while maintaining all the type-level information (and safety). Dynamic is only used in very special cases in low-level libraries.
GHC does even more type erasure than the JVM; at runtime the types are completely gone (not just the type parameters).
Haskell's approach to types is to use them at compile time to guarantee that no ill-typed operation can ever be carried out, and since Haskell doesn't have OO-style subtyping and dynamic dispatch, there's no purpose at all to keeping the types around. So data is compiled to a memory structure that simply contains the right values, and functions are compiled with baked-in knowledge of the structure of the types on which they operate1, and just blindly expect their arguments to have that structure. That's why you get fun things like segmentation faults if you mess with unsafeCoerce incorrectly, not just a runtime exception saying the value was not of the expected type; at runtime Haskell has no idea whether a value is of any given type.
So rather than Haskell giving "the right answer" to the equivalent program, Haskell disallows your program as unsafe! There is no Any type in Haskell to which you can cast whatever you want.
That's not 100% true; in both Haskell and Scala there are ways of keeping type information alive at runtime. Essentially it's done by creating ordinary data structures that represent types, and passing them around together values that are of those types, so at runtime you can refer to the type representation object for information about the type of the other object. There are library and language facilities in both languages to let you use this mechanism at a higher (and more principled) level, so that it's easier to use safely. Because it requires the type tokens to be passed around, you have to "opt-in" to such features, and your callers have to be aware of it to pass you the required type tokens (whether the actual generation and passing of the token is done implicitly or explicitly).
Without using such features, Haskell provides no way to pattern match on a value that could be of type List Int or Set String to find out which one it is. Either you're using a monomorphic type, in which case it can only be one type and the others will be rejected, or you're using a polymorphic type, in which case you can only apply code to it that will do the same thing2 regardless of which concrete type instantiates the polymorphic type.
1 Except for polymorphic functions, which assume nothing about their polymorphic arguments, and so can basically do nothing with them except pass them to other polymorphic functions (with matching type class constraints, if any).
2 Type class constrained polymorphic types are the only exception to this. Even then, if you've got a value a type that's a member of some type class, all you can do with it is pass it to other functions that accept values in any type that is a member of that type class. And if those functions are general functions defined outside of the type class in question, they'll be under the same restriction. It's only the type class methods themselves that can actually "do something different" for different types in the class, and that's because they are the union of a whole bunch of monomorphic definitions that operate on one particular type in the class. You can't write code that gets to take a polymorphic value, inspect it to see what it was instantiated with, and then decide what to do.
Of course Haskell prints the right answer:
import Data.Set
import Data.Typeable
main = do
let li=[1,2,3]
let ls=["1","2","3"]
let si=Data.Set.fromList[1,2,3]
let ss=Data.Set.fromList["1","2","3"]
print $ typeOf li
print $ typeOf ls
print $ typeOf si
print $ typeOf ss
prints
[Integer]
[[Char]]
Set Integer
Set [Char]