type stable parametric polymorphism - scala

I don't understand why the following scala code doesn't compile:
sealed trait A
case class B() extends A {
def funcB: B = this
}
case class C() extends A {
def funcC: C = this
}
def f[T <: A](s:T): T = s match {
case s: B => s.funcB
case s: C => s.funcC
}
It works to replace f with
def f[T <: A](s:T): A = s match {
case s: B => s.funcB
case s: C => s.funcC
}
and then cast to the subtype when f is called, using asInstanceOf, for example. But I would like to be able to construct a function which unifies some previously defined methods, and have them be type stable. Can anyone please explain?
Also, note that the following f also compiles:
def f[T <: A](s:T): T = s match {
case s: B => s
case s: C => s
}

What makes it work?
In particular, in Scala 3 you could use match types
scala> type Foo[T <: A] = T match {
| case B => B
| case C => C
| }
|
| def f[T <: A](s:T): Foo[T] = s match {
| case s: B => s.funcB
| case s: C => s.funcC
| }
def f[T <: A](s: T): Foo[T]
scala> f(B())
val res0: B = B()
scala> f(C())
val res1: C = C()
In general, for the solution to "return current type" problems see Scala FAQ How can a method in a superclass return a value of the “current” type?
Compile-time techniques such as type classes and match types can be considered as kind of compile-time pattern matching which instruct the compiler to reduce to the most specific informationally rich type used at call site instead of otherwise having to determine a probably poorer upper bound type.
Why it does not work?
The key concept to understand is that parametric polymorphism is a kind of universal quantification which means it must make sense to the compiler for all instantiations of type parameters at call-sites. Consider typing specification
def f[T <: A](s: T): T
which the compiler might interpret something like so
For all types T that are a subtype of A, then f should return that
particular subtype T.
hence the expression expr representing the body of f
def f[T <: A](s:T): T = expr
must type to that particular T. Now lets try to type our expr
s match {
case s: B => s.funcB
case s: C => s.funcC
}
The type of
case s: B => s.funcB
is B, and the type of
case s: C => s.funcC
is C. Given we have B and C, now compiler has to take the least upper bound of the two which is A. But A is certainly not always T. Hence the typecheck fails.
Now lets do the same exercise with
def f[T <: A](s: T): A
This specification means (and observe the "for all" again)
For all types T that are a subtype of A, then f should return
their supertype A.
Now lets type the method body expressions
s match {
case s: B => s.funcB
case s: C => s.funcC
}
As before we arrive at types B and C, so compiler takes the upper bound which is the supertype A. And indeed this is the very return type we specified. So typecheck succeeds. However despite succeeding, at compile-time we lost some typing information as compiler will no longer consider all the information that comes with specific T passed in at call-site but only the information available via its supertype A. For example, if T has a member not existing in A, then we will not be able to call it.
What to avoid?
Regarding asInstanceOf, this is us telling the compiler to stop helping us because we will take the rains. Two groups of people tend to use it in Scala to make things work, the mad scientist library authors and ones transitioning from other more dynamically typed languages. However in most application level code it is considered bad practice.

It all comes down to our old friend (fiend?) the compile-time/run-time barrier. (And ne'er the twain shall meet.)
T is resolved at compile-time at the call site. When the compiler sees f(B) then T means B and when the compiler sees f(C) then T becomes C.
But match { case ... is resolved at run-time. The compiler can't know which case branch will be chosen. From the compiler's point of view all case options are equally likely. So if T is resolved to B but the code might take a C branch...well, the compiler can't allow that.
Looking at what does compile:
def f[T <: A](s:T): A = s match { //f() returns an A
case s: B => s.funcB //B is an A sub-type
case s: C => s.funcC //C is an A sub-type
} //OK, all is good
Your 2nd "also works" example does not compile for me.

To answer the question why it does not work.
f returns the result of the statement s match {...}.
The type of that statement is A (sometimes it returns B, and sometimes it returns C), not T as it is supposed to be. T is sometimes C, and sometimes B, s match {...} is never either of those. It is the supertype of them, which is A.
Re. this:
s match {
case s: B => s
case s: C => s
}
The type of this statement is obviously T, because s is T. It certainly does compile despite what #jwvh might be saying :)

Related

Scala type inference rule over contravarience?

I'm using Circe and noticed something that i am not so confortable with and would like to understand what is going on under the hood ?
Fundamentally it is not really a circe issue. Also i was just playing with circe around to test few thing. So could have decoded in JsonObject straight but that is beside the point.
val jobjectStr = """{
| "idProperty": 1991264,
| "nIndex": 0,
| "sPropertyValue": "0165-5728"
| }""".stripMargin
val jobject = decode[Json](jobjectStr).flatMap{ json =>
json.as[JsonObject]
}
My issue is with the flapMap signature of Either, contravariance and what is happening here:
We have the following types:
decode[Json](jobjectStr): Either[Error, Json]
json.as[JsonObject]: Decoder.Result[JsonObject]
where circe defines
final type Result[A] = Either[DecodingFailure, A]
and
sealed abstract class DecodingFailure(val message: String) extends Error {
Now the signature of flatMap in either is:
def flatMap[A1 >: A, B1](f: B => Either[A1, B1]): Either[A1, B1]
In other words, talking only about type it is like my code is doing
Either[Error, Json] flatMap Either[DecodingFailure, JsonObject]
Hence my issue is: DecodingFailure >: Error is not true
And Indeed the type of the full expression is:
decode[Json](jobjectStr).flatMap{ json =>
json.as[JsonObject]
}: Either[Error, JsonObject]
Hence i'm confused, because my understanding is that the type of the first Parameter of Either is Contravariant in the flatMap Signature. Here there seems to be some wierd least upper bound inferencing going on ... But i am not sure why or if it is even the case.
Any explanation ?
This really isn't a variance issue. A1 >: A is just telling us that the result type, A1, might have to be a super-type of the received type, A, if the compiler has to go looking for a least upper bound (the LUB). (The use of A1 in the f: B => ... description is, I think, a bit confusing.)
Consider the following:
class Base
class SubA extends Base
class SubB extends Base
Either.cond(true, "a string", new SubA)
.flatMap(Either.cond(true, _, new SubB))
//res0: scala.util.Either[Base,String] = Right(a string)
Notice how the result is Either[Base,String] because Base is the LUB of SubA and SubB.
So first of all, we need to understand that the compiler will always try to infer types that allow compilation. The only real way to avoid something to compile is to use implicits.
(not sure if this is part of the language specification, or a compiler implementation detail, or something common to all compilers, or a bug or a feature).
Now, let's start with a simpler example List and ::.
sealed trait List[+A] {
def ::[B >: A](b: B): List[B] = Cons(b, this)
}
final case class Cons[+A](head: A, tail: List[A]) extends List[A]
final case object Nil extends List[Nothing]
So, assuming the compiler will always allow some code like x :: list will always compile. Then, we have three scenarios:
x is of type A and list is a List[A], so it is obvious that the returned value has to be of type List[A].
x is of some type C and list is a List[A], and C is a subtype of A (C <: A). Then, the compiler simply upcast x to be of type A and the process continues as the previous one.
x is of some type D and list is a List[A], and D is not a subtype of A. Then, the compiler finds a new type B which is the LUB between D and A, the compiler finally upcast both x to be of type B and list to be a List[B] (this is possible due covariance) and proceeds like the first one.
Also, note that due to the existence of types like Any and Nothing there is "always" a LUB between two types.
Now let's see Either and flatMap.
sealed trait Either[+L, +R] {
def flatMap[LL >: L, RR](f: R => Either[LL, RR]): Either[LL, RR]
}
final case class Left[+L](l: L) extends Either[L, Nothing]
final case clas Right[+R](r: R) extends Either[Nothing, R]
Now, assuming my left side is an Error, I feel this behaviour of returning the LUB between the two possible lefts is the best, since at the end I would have the first error, or the second error or the final value, so since I do not know which of the two errors it was then that error must be of some type that encapsulates both possible errors.

Scala Option type upper bound don't understand

I'm reading Functional Programming in Scala, and in chapter 04 the authors implement Option on their own. Now, when defining the function getOrElse they use an upper bound to restrict the type of A to a supertype (if a understood correctly)
So, the definition goes:
sealed trait Option[+A] {
def getOrElse[B >: A](default: => B): B = this match {
case None => default
case Some(a) => a
}
}
So, when we have something like
val a = Some(4)
println(a.getOrElse(None)) => println prints a integer value
val b = None
println(b.getOrElse(Some(3)) => println prints a Option[Integer] value
a has type Option[Int], so A would be type Int. B would be type Nothing. Nothing is a subtype of every other type. That means that Option[Nothing] is a subtype of Option[Int] (because of covariance), right?
But with B >: A we said that B has to be a supertype?! So how can we get an Int back? This is a bit confusing for me...
Anyone care to try and clarify?
That means that Option[Nothing] is a subtype of Option[Int] (because of covariance), right?
Correct. Option[Nothing] is an Option[Int].
But with B >: A we said that B has to be a supertype?! So how can we get an Int back?
It doesn't have to be a super-type. It just requires A as a lower-bound. Which means you can still pass Int to getOrElse if A is Int.
But that doesn't mean you can't pass instances of a sub-class. For instance:
class A
class B extends A
class C extends B
scala> Option(new B)
res196: Option[B] = Some(B#661f82ac)
scala> res196.getOrElse(new C)
res197: B = B#661f82ac
scala> res196.getOrElse(new A)
res198: A = B#661f82ac
scala> res196.getOrElse("...")
res199: Object = B#661f82ac
I can still pass an instance of C, because C can be up-cast to B. I can also pass a type higher up the inheritance tree, and getOrElse will return that type, instead. If I pass a type that has nothing to do with the type contained in the Option, then the type with the least upper-bound will be inferred. In the above case, it's Any.
So why is the lower-bound there at all? Why not have:
def getOrElse[B <: A](default: => B): B
This won't work because getOrElse must either return the A that's contained in the Option, or the default B. But if we return the A, and A is not a B, so the type-bound is invalid. Perhaps if getOrElse returned A:
def getOrElse[B <: A](default: => B): A
This would work (if it were really defined that way), but you would be restricted by the type-bounds. So in my above example, you could only pass B or C to getOrElse on an Option[B]. In any case, this is not how it is in the standard library.
The standard library getOrElse allows you to pass anything to it. Say you have Option[A]. If we pass a sub-type of A, then it is up-cast to A. If we pass A, obviously this is okay. And if we pass some other type, then the compiler infers the least upper-bound between the two. In all cases, the type-bound B >: A is met.
Because getOrElse allows you to pass anything to it, many consider it very tricky. For example you could have:
val number = "blah"
// ... lots of code
val result = Option(1).getOrElse(number)
And this will compile. We'll just have an Option[Any] that will probably cause an error somewhere else down the line.

Scala pattern match infers `Any` instead of an existential type, breaks type safety?

I ran into a puzzling type inference problem with case classes. Here's a minimal example:
trait T[X]
case class Thing[A, B, X](a: A, f: A => B) extends T[X]
def hmm[X](t: T[X]) = t match {
case Thing(a, f) => f("this really shouldn't typecheck")
}
Scala decides that a: Any and f: Any => Any, but that's inappropriate; they really ought to have types a: SomeTypeA and f: SomeTypeA => SomeTypeB, where SomeTypeA and SomeTypeB are unknown types.
Another way of saying this is that I think the hypothetical Thing.unapply method should look something like
def unapply[X](t: T[X]): Option[(A, A => B)] forSome { type A; type B } = {
t match {
case thing: Thing[_, _, X] => Some((thing.a, thing.f))
}
}
This version correctly gives a type error at f("this really shouldn't typecheck").
Does this seem like a bug in the compiler, or am I missing something?
Edit: This is on Scala 2.10.3.
Mark Harrah pointed this out in the #scala channel on Freenode: yes, this is a bug.
https://issues.scala-lang.org/browse/SI-6680

Scala: F-Bounded Polymorphism Woes

Assume the existence of the following types and method:
trait X[A <: X[A]]
case class C extends X[C]
def m(x: PartialFunction[X[_], Boolean])
I want to be able to create a PartialFunction to be passed into m.
A first attempt would be to write
val f: PartialFunction[X[_], Boolean] = {
case c: C => true
}
m(f)
This fails with type arguments [_$1] do not conform to trait X's type parameter bounds [A <: X[A]]. So, it seems we have to constraint X's type parameters.
A second attempt:
val f: PartialFunction[{type A <: X[A]}, Boolean] = {
case c: C => true
}
m(f)
This fails on the application of m because PartialFunction[AnyRef{type A <: X[this.A]},Boolean] <: PartialFunction[X[_],Boolean] is false.
Is there any way not involving casting that actually satisfies the compiler both on the definition of the partial function and on the application of m?
I'm not sure what exactly you want, but since you are using an existential type (in disguise of the _ syntax), this is how you can make that work:
val f: PartialFunction[X[A] forSome {type A <: X[A]}, Boolean] = {
case c : C => true
}
The _ syntax isn't good enough here, since you need to give the existential type the right upper bound. That is only possible with the more explicit forSome syntax.
What I find surprising, though, is that Scala accepts the declaration
def m(x: PartialFunction[X[_], Boolean])
in the first place. It seems weird that it even considers X[_] a well-formed type. This is short for X[A] forSome {type A <: Any}, which should not be a valid application of X, because it does not conform to the parameter bounds.
Not sure if that's what you wanted to achieve, but this is the working sequence:
trait X[A <: X[A]]
case class C extends X[C]
def m[T<:X[T]](x: PartialFunction[X[T], Boolean]) = print("yahoo!")
scala> def f[T<:X[T]]:PartialFunction[X[T], Boolean] = {
| case c: C => true
| }
f: [T <: X[T]]=> PartialFunction[X[T],Boolean]
scala> m(f)
yahoo!

Scala: specify a default generic type instead of Nothing

I have a pair of classes that look something like this. There's a Generator that generates a value based on some class-level values, and a GeneratorFactory that constructs a Generator.
case class Generator[T, S](a: T, b: T, c: T) {
def generate(implicit bf: CanBuildFrom[S, T, S]): S =
bf() += (a, b, c) result
}
case class GeneratorFactory[T]() {
def build[S <% Seq[T]](seq: S) = Generator[T, S](seq(0), seq(1), seq(2))
}
You'll notice that GeneratorFactory.build accepts an argument of type S and Generator.generate produces a value of type S, but there is nothing of type S stored by the Generator.
We can use the classes like this. The factory works on a sequence of Char, and generate produces a String because build is given a String.
val gb = GeneratorFactory[Char]()
val g = gb.build("this string")
val o = g.generate
This is fine and handles the String type implicitly because we are using the GeneratorFactory.
The Problem
Now the problem arises when I want to construct a Generator without going through the factory. I would like to be able to do this:
val g2 = Generator('a', 'b', 'c')
g2.generate // error
But I get an error because g2 has type Generator[Char,Nothing] and Scala "Cannot construct a collection of type Nothing with elements of type Char based on a collection of type Nothing."
What I want is a way to tell Scala that the "default value" of S is something like Seq[T] instead of Nothing. Borrowing from the syntax for default parameters, we could think of this as being something like:
case class Generator[T, S=Seq[T]]
Insufficient Solutions
Of course it works if we explicitly tell the generator what its generated type should be, but I think a default option would be nicer (my actual scenario is more complex):
val g3 = Generator[Char, String]('a', 'b', 'c')
val o3 = g3.generate // works fine, o3 has type String
I thought about overloading Generator.apply to have a one-generic-type version, but this causes an error since apparently Scala can't distinguish between the two apply definitions:
object Generator {
def apply[T](a: T, b: T, c: T) = new Generator[T, Seq[T]](a, b, c)
}
val g2 = Generator('a', 'b', 'c') // error: ambiguous reference to overloaded definition
Desired Output
What I would like is a way to simply construct a Generator without specifying the type S and have it default to Seq[T] so that I can do:
val g2 = Generator('a', 'b', 'c')
val o2 = g2.generate
// o2 is of type Seq[Char]
I think that this would be the cleanest interface for the user.
Any ideas how I can make this happen?
Is there a reason you don't want to use a base trait and then narrow S as needed in its subclasses? The following for example fits your requirements:
import scala.collection.generic.CanBuildFrom
trait Generator[T] {
type S
def a: T; def b: T; def c: T
def generate(implicit bf: CanBuildFrom[S, T, S]): S = bf() += (a, b, c) result
}
object Generator {
def apply[T](x: T, y: T, z: T) = new Generator[T] {
type S = Seq[T]
val (a, b, c) = (x, y, z)
}
}
case class GeneratorFactory[T]() {
def build[U <% Seq[T]](seq: U) = new Generator[T] {
type S = U
val Seq(a, b, c, _*) = seq: Seq[T]
}
}
I've made S an abstract type to keep it a little more out of the way of the user, but you could just as well make it a type parameter.
This does not directly answer your main question, as I think others are handling that. Rather, it is a response to your request for default values for type arguments.
I have put some thought into this, even going so far as starting to write a proposal for instituting a language change to allow it. However, I stopped when I realized where the Nothing actually comes from. It is not some sort of "default value" like I expected. I will attempt to explain where it comes from.
In order to assign a type to a type argument, Scala uses the most specific possible/legal type. So, for example, suppose you have "class A[T](x: T)" and you say "new A[Int]". You directly specified the value of "Int" for T. Now suppose that you say "new A(4)". Scala knows that 4 and T have to have the same type. 4 can have a type anywhere between "Int" and "Any". In that type range, "Int" is the most specific type, so Scala creates an "A[Int]". Now suppose that you say "new A[AnyVal]". Now, you are looking for the most specific type T such that Int <: T <: Any and AnyVal <: T <: AnyVal. Luckily, Int <: AnyVal <: Any, so T can be AnyVal.
Continuing, now suppose that you have "class B[S >: String <: AnyRef]". If you say "new B", you won't get an B[Nothing]. Rather you will find that you get a B[String]. This is because S is being constrained as String <: S <: AnyRef and String is at the bottom of that range.
So, you see, for "class C[R]", "new C" doesn't give you a C[Nothing] because Nothing is some sort of default value for type arguments. Rather, you get a C[Nothing] because Nothing is the lowest thing that R can be (if you don't specify otherwise, Nothing <: R <: Any).
This is why I gave up on my default type argument idea: I couldn't find a way to make it intuitive. In this system of restricting ranges, how do you implement a low-priority default? Or, does the default out-priority the "choose the lowest type" logic if it is within the valid range? I couldn't think of a solution that wouldn't be confusing for at least some cases. If you can, please let me know, as I'm very interested.
edit: Note that the logic is reversed for contravariant parameters. So if you have "class D[-Q]" and you say "new D", you get a D[Any].
One option is to move the summoning of the CanBuildFrom to a place where it (or, rather, its instances) can help to determine S,
case class Generator[T,S](a: T, b: T, c: T)(implicit bf: CanBuildFrom[S, T, S]) {
def generate : S =
bf() += (a, b, c) result
}
Sample REPL session,
scala> val g2 = Generator('a', 'b', 'c')
g2: Generator[Char,String] = Generator(a,b,c)
scala> g2.generate
res0: String = abc
Update
The GeneratorFactory will also have to be modified so that its build method propagates an appropriate CanBuildFrom instance to the Generator constructor,
case class GeneratorFactory[T]() {
def build[S](seq: S)(implicit conv: S => Seq[T], bf: CanBuildFrom[S, T, S]) =
Generator[T, S](seq(0), seq(1), seq(2))
}
Not that with Scala < 2.10.0 you can't mix view bounds and implicit parameter lists in the same method definition, so we have to translate the bound S <% Seq[T] to its equivalent implicit parameter S => Seq[T].