Issue with existential in Scala - scala

I have an issue working with existentials in Scala. My problem started when creating a mini workflow engine. I started on the idea that it was a directed graph, implemented the model for the latter first and then modeled the Workflow like this:
case class Workflow private(states: List[StateDef], transitions: List[_, _], override val edges: Map[String, List[StateDef]]) extends Digraph[String, StateDef, Transition[_, _]](states, edges) { ... }
In this case class, the first two fields are a list of states which behave as node, transitions which behave as edges.
The Transition parameter types are for the input and output parameters, as this should behave as an executable piece in the workflow, like a function of some sort:
case class Transition[-P, +R](tailState: StateDef, headState: StateDef, action: Action[P, R], condition: Option[Condition[P]] = None) extends Edge[String, StateDef](tailState, headState) {
def execute(param: P): Try[State[R]] = ...
}
I realized soon enough that dealing with a list of transitions in the Workflow object was giving me troubles with its type parameters. I tried to use parameters with [[Any]] and [[Nothing]], but I couldn't make it work (gist [1]).
If I'd do Java, I'd use a wildcard ? and use its 'less type safe and more dynamic' property and Java would have to believe me. But Scala is stricter and with variance and covariance of the Transition parameter types, it's hard to define wildcards and handle these properly. For example, using forSome notation and having a method in Workflow, I would get this error (gist [2]):
Error:(55, 24) type mismatch;
found : List[A$A27.this.Transition[A$A27.this.CreateImage,A$A27.this.Image]]
required: List[A$A27.this.Transition[P forSome { type P },R forSome { type R }]]
lazy val w = Workflow(transitions)
^
Hence then I created an existential type based on a trait (gist [3]), as explained in this article.
trait Transitions {
type Param
type Return
val transition: Transition[Param, Return]
val evidenceParam: StateValue[Param]
val evidenceReturn: StateValue[Return]
}
So now I could plug this existential in my Workflow class like this:
case class Workflow private(states: List[StateDef], transitions: List[Transitions], override val edges: Map[String, List[StateDef]])
extends Digraph[String, StateDef, Transitions](states, edges)
Working in a small file proved to be working (gist [3]). But when I moved on to the real code, my Digraph parent class does not like this Transitions existential. The former needs an Edge[ID, V] type, which Transition complies with but not the Transitions existential of course.
How in Scala does one resolve this situation? It seems troublesome to work with parameter types to get generics in Scala. Is there an easier solution that I haven't tried? Or a magic trick to specify the correct compatible parameter type between Digraph which need an Edge[ID, V] type and not an existential type that basically erase type traces?
I am sorry as this is convoluted, I will try my best to update the question if necessary.
Here are the Gist references for some of my trials and errors:
https://gist.github.com/jimleroyer/943efd00c764880b8119786d9dd6c3a2
https://gist.github.com/jimleroyer/1ce238b3934882ddc02a09485f52f407
https://gist.github.com/jimleroyer/17227b7e334d020a21deb36086b9b978
EDIT-1
Based on #HTNW answer, I've modified the scope of the existentials using forSome and updated the solution: https://gist.github.com/jimleroyer/2cb4ccbec13620585d21d53b4431ce22
I still have an issue though to properly bind the generics with the matchTransition & getTransition methods and without an explicit cast using asInstanceOf. I'll open another question specific to that one issue.

You scoped your existential quantifiers wrong.
R forSome { type R }
is equal to Any, because every single type is a type, so every single type is a subtype of that existential type, and that is the distinguishing feature of Any. Therefore
Transition[P forSome { type P }, R forSome { type R }]
is really
Transition[Any, Any]
and the Transitions end up needing to take Anys as parameter, and you lose all information about the type of the return. Make it
List[Transition[P, R] forSome { type P; type R }] // if every transition can have different types
List[Transition[P, R]] forSome { type P; type R } // if all the transitions need similar types
// The first can also be sugared to
List[Transition[_, _]]
// _ scopes so the forSome is placed outside the nearest enclosing grouping
Also, I don't get where you got the idea that Java's ? is "less safe". Code using it has a higher chance of being unsafe, sure, because ? is limited, but on its own it is perfectly sound (modulo null).

Related

Understanding covariance in my Scala code

I am working through the correct syntax and structure for the following problem.
I have two datasets with two separate schemas--call them ClientEvent and ServerEvent--stored on disk. The codebase I am working on has defined a class, Reader[T :< Asset] where ClientEvent and ServerEvent are subtypes of Asset. Asset is a trait.
I am writing a function:
def getPathAndReader(config): (String, Reader[Asset]) = {
if (config.readClient) {
return getClientPathAndReader(config)
} else {
return getServerPathAndReader(config)
}
}
This does not compile in my Scala code. From my understanding, T must be a subtype of Asset, which both ServerEvent and ClientEvent are, therefore Reader[ServerEvent] <: Reader[Asset]. But since functions are covariant in their inputs, the function I wrote cannot just return this lower type, I'd have to cast it to a supertype? Does that lose too much information?
load is a function on the trait Asset
trait Reader[T <: Asset] {
def load(raw: DataFrame): Dataset[T]
}
What would be an alternative way to structure this code?
The code's intent is to take the file path returned, and call Reader::load(filePath: String) to get data back. The subtyped readers have some internal logic to clean the data that it retrieves from disk before it's returned as a Dataframe. This means it relies on the type that it passes in. I come from a C++/C# background so my thinking is that if you have a generic Reader[Asset] but call Reader::load(path: String) it will know what to do based on the type it actually is, similar to Base* ptr and calling a derived method.
Your claim that
"From my understanding, T must be a subtype of Asset, which both ServerEvent and ClientEvent are, therefore Reader[ServerEvent] <: Reader[Asset]." is not correct. Generally if A and B are usual types such as A <: B and G[T] is a generic type, then all 3 cases are possible:
Co-variant case G[A] <: G[B] - typical example is some read-only collection like Iterator
Contra-variant case G[A] :> G[B] - typical example is some kind of a consumer like a function T => ()
Invariant case where G[A] and G[B] are not related. The most typical case when some uses of the T are co-variant and some a contravariant. For example, a simple mapping function T => T is invariant. Also most of the mutable collections are invariant as well because the both "produce" and "consume" objects.
Unfortunately for you Dataset[T] is invariant (rather than covariant Dataset[+T] or contravariant Dataset[-T]). This effectively makes your Reader also invariant. As to how to work this around, it is hard to advice without understanding a larger context. For example, why your getClientPathAndReader and getServerPathAndReader do not return Dataset[Asset]? If you really then use specific ServerEvent and ClientEvent, then your design is not type-safe anyway. If you use only Asset, then changing your readers to return Dataset[Asset] seems the easiest solution.

Scala: Casting results of groupBy(_.getClass)

In this hypothetical, I have a list of operations to be executed. Some of the operations in that list will be more efficient if they can be batched together (eg, lookup up different rows from the same table in a database).
trait Result
trait BatchableOp[T <: BatchableOp[T]] {
def resolve(batch: Vector[T]): Vector[Result]
}
Here we use F-bounded Polymorphism to allow the implementation of the operation to refer to its own type, which is highly convenient.
However, this poses a problem when it comes time to execute:
def execute(operations: Vector[BatchableOp[_]]): Vector[Result] = {
def helper[T <: BatchableOp[T]](clazz: Class[T], batch: Vector[T]): Vector[Result] =
batch.head.resolve(batch)
operations
.groupBy(_.getClass)
.toVector
.flatMap { case (clazz, batch) => helper(clazz, batch)}
}
This results in a compiler error stating inferred type arguments [BatchableOp[_]] do not conform to method helper's type parameter bounds [T <: BatchableOp[T]].
How can the Scala compiler be convinced that the group is all of the same type (which is a subclass of BatchableOp)?
One workaround is to specify the type explicitly, but in this case the type is unknown.
Another workaround relies on enumerating the child types, but I'd like to not have to update the execute method after implementing a new BatchableOp type.
I would like to approach the question systematically, so that the same solution strategy can be applied in similar cases.
First, an obvious remark: you want to work with a vector. The content of the vector can be of different types. The length of the vector is not limited. The number of types of entries of the vector is not limited. Therefore, the compiler cannot prove everything at compile time: you will have to use something like asInstanceOf at some point.
Now to the solution of the actual question:
This here compiles under 2.12.4:
import scala.language.existentials
trait Result
type BOX = BatchableOp[X] forSome { type X <: BatchableOp[X] }
trait BatchableOp[C <: BatchableOp[C]] {
def resolve(batch: Vector[C]): Vector[Result]
// not abstract, needed only once!
def collectSameClassInstances(batch: Vector[BOX]): Vector[C] = {
for (b <- batch if this.getClass.isAssignableFrom(b.getClass))
yield b.asInstanceOf[C]
}
// not abstract either, no additional hassle for subclasses!
def collectAndResolve(batch: Vector[BOX]): Vector[Result] =
resolve(collectSameClassInstances(batch))
}
def execute(operations: Vector[BOX]): Vector[Result] = {
operations
.groupBy(_.getClass)
.toVector
.flatMap{ case (_, batch) =>
batch.head.collectAndResolve(batch)
}
}
The main problem that I see here is that in Scala (unlike in some experimental dependently typed languages) there is no simple way to write down complex computations "under the assumption of existence of a type".
Therefore, it seems difficult / impossible to transform
Vector[BatchOp[T] forSome T]
into a
Vector[BatchOp[T]] forSome T
Here, the first type says: "it's a vector of batchOps, their types are unknown, and can be all different", whereas the second type says: "it's a vector of batchOps of unknown type T, but at least we know that they are all the same".
What you want is something like the following hypothetical language construct:
val vec1: Vector[BatchOp[T] forSome T] = ???
val vec2: Vector[BatchOp[T]] forSome T =
assumingExistsSomeType[C <: BatchOp[C]] yield {
/* `C` now available inside this scope `S` */
vec1.map(_.asInstanceOf[C])
}
Unfortunately, we don't have anything like it for existential types, we can't introduce a helper type C in some scope S such that when C is eliminated, we are left with an existential (at least I don't see a general way to do it).
Therefore, the only interesting question that is to be answered here is:
Given a Vector[BatchOp[X] forSome X] for which I know that there is one common type C such that they all are actually Vector[C], where is the scope in which this C is present as a usable type variable?
It turns out that BatchableOp[C] itself has a type variable C in scope. Therefore, I can add a method collectSameClassInstances to BachableOp[C], and this method will actually have some type C available that it can use in the return type. Then I can immediately pass the result of collectSameClassInstances to the resolve method, and then I get a completely benign Vector[Result] type as output.
Final remark: If you decide to write any code with F-bounded polymorphisms and existentials, at least make sure that you have documented very clearly what exactly you are doing there, and how you will ensure that this combination does not escape in any other parts of the codebase. It doesn't feel like a good idea to expose such interfaces to the users. Keep it localized, make sure these abstractions do not leak anywhere.
Andrey's answer has a key insight that the only scope with the appropriate type variable is on the BatchableOp itself. Here's a reduced version that doesn't rely on importing existentials:
trait Result
trait BatchableOp[T <: BatchableOp[T]] {
def resolve(batch: Vector[T]): Vector[Result]
def unsafeResolve(batch: Vector[BatchableOp[_]]): Vector[Result] = {
resolve(batch.asInstanceOf[Vector[T]])
}
}
def execute(operations: Vector[BatchableOp[_]]): Vector[Result] = {
operations
.groupBy(_.getClass)
.toVector
.flatMap{ case (_, batch) =>
batch.head.unsafeResolve(batch)
}
}

Scala: Typecast without explicitly known type parameter

Consider the following example:
case class C[T](x:T) {
def f(t:T) = println(t)
type ValueType = T
}
val list = List(1 -> C(2), "hello" -> C("goodbye"))
for ((a,b) <- list) {
b.f(a)
}
In this example, I know (runtime guarantee) that the type of a will be some T, and b will have type C[T] with the same T. Of course, the compiler cannot know that, hence we get a typing error in b.f(a).
To tell the compiler that this invocation is OK, we need to do a typecast à la b.f(a.asInstanceOf[T]). Unfortunately, T is not known here. So my question is: How do I rewrite b.f(a) in order to make this code compile?
I am looking for a solution that does not involve complex constructions (to keep the code readable), and that is "clean" in the sense that we should not rely on code erasure to make it work (see the first approach below).
I have some working approaches, but I find them unsatisfactory for various reasons.
Approaches I tried:
b.asInstanceOf[C[Any]].f(a)
This works, and is reasonably readable, but it is based on a "lie". b is not of type C[Any], and the only reason we do not get a runtime error is because we rely on the limitations of the JVM (type erasure). I think it is good style only to use x.asInstanceOf[X] when we know that x is really of type X.
b.f(a.asInstanceOf[b.ValueType])
This should work according to my understanding of the type system. I have added the member ValueType to the class C in order to be able to explicitly refer to the type parameter T. However, in this approach we get a mysterious error message:
Error:(9, 22) type mismatch;
found : b.ValueType
(which expands to) _1
required: _1
b.f(a.asInstanceOf[b.ValueType])
^
Why? It seems to complain that we expect type _1 but got type _1! (But even if this approach works, it is limited to the cases where we have the possibility to add a member ValueType to C. If C is some existing library class, we cannot do that either.)
for ((a,b) <- list.asInstanceOf[List[(T,C[T]) forSome {type T}]]) {
b.f(a)
}
This one works, and is semantically correct (i.e., we do not "lie" when invoking asInstanceOf). The limitation is that this is somewhat unreadable. Also, it is somewhat specific to the present situation: if a,b do not come from the same iterator, then where can we apply this type cast? (This code also has the side effect of being too complex for Intelli/J IDEA 2016.2 which highlights it as an error in the editor.)
val (a2,b2) = (a,b).asInstanceOf[(T,C[T]) forSome {type T}]
b2.f(a2)
I would have expected this one to work since a2,b2 now should have types T and C[T] for the same existential T. But we get a compile error:
Error:(10, 9) type mismatch;
found : a2.type (with underlying type Any)
required: T
b2.f(a2)
^
Why? (Besides that, the approach has the disadvantage of incurring runtime costs (I think) because of the creation and destruction of a pair.)
b match {
case b : C[t] => b.f(a.asInstanceOf[t])
}
This works. But enclosing the code with a match makes the code much less readable. (And it also is too complicated for Intelli/J.)
The cleanest solution is, IMO, the one you found with the type-capture pattern match. You can make it concise, and hopefully readable, by integrating the pattern directly inside your for comprehension, as follows:
for ((a, b: C[t]) <- list) {
b.f(a.asInstanceOf[t])
}
Fiddle: http://www.scala-js-fiddle.com/gist/b9030033133ee94e8c18ad772f3461a0
If you are not in a for comprehension already, unfortunately the corresponding pattern assignment does not work:
val (c, d: C[t]) = (a, b)
d.f(c.asInstanceOf[t])
That's because t is not in scope anymore on the second line. In that case, you would have to use the full pattern matching.
Maybe I'm confused about what you are trying to achieve, but this compiles:
case class C[T](x:T) {
def f(t:T) = println(t)
type ValueType = T
}
type CP[T] = (T, C[T])
val list = List[CP[T forSome {type T}]](1 -> C(2), "hello" -> C("goodbye"))
for ((a,b) <- list) {
b.f(a)
}
Edit
If the type of the list itself is out of your control, you can still cast it to this "correct" type.
case class C[T](x:T) {
def f(t:T) = println(t)
type ValueType = T
}
val list = List(1 -> C(2), "hello" -> C("goodbye"))
type CP[T] = (T, C[T])
for ((a,b) <- list.asInstanceOf[List[CP[T forSome { type T }]]]) {
b.f(a)
}
Great question! Lots to learn here about Scala.
Other answers and comments have already addressed most of the issues here, but I'd like to address a few additional points.
You asked why this variant doesn't work:
val (a2,b2) = (a,b).asInstanceOf[(T,C[T]) forSome {type T}]
b2.f(a2)
You aren't the only person who's been surprised by this; see e.g. this recent very similar issue report: SI-9899.
As I wrote there:
I think this is working as designed as per SLS 6.1: "The following skolemization rule is applied universally for every expression: If the type of an expression would be an existential type T, then the type of the expression is assumed instead to be a skolemization of T."
Basically, every time you write a value-level expression that the compiler determines to have an existential type, the existential type is instantiated. b2.f(a2) has two subexpressions with existential type, namely b2 and a2, so the existential gets two different instantiations.
As for why the pattern-matching variant works, there isn't explicit language in SLS 8 (Pattern Matching) covering the behavior of existential types, but 6.1 doesn't apply because a pattern isn't technically an expression, it's a pattern. The pattern is analyzed as a whole and any existential types inside only get instantiated (skolemized) once.
As a postscript, note that yes, when you play in this area, the error messages you get are often confusing or misleading and ought to be improved. See for example https://github.com/scala/scala-dev/issues/205
A wild guess, but is it possible that you need something like this:
case class C[+T](x:T) {
def f[A >: T](t: A) = println(t)
}
val list = List(1 -> C(2), "hello" -> C("goodbye"))
for ((a,b) <- list) {
b.f(a)
}
?
It will type check.
I'm not quite sure what "runtime guarantee" means here, usually it means that you are trying to fool type system (e.g. with asInstanceOf), but then all bets are off and you shouldn't expect type system to be of any help.
UPDATE
Just for the illustration why type casting is an evil:
case class C[T <: Int](x:T) {
def f(t: T) = println(t + 1)
}
val list = List("hello" -> C(2), 2 -> C(3))
for ((a, b: C[t]) <- list) {
b.f(a.asInstanceOf[t])
}
It compiles and fails at runtime (not surprisingly).
UPDATE2
Here's what generated code looks like for the last snippet (with C[t]):
...
val a: Object = x1._1();
val b: Test$C = x1._2().$asInstanceOf[Test$C]();
if (b.ne(null))
{
<synthetic> val x2: Test$C = b;
matchEnd4({
x2.f(scala.Int.unbox(a));
scala.runtime.BoxedUnit.UNIT
})
}
...
Type t simply vanished (as it should have been) and Scala is trying to convert a to an upper bound of T in C, i.e. Int. If there is no upper bound it's going to be Any (but then method f is nearly useless unless you cast again or use something like println which takes Any).

What does `SomeType[_]` mean in scala?

Going through Play Form source code right now and encountered this
def bindFromRequest()(implicit request: play.api.mvc.Request[_]): Form[T] = {
I am guessing it takes request as an implicit parameter (you don't have to call bindFromRequet(request)) of type play.api.mvc.Request[_] and return a generic T type wrapped in Form class. But what does [_] mean.
The notation Foo[_] is a short hand for an existential type:
Foo[A] forSome {type A}
So it differs from a normal type parameter by being existentially quantified. There has to be some type so your code type checks where as if you would use a type parameter for the method or trait it would have to type check for every type A.
For example this is fine:
val list = List("asd");
def doSomething() {
val l: List[_] = list
}
This would not typecheck:
def doSomething[A]() {
val l: List[A] = list
}
So existential types are useful in situations where you get some parameterized type from somewhere but you do not know and care about the parameter (or only about some bounds of it etc.)
In general, you should avoid existential types though, because they get complicated fast. A lot of instances (especially the uses known from Java (called wildcard types there)) can be avoided be using variance annotations when designing your class hierarchy.
The method doesn't take a play.api.mvc.Request, it takes a play.api.mvc.Request parameterized with another type. You could give the type parameter a name:
def bindFromRequest()(implicit request: play.api.mvc.Request[TypeParameter]): Form[T] = {
But since you're not referring to TypeParameter anywhere else it's conventional to use an underscore instead. I believe the underscore is special cased as a 'black hole' here, rather than being a regular name that you could refer to as _ elsewhere in the type signature.

Type parameters versus member types in Scala

I'd like to know how do the member types work in Scala, and how should I associate types.
One approach is to make the associated type a type parameter. The advantages of this approach is that I can prescribe the variance of the type, and I can be sure that a subtype doesn't change the type. The disadvantages are, that I cannot infer the type parameter from the type in a function.
The second approach is to make the associated type a member of the second type, which has the problem that I can't prescribe bounds on the subtypes' associated types and therefore, I can't use the type in function parameters (when x : X, X#T might not be in any relation with x.T)
A concrete example would be:
I have a trait for DFAs (could be without the type parameter)
trait DFA[S] { /* S is the type of the symbols in the alphabet */
trait State { def next(x : S); }
/* final type Sigma = S */
}
and I want to create a function for running this DFA over an input sequence, and I want
the function must take anything <% Seq[alphabet-type-of-the-dfa] as input sequence type
the function caller needn't specify the type parameters, all must be inferred
I'd like the function to be called with the concrete DFA type (but if there is a solution where the function would not have a type parameter for the DFA, it's OK)
the alphabet types must be unconstrained (ie. there must be a DFA for Char as well as for a yet unknown user-defined class)
the DFAs with different alphabet types are not subtypes
I tried this:
def runDFA[S, D <: DFA[S], SQ <% Seq[S]](d : D)(seq : SQ) = ....
this works, except the type S is not inferred here, so I have to write the whole type parameter list on each call site.
def runDFA[D <: DFA[S] forSome { type S }, SQ <% Seq[D#Sigma]]( ... same as above
this didn't work (invalid circular reference to type D??? (what is it?))
I also deleted the type parameter, created an abstract type Sigma and tried binding that type in the concrete classes. runDFA would look like
def runDFA[D <: DFA, SQ <% Seq[D#Sigma]]( ... same as above
but this inevitably runs into problems like "type mismatch: expected dfa.Sigma, got D#Sigma"
Any ideas? Pointers?
Edit:
As the answers indicate there is no simple way of doing this, could somebody elaborate more on why is it impossible and what would have to be changed so it worked?
The reasons I want runDFA ro be a free function (not a method) is that I want other similar functions, like automaton minimization, regular language operations, NFA-to-DFA conversions, language factorization etc. and having all of this inside one class is just against almost any principle of OO design.
First off, you don't need the parameterisation SQ <% Seq[S]. Write the method parameter as Seq[S]. If SQ <% Seq[S] then any instance of it is implicitly convertable to Seq[S] (that's what <% means), so when passed as Seq[S] the compiler will automatically insert the conversion.
Additionally, what Jorge said about type parameters on D and making it a method on DFA hold. Because of the way inner classes work in Scala I would strongly advise putting runDFA on DFA. Until the path dependent typing stuff works, dealing with inner classes of some external class can be a bit of a pain.
So now you have
trait DFA[S]{
...
def runDFA(seq : Seq[S]) = ...
}
And runDFA is all of a sudden rather easy to infer type parameters for: It doesn't have any.
Scala's type inference sometimes leaves much to be desired.
Is there any reason why you can't have the method inside your DFA trait?
def run[SQ <% Seq[S]](seq: SQ)
If you don't need the D param later, you can also try defining your method without it:
def runDFA[S, SQ <% Seq[S]](d: DFA[S])(seq: SQ) = ...
Some useful info on how the two differs :
From the the shapeless guide:
Without type parameters you cannot make dependent types , for example
trait Generic[A] {
type Repr
def to(value: A): Repr
def from(value: Repr): A
}
import shapeless.Generic
def getRepr[A](value: A)(implicit gen: Generic[A]) =
gen.to(value)
Here the type returned by to depends on the input type A (because the supplied implicit depends on A):
case class Vec(x: Int, y: Int)
case class Rect(origin: Vec, size: Vec)
getRepr(Vec(1, 2))
// res1: shapeless.::[Int,shapeless.::[Int,shapeless.HNil]] = 1 :: 2 ::
HNil
getRepr(Rect(Vec(0, 0), Vec(5, 5)))
// res2: shapeless.::[Vec,shapeless.::[Vec,shapeless.HNil]] = Vec(0,0)
:: Vec(5,5) :: HNil
without type members this would be impossible :
trait Generic2[A, Repr]
def getRepr2[A, R](value: A)(implicit generic: Generic2[A, R]): R =
???
We would have had to pass the desired value of Repr to getRepr as a
type parameter, effec vely making getRepr useless. The intui ve
take-away from this is that type parameters are useful as “inputs” and
type members are useful as “outputs”.
please see the shapeless guide for details.