Covariant Trees in Scala

Covariant Trees in Scala - scala

Looking at scalaz.Tree[A], it is invariant in A. I'm looking for a multi-way tree which I can dump values of a hierarchy in
E.g. if I have an ADT of
trait MyThing
case object Thing1 extends MyThing
case object Thing2 extends MyThing
And I want a Tree of MyThings, I cant use this without doing a cast to MyThing in scalaz;
import scalaz.{Scalaz, Tree}
import Scalaz._
val tree = Thing1.asInstanceOf[MyThing].
node(Thing2.asInstanceOf[MyThing].leaf)
Which is a bit of a pain.
Is there are covariant [+A] version of Tree?
Why is Tree invariant in the first place?

First of all, I want to second Huw's recommendation that you use a type ascription instead of downcasting with asInstanceOf. As Huw says, using a type ascription will fail at compile time instead of runtime if something changes in your type hierarchy that makes the cast invalid. It's also just a good practice to avoid asInstanceOf for any upcasts. You can use asInstanceOf for either upcasting or downcasting, but only using it for downcasting makes it easy to identify unsafe casting in your code.
To answer your two questions—no, there's not a covariant tree type in Scalaz, for reasons discussed in detail in the pull request linked by Huw above. At first this may seem like a huge inconvenience, but the decision to avoid non-invariant structures in Scalaz is related to a similar design decision—avoiding the use of subtyping in ADTs—that makes invariant trees, etc. less painful.
In other languages with good support for ADTs (like Haskell and OCaml, for example), the leaves of an ADT aren't subtypes of the ADT type, and Scala's somewhat unusual subtyping-based implementation can make type inference messy. The following is a common example of this problem:
scala> List(1, 2, 3).foldLeft(None)((_, i) => Some(i))
<console>:14: error: type mismatch;
found : Some[Int]
required: None.type
List(1, 2, 3).foldLeft(None)((_, i) => Some(i))
^
Because the type of the accumulator is inferred from the first argument to foldLeft, it ends up as None.type, which is pretty much useless. You have to provide a type ascription (or explicit type parameters for foldLeft), which can be pretty inconvenient.
Scalaz attempts to address this issue by promoting the use of ADT constructors that don't return the most specific subtype for the ADT leaf. For example, it includes none[A] and some[A](a: A) constructors for Option that return an Option[A].
(For more discussion of these issues, see my answer here and this related question).
In your case, implementing this approach could be as simple as writing the following:
val thing1: MyThing = Thing1
val thing2: MyThing = Thing2
Which allows you to write thing1.node(thing2.leaf). If you want to go further down this path, I'd highly recommend looking at Argonaut's Json ADT as a good example of ADT design that downplays the role of subtyping.

Related

How to express self-referential types using type definitions for existential (forSome) use case correctly in Scala

I'm facing issues due to two scala compiler limitations
It's not possible to refer to the type of this (this.type is not it for obv reasons) for cases where you want to write traits which implement common behavior which requires construction of a concrete type. The recommended pattern to work around this is
trait Foo[SelfType <: Foo[SelfType]] {
this: SelfType =>
final def foo: SelfType = newInstance(??? /*do some work*/)
protected def newInstance(i: Int): SelfType
}
Scala compiler (fixed in Dotty AFAIK) does not keep track of bounds in existential types (I've found at least 3 bugs filed with the scala github project going back from 2009 and on). This means, given the above type if I were to write a method that take some unknown Foo as such
def process(f: Foo[_])
I may get some weird buggy compiler behavior in certain cases. Instead, I need to manually express to the compiler that the _ here obeys the type bounds to work around its limitations. For that, I'd need to do something like
def process[F <: Foo[F]](f: Foo[F])
and invoke the method without passing any type params which hopefully captures this correctly. However, this can make def signatures quite cumbersome if Foo takes other type params leading to a lot of burden for clients. So my question is, is it possible to use type definitions as a shorthand to express such self-referential types correctly. E.g. I've tried all sorts of things.
type SomeFoo = Foo[SelfType] with SelfType forSome { type SelfType <: Foo[SelfType] }
but that doesn't seem to quite do it. Somehow the compiler doesn't realize that SelfType is the same unknown type SomeFoo.
I hope this made sense. Thanks for your help. P.S. feel free to suggest renaming the question as it's not super clear right now.

Why Scala doesn't deduce trait type parameters?

trait foo[F] {
def test: F
}
class ah extends foo[(Int,Int) => Int] {
def test = (i: Int,j: Int) => i+j
}
So the question is, why Scala known to be so smart about types cannot just deduce type (Int,Int) => Int from the type of test asking me instead to specify it? Or It is still possible? Or maybe this is backed by some thoughts that I don't have in mind.

Your question is basically "why does Scala have only local type inference" or equivalently "why does Scala not have non-local type inference". And the answer is: because the designers don't want to.
There are several reasons for this. One reason is that the most reasonable alternative to local type inference is global type inference, but that's impossible in Scala: Scala has separate compilation and modular type checking, so there simply is no point in time where the compiler has a global view of the entire program. In fact, Scala has dynamic code loading, which means that at compile time the entire code doesn't even need to exist yet! Global type inference is simply impossible in Scala. The best we could do is "whole-compilation-unit type inference", but that is undesirable, too: it would mean that whether or not you need type annotations depends on whether or not you compile your code in multiple units or just one.
Another important reason is that type annotations at module boundaries serve as a double-check, kind of like double-entry book keeping for types. Note that even in languages with global type inference like Haskell, it is strongly recommended to put type annotations on module boundaries and public interfaces.
A third reason is that global type inference can sometimes lead to confusing error messages, when instead of the type checker failing at a type annotation, the type inferencer happily chugs along inferring increasingly non-sensical types, until it finally gives up at a location far away from the actual error (which might just be a simple typo) with a type error that is only tangentially related to the original types at the error site. This can sometimes happen in Haskell, for example. The Scala designers value helpful error messages so much that they are willing to sacrifice language features unless they can figure out how to implement them with good error messages. (Note that even with Scala's very limited type inference, you can get such "helpful" messages as "expected Foo got Product with Serializable".)
However, what Scala's local type inference can do in this example, is to work in the other direction, and infer the parameter types of the anonymous function from the return type of test:
trait foo[F] {
def test: F
}
class ah extends foo[(Int, Int) ⇒ Int] {
def test = (i, j) ⇒ i + j
}
(new ah) test(2, 3) //=> 5

For you particular example, inferring type parameters based on inheritance can be ambiguous:
trait A
trait B extends A
trait Foo[T] {
val x: T
}
trait Bar extends Foo[?] {
val x: B
}
The type that could go in the ? could be either A or B. This is a general example of where Scala will not infer types that could be ambiguous.
Now, you are correct to observe that there is a similarity between
class Foo[T](x: T)
and
trait Foo[T] { x: T }
I have seen work by some into possibly generalizing the similarity (but I can't seem to find it right now). That could, in theory, allow type inference of type parameters based on a member. But I don't know if it will ever get to that point.

Why do we have to explicitly specify the ClassTag typeclass

Now that scala has iterated towards a JVM type erasure fix with the ClassTag typeclass, why is it an opt-in, rather than having the compiler always capture the type signature for runtime inspection. Having it an implicit parametrized type constraint would make it possible to call classTag[T] regardless of the generic parameter declaration.
EDIT: I should clarify that I don't mean that scala should change the signature behind the scenes to always inlcude ClassTag. Rather I mean that since ClassTag shows that scala can capture runtime Type information and therefore avoid type erasure limitations, why can't that capture be implicit as part of the compiler so that that information is always available in scala code?
My suspicion is that it's backwards compatibility, java ecosystem compatibility, binary size or runtime overhead related, but those are just speculation.

Backwards compatibility would be completely destroyed, really. If you have a simple method like:
def foo[A](a: A)(implicit something: SomeType) = ???
Then suppose that in the next version of Scala, the compiler suddenly added implicit ClassTags to the signature of all methods with type parameters. This method would be broken. Anywhere it was being explicitly called like foo(a)(someTypeValue) wouldn't work anymore. Binary and source compatibility would be gone.
Java interoperability would be ugly. Supposing our method now looks like this:
def foo[A : ClassTag](a: A) = ???
Because ClassTags are generated by the Scala compiler, using this method from Java would prove more difficult. You'd have to create the ClassTag yourself.
ClassTag<MyClass> tag = scala.reflect.ClassTag$.MODULE$.apply(MyClass.class);
foo(a, tag);
My Java might not be 100% correct, but you get the idea. Anything parameterized would become very ugly. Well, it already is if it requires an implicit ClassTag, but the class of methods where this would be necessary would increase dramatically.
Moreover, type erasure isn't that much of a problem in most parameterized methods we (I, at least) use. I think automatically requiring a ClassTag for each type parameter would be far more trouble than it would help, for the above reasons.
Certainly this would add more compiler overhead, as it would need to generate more ClassTags than it usually would. I don't think it would add much more runtime overhead unless the ClassTag makes a difference. For example, in a simple method like below, the ClassTag doesn't really do anything:
def foo[A : ClassTag](a: A): A = a
We should also note that they're not perfect, either. So adding them isn't an end-all solution to erasure problems.
val list = List(1, "abc", List(1, 2, 3), List("a", "b"))
def find[A: ClassTag](l: List[Any]): Option[A] =
l collectFirst { case a: A => a }
scala> find[List[String]]
res2: Option[List[String]] = Some(List(1, 2, 3)) // Not quite! And no warnings, either.
Adding a ClassTag to every single class instance would add overhead, and surely also break compatibility. It's also in many places not possible. We can't just infuse java.lang.String with a ClassTag. Furthermore, we'd still be just as susceptible to erasure. Having a ClassTag field in each class is really no better than using getClass. We could make comparisons like
case a if(a.getClass == classOf[String]) => a.asInstanceOf[String]
But this is horribly ugly, requires a cast, and isn't necessarily what the ClassTag is meant to fix. If I tried something like this with my find method, it wouldn't work--at all.
// Can't compile
def find[A](l: List[Any]): Option[A] =
l collectFirst { case a if(a.getClass == classOf[A]) => a.asInstanceOf[A] }
Even if I were to fashion this to somehow work with ClassTag, where would it come from? I could not say a.classTag == classTag[A], because A has already been erased. I need the ClassTag at the method call site.

Yes, you would penalise any generic method or class for a use case that is quite seldom (e.g. requiring array construction or heterogeneous map value recovery). With this idea of "always class tags" you would also effectively destroy the possibility to call Scala code from Java. In summery, it simply doesn't make any sense to require a class tag to be always present, neither from compatibility, performance or class size point of view.

What can I use scala.Singleton for?

To clarify: I am NOT asking what I can use a Singleton design pattern for. The question is about largely undocumented trait provided in scala.
What is this trait for? The only concrete use case I could find so far was to limit a trait to objects only, as seen in this question: Restricting a trait to objects?
This question sheds some light on the issue Is scala.Singleton pure compiler fiction?, but clearly there was another use case as well!
Is there some obvious use that I can't think of, or is it just mainly compiler magicks?

I think the question is answered by Martin Odersky's comment on the mailing list thread linked to from the linked question:
The type Singleton is essentially an encoding trick for existentials
with values. I.e.
T forSome { val x: T }
is turned into
[x.type := X] T forSome { type X <: T with Singleton }
Singleton types are usually not used directly…
In other words, there is no intended use beyond guiding the typer phase of the compiler. The Scala Language Specification has this bit in §3.2.10, also §3.2.1 indicates that this trait might be used by the compiler to declare that a type is stable.
You can also see this with the following (Scala 2.11):
(new {}).isInstanceOf[Singleton]
<console>:54: warning: fruitless type test: a value of type AnyRef cannot also
be a Singleton
(new {}).isInstanceOf[Singleton]
^
res27: Boolean = true
So you cannot even use that trait in a meaningful test.
(This is not a definite answer, just my observation)

Any reason why scala does not explicitly support dependent types?

There are path dependent types and I think it is possible to express almost all the features of such languages as Epigram or Agda in Scala, but I'm wondering why Scala does not support this more explicitly like it does very nicely in other areas (say, DSLs) ?
Anything I'm missing like "it is not necessary" ?

Syntactic convenience aside, the combination of singleton types, path-dependent types and implicit values means that Scala has surprisingly good support for dependent typing, as I've tried to demonstrate in shapeless.
Scala's intrinsic support for dependent types is via path-dependent types. These allow a type to depend on a selector path through an object- (ie. value-) graph like so,
scala> class Foo { class Bar }
defined class Foo
scala> val foo1 = new Foo
foo1: Foo = Foo#24bc0658
scala> val foo2 = new Foo
foo2: Foo = Foo#6f7f757
scala> implicitly[foo1.Bar =:= foo1.Bar] // OK: equal types
res0: =:=[foo1.Bar,foo1.Bar] = <function1>
scala> implicitly[foo1.Bar =:= foo2.Bar] // Not OK: unequal types
<console>:11: error: Cannot prove that foo1.Bar =:= foo2.Bar.
implicitly[foo1.Bar =:= foo2.Bar]
In my view, the above should be enough to answer the question "Is Scala a dependently typed language?" in the positive: it's clear that here we have types which are distinguished by the values which are their prefixes.
However, it's often objected that Scala isn't a "fully" dependently type language because it doesn't have dependent sum and product types as found in Agda or Coq or Idris as intrinsics. I think this reflects a fixation on form over fundamentals to some extent, nevertheless, I'll try and show that Scala is a lot closer to these other languages than is typically acknowledged.
Despite the terminology, dependent sum types (also known as Sigma types) are simply a pair of values where the type of the second value is dependent on the first value. This is directly representable in Scala,
scala> trait Sigma {
| val foo: Foo
| val bar: foo.Bar
| }
defined trait Sigma
scala> val sigma = new Sigma {
| val foo = foo1
| val bar = new foo.Bar
| }
sigma: java.lang.Object with Sigma{val bar: this.foo.Bar} = $anon$1#e3fabd8
and in fact, this is a crucial part of the encoding of dependent method types which is needed to escape from the 'Bakery of Doom' in Scala prior to 2.10 (or earlier via the experimental -Ydependent-method types Scala compiler option).
Dependent product types (aka Pi types) are essentially functions from values to types. They are key to the representation of statically sized vectors and the other poster children for dependently typed programming languages. We can encode Pi types in Scala using a combination of path dependent types, singleton types and implicit parameters. First we define a trait which is going to represent a function from a value of type T to a type U,
scala> trait Pi[T] { type U }
defined trait Pi
We can than define a polymorphic method which uses this type,
scala> def depList[T](t: T)(implicit pi: Pi[T]): List[pi.U] = Nil
depList: [T](t: T)(implicit pi: Pi[T])List[pi.U]
(note the use of the path-dependent type pi.U in the result type List[pi.U]). Given a value of type T, this function will return a(n empty) list of values of the type corresponding to that particular T value.
Now let's define some suitable values and implicit witnesses for the functional relationships we want to hold,
scala> object Foo
defined module Foo
scala> object Bar
defined module Bar
scala> implicit val fooInt = new Pi[Foo.type] { type U = Int }
fooInt: java.lang.Object with Pi[Foo.type]{type U = Int} = $anon$1#60681a11
scala> implicit val barString = new Pi[Bar.type] { type U = String }
barString: java.lang.Object with Pi[Bar.type]{type U = String} = $anon$1#187602ae
And now here is our Pi-type-using function in action,
scala> depList(Foo)
res2: List[fooInt.U] = List()
scala> depList(Bar)
res3: List[barString.U] = List()
scala> implicitly[res2.type <:< List[Int]]
res4: <:<[res2.type,List[Int]] = <function1>
scala> implicitly[res2.type <:< List[String]]
<console>:19: error: Cannot prove that res2.type <:< List[String].
implicitly[res2.type <:< List[String]]
^
scala> implicitly[res3.type <:< List[String]]
res6: <:<[res3.type,List[String]] = <function1>
scala> implicitly[res3.type <:< List[Int]]
<console>:19: error: Cannot prove that res3.type <:< List[Int].
implicitly[res3.type <:< List[Int]]
(note that here we use Scala's <:< subtype-witnessing operator rather than =:= because res2.type and res3.type are singleton types and hence more precise than the types we are verifying on the RHS).
In practice, however, in Scala we wouldn't start by encoding Sigma and Pi types and then proceeding from there as we would in Agda or Idris. Instead we would use path-dependent types, singleton types and implicits directly. You can find numerous examples of how this plays out in shapeless: sized types, extensible records, comprehensive HLists, scrap your boilerplate, generic Zippers etc. etc.
The only remaining objection I can see is that in the above encoding of Pi types we require the singleton types of the depended-on values to be expressible. Unfortunately in Scala this is only possible for values of reference types and not for values of non-reference types (esp. eg. Int). This is a shame, but not an intrinsic difficulty: Scala's type checker represents the singleton types of non-reference values internally, and there have been a couple of experiments in making them directly expressible. In practice we can work around the problem with a fairly standard type-level encoding of the natural numbers.
In any case, I don't think this slight domain restriction can be used as an objection to Scala's status as a dependently typed language. If it is, then the same could be said for Dependent ML (which only allows dependencies on natural number values) which would be a bizarre conclusion.

I would assume it is because (as I know from experience, having used dependent types in the Coq proof assistant, which fully supports them but still not in a very convenient way) dependent types are a very advanced programming language feature which is really hard to get right - and can cause an exponential blowup in complexity in practice. They're still a topic of computer science research.

I believe that Scala's path-dependent types can only represent Σ-types, but not Π-types. This:
trait Pi[T] { type U }
is not exactly a Π-type. By definition, Π-type, or dependent product, is a function which result type depends on argument value, representing universal quantifier, i.e. ∀x: A, B(x). In the case above, however, it depends only on type T, but not on some value of this type. Pi trait itself is a Σ-type, an existential quantifier, i.e. ∃x: A, B(x). Object's self-reference in this case is acting as quantified variable. When passed in as implicit parameter, however, it reduces to an ordinary type function, since it is resolved type-wise. Encoding for dependent product in Scala may look like the following:
trait Sigma[T] {
val x: T
type U //can depend on x
}
// (t: T) => (∃ mapping(x, U), x == t) => (u: U); sadly, refinement won't compile
def pi[T](t: T)(implicit mapping: Sigma[T] { val x = t }): mapping.U
The missing piece here is an ability to statically constraint field x to expected value t, effectively forming an equation representing the property of all values inhabiting type T. Together with our Σ-types, used to express the existence of object with given property, the logic is formed, in which our equation is a theorem to be proven.
On a side note, in real case theorem may be highly nontrivial, up to the point where it cannot be automatically derived from code or solved without significant amount of effort. One can even formulate Riemann Hypothesis this way, only to find the signature impossible to implement without actually proving it, looping forever or throwing an exception.

The question was about using dependently typed feature more directly and, in my opinion,
there would be a benefit in having a more direct dependent typing approach than what Scala offers.
Current answers try to argue the question on type theoretical level.
I want to put a more pragmatic spin on it.
This may explain why people are divided on the level of support of dependent types in the Scala language. We may have somewhat different definitions in mind. (not to say one is right and one is wrong).
This is not an attempt to answer the question how easy would it be to turn
Scala into something like Idris (I imagine very hard) or to write a library
offering more direct support for Idris like capabilities (like singletons tries to be in Haskell).
Instead, I want to emphasize the pragmatic difference between Scala and a language like Idris.
What are code bits for value and type level expressions?
Idris uses the same code, Scala uses very different code.
Scala (similarly to Haskell) may be able to encode lots of type level calculations.
This is shown by libraries like shapeless.
These libraries do it using some really impressive and clever tricks.
However, their type level code is (currently) quite different from value level expressions
(I find that gap to be somewhat closer in Haskell). Idris allows to use value level expression on the type level AS IS.
The obvious benefit is code reuse (you do not need to code type level expressions
separately from value level if you need them in both places). It should be way easier to
write value level code. It should be easier to not have to deal with hacks like singletons (not to mention performance cost). You do not need to learn two things you learn one thing.
On a pragmatic level, we end up needing fewer concepts. Type synonyms, type families, functions, ... how about just functions? In my opinion, this unifying benefits go much deeper and are more than syntactic convenience.
Consider verified code. See:
https://github.com/idris-lang/Idris-dev/blob/v1.3.0/libs/contrib/Interfaces/Verified.idr
Type checker verifies proofs of monadic/functor/applicative laws and the
proofs are about actual implementations of monad/functor/applicative and not some encoded
type level equivalent that may be the same or not the same.
The big question is what are we proving?
The same can me done using clever encoding tricks (see the following for Haskell version, I have not seen one for Scala)
https://blog.jle.im/entry/verified-instances-in-haskell.html
https://github.com/rpeszek/IdrisTddNotes/wiki/Play_FunctorLaws
except the types are so complicated that it is hard to see the laws, the value
level expressions are converted (automatically but still) to type level things and
you need to trust that conversion as well.
There is room for error in all of this which kinda defies the purpose of compiler acting as
a proof assistant.
(EDITED 2018.8.10) Talking about proof assistance, here is another big difference between Idris and Scala. There is nothing in Scala (or Haskell) that can prevent from writing diverging proofs:
case class Void(underlying: Nothing) extends AnyVal //should be uninhabited
def impossible() : Void = impossible()
while Idris has total keyword preventing code like this from compiling.
A Scala library that tries to unify value and type level code (like Haskell singletons) would be an interesting test for Scala's support of dependent types. Can such library be done much better in Scala because of path-dependent types?
I am too new to Scala to answer that question myself.