Primitive types are not traited Immutable in scala? - scala

Can anyone please share insight into the trait "Immutable" in scala? At first glance I thought this would be a nice control structure to limit a class I'm building, but oddly I noticed that primitive types do not extend this. Is there a reason for this? Is there a way to bind the syntax to Immutable or AnyVal?
class Test {
def test[T<:Immutable](x:T)={
println("passes "+x)
}
case class X(s:String) extends Immutable
test(X("hello")) //passes
// test("fail") - does not pass compiler
}

The only direct subtypes of Immutable in the Scala core library are:
collection.immutable.Traversable
collection.parallel.immutable.ParIterable
Nothing else refers to Immutable at all.
Immutable hasn't been changed since it was added in 2009 in Martin Odersky's "massive new collections checkin". I'm searching through that commit, and it looks like Immutable was never even used as a bound when it was first introduced either.
Honestly, I doubt there's much intent behind these traits anymore. Odersky probably planned to use Immutable to bound the type arguments on immutable collections, and then thought better of it. But that's just my speculation.

So-called primitive types (Boolean, Byte, Char, Short, Int, Long, Float, Double) are intrinsically immutable. 5 is 5 is 5. You cannot do anything to 5 to turn it into anything that is not 5.
Otherwise, immutability is a property of how a value is stored. If stored in a var, that var may be replaced freely with a new value (of a compatible type). By extension, constructed types (classes, traits and objects) may be either immutable or mutable depending on whether they allow any of their internal state to be altered following construction.
Java's String (also used as Scala's String) is immutable.
However, none of this has anything to do with you example, since you did not demonstrate mutability. You simply showed what happens when one applies the + method of one value to another value.
While it is certainly possible that one can implement a + method that mutates its (apparent) left-hand operand, one rarely does that. If there's a need for that kind of mutation, one would conventionally define the += method instead.
+ is somewhat special in that it may be applied to any value (if the argument / right-hand operand) is a String by virtue of an implicit conversion to a special class that defines +(s: String) so that the string concatenation interpretation of + may be applied. In other words, if you write e1 + "e2" and the type of the expression e1 does not define +, then Scala will convert e1 to String and concatenate it with "e2".

Related

Disfunctionality of type parameter

I’m new to using Scala and am trying to see if a list contains any objects of a certain type.
When I make a method to do this, I get the following results:
var l = List("Some string", 3)
def containsType[T] = l.exists(_.isInstanceOf[T])
containsType[Boolean] // val res0: Boolean = true
l.exists(_.isInstanceOf[Boolean]) // val res1: Boolean = false
Could someone please help me understand why my method doesn’t return the same results as the expression on the last line?
Thank you,
Johan
Alin's answer details perfectly why the generic isn't available at runtime. You can get a bit closer to what you want with the magic of ClassTag, but you still have to be conscious of some issues with Java generics.
import scala.reflect.ClassTag
var l = List("Some string", 3)
def containsType[T](implicit cls: ClassTag[T]): Boolean = {
l.exists(cls.runtimeClass.isInstance(_))
}
Now, whenever you call containsType, a hidden extra argument of type ClassTag[T] gets passed it. So when you write, for instance, println(containsType[String]), then this gets compiled to
scala.this.Predef.println($anon.this.containsType[String](ClassTag.apply[String](classOf[java.lang.String])))
An extra argument gets passed to containsType, namely ClassTag.apply[String](classOf[java.lang.String]). That's a really long winded way of explicitly passing a Class<String>, which is what you'd have to do in Java manually. And java.lang.Class has an isInstance function.
Now, this will mostly work, but there are still major caveats. Generics arguments are completely erased at runtime, so this won't help you distinguish between an Option[Int] and an Option[String] in your list, for instance. As far as the JVM is concerned, they're both Option.
Second, Java has an unfortunate history with primitive types, so containsType[Int] will actually be false in your case, despite the fact that the 3 in your list is actually an Int. This is because, in Java, generics can only be class types, not primitives, so a generic List can never contain int (note the lowercase 'i', this is considered a fundamentally different thing in Java than a class).
Scala paints over a lot of these low-level details, but the cracks show through in situations like this. Scala sees that you're constructing a list of Strings and Ints, so it wants to construct a list of the common supertype of the two, which is Any (strings and ints have no common supertype more specific than Any). At runtime, Scala Int can translate to either int (the primitive) or Integer (the object). Scala will favor the former for efficiency, but when storing in generic containers, it can't use a primitive type. So while Scala thinks that your list l contains a String and an Int, Java thinks that it contains a String and a java.lang.Integer. And to make things even crazier, both int and java.lang.Integer have distinct Class instances.
So summon[ClassTag[Int]] in Scala is java.lang.Integer.TYPE, which is a Class<Integer> instance representing the primitive type int (yes, the non-class type int has a Class instance representing it). While summon[ClassTag[java.lang.Integer]] is java.lang.Integer::class, a distinct Class<Integer> representing the non-primitive type Integer. And at runtime, your list contains the latter.
In summary, generics in Java are a hot mess. Scala does its best to work with what it has, but when you start playing with reflection (which ClassTag does), you have to start thinking about these problems.
println(containsType[Boolean]) // false
println(containsType[Double]) // false
println(containsType[Int]) // false (list can't contain primitive type)
println(containsType[Integer]) // true (3 is converted to an Integer)
println(containsType[String]) // true (class type so it works the way you expect)
println(containsType[Unit]) // false
println(containsType[Long]) // false
Scala uses the type erasure model of generics. This means that no
information about type arguments is kept at runtime, so there's no way
to determine at runtime the specific type arguments of the given
List object. All the system can do is determine that a value is a
List of some arbitrary type parameters.
You can verify this behavior by trying any List concrete type:
val l = List("Some string", 3)
println(l.isInstanceOf[List[Int]]) // true
println(l.isInstanceOf[List[String]]) // true
println(l.isInstanceOf[List[Boolean]]) // also true
println(l.isInstanceOf[List[Unit]]) // also true
Now regarding your example:
def containsType[T] = l.exists(_.isInstanceOf[T])
println(containsType[Int]) // true
println(containsType[Boolean]) // also true
println(containsType[Unit]) // also true
println(containsType[Double]) // also true
isInstanceOf is a synthetic function (a function generated by the Scala compiler at compile-time, usually to work around the underlying JVM limitations) and does not work the way you would expect with generic type arguments like T, because after compilation, this would normally be equivalent in Java to instanceof T which, by the way - is illegal in Java.
Why is illegal? Because of type erasure. Type erasure means all your generic code (generic classes, generic methods, etc.) is converted to non-generic code. This usually means 3 things:
all type parameters in generic types are replaced with their bounds or Object if they are unbounded;
wherever necessary the compiler inserts type casts to preserve type-safety;
bridge methods are generated if needed to preserve polymorphism of all generic methods.
However, in the case of instanceof T, the JVM cannot differentiate between types of T at execution time, so this makes no sense. The type used with instanceof has to be reifiable, meaning that all information about the type needs to be available at runtime. This property does not apply to generic types.
So if Java forbids this because it can't work, why does Scala even allows it? The Scala compiler is indeed more permissive here, but for one good reason; because it treats it differently. Like the Java compiler, the Scala compiler also erases all generic code at compile-time, but since isInstanceOf is a synthetic function in Scala, calls to it using generic type arguments such as isInstanceOf[T] are replaced during compilation with instanceof Object.
Here's a sample of your code decompiled:
public <T> boolean containsType() {
return this.l().exists(x$1 -> BoxesRunTime.boxToBoolean(x$1 instanceof Object));
}
Main$.l = (List<Object>)package$.MODULE$.List().apply((Seq)ScalaRunTime$.MODULE$.wrapIntArray(new int[] { 1, 2, 3 }));
Predef$.MODULE$.println((Object)BoxesRunTime.boxToBoolean(this.containsType()));
Predef$.MODULE$.println((Object)BoxesRunTime.boxToBoolean(this.containsType()));
This is why no matter what type you give to the polymorphic function containsType, it will always result in true. Basically, containsType[T] is equivalent to containsType[_] from Scala's perspective - which actually makes sense because a generic type T, without any upper bounds, is just a placeholder for type Any in Scala. Because Scala cannot have raw types, you cannot for example, create a List without providing a type parameter, so every List must be a List of "something", and that "something" is at least an Any, if not given a more specific type.
Therefore, isInstanceOf can only be called with specific (concrete) type arguments like Boolean, Double, String, etc. That is why, this works as expected:
println(l.exists(_.isInstanceOf[Boolean])) // false
We said that Scala is more permissive, but that does not mean you get away without a warning.
To alert you of the possibly non-intuitive runtime behavior, the Scala compiler does usually emit unchecked warnings. For example, if you had run your code in the Scala interpreter (or compile it using scalac), you would have received this:

Scala Nothing datatype

I know Scala Nothing is the bottom type. When I see the API it extends from "Any" which is the top in the hierarchy.
Now since Scala does not support multiple inheritance, how can we say that it is the bottom type. In other words it is not inheriting directly all the classes or traits like Seq, List, String, Int and so on. If that is the case how can we say that it is the bottom of all type ?
What I meant is that if we are able to assign List[Nothing] (Nil) to List[String] as List is covariant in scala how it is possible because there is no direct correlation between Nothing and String type. As we know Nothing is a bottom type but I am having little difficulty in seeing the relation between String and Nothing as I stated in the above example.
Thanks & Regards,
Mohamed
tl;dr summary: Nothing is a subtype of every type because the spec says so. It cannot be explained from within the language. Every language (or at least almost every language) has some things at the very core that cannot be explained from within the language, e.g. java.lang.Object having no superclass even though every class has a superclass, since even if we don't write an extends clause, the class will implicitly get a superclass. Or the "bootstrap paradox" in Ruby, Object being an instance of Class, but Class being a subclass of Object, and thus Object being an indirect instance of itself (and even more directly: Class being an instance of Class).
I know Scala Nothing is the bottom type. When I see the API it extends from "Any" which is the top in the hierarchy.
Now since Scala does not support multiple inheritance, how can we say that it is the bottom type.
There are two possible answers to this.
The simple and short answer is: because the spec says so. The spec says Nothing is a subtype of all types, so Nothing is the subtype of all types. How? We don't care. The spec says it is so, so that's what it is. Let the compiler designers worry about how to represent this fact within their compiler. Do you care how Any is able to have to superclass? Do you care how def is represented internally in the compiler?
The slightly longer answer is: Yes, it's true, Nothing inherits from Any and only from Any. But! Inheritance is not the same thing as subtyping. In Scala, inheritance and subtyping are closely tied together, but they are not the same thing. The fact that Nothing can only inherit from one class does not mean that it cannot be the subtype of more than one type. A type is not the same thing as a class.
In fact, to be very specific, the spec does not even say that Nothing is a subtype of all types. It only says that Nothing conforms to all types.
In other words it is not inheriting directly all the classes or traits like Seq, List, String, Int and so on. If that is the case how can we say that it is the bottom of all type ?
Again, we can say that, because the spec says we can say that.
How can we say that def defines a method? Because the spec says so. How can we say that a b c means the same thing as a.b(c) and a b_: c means the same thing as { val __some_unforgeable_id__ = a; c.b_:(__some_unforgeable_id__) }? Because the spec says so. How can we say that "" is a string and '' is a character? Because the spec says so.
What I meant is that if we are able to assign List[Nothing] (Nil) to List[String] as List is covariant in scala how it is possible because there is no direct correlation between Nothing and String type.
Yes, there is a direct correlation between the types Nothing and String. Nothing is a subtype of String because Nothing is a subtype of all types, including String.
As we know Nothing is a bottom type but I am having little difficulty in seeing the relation between String and Nothing as I stated in the above example.
The relation between String and Nothing is that Nothing is a subtype of String. Why? Because the spec says so.
The compiler knows Nothing is a subtype of String the same way it knows 1 is an instance of Int and has a + method, even though if you look at the source code of the Scala standard library, the Int class is actually abstract and all its methods have no implementation.
Someone, somewhere wrote some code within the compiler that knows how to handle adding two numbers, even though those numbers are actually represented as JVM primitives and don't even exist inside the Scala object system. The same way, someone, somewhere wrote some code within the compiler that knows that Nothing is a subtype of all types even though this fact is not represented (and is not even representable) in the source code of Nothing.
Now since Scala does not support multiple inheritance
Scala does support multiple inheritance, using trait mixin. This is currently not commutative, i.e. the type A with B is not identical with B with A (this will happen with Dotty), but still it's a form of multiple inheritance, and indeed one of Scala's strong points, as it solves the diamond problem through its linearisation rules.
By the way, Null is another bottom type, inherited from Java (which could also be said to have a Nothing bottom type because you can throw a runtime exception in any possible place).
I think you need to distinguish between class inheritance and type bounds. There is no contradiction in defining Nothing as a bottom type, although it does not "explicitly" inherit from any type you want, such as List. It's more like a capability, the capability to throw an exception.
if we are able to assign List[Nothing] (Nil) to List[String] as List is covariant in scala how it is possible because there is no direct correlation between Nothing and String type
Yes, the idea of the bottom type is that Nothing is also (among many other things) a sub-type of String. So you can write
def foo: String = throw new Exception("No")
This only works because Nothing (the type of throwing an exception) is more specific than the declared return type String.

How to define a union type that works at runtime?

Following on form this excellent set of answers on how to define union types in Scala. I've been using the Miles Sabin definition of Union types, but one questions remains.
How do you work with these if the type isn't know until Runtime? For example:
trait inv[-A] {}
type Or[A,B] = {
type check[X] = (inv[A] with inv[B]) <:< inv[X]
}
case class Foo[A : (Int Or String)#check](a: A)
Foo(1) // Foo[Int] = Foo(1)
Foo("hi") // Foo[String] = Foo(hi)
Foo(2.0) // Error!
This example works since the parameter A is know at compile time, and calling Foo(1) is really calling Foo[Int](1). However, what do you do if parameter A isn't known until runtime? Maybe you're paring a file that contains the data for Foo's, in which case the type parameter of Foo isn't know until you read the data. There's no easy way to set parameter A in this case.
The best solutions I've been able to come up with are:
Pattern Match on the data you've read and then create different Foo's based that type. In my case this isn't feasible because my case-class actually contains dozens of union types, so there'd be hundreds of combinations of types to pattern match.
Cast the type you've just read to be (String or Int), so you have a single type to pass around, that passes the Type Class constraint when you create Foo with it. Then return Foo[_] instead. This puts the onus back on the Foo user to work out the type of each field (since they'll appear to be type Any), but at least it defers having to know the type until the field is actually used, in which case a pattern match seems more tractable.
The second solution looks like this:
def parseLine: Any // Parses data point, but can be either a String or
// Int, so returns Any.
def mkFoo: Foo[_] = {
val a = parseLine.asInstanceOf[Int with String]
Foo(a) // Passes type constraint now
}
In practice I've ended up using the second solution, but I'm wondering if there's something better I can do?
Another way to state the problem is: What does it mean to return a Union Type? Functions can only return a single type, and the trickery we use with Miles Sabin union types is only useful for the types you pass in, not for the types you return.
PS. For context, why this is a problem in my case is that I'm generating a set of case-classes from a Json schema file. Json naturally supports union types, so I would like to make my case classes reflect that too. This works great in one direction: users creating case-classes to be serialized out to Json. But gets sticky in the other direction: user's parsing Json files to have a set of populated case classes returned to them.
The "standard" Scala solution to this problem is to use an ordinary discriminated-union type (ie, to forego true union types altogether):
sealed trait Foo
case class IntFoo(x: Int) extends Foo
case class StringFoo(x: String) extends Foo
This reflects the fact that, as you observe, the particular type of the member is a runtime value; the JVM type-tag of the Foo instance provides this runtime value.
Miles Sabin's implementation of union types is very clever, but I'm not sure if it provides any practical benefit, because it only restricts the type of thing that can go into a Foo, but provides the user of a Foo with no computable version of that restriction, in the way a match provides you with a computable version of the sealed trait. In general, for a restriction to be useful, it needs two sides: a check that only the right things are put in, and an extractor (aka an eliminator) that allows the same right things to come out the other end.
Perhaps if you gave some explanation of why you're looking for a purer union type it would illuminate whether regular discriminated unions are sufficient or if you really need something more.
There's a reason every JSON parser for Scala requires well defined types into which the JSON will be converted, even if some fields have to be dropped: you cannot work with something you don't know the type of.
To given an example, say you have a, and maybe a is a String, maybe it's an Int, but you don't know what it is. Why computation could you possibly make with a, not knowing its type? Why would your code compute the sum of all a's, for instance, if you didn't know in advance it was a number?
Generally, the answer to that is to perform user-provided data manipulation at runtime over data with unknown characteristics, as the user itself sees that it's a number and decides they want to know what the sum of that field is. Fine, but you are going the wrong way about it if so.
There is a well defined way to represent JSON data in Scala (and, for that matter, any data that has the same characteristics as JSON. Which is using a hierarchy of classes. A json value may be a json object, array or one of a number of primitives. A json object contains a list of key/value pairs, whose keys are json strings and values are json values. And so on. This is easy to represent, and there are many library doing so already. In fact, there are so many that there's a project called Json4s which presents a unified API which can be used and is implemented by many of the aforementioned libraries.
Things like the records which Miles Sabin's Shapeless library provide are intended to be used when the input doesn't have a well defined schema, but the program knows what it needs from that input. And, yes, the program might know what to do with a if it is an Int or a String, but not every possible value.
The next Scala 3 (mid 2020) based on Dotty will implement the proposal for Union Type from last Sept. 2018
You see it in "a tour of Scala 3" (June 2019)
Union Types Provide ad-hoc combinations of types
Subsetting = Subtyping
No boxing overhead
case class UserName(name: String)
case class Password(hash: Hash)
def help(id: UserName | Password) = {
val user = id match {
case UserName(name) => lookupName(name)
case Password(hash) => lookupPassword(hash)
}
...
}
Union Types Work also with singleton types
Great for JS interop
type Command = "Click" | "Drag" | "KeyPressed"
def handleEvent(kind: Command) = kind match {
case "Click" => MouseClick()
case "Drag" => MoveTo()
case "KeyPressed" => KeyPressed()
}

Is is reasonable, and is there a benefit to a Scala Symbol class that extends AnyVal?

It seems that one issue with scala.Symbol is it two objects, the Symbol and the String it is based on.
Why can this extra object not be eliminated by defining Sym something like:
class Sym private(val name:String) extends AnyVal {
override def toString = "'" + name
}
object Sym {
def apply(name:String) = new Sym(name.intern)
}
Admittedly the performance implications of object allocation are likely tiny, but comments with those with a deeper understanding of Scala would be illuminating. In particular, does the above provide efficient maps via equality by reference?
Another advantage of the simple 'Sym' above is in a map centric application where there are lots of string keys, but where the strings are naming many entirely different kinds of things, type safe Sym classes can be defined so that Maps will definitively show to the programmer, the compiler and refactoring tools what the key really is.
(Neither Symbol nor Sym can be extened, the former apparently by choice, and the latter because it extends AnyVal, but Sym is trivial enough to just duplicate with an appropriate name)
It is not possible to do Symbol as an AnyVal. The main benefit of Symbols over simple Strings is that Symbols are guaranteed to be interned, so you can test equality of symbols using a simple reference comparison instead of an expensive string comparison.
See the source code of Symbol. Equals is overridden and redefined to do a reference comparison using the eq method.
But unfortunately an AnyVal does not allow you to redefine equality. From the SIP-15 for user-defined value classes:
C may not define concrete equals or hashCode methods.
So while it would be extremely useful to have a way to redefine equality without incurring runtime overhead, it is unfortunately not possible.
Edit: never use string.intern in any program where performance is important. The performance of string.intern is horrible compared to even a trivial intern table. See this SO question and answer. See the source code of Symbol above for a simple intern table.
Unfortunately, object allocation for an AnyVal is forced whenever it is put into a collection, like the Map in your example. This is because the value class has to be cast to the type parameter of the collection, and casting to a new type always forces allocation. This eliminates almost any advantage of declaring Sym as a value class. See Allocation Details in the Scala documentation page for value classes.
For AnyVal the class is actually the String. The magically added methods and type-safety are just compiler tricks. It's the String that gets transfered all around.
For pattern matching (Symbol's purpose as I suppose) Scala needs the class of an object. Thus — Symbol extends AnyRef.

Why are value classes restricted to AnyVal?

As far as I understand value classes in Scala are just there to wrap primitive types like Int or Boolean into another type without introducing additional memory usage. So they are basically used as a lightweight alternative to ordinary classes.
That reminds me of Haskell's newtype notation which is also used to wrap existing types in new ones, thus introducing a new interface to some data without consuming additional space (to see the similarity of both languages consider for instance the restriction to one "constructor" with one field both in Haskell and in Scala).
What I am wondering is why the concept of introducing new types that get inlined by the compiler is not generalized to Haskell's approach of having zero-overhead type wrappers for any kind of type. Why did the Scala guys stick to primitive types (aka AnyVal) here?
Or is there already a way in Scala to also define such wrappers for Scala.AnyRef types?
They're not limited to AnyVal.
implicit class RichOptionPair[A,B](val o: Option[(A,B)]) extends AnyVal {
def ofold[C](f: (A,B) => C) = o map { case (a,b) => f(a,b) }
}
scala> Some("fish",5).ofold(_ * _)
res0: Option[String] = Some(fishfishfishfishfish)
There are various limitations on value classes that make them act like lightweight wrappers, but only being able to wrap primitives is not one of them.
The reasoning is documented as Scala Improvement Process (SIP)-15. As Alexey Romanov pointed out in his comment, the idea was to look for an expression using existing keywords that would allow the compiler to determine this situation.
In order for the compiler to perform the inlining, several constraints apply, such as the wrapping class being "ephemeral" (no field or object members, constructor body etc.). Your suggestion of automatically generating inlining classes has at least two problems:
The compiler would need to go through the whole list of constraints for each class. And because the status as value class is implicit, it may flip by adding members to the class at a later point, breaking binary compatibility
More constraints are added by the compiler, e.g. the value class becomes final prohibiting inheritance. So you would have to add these constraints to any class who want to be inlineable that way, and then you gain nothing but extra verbosity.
One could think of other hypothetical constructs, e.g. val class Meter(underlying: Double) { ... }, but the advantage of extends AnyVal IMO is that no syntactic extensions are needed. Also all primitive types are extending AnyVal, so there is a nice analogy (no reference, no inheritance, effective representation etc.)