Null and Nothing: are they an exception to single inheritance in Scala? - scala

Here's an image of Scala hierarchy, taken from the official documentation: https://docs.scala-lang.org/resources/images/tour/unified-types-diagram.svg
So I know Scala supports only single inheritance.
But, if for example Int extends only AnyVal (abstract final class Int extends AnyVal), how come Nothing and Null can extend more than one class? I've read somewhere it's because they are subtypes of every other type, but not subclasses.
But it's not clear from the image above since they are represented exactly as any other classes and the relationships that connects Nothing and Null are the subclass-superclass relationships that I see for any other classes in hierarchy.
So, what is the difference between type and class? How come in the image above, Nothing and Null extend more than one class?

Related

Using Enumerations in Scala Best Practices

I have been using sealed traits and case objects to define enumerated types in Scala and I recently came across another approach to extend the Enumeration class in Scala like this below:
object CertificateStatusEnum extends Enumeration {
val Accepted, SignatureError, CertificateExpired, CertificateRevoked, NoCertificateAvailable, CertChainError, ContractCancelled = Value
}
against doing something like this:
sealed trait CertificateStatus
object CertificateStatus extends {
case object Accepted extends CertificateStatus
case object SignatureError extends CertificateStatus
case object CertificateExpired extends CertificateStatus
case object CertificateRevoked extends CertificateStatus
case object NoCertificateAvailable extends CertificateStatus
case object CertChainError extends CertificateStatus
case object ContractCancelled extends CertificateStatus
}
What is considered a good approach?
They both get the job done for simple purposes, but in terms of best practice, the use of sealed traits + case objects is more flexible.
The story behind is that since Scala came with everything Java had, so Java had enumerations and Scala had to put them there for interoperability reasons. But Scala does not need them, because it supports ADTs (algebraic data types) so it can generate enumeration in a functional way like the one you just saw.
You'll encounter certain limitations with the normal Enumeration class:
the inability of the compiler to detect pattern matches exhaustively
it's actually harder to extend the elements to hold more data besides the String name and the Int id, because Value is final.
at runtime, all enums have the same type because of type erasure, so limited type level programming - for example, you can't have overloaded methods.
when you did object CertificateStatusEnum extends Enumeration your enumerations will not be defined as CertificateStatusEnum type, but as CertificateStatusEnum.Value - so you have to use some type aliases to fix that. The problem with this is the type of your companion will still be CertificateStatusEnum.Value.type so you'll end up doing multiple aliases to fix that, and have a rather confusing enumeration.
On the other hand, the algebraic data type comes as a type-safe alternative where you specify the shape of each element and to encode the enumeration you just need sum types which are expressed exactly using sealed traits (or abstract classes) and case objects.
These solve the limitations of the Enumeration class, but you'll encounter some other (minor) drawbacks, though these are not that limiting:
case objects won't have a default order - so if you need one, you'll have to add your id as an attribute in the sealed trait and provide an ordering method.
a somewhat problematic issue is that even though case objects are serializable, if you need to deserialize your enumeration, there is no easy way to deserialize a case object from its enumeration name. You will most probably need to write a custom deserializer.
you can't iterate over them by default as you could using Enumeration. But it's not a very common use case. Nevertheless, it can be easily achieved, e.g. :
object CertificateStatus extends {
val values: Seq[CertificateStatus] = Seq(
Accepted,
SignatureError,
CertificateExpired,
CertificateRevoked,
NoCertificateAvailable,
CertChainError,
ContractCancelled
)
// rest of the code
}
In practice, there's nothing that you can do with Enumeration that you can't do with sealed trait + case objects. So the former went out of people's preferences, in favor of the latter.
This comparison only concerns Scala 2.
In Scala 3, they unified ADTs and their generalized versions (GADTs) with enums under a new powerful syntax, effectively giving you everything you need. So you'll have every reason to use them. As Gael mentioned, they became first-class entities.
It depends on what you want from enum.
In the first case, you implicitly have an order on items (accessed by id property). Reordering has consequences.
I'd prefer 'case object', in some cases enum item could have extra info in the constructor (like, Color with RGB, not just name).
Also, I'd recommend https://index.scala-lang.org/mrvisser/sealerate or similar libraries. That allows iterating over all elements.

Does the AnyVal with AnyRef as a parameter make sense in Scala?

Does below construct makes any sense? Are any benefits of using it?
final case class Id(uuid: UUID) extends AnyVal
As I understand above construct, Id doesn't have to be instantiated in some scenarios described here. But I have some doubts because I didn't find any example with AnyRef as a parameter.
Yes, this example makes sense. Extending AnyVal is useful when you want specific semantics for a type, but don't want to pay the additional allocation cost that come along with it. For example, say you have a typeclass instance for outputting string representation of values, like Show[A], and you want to give specific semantics to UUID, but there already exists an instance of Show[UUID] in scope which you can't control, this is when wrapping a type and introducing an implicit typeclass for it can be useful.
Do note that AnyVal may end up allocating instances of the wrapper class in specific cases as mentioned in this documentation:
A value class is actually instantiated when:
a value class is treated as another type.
a value class is assigned to an array.
doing runtime type tests, such as pattern matching.

class Int is abstract; cannot be instantiated

While going through Programming in Scala, i came across:
While you can define your own value classes (see Section 11.4), there
are nine value classes built into Scala: Byte, Short, Char, Int, Long,
Float, Double, Boolean, and Unit. The first eight of these correspond
to Java's primitive types, and their values are represented at run
time as Java's primitive values. The instances of these classes are
all written as literals in Scala. For example, 42 is an instance of
Int, 'x' is an instance of Char, and false an instance ofBoolean. You
cannot create instances of these classes using new. This is enforced
by the "trick" that value classes are all defined to be both abstract
and final.
Due to which new Int gives the error class Int is abstract; cannot be instantiated
val a: Int = new Int in Scala. Java allows new Integer(23).
Question: What is the trick the author is taking about. Why Scala defines value classes to be abstract and final.
What is the trick the author is taking about ?
The "trick" is that
when a class is abstract, you cannot make instances of it (cannot call new).
when a class is final, you cannot make subclasses
when a class is abstract and you cannot make subclasses, then you also cannot make a concrete subclass that you could instantiate
So as a result, value classes cannot be instantiated by application code.
Why Scala defines value classes to be abstract and final.
The point of value classes is that they are defined by their (immutable) value/contents. The object identity is not relevant.
The Scala compiler also tries to optimize value classes by not creating any objects at all where possible (just using unboxed primitives directly). That only works if we can be sure that you can just box and unbox at will.
In Java new Integer(1) and another new Integer(1) are two different objects, but that is not useful for a pure value class (if you want to use these different instances as lock monitor objects or something else where you need object identity, you are just confusing yourself and others and should not have used Integer).

Why the first base class in parent list must be non-trait class?

In the Scala spec, it's said that in a class template sc extends mt1, mt2, ..., mtn
Each trait reference mti must denote a trait. By contrast, the
superclass constructor sc normally refers to a class which is not a
trait. It is possible to write a list of parents that starts with a
trait reference, e.g. mt1 with …… with mtn. In that case the
list of parents is implicitly extended to include the supertype of
mt1 as first parent type. The new supertype must have at least one
constructor that does not take parameters. In the following, we will
always assume that this implicit extension has been performed, so that
the first parent class of a template is a regular superclass
constructor, not a trait reference.
If I understand it correctly, I think it means:
trait Base1 {}
trait Base2 {}
class Sub extends Base1 with Base2 {}
Will be implicitly extended to:
trait Base1 {}
trait Base2 {}
class Sub extends Object with Base1 with Base2 {}
My questions are:
Is my understanding correct?
Does this requirement (the first subclass in the parent list must be non-trait class) and the implicit extension only applies to class template (e.g. class Sub extends Mt1, Mt2) or also trait template (e.g. trait Sub extends Mt1, Mt2)?
Why this requirement and the implicit extension is necessary?
Disclaimer: I'm not and never was a member of the "Scala design committee" or anything like that, so the answer on the "why?" question is mostly speculation but I think a useful one.
Disclaimer #2: I've written this post over several hours and in several takes so it is probably not very consistent
Disclaimer #3 (a shameful self-promotion for the future readers): If you find this quite long answer useful, you might also take a look at my another long answer to another question by Lifu Huang on a similar topic.
Short answers
This is one of those complicated things for which I don't think there is a good short answer unless you already know what the answer is. Although my real answer will be long, here are my best short answers:
Why the first base class in parent list must be non-trait class?
Because there has to be only one non-trait base class and it makes thing easier if it is always the first
Is my understanding correct?
Yes, your implicit example is what will happen. However I'm not sure that it shows full understanding of the topic.
Does this requirement (the first subclass in the parent list must be non-trait class) and the implicit extension only applies to class template (e.g. class Sub extends Mt1, Mt2) or also trait template (e.g. trait Sub extends Mt1, Mt2)?
No, implicit extensions happens for traits as well. Actually how else you could expect Mt1 to have its own "supertype" to be promoted down to the class that extends it?
Actually here are two IMHO non-obvious examples proving this is true:
Example #1
trait TAny extends Any
trait TNo
// works
class CGood(val value: Int) extends AnyVal with TAny
// fails
// illegal inheritance; superclass AnyVal is not a subclass of the superclass Object
class CBad(val value: Int) extends AnyVal with TNo
This example fails because the spec says
The extends clause extends scsc with mt1mt1 with …… with mtnmtn can be omitted, in which case extends scala.AnyRef is assumed.
so TNo actually extends AnyRef which is incompatible with AnyVal.
Example #2
class CFirst
class CSecond extends CFirst
// did you know that traits can extend classes as well?
trait TFirst extends CFirst
trait TSecond extends CSecond
// works
class ChildGood extends TSecond with TFirst
// fails
// illegal inheritance; superclass CFirst is not a subclass of the superclass CSecond of the mixin trait TSecond
class ChildBad extends TFirst with TSecond
Again ChildBad fails because TSecond requires CSecond but TFirst only provides CFirst as the base class.
Why this requirement and the implicit extension is necessary?
There are three major reasons:
Compatibility with the main target platform (JVM)
Traits have "mixin" semantics: you have a class and you mix additional behavior in
Completeness, consistency and simplicity of the rest of the spec (e.g. of linearization rules). This might be restated as following: each class must declare 0 or 1 base non-trait classes and after compilation the target platform enforces that there will be exactly 1 non-trait base class. So it makes the rest of the spec easier if you just assume there is always exactly one base class. In such way you have to write this implicit extension rules only once rather than each time when the behavior depends on the base class.
Scala spec goals/intentions
I believe that when one reads a spec there are two different sets of questions:
What exactly is written? What is the meaning of the spec?
Why it is written so? What was the intention?
Actually I think in many cases #2 is more important than #1 but unfortunately specs rarely explicitly contain insights into that area. Anyway I will start with my speculations over #2: what were the intentions/goals/limitations of the classes system in Scala? The main high-level goal was to create a type system richer than the one in Java or .Net (which are quite similar) but that can be:
compiled back to an efficient code in those target platforms
allow reasonable two-way interaction between the Scala code and the "native" code in the target platforms
Side note: Support of the .Net was dropped years ago but it was one of the target platforms for years and this affected the design.
Single base class
Short summary: this section describes some reasons why Scala designers had a strong motivation to have the "exactly one base class" rule in the language.
A major problem with OO design and particularly inheritance is that AFAIK the question: "where exactly is the border between the "good and useful" practices and the "bad" ones?" is open. It means that each language must find out its own trade off between making impossible what is wrong and making possible (and easy) what is useful. Many believe that in C++, which obviously was a major inspiration for Java and .Net, that trade off is shifted too much into "allow everything even if it is potentially harmful" zone. It made many designers of newer languages to seek for more restricting trade off. Particularly both JVM and .Net platform enforce the rule that all types are split into "value types" (aka primitive types), "classes" and "interfaces" and each class, except the root class (java.lang.Object/System.Object), has exactly one "base class" and zero or more "base interfaces". This decision was a reaction to many issues of multiple inheritance including infamous "diamond problem" but actually many others as well.
Sidenote (about memory layout): Another major problem with multiple inheritance is objects layout in memory. Consider following ridiculous (and impossible in current Scala) example inspired by Achilles and the tortoise:
trait Achilles {
def getAchillesPos: Int
def stepAchilles(): Unit
}
class AchillesImpl(var achillesPos: Int) extends Achilles {
def getAchillesPos: Int = achillesPos
def stepAchilles(): Unit = {
achillesPos += 2
}
}
class TortoiseImpl(var tortoisePos: Int) {
def getTortoisePos: Int = tortoisePos
def stepTortoise(): Unit = {
tortoisePos += 1
}
}
class AchillesAndTortoise(handicap: Int) extends AchillesImpl(0) with TortoiseImpl(handicap) {
def catchTortoise(): Int = {
var time = 0
while (getAchillesPos < getTortoisePos) {
time += 1
stepAchilles()
stepTortoise()
}
time
}
}
The tricky part here is how to actually lay achillesPos and tortoisePos fields out in the memory (of the object). The issue is that you probably want to have only one compiled copy of all the methods in the memory and you want the code to be efficient. This means that getAchillesPos and stepAchilles should have know some fixed offset of the achillesPos regarding to the this pointer. Similarly getTortoisePos and stepTortoise should have know some fixed offset of the tortoisePos regarding to the this pointer. And all choices you have to achieve this goal don't look nice. For example:
You might decide that achillesPos is always first and tortoisePos is always second. But this means that in the instances of TortoiseImpl tortoisePos should also be the second field but there is nothing to fill the first field with so you waste some memory. Moreover if both AchillesImpl and TortoiseImpl come from pre-compiled libraries, you should have some way to move access to the fields in them as well.
You might try to "fix" this pointer on-the-fly when you call into TortoiseImpl (AFAIK this is the way C++ really works). This becomes especially funny when TortoiseImpl is an abstract class that is aware of the trait Achilles (but not the specific class AchillesImpl) via extends and tries to call back some methods from there via this or pass this to some method that takes Achilles as an argument so this has to be "fixed back". Note that this is not the same as the "diamond problem" because there is only one copy of all fields and implementations.
You might agree to have a unique copy of the methods compiled for each specific class that are aware of the specific layout. This is bad for memory usage and performance because it blows CPU caches and forces JIT to make independent optimizations for each.
You might say that no method except for getter and setter can have direct access to the fields and should use getters and setters instead. Or store all the fields in some kind of a dictionary which is effectively the same. This might be bad for performance (but this is the closest to what Scala does with mixin-traits).
In the actual Scala this issue does not exist because trait can't really declare any fields. When you declare val or var in a trait, you actually declare a getter (and a setter) method(s) that will be implemented by particular class that extends the trait and each class has full control over layout of the fields. And actually in terms of performance this most probably would work OK because JVM (JIT) can inline such a virtual call in many real-world scenarios.
End of the Sidenote
Another major point is interoperability with the target platform. Even if Scala somehow supported true multiple-inheritance so you can have a type that inherits from String with Date and that can be passed to both methods that expect String and that expect Date, how this would look like from the Java point of view? Also if the target platform enforces the rule that every class has to be an (indirect) sub-type of the same root class (Object), you can't work this around in your higher level language.
Traits and Mix-ins
Many think that "one class and many interfaces" trade-off that was made in Java and .Net is too restrictive. For example it makes it hard to share common default implementation of some of the interface methods between different classes. Actually over the time Java and .Net designers seem to come to the same conclusion and rolled out they own fixes for this kind of issues: Extension methods in .Net and then Default methods in Java. Scala designers added a feature called Mixins that was known to fare well in many practical cases. However unlike many other dynamic languages that has similar feature, Scala still had to meet the "exactly one base class" rule and other limitations of the target platform.
It is important to note that there are important scenarios when mixins are used in practice is to implement a variation of the Decorator or Adapter patterns both of which relies on the fact that you can restrict your base type to something more specific than Any or AnyRef. Prime example of such usage is the scala.collection package.
Scala syntax
So now you have following goals/restrictions:
Exactly one base class for each class
Ability to add logic to classes from mixins
Support of mixins with restricted base type
Classes from the target platform (Java) when seen from Scala are mapped to the Scala classes (because what else they can be mapped to?) and they come pre-compiled and we don't want to mess with their implementation
Other good qualities such as simplicity, type safety, determinism, etc.
If you want some kind of multiple inheritance support in your language, you need to develop conflict resolution rules: what happens when several base types provide some logic that would fit the same "slot" in your class. After prohibition of fields in traits we are left with the following "slots":
Base class in terms of the target platform
Constructors
Methods with the same name and signature
And possible conflict resolution strategies are:
Prohibit (fail compilation)
Decide which one wins and wipes others
Somehow chain them
Somehow preserve all with renaming. This is not really possible in JVM. For example in .Net see Explicit Interface Implementation
In a sense Scala uses all available (i.e. first 3) strategies but the high-level goal is: let's try to preserve as many logic as we can.
The most important part for this discussion is conflicts resolution for constructors and methods.
We want the rules to be the same for different slots because otherwise it is not clear how to achieve safety (if traits A and B both override methods foo and bar but resolution rules for foo and bar are different, invariants for A and B might easily be broken). Scala's approach is based on the class linearization. In short these is the way to "flatten" hierarchy of the base classes into a simple linear structure in some predictive way that is based on the idea that the lefter type in the with chain - the more "base" (higher in the inheritance) it is. After you do this, conflict resolution rule for methods becomes simple: you go through the list of the base types and chain behavior via super calls; if super is not called, you stop chaining. This produce quite predictable semantics that people can reason about.
Now assume you allow non-trait class to be not first. Consider following example:
class CBase {
def getValue = 2
}
trait TFirst extends CBase {
override def getValue = super.getValue + 1
}
trait TSecond extends CFirst {
override def getValue = super.getValue * 2
}
class CThird extends CBase with TSecond {
override def getValue = 100 - super.getValue
}
class Child extends TFirst with TSecond with CThird
In which order TFirst.getValue and TSecond.getValue should be called? Obviously CThird is already compiled and you can't change what the super for it is, so it has to be moved to the first position and there is already TSecond.getValue call inside it. But on the other hand this breaks the rule that everything on the left is base and everything on the right is child. The simplest way to not introduce such confusion is to enforce the rule that non-trait classes must go first.
The same logic applies if you just extend the previous example by substituting class CThird with a trait that extends it:
trait TFourth extends CThird
class AnotherChild extends TFirst with TSecond with TFourth
Again, the only non-trait class AnotherChild can extend is CThird and this again makes conflict resolution rules quite hard to reason about.
That's why Scala makes a rule much simpler: whatever provides the base class must come from the first position. And then it makes sense to extend the same rule upon the traits as well so if the first position is occupied by some trait - it also defines the base class.
1) Basically yes, your understanding is correct. Like in Java, every class inherits from java.lang.Object (AnyRef in Scala). So, since you are defining a concrete class, you will implicitly inherits from Object. If you check with the REPL, you got:
scala> trait Base1 {}
defined trait Base1
scala> trait Base2 {}
defined trait Base2
scala> class Sub extends Base1 with Base2 {}
defined class Sub
scala> classOf[Sub].getSuperclass
res0: Class[_ >: Sub] = class java.lang.Object
2) Yes, from the "Traits" paragraph in the specs, this applies also to them. In "Templates" paragraph we have:
The new supertype must have at least one constructor that does not take parameters
And then in "Traits" paragraph:
Unlike normal classes, traits cannot have constructor parameters. Furthermore, no constructor arguments are passed to the superclass of the trait. This is not necessary as traits are initialized after the superclass is initialized.
Assume a trait D defines some aspect of an instance x of type C (i.e. D is a base class of C). Then the actual supertype of D in x is the compound type consisting of all the base classes in L(C) that succeed D.
This is needed to define the base constructor with no-parameters.
3) As per answer (2), it's needed to define the base constructor

Case object extending class with constructor in Scala

I am a beginner in Scala and was playing around to learn more about Abstract data types. I defined the following definition to replicate Option type:
sealed abstract class Maybe[+A](x:A)
case object Nothing extends Maybe[Nothing](Nothing)
case class Just[A](x:A) extends Maybe[A](x)
But I encountered the following error.
found : Nothing.type
required: Nothing
case object Nothing extends Maybe[Nothing](Nothing)
How do I pass Nothing instead of Nothing.type?
I referred to the following question for hints:
How to extend an object in Scala with an abstract class with constructor?, but it was not helpful.
Maybe more like this. Your Nothing shouldnt have a value, just the type. Also people usually use traits instead of abstract classes.
sealed trait Maybe[+A]
case object None extends Maybe[Nothing]
case class Just[A](x:A) extends Maybe[A]
You probably shouldnt create your own Nothing, thats going to be confusing, you will confuse yourself and the compiler about if you are referring to your one, or the one at the bottom of the type hierarchy.
As mentioned by Stephen, the correct way to do this would be not to have trait and not an abstract class, however, I thought it might be informative to explain why the current methodology fails and how to fix it.
The main issue is with this line:
case object Nothing extends Maybe[Nothing](Nothing)
First thing (as mentioned) you shouldn't call your object Nothing. Secondly, you set the object to extend Maybe[Nothing]. Nothing can't have any actual values so you can't use it as an object. Also, you can't use the object itself as the constructor parameter because that would cause a cyclic behavior.
What you need is to have a bottom type (i.e. a type which all A have in common) and an object of that type. Nothing is a bottom type but has no objects.
A possible solution is to limit yourself to AnyRef (i.e. nullable objects) and use the Null bottom type which has a valid object (null):
sealed abstract class Maybe[+A <: AnyRef](x:A)
case object None extends Maybe[Null](null)
This is a bit of clarification for Assaf Mendelson's answer, but it's too big for a comment.
case object Nothing extends Maybe[Nothing](Nothing)
Scala has separate namespaces for types and values. Nothing in case object Nothing is a value. Nothing in Maybe[Nothing] is a type. Since you didn't define a type called Nothing, it refers to the automatically imported scala.Nothing and you must pass a value of this type as an argument. By definition it has no values but e.g. case object Nothing extends Maybe[Nothing](throw new Exception) would compile, as the type of throw expressions is Nothing. Instead you pass the value Nothing, i.e. the same case object you are defining; its type is written as Nothing.type.
How do I pass Nothing instead of Nothing.type?
It seems like there is no way to do so.
As it says at http://www.scala-lang.org/api/2.9.1/scala/Nothing.html:
there exist no instances of this type.