Nulls in Scala ...why is this possible? - scala

I was coding in Scala and doing some quick refactoring in Intellij, when I stumbled upon the following piece of weirdness...
package misc
/**
* Created by abimbola on 05/10/15.
*/
object WTF extends App {
val name: String = name
println(s"Value is: $name")
}
I then noticed that the compiler didn't complain, so I decided to attempt to run this and I got a very interesting output
Value is: null
Process finished with exit code 0
Can anyone tell me why this works?
EDIT:
First problem, the value name is assigned a reference to itself even though it does not exist yet; why exactly does the Scala compiler not explode with errors???
Why is the value of the assignment null?

1.) Why does the compiler not explode
Here is a reduced example. This compiles because through given type a default value can be inferred:
class Example { val x: Int = x }
scalac Example.scala
Example.scala:1: warning: value x in class Example does nothing other than call itself recursively
class Example { val x: Int = x }
This does not compile because no default value can be inferred:
class ExampleDoesNotCompile { def x = x }
scalac ExampleDoesNotCompile.scala
ExampleDoesNotCompile.scala:1: error: recursive method x needs result type
class ExampleDoesNotCompile { def x = x }
1.1 What happens here
My interpretation. So beware: The uniform access principle kicks in.
The assignment to the val x calls the accessor x() which returns the unitialized value of x.
So x is set to the default value.
class Example { val x: Int = x }
^
[[syntax trees at end of cleanup]] // Example.scala
package <empty> {
class Example extends Object {
private[this] val x: Int = _;
<stable> <accessor> def x(): Int = Example.this.x;
def <init>(): Example = {
Example.super.<init>();
Example.this.x = Example.this.x();
()
}
}
} ^
2.) Why the value is null
The default values are determined by the environment Scala is compiled to.
In the example you have given it looks like you run on the JVM. The default value for Object here is null.
So when you do not provide a value the default value is used as a fallback.
Default values JVM:
byte 0
short 0
int 0
long 0L
float 0.0f
double 0.0d
char '\u0000'
boolean false
Object null // String are objects.
Also the default value is a valid value for given type:
Here is an example in the REPL:
scala> val x : Int = 0
x: Int = 0
scala> val x : Int = null
<console>:10: error: an expression of type Null is ineligible for implicit conversion
val x : Int = null
^
scala> val x : String = null
x: String = null

why exactly does the Scala compiler not explode with errors?
Because this problem can't be solved in the general case. Do you know the halting problem? The halting problem says that it is not possible to write an algorithm that finds out if a program ever halts. Since the problem of finding out if a recursive definition would result in a null assignment can be reduced to the halting problem, it is also not possible to solve it.
Well, now it is quite easy to forbid recursive definitions at all, this is for example done for values that are no class values:
scala> def f = { val k: String = k+"abc" }
<console>:11: error: forward reference extends over definition of value k
def f = { val k: String = k+"abc" }
^
For class values this feature is not forbidden for a few reasons:
Their scope is not limited
The JVM initializes them with a default value (which is null for reference types).
Recursive values are useful
Your use case is trivial, as is this:
scala> val k: String = k+"abc"
k: String = nullabc
But what about this:
scala> object X { val x: Int = Y.y+1 }; object Y { val y: Int = X.x+1 }
defined object X
defined object Y
scala> X.x
res2: Int = 2
scala> Y.y
res3: Int = 1
scala> object X { val x: Int = Y.y+1 }; object Y { val y: Int = X.x+1 }
defined object X
defined object Y
scala> Y.y
res4: Int = 2
scala> X.x
res5: Int = 1
Or this:
scala> val f: Stream[BigInt] = 1 #:: 1 #:: f.zip(f.tail).map { case (a,b) => a+b }
f: Stream[BigInt] = Stream(1, ?)
scala> f.take(10).toList
res7: List[BigInt] = List(1, 1, 2, 3, 5, 8, 13, 21, 34, 55)
As you can see it is quite easy to write programs where it is not obvious anymore to which value they will result. And since the halting problem is not solvable we can not let the compiler do the work for us in non trivial cases.
This also means that trivial cases, as the one shown in your question, could be hardcoded in the compiler. But since there can't exist a algorithm that can detect all possible trivial cases, all cases that are ever found need to be hardcoded in the compiler (not to mention that a definition of a trivial case does not exist). Therefore it wouldn't be wise to even start hardcoding some of these cases. It would ultimately result in a slower compiler and a compiler that is more difficult to maintain.
One could argue that for an use case that burns every second user it would be wise to at least hardcode such an extreme scenario. On the other hand, some people just need to be burned in order to learn something new. ;)

I think #Andreas' answer already has the necessary info. I'll just try to provide additional explanation:
When you write val name: String = name at the class level, this does a few different things at the same time:
create the field name
create the getter name()
create code for the assignment name = name, which becomes part of the primary constructor
This is what's made explicit by Andreas' 1.1
package <empty> {
class Example extends Object {
private[this] val x: Int = _;
<stable> <accessor> def x(): Int = Example.this.x;
def <init>(): Example = {
Example.super.<init>();
Example.this.x = Example.this.x();
()
}
}
}
The syntax is not Scala, it is (as suggested by [[syntax trees at end of cleanup]]) a textual representation of what the compiler will later convert into bytecode. Some unfamiliar syntax aside, we can interpret this, like the JVM would:
the JVM creates an object. At this point, all fields have default values. val x: Int = _; is like int x; in Java, i.e. the JVM's default value is used, which is 0 for I (i.e. int in Java, or Int in Scala)
the constructor is called for the object
(the super constructor is called)
the constructor calls x()
x() returns x, which is 0
x is assigned 0
the constructor returns
as you can see, after the initial parsing step, there is nothing in the syntax tree that seems immediately wrong, even though the original source code looks wrong. I wouldn't say that this is the behavior I expect, so I would imagine one of three things:
Either, the Scala devs saw it as too intricate to recognize and forbid
or, it's a regression and simply wasn't found as a bug
or, it's a "feature" and there is legitimate need for this behavior
(ordering reflects my opinion of likeliness, in decreasing order)

Related

Scala recursive val behaviour

What do you think prints out?
val foo: String = "foo" + foo
println(foo)
val foo2: Int = 3 + foo2
println(foo2)
Answer:
foonull
3
Why? Is there a part in specification that describes/explains this?
EDIT: To clarify my astonishment - I do realize that foo is undefined at val foo: String = "foo" + foo and that's why it has a default value null (zero for integers). But this doesn't seem very "clean", and I see opinions here that agree with me. I was hoping that compiler would stop me from doing something like that. It does make sense in some particular cases, such as when defining Streams which are lazy by nature, but for strings and integers I would expect either stopping me due to reassignment to val or telling me that I'm trying to use an undefined value, just like as if I wrote val foo = whatever (given that whatever was never defined).
To further complicate things, #dk14 points out that this behaviour is only present for values represented as fields and doesn't happen within blocks, e.g.
val bar: String = {
val foo: String = "foo" + foo // error: forward reference extends...
"bar"
}
Yes, see the SLS Section 4.2.
foo is a String which is a reference type. It has a default value of null. foo2 is an Int which has a default value of 0.
In both cases, you are referring to a val which has not been initialized, so the default value is used.
In both cases foo resp. foo2 have their default values according to the JVM specification. That's null for a reference type and 0 for an int or Int - as Scala spells it.
Scala's Int is backed by a primitive type (int) underneath, and its default value (before initialization) is 0.
On the other hand, String is an Object which is indeed null when not initialized.
Unfortunately, scala is not as safe as let's say Haskell because of compromises with Java's OOP model ("Object-oriented meets Functional"). There is a safe subset of Scala called Scalazzi and some of sbt/scalac-plugins can give you more warnings/errors:
https://github.com/puffnfresh/wartremover (doesn't check for the case you've found though)
Returning back to your case, this happens only for values represented as fields and doesn't happen when you're inside a function/method/block:
scala> val foo2: Int = 3 + foo2
foo2: Int = 3
scala> {val foo2: Int = 3 + foo2 }
<console>:14: error: forward reference extends over definition of value foo2
{val foo2: Int = 3 + foo2 }
^
scala> def method = {val a: Int = 3 + a}
<console>:12: error: forward reference extends over definition of value a
def method = {val a: Int = 3 + a}
The reason of this situation is integration with Java as val compiles to the final field in JVM, so Scala saves all initialization specifics of JVM classes (I believe it's part of JSR-133: Java Memory Model). Here is the more complicated example that explains this behavior:
scala> object Z{ val a = b; val b = 5}
<console>:12: warning: Reference to uninitialized value b
object Z{ val a = b; val b = 5}
^
defined object Z
scala> Z.a
res12: Int = 0
So, here you can see the warning that you didn't see in the first place.

Eta-expansion on non-methods works for fields but not for local variables

The following code is pretty much self-explanatory:
class EtaExpansionOnNonMethods { // or object
val zero = 0
val zeroEta = zero _ // compiles: () => Int
def f {
val one = 1
val oneEta = one _ // compilation error
}
}
Error:(7, 18) _ must follow method; cannot follow Int
val oneEta = one _
^
Why is eta-expansion on an e.g. Int field allowed (resulting in () => Int) but not on an Int local variable (resulting in a compilation error)? I'm using version 2.11.7.
That's because val members are actually compiled down to getter/setter-like methods, for example running javap EtaExpansionOnNonMethods.class that you'd get from running scalac gives you:
E:\EtaExp>"C:\Program Files\Java\jdk1.8.0_51\bin\javap" EtaExpansionOnNonMethods.class
Compiled from "EtaExp.scala"
public class EtaExpansionOnNonMethods {
public int zero();
public EtaExpansionOnNonMethods();
}
Notice that if you were to declare the member as private[this] val zero = 0, which is compiled down to a final field, you'd get the exact same error you get when trying to eta-expand a local variable or value.
In the end, the general premise still holds: you can use eta-expansion on methods, even when those methods are not really explicit. :)

Getting a null with a val depending on abstract def in a trait [duplicate]

This question already has answers here:
Scala - initialization order of vals
(3 answers)
Closed 7 years ago.
I'm seeing some initialization weirdness when mixing val's and def's in my trait. The situation can be summarized with the following example.
I have a trait which provides an abstract field, let's call it fruit, which should be implemented in child classes. It also uses that field in a val:
scala> class FruitTreeDescriptor(fruit: String) {
| def describe = s"This tree has loads of ${fruit}s"
| }
defined class FruitTreeDescriptor
scala> trait FruitTree {
| def fruit: String
| val descriptor = new FruitTreeDescriptor(fruit)
| }
defined trait FruitTree
When overriding fruit with a def, things work as expected:
scala> object AppleTree extends FruitTree {
| def fruit = "apple"
| }
defined object AppleTree
scala> AppleTree.descriptor.describe
res1: String = This tree has loads of apples
However, if I override fruit using a val...
scala> object BananaTree extends FruitTree {
| val fruit = "banana"
| }
defined object BananaTree
scala> BananaTree.descriptor.describe
res2: String = This tree has loads of nulls
What's going on here?
In simple terms, at the point you're calling:
val descriptor = new FruitTreeDescriptor(fruit)
the constructor for BananaTree has not been given the chance to run yet. This means the value of fruit is still null, even though it's a val.
This is a subcase of the well-known quirk of the non-declarative initialization of vals, which can be illustrated with a simpler example:
class A {
val x = a
val a = "String"
}
scala> new A().x
res1: String = null
(Although thankfully, in this particular case, the compiler will detect something being afoot and will present a warning.)
To avoid the problem, declare fruit as a lazy val, which will force evaluation.
The problem is the initialization order. val fruit = ... is being initialized after val descriptor = ..., so at the point when descriptor is being initialized, fruit is still null. You can fix this by making fruit a lazy val, because then it will be initialized on first access.
Your descriptor field initializes earlier than fruit field as trait intializes earlier than class, that extends it. null is a field's value before initialization - that's why you get it. In def case it's just a method call instead of accessing some field, so everything is fine (as method's code may be called several times - no initialization here). See, http://docs.scala-lang.org/tutorials/FAQ/initialization-order.html
Why def is so different? That's because def may be called several times, but val - only once (so its first and only one call is actually initialization of the fileld).
Typical solution to such problem - using lazy val instead, it will intialize when you really need it. One more solution is early intializers.
Another, simpler example of what's going on:
scala> class A {val a = b; val b = 5}
<console>:7: warning: Reference to uninitialized value b
class A {val a = b; val b = 5}
^
defined class A
scala> (new A).a
res2: Int = 0 //null
Talking more generally, theoretically scala could analize the dependency graph between fields (which field needs other field) and start initialization from final nodes. But in practice every module is compiled separately and compiler might not even know those dependencies (it might be even Java, which calls Scala, which calls Java), so it's just do sequential initialization.
So, because of that, it couldn't even detect simple loops:
scala> class A {val a: Int = b; val b: Int = a}
<console>:7: warning: Reference to uninitialized value b
class A {val a: Int = b; val b: Int = a}
^
defined class A
scala> (new A).a
res4: Int = 0
scala> class A {lazy val a: Int = b; lazy val b: Int = a}
defined class A
scala> (new A).a
java.lang.StackOverflowError
Actually, such loop (inside one module) can be theoretically detected in separate build, but it won't help much as it's pretty obvious.

Array of elements of compound type

I have been playing lately with compound types and I recently was trying the following code:
import scala.reflect._
object Main {
def main(args: Array[String]) {
val a1 = classTag[Int with String with Double].newArray(3)
println(a1.mkString(" "))
val a2 = classTag[String with Int with Double].newArray(3)
println(a2.mkString(" "))
val a3 = classTag[Double with Int with String].newArray(3)
println(a3.mkString(" "))
}
}
With the following output:
0 0 0
null null null
0.0 0.0 0.0
It results a bit strange to me. Each of the array elements has access to the methods of the three types: Int, String and Double. What is happening exactly behind scene here? The compound types are actually instances of the first type? Can a compound type being instanced explicitly? Their use case is only for when the types composing the compound are related through inheritance and so on? Thanks.
P.S.: I'm using Scala 2.11.4
What is happening exactly behind scene here? The compound types are actually instances of the first type?
Looking at the newArray method as defined for ClassTag:
override def newArray(len: Int): Array[T] =
runtimeClass match {
case java.lang.Byte.TYPE => new Array[Byte](len).asInstanceOf[Array[T]]
case java.lang.Integer.TYPE => new Array[Int](len).asInstanceOf[Array[T]]
/* snip */
case _ => java.lang.reflect.Array.newInstance(runtimeClass, len).asInstanceOf[Array[T]]
}
And then your type:
scala> classTag[Int with String with Double].runtimeClass
res4: Class[_] = int
It seems pretty clear how it's arriving at the conclusion to use the first type. The runtimeClass of Int with String with Double is Int, and the newArray method uses the runtimeClass to construct the new Array. So the Array is filled with default Int values. Likewise for the other order combinations.
Why is the runtime class Int? Well, Int with String with Double isn't an actual class, but Int is. The compiler's choice of which class to use can't just be arbitrary, so why not the first?
Can a compound type be instanced explicitly?
I'm not sure what you mean by this. If you think about it, the compiler kind of has to favor one type in this composition. What would an explicit Int with String with Double look like? What about Int with String with Double with Product with Process ? I don't know, either.
Each of the array elements has access to the methods of the three types: Int, String and Double
They do, and yet they don't.
scala> val arr = classTag[Int with Process].newArray(3)
arr: Array[Int with Process] = Array(0, 0, 0)
scala> arr(0).exitValue()
java.lang.ClassCastException: java.lang.Integer cannot be cast to java.lang.Process
You're really just tricking the compiler into believing you have an Array of these types, when in fact you don't (and can't, because nothing can be a sub-type of both Int and Process).
Their use case is only for when the types composing the compound are related through inheritance and so on?
Pretty much, yes. I can't think if any situation in which you'd ever want to try to build these types directly. Their main use case is to guarantee that a type inherits from other specific types.
For example, a method that requires that a parameter inherit methods from two traits:
trait Hello { def hello = println("Hello") }
trait Bye { def bye = println("Bye") }
def greet(hb: Hello with Bye): Unit = { hb.hello; hb.bye }
class A extends Hello
class B extends Hello with Bye
scala> greet(new A)
<console>:15: error: type mismatch;
found : A
required: Hello with Bye
greet(new A)
^
scala> greet(new B)
Hello
Bye

How to make a "n to m" restricted int variant in Scala?

I'd like to have types s.a. Int_1to3 or Uint in Scala. Preferably, there'd be a general factory method that can provide any such.
It's mostly for self-documentary purposes, but also the values would be checked on arrival (i.e. via 'assert').
I was somewhat surprised not to have found a solution to this during my initial (google) search. The closest I came to is Unsigned.scala, but that's overkill for my needs.
This must be dead simple?
Just to give an idea on the usage, something like this would be splendid! :)
type Int_1to3= Int_limited( 1 to 3 )
type Uint= Int_limited( _ >= 0 )
I see two potential solutions:
First you can have a look at Unboxed Type Tags. They allow to attach a type a compile time without having to box the integer. The compiler will enforce that they are used when needed, but values are checked at runtime.
From the cited article, you could write something like:
type Tagged[U] = { type Tag = U }
type ##[T, U] = T with Tagged[U]
trait Positive
trait One2Three
type PositiveInt = Int ## Positive
type SmallInt = Int ## One2Three
//Builds a small int
def small(i: Int): SmallInt = {
require( i > 0 && i < 4, "should be between 1 and 3" )
i.asInstanceOf[SmallInt]
}
//Builds a positive int
def positive( i: Int): PositiveInt = {
require( i >= 0, "should be positive" )
i.asInstanceOf[PositiveInt]
}
//Example usage in methods
def mul( i: SmallInt, j: PositiveInt ): PositiveInt = positive(i*j)
Then in the REPL:
scala> mul( small(2), positive(4) )
res1: PositiveInt = 8
scala> mul( small(4), positive(2) ) //RUNTIME ERROR
java.lang.IllegalArgumentException: should be between 1 and 3
scala> mul( 2, positive(2) ) //COMPILE TIME ERROR
<console>:16: error: type mismatch;
found : Int(2)
required: SmallInt
mul( 2, positive(2) )
^
The second solutions may be value classes proposed for Scala 2.10. You can read the SIP-15 to see how to use them.
A "pattern" you can use here is to declare a sealed type with a private constructor as a wrapper around the underlying value, which is restricted to a single point that it can be validated and instantiated. Eg
sealed abstract class Int_1to3(val i:Int)
object Int_1to3 {
def apply(i:Int):Option[Int_1to3] =
if (1.to(3).contains(i)) Some(new Int_1to3(i){}) else None
}
That way, whenever you end up with an instance of some x of type Int_1to3, you have a compile-time guarantee that x.i will be 1, 2 or 3.
Because you have such "low standards", it is enough to do:
def safeInt(i: Int, f: Int => Boolean): Int =
if (f(i)) i else throw new IllegalArgumentException("wrong int")
def int1to3(i: Int) =
safeInt(i, 1 to 3 contains _)
def uInt(i: Int) =
safeInt(i, _ >= 0)
To have this a a type doesn't make much sense when you do not want to enforce the compiler to keep your code safe. This is possible, but as you said, not for your needs.
No, there is nothing like that in the language. The solutions available -- through libraries -- are what you call "overkilL".
Saw the video on Scalatest/Scalatric 3.0 yesterday, and in it #Bill-Venners discussed the PosInt, PozInt types that are very close to what I had been asking in 2012.
He also presented an OddInt sample for us to create such value types ourselves.