Instance of trait throws null pointer - scala

Why does a null pointer exception occur in the code below?
object Test extends App{
trait MyTrait[A]{ self =>
val seq: Seq[A]
val size = seq.size // null pointer here
}
val p = new MyTrait[Int]{
val seq = Seq(1,2,3)
}
}
If I change the size field to lazy then it's ok.

Fields are initialized in the order they are mixed in--so first everything in the trait happens, and then the val is assigned to Seq(1,2,3) (since you're essentially mixing in an anonymous trait).
As you have discovered, lazy val is often a way out of this mess: you don't actually call seq.size until you need it, which is after you have populated the seq field.

The style recommendation is to avoid vals in traits, in favor of defs and lazy vals, because of init order land mines.
Sample conversation:
https://groups.google.com/forum/?fromgroups=#!topic/scala-user/nrOrjPOYmb0

Related

Strange NPE with io.circe.Decoder

I have 2 variables declared as follows,
implicit val decodeURL: Decoder[URL] = Decoder.decodeString.emapTry(s => Try(new URL(s))) // #1
implicit val decodeCompleted = Decoder[List[URL]].prepare(_.downField("completed")) // #2
Both lines compile and run.
However, if I annotate #2 with type i.e. implicit val decodeCompleted: Decoder[List[URL]] = Decoder[List[URL]].prepare(_.downField("completed")). It compiles and #2 will throw NullPointerException (NPE) during runtime.
How could this happen? I don't know if this is Circe or just plain Scala issue. Why #2 is different from #1? Thanks
The issue is that you are supposed to use implicits with annotation always.
Now, when you use them when they are not annotated you get into some sort on undefined/invalid behavior zone. That is why with unannotated happens sth like this:
I want to use Decoder[List[URL]]
there is no (annotated) Decoder[List[URL]] implicit in the scope
let's derive it normally (no need for generic.auto._ because the definition for that is in a companion object)
once derived you call on it .prepare(_.downField("completed"))
the final result is of type Decoder[List[URL]], so that is inferred type of decodeCompleted
Now, what happens if you annotate?
I want to use Decoder[List[URL]]
there is decodeCompleted declared as something that fulfills that definition
let use decodeCompleted value
but decodeCompleted wasn't initialized! in fact we are initializing it right now!
as a result you end up with decodeCompleted = null
This is virtually equal to:
val decodeCompleted = decodeCompleted
except that the layer of indirection get's in the way of discovering the absurdity of this by compiler. (If you replaced val with def you would end up with an infinite recursion and stack overflow):
# implicit val s: String = implicitly[String]
s: String = null
# implicit def s: String = implicitly[String]
defined function s
# s
java.lang.StackOverflowError
ammonite.$sess.cmd1$.s(cmd1.sc:1)
ammonite.$sess.cmd1$.s(cmd1.sc:1)
ammonite.$sess.cmd1$.s(cmd1.sc:1)
ammonite.$sess.cmd1$.s(cmd1.sc:1)
Yup, its messed up by compiler. You did nothing wrong and in a perfect world it would work.
Scala community mitigates that by distincting:
auto derivation - when you need implicit somewhere and it is automatically derived without you defining a variable for it
semiauto derivation - when you derive into a value, and make that value implicit
In the later case, you usually have some utilities like:
import io.circe.generic.semiauto._
implicit val decodeCompleted: Decoder[List[URL]] = deriveDecoder[List[URL]]
It works because it takes DerivedDecoder[A] implicit and then it extracts Decoder[A] from it, so you never end up with implicit val a: A = implicitly[A] scenario.
Indeed the problem is that you introduce a recursive val, like #MateuszKubuszok explained.
The most straightforward—although slightly ugly—workaround is:
implicit val decodeCompleted: Decoder[List[URL]] = {
val decodeCompleted = null
Decoder[List[URL]].prepare(_.downField("completed"))
}
By shadowing decodeCompleted in the right-hand side, implicit search will no longer consider it as a candidate inside that code block, because it can no longer be referenced.

Why do each new instance of case classes evaluate lazy vals again in Scala?

From what I have understood, scala treats val definitions as values.
So, any instance of a case class with same parameters should be equal.
But,
case class A(a: Int) {
lazy val k = {
println("k")
1
}
val a1 = A(5)
println(a1.k)
Output:
k
res1: Int = 1
println(a1.k)
Output:
res2: Int = 1
val a2 = A(5)
println(a1.k)
Output:
k
res3: Int = 1
I was expecting that for println(a2.k), it should not print k.
Since this is not the required behavior, how should I implement this so that for all instances of a case class with same parameters, it should only execute a lazy val definition only once. Do I need some memoization technique or Scala can handle this on its own?
I am very new to Scala and functional programming so please excuse me if you find the question trivial.
Assuming you're not overriding equals or doing something ill-advised like making the constructor args vars, it is the case that two case class instantiations with same constructor arguments will be equal. However, this does not mean that two case class instantiations with same constructor arguments will point to the same object in memory:
case class A(a: Int)
A(5) == A(5) // true, same as `A(5).equals(A(5))`
A(5) eq A(5) // false
If you want the constructor to always return the same object in memory, then you'll need to handle this yourself. Maybe use some sort of factory:
case class A private (a: Int) {
lazy val k = {
println("k")
1
}
}
object A {
private[this] val cache = collection.mutable.Map[Int, A]()
def build(a: Int) = {
cache.getOrElseUpdate(a, A(a))
}
}
val x = A.build(5)
x.k // prints k
val y = A.build(5)
y.k // doesn't print anything
x == y // true
x eq y // true
If, instead, you don't care about the constructor returning the same object, but you just care about the re-evaluation of k, you can just cache that part:
case class A(a: Int) {
lazy val k = A.kCache.getOrElseUpdate(a, {
println("k")
1
})
}
object A {
private[A] val kCache = collection.mutable.Map[Int, Int]()
}
A(5).k // prints k
A(5).k // doesn't print anything
The trivial answer is "this is what the language does according to the spec". That's the correct, but not very satisfying answer. It's more interesting why it does this.
It might be clearer that it has to do this with a different example:
case class A[B](b: B) {
lazy val k = {
println(b)
1
}
}
When you're constructing two A's, you can't know whether they are equal, because you haven't defined what it means for them to be equal (or what it means for B's to be equal). And you can't statically intitialize k either, as it depends on the passed in B.
If this has to print twice, it would be entirely intuitive if that would only be the case if k depends on b, but not if it doesn't depend on b.
When you ask
how should I implement this so that for all instances of a case class with same parameters, it should only execute a lazy val definition only once
that's a trickier question than it sounds. You make "the same parameters" sound like something that can be known at compile time without further information. It's not, you can only know it at runtime.
And if you only know that at runtime, that means you have to keep all past uses of the instance A[B] alive. This is a built in memory leak - no wonder Scala has no built-in way to do this.
If you really want this - and think long and hard about the memory leak - construct a Map[B, A[B]], and try to get a cached instance from that map, and if it doesn't exist, construct one and put it in the map.
I believe case classes only consider the arguments to their constructor (not any auxiliary constructor) to be part of their equality concept. Consider when you use a case class in a match statement, unapply only gives you access (by default) to the constructor parameters.
Consider anything in the body of case classes as "extra" or "side effect" stuffs. I consider it a good tactic to make case classes as near-empty as possible and put any custom logic in a companion object. Eg:
case class Foo(a:Int)
object Foo {
def apply(s: String) = Foo(s.toInt)
}
In addition to dhg answer, I should say, I'm not aware of functional language that does full constructor memoizing by default. You should understand that such memoizing means that all constructed instances should stick in memory, which is not always desirable.
Manual caching is not that hard, consider this simple code
import scala.collection.mutable
class Doubler private(a: Int) {
lazy val double = {
println("calculated")
a * 2
}
}
object Doubler{
val cache = mutable.WeakHashMap.empty[Int, Doubler]
def apply(a: Int): Doubler = cache.getOrElseUpdate(a, new Doubler(a))
}
Doubler(1).double //calculated
Doubler(5).double //calculated
Doubler(1).double //most probably not calculated

Does case class' copy-method use Structural Sharing?

Structural Sharing in Scala Immutable Collections is quite straightforward and there's a lot of material floating around to understand it.
Now every Scala case class automatically defines a copy method, which returns a new copy with the new attributes specified, my question is, does the method use Structural Sharing?
So when I have a
case class A(x: HugeObject, y: Int)
and call the copy method
val a = A(x,y)
val b = a.copy(y = 5)
Does it copy x?
A case class is flat tuple, as such when using copy a new instance is allocated with slots for each product element. However, the elements themselves are not in any form duplicated but shared by reference (except for the value passed into the copy method).
case class Foo(a: AnyRef, b: AnyRef)
val f1 = Foo(new AnyRef, new AnyRef)
val f2 = f1.copy(a = new AnyRef)
f1.a == f2.a // false
f1.b == f2.b // true
f1.b eq f2.b // true
So in your case, x is only reused as the same reference but not structurally duplicated.

Getting a null with a val depending on abstract def in a trait [duplicate]

This question already has answers here:
Scala - initialization order of vals
(3 answers)
Closed 7 years ago.
I'm seeing some initialization weirdness when mixing val's and def's in my trait. The situation can be summarized with the following example.
I have a trait which provides an abstract field, let's call it fruit, which should be implemented in child classes. It also uses that field in a val:
scala> class FruitTreeDescriptor(fruit: String) {
| def describe = s"This tree has loads of ${fruit}s"
| }
defined class FruitTreeDescriptor
scala> trait FruitTree {
| def fruit: String
| val descriptor = new FruitTreeDescriptor(fruit)
| }
defined trait FruitTree
When overriding fruit with a def, things work as expected:
scala> object AppleTree extends FruitTree {
| def fruit = "apple"
| }
defined object AppleTree
scala> AppleTree.descriptor.describe
res1: String = This tree has loads of apples
However, if I override fruit using a val...
scala> object BananaTree extends FruitTree {
| val fruit = "banana"
| }
defined object BananaTree
scala> BananaTree.descriptor.describe
res2: String = This tree has loads of nulls
What's going on here?
In simple terms, at the point you're calling:
val descriptor = new FruitTreeDescriptor(fruit)
the constructor for BananaTree has not been given the chance to run yet. This means the value of fruit is still null, even though it's a val.
This is a subcase of the well-known quirk of the non-declarative initialization of vals, which can be illustrated with a simpler example:
class A {
val x = a
val a = "String"
}
scala> new A().x
res1: String = null
(Although thankfully, in this particular case, the compiler will detect something being afoot and will present a warning.)
To avoid the problem, declare fruit as a lazy val, which will force evaluation.
The problem is the initialization order. val fruit = ... is being initialized after val descriptor = ..., so at the point when descriptor is being initialized, fruit is still null. You can fix this by making fruit a lazy val, because then it will be initialized on first access.
Your descriptor field initializes earlier than fruit field as trait intializes earlier than class, that extends it. null is a field's value before initialization - that's why you get it. In def case it's just a method call instead of accessing some field, so everything is fine (as method's code may be called several times - no initialization here). See, http://docs.scala-lang.org/tutorials/FAQ/initialization-order.html
Why def is so different? That's because def may be called several times, but val - only once (so its first and only one call is actually initialization of the fileld).
Typical solution to such problem - using lazy val instead, it will intialize when you really need it. One more solution is early intializers.
Another, simpler example of what's going on:
scala> class A {val a = b; val b = 5}
<console>:7: warning: Reference to uninitialized value b
class A {val a = b; val b = 5}
^
defined class A
scala> (new A).a
res2: Int = 0 //null
Talking more generally, theoretically scala could analize the dependency graph between fields (which field needs other field) and start initialization from final nodes. But in practice every module is compiled separately and compiler might not even know those dependencies (it might be even Java, which calls Scala, which calls Java), so it's just do sequential initialization.
So, because of that, it couldn't even detect simple loops:
scala> class A {val a: Int = b; val b: Int = a}
<console>:7: warning: Reference to uninitialized value b
class A {val a: Int = b; val b: Int = a}
^
defined class A
scala> (new A).a
res4: Int = 0
scala> class A {lazy val a: Int = b; lazy val b: Int = a}
defined class A
scala> (new A).a
java.lang.StackOverflowError
Actually, such loop (inside one module) can be theoretically detected in separate build, but it won't help much as it's pretty obvious.

Extending Scala collections: One based Array index exercise

As an exercise, I'd like to extend the Scala Array collection to my own OneBasedArray (does what you'd expect, indexing starts from 1). Since this is an immutable collection, I'd like to have it return the correct type when calling filter/map etc.
I've read the resources here, here and here, but am struggling to understand how to translate this to Arrays (or collections other than the ones in the examples). Am I on the right track with this sort of structure?
class OneBasedArray[T]
extends Array[T]
with GenericTraversableTemplate[T, OneBasedArray]
with ArrayLike[T, OneBasedArray]
Are there any further resources that help explain extending collections?
For a in depth overview of new collections API: The Scala 2.8 Collections API
For a nice view of the relation between main classes and traits this
By the way I don't think Array is a collection in Scala.
Here is an example of pimping iterables with a method that always returns the expected runtime type of the iterable it operates on:
import scala.collection.generic.CanBuildFrom
trait MapOrElse[A] {
val underlying: Iterable[A]
def mapOrElse[B, To]
(m: A => Unit)
(pf: PartialFunction[A,B])
(implicit cbf: CanBuildFrom[Iterable[A], B, To])
: To = {
var builder = cbf(underlying.repr)
for (a <- underlying) if (pf.isDefinedAt(a)) builder += pf(a) else m(a)
builder.result
}
}
implicit def toMapOrElse[A](it: Iterable[A]): MapOrElse[A] =
new MapOrElse[A] {val underlying = it}
The new function mapOrElse is similar to the collect function but it allows you to pass a method m: A => Unit in addition to a partial function pf that is invoked whenever pf is undefined. m can for example be a logging method.
An Array is not a Traversable -- trying to work with that as a base class will cause all sorts of problems. Also, it is not immutable either, which makes it completely unsuited to what you want. Finally, Array is an implementation -- try to inherit from traits or abstract classes.
Array isn't a typical Scala collection... It's simply a Java array that's pimped to look like a collection by way of implicit conversions.
Given the messed-up variance of Java Arrays, you really don't want to be using them without an extremely compelling reason, as they're a source of lurking bugs.
(see here: http://www.infoq.com/presentations/Java-Puzzlers)
Creaking a 1-based collection like this isn't really a good idea either, as you have no way of knowing how many other collection methods rely on the assumption that sequences are 0-based. So to do it safely (if you really must) you'll want add a new method that leaves the default one unchanged:
class OneBasedLookup[T](seq:Seq) {
def atIdx(i:Int) = seq(i-1)
}
implicit def seqHasOneBasedLookup(seq:Seq) = new OneBasedLookup(seq)
// now use `atIdx` on any sequence.
Even safer still, you can create a Map[Int,T], with the indices being one-based
(Iterator.from(1) zip seq).toMap
This is arguably the most "correct" solution, although it will also carry the highest performance cost.
Not an array, but here's a one-based immutable IndexedSeq implementation that I recently put together. I followed the example given here where they implement an RNA class. Between that example, the ScalaDocs, and lots of "helpful" compiler errors, I managed to get it set up correctly. The fact that OneBasedSeq is genericized made it a little more complex than the RNA example. Also, in addition to the traits extended and methods overridden in the example, I had to extend IterableLike and override the iterator method, because various methods call that method behind the scenes, and the default iterator is zero-based.
Please pardon any stylistic or idiomadic oddities; I've been programming in Scala for less than 2 months.
import collection.{IndexedSeqLike, IterableLike}
import collection.generic.CanBuildFrom
import collection.mutable.{Builder, ArrayBuffer}
// OneBasedSeq class
final class OneBasedSeq[T] private (s: Seq[T]) extends IndexedSeq[T]
with IterableLike[T, OneBasedSeq[T]] with IndexedSeqLike[T, OneBasedSeq[T]]
{
private val innerSeq = s.toIndexedSeq
def apply(idx: Int): T = innerSeq(idx - 1)
def length: Int = innerSeq.length
override def iterator: Iterator[T] = new OneBasedSeqIterator(this)
override def newBuilder: Builder[T, OneBasedSeq[T]] = OneBasedSeq.newBuilder
override def toString = "OneBasedSeq" + super.toString
}
// OneBasedSeq companion object
object OneBasedSeq {
private def fromSeq[T](s: Seq[T]) = new OneBasedSeq(s)
def apply[T](vals: T*) = fromSeq(IndexedSeq(vals: _*))
def newBuilder[T]: Builder[T, OneBasedSeq[T]] =
new ArrayBuffer[T].mapResult(OneBasedSeq.fromSeq)
implicit def canBuildFrom[T, U]: CanBuildFrom[OneBasedSeq[T], U, OneBasedSeq[U]] =
new CanBuildFrom[OneBasedSeq[T], U, OneBasedSeq[U]] {
def apply() = newBuilder
def apply(from: OneBasedSeq[T]): Builder[U, OneBasedSeq[U]] = newBuilder[U]
}
}
// Iterator class for OneBasedSeq
class OneBasedSeqIterator[T](private val obs: OneBasedSeq[T]) extends Iterator[T]
{
private var index = 1
def hasNext: Boolean = index <= obs.length
def next: T = {
val ret = obs(index)
index += 1
ret
}
}