Structural Sharing in Scala Immutable Collections is quite straightforward and there's a lot of material floating around to understand it.
Now every Scala case class automatically defines a copy method, which returns a new copy with the new attributes specified, my question is, does the method use Structural Sharing?
So when I have a
case class A(x: HugeObject, y: Int)
and call the copy method
val a = A(x,y)
val b = a.copy(y = 5)
Does it copy x?
A case class is flat tuple, as such when using copy a new instance is allocated with slots for each product element. However, the elements themselves are not in any form duplicated but shared by reference (except for the value passed into the copy method).
case class Foo(a: AnyRef, b: AnyRef)
val f1 = Foo(new AnyRef, new AnyRef)
val f2 = f1.copy(a = new AnyRef)
f1.a == f2.a // false
f1.b == f2.b // true
f1.b eq f2.b // true
So in your case, x is only reused as the same reference but not structurally duplicated.
Related
I am trying to create a method that will take a list of Objects and a list of Types and then check whether a particular object matches a particular type.
I tried to create a dummy example. In the following code validate method takes list of objects and a List of type then tried to validate the types of each object. I know following code is wrong but I just created to communicate the problem, it would be really great if you can tell me how can I do this in Scala.
object T extends App {
trait MyT
case class A(a: Int) extends MyT
case class B(b: Int) extends MyT
def validate(values: Seq[Any], types: Seq[MyT]): Seq[Boolean] =
for ((v, ty: MyT) ← values.zip(types)) yield v.isInstanceOf[ty]
// create objects
val a = A(1)
val b = B(1)
// call validate method
validate(Seq(a, b), Seq(A, B))
}
Types are not values, so you can't have a list of them.
But, you may use Class instead:
def validate(values: Seq[Any], types: Seq[Class[_ <: MyT]]): Seq[Boolean] =
values.lazyZip(types).map {
case (v, c) =>
c.isInstance(v)
}
Which can be used like this:
validate(Seq(a, b), Seq(classOf[A], classOf[B]))
// res: Seq[Boolean] = List(true, true)
You can see the code running here.
However, note that this will break if your classes start to have type parameters due to type erasure.
Note, I personally don't recommend this code since, IMHO, the very definition of validate is a code smell and I would bet you have a design error that lead to this situation.
I recommend searching for a way to avoid this situation and solve the root problem.
I am trying to understand how a case class can be passed as an argument to a function which accepts functions as arguments. Below is an example:
Consider the below function
def !![B](h: Out[B] => A): In[B] = { ... }
If I understood correctly, this is a polymorphic method which has a type parameter B and accepts a function h as a parameter. Out and In are other two classes defined previously.
This function is then being used as shown below:
case class Q(p: boolean)(val cont: Out[R])
case class R(p: Int)
def g(c: Out[Q]) = {
val rin = c !! Q(true)_
...
}
I am aware that currying is being used to avoid writing the type annotation and instead just writing _. However, I cannot grasp why and how the case class Q is transformed to a function (h) of type Out[B] => A.
EDIT 1 Updated !! above and the In and Out definitions:
abstract class In[+A] {
def future: Future[A]
def receive(implicit d: Duration): A = {
Await.result[A](future, d)
}
def ?[B](f: A => B)(implicit d: Duration): B = {
f(receive)
}
}
abstract class Out[-A]{
def promise[B <: A]: Promise[B]
def send(msg: A): Unit = promise.success(msg)
def !(msg: A) = send(msg)
def create[B](): (In[B], Out[B])
}
These code samples are taken from the following paper: http://drops.dagstuhl.de/opus/volltexte/2016/6115/
TLDR;
Using a case class with multiple parameter lists and partially applying it will yield a partially applied apply call + eta expansion will transform the method into a function value:
val res: Out[Q] => Q = Q.apply(true) _
Longer explanation
To understand the way this works in Scala, we have to understand some fundamentals behind case classes and the difference between methods and functions.
Case classes in Scala are a compact way of representing data. When you define a case class, you get a bunch of convenience methods which are created for you by the compiler, such as hashCode and equals.
In addition, the compiler also generates a method called apply, which allows you to create a case class instance without using the new keyword:
case class X(a: Int)
val x = X(1)
The compiler will expand this call to
val x = X.apply(1)
The same thing will happen with your case class, only that your case class has multiple argument lists:
case class Q(p: boolean)(val cont: Out[R])
val q: Q = Q(true)(new Out[Int] { })
Will get translated to
val q: Q = Q.apply(true)(new Out[Int] { })
On top of that, Scala has a way to transform methods, which are a non value type, into a function type which has the type of FunctionX, X being the arity of the function. In order to transform a method into a function value, we use a trick called eta expansion where we call a method with an underscore.
def foo(i: Int): Int = i
val f: Int => Int = foo _
This will transform the method foo into a function value of type Function1[Int, Int].
Now that we posses this knowledge, let's go back to your example:
val rin = c !! Q(true) _
If we just isolate Q here, this call gets translated into:
val rin = Q.apply(true) _
Since the apply method is curried with multiple argument lists, we'll get back a function that given a Out[Q], will create a Q:
val rin: Out[R] => Q = Q.apply(true) _
I cannot grasp why and how the case class Q is transformed to a function (h) of type Out[B] => A.
It isn't. In fact, the case class Q has absolutely nothing to do with this! This is all about the object Q, which is the companion module to the case class Q.
Every case class has an automatically generated companion module, which contains (among others) an apply method whose signature matches the primary constructor of the companion class, and which constructs an instance of the companion class.
I.e. when you write
case class Foo(bar: Baz)(quux: Corge)
You not only get the automatically defined case class convenience methods such as accessors for all the elements, toString, hashCode, copy, and equals, but you also get an automatically defined companion module that serves both as an extractor for pattern matching and as a factory for object construction:
object Foo {
def apply(bar: Baz)(quux: Corge) = new Foo(bar)(quux)
def unapply(that: Foo): Option[Baz] = ???
}
In Scala, apply is a method that allows you to create "function-like" objects: if foo is an object (and not a method), then foo(bar, baz) is translated to foo.apply(bar, baz).
The last piece of the puzzle is η-expansion, which lifts a method (which is not an object) into a function (which is an object and can thus be passed as an argument, stored in a variable, etc.) There are two forms of η-expansion: explicit η-expansion using the _ operator:
val printFunction = println _
And implicit η-expansion: in cases where Scala knows 100% that you mean a function but you give it the name of a method, Scala will perform η-expansion for you:
Seq(1, 2, 3) foreach println
And you already know about currying.
So, if we put it all together:
Q(true)_
First, we know that Q here cannot possibly be the class Q. How do we know that? Because Q here is used as a value, but classes are types, and like most programming languages, Scala has a strict separation between types and values. Therefore, Q must be a value. In particular, since we know class Q is a case class, object Q is the companion module for class Q.
Secondly, we know that for a value Q
Q(true)
is syntactic sugar for
Q.apply(true)
Thirdly, we know that for case classes, the companion module has an automatically generated apply method that matches the primary constructor, so we know that Q.apply has two parameter lists.
So, lastly, we have
Q.apply(true) _
which passes the first argument list to Q.apply and then lifts Q.apply into a function which accepts the second argument list.
Note that case classes with multiple parameter lists are unusual, since only the parameters in the first parameter list are considered elements of the case class, and only elements benefit from the "case class magic", i.e. only elements get accessors implemented automatically, only elements are used in the signature of the copy method, only elements are used in the automatically generated equals, hashCode, and toString() methods, and so on.
See this implementation follows the upper bound example http://docs.scala-lang.org/tutorials/tour/upper-type-bounds.html
class Fruit(name: String)
class Apple (name: String) extends Fruit(name)
class Orange(name: String) extends Fruit(name)
class BigOrange(name:String) extends Orange(name)
class BigFLOrange(name:String) extends BigOrange(name)
// Straight from the doc
trait Node[+B ] {
def prepend[U >: B ](elem: U)
}
case class ListNode[+B](h: B, t: Node[B]) extends Node[B] {
def prepend[U >:B ](elem: U) = ListNode[U](elem, this)
def head: B = h
def tail = t
}
case class Nil[+B ]() extends Node[B] {
def prepend[U >: B ](elem: U) = ListNode[U](elem, this)
}
But this definition seems to allow multiple unrelated things in the same container
val f = new Fruit("fruit")
val a = new Apple("apple")
val o = new Orange("orange")
val bo = new BigOrange("big orange")
val foo :ListNode[BigOrange] = ListNode[BigOrange](bo, Nil())
foo.prepend(a) // add an apple to BigOrangeList
foo.prepend(o) // add an orange to BigOrangeList
val foo2 : ListNode[Orange] = foo // and still get to assign to OrangeList
So I am not sure this is a great example in the docs. And, question, how doe I modify the constraints so that..this behaves more like a List?
User #gábor-bakos points out that I am confusing invariance with covariance. So I tried the mutable list buffer. It does not later allow apple to be inserted into an Orange list Buffer, but it is not covariant
val ll : ListBuffer[BigOrange]= ListBuffer(bo)
ll += bo //good
ll += a // not allowed
So..can my example above (ListNode) be modified so that
1. it is covariant (it already is)
2 It is mutable, but mutable like the ListBuffer example (will not later allow apples to be inserted into BigOrange list
A mutable list cannot/should not be covariant in its argument type.
Exactly because of the reason you noted.
Suppose, you could have a MutableList[Orange], that was a subclass of a MutableList[Fruit]. Now, there is nothing that would prevent you from making a function:
def putApple(fruits: MutableList[Fruit], idx: Int) =
fruits(idx) = new Apple
You can add Apple to the list of Fruits, because Apple is a Fruit, nothing wrong with this.
But once you have a function like that, there is no reason you can't call it like this:
val oranges = new MutableList[Orange](new Orange, new Orange)
putApple(oranges, 0)
This will compile, since MutableList[Orange] is a subclass of MutableList[Fruit]. But now:
val firstOrange: Orange = oranges(0)
will crash, because the first element of oranges is actually an Apple.
For this reason, mutable collections have to be invariant in the element type (to answer the question you asked in the comments, to make the list invariant remove the + before B, and also get rid of the type parameter in prepend. It should just be def pretend(elem: B)).
How to get around it? The best solution is to simply not use mutable collections. You should not need them in 99% or real life scala code. If you think you need one, there is a 99% you are doing something wrong.
The major thing that you are probably missing is that prepend does not modify a list. In the line val foo2 : ListNode[Orange] = foo the list foo is still of type ListNode[BigOrange] and given covariance of parameters this assignment is valid (and there is nothing particularly awkward about it). Compiler will prevent you from assigning Fruits to Oranges (in this case assigning an Apple to an Orange is hazardous) but you have to save modified lists beforehand:
val foo: ListNode[Fruit] = ListNode[BigOrange](bo, Nil()).prepend(a).prepend(o)
val foo2: ListNode[Orange] = foo // this is not valid
Moreover your Node definition lacks return type for prepend - thus compiler infers wrong return type (Unit instead of ListNode[U]).
Here's fixed version:
trait Node[+B] {
def prepend[U >: B ](elem: U): Node[U]
}
From what I have understood, scala treats val definitions as values.
So, any instance of a case class with same parameters should be equal.
But,
case class A(a: Int) {
lazy val k = {
println("k")
1
}
val a1 = A(5)
println(a1.k)
Output:
k
res1: Int = 1
println(a1.k)
Output:
res2: Int = 1
val a2 = A(5)
println(a1.k)
Output:
k
res3: Int = 1
I was expecting that for println(a2.k), it should not print k.
Since this is not the required behavior, how should I implement this so that for all instances of a case class with same parameters, it should only execute a lazy val definition only once. Do I need some memoization technique or Scala can handle this on its own?
I am very new to Scala and functional programming so please excuse me if you find the question trivial.
Assuming you're not overriding equals or doing something ill-advised like making the constructor args vars, it is the case that two case class instantiations with same constructor arguments will be equal. However, this does not mean that two case class instantiations with same constructor arguments will point to the same object in memory:
case class A(a: Int)
A(5) == A(5) // true, same as `A(5).equals(A(5))`
A(5) eq A(5) // false
If you want the constructor to always return the same object in memory, then you'll need to handle this yourself. Maybe use some sort of factory:
case class A private (a: Int) {
lazy val k = {
println("k")
1
}
}
object A {
private[this] val cache = collection.mutable.Map[Int, A]()
def build(a: Int) = {
cache.getOrElseUpdate(a, A(a))
}
}
val x = A.build(5)
x.k // prints k
val y = A.build(5)
y.k // doesn't print anything
x == y // true
x eq y // true
If, instead, you don't care about the constructor returning the same object, but you just care about the re-evaluation of k, you can just cache that part:
case class A(a: Int) {
lazy val k = A.kCache.getOrElseUpdate(a, {
println("k")
1
})
}
object A {
private[A] val kCache = collection.mutable.Map[Int, Int]()
}
A(5).k // prints k
A(5).k // doesn't print anything
The trivial answer is "this is what the language does according to the spec". That's the correct, but not very satisfying answer. It's more interesting why it does this.
It might be clearer that it has to do this with a different example:
case class A[B](b: B) {
lazy val k = {
println(b)
1
}
}
When you're constructing two A's, you can't know whether they are equal, because you haven't defined what it means for them to be equal (or what it means for B's to be equal). And you can't statically intitialize k either, as it depends on the passed in B.
If this has to print twice, it would be entirely intuitive if that would only be the case if k depends on b, but not if it doesn't depend on b.
When you ask
how should I implement this so that for all instances of a case class with same parameters, it should only execute a lazy val definition only once
that's a trickier question than it sounds. You make "the same parameters" sound like something that can be known at compile time without further information. It's not, you can only know it at runtime.
And if you only know that at runtime, that means you have to keep all past uses of the instance A[B] alive. This is a built in memory leak - no wonder Scala has no built-in way to do this.
If you really want this - and think long and hard about the memory leak - construct a Map[B, A[B]], and try to get a cached instance from that map, and if it doesn't exist, construct one and put it in the map.
I believe case classes only consider the arguments to their constructor (not any auxiliary constructor) to be part of their equality concept. Consider when you use a case class in a match statement, unapply only gives you access (by default) to the constructor parameters.
Consider anything in the body of case classes as "extra" or "side effect" stuffs. I consider it a good tactic to make case classes as near-empty as possible and put any custom logic in a companion object. Eg:
case class Foo(a:Int)
object Foo {
def apply(s: String) = Foo(s.toInt)
}
In addition to dhg answer, I should say, I'm not aware of functional language that does full constructor memoizing by default. You should understand that such memoizing means that all constructed instances should stick in memory, which is not always desirable.
Manual caching is not that hard, consider this simple code
import scala.collection.mutable
class Doubler private(a: Int) {
lazy val double = {
println("calculated")
a * 2
}
}
object Doubler{
val cache = mutable.WeakHashMap.empty[Int, Doubler]
def apply(a: Int): Doubler = cache.getOrElseUpdate(a, new Doubler(a))
}
Doubler(1).double //calculated
Doubler(5).double //calculated
Doubler(1).double //most probably not calculated
I have a class with several parameters such as class Building(val a: Int, val b: Int, val c: Int). This code I have to update it is this:
def updatedA(a: Int): Building = new Building(a, this.b, this.c)
def updatedB(b: Int): Building = new Building(this.a, b, this.c)
Is there a shorter way to get an updated object like the following?
def updatedA(newA: Int): Building = new { val a = newA } extends this // doesn't compile/ type is AnyRef instead of Building
Have you looked at the copyWith() construction mechanism ?
The copyWith method takes advantage of Scala 2.8's default arguments.
For each field in the class, it has a parameter which defaults to the
current value. Whichever arguments aren't specified will take on their
original values. We can call the method using named arguments (also in
Scala 2.8).
val newJoe = joe.copyWith(phone = "555-1234")
Or check out the copy mechanism for case classes.
/* The real power is the copy constructor that is automatically
generated in the case class. I can make a copy with any or all
attributes modifed by using the copy constructor and declaring which
field to modify */
scala> parent.copy(right = Some(node))