Scala HashMap: doesn't += reassign to left hand side? - scala

I am reading Seven Languages in Seven Weeks to get the taste of different programming paradigm. In the chapter about Scala, I found out that collection are immutable (at least the one from scala.collection.immutable).
However, there is an example which confuses me:
scala> val hashMap = HashMap(0->0)
scala> hashMap += 1->1
scala> hashMap
res42: scala.collection.mutable.HashMap[Int,Int] = Map(1 -> 1, 0 -> 0)
but
scala> map = map + 2->2
<console>:9: error: reassignment to val
map = map + 2->2
Is it += reassigning an immutable collection? How is that that += can reassign a val HashMap while = fails?
Moreover, I tried out with other collections (List and Map) and with "primitive" (Int) and += fails with the reassignment error. How are HashMaps special? I do not read anything particular in the Scala API and I cannot find a definition for += operator (I am assuming it to be an operator and not a function even in Scala, as well as in C++ or Java).
Sorry for the dumb question, but since I am new to Scala I am having difficulties finding resources by myself.

You're right that this works with a var, where the compiler can take
hashMap += 1->1
and desugar it to
hashMap = hashMap + 1->1
But there's another possibility too. If your hashMap is of the type scala.collection.mutable.hashMap then it directly calls the += method defined on that type:
hashMap.+=(1->1)
No val is reassigned, the map just internally mutates itself.

hashMap in code sample is collection.mutable.HashMap[Int,Int]. Note the mutable package.
There is a mutable version for many scala collections.
And there is method with name += in mutable.HashMap.

There are two types of collections in Scala: Mutable and Immutable
Mutable: http://www.scala-lang.org/api/2.10.3/index.html#scala.collection.mutable.package
Immutable: http://www.scala-lang.org/api/2.10.3/index.html#scala.collection.immutable.package
What you used definitely belongs to mutable category and hence can be reassigned. In fact, if you try this:
val hashMap = scala.collection.immutable.HashMap(0->0)
hashMap += (1->1)
you'll get same error

Unlike some other languages, x += y doesn't always compile into x = x + y in Scala. hashMap += 1->1 is actually an infix method call (+= is a valid method name in Scala), which is defined in mutable.HashMap class.
Your first example uses mutable HashMap, as you can see in the last line. Immutable HashMap doesn't have += method.

Note that for a mutable HashMap
scala> val map = scala.collection.mutable.HashMap[Int,Int](0 -> 0)
map: scala.collection.mutable.HashMap[Int,Int] = Map(0 -> 0)
its contents can be changed without using val or var,
scala> map += 1->1
res1: map.type = Map(1 -> 1, 0 -> 0)
scala> map += 2->2
res2: map.type = Map(2 -> 2, 1 -> 1, 0 -> 0)
scala> map
res3: scala.collection.mutable.HashMap[Int,Int] = Map(2 -> 2, 1 -> 1, 0 -> 0)
However for an immutable HashMap declared with val
scala> val imap = scala.collection.immutable.HashMap[Int,Int](0 -> 0)
imap: scala.collection.immutable.HashMap[Int,Int] = Map(0 -> 0)
we cannot for instance add new pairs,
scala> imap += 1->1
<console>:10: error: value += is not a member of scala.collection.immutable.HashMap[Int,Int]
imap += 1->1
^
However we can create a new HashMap from the original and add a new pair,
scala> val imap2 = imap.updated(1,1)
imap2: scala.collection.immutable.HashMap[Int,Int] = Map(0 -> 0, 1 -> 1)
Even so, an immutable HashMap declared with var
scala> var imap = scala.collection.immutable.HashMap[Int,Int](0 -> 0)
imap: scala.collection.immutable.HashMap[Int,Int] = Map(0 -> 0)
allows for updating the contents,
scala> imap += 1->1
scala> imap
res11: scala.collection.immutable.HashMap[Int,Int] = Map(0 -> 0, 1 -> 1)

Related

Is it instance name/id that scala REPL prints?

Tutorial mentions about mutable sets in the initial but why would the REPL change the instance name from res4 to res5 when a new element is added? Is 'res' not the instance name that REPL prints? Below is the code in context. Beginner in scala. Please bear if the question is trivial.
scala> val set = scala.collection.mutable.Set[Int]()
val set: scala.collection.mutable.Set[Int] = Set()
scala> set += 1
val res0: scala.collection.mutable.Set[Int] = Set(1)
scala> set += 2 += 3
val res1: scala.collection.mutable.Set[Int] = Set(1, 2, 3)
The reference did not change though, it means res0 == res1. Scala repl will generate names for expressions that are not assigned any name, no matter if it's mutable or not.
Additionally take a look at the docs. For mutable.Set, the method += results in Set.this.type. Since there is a value returned, it has to be assigned some name.

Checking sameness/equality in Scala

As I asked in other post (Unique id for Scala object), it doesn't seem like that I can have id just like Python.
I still need to check the sameness in Scala for unittest. I run a test and compare the returned value of some nested collection object (i.e., List[Map[Int, ...]]) with the one that I create.
However, the hashCode for mutable map is the same as that of immutable map. As a result (x == y) returns True.
scala> val x = Map("a" -> 10, "b" -> 20)
x: scala.collection.immutable.Map[String,Int] = Map(a -> 10, b -> 20)
scala> x.hashCode
res0: Int = -1001662700
scala> val y = collection.mutable.Map("b" -> 20, "a" -> 10)
y: scala.collection.mutable.Map[String,Int] = Map(b -> 20, a -> 10)
scala> y.hashCode
res2: Int = -1001662700
In some cases, it's OK, but in other cases, I may need to make it failed test. So, here comes my question.
Q1: What is the normally used method for comparing two values (including very complicated data types) are the same? I may compare the toString() results, but I don't think this is a good idea.
Q2: Is it a general rule that mutable data structure has the same hashCode with immutable counterpart?
You are looking for AnyRef.eq which does reference equality (which is as close as you can get to Python's id function and is identical if you just want to compare references and you don't care about the actual ID):
scala> x == y
true
scala> x eq y
false

When does Scala actually copy objects?

Background
I have a chunk of code that looks like this:
val big_obj = new BigObj
big_obj.recs[5].foo()
... // other code
big_obj.recs[7].bar()
Problem
I want to do something like this
val big_obj = new BigObj
alias ref = big_obj.recs // looking for something like an alias
ref[5].foo()
... // other code
ref[7].bar()
because I am afraid of making copies of big objects (coming from C++). But then I realised that Scala is probably smart and if I simply do this:
val big_obj = new BigObj
val ref = big_obj.recs // no copies made?
the compiler is probably smart enough to not copy anyways, since it's all read-only.
Question
This got me wondering about Scala's memory model.
Under what situations will copies be made/not made?
I am looking for a simple answer or rule-of-thumb which I can keep in my mind when I deal with really_big_objects, whenever I make assignments, including passing arguments.
Just like Java (and python and probably a lot of other languages), copies are never made of objects. When you assign an object or pass it as an argument, it only copies the reference to the object; the actual object just sits in memory and has an extra thing pointing to it. The only things that would get copied are primitives (integers, doubles, etc).
As you pointed out, this is obviously good for immutable objects, but it's true of all objects, even mutable ones:
scala> val a = collection.mutable.Map(1 -> 2)
a: scala.collection.mutable.Map[Int,Int] = Map(1 -> 2)
scala> val b = a
b: scala.collection.mutable.Map[Int,Int] = Map(1 -> 2)
scala> b += (2 -> 4)
res41: b.type = Map(2 -> 4, 1 -> 2)
scala> a
res42: scala.collection.mutable.Map[Int,Int] = Map(2 -> 4, 1 -> 2)
scala> def addTo(m: collection.mutable.Map[Int,Int]) { m += (3 -> 9) }
addTo: (m: scala.collection.mutable.Map[Int,Int])Unit
scala> addTo(b)
scala> a
res44: scala.collection.mutable.Map[Int,Int] = Map(2 -> 4, 1 -> 2, 3 -> 9)

immutable val vs mutable ArrayBuffer

mutable vs. immutable in Scala collections
Before I post this question, I have read the above article. Apparently if you store something in val, you can't modify it, but then if you store a mutable collection such as ArrayBuffer, you can modify it!
scala> val b = ArrayBuffer[Int](1,2,3)
b: scala.collection.mutable.ArrayBuffer[Int] = ArrayBuffer(1, 2, 3)
scala> b += 1
res50: b.type = ArrayBuffer(1, 2, 3, 1)
scala> b
res51: scala.collection.mutable.ArrayBuffer[Int] = ArrayBuffer(1, 2, 3, 1)
What is the use of using val to store a mutable ArrayBuffer? I assume the only reason b changes is because val b holds the memory address to that ArrayBuffer(1,2,3).
If you try var x = 1; val y = x; x = 5; y, the output will still be 1. In this case, y stores an actual value instead of the address to x.
Java doesn't have this confusion because it's clear an Object can't be assigned to an int variable .
How do I know when is the variable in scala carrying a value, when is a memory address? What's the point of storing a mutable collection in a immutable variable?
A simple answer is that vals and vars are all references. There're no primitive types in Scala. They're all objects.
val x = 1
is a reference named x that points to an immutable integer object 1. You cannot do 1.changeTo(2) or something, so if you have
val value = 5
val x = value
var y = value
You can do y += 10 This changes y to reference a new object, (5 + 10) = 15. The original 5 remains 5.
On the other hand, you cannot do x += 10 because x is a val which means it must always point to 5. So, this doesn't compile.
You may wonder why you can do val b = ArrayBuffer(...) and then b += something even though b is a val. That's because += is actually a method, not an assignment. Calling b += something gets translated to b.+=(something). The method += just adds a new element (something) to its mutable self and returns itself for further assignment.
Let's see an example
scala> val xs = ArrayBuffer(1,2,3)
xs: scala.collection.mutable.ArrayBuffer[Int] = ArrayBuffer(1, 2, 3)
scala> val ys = ( xs += 999 )
ys: xs.type = ArrayBuffer(1, 2, 3, 999)
scala> xs
res0: scala.collection.mutable.ArrayBuffer[Int] = ArrayBuffer(1, 2, 3, 999)
scala> ys
res1: xs.type = ArrayBuffer(1, 2, 3, 999)
scala> xs eq ys
res2: Boolean = true
This confirms xs and ys point to the same (mutable) ArrayBuffer. The eq method is like Java's ==, which compares object identity. Mutable/Immutable references (val/var) and mutable/immutable data structures (ArrayBuffer, List) are different. So, if you do another xs += 888, the ys which is an immutable reference pointing to a mutable data structure also contains 888.
What's the point of storing a mutable collection in a immutable variable
val a = new ArrayBuffer(1)
a = new ArrayBuffer[Int]()
<console>:9: error: reassignment to val
It prevents the variable from being assigned to a new memory address. In practice though scala encourages you not to use mutable state (to avoid locking, blocking, etc), so I'm having trouble coming up with an example for a real situation where the choice of var or val for mutable state matters.
Immutable object and constant value are two different things.
If you define your collection as val means that the referenced instance of the collection will always be the same. But this instance can be mutable or immutable: if it is immutable you cannot add or remove items in that instance, vice versa if it is mutable you can do it. When a collection is immutable to add or remove items you always create a copy.
How do I know when is the variable in scala carrying a value, when is a memory address?
Scala always runs on the JVM (.NET support was discontinued), so types that are primitive types on JVM will be treated as primitive types by Scala.
What is the use of using val to store a mutable ArrayBuffer?
The closest alternative would be to use a var to store an immutable Seq. If that Seq was very large, you wouldn't want to have to copy the whole Seq every time you made a change to it - but that's what you might have to do! That would be very slow!

Adding element to a scala set which is a map value

I have the following map in Scala:
var m = Map[Int,Set[Int]]()
m += 1 -> Set(1)
m(1) += 2
I've discovered that the last line doesn't work. I get "error: reassignment to val".
So I tried
var s = m(1)
s += 2
Then when I compared m(1) with s after I added 2 to it, their contents were different. So how can I add an element to a set which is the value of a map?
I come from a Java/C++ background so what I tried seems natural to me, but apparently it's not in Scala.
You're probably using immutable.Map. You need to use mutable.Map, or replace the set instead of modifying it with another immutable map.
Here's a reference of a description of the mutable vs immutable data structures.
So...
import scala.collection.mutable.Map
var m = Map[Int,Set[Int]]()
m += 1 -> Set(1)
m(1) += 2
In addition to #Stefan answer:
instead of using mutable Map, you can use mutable Set
import scala.collection.mutable.{Set => mSet}
var m = Map[Int,mSet[Int]]()
m += 1 -> mSet(1)
m(1)+=2
mSet is a shortcut to mutable Set introduced to reduce verbosity.
scala> m
res9: scala.collection.immutable.Map[Int,scala.collection.mutable.Set[Int]] = Map(1 -> Set(2, 1))
I think what you really want here is a MultiMap
import collection.mutable.{Set, Map, HashMap, MultiMap}
val m = new HashMap[Int,Set[Int]] with MultiMap[Int, Int]
m.addBinding(1,1)
m.addBinding(1,2)
m.addBinding(2,3)
Note that m itself is a val, as it's the map itself which is now mutable, not the reference to the map
At this point, m will now be a:
Map(
1 -> Set(1,2),
2 -> Set(3)
)
Unfortunately, there's no immutable equivalent to MultiMap, and you have to specify the concrete subclass of mutable.Map that you'll use at construction time.
For all subsequent operations, it's enough to just pass the thing around typed as a MultiMap[Int,Int]