Checking sameness/equality in Scala

Checking sameness/equality in Scala - scala

As I asked in other post (Unique id for Scala object), it doesn't seem like that I can have id just like Python.
I still need to check the sameness in Scala for unittest. I run a test and compare the returned value of some nested collection object (i.e., List[Map[Int, ...]]) with the one that I create.
However, the hashCode for mutable map is the same as that of immutable map. As a result (x == y) returns True.
scala> val x = Map("a" -> 10, "b" -> 20)
x: scala.collection.immutable.Map[String,Int] = Map(a -> 10, b -> 20)
scala> x.hashCode
res0: Int = -1001662700
scala> val y = collection.mutable.Map("b" -> 20, "a" -> 10)
y: scala.collection.mutable.Map[String,Int] = Map(b -> 20, a -> 10)
scala> y.hashCode
res2: Int = -1001662700
In some cases, it's OK, but in other cases, I may need to make it failed test. So, here comes my question.
Q1: What is the normally used method for comparing two values (including very complicated data types) are the same? I may compare the toString() results, but I don't think this is a good idea.
Q2: Is it a general rule that mutable data structure has the same hashCode with immutable counterpart?

You are looking for AnyRef.eq which does reference equality (which is as close as you can get to Python's id function and is identical if you just want to compare references and you don't care about the actual ID):
scala> x == y
true
scala> x eq y
false

Related

Finding a map in list/vector of maps in Scala

I have a vector/list of maps (Map[String,Int]). How can I find if a key-value pair exists in one of these maps in the list of maps using .find?

val res = List(Map("1" -> 1), Map("2" -> 2)).find(t => t.exists(j => j == ("2", 2)))
println(res)
use find with exists to check whether it exists in maps.

chengpohi's solution is pretty inefficient, and also different to how I understand the question.
Let m: Map[String,Int].
Why chengpoi's solution is inefficient
First, using m.exists(j => j == ("2",2)), which can also be written m.contains("2" -> 2) looks at every entry of m, while m("2").toSeq.contains(2) performs only a single map lookup.
Note that m.contains("2" -> 2) will not work, as contains is overridden for Map to check for a key, i.e., m.contains("2") works—and is also fast.
To obtain the same result as chengpoi, but efficiently:
def mapExists[K,V](ms: List[Map[K,V]], k: K, v: V): Option[(K,V)] =
ms.get(k).filter(_ == v).map(_ => k -> v)
Note that this method returns its arguments, which is quite redundant.
How I understand the question
Second, I understood the question as checking whether the List contains a Map with a specific pair.
This would translate to
def mapExists[K,V](ms: List[Map[K,V]], k: K, v: V): Boolean =
ms.exists(_.get(k).contains(v))

It can be done even like this using just the key value we are interested to find:
scala> val res = List(Map("A" -> 10), Map("B" -> 20)).find(_.keySet.contains("B"))
res: Option[scala.collection.immutable.Map[String,Int]] = Some(Map(B -> 20))
scala>

Efficient way to check if a traversable has more than 1 element in Scala

I need to check if a Traversable (which I already know to be nonEmpty) has a single element or more.
I could use size, but (tell me if I'm wrong) I suspect that this could be O(n), and traverse the collection to compute it.
I could check if tail.nonEmpty, or if .head != .last
Which are the pros and cons of the two approaches? Is there a better way? (for example, will .last do a full iteration as well?)

All approaches that cut elements from beginning of the collection and return tail are inefficient. For example tail for List is O(1), while tail for Array is O(N). Same with drop.
I propose using take:
list.take(2).size == 1 // list is singleton
take is declared to return whole collection if collection length is less that take's argument. Thus there will be no error if collection is empty or has only one element. On the other hand if collection is huge take will run in O(1) time nevertheless. Internally take will start iterating your collection, take two steps and break, putting elements in new collection to return.
UPD: I changed condition to exactly match the question

Not all will be the same, but let's take a worst case scenario where it's a List. last will consume the entire List just to access that element, as will size.
tail.nonEmpty is obtained from a head :: tail pattern match, which doesn't need to consume the entire List. If you already know the list to be non-empty, this should be the obvious choice.
But not all tail operations take constant time like a List: Scala Collections Performance

You can take a view of a traversable. You can slice the TraversableView lazily.
The initial star is because the REPL prints some output.
scala> val t: Traversable[Int] = Stream continually { println("*"); 42 }
*
t: Traversable[Int] = Stream(42, ?)
scala> t.view.slice(0,2).size
*
res1: Int = 2
scala> val t: Traversable[Int] = Stream.fill(1) { println("*"); 42 }
*
t: Traversable[Int] = Stream(42, ?)
scala> t.view.slice(0,2).size
res2: Int = 1
The advantage is that there is no intermediate collection.
scala> val t: Traversable[_] = Map((1 to 10) map ((_, "x")): _*)
t: Traversable[_] = Map(5 -> x, 10 -> x, 1 -> x, 6 -> x, 9 -> x, 2 -> x, 7 -> x, 3 -> x, 8 -> x, 4 -> x)
scala> t.take(2)
res3: Traversable[Any] = Map(5 -> x, 10 -> x)
That returns an unoptimized Map, for instance:
scala> res3.getClass
res4: Class[_ <: Traversable[Any]] = class scala.collection.immutable.HashMap$HashTrieMap
scala> Map(1->"x",2->"x").getClass
res5: Class[_ <: scala.collection.immutable.Map[Int,String]] = class scala.collection.immutable.Map$Map2

What about pattern matching?
itrbl match { case _::Nil => "one"; case _=>"more" }

How to find the number of (key , value) pairs in a map in scala?

I need to find the number of (key , value) pairs in a Map in my Scala code. I can iterate through the map and get an answer but I wanted to know if there is any direct function for this purpose or not.

you can use .size
scala> val m=Map("a"->1,"b"->2,"c"->3)
m: scala.collection.immutable.Map[String,Int] = Map(a -> 1, b -> 2, c -> 3)
scala> m.size
res3: Int = 3

Use Map#size:
The size of this traversable or iterator.
The size method is from TraversableOnce so, barring infinite sequences or sequences that shouldn't be iterated again, it can be used over a wide range - List, Map, Set, etc.

When does Scala actually copy objects?

Background
I have a chunk of code that looks like this:
val big_obj = new BigObj
big_obj.recs[5].foo()
... // other code
big_obj.recs[7].bar()
Problem
I want to do something like this
val big_obj = new BigObj
alias ref = big_obj.recs // looking for something like an alias
ref[5].foo()
... // other code
ref[7].bar()
because I am afraid of making copies of big objects (coming from C++). But then I realised that Scala is probably smart and if I simply do this:
val big_obj = new BigObj
val ref = big_obj.recs // no copies made?
the compiler is probably smart enough to not copy anyways, since it's all read-only.
Question
This got me wondering about Scala's memory model.
Under what situations will copies be made/not made?
I am looking for a simple answer or rule-of-thumb which I can keep in my mind when I deal with really_big_objects, whenever I make assignments, including passing arguments.

Just like Java (and python and probably a lot of other languages), copies are never made of objects. When you assign an object or pass it as an argument, it only copies the reference to the object; the actual object just sits in memory and has an extra thing pointing to it. The only things that would get copied are primitives (integers, doubles, etc).
As you pointed out, this is obviously good for immutable objects, but it's true of all objects, even mutable ones:
scala> val a = collection.mutable.Map(1 -> 2)
a: scala.collection.mutable.Map[Int,Int] = Map(1 -> 2)
scala> val b = a
b: scala.collection.mutable.Map[Int,Int] = Map(1 -> 2)
scala> b += (2 -> 4)
res41: b.type = Map(2 -> 4, 1 -> 2)
scala> a
res42: scala.collection.mutable.Map[Int,Int] = Map(2 -> 4, 1 -> 2)
scala> def addTo(m: collection.mutable.Map[Int,Int]) { m += (3 -> 9) }
addTo: (m: scala.collection.mutable.Map[Int,Int])Unit
scala> addTo(b)
scala> a
res44: scala.collection.mutable.Map[Int,Int] = Map(2 -> 4, 1 -> 2, 3 -> 9)

Nested Default Maps in Scala

I'm trying to construct nested maps in Scala, where both the outer and inner map use the "withDefaultValue" method. For example, the following :
val m = HashMap.empty[Int, collection.mutable.Map[Int,Int]].withDefaultValue( HashMap.empty[Int,Int].withDefaultValue(3))
m(1)(2)
res: Int = 3
m(1)(2) = 5
m(1)(2)
res: Int = 5
m(2)(3) = 6
m
res : scala.collection.mutable.Map[Int,scala.collection.mutable.Map[Int,Int]] = Map()
So the map, when addressed by the appropriate keys, gives me back what I put in. However, the map itself appears empty! Even m.size returns 0 in this example. Can anyone explain what's going on here?

Short answer
It's definitely not a bug.
Long answer
The behavior of withDefaultValue is to store a default value (in your case, a mutable map) inside the Map to be returned in the case that they key does not exist. This is not the same as a value that is inserted into the Map when they key is not found.
Let's look closely at what's happening. It will be easier to understand if we pull the default map out as a separate variable so we can inspect is at will; let's call it default
import collection.mutable.HashMap
val default = HashMap.empty[Int,Int].withDefaultValue(3)
So default is a mutable map (that has its own default value). Now we can create m and give default as the default value.
import collection.mutable.{Map => MMap}
val m = HashMap.empty[Int, MMap[Int,Int]].withDefaultValue(default)
Now whenever m is accessed with a missing key, it will return default. Notice that this is the exact same behavior as you have because withDefaultValue is defined as:
def withDefaultValue (d: B): Map[A, B]
Notice that it's d: B and not d: => B, so it will not create a new map each time the default is accessed; it will return the same exact object, what we've called default.
So let's see what happens:
m(1) // Map()
Since key 1 is not in m, the default, default is returned. default at this time is an empty Map.
m(1)(2) = 5
Since m(1) returns default, this operation stores 5 as the value for key 2 in default. Nothing is written to the Map m because m(1) resolves to default which is a separate Map entirely. We can check this by viewing default:
default // Map(2 -> 5)
But as we said, m is left unchanged
m // Map()
Now, how to achieve what you really wanted? Instead of using withDefaultValue, you want to make use of getOrElseUpdate:
def getOrElseUpdate (key: A, op: ⇒ B): B
Notice how we see op: => B? This means that the argument op will be re-evaluated each time it is needed. This allows us to put a new Map in there and have it be a separate new Map for each invalid key. Let's take a look:
val m2 = HashMap.empty[Int, MMap[Int,Int]]
No default values needed here.
m2.getOrElseUpdate(1, HashMap.empty[Int,Int].withDefaultValue(3)) // Map()
Key 1 doesn't exist, so we insert a new HashMap, and return that new value. We can check that it was inserted as we expected. Notice that 1 maps to the newly added empty map and that they 3 was not added anywhere because of the behavior explained above.
m2 // Map(1 -> Map())
Likewise, we can update the Map as expected:
m2.getOrElseUpdate(1, HashMap.empty[Int,Int].withDefaultValue(1))(2) = 6
and check that it was added:
m2 // Map(1 -> Map(2 -> 6))

withDefaultValue is used to return a value when the key was not found. It does not populate the map. So you map stays empty. Somewhat like using getOrElse(a, b) where b is provided by withDefaultValue.

I just had the exact same problem, and was happy to find dhg's answer. Since typing getOrElseUpdate all the time is not very concise, I came up with this little extension of the idea that I want to share:
You can declare a class that uses getOrElseUpdate as default behavior for the () operator:
class DefaultDict[K, V](defaultFunction: (K) => V) extends HashMap[K, V] {
override def default(key: K): V = return defaultFunction(key)
override def apply(key: K): V =
getOrElseUpdate(key, default(key))
}
Now you can do what you want to do like this:
var map = new DefaultDict[Int, DefaultDict[Int, Int]](
key => new DefaultDict(key => 3))
map(1)(2) = 5
Which does now result in map containing 5 (or rather: containing a DefaultDict containing the value 5 for the key 2).

What you're seeing is the effect that you've created a single Map[Int, Int] this is the default value whenever the key isn't in the outer map.
scala> val m = HashMap.empty[Int, collection.mutable.Map[Int,Int]].withDefaultValue( HashMap.empty[Int,Int].withDefaultValue(3))
m: scala.collection.mutable.Map[Int,scala.collection.mutable.Map[Int,Int]] = Map()
scala> m(2)(2)
res1: Int = 3
scala> m(1)(2) = 5
scala> m(2)(2)
res2: Int = 5
To get the effect that you're looking for, you'll have to wrap the Map with an implementation that actually inserts the default value when a key isn't found in the Map.
Edit:
I'm not sure what your actual use case is, but you may have an easier time using a pair for the key to a single Map.
scala> val m = HashMap.empty[(Int, Int), Int].withDefaultValue(3)
m: scala.collection.mutable.Map[(Int, Int),Int] = Map()
scala> m((1, 2))
res0: Int = 3
scala> m((1, 2)) = 5
scala> m((1, 2))
res3: Int = 5
scala> m
res4: scala.collection.mutable.Map[(Int, Int),Int] = Map((1,2) -> 5)

I know it's a bit late but I've just seen the post while I was trying to solve the same problem.
Probably the API are different from the 2012 version but you may want to use withDefaultinstead that withDefaultValue.
The difference is that withDefault takes a function as parameter, that is executed every time a missed key is requested ;)