Scala Semantic Equality [duplicate] - scala

scala> List(1,2,3) == List(1,2,3)
res2: Boolean = true
scala> Map(1 -> "Olle") == Map(1 -> "Olle")
res3: Boolean = true
But when trying to do the same with Array, it does not work the same. Why?
scala> Array('a','b') == Array('a','b')
res4: Boolean = false
I have used 2.8.0.RC7 and 2.8.0.Beta1-prerelease.

Because the definition of "equals" for Arrays is that they refer to the same array.
This is consistent with Java's array equality, using Object.Equals, so it compares references.
If you want to check pairwise elements, then use sameElements
Array('a','b').sameElements(Array('a','b'))
or deepEquals, which has been deprecated in 2.8, so instead use:
Array('a','b').deep.equals(Array('a','b').deep)
There's a good Nabble discussion on array equality.

The root cause the it is that fact that Scala use the same Array implementation as Java, and that's the only collection that not support == as equality operator.
Also, it's important to note that the chosen answer suggest equally sameElements and deep comparison when actually it's preferred to use:
Array('a','b').deep.equals(Array('a','b').deep)
Or, because now we can use == back again:
Array('a','b').deep == Array('a','b').deep
Instead of:
Array('a','b').sameElements(Array('a','b'))
Because sameElements won't for for nested array, it's not recursive. And deep comparison will.

Related

Compare Seq and Array different behavior

Scala seems to view Seqs with same values as a single object, but not the same as Arrays.
Seq behaves the same as List, Set.
scala> Array(1) == Array(1)
res2: Boolean = false
scala> Seq(1) == Seq(1)
res3: Boolean = true
Why does it happen? What's the reason behind?
This is because Array is essentially an alias for Java’s array, which implements equals as reference equality - only returning true if two variables point to the same array instance.
Array is the only Scala collection for which == checks for reference equality, for all others it delegates to .equals which checks for value equality.
Though, Scala 2.13 introduces immutable Arrays which behave as expected.
For now, you can use .sameElements or .deep to compare instead.

Scala iterable set and sameElements [duplicate]

Whilst working through the Scala exercises on Iterables, I encountered the following strange behaviour:
val xs = Set(5,4,3,2,1)
val ys = Set(1,2,3,4,5)
xs sameElements ys // true
val xs = Set(3,2,1)
val ys = Set(1,2,3)
xs sameElements ys // false - WAT?!
Surely these Sets have the same elements, and should ignore ordering; and why does this work as expected only for the larger set?
The Scala collections library provides specialised implementations for Sets of fewer than 5 values (see the source). The iterators for these implementations return elements in the order in which they were added, rather than the consistent, hash-based ordering used for larger Sets.
Furthermore, sameElements (scaladoc) is defined on Iterables (it is implemented in IterableLike - see the source); it returns true only if the iterators return the same elements in the same order.
So although Set(1,2,3) and Set(3,2,1) ought to be equivalent, their iterators are different, therefore sameElements returns false.
This behaviour is surprising, and arguably a bug since it violates the mathematical expectations for a Set (but only for certain sizes of Set!).
As I.K. points out in the comments, == works fine if you are just comparing Sets with one another, i.e. Set(1,2,3) == Set(3,2,1). However, sameElements is more general in that it can compare the elements of any two iterables. For example, List(1, 2, 3) == Array(1, 2, 3) is false, but List(1, 2, 3) sameElements Array(1, 2, 3) is true.
More generally, equality can be confusing - note that:
List(1,2,3) == Vector(1,2,3)
List(1,2,3) != Set(1,2,3)
List(1,2,3) != Array(1,2,3)
Array(1,2,3) != Array(1,2,3)
I have submitted a fix for the Scala exercises that explains the sameElements problem.

Scala: Why does Array.ofDim[String](n) create an Array where values are initialized as nulls rather than empty strings?

Since Scala advocates avoiding the usage of null, I do not understand why the default values are null in this specific use-case. Can someone please explain this ?
scala> Array.ofDim[String](1)
res0: Array[String] = Array(null)
On the other-hand, defaults for other types seems to be fine.
scala> Array.ofDim[Double](1)
res1: Array[Double] = Array(0.0)
scala> Array.ofDim[Int](1)
res2: Array[Int] = Array(0)
scala> Array.ofDim[Boolean](1)
res3: Array[Boolean] = Array(false)
scala> Array.ofDim[Float](1)
res4: Array[Float] = Array(0.0)
UPDATE/EDIT:
Not that it is a big deal, I can map over the array and wrapping each element in Option and fold it by providing default value is "". I want to understand why this is the default behavior
ofDim can be used to create/provide an array, when you have to call a Java API for example which expects nulls.
Another use case might be performance, when every instance counts, since arrays are the only collection which is supported directly by the JVM.
If you want an initialized array, you can use fill:
val sa = Array.fill(5)("Hi")
It is because you declared an array of Strings. And a String's default value is null. (Remember that this String is a java.lang.String, whose default is null.)

Scala Sets contain the same elements, but sameElements() returns false

Whilst working through the Scala exercises on Iterables, I encountered the following strange behaviour:
val xs = Set(5,4,3,2,1)
val ys = Set(1,2,3,4,5)
xs sameElements ys // true
val xs = Set(3,2,1)
val ys = Set(1,2,3)
xs sameElements ys // false - WAT?!
Surely these Sets have the same elements, and should ignore ordering; and why does this work as expected only for the larger set?
The Scala collections library provides specialised implementations for Sets of fewer than 5 values (see the source). The iterators for these implementations return elements in the order in which they were added, rather than the consistent, hash-based ordering used for larger Sets.
Furthermore, sameElements (scaladoc) is defined on Iterables (it is implemented in IterableLike - see the source); it returns true only if the iterators return the same elements in the same order.
So although Set(1,2,3) and Set(3,2,1) ought to be equivalent, their iterators are different, therefore sameElements returns false.
This behaviour is surprising, and arguably a bug since it violates the mathematical expectations for a Set (but only for certain sizes of Set!).
As I.K. points out in the comments, == works fine if you are just comparing Sets with one another, i.e. Set(1,2,3) == Set(3,2,1). However, sameElements is more general in that it can compare the elements of any two iterables. For example, List(1, 2, 3) == Array(1, 2, 3) is false, but List(1, 2, 3) sameElements Array(1, 2, 3) is true.
More generally, equality can be confusing - note that:
List(1,2,3) == Vector(1,2,3)
List(1,2,3) != Set(1,2,3)
List(1,2,3) != Array(1,2,3)
Array(1,2,3) != Array(1,2,3)
I have submitted a fix for the Scala exercises that explains the sameElements problem.

Strange behaviour of the Array type with `==` operator

scala> List(1,2,3) == List(1,2,3)
res2: Boolean = true
scala> Map(1 -> "Olle") == Map(1 -> "Olle")
res3: Boolean = true
But when trying to do the same with Array, it does not work the same. Why?
scala> Array('a','b') == Array('a','b')
res4: Boolean = false
I have used 2.8.0.RC7 and 2.8.0.Beta1-prerelease.
Because the definition of "equals" for Arrays is that they refer to the same array.
This is consistent with Java's array equality, using Object.Equals, so it compares references.
If you want to check pairwise elements, then use sameElements
Array('a','b').sameElements(Array('a','b'))
or deepEquals, which has been deprecated in 2.8, so instead use:
Array('a','b').deep.equals(Array('a','b').deep)
There's a good Nabble discussion on array equality.
The root cause the it is that fact that Scala use the same Array implementation as Java, and that's the only collection that not support == as equality operator.
Also, it's important to note that the chosen answer suggest equally sameElements and deep comparison when actually it's preferred to use:
Array('a','b').deep.equals(Array('a','b').deep)
Or, because now we can use == back again:
Array('a','b').deep == Array('a','b').deep
Instead of:
Array('a','b').sameElements(Array('a','b'))
Because sameElements won't for for nested array, it's not recursive. And deep comparison will.