Scala - Collection comparison - Why is Set(1) == ListSet(1)? - scala

Why is the output of this comparison outputs true?
import scala.collection.immutable.ListSet
Set(1) == ListSet(1) // Expect false
//Output
res0: Boolean = true
And in a more general sense, how the comparison is actually done?

Since the inheritance chain Set <: GenSet <: GenSetLike is a bit lengthy, it might be not immediately obvious where to look for the code of equals, so I thought maybe I quote it here:
GenSetLike.scala:
/** Compares this set with another object for equality.
*
* '''Note:''' This operation contains an unchecked cast: if `that`
* is a set, it will assume with an unchecked cast
* that it has the same element type as this set.
* Any subsequent ClassCastException is treated as a `false` result.
* #param that the other object
* #return `true` if `that` is a set which contains the same elements
* as this set.
*/
override def equals(that: Any): Boolean = that match {
case that: GenSet[_] =>
(this eq that) ||
(that canEqual this) &&
(this.size == that.size) &&
(try this subsetOf that.asInstanceOf[GenSet[A]]
catch { case ex: ClassCastException => false })
case _ =>
false
}
Essentially, it checks whether the other object is also a GenSet, and if yes, it attempts to perform some fail-fast checks (like comparing size and invoking canEqual), and if the sizes are equal, it checks whether this set is a subset of another set, presumably by checking each element.
So, the exact class used to represent the set at runtime is irrelevant, what matters is that the compared object is also a GenSet and has the same elements.

From Scala collections equality:
The collection libraries have a uniform approach to equality and hashing. The idea is, first, to divide collections into sets, maps, and sequences.
...
On the other hand, within the same category, collections are equal if and only if they have the same elements
In your case, both collections are considered sets and they contain the same elements, hence, they're equal.

Scala 2.12.8 documentation:
This class implements immutable sets using a list-based data
structure.
So ListSet is a set too but with concrete (list-based) implementation.

Related

Contains of Option[String] in Scala not working as expected?

I just discovered something weird. This statement:
Some("test this").contains("test")
Evaluates to false. While this evaluates to true:
Some("test this").contains("test this")
How does this make sense? I thought the Option would run the contains on the wrapped object if possible.
EDIT:
I'm also thinking about this from a code readability perspective. Imagine you are seeing this code:
person.name.contains("Roger")
Must name be equal to Roger? Or can it contain Roger? The behavior depends if it's a String or Option[String].
There's a principle in typed functional programming called "parametric reasoning". Broadly stated, the principle is that it's desirable to be able to have intuitions about what a function does just from looking at its type signature.
If we "devirtualize" (effectively turning it into a static method... this is actually a fairly common optimization step in object-oriented runtimes) Option's contains method has the signature:
def contains[A, A1 >: A](opt: Option[A], elem: A1): Boolean
That is, it takes an Option[A] and an A1 (where A1 is a supertype of A, if it's not an A) and returns a Boolean. Implicitly in Scala's typesystem, of course, we know that A and A1 are both subtypes of Any.
Without knowing anything more about what the types A and A1 are (A might be String and A1 might be AnyRef, or A and A1 might both be Int: whatever our intuition, it has to apply as much in either situation), what could we possibly do? We're basically limited to combinations of operations involving an Option[Any] and an Any which eventually get us to a Boolean (and, ideally, won't throw an exception).
For instance, opt.nonEmpty && opt.get == elem works: we can always call nonEmpty on an Option[Any] and then compare the contents using equality. We could also do something like opt.isEmpty || (opt.get.## % 43) == (elem.## % 57), but knowing that the contents of the Option and some other object have equal remainders in two different bases doesn't strike one as useful.
Note that in your specific case, because there's no contains method on an Any. What should the behavior be if we have an Option[Int]?
It might actually be useful, since we do have the ability to convert arbitrary objects into Strings via the toString method (thank you Java!), to implement a containsSubstring method on Option[A]:
def containsSubstring(substring: String): Boolean =
nonEmpty && get.toString.contains(substring)
You could implement an enrichment class along these lines:
object Enrichments {
implicit class OptionOps[A](opt: Option[A]) extends AnyVal {
def containsSubstring(substring: String): Boolean =
opt.nonEmpty && opt.get.toString.contains(substring)
}
}
then you only need:
import Enrichments.OptionOps
Some("test this").containsSubstring("test") // evaluates true
case class Person(name: Option[String], age: Int)
// Option(p).containsSubstring("Roger") would also work, assuming Person doesn't override toString...
def isRoger(p: Person): Boolean = p.name.containsSubstring("Roger")
I recommend to check the docs and eventually the code of the API that you are you using. The docs detail what Option's contains does and how it works:
/** Tests whether the option contains a given value as an element.
*
* This is equivalent to:
*
* option match {
* case Some(x) => x == elem
* case None => false
* }
*
* // Returns true because Some instance contains string "something" which equals "something".
* Some("something") contains "something"
*
* // Returns false because "something" != "anything".
* Some("something") contains "anything"
*
* // Returns false when method called on None.
* None contains "anything"
*
* #param elem the element to test.
* #return `true` if the option has an element that is equal (as
* determined by `==`) to `elem`, `false` otherwise.
*/
final def contains[A1 >: A](elem: A1): Boolean =
!isEmpty && this.get == elem

Override equality for floating point values in Scala

Note: Bear with me, I'm not asking how to override equals or how to create a custom method to compare floating point values.
Scala is very nice in allowing comparison of objects by value, and by providing a series of tools to do so with little code. In particular, case classes, tuples and allowing comparison of entire collections.
I've often call methods that do intensive computations and generate o non-trivial data structure to return and I can then write a unit test that given a certain input will call the method and then compare the results against a hardcoded value. For instance:
def compute() =
{
// do a lot of computations here to produce the set below...
Set(('a', 1), ('b', 3))
}
val A = compute()
val equal = A == Set(('a', 1), ('b', 3))
// equal = true
This is a bare-bones example and I'm omitting here any code from specific test libraries, etc.
Given that floating point values are not reliably compared with equals, the following, and rather equivalent example, fails:
def compute() =
{
// do a lot of computations here to produce the set below...
Set(('a', 1.0/3.0), ('b', 3.1))
}
val A = compute()
val equal2 = A == Set(('a', 0.33333), ('b', 3.1)) // Use some arbitrary precision here
// equal2 = false
What I would want is to have a way to make all floating-point comparisons in that call to use an arbitrary level of precision. But note that I don't control (or want to alter in any way) either Set or Double.
I tried defining an implicit conversion from double to a new class and then overloading that class to return true. I could then use instances of that class in my hardcoded validations.
implicit class DoubleAprox(d: Double)
{
override def hashCode = d.hashCode()
override def equals(other : Any) : Boolean = other match {
case that : Double => (d - that).abs < 1e-5
case _ => false
}
}
val equals3 = DoubleAprox(1.0/3.0) == 0.33333 // true
val equals4 = 1.33333 == DoubleAprox(1.0/3.0) // false
But as you can see, it breaks symmetry. Given that I'm then comparing more complex data-structures (sets, tuples, case classes), I have no way to define a priori if equals() will be called on the left or the right. Seems like I'm bound to traverse all the structures and then do single floating-point comparisons on the branches... So, the question is: is there any way to do this at all??
As a side note: I gave a good read to an entire chapter on object equality and several blogs, but they only provides solutions for inheritance problems and requires you to basically own all classes involved and change all of them. And all of it seems rather convoluted given what it is trying to solve.
Seems to me that equality is one of those things that is fundamentally broken in Java due to the method having to be added to each class and permanently overridden time and again. What seems more intuitive to me would be to have comparison methods that the compiler can find. Say, you would provide equals(DoubleAprox, Double) and it would be used every time you want to compare 2 objects of those classes.
I think that changing the meaning of equality to mean anything fuzzy is a bad idea. See my comments in Equals for case class with floating point fields for why.
However, it can make sense to do this in a very limited scope, e.g. for testing. I think for numerical problems you should consider using the spire library as a dependency. It contains a large amount of useful things. Among them a type class for equality and mechanisms to derive type class instances for composite types (collections, tuples, etc) based on the type class instances for the individual scalar types.
Since as you observe, equality in the java world is fundamentally broken, they are using other operators (=== for type safe equality).
Here is an example how you would redefine equality for a limited scope to get fuzzy equality for comparing test results:
// import the machinery for operators like === (when an Eq type class instance is in scope)
import spire.syntax.all._
object Test extends App {
// redefine the equality for double, just in this scope, to mean fuzzy equali
implicit object FuzzyDoubleEq extends spire.algebra.Eq[Double] {
def eqv(a:Double, b:Double) = (a-b).abs < 1e-5
}
// this passes. === looks up the Eq instance for Double in the implicit scope. And
// since we have not imported the default instance but defined our own, this will
// find the Eq instance defined above and use its eqv method
require(0.0 === 0.000001)
// import automatic generation of type class instances for tuples based on type class instances of the scalars
// if there is an Eq available for each scalar type of the tuple, this will also make an Eq instance available for the tuple
import spire.std.tuples._
require((0.0, 0.0) === (0.000001, 0.0)) // works also for tuples containing doubles
// import automatic generation of type class instances for arrays based on type class instances of the scalars
// if there is an Eq instance for the element type of the array, there will also be one for the entire array
import spire.std.array._
require(Array(0.0,1.0) === Array(0.000001, 1.0)) // and for arrays of doubles
import spire.std.seq._
require(Seq(1.0, 0.0) === Seq(1.000000001, 0.0))
}
Java equals is indeed not as principled as it should be - people who are very bothered about this use something like Scalaz' Equal and ===. But even that assumes a symmetry of the types involved; I think you would have to write a custom typeclass to allow comparing heterogeneous types.
It's quite easy to write a new typeclass and have instances recursively derived for case classes, using Shapeless' automatic type class instance derivation. I'm not sure that extends to a two-parameter typeclass though. You might find it best to create distinct EqualityLHS and EqualityRHS typeclasses, and then your own equality method for comparing A: EqualityLHS and B: EqualityRHS, which could be pimped onto A as an operator if desired. (Of course it should be possible to extend the technique generically to support two-parameter typeclasses in full generality rather than needing such workarounds, and I'm sure shapeless would greatly appreciate such a contribution).
Best of luck - hopefully this gives you enough to find the rest of the answer yourself. What you want to do is by no means trivial, but with the help of modern Scala techniques it should be very much within the realms of possibility.

Scala mutable MultiMap addBinding and insert order preservation

It appears that MultiMap's addBinding does not preserve the insertion order of values binded to a same key, as the underlying mechanism it uses is a HashSet. What may be an idiomatic way to preserve insertion order with a MultiMap?
Based on MultiMap where the implementation states:
/** Creates a new set.
*
* Classes that use this trait as a mixin can override this method
* to have the desired implementation of sets assigned to new keys.
* By default this is `HashSet`.
*
* #return An empty set of values of type `B`.
*/
protected def makeSet: Set[B] = new HashSet[B]
You can simply define:
trait OrderedMultimap[A, B] extends MultiMap[A, B] {
override def makeSet: Set[B] = new LinkedHashSet[B]
}
One way would probably be to fall back to a regular Map (not MultiMap), using a collection for the value type, whereas that collection would be a collection type where order can be enforced (i.e. not a Set). As I understand, to preserve order of insertion in the wider sense that allows element repetition, the natural Scala collection to use would be a Seq implementation (e.g. Vector, or Queue, depending on the access patterns).

Check 2 sets for inclusion in Scala

abstract class FinSet[T] protected () {
// given a set other, it returns true iff every element of this is an element of other
def <=(other:FinSet[T]): Boolean =
// ????
That is what I am given so far. I am somewhat confused on how to implement this method. Would I be calling the method like so:
Set(1,2,3).<=(Set(3,2,1)) which should return true
I was wondering if this would work, but it seems too simple:
def <=(other:FinSet[T]): Boolean = if (this == other) true else false
Just looking for some guidance. Thanks.
& - means intersection, if second set doesn't have elements from first set, the following code would return false.
(thisSet & thatSet) == thisSet
In details this code computes the intersection between this set and another set and checks if elements in this equal to result of the first expression.
see & or intersect(more verbose version) method in Scaladoc
You can also do something like this:
thisSet.forall(x => thatSet contains x)
or less verbose:
thisSet.forall(thatSet contains _)
or like this:
(thisSet ++ thatSet) == thatSet
or maybe like this:
(thatSet -- thisSet).size == (thatSet.size - thisSet.size)
Rephrasing the requirement: you want to check if, for all elements of this set, the other set contains the element.
This sounds like the combination of two more primitive functions that you will probably want anyway. So, if you haven't done so already, I would define the methods:
def forall(predicate: T => Boolean): Boolean // Checks that predicate holds for all elements
def contains(elem: T): Boolean // Check that elem is an element of the set
Then the method <= devolves to:
def <=(other: FinSet[T]): Boolean = forall(other.contains)

Scala: checking if an object is Numeric

Is it possible for a pattern match to detect if something is a Numeric? I want to do the following:
class DoubleWrapper(value: Double) {
override def equals(o: Any): Boolean = o match {
case o: Numeric => value == o.toDouble
case _ => false
}
override def hashCode(): Int = value ##
}
But of course this doesn't really work because Numeric isn't the supertype of things like Int and Double, it's a typeclass. I also can't do something like def equals[N: Numeric](o: N) because o has to be Any to fit the contract for equals.
So how do I do it without listing out every known Numeric class (including, I guess, user-defined classes I may not even know about)?
The original problem is not solvable, and here is my reasoning why:
To find out whether a type is an instance of a typeclass (such as Numeric), we need implicit resolution. Implicit resolution is done at compile time, but we would need it to be done at runtime. That is currently not possible, because as far as I can tell, the Scala compiler does not leave all necessary information in the compiled class file. To see that, one can write a test class with a method that contains a local variable, that has the implicit modifier. The compilation output will not change when the modifier is removed.
Are you using DoubleWrapper to add methods to Double? Then it should be a transparent type, i.e. you shouldn't be keeping instances, but rather define the pimped methods to return Double instead. That way you can keep using == as defined for primitives, which already does what you want (6.0 == 6 yields true).
Ok, so if not, how about
override def equals(o: Any): Boolean = o == value
If you construct equals methods of other wrappers accordingly, you should end up comparing the primitive values again.
Another question is whether you should have such an equals method for a stateful wrapper. I don't think mutable objects should be equal according to one of the values they hold—you will most likely run into trouble with that.