Can I create a collection in Scala that uses different equals/hashCode/compare implementations? - scala

I'm looking for as simple way to create an identity set. I just want to be able to keep track of whether or not I've "seen" a particular object while traversing a graph.
I can't use a regular Set because Set uses "==" (the equals method in Scala) to compare elements. What I want is a Set that uses "eq."
Is there any way to create a Set in Scala that uses some application-specified method for testing equality rather than calling equals on the set elements? I looked for some kind of "wrapEquals" method that I could override but did not find it.
I know that I could use Java's IdentityHashMap, but I'm looking for something more general-purpose.
Another idea I had was to just wrap each set element in another object that implements equals in terms of eq, but it's wasteful to generate tons of new objects just to get a new equals implementation.
Thanks!

Depending on your needs you could create a box for which you use identity checks on the contained element such as:
class IdentBox[T <: AnyRef](val value: T) {
override def equals(other: Any): Boolean = other match {
case that: IdentBox[T] => that.value eq this.value
case _ => false
}
override def hashCode(): Int = value.hashCode
}
And make the collection to contain those boxes instead of the elements directly: Set[IdentBox[T]]
It has some overhead of boxing / unboxing but it might be tolerable in your use case.

This is a similar question. The accepted answer in that case was to use a TreeSet and provide a custom Comparator.

Since you don't require a reference to the "seen" objects, but just a boolean value for "contains", I would suggest just using a mutable.Set[Int] and loading it with values obtained by calling System.identityHashCode(obj).
Scala custom collections have enough conceptual surface area to scare off most people who want a quick tweak like this.

Related

Is there a way to make sure that a class overrides hashCode

I am creating a cache, and want to make sure that the key type overrides hashCode.
If hashCode was not already defined on Object, something like this would work
trait Key {
def hashCode: Int
}
If the keys are always case classes it is obviously not a problem, but I want to make sure that if somebody passes a regular class it will fail. Is there a way to do it in Scala?
On a side note: My key is specifications for a SQL query which currently is represented as case classes. For example
case class Filter(age: Option[Int], gender: Option[String])
But eventually, I want to represent it using a cleaner specification pattern implementation (for example: https://gist.github.com/lbialy/912fad3c909374b81ce7)
If you want to explicitly whitelist classes that are allowed to use their hashCode, you cannot use inheritance for that, but you can provide your own typeclass:
trait HasApprovedHashCode[X] {
def hashCode(x: X): Int
}
and then modify all the methods that crucially rely on a proper implementation of hashCode like this:
def methodRelyingOnHashCode[K: HasApprovedHashCode, V](...) = ...
Now you can explicitly whitelist only those classes that you deem as having good enough implementation of hashCode.
Usually, I would say: hash code of the used key is not your responsibility. If the user of your library insists on shooting h(im/er)self in the foot, you cannot prevent it. You shouldn't facilitate it, or even create a situation where this is almost inevitable, but it's not your responsibility to hunt down every single class out there that could somehow misbehave when used as a key of a map.

Use mocked function return value in real function call

Is it possible to use a mocked function inside a real function call? Both functions are in the same object. So for example, if I have
obj A {
def mockThis(value: Int): Int = {
value*5
}
def realFuncIWantToTest(value: Int): Int = {
val v = mockThis(value)
v
}
}
Obviously this is an extremely simple case and this isn't what my code is doing (v is actually a complicated object). Essentially I want realFuncIWantToTest to use the mocked function return value that I define.
Thanks!
You might be able to do this using Mockito's spies; see here for an example on that.
Spies basically work by having that spy wrapping around a real object of your class under test.
But one word here: even when it is possible, please consider changing your design instead. This "partial mocking" is often a good indication that your class is violating the single responsibility principle. Meaning: a class should be responsible for "one" thing. But the idea that you can / have to partially mock things within your class indicates that your class is responsible for at least two, somehow disconnect aspects.
In that sense: the better approach would be that mockThis() would be a call on another object; which could be inserted via dependency injection into this class.
Long story short: at least on a Java level your idea should work fine (where I have certain doubts that Mockito will work nicely with your scala objects) from a technical perspective; but from a conceptual point point; you should rather avoid doing it this way.

Scala type alias with companion object

I'm a relatively new Scala user and I wanted to get an opinion on the current design of my code.
I have a few classes that are all represented as fixed length Vector[Byte] (ultimately they are used in a learning algorithm that requires a byte string), say A, B and C.
I would like these classes to be referred to as A, B and C elsewhere in the package for readability sake and I don't need to add any extra class methods to Vector for these methods. Hence, I don't think the extend-my-library pattern is useful here.
However, I would like to include all the useful functional methods that come with Vector without having to 'drill' into a wrapper object each time. As efficiency is important here, I also didn't want the added weight of a wrapper.
Therefore I decided to define type aliases in the package object:
package object abc {
type A: Vector[Byte]
type B: Vector[Byte]
type C: Vector[Byte]
}
However, each has it's own fixed length and I would like to include factory methods for their creation. It seems like this is what companion objects are for. This is how my final design looks:
package object abc {
type A: Vector[Byte]
object A {
val LENGTH: Int = ...
def apply(...): A = {
Vector.tabulate...
}
}
...
}
Everything compiles and it allows me to do stuff like this:
val a: A = A(...)
a map {...} mkString(...)
I can't find anything specifically warning against writing companion objects for type aliases, but it seems it goes against how type aliases should be used. It also means that all three of these classes are defined in the same file, when ideally they should be separated.
Are there any hidden problems with this approach?
Is there a better design for this problem?
Thanks.
I guess it is totally ok, because you are not really implementing a companion object.
If you were, you would have access to private fields of immutable.Vector from inside object A (like e.g. private var dirty), which you do not have.
Thus, although it somewhat feels like A is a companion object, it really isn't.
If it were possible to create a companion object for any type by using type alias would make member visibility constraints moot (except maybe for private|protected[this]).
Furthermore, naming the object like the type alias clarifies context and purpose of the object, which is a plus in my book.
Having them all in one file is something that is pretty common in scala as I know it (e.g. when using the type class pattern).
Thus:
No pitfalls, I know of.
And, imho, no need for a different approach.

Pattern Matching Design

I recently wrote some code like the block below and it left me with thoughts that the design could be improved if I was more knowledgeable on functional programming abstractions.
sealed trait Foo
case object A extends Foo
case object B extends Foo
case object C extends Foo
.
.
.
object Foo {
private def someFunctionSemanticallyRelatedToA() = { // do stuff }
private def someFunctionSemanticallyRelatedToB() = { // do stuff }
private def someFunctionSemanticallyRelatedToC() = { // do stuff }
.
.
.
def somePublicFunction(x : Foo) = x match {
case A => someFunctionSemanticallyRelatedToA()
case B => someFunctionSemanticallyRelatedToB()
case C => someFunctionSemanticallyRelatedToC()
.
.
.
}
}
My questions are:
Is the somePublicFunction() suffering from code smell or even the whole design? My concern is that the list of value constructors could grow quite big.
Is there a better FP abstraction to handle this type of design more elegantly or even concisely?
You've just run into the expression problem. In your code sample, the problem is that potentially every time you add or remove a case from your Foo algebraic data type, you'll need to modify every single match (like in somePublicFunction) against values of Foo. In Nimrand's answer, the problem is in the opposite end of the spectrum: you can add or remove cases from Foo easily, but every time you want to add or remove a behaviour (a method), you'll need to modify every subclass of Foo.
There are various proposals to solve the expression problem, but one interesting functional way is Oleg Kiselyov's Typed Tagless Final Interpreters, which replaces each case of the algebraic data type with a function that returns some abstract value that's considered to be equivalent to that case. Using generics (i.e. type parameters), these functions can all have compatible types and work with each other no matter when they were implemented. E.g., I've implemented an example of building and evaluating an arithmetic expression tree using TTFI: https://github.com/yawaramin/scala-ttfi
Your explanation is a bit too abstract to give you a confident answer. However, if the list of subclasses of Foo is likely to grow/change in the future, I would be inclined to make it an abstract method of Foo, and then implement the logic for each case in the sub classes. Then you just call Foo.myAbstractMethod() and polymorphism handles everything neatly.
This keeps the code specific to each object with the object itself, which is keeps things more neatly organized. It also means that you can add new subclasses of Foo without having to jump around to multiple places in code to augment the existing match statements elsewhere in the code.
Case classes and pattern-matching work best when the set of sub-classes is relatively small and fixed. For example, Option[T] there are only two sub-classes, Some[T] and None. That will NEVER change, because to change that would be to fundamentally change what Option[T] represents. Therefore, it's a good candidate for pattern-matching.

How to compare two objects for equality in Scala?

I have a very basic equality check between two objects but it fails.
package foo
import org.junit.Assert._
object Sandbox extends App{
class A
val a = new A
val b = new A
assertEquals(a, b)
}
My use-case is more complex but I wanted to get my basics right. I get an assertion error when I run the code:
Caused by: java.lang.AssertionError: expected:<foo.Sandbox$A#3f86d38b> but was:<foo.Sandbox$A#206d63fd>
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:743)
at org.junit.Assert.assertEquals(Assert.java:118)
at org.junit.Assert.assertEquals(Assert.java:144)
. . . .
How can I compare two objects for equality?
You can override AnyRef equals function (the same way as you would need to override Object.equals in Java)
def equals(arg0: Any): Boolean
Or you can make A case class, Scala generates correct equals for case classes out of the box.
Unfortunately java defined an equals a hashCode and a toString method on every object. The default implementation of equals is to check referential equality, that is that they check that you are referring to the exact same object. Since you created two different objects, and gave no equals method to A they are not equal, since they aren't the same instance.
You can provide A with a better equals method, but anytime you override equals you should override hashCode as well to ensure that they two are in agreement. The rules for them being in agreement are:
if equals returns true, the two objects must return the same hashCode
if objects have different hashCodes, equals must return false.