Prevent Mixin overriding equals from breaking case class equality - scala

Squeryl defines a trait KeyedEntity which overrides equals, checking for several conditions in an if and calling super.equals in the end. Since super is Object, it will always fail.
Consider:
trait T { override def equals(z: Any):Boolean = super.equals(z)} }
case class A(a: Int) extends T
val a = A(1); val b = A(1)
a==b // false
Thus, if you declare
case class Record(id: Long, name: String ...) extends KeyedEntity[Long] { ... }
-- and you create several Record instances but do not persist them, their comparison will break. I found this by implementing both Salat and Squeryl back ends for the same class, and then all Salat tests fail since isPersisted from KeyedEntity is false.
Is there a design whereby KeyedEntity will preserve case class equality if mixed into a case class? I tried self-typing and parameterizing BetterKeyedEntity[K,P] { self: P => ... } for the case class type as P but it causes infinite recursion in equals.
As things stand right now, super is Object so the final branch of the overridden equals in KeyedEntity will always return false.

The structural equality check usually generated for case classes seems not to be generated if there is an equals override. However some subtleties have to be noted.
Mixing the concept of id-based equality falling back to structural equality might not be a good idea, since I can imagine that it may lead to subtle bugs. For example:
x: A(1) and and y: A(1) are not yet persisted, so they are equal
then they get persisted, and since they are separate objects, the persistence framework may persist them as separate entities (I don't know Squeryl, maybe not an issue there, but this is a thin line to walk)
after persisting, they are suddenly not equal since the id differs.
Even worse, if x and y get persisted to the same id, the hashCode will differ before and after persisting (the source shows that if persisted it is the hashCode of the id). This breaks immutability, and will lead to very bad behavior (when put in maps for example). See this gist in which I demonstrate the assert failing.
So don't mix structural and id-based equality implicitly. Also see this explained in the context of Hibernate.
Typeclasses
It have to be noted that others pointed out (ref needed) that the concept of method-based equality is flawed, for such reasons (there is not only one way two thing can be equal). Therefore you can define a typeclass which describes Equality:
trait Eq[A] {
def equal(x: A, y: A): Boolean
}
and define (possibly multiple) instances of that typeclass for your classes:
// structural equality
implicit object MyClassEqual extends Eq[MyClass] { ... }
// id based equality
def idEq[K, A <: KeyedEntity[K]]: Eq[A] = new Eq[A] {
def equal(x: A, y: A) = x.id == y.id
}
then you can request that things are members of the Eq typeclass:
def useSomeObjects[A](a: A, b: A)(implicit aEq: Eq[A]) = {
... aEq.equal(a, b) ...
}
So you can decide which notion of equality to use by importing the appropriate typeclass in scope, or passing the typeclass instance directly as in useSomeObjects(x, y)(idEq[Int, SomeClass])
Note that you might also need a Hashable typeclass, similarly.
Autogenerating Eq instances
This situation is pretty similar to the Scala stdlib's scala.math.Ordering typeclass. Here is an example for auto-deriving structural Ordering instances for case classes using the excellent shapeless library.
The same would easy to be done for Eq and Hashable.
Scalaz
Note that scalaz has Equal typeclass, with nice pimp patterns with which you can write x === y instead of eqInstance.equal(x, y). I'm not aware it has Hashable typeclass, yet.

Related

'with' keyword usage in scala with case classes

I though if something like this makes sense in scala:
object CaseClassUnion extends App {
case class HttpConfig(bindUrl: String, port: String)
case class DbConfig(url: String, usr: String, pass: String)
val combined: HttpConfig with DbConfig = ???
//HttpConfig("0.0.0.0", "21") ++ DbConfig("localhost", "root", "root")
//would be nice to have something like that
}
At least this compiles... Is there a way, probably with macros magic to achieve union of two classes given their instances?
In zio I believe there is something like in reverse:
val live: ZLayer[ProfileConfiguration with Logging, Nothing, ApplicationConfiguration] =
ZLayer.fromServices[ProfileConfigurationModule.Service, Logger[String], Service] { (profileConfig, logger) => ???
where we convert ProfileConfiguration with Logging to function of ProfileConfigurationModule.Service, Logger[String] => Service
Several things.
When you have several traits combined with with Scala does a trait linearization to combine them into one class with a linear hierarchy. But that's true for traits which doesn't have constructors!
case class (which is not a trait) cannot be extended with another case class (at all) because that would break contracts like:
case class A(a: Int)
case class B(a: Int, b: String) extends A(a)
A(1) == B(1, "") // because B is A and their respective fields match
B(1, "") != A(1) // because A is not B
B(1, "").hashCode != A(1).hashCode // A == B is true but hashCodes are different!
which means that you cannot even generate case class combination manually. You want to "combine" them, use some product: a tuple, another case class, etc.
If you are curious about ZIO it:
uses traits
uses them as some sort of type-level trick to represent an unordered set of dependencies, where type inference would calculate set sum when you combine operations and some clever trickery to remove traits from the list using .provide to remove dependency from the set
ZLayers are just making these shenanigans easier
so and if you even pass there some A with B you either combined it yourself by using cake pattern, or you passed dependencies one by one. ZIO developer might never be faced with the problem of needing some macro to combine several case classes (.provide(combineMagically(A, B, C, D, ...)) as they could pass implementations of each dependency one by one (.provide(A).provide(B)) and the code underneath would never need the combination of these types as one value - it's just a compile-time trick that might never translate to the requirement of an actual value of type A with B with C with D ....
TL;DR: You cannot generate a combination of 2 case classes; ZIO uses compound types as some sort of type-level set to trace dependencies and it doesn't actually require creating values of the compound types.

Require semigroup to be associative in scala

A semigroup is required to be associative, but I could define a Semigroup like:
trait Semigroup[T] {
def op(t1:T, t2:T) : T
}
def plus = new Semigroup[Int] { def op(t1:Int, t2:Int) = t1 - t2 }
I am able to implement plus which is not associative, yet the class is still a Semigroup. Is there a safeguard in place against this or is the user expected to rely on testing to prevent this?
There will be no compilation exception, that the associativity property does not hold. In other words, it's up to you to make sure it's implemented correctly.
But if you use cats, you can use laws to make sure all the properties required for the semigroup and other structures defined in cats. Take a look at the documentation. You can create a test to check if the semigroup you defined is alright:
class TreeLawTests extends AnyFunSuite with Discipline {
checkAll("YourSemigroup[YourType].SemigroupLaws", SemigroupTests[YourSemigroup[YourType]].semigroup)
}

Override equality for floating point values in Scala

Note: Bear with me, I'm not asking how to override equals or how to create a custom method to compare floating point values.
Scala is very nice in allowing comparison of objects by value, and by providing a series of tools to do so with little code. In particular, case classes, tuples and allowing comparison of entire collections.
I've often call methods that do intensive computations and generate o non-trivial data structure to return and I can then write a unit test that given a certain input will call the method and then compare the results against a hardcoded value. For instance:
def compute() =
{
// do a lot of computations here to produce the set below...
Set(('a', 1), ('b', 3))
}
val A = compute()
val equal = A == Set(('a', 1), ('b', 3))
// equal = true
This is a bare-bones example and I'm omitting here any code from specific test libraries, etc.
Given that floating point values are not reliably compared with equals, the following, and rather equivalent example, fails:
def compute() =
{
// do a lot of computations here to produce the set below...
Set(('a', 1.0/3.0), ('b', 3.1))
}
val A = compute()
val equal2 = A == Set(('a', 0.33333), ('b', 3.1)) // Use some arbitrary precision here
// equal2 = false
What I would want is to have a way to make all floating-point comparisons in that call to use an arbitrary level of precision. But note that I don't control (or want to alter in any way) either Set or Double.
I tried defining an implicit conversion from double to a new class and then overloading that class to return true. I could then use instances of that class in my hardcoded validations.
implicit class DoubleAprox(d: Double)
{
override def hashCode = d.hashCode()
override def equals(other : Any) : Boolean = other match {
case that : Double => (d - that).abs < 1e-5
case _ => false
}
}
val equals3 = DoubleAprox(1.0/3.0) == 0.33333 // true
val equals4 = 1.33333 == DoubleAprox(1.0/3.0) // false
But as you can see, it breaks symmetry. Given that I'm then comparing more complex data-structures (sets, tuples, case classes), I have no way to define a priori if equals() will be called on the left or the right. Seems like I'm bound to traverse all the structures and then do single floating-point comparisons on the branches... So, the question is: is there any way to do this at all??
As a side note: I gave a good read to an entire chapter on object equality and several blogs, but they only provides solutions for inheritance problems and requires you to basically own all classes involved and change all of them. And all of it seems rather convoluted given what it is trying to solve.
Seems to me that equality is one of those things that is fundamentally broken in Java due to the method having to be added to each class and permanently overridden time and again. What seems more intuitive to me would be to have comparison methods that the compiler can find. Say, you would provide equals(DoubleAprox, Double) and it would be used every time you want to compare 2 objects of those classes.
I think that changing the meaning of equality to mean anything fuzzy is a bad idea. See my comments in Equals for case class with floating point fields for why.
However, it can make sense to do this in a very limited scope, e.g. for testing. I think for numerical problems you should consider using the spire library as a dependency. It contains a large amount of useful things. Among them a type class for equality and mechanisms to derive type class instances for composite types (collections, tuples, etc) based on the type class instances for the individual scalar types.
Since as you observe, equality in the java world is fundamentally broken, they are using other operators (=== for type safe equality).
Here is an example how you would redefine equality for a limited scope to get fuzzy equality for comparing test results:
// import the machinery for operators like === (when an Eq type class instance is in scope)
import spire.syntax.all._
object Test extends App {
// redefine the equality for double, just in this scope, to mean fuzzy equali
implicit object FuzzyDoubleEq extends spire.algebra.Eq[Double] {
def eqv(a:Double, b:Double) = (a-b).abs < 1e-5
}
// this passes. === looks up the Eq instance for Double in the implicit scope. And
// since we have not imported the default instance but defined our own, this will
// find the Eq instance defined above and use its eqv method
require(0.0 === 0.000001)
// import automatic generation of type class instances for tuples based on type class instances of the scalars
// if there is an Eq available for each scalar type of the tuple, this will also make an Eq instance available for the tuple
import spire.std.tuples._
require((0.0, 0.0) === (0.000001, 0.0)) // works also for tuples containing doubles
// import automatic generation of type class instances for arrays based on type class instances of the scalars
// if there is an Eq instance for the element type of the array, there will also be one for the entire array
import spire.std.array._
require(Array(0.0,1.0) === Array(0.000001, 1.0)) // and for arrays of doubles
import spire.std.seq._
require(Seq(1.0, 0.0) === Seq(1.000000001, 0.0))
}
Java equals is indeed not as principled as it should be - people who are very bothered about this use something like Scalaz' Equal and ===. But even that assumes a symmetry of the types involved; I think you would have to write a custom typeclass to allow comparing heterogeneous types.
It's quite easy to write a new typeclass and have instances recursively derived for case classes, using Shapeless' automatic type class instance derivation. I'm not sure that extends to a two-parameter typeclass though. You might find it best to create distinct EqualityLHS and EqualityRHS typeclasses, and then your own equality method for comparing A: EqualityLHS and B: EqualityRHS, which could be pimped onto A as an operator if desired. (Of course it should be possible to extend the technique generically to support two-parameter typeclasses in full generality rather than needing such workarounds, and I'm sure shapeless would greatly appreciate such a contribution).
Best of luck - hopefully this gives you enough to find the rest of the answer yourself. What you want to do is by no means trivial, but with the help of modern Scala techniques it should be very much within the realms of possibility.

Put method in trait or in case class?

There are two ways of defining a method for two different classes inheriting the same trait in Scala.
sealed trait Z { def minus: String }
case class A() extends Z { def minus = "a" }
case class B() extends Z { def minus = "b" }
The alternative is the following:
sealed trait Z { def minus: String = this match {
case A() => "a"
case B() => "b"
}
case class A() extends Z
case class B() extends Z
The first method repeats the method name, whereas the second method repeats the class name.
I think that the first method is the best to use because the codes are separated. However, I found myself often using the second one for complicated methods, so that adding additional arguments can be done very easily for example like this:
sealed trait Z {
def minus(word: Boolean = false): String = this match {
case A() => if(word) "ant" else "a"
case B() => if(word) "boat" else "b"
}
case class A() extends Z
case class B() extends Z
What are other differences between those practices? Are there any bugs that are waiting for me if I choose the second approach?
EDIT:
I was quoted the open/closed principle, but sometimes, I need to modify not only the output of the functions depending on new case classes, but also the input because of code refactoring. Is there a better pattern than the first one? If I want to add the previous mentioned functionality in the first example, this would yield the ugly code where the input is repeated:
sealed trait Z { def minus(word: Boolean): String ; def minus = minus(false) }
case class A() extends Z { def minus(word: Boolean) = if(word) "ant" else "a" }
case class B() extends Z { def minus(word: Boolean) = if(word) "boat" else "b" }
I would choose the first one.
Why ? Merely to keep Open/Closed Principle.
Indeed, if you want to add another subclass, let's say case class C, you'll have to modify supertrait/superclass to insert the new condition... ugly
Your scenario has a similar in Java with template/strategy pattern against conditional.
UPDATE:
In your last scenario, you can't avoid the "duplication" of input. Indeed, parameter type in Scala isn't inferable.
It still better to have cohesive methods than blending the whole inside one method presenting as many parameters as the method union expects.
Just Imagine ten conditions in your supertrait method. What if you change inadvertently the behavior of one of each? Each change would be risked and supertrait unit tests should always run each time you modify it ...
Moreover changing inadvertently an input parameter (not a BEHAVIOR) is not "dangerous" at all. Why? because compiler would tell you that a parameter/parameter type isn't relevant any more.
And if you want to change it and do the same for every subclasses...ask to your IDE, it loves refactoring things like this in one click.
As this link explains:
Why open-closed principle matters:
No unit testing required.
No need to understand the sourcecode from an important and huge class.
Since the drawing code is moved to the concrete subclasses, it's a reduced risk to affect old functionallity when new functionality is added.
UPDATE 2:
Here a sample avoiding inputs duplication fitting your expectation:
sealed trait Z {
def minus(word: Boolean): String = if(word) whenWord else whenNotWord
def whenWord: String
def whenNotWord: String
}
case class A() extends Z { def whenWord = "ant"; def whenNotWord = "a"}
Thanks type inference :)
Personally, I'd stay away from the second approach. Each time you add a new sub class of Z you have to touch the shared minus method, potentially putting at risk the behavior tied to the existing implementations. With the first approach adding a new subclass has no potential side effect on the existing structures. There might be a little of the Open/Closed Principle in here and your second approach might violate it.
Open/Closed principle can be violated with both approaches. They are orthogonal to each other. The first one allows to easily add new type and implement required methods, it breaks Open/Closed principle if you need to add new method into hierarchy or refactor method signatures to the point that it breaks any client code. It is after all reason why default methods were added to Java8 interfaces so that old API can be extended without requiring client code to adapt.
This approach is typical for OOP.
The second approach is more typical for FP. In this case it is easy to add methods but it is hard to add new type (it breaks O/C here). It is good approach for closed hierarchies, typical example are Algebraic Data Types (ADT). Standardized protocol which is not meant to be extended by clients could be a candidate.
Languages struggle to allow to design API which would have both benefits - easy to add types as well as adding methods. This problem is called Expression Problem. Scala provides Typeclass pattern to solve this problem which allows to add functionality to existing types in ad-hoc and selective manner.
Which one is better depends on your use case.
Starting in Scala 3, you have the possibility to use trait parameters (just like classes have parameters), which simplifies things quite a lot in this case:
trait Z(x: String) { def minus: String = x }
case class A() extends Z("a")
case class B() extends Z("b")
A().minus // "a"
B().minus // "b"

Scala contravariance - real life example

I understand covariance and contravariance in scala. Covariance has many applications in the real world, but I can not think of any for contravariance applications, except the same old examples for Functions.
Can someone shed some light on real world examples of contravariance use?
In my opinion, the two most simple examples after Function are ordering and equality. However, the first is not contra-variant in Scala's standard library, and the second doesn't even exist in it. So, I'm going to use Scalaz equivalents: Order and Equal.
Next, I need some class hierarchy, preferably one which is familiar and, of course, it both concepts above must make sense for it. If Scala had a Number superclass of all numeric types, that would have been perfect. Unfortunately, it has no such thing.
So I'm going to try to make the examples with collections. To make it simple, let's just consider Seq[Int] and List[Int]. It should be clear that List[Int] is a subtype of Seq[Int], ie, List[Int] <: Seq[Int].
So, what can we do with it? First, let's write something that compares two lists:
def smaller(a: List[Int], b: List[Int])(implicit ord: Order[List[Int]]) =
if (ord.order(a,b) == LT) a else b
Now I'm going to write an implicit Order for Seq[Int]:
implicit val seqOrder = new Order[Seq[Int]] {
def order(a: Seq[Int], b: Seq[Int]) =
if (a.size < b.size) LT
else if (b.size < a.size) GT
else EQ
}
With these definitions, I can now do something like this:
scala> smaller(List(1), List(1, 2, 3))
res0: List[Int] = List(1)
Note that I'm asking for an Order[List[Int]], but I'm passing a Order[Seq[Int]]. This means that Order[Seq[Int]] <: Order[List[Int]]. Given that Seq[Int] >: List[Int], this is only possible because of contra-variance.
The next question is: does it make any sense?
Let's consider smaller again. I want to compare two lists of integers. Naturally, anything that compares two lists is acceptable, but what's the logic of something that compares two Seq[Int] being acceptable?
Note in the definition of seqOrder how the things being compared becomes parameters to it. Obviously, a List[Int] can be a parameter to something expecting a Seq[Int]. From that follows that a something that compares Seq[Int] is acceptable in place of something that compares List[Int]: they both can be used with the same parameters.
What about the reverse? Let's say I had a method that only compared :: (list's cons), which, together with Nil, is a subtype of List. I obviously could not use this, because smaller might well receive a Nil to compare. It follows that an Order[::[Int]] cannot be used instead of Order[List[Int]].
Let's proceed to equality, and write a method for it:
def equalLists(a: List[Int], b: List[Int])(implicit eq: Equal[List[Int]]) = eq.equal(a, b)
Because Order extends Equal, I can use it with the same implicit above:
scala> equalLists(List(4, 5, 6), List(1, 2, 3)) // we are comparing lengths!
res3: Boolean = true
The logic here is the same one. Anything that can tell whether two Seq[Int] are the same can, obviously, also tell whether two List[Int] are the same. From that, it follows that Equal[Seq[Int]] <: Equal[List[Int]], which is true because Equal is contra-variant.
This example is from the last project I was working on. Say you have a type-class PrettyPrinter[A] that provides logic for pretty-printing objects of type A. Now if B >: A (i.e. if B is superclass of A) and you know how to pretty-print B (i.e. have an instance of PrettyPrinter[B] available) then you can use the same logic to pretty-print A. In other words, B >: A implies PrettyPrinter[B] <: PrettyPrinter[A]. So you can declare PrettyPrinter[A] contravariant on A.
scala> trait Animal
defined trait Animal
scala> case class Dog(name: String) extends Animal
defined class Dog
scala> trait PrettyPrinter[-A] {
| def pprint(a: A): String
| }
defined trait PrettyPrinter
scala> def pprint[A](a: A)(implicit p: PrettyPrinter[A]) = p.pprint(a)
pprint: [A](a: A)(implicit p: PrettyPrinter[A])String
scala> implicit object AnimalPrettyPrinter extends PrettyPrinter[Animal] {
| def pprint(a: Animal) = "[Animal : %s]" format (a)
| }
defined module AnimalPrettyPrinter
scala> pprint(Dog("Tom"))
res159: String = [Animal : Dog(Tom)]
Some other examples would be Ordering type-class from Scala standard library, Equal, Show (isomorphic to PrettyPrinter above), Resource type-classes from Scalaz etc.
Edit:
As Daniel pointed out, Scala's Ordering isn't contravariant. (I really don't know why.) You may instead consider scalaz.Order which is intended for the same purpose as scala.Ordering but is contravariant on its type parameter.
Addendum:
Supertype-subtype relationship is but one type of relationship that can exist between two types. There can be many such relationships possible. Let's consider two types A and B related with function f: B => A (i.e. an arbitrary relation). Data-type F[_] is said to be a contravariant functor if you can define an operation contramap for it that can lift a function of type B => A to F[A => B].
The following laws need to be satisfied:
x.contramap(identity) == x
x.contramap(f).contramap(g) == x.contramap(f compose g)
All of the data types discussed above (Show, Equal etc.) are contravariant functors. This property lets us do useful things such as the one illustrated below:
Suppose you have a class Candidate defined as:
case class Candidate(name: String, age: Int)
You need an Order[Candidate] which orders candidates by their age. Now you know that there is an Order[Int] instance available. You can obtain an Order[Candidate] instance from that with the contramap operation:
val byAgeOrder: Order[Candidate] =
implicitly[Order[Int]] contramap ((_: Candidate).age)
An example based on a real-world event-driven software system. Such a system is based on broad categories of events, like events related to the functioning of the system (system events), events generated by user actions (user events) and so on.
A possible event hierarchy:
trait Event
trait UserEvent extends Event
trait SystemEvent extends Event
trait ApplicationEvent extends SystemEvent
trait ErrorEvent extends ApplicationEvent
Now the programmers working on the event-driven system need to find a way to register/process the events generated in the system. They will create a trait, Sink, that is used to mark components in need to be notified when an event has been fired.
trait Sink[-In] {
def notify(o: In)
}
As a consequence of marking the type parameter with the - symbol, the Sink type became contravariant.
A possible way to notify interested parties that an event happened is to write a method and to pass it the corresponding event. This method will hypothetically do some processing and then it will take care of notifying the event sink:
def appEventFired(e: ApplicationEvent, s: Sink[ApplicationEvent]): Unit = {
// do some processing related to the event
// notify the event sink
s.notify(e)
}
def errorEventFired(e: ErrorEvent, s: Sink[ErrorEvent]): Unit = {
// do some processing related to the event
// notify the event sink
s.notify(e)
}
A couple of hypothetical Sink implementations.
trait SystemEventSink extends Sink[SystemEvent]
val ses = new SystemEventSink {
override def notify(o: SystemEvent): Unit = ???
}
trait GenericEventSink extends Sink[Event]
val ges = new GenericEventSink {
override def notify(o: Event): Unit = ???
}
The following method calls are accepted by the compiler:
appEventFired(new ApplicationEvent {}, ses)
errorEventFired(new ErrorEvent {}, ges)
appEventFired(new ApplicationEvent {}, ges)
Looking at the series of calls you notice that it is possible to call a method expecting a Sink[ApplicationEvent] with a Sink[SystemEvent] and even with a Sink[Event]. Also, you can call the method expecting a Sink[ErrorEvent] with a Sink[Event].
By replacing invariance with a contravariance constraint, a Sink[SystemEvent] becomes a subtype of Sink[ApplicationEvent]. Therefore, contravariance can also be thought of as a ‘widening’ relationship, since types are ‘widened’ from more specific to more generic.
Conclusion
This example has been described in a series of articles about variance found on my blog
In the end, I think it helps to also understand the theory behind it...
Short answer that might help people who were super confused like me and didn't want to read these long winded examples:
Imagine you have 2 classes Animal, and Cat, which extends Animal. Now, imagine that you have a type Printer[Cat], that contains the functionality for printing Cats. And you have a method like this:
def print(p: Printer[Cat], cat: Cat) = p.print(cat)
but the thing is, that since Cat is an Animal, Printer[Animal] should also be able to print Cats, right?
Well, if Printer[T] were defined like Printer[-T], i.e. contravariant, then we could pass Printer[Animal] to the print function above and use its functionality to print cats.
This is why contravariance exists. Another example, from C#, for example, is the class IComparer which is contravariant as well. Why? Because we should be able to use Animal comparers to compare Cats, too.