Define equality/ordering implicitly for collections - scala

Is it possible to define my own notion of equality or ordering for the collections in Scala? Overriding equals and hashCode doesn't work in this case because I'd like to have more than one instance.
Here is roughly what I had in mind: (ignore the invalidity of this code)
implicit val customEq1(x: Int, y: Int) = x % 8 == y % 8
val customEq2(x: Int, y: Int) = x.toString == y.toString.take(2)
val union = Set(1,15,3).union(Set(3,7,8)) // => Set(1,3,8)
I'd imagine equality/ordering being a typeclass, but the functions like e.g. diff, union, intersect don't seem to offer any such functionality.

If you have multiple different implementations for comparison, you can create a class for each with the appropriate overrides, then coerce the type of the set with implicit conversion, something like this:
class MyCompInt { overrides... }
val union = Set[MyCompInt](1, 15, 3).union(...)

Related

Scala Implicit Conversion Function Name Clashes

I am working with a simple complex number case class in Scala and would like to create an add function that works between complex numbers, doubles and ints. Below is a simple example of a working solution:
case class Complex(re: Double, im: Double)
implicit def toComplex[A](n: A)(implicit f: A => Double): Complex = Complex(n, 0)
implicit class NumberWithAdd[A](n: A)(implicit f: A => Complex) {
def add(m: Complex) = Complex(n.re + m.re, n.im + m.im)
}
Note I am deliberately not including the add function in the complex case class. Using the above I can do all of this:
scala> val z = Complex(1, 2); val w = Complex(2, 3)
z: Complex = Complex(1.0,2.0)
w: Complex = Complex(2.0,3.0)
scala> z add w
res5: Complex = Complex(3.0,5.0)
scala> z add 1
res6: Complex = Complex(2.0,2.0)
scala> 1 add z
res7: Complex = Complex(2.0,2.0)
I'd like to use '+' instead of 'add, but however this does not work. I get the following error:
Error:(14, 4) value + is not a member of A$A288.this.Complex
z + 1
^
Both z + w and 1 + z still work however.
What I'd like to know is why does changing the function name from 'add' to '+' break this? Is there an alternate route to getting this functionality (without simply putting the add function in the complex case class)? Any help would be appreciated.
Edit - Motivation
I'm playing around with monoids and other algebraic structures. I would like to be able to generalise the '...WithAdd' function to automatically work for any class that has a corresponding monoid:
trait Monoid[A] {
val identity: A
def op(x: A, y: A): A
}
implicit class withOp[A](n: A)(implicit val monoid: Monoid[A]) {
def +(m: A): A = monoid.op(n, m)
}
case class Complex(re: Double, im: Double) {
override def toString: String = re + " + " + im + "i"
}
class ComplexMonoid extends Monoid[Complex] {
val identity = Complex(0, 0)
def op(z: Complex, w: Complex): Complex = {
Complex(z.re + w.re, z.im + w.im)
}
}
implicit val complexMonoid = new ComplexMonoid
Using the above I can now do Complex(1, 2) + Complex(3, 1) giving Complex = 4.0 + 3.0i. This is great for code reuse as I could now add extra functions to the Monoid and withAdd function (such as appling op n times to an element, giving the power function for multiplication) and it would work for any case class that has a corresponding monoid. It is only with complex numbers and trying to incorporate doubles, ints, etc., that I then run into the problem above.
I would use a regular class, not a case class. Then it would be easy to create methods to add or subtract these Complex numbers, like:
class Complex(val real : Double, val imag : Double) {
def +(that: Complex) =
new Complex(this.real + that.real, this.imag + that.imag)
def -(that: Complex) =
new Complex(this.real - that.real, this.imag - that.imag)
override def toString = real + " + " + imag + "i"
}
As the source page shows, it will now support something that looks like operator overloading (it's not, because + and - are functions and not operators).
The problem with implicit class NumberWithAdd and its method + is that the same method also exist in number classes such as Int and Double. The + method of NumberWithAdd basically allows you to start with a number that can be casted to Complex and add a Complex object to that first item. That is, the left hand value can be anything (as long as it can be converted) and the right hand value must be Complex.
That works great for w + z (no need to convert w) and 1 + z (implicit conversion for Int to Complex is available). It fails for z + 1 because + is not available in the class Complex .
Since z + 1 is actually z.+(1), Scala will look for other possible matches for +(i: Int) in classes that Complex can be converted into. It also checks NumberWithAdd, which does have a + function but that one required a Complex as right hand value. (It would match a function that requires an Int as right hand value.) There are other functions named + that do accept Int, but there's no conversion from Complex to what those functions want as left hand values.
The same definition of + does work when it's in the (case) class Complex. In that case, both w + z and z + 1 simply use that definition. The case 1 + z is now a little more complicated. Since Int does not have a function + that accepts a Complex value, Scala will find the one that does (in Complex) and determines whether or not it is possible to convert Int into Complex. That is possible using the implicit functions, the conversion takes place and the function is executed.
When the function + in the class NumberWithAdd is renamed add, there's no confusion with functions in Int because Int does not have a function +. So Scala will try harder to apply the function add and it will do the Int to Complex conversion. It will even do that conversion when you try 1 add 2.
Note: My explanations may not fully describe the actual inner workings.

Partially applied/curried function vs overloaded function

Whilst I understand what a partially applied/curried function is, I still don't fully understand why I would use such a function vs simply overloading a function. I.e. given:
def add(a: Int, b: Int): Int = a + b
val addV = (a: Int, b: Int) => a + b
What is the practical difference between
def addOne(b: Int): Int = add(1, b)
and
def addOnePA = add(1, _:Int)
// or currying
val addOneC = addV.curried(1)
Please note I am NOT asking about currying vs partially applied functions as this has been asked before and I have read the answers. I am asking about currying/partially applied functions VS overloaded functions
The difference in your example is that overloaded function will have hardcoded value 1 for the first argument to add, i.e. set at compile time, while partially applied or curried functions are meant to capture their arguments dynamically, i.e. at run time. Otherwise, in your particular example, because you are hardcoding 1 in both cases it's pretty much the same thing.
You would use partially applied/curried function when you pass it through different contexts, and it captures/fills-in arguments dynamically until it's completely ready to be evaluated. In FP this is important because many times you don't pass values, but rather pass functions around. It allows for higher composability and code reusability.
There's a couple reasons why you might prefer partially applied functions. The most obvious and perhaps superficial one is that you don't have to write out intermediate functions such as addOnePA.
List(1, 2, 3, 4) map (_ + 3) // List(4, 5, 6, 7)
is nicer than
def add3(x: Int): Int = x + 3
List(1, 2, 3, 4) map add3
Even the anonymous function approach (that the underscore ends up expanding out to by the compiler) feels a tiny bit clunky in comparison.
List(1, 2, 3, 4) map (x => x + 3)
Less superficially, partial application comes in handy when you're truly passing around functions as first-class values.
val fs = List[(Int, Int) => Int](_ + _, _ * _, _ / _)
val on3 = fs map (f => f(_, 3)) // partial application
val allTogether = on3.foldLeft{identity[Int] _}{_ compose _}
allTogether(6) // (6 / 3) * 3 + 3 = 9
Imagine if I hadn't told you what the functions in fs were. The trick of coming up with named function equivalents instead of partial application becomes harder to use.
As for currying, currying functions often lets you naturally express transformations of functions that produce other functions (rather than a higher order function that simply produces a non-function value at the end) which might otherwise be less clear.
For example,
def integrate(f: Double => Double, delta: Double = 0.01)(x: Double): Double = {
val domain = Range.Double(0.0, x, delta)
domain.foldLeft(0.0){case (acc, a) => delta * f(a) + acc
}
can be thought of and used in the way that you actually learned integration in calculus, namely as a transformation of a function that produces another function.
def square(x: Double): Double = x * x
// Ignoring issues of numerical stability for the moment...
// The underscore is really just a wart that Scala requires to bind it to a val
val cubic = integrate(square) _
val quartic = integrate(cubic) _
val quintic = integrate(quartic) _
// Not *utterly* horrible for a two line numerical integration function
cubic(1) // 0.32835000000000014
quartic(1) // 0.0800415
quintic(1) // 0.015449626499999999
Currying also alleviates a few of the problems around fixed function arity.
implicit class LiftedApply[A, B](fOpt: Option[A => B]){
def ap(xOpt: Option[A]): Option[B] = for {
f <- fOpt
x <- xOpt
} yield f(x)
}
def not(x: Boolean): Boolean = !x
def and(x: Boolean)(y: Boolean): Boolean = x && y
def and3(x: Boolean)(y: Boolean)(z: Boolean): Boolean = x && y && z
Some(not _) ap Some(false) // true
Some(and _) ap Some(true) ap Some(true) // true
Some(and3 _) ap Some(true) ap Some(true) ap Some(true) // true
By having curried functions, we've been able to "lift" a function to work on Option for as many arguments as we need. If our logic functions had not been curried, then we would have had to have separate functions to lift A => B to Option[A] => Option[B], (A, B) => C to (Option[A], Option[B]) => Option[C], (A, B, C) => D to (Option[A], Option[B], Option[C]) => Option[D] and so on for all the arities we cared about.
Currying also has some other miscellaneous benefits when it comes to type inference and is required if you have both implicit and non-implicit arguments for a method.
Finally, the answers to this question list out some more times you might want currying.

Curried update method

I'm trying to have curried apply and update methods like this:
def apply(i: Int)(j: Int) = matrix(i)(j)
def update(i: Int, j: Int, value: Int) =
new Matrix(n, m, (x, y) => if ((i,j) == (x,y)) value else matrix(x)(y))
Apply method works correctly, but update method complains:
scala> matrix(2)(1) = 1
<console>:16: error: missing arguments for method apply in class Matrix;
follow this method with `_' if you want to treat it as a partially applied function
matrix(2)(1) = 1
Calling directly update(2)(1)(1) works, so it is a conversion to update method that doesn't work properly. Where is my mistake?
The desugaring of assignment syntax into invocations of update maps the concatenation of a single argument list on the LHS of the assignment with the value on the RHS of the assignment to the first parameter block of the update method definition, irrespective of how many other parameter blocks the update method definition has. Whilst this transformation in a sense splits a single parameter block into two (one on the LHS, one on the RHS of the assignment), it will not further split the left parameter block in the way that you want.
I also think you're mistaken about the example of the explicit invocation of update that you show. This doesn't compile with the definition of update that you've given,
scala> class Matrix { def update(i: Int, j: Int, value: Int) = (i, j, value) }
defined class Matrix
scala> val m = new Matrix
m: Matrix = Matrix#37176bc4
scala> m.update(1)(2)(3)
<console>:10: error: not enough arguments for method update: (i: Int, j: Int, value: Int)(Int, Int, Int).
Unspecified value parameters j, value.
m.update(1)(2)(3)
^
I suspect that during your experimentation you actually defined update like so,
scala> class Matrix { def update(i: Int)(j: Int)(value: Int) = (i, j, value) }
defined class Matrix
The update desugaring does apply to this definition, but probably not in the way that you expect: as described above, it only applies to the first argument list, which leads to constructs like,
scala> val m = new Matrix
m: Matrix = Matrix#39741f43
scala> (m() = 1)(2)(3)
res0: (Int, Int, Int) = (1,2,3)
Here the initial one-place parameter block is split to an empty parameter block on the LHS of the assignment (ie. the ()) and a one argument parameter block on the RHS (ie. the 1). The remainder of the parameter blocks from the original definition then follow.
If you're surprised by this behaviour you won't be the first.
The syntax you're after is achievable via a slightly different route,
scala> class Matrix {
| class MatrixAux(i : Int) {
| def apply(j : Int) = 23
| def update(j: Int, value: Int) = (i, j, value)
| }
|
| def apply(i: Int) = new MatrixAux(i)
| }
defined class Matrix
scala> val m = new Matrix
m: Matrix = Matrix#3af30087
scala> m(1)(2) // invokes MatrixAux.apply
res0: Int = 23
scala> m(1)(2) = 3 // invokes MatrixAux.update
res1: (Int, Int, Int) = (1,2,3)
My guess is, that it is simply not supported. Probably not due to an explicit design decision, because I don't see why it shouldn't work in principle.
The translation concerned with apply, i.e., the one performed when converting m(i)(j) into m.apply(i, j) seems to be able to cope with currying. Run scala -print on your program to see the code resulting from the translation.
The translation concerned with update, on the other hand, doesn't seem to be able to cope with currying. Since the error message is missing arguments for method apply, it even looks as if the currying confuses the translator such that it tries to translate m(i)(j) = v into m.apply, but then screws up the number of required arguments. scala -print unfortunately won't help here, because the type checker terminates the translation too early.
Here is what the language specs (Scala 2.9, "6.15 Assignments") say about assignments. Since currying is not mentioned, I assume that it is not explicitly supported. I couldn't find the corresponding paragraph for apply, but I guess it is purely coincidental that currying works there.
An assignment f(args) = e with a function application to the left of
the ‘=’ operator is interpreted as f.update(args, e), i.e. the
invocation of an update function defined by f.

scala currying by nested functions or by multiple parameter lists

In Scala, I can define a function with two parameter lists.
def myAdd(x :Int)(y :Int) = x + y
This makes it easy to define a partially applied function.
val plusFive = myAdd(5) _
But, I can accomplish something similar by defining and returning a nested function.
def myOtherAdd(x :Int) = {
def f(y :Int) = x + y
f _
}
Cosmetically, I've moved the underscore, but this still feels like currying.
val otherPlusFive = myOtherAdd(5)
What criteria should I use to prefer one approach over the other?
There are at least four ways to accomplish the same thing:
def myAddA(x: Int, y: Int) = x + y
val plusFiveA: Int => Int = myAddA(5,_)
def myAddB(x: Int)(y : Int) = x + y
val plusFiveB = myAddB(5) _
def myAddC(x: Int) = (y: Int) => x + y
val plusFiveC = myAddC(5)
def myAddD(x: Int) = {
def innerD(y: Int) = x + y
innerD _
}
val plusFiveD = myAddD(5)
You might want to know which is most efficient or which is the best style (for some non-performance based measure of best).
As far as efficiency goes, it turns out that all four are essentially equivalent. The first two cases actually emit exactly the same bytecode; the JVM doesn't know anything about multiple parameter lists, so once the compiler figures it out (you need to help it with a type annotation on the case A), it's all the same under the hood. The third case is also extremely close, but since it promises up front to return a function and specifies it on the spot, it can avoid one internal field. The fourth case is pretty much the same as the first two in terms of work done; it just does the conversion to Function1 inside the method instead of outside.
In terms of style, I suggest that B and C are the best ways to go, depending on what you're doing. If your primary use case is to create a function, not to call in-place with both parameter lists, then use C, because it tells you what it's going to do. (This version is also particularly familiar to people coming from Haskell, for instance.) On the other hand, if you are mostly going to call it in place but will only occasionally curry it, then use B. Again, it says more clearly what it's expected to do.
You could also do this:
def yetAnotherAdd(x: Int) = x + (_: Int)
You should choose the API based on intention. The main reason in Scala to have multiple parameter lists is to help type inference. For instance:
def f[A](x: A)(f: A => A) = ...
f(5)(_ + 5)
One can also use it to have multiple varargs, but I have never seen code like that. And, of course, there's the need for the implicit parameter list, but that's pretty much another matter.
Now, there are many ways you can have functions returning functions, which is pretty much what currying does. You should use them if the API should be thought of as a function which returns a function.
I think it is difficult to get any more precise than this.
Another benefit of having a method return a function directly (instead of using partial application) is that it leads to much cleaner code when using infix notation, allowing you to avoid a bucketload of parentheses and underscores in more complex expressions.
Consider:
val list = List(1,2,3,4)
def add1(a: Int)(b: Int) = a + b
list map { add1(5) _ }
//versus
def add2(a: Int) = a + (_: Int)
list map add2(5)

How does one write the Pythagoras Theorem in Scala?

The square of the hypotenuse of a right triangle is equal to the sum of the squares on the other two sides.
This is Pythagoras's Theorem. A function to calculate the hypotenuse based on the length "a" and "b" of it's sides would return sqrt(a * a + b * b).
The question is, how would you define such a function in Scala in such a way that it could be used with any type implementing the appropriate methods?
For context, imagine a whole library of math theorems you want to use with Int, Double, Int-Rational, Double-Rational, BigInt or BigInt-Rational types depending on what you are doing, and the speed, precision, accuracy and range requirements.
This only works on Scala 2.8, but it does work:
scala> def pythagoras[T](a: T, b: T, sqrt: T => T)(implicit n: Numeric[T]) = {
| import n.mkNumericOps
| sqrt(a*a + b*b)
| }
pythagoras: [T](a: T,b: T,sqrt: (T) => T)(implicit n: Numeric[T])T
scala> def intSqrt(n: Int) = Math.sqrt(n).toInt
intSqrt: (n: Int)Int
scala> pythagoras(3,4, intSqrt)
res0: Int = 5
More generally speaking, the trait Numeric is effectively a reference on how to solve this type of problem. See also Ordering.
The most obvious way:
type Num = {
def +(a: Num): Num
def *(a: Num): Num
}
def pyth[A <: Num](a: A, b: A)(sqrt: A=>A) = sqrt(a * a + b * b)
// usage
pyth(3, 4)(Math.sqrt)
This is horrible for many reasons. First, we have the problem of the recursive type, Num. This is only allowed if you compile this code with the -Xrecursive option set to some integer value (5 is probably more than sufficient for numbers). Second, the type Num is structural, which means that any usage of the members it defines will be compiled into corresponding reflective invocations. Putting it mildly, this version of pyth is obscenely inefficient, running on the order of several hundred thousand times slower than a conventional implementation. There's no way around the structural type though if you want to define pyth for any type which defines +, * and for which there exists a sqrt function.
Finally, we come to the most fundamental issue: it's over-complicated. Why bother implementing the function in this way? Practically speaking, the only types it will ever need to apply to are real Scala numbers. Thus, it's easiest just to do the following:
def pyth(a: Double, b: Double) = Math.sqrt(a * a + b * b)
All problems solved! This function is usable on values of type Double, Int, Float, even odd ones like Short thanks to the marvels of implicit conversion. While it is true that this function is technically less flexible than our structurally-typed version, it is vastly more efficient and eminently more readable. We may have lost the ability to calculate the Pythagrean theorem for unforeseen types defining + and *, but I don't think you're going to miss that ability.
Some thoughts on Daniel's answer:
I've experimented to generalize Numeric to Real, which would be more appropriate for this function to provide the sqrt function. This would result in:
def pythagoras[T](a: T, b: T)(implicit n: Real[T]) = {
import n.mkNumericOps
(a*a + b*b).sqrt
}
It is tricky, but possible, to use literal numbers in such generic functions.
def pythagoras[T](a: T, b: T)(sqrt: (T => T))(implicit n: Numeric[T]) = {
import n.mkNumericOps
implicit val fromInt = n.fromInt _
//1 * sqrt(a*a + b*b) Not Possible!
sqrt(a*a + b*b) * 1 // Possible
}
Type inference works better if the sqrt is passed in a second parameter list.
Parameters a and b would be passed as Objects, but #specialized could fix this. Unfortuantely there will still be some overhead in the math operations.
You can almost do without the import of mkNumericOps. I got frustratringly close!
There is a method in java.lang.Math:
public static double hypot (double x, double y)
for which the javadocs asserts:
Returns sqrt(x2 +y2) without intermediate overflow or underflow.
looking into src.zip, Math.hypot uses StrictMath, which is a native Method:
public static native double hypot(double x, double y);