How to use scala.util.Sorting.quickSort() with arbitrary types? - scala

I need to sort an array of pairs by second element. How do I pass comparator for my pairs to the quickSort function?
I'm using the following ugly approach now:
type AccResult = (AccUnit, Long) // pair
class Comparator(a:AccResult) extends Ordered[AccResult] {
def compare(that:AccResult) = lessCompare(a, that)
def lessCompare(a:AccResult, that:AccResult) = if (a._2 == that._2) 0 else if (a._2 < that._2) -1 else 1
}
scala.util.Sorting.quickSort(data)(d => new Comparator(d))
Why is quickSort designed to have an ordered view instead of usual comparator argument?
Scala 2.7 solutions are preferred.

I tend to prefer the non-implicit arguments unless its being used in more than a few places.
type Pair = (String,Int)
val items : Array[Pair] = Array(("one",1),("three",3),("two",2))
quickSort(items)(new Ordering[Pair] {
def compare(x: Pair, y: Pair) = {
x._2 compare y._2
}
})
Edit: After learning about view bounds in another question, I think that this approach might be better:
val items : Array[(String,Int)] = Array(("one",1),("three",3),("two",2))
class OrderTupleBySecond[X,Y <% Comparable[Y]] extends Ordering[(X,Y)] {
def compare(x: (X,Y), y: (X,Y)) = {
x._2 compareTo y._2
}
}
util.Sorting.quickSort(items)(new OrderTupleBySecond[String,Int])
In this way, OrderTupleBySecond could be used for any Tuple2 type where the type of the 2nd member of the tuple has a view in scope which would convert it to a Comparable.

Ok, I'm not sure exactly what you are unhappy about what you are currently doing, but perhaps all you are looking for is this?
implicit def toComparator(a: AccResult) = new Comparator(a)
scala.util.Sorting.quickSort(data)
If, on the other hand, the problem is that the tuple is Ordered and you want a different ordering, well, that's why it changed on Scala 2.8.
* EDIT *
Ouch! Sorry, I only now realize you said you preferred Scala 2.7 solutions. I have editted this answer soon to put the solution for 2.7 above. What follows is a 2.8 solution.
Scala 2.8 expects an Ordering, not an Ordered, which is a context bound, not a view bound. You'd write your code in 2.8 like this:
type AccResult = (AccUnit, Long) // pair
implicit object AccResultOrdering extends Ordering[AccResult] {
def compare(x: AccResult, y: AccResult) = if (x._2 == y._2) 0 else if (x._2 < y._2) -1 else 1
}
Or maybe just:
type AccResult = (AccUnit, Long) // pair
implicit val AccResultOrdering = Ordering by ((_: AccResult)._2)
And use it like:
scala.util.Sorting.quickSort(data)
On the other hand, the usual way to do sort in Scala 2.8 is just to call one of the sorting methods on it, such as:
data.sortBy((_: AccResult)._2)

Have your type extend Ordered, like so:
case class Thing(number : Integer, name: String) extends Ordered[Thing] {
def compare(that: Thing) = name.compare(that.name)
}
And then pass it to sort, like so:
val array = Array(Thing(4, "Doll"), Thing(2, "Monkey"), Thing(7, "Green"))
scala.util.Sorting.quickSort(array)
Printing the array will give you:
array.foreach{ e => print(e) }
>> Thing(4,Doll) Thing(7,Green) Thing(2,Monkey)

Related

Scala Cats Accumulating Errors and Successes with Ior

I am trying to use Cats datatype Ior to accumulate both errors and successes of using a service (which can return an error).
def find(key: String): F[Ior[NonEmptyList[Error], A]] = {
(for {
b <- service.findByKey(key)
} yield b.rightIor[NonEmptyList[Error]])
.recover {
case e: Error => Ior.leftNel(AnotherError)
}
}
def findMultiple(keys: List[String]): F[Ior[NonEmptyList[Error], List[A]]] = {
keys map find reduce (_ |+| _)
}
My confusion lies in how to combine the errors/successes. I am trying to use the Semigroup combine (infix syntax) to combine with no success. Is there a better way to do this? Any help would be great.
I'm going to assume that you want both all errors and all successful results. Here's a possible implementation:
class Foo[F[_]: Applicative, A](find: String => F[IorNel[Error, A]]) {
def findMultiple(keys: List[String]): F[IorNel[Error, List[A]]] = {
keys.map(find).sequence.map { nelsList =>
nelsList.map(nel => nel.map(List(_)))
.reduceOption(_ |+| _).getOrElse(Nil.rightIor)
}
}
}
Let's break it down:
We will be trying to "flip" a List[IorNel[Error, A]] into IorNel[Error, List[A]]. However, from doing keys.map(find) we get List[F[IorNel[...]]], so we need to also "flip" it in a similar fashion first. That can be done by using .sequence on the result, and is what forces F[_]: Applicative constraint.
N.B. Applicative[Future] is available whenever there's an implicit ExecutionContext in scope. You can also get rid of F and use Future.sequence directly.
Now, we have F[List[IorNel[Error, A]]], so we want to map the inner part to transform the nelsList we got. You might think that sequence could be used there too, but it can not - it has the "short-circuit on first error" behavior, so we'd lose all successful values. Let's try to use |+| instead.
Ior[X, Y] has a Semigroup instance when both X and Y have one. Since we're using IorNel, X = NonEmptyList[Z], and that is satisfied. For Y = A - your domain type - it might not be available.
But we don't want to combine all results into a single A, we want Y = List[A] (which also always has a semigroup). So, we take every IorNel[Error, A] we have and map A to a singleton List[A]:
nelsList.map(nel => nel.map(List(_)))
This gives us List[IorNel[Error, List[A]], which we can reduce. Unfortunately, since Ior does not have a Monoid, we can't quite use convenient syntax. So, with stdlib collections, one way is to do .reduceOption(_ |+| _).getOrElse(Nil.rightIor).
This can be improved by doing few things:
x.map(f).sequence is equivalent to doing x.traverse(f)
We can demand that keys are non-empty upfront, and give nonempty result back too.
The latter step gives us Reducible instance for a collection, letting us shorten everything by doing reduceMap
class Foo2[F[_]: Applicative, A](find: String => F[IorNel[Error, A]]) {
def findMultiple(keys: NonEmptyList[String]): F[IorNel[Error, NonEmptyList[A]]] = {
keys.traverse(find).map { nelsList =>
nelsList.reduceMap(nel => nel.map(NonEmptyList.one))
}
}
}
Of course, you can make a one-liner out of this:
keys.traverse(find).map(_.reduceMap(_.map(NonEmptyList.one)))
Or, you can do the non-emptiness check inside:
class Foo3[F[_]: Applicative, A](find: String => F[IorNel[Error, A]]) {
def findMultiple(keys: List[String]): F[IorNel[Error, List[A]]] = {
NonEmptyList.fromList(keys)
.map(_.traverse(find).map { _.reduceMap(_.map(List(_))) })
.getOrElse(List.empty[A].rightIor.pure[F])
}
}
Ior is a good choice for warning accumulation, that is errors and a successful value. But, as mentioned by Oleg Pyzhcov, Ior.Left case is short-circuiting. This example illustrates it:
scala> val shortCircuitingErrors = List(
Ior.leftNec("error1"),
Ior.bothNec("warning2", 2),
Ior.bothNec("warning3", 3)
).sequence
shortCircuitingErrors: Ior[Nec[String], List[Int]]] = Left(Chain(error1))
One way to accumulate both errors and successes is to convert all your Left cases into Both. One approach is using Option as right type and converting Left(errs) values into Both(errs, None). After calling .traverse, you end up with optList: List[Option] on the right side and you can flatten it with optList.flatMap(_.toList) to filter out None values.
class Error
class KeyValue
def find(key: String): Ior[Nel[Error], KeyValue] = ???
def findMultiple(keys: List[String]): Ior[Nel[Error], List[KeyValue]] =
keys
.traverse { k =>
val ior = find(k)
ior.putRight(ior.right)
}
.map(_.flatMap(_.toList))
Or more succinctly:
def findMultiple(keys: List[String]): Ior[Nel[Error], List[KeyValue]] =
keys.flatTraverse { k =>
val ior = find(k)
ior.putRight(ior.toList) // Ior[A,B].toList: List[B]
}

Scala primitives as reference types?

Does Scala provide a means of accessing primitives by reference (e.g., on the heap) out of the box? E.g., is there an idiomatic way of making the following code return 1?:
import scala.collection.mutable
val m = new mutable.HashMap[String, Int]
var x = m.getOrElseUpdate("foo", 0)
x += 1
m.get("foo") // The map value should be 1 after the preceding update.
I expect I should be able to use a wrapper class like the following as the map's value type (thus storing pointers to the WrappedInts):
class WrappedInt(var theInt:Int)
...but I'm wondering if I'm missing a language or standard library feature.
You can't do that with primitives or their non-primitives counter parts in Java nor Scala. Don't see any other way but use the WrappedInt.
If your goal is to increment map values by key, than you can use some nicer solutions instead of wrapper.
val key = "foo"
val v = m.put(key, m.getOrElse(key, 0) + 1)
or another approach would be to set a default value 0 for the map:
val m2 = m.withDefault(_ => 0)
val v = m2.put(key, m2(key) + 1)
or add extension method updatedWith
implicit class MapExtensions[K, V](val map: Map[K, V]) extends AnyVal {
def updatedWith(key: K, default: V)(f: V => V) = {
map.put(key, f(map.getOrElse(key, default)))
}
}
val m3 = m.updatedWith("foo", 0) { _ + 1 }

How can I extend Scala collections with member values?

Say I have the following data structure:
case class Timestamped[CC[M] < Seq[M]](elems : CC, timestamp : String)
So it's essentially a sequence with an attribute -- a timestamp -- attached to it. This works fine and I could create new instances with the syntax
val t = Timestamped(Seq(1,2,3,4),"2014-02-25")
t.elems.head // 1
t.timestamp // "2014-05-25"
The syntax is unwieldly and instead I want to be able to do something like:
Timestamped(1,2,3,4)("2014-02-25")
t.head // 1
t.timestamp // "2014-05-25"
Where timestamped is just an extension of a Seq and it's implementation SeqLike, with a single attribute val timestamp : String.
This seems easy to do; just use a Seq with a mixin TimestampMixin { val timestamp : String }. But I can't figure out how to create the constructor. My question is: how do I create a constructor in the companion object, that creates a sequence with an extra member value? The signature is as follows:
object Timestamped {
def apply(elems: M*)(timestamp : String) : Seq[M] with TimestampMixin = ???
}
You'll see that it's not straightforward; collections use Builders to instantiate themselves, so I can't simply call the constructor an override some vals.
Scala collections are very complicated structures when it comes down to it. Extending Seq requires implementing apply, length, and iterator methods. In the end, you'll probably end up duplicating existing code for List, Set, or something else. You'll also probably have to worry about CanBuildFroms for your collection, which in the end I don't think is worth it if you just want to add a field.
Instead, consider an implicit conversion from your Timestamped type to Seq.
case class Timestamped[A](elems: Seq[A])(timestamp: String)
object Timestamped {
implicit def toSeq[A](ts: Timestamped[A]): Seq[A] = ts.elems
}
Now, whenever I try to call a method from Seq, the compiler will implicitly convert Timestamped to Seq, and we can proceed as normal.
scala> val ts = Timestamped(List(1,2,3,4))("1/2/34")
ts: Timestamped[Int] = Timestamped(List(1, 2, 3, 4))
scala> ts.filter(_ > 2)
res18: Seq[Int] = List(3, 4)
There is one major drawback here, and it's that we're now stuck with Seq after performing operations on the original Timestamped.
Go the other way... extend Seq, it only has 3 abstract members:
case class Stamped[T](elems: Seq[T], stamp: Long) extends Seq[T] {
override def apply(i: Int) = elems.apply(i)
override def iterator = elems.iterator
override def length = elems.length
}
val x = Stamped(List(10,20,30), 15L)
println(x.head) // 10
println(x.timeStamp) // 15
println(x.map { _ * 10}) // List(100, 200, 300)
println(x.filter { _ > 20}) // List(30)
Keep in mind, this only works as long as Seq is specific enough for your use cases, if you later find you need more complex collection behavior this may become untenable.
EDIT: Added a version closer to the signature you were trying to create. Not sure if this helps you any more:
case class Stamped[T](elems: T*)(stamp: Long) extends Seq[T] {
def timeStamp = stamp
override def apply(i: Int) = elems.apply(i)
override def iterator = elems.iterator
override def length = elems.length
}
val x = Stamped(10,20,30)(15L)
println(x.head) // 10
println(x.timeStamp) // 15
println(x.map { _ * 10}) // List(100, 200, 300)
println(x.filter { _ > 20}) // List(30)
Where elems would end up being a generically created WrappedArray.

Scala operator overloading with multiple parameters

In short: I try to write something like A <N B for a DSL in Scala, for an integer N and A,B of Type T. Is there a nice possibility to do so?
Longer: I try to write a DSL for TGrep2 in Scala. I'm currently interested to write
A <N B B is the Nth child of A (the rst child is <1).
in a nice way and as close as possible to the original definition in Scala. Is there a way to overload the < Operator that it can take a N and a B as a argument.
What I tried: I tried two different possibilities which did not make me very happy:
scala> val N = 10
N: Int = 10
scala> case class T(n:String) {def <(i:Int,j:T) = println("huray!")}
defined class T
scala> T("foo").<(N,T("bar"))
huray!
and
scala> case class T(n:String) {def <(i:Int) = new {def apply(j:T) = println("huray!")}}
defined class T
scala> (T("foo")<N)(T("bar"))
warning: there were 1 feature warnings; re-run with -feature for details
huray!
Id suggest you use something like nth instead of the < symbol which makes the semantics clear. A nth N is B would make a lot of sense to me at least. It would translate to something like
case class T (label:String){
def is(j:T) = {
label equals j.label
}
}
case class J(i:List[T]){
def nth(index:Int) :T = {
i(index)
}
}
You can easily do:
val t = T("Mice")
val t1 = T("Rats")
val j = J(List(t1,t))
j nth 1 is t //res = true
The problem is that apply doesn't work as a postfix operator, so you can't write it without the parantheses, you could write this:
case class T(n: String) {
def <(in: (Int, T)) = {
in match {
case (i, t) =>
println(s"${t.n} is the ${i} child of ${n}")
}
}
}
implicit class Param(lower: Int) {
def apply(t: T) = (lower, t)
}
but then,
T("foo") < 10 T("bar")
would still fail, but you could work it out with:
T("foo") < 10 (T("bar"))
there isn't a good way of doing what you want without adding parenthesis somewhere.
I think that you might want to go for a combinational parser instead if you really want to stick with this syntax. Or as #korefn proposed, you break the compatibility and do it with new operators.

Nearest keys in a SortedMap

Given a key k in a SortedMap, how can I efficiently find the largest key m that is less than or equal to k, and also the smallest key n that is greater than or equal to k. Thank you.
Looking at the source code for 2.9.0, the following code seems about to be the best you can do
def getLessOrEqual[A,B](sm: SortedMap[A,B], bound: A): B = {
val key = sm.to(x).lastKey
sm(key)
}
I don't know exactly how the splitting of the RedBlack tree works, but I guess it's something like a O(log n) traversal of the tree/construction of new elements and then a balancing, presumable also O(log n). Then you need to go down the new tree again to get the last key. Unfortunately you can't retrieve the value in the same go. So you have to go down again to fetch the value.
In addition the lastKey might throw an exception and there is no similar method that returns an Option.
I'm waiting for corrections.
Edit and personal comment
The SortedMap area of the std lib seems to be a bit neglected. I'm also missing a mutable SortedMap. And looking through the sources, I noticed that there are some important methods missing (like the one the OP asks for or the ones pointed out in my answer) and also some have bad implementation, like 'last' which is defined by TraversableLike and goes through the complete tree from first to last to obtain the last element.
Edit 2
Now the question is reformulated my answer is not valid anymore (well it wasn't before anyway). I think you have to do the thing I'm describing twice for lessOrEqual and greaterOrEqual. Well you can take a shortcut if you find the equal element.
Scala's SortedSet trait has no method that will give you the closest element to some other element.
It is presently implemented with TreeSet, which is based on RedBlack. The RedBlack tree is not visible through methods on TreeSet, but the protected method tree is protected. Unfortunately, it is basically useless. You'd have to override methods returning TreeSet to return your subclass, but most of them are based on newSet, which is private.
So, in the end, you'd have to duplicate most of TreeSet. On the other hand, it isn't all that much code.
Once you have access to RedBlack, you'd have to implement something similar to RedBlack.Tree's lookup, so you'd have O(logn) performance. That's actually the same complexity of range, though it would certainly do less work.
Alternatively, you'd make a zipper for the tree, so that you could actually navigate through the set in constant time. It would be a lot more work, of course.
Using Scala 2.11.7, the following will give what you want:
scala> val set = SortedSet('a', 'f', 'j', 'z')
set: scala.collection.SortedSet[Char] = TreeSet(a, f, j, z)
scala> val beforeH = set.to('h').last
beforeH: Char = f
scala> val afterH = set.from('h').head
afterH: Char = j
Generally you should use lastOption and headOption as the specified elements may not exist. If you are looking to squeeze a little more efficiency out, you can try replacing from(...).head with keysIteratorFrom(...).head
Sadly, the Scala library only allows to make this type of query efficiently:
and also the smallest key n that is greater than or equal to k.
val n = TreeMap(...).keysIteratorFrom(k).next
You can hack this by keeping two structures, one with normal keys, and one with negated keys. Then you can use the other structure to make the second type of query.
val n = - TreeMap(...).keysIteratorFrom(-k).next
Looks like I should file a ticket to add 'fromIterator' and 'toIterator' methods to 'Sorted' trait.
Well, one option is certainly using java.util.TreeMap.
It has lowerKey and higherKey methods, which do excatly what you want.
I had a similar problem: I wanted to find the closest element to a given key in a SortedMap. I remember the answer to this question being, "You have to hack TreeSet," so when I had to implement it for a project, I found a way to wrap TreeSet without getting into its internals.
I didn't see jazmit's answer, which more closely answers the original poster's question with minimum fuss (two method calls). However, those method calls do more work than needed for this application (multiple tree traversals), and my solution provides lots of hooks where other users can modify it to their own needs.
Here it is:
import scala.collection.immutable.TreeSet
import scala.collection.SortedMap
// generalize the idea of an Ordering to metric sets
trait MetricOrdering[T] extends Ordering[T] {
def distance(x: T, y: T): Double
def compare(x: T, y: T) = {
val d = distance(x, y)
if (d > 0.0) 1
else if (d < 0.0) -1
else 0
}
}
class MetricSortedMap[A, B]
(elems: (A, B)*)
(implicit val ordering: MetricOrdering[A])
extends SortedMap[A, B] {
// while TreeSet searches for an element, keep track of the best it finds
// with *thread-safe* mutable state, of course
private val best = new java.lang.ThreadLocal[(Double, A, B)]
best.set((-1.0, null.asInstanceOf[A], null.asInstanceOf[B]))
private val ord = new MetricOrdering[(A, B)] {
def distance(x: (A, B), y: (A, B)) = {
val diff = ordering.distance(x._1, y._1)
val absdiff = Math.abs(diff)
// the "to" position is a key-null pair; the object of interest
// is the other one
if (absdiff < best.get._1)
(x, y) match {
// in practice, TreeSet always picks this first case, but that's
// insider knowledge
case ((to, null), (pos, obj)) =>
best.set((absdiff, pos, obj))
case ((pos, obj), (to, null)) =>
best.set((absdiff, pos, obj))
case _ =>
}
diff
}
}
// use a TreeSet as a backing (not TreeMap because we need to get
// the whole pair back when we query it)
private val treeSet = TreeSet[(A, B)](elems: _*)(ord)
// find the closest key and return:
// (distance to key, the key, its associated value)
def closest(to: A): (Double, A, B) = {
treeSet.headOption match {
case Some((pos, obj)) =>
best.set((ordering.distance(to, pos), pos, obj))
case None =>
throw new java.util.NoSuchElementException(
"SortedMap has no elements, and hence no closest element")
}
treeSet((to, null.asInstanceOf[B])) // called for side effects
best.get
}
// satisfy the contract (or throw UnsupportedOperationException)
def +[B1 >: B](kv: (A, B1)): SortedMap[A, B1] =
new MetricSortedMap[A, B](
elems :+ (kv._1, kv._2.asInstanceOf[B]): _*)
def -(key: A): SortedMap[A, B] =
new MetricSortedMap[A, B](elems.filter(_._1 != key): _*)
def get(key: A): Option[B] = treeSet.find(_._1 == key).map(_._2)
def iterator: Iterator[(A, B)] = treeSet.iterator
def rangeImpl(from: Option[A], until: Option[A]): SortedMap[A, B] =
new MetricSortedMap[A, B](treeSet.rangeImpl(
from.map((_, null.asInstanceOf[B])),
until.map((_, null.asInstanceOf[B]))).toSeq: _*)
}
// test it with A = Double
implicit val doubleOrdering =
new MetricOrdering[Double] {
def distance(x: Double, y: Double) = x - y
}
// and B = String
val stuff = new MetricSortedMap[Double, String](
3.3 -> "three",
1.1 -> "one",
5.5 -> "five",
4.4 -> "four",
2.2 -> "two")
println(stuff.iterator.toList)
println(stuff.closest(1.5))
println(stuff.closest(1000))
println(stuff.closest(-1000))
println(stuff.closest(3.3))
println(stuff.closest(3.4))
println(stuff.closest(3.2))
I've been doing:
val m = SortedMap(myMap.toSeq:_*)
val offsetMap = (m.toSeq zip m.keys.toSeq.drop(1)).map {
case ( (k,v),newKey) => (newKey,v)
}.toMap
When I want the results of my map off-set by one key. I'm also looking for a better way, preferably without storing an extra map.