Nearest keys in a SortedMap - scala

Given a key k in a SortedMap, how can I efficiently find the largest key m that is less than or equal to k, and also the smallest key n that is greater than or equal to k. Thank you.

Looking at the source code for 2.9.0, the following code seems about to be the best you can do
def getLessOrEqual[A,B](sm: SortedMap[A,B], bound: A): B = {
val key = sm.to(x).lastKey
sm(key)
}
I don't know exactly how the splitting of the RedBlack tree works, but I guess it's something like a O(log n) traversal of the tree/construction of new elements and then a balancing, presumable also O(log n). Then you need to go down the new tree again to get the last key. Unfortunately you can't retrieve the value in the same go. So you have to go down again to fetch the value.
In addition the lastKey might throw an exception and there is no similar method that returns an Option.
I'm waiting for corrections.
Edit and personal comment
The SortedMap area of the std lib seems to be a bit neglected. I'm also missing a mutable SortedMap. And looking through the sources, I noticed that there are some important methods missing (like the one the OP asks for or the ones pointed out in my answer) and also some have bad implementation, like 'last' which is defined by TraversableLike and goes through the complete tree from first to last to obtain the last element.
Edit 2
Now the question is reformulated my answer is not valid anymore (well it wasn't before anyway). I think you have to do the thing I'm describing twice for lessOrEqual and greaterOrEqual. Well you can take a shortcut if you find the equal element.

Scala's SortedSet trait has no method that will give you the closest element to some other element.
It is presently implemented with TreeSet, which is based on RedBlack. The RedBlack tree is not visible through methods on TreeSet, but the protected method tree is protected. Unfortunately, it is basically useless. You'd have to override methods returning TreeSet to return your subclass, but most of them are based on newSet, which is private.
So, in the end, you'd have to duplicate most of TreeSet. On the other hand, it isn't all that much code.
Once you have access to RedBlack, you'd have to implement something similar to RedBlack.Tree's lookup, so you'd have O(logn) performance. That's actually the same complexity of range, though it would certainly do less work.
Alternatively, you'd make a zipper for the tree, so that you could actually navigate through the set in constant time. It would be a lot more work, of course.

Using Scala 2.11.7, the following will give what you want:
scala> val set = SortedSet('a', 'f', 'j', 'z')
set: scala.collection.SortedSet[Char] = TreeSet(a, f, j, z)
scala> val beforeH = set.to('h').last
beforeH: Char = f
scala> val afterH = set.from('h').head
afterH: Char = j
Generally you should use lastOption and headOption as the specified elements may not exist. If you are looking to squeeze a little more efficiency out, you can try replacing from(...).head with keysIteratorFrom(...).head

Sadly, the Scala library only allows to make this type of query efficiently:
and also the smallest key n that is greater than or equal to k.
val n = TreeMap(...).keysIteratorFrom(k).next
You can hack this by keeping two structures, one with normal keys, and one with negated keys. Then you can use the other structure to make the second type of query.
val n = - TreeMap(...).keysIteratorFrom(-k).next

Looks like I should file a ticket to add 'fromIterator' and 'toIterator' methods to 'Sorted' trait.

Well, one option is certainly using java.util.TreeMap.
It has lowerKey and higherKey methods, which do excatly what you want.

I had a similar problem: I wanted to find the closest element to a given key in a SortedMap. I remember the answer to this question being, "You have to hack TreeSet," so when I had to implement it for a project, I found a way to wrap TreeSet without getting into its internals.
I didn't see jazmit's answer, which more closely answers the original poster's question with minimum fuss (two method calls). However, those method calls do more work than needed for this application (multiple tree traversals), and my solution provides lots of hooks where other users can modify it to their own needs.
Here it is:
import scala.collection.immutable.TreeSet
import scala.collection.SortedMap
// generalize the idea of an Ordering to metric sets
trait MetricOrdering[T] extends Ordering[T] {
def distance(x: T, y: T): Double
def compare(x: T, y: T) = {
val d = distance(x, y)
if (d > 0.0) 1
else if (d < 0.0) -1
else 0
}
}
class MetricSortedMap[A, B]
(elems: (A, B)*)
(implicit val ordering: MetricOrdering[A])
extends SortedMap[A, B] {
// while TreeSet searches for an element, keep track of the best it finds
// with *thread-safe* mutable state, of course
private val best = new java.lang.ThreadLocal[(Double, A, B)]
best.set((-1.0, null.asInstanceOf[A], null.asInstanceOf[B]))
private val ord = new MetricOrdering[(A, B)] {
def distance(x: (A, B), y: (A, B)) = {
val diff = ordering.distance(x._1, y._1)
val absdiff = Math.abs(diff)
// the "to" position is a key-null pair; the object of interest
// is the other one
if (absdiff < best.get._1)
(x, y) match {
// in practice, TreeSet always picks this first case, but that's
// insider knowledge
case ((to, null), (pos, obj)) =>
best.set((absdiff, pos, obj))
case ((pos, obj), (to, null)) =>
best.set((absdiff, pos, obj))
case _ =>
}
diff
}
}
// use a TreeSet as a backing (not TreeMap because we need to get
// the whole pair back when we query it)
private val treeSet = TreeSet[(A, B)](elems: _*)(ord)
// find the closest key and return:
// (distance to key, the key, its associated value)
def closest(to: A): (Double, A, B) = {
treeSet.headOption match {
case Some((pos, obj)) =>
best.set((ordering.distance(to, pos), pos, obj))
case None =>
throw new java.util.NoSuchElementException(
"SortedMap has no elements, and hence no closest element")
}
treeSet((to, null.asInstanceOf[B])) // called for side effects
best.get
}
// satisfy the contract (or throw UnsupportedOperationException)
def +[B1 >: B](kv: (A, B1)): SortedMap[A, B1] =
new MetricSortedMap[A, B](
elems :+ (kv._1, kv._2.asInstanceOf[B]): _*)
def -(key: A): SortedMap[A, B] =
new MetricSortedMap[A, B](elems.filter(_._1 != key): _*)
def get(key: A): Option[B] = treeSet.find(_._1 == key).map(_._2)
def iterator: Iterator[(A, B)] = treeSet.iterator
def rangeImpl(from: Option[A], until: Option[A]): SortedMap[A, B] =
new MetricSortedMap[A, B](treeSet.rangeImpl(
from.map((_, null.asInstanceOf[B])),
until.map((_, null.asInstanceOf[B]))).toSeq: _*)
}
// test it with A = Double
implicit val doubleOrdering =
new MetricOrdering[Double] {
def distance(x: Double, y: Double) = x - y
}
// and B = String
val stuff = new MetricSortedMap[Double, String](
3.3 -> "three",
1.1 -> "one",
5.5 -> "five",
4.4 -> "four",
2.2 -> "two")
println(stuff.iterator.toList)
println(stuff.closest(1.5))
println(stuff.closest(1000))
println(stuff.closest(-1000))
println(stuff.closest(3.3))
println(stuff.closest(3.4))
println(stuff.closest(3.2))

I've been doing:
val m = SortedMap(myMap.toSeq:_*)
val offsetMap = (m.toSeq zip m.keys.toSeq.drop(1)).map {
case ( (k,v),newKey) => (newKey,v)
}.toMap
When I want the results of my map off-set by one key. I'm also looking for a better way, preferably without storing an extra map.

Related

loop until a condition stands in scala

I'd like to write a generic loop until a given condition stands, in a functional way.
I've came up with the following code :
def loop[A](a: A, f: A => A, cond: A => Boolean) : A =
if (cond(a)) a else loop(f(a), f, cond)
What are other alternatives ? Is there anything in scalaz ?
[update] It may be possible to use cats and to convert A => A into Reader and afterwards use tailRecM. Any help would be appreciated.
I agree with #wheaties's comment, but since you asked for alternatives, here you go:
You could represent the loop's steps as an iterator, then navigate to the first step where cond is true using .find:
val result = Iterator.iterate(a)(f).find(cond).get
I had originally misread, and answered as if the cond was the "keep looping while true" condition, as with C-style loops. Here's my response as if that was what you asked.
val steps = Iterator.iterate(a)(f).takeWhile(cond)
If all you want is the last A value, you can use steps.toIterable.last (oddly, Iterator doesn't have .last defined). Or you could collect all of the values to a list using steps.toList.
Example:
val steps = Iterator.iterate(0)(_ + 1).takeWhile(_ < 10)
// remember that an iterator is read-once, so if you call .toList, you can't call .last
val result = steps.toIterable.last
// result == 9
From your structure, I think what you are describing is closer to dropWhile than takeWhile. What follows is 100% educational and I don't suggest that this is useful or the proper way to solve this problem. Nevertheless, you might find it useful.
If you want to be generic to any container (List, Array, Option, etc.) You will need a method to access the first element of this container (a.k.a. the head):
trait HasHead[I[_]]{
def head[X](of: I[X]): X
}
object HasHead {
implicit val listHasHead = new HasHead[List] {
def head[X](of: List[X]) = of.head
}
implicit val arrayHasHead = new HasHead[Array] {
def head[X](of: Array[X]) = of.head
}
//...
}
Here is the generic loop adapted to work with any container:
def loop[I[_], A](
a: I[A],
f: I[A] => I[A],
cond: A => Boolean)(
implicit
hh: HasHead[I]): I[A] =
if(cond(hh.head(a))) a else loop(f(a), f, cond)
Example:
loop(List(1,2,3,4,5), (_: List[Int]).tail, (_: Int) > 2)
> List(3, 4, 5)

Scala: Grouping list of tuples

I need to group list of tuples in some unique way.
For example, if I have
val l = List((1,2,3),(4,2,5),(2,3,3),(10,3,2))
Then I should group the list with second value and map with the set of first value
So the result should be
Map(2 -> Set(1,4), 3 -> Set(2,10))
By so far, I came up with this
l groupBy { p => p._2 } mapValues { v => (v map { vv => vv._1 }).toSet }
This works, but I believe there should be a much more efficient way...
This is similar to this question. Basically, as #serejja said, your approach is correct and also the most concise one. You could use collection.breakOut as builder factory argument to the last map and thereby save the additional iteration to get the Set type:
l.groupBy(_._2).mapValues(_.map(_._1)(collection.breakOut): Set[Int])
You shouldn't probably go beyond this, unless you really need to squeeze the performance.
Otherwise, this is how a general toMultiMap function could look like which allows you to control the values collection type:
import collection.generic.CanBuildFrom
import collection.mutable
def toMultiMap[A, K, V, Values](xs: TraversableOnce[A])
(key: A => K)(value: A => V)
(implicit cbfv: CanBuildFrom[Nothing, V, Values]): Map[K, Values] = {
val b = mutable.Map.empty[K, mutable.Builder[V, Values]]
xs.foreach { elem =>
b.getOrElseUpdate(key(elem), cbfv()) += value(elem)
}
b.map { case (k, vb) => (k, vb.result()) } (collection.breakOut)
}
What it does is, it uses a mutable Map during building stage, and values gathered in a mutable Builder first (the builder is provided by the CanBuildFrom instance). After the iteration over all input elements has completed, that mutable map of builder values is converted into an immutable map of the values collection type (again using the collection.breakOut trick to get the desired output collection straight away).
Ex:
val l = List((1,2,3),(4,2,5),(2,3,3),(10,3,2))
val v = toMultiMap(l)(_._2)(_._1) // uses Vector for values
val s: Map[Int, Set[Int] = toMultiMap(l)(_._2)(_._1) // uses Set for values
So your annotated result type directs the type inference of the values type. If you do not annotate the result, Scala will pick Vector as default collection type.

Scala: Why does SortedMap's mapValues returns a Map and not a SortedMap?

I'm new to Scala.
I'm using SortedMap in my code, and I wanted to use mapValues to create a new map with some transformation on the values.
Instead of returning a new SortedMap, the mapValues function returns a new Map, which I then have to convert to a SortedMap.
For example
val my_map = SortedMap(1 -> "one", 0 -> "zero", 2 -> "two")
val new_map = my_map.mapValues(name => name.toUpperCase)
// returns scala.collection.immutable.Map[Int,java.lang.String] = Map(0 -> ZERO, 1 -> ONE, 2 -> TWO)
val sorted_new_map = SortedMap(new_map.toArray:_ *)
This looks inefficient - the last convertion probably sorts the keys again, or at least verify that they are sorted.
I could use the normal map function which operates both on the keys and the values, and deliberately not change the keys in my transformation function. This looks inefficient too, since the implementation of Map probably assume that the transformation may change the order of the keys (like in the case: my_map.map(tup => (-tup._1, tup._2)) - so it probably "re-sorts" them too.
Is anyone familiar with the internal implementations of Map and SortedMap, and could tell me if my assumptions are correct? Can the compiler recognize automatically that the keys have not been reordered? Is there an internal reason for why mapValues should not return a SortedMap? Is there a better way to transform the map's values without loosing the order of the keys?
Thanks
You've stumbled upon a tricky feature of Scala's Map implementation. The catch that you are missing is that mapValues does not actually return a new Map: it returns a view of a Map. In other words, it wraps your original map in such a way that whenever you access a value it will compute .toUpperCase before returning the value to you.
The upside to this behavior is that Scala won't compute the function for values that aren't accessed, and it won't spend time copying all the data into a new Map. The downside is that the function is re-computed every time that value is accessed. So you might end up doing extra computation if you access the same values many times.
So why does SortedMap not return a SortedMap? Because it's actually returning a Map-wrapper. The underlying Map, then one that is wrapped, is still a SortedMap, so if you were to iterate through, it would still be in sorted order. You and I know that, but the type-checker doesn't. It certainly seems like they could have written it in such a way that it still maintains the SortedMap trait, but they didn't.
You can see in the code that it's not returning a SortedMap, but that the iteration behavior is still going to be sorted:
// from MapLike
override def mapValues[C](f: B => C): Map[A, C] = new DefaultMap[A, C] {
def iterator = for ((k, v) <- self.iterator) yield (k, f(v))
...
The solution to your problem is the same as the solution to getting around the view issue: use .map{ case (k,v) => (k,f(v)) }, as you mentioned in your question.
If you really want that convenience method though, you can do what I do, and write you own, better, version of mapValues:
class EnrichedWithMapVals[T, U, Repr <: GenTraversable[(T, U)]](self: GenTraversableLike[(T, U), Repr]) {
/**
* In a collection of pairs, map a function over the second item of each
* pair. Ensures that the map is computed at call-time, and not returned
* as a view as 'Map.mapValues' would do.
*
* #param f function to map over the second item of each pair
* #return a collection of pairs
*/
def mapVals[R, That](f: U => R)(implicit bf: CanBuildFrom[Repr, (T, R), That]) = {
val b = bf(self.asInstanceOf[Repr])
b.sizeHint(self.size)
for ((k, v) <- self) b += k -> f(v)
b.result
}
}
implicit def enrichWithMapVals[T, U, Repr <: GenTraversable[(T, U)]](self: GenTraversableLike[(T, U), Repr]): EnrichedWithMapVals[T, U, Repr] =
new EnrichedWithMapVals(self)
Now when you call mapVals on a SortedMap you get back a non-view SortedMap:
scala> val m3 = m1.mapVals(_ + 1)
m3: SortedMap[String,Int] = Map(aardvark -> 2, cow -> 6, dog -> 10)
It actually works on any collection of pairs, not just Map implementations:
scala> List(('a,1),('b,2),('c,3)).mapVals(_+1)
res8: List[(Symbol, Int)] = List(('a,2), ('b,3), ('c,4))

Scala: Generalised method to find match and return match dependant values in collection

I wish to find a match within a List and return values dependant on the match. The CollectFirst works well for matching on the elements of the collection but in this case I want to match on the member swEl of the element rather than on the element itself.
abstract class CanvNode (var swElI: Either[CSplit, VistaT])
{
private[this] var _swEl: Either[CSplit, VistaT] = swElI
def member = _swEl
def member_= (value: Either[CSplit, VistaT] ){ _swEl = value; attach}
def attach: Unit
attach
def findVista(origV: VistaIn): Option[Tuple2[CanvNode,VistaT]] = member match
{
case Right(v) if (v == origV) => Option(this, v)
case _ => None
}
}
def nodes(): List[CanvNode] = topNode :: splits.map(i => List(i.n1, i.n2)).flatten
//Is there a better way of implementing this?
val temp: Option[Tuple2[CanvNode, VistaT]] =
nodes.map(i => i.findVista(origV)).collectFirst{case Some (r) => r}
Do I need a View on that, or will the collectFirst method ensure the collection is only created as needed?
It strikes me that this must be a fairly general pattern. Another example could be if one had a List member of the main List's elements and wanted to return the fourth element if it had one. Is there a standard method I can call? Failing that I can create the following:
implicit class TraversableOnceRichClass[A](n: TraversableOnce[A])
{
def findSome[T](f: (A) => Option[T]) = n.map(f(_)).collectFirst{case Some (r) => r}
}
And then I can replace the above with:
val temp: Option[Tuple2[CanvNode, VistaT]] =
nodes.findSome(i => i.findVista(origV))
This uses implicit classes from 2.10, for pre 2.10 use:
class TraversableOnceRichClass[A](n: TraversableOnce[A])
{
def findSome[T](f: (A) => Option[T]) = n.map(f(_)).collectFirst{case Some (r) => r}
}
implicit final def TraversableOnceRichClass[A](n: List[A]):
TraversableOnceRichClass[A] = new TraversableOnceRichClass(n)
As an introductory side node: The operation you're describing (return the first Some if one exists, and None otherwise) is the sum of a collection of Options under the "first" monoid instance for Option. So for example, with Scalaz 6:
scala> Stream(None, None, Some("a"), None, Some("b")).map(_.fst).asMA.sum
res0: scalaz.FirstOption[java.lang.String] = Some(a)
Alternatively you could put something like this in scope:
implicit def optionFirstMonoid[A] = new Monoid[Option[A]] {
val zero = None
def append(a: Option[A], b: => Option[A]) = a orElse b
}
And skip the .map(_.fst) part. Unfortunately neither of these approaches is appropriately lazy in Scalaz, so the entire stream will be evaluated (unlike Haskell, where mconcat . map (First . Just) $ [1..] is just fine, for example).
Edit: As a side note to this side note: apparently Scalaz does provide a sumr that's appropriately lazy (for streams—none of these approaches will work on a view). So for example you can write this:
Stream.from(1).map(Some(_).fst).sumr
And not wait forever for your answer, just like in the Haskell version.
But assuming that we're sticking with the standard library, instead of this:
n.map(f(_)).collectFirst{ case Some(r) => r }
I'd write the following, which is more or less equivalent, and arguably more idiomatic:
n.flatMap(f(_)).headOption
For example, suppose we have a list of integers.
val xs = List(1, 2, 3, 4, 5)
We can make this lazy and map a function with a side effect over it to show us when its elements are accessed:
val ys = xs.view.map { i => println(i); i }
Now we can flatMap an Option-returning function over the resulting collection and use headOption to (safely) return the first element, if it exists:
scala> ys.flatMap(i => if (i > 2) Some(i.toString) else None).headOption
1
2
3
res0: Option[java.lang.String] = Some(3)
So clearly this stops when we hit a non-empty value, as desired. And yes, you'll definitely need a view if your original collection is strict, since otherwise headOption (or collectFirst) can't reach back and stop the flatMap (or map) that precedes it.
In your case you can skip findVista and get even more concise with something like this:
val temp = nodes.view.flatMap(
node => node.right.toOption.filter(_ == origV).map(node -> _)
).headOption
Whether you find this clearer or just a mess is a matter of taste, of course.

Encoding recursive tree-creation with while loop + stacks

I'm a bit embarassed to admit this, but I seem to be pretty stumped by what should be a simple programming problem. I'm building a decision tree implementation, and have been using recursion to take a list of labeled samples, recursively split the list in half, and turn it into a tree.
Unfortunately, with deep trees I run into stack overflow errors (ha!), so my first thought was to use continuations to turn it into tail recursion. Unfortunately Scala doesn't support that kind of TCO, so the only solution is to use a trampoline. A trampoline seems kinda inefficient and I was hoping there would be some simple stack-based imperative solution to this problem, but I'm having a lot of trouble finding it.
The recursive version looks sort of like (simplified):
private def trainTree(samples: Seq[Sample], usedFeatures: Set[Int]): DTree = {
if (shouldStop(samples)) {
DTLeaf(makeProportions(samples))
} else {
val featureIdx = getSplittingFeature(samples, usedFeatures)
val (statsWithFeature, statsWithoutFeature) = samples.partition(hasFeature(featureIdx, _))
DTBranch(
trainTree(statsWithFeature, usedFeatures + featureIdx),
trainTree(statsWithoutFeature, usedFeatures + featureIdx),
featureIdx)
}
}
So basically I'm recursively subdividing the list into two according to some feature of the data, and passing through a list of used features so I don't repeat - that's all handled in the "getSplittingFeature" function so we can ignore it. The code is really simple! Still, I'm having trouble figuring out a stack-based solution that doesn't just use closures and effectively become a trampoline. I know we'll at least have to keep around little "frames" of arguments in the stack but I would like to avoid closure calls.
I get that I should be writing out explicitly what the callstack and program counter handle for me implicitly in the recursive solution, but I'm having trouble doing that without continuations. At this point it's hardly even about efficiency, I'm just curious. So please, no need to remind me that premature optimization is the root of all evil and the trampoline-based solution will probably work just fine. I know it probably will - this is basically a puzzle for it's own sake.
Can anyone tell me what the canonical while-loop-and-stack-based solution to this sort of thing is?
UPDATE: Based on Thipor Kong's excellent solution, I've coded up a while-loops/stacks/hashtable based implementation of the algorithm that should be a direct translation of the recursive version. This is exactly what I was looking for:
FINAL UPDATE: I've used sequential integer indices, as well as putting everything back into arrays instead of maps for performance, added maxDepth support, and finally have a solution with the same performance as the recursive version (not sure about memory usage but I would guess less):
private def trainTreeNoMaxDepth(startingSamples: Seq[Sample], startingMaxDepth: Int): DTree = {
// Use arraybuffer as dense mutable int-indexed map - no IndexOutOfBoundsException, just expand to fit
type DenseIntMap[T] = ArrayBuffer[T]
def updateIntMap[#specialized T](ab: DenseIntMap[T], idx: Int, item: T, dfault: T = null.asInstanceOf[T]) = {
if (ab.length <= idx) {ab.insertAll(ab.length, Iterable.fill(idx - ab.length + 1)(dfault)) }
ab.update(idx, item)
}
var currentChildId = 0 // get childIdx or create one if it's not there already
def child(childMap: DenseIntMap[Int], heapIdx: Int) =
if (childMap.length > heapIdx && childMap(heapIdx) != -1) childMap(heapIdx)
else {currentChildId += 1; updateIntMap(childMap, heapIdx, currentChildId, -1); currentChildId }
// go down
val leftChildren, rightChildren = new DenseIntMap[Int]() // heapIdx -> childHeapIdx
val todo = Stack((startingSamples, Set.empty[Int], startingMaxDepth, 0)) // samples, usedFeatures, maxDepth, heapIdx
val branches = new Stack[(Int, Int)]() // heapIdx, featureIdx
val nodes = new DenseIntMap[DTree]() // heapIdx -> node
while (!todo.isEmpty) {
val (samples, usedFeatures, maxDepth, heapIdx) = todo.pop()
if (shouldStop(samples) || maxDepth == 0) {
updateIntMap(nodes, heapIdx, DTLeaf(makeProportions(samples)))
} else {
val featureIdx = getSplittingFeature(samples, usedFeatures)
val (statsWithFeature, statsWithoutFeature) = samples.partition(hasFeature(featureIdx, _))
todo.push((statsWithFeature, usedFeatures + featureIdx, maxDepth - 1, child(leftChildren, heapIdx)))
todo.push((statsWithoutFeature, usedFeatures + featureIdx, maxDepth - 1, child(rightChildren, heapIdx)))
branches.push((heapIdx, featureIdx))
}
}
// go up
while (!branches.isEmpty) {
val (heapIdx, featureIdx) = branches.pop()
updateIntMap(nodes, heapIdx, DTBranch(nodes(child(leftChildren, heapIdx)), nodes(child(rightChildren, heapIdx)), featureIdx))
}
nodes(0)
}
Just store the binary tree in an array, as described on Wikipedia: For node i, the left child goes into 2*i+1 and the right child in to 2*i+2. When doing "down", you keep a collection of todos, that still have to be splitted to reach a leaf. Once you've got only leafs, to go upward (from right to left in the array) to build the decision nodes:
Update: A cleaned up version, that also supports the features stored int the branches (type parameter B) and that is more functional/fully pure and that supports sparse trees with a map as suggested by ron.
Update2-3: Make economical use of name space for node ids and abstract over type of ids to allow of large trees. Take node ids from Stream.
sealed trait DTree[A, B]
case class DTLeaf[A, B](a: A, b: B) extends DTree[A, B]
case class DTBranch[A, B](left: DTree[A, B], right: DTree[A, B], b: B) extends DTree[A, B]
def mktree[A, B, Id](a: A, b: B, split: (A, B) => Option[(A, A, B)], ids: Stream[Id]) = {
#tailrec
def goDown(todo: Seq[(A, B, Id)], branches: Seq[(Id, B, Id, Id)], leafs: Map[Id, DTree[A, B]], ids: Stream[Id]): (Seq[(Id, B, Id, Id)], Map[Id, DTree[A, B]]) =
todo match {
case Nil => (branches, leafs)
case (a, b, id) :: rest =>
split(a, b) match {
case None =>
goDown(rest, branches, leafs + (id -> DTLeaf(a, b)), ids)
case Some((left, right, b2)) =>
val leftId #:: rightId #:: idRest = ids
goDown((right, b2, rightId) +: (left, b2, leftId) +: rest, (id, b2, leftId, rightId) +: branches, leafs, idRest)
}
}
#tailrec
def goUp[A, B](branches: Seq[(Id, B, Id, Id)], nodes: Map[Id, DTree[A, B]]): Map[Id, DTree[A, B]] =
branches match {
case Nil => nodes
case (id, b, leftId, rightId) :: rest =>
goUp(rest, nodes + (id -> DTBranch(nodes(leftId), nodes(rightId), b)))
}
val rootId #:: restIds = ids
val (branches, leafs) = goDown(Seq((a, b, rootId)), Seq(), Map(), restIds)
goUp(branches, leafs)(rootId)
}
// try it out
def split(xs: Seq[Int], b: Int) =
if (xs.size > 1) {
val (left, right) = xs.splitAt(xs.size / 2)
Some((left, right, b + 1))
} else {
None
}
val tree = mktree(0 to 1000, 0, split _, Stream.from(0))
println(tree)