Class mutable.HashTable in Scala - scala

The on-line documentation doesn't say too much about it (http://www.scala-lang.org/api/2.12.0/scala/collection/mutable/HashTable.html), and it's impossible to find any example of its usage.
I'm wondering how to even put elements into a mutable.HashTable.
Are there any data structure with the same constant time complexity in research and delete operations that i could use?

mutable.HashTable is a trait.
trait HashTable[A, Entry >: Null <: HashEntry[A, Entry]] extends HashTable.HashUtils[A]
with default load factor as 75%
private[collection] final def defaultLoadFactor: Int = 750
uses array for implementation,
protected var table: Array[HashEntry[A, Entry]] = new Array(initialCapacity)
1) you have to use implementations of HashTable, which are mutable.HashMap or mutable.LinkedHashMap.
example,
scala> import scala.collection.mutable.HashMap
import scala.collection.mutable.HashMap
scala> new HashMap[String, String]
res9: scala.collection.mutable.HashMap[String,String] = Map()
add element
scala> res9+= "order-id-1" -> "delivered"
res10: res9.type = Map(order-id-1 -> delivered)
scala> res9
res11: scala.collection.mutable.HashMap[String,String] = Map(order-id-1 -> delivered)
access element
scala> res9("order-id-1")
res12: String = delivered
Write complexity should be O(1) same as in hashmap datastructure. You find the hashcode and add to the array of Entry.
def += (kv: (A, B)): this.type = {
val e = findOrAddEntry(kv._1, kv._2)
if (e ne null) e.value = kv._2
this
}
protected def findOrAddEntry[B](key: A, value: B): Entry = {
val h = index(elemHashCode(key))
val e = findEntry0(key, h)
if (e ne null) e else { addEntry0(createNewEntry(key, value), h); null }
}
private[this] def findEntry0(key: A, h: Int): Entry = {
var e = table(h).asInstanceOf[Entry]
while (e != null && !elemEquals(e.key, key)) e = e.next
e
}
private[this] def addEntry0(e: Entry, h: Int) {
e.next = table(h).asInstanceOf[Entry]
table(h) = e
tableSize = tableSize + 1
nnSizeMapAdd(h)
if (tableSize > threshold)
resize(2 * table.length)
}
2. If you use LinkedHashMap it maintains the order of insertion with use of linked entries
#transient protected var firstEntry: Entry = null
#transient protected var lastEntry: Entry = null
example,
scala> import scala.collection.mutable.LinkedHashMap
import scala.collection.mutable.LinkedHashMap
scala> LinkedHashMap[String, String]()
res0: scala.collection.mutable.LinkedHashMap[String,String] = Map()
scala> res0 += "line-item1" -> "Delivered" += "line-item2" -> "Processing"
res2: res0.type = Map(line-item1 -> Delivered, line-item2 -> Processing)
scala> res0
res3: scala.collection.mutable.LinkedHashMap[String,String] = Map(line-item1 -> Delivered, line-item2 -> Processing)

Related

Map whose keys are compared using eq [duplicate]

I have an object with stores information about specific instances. For that, i would like to use a Map, but as the keys are not by-reference (they aren't, right?) but as hashes provided by the getHashCode method. For better understanding:
import collection.mutable._
import java.util.Random
object Foo {
var myMap = HashMap[AnyRef, Int]()
def doSomething(ar: AnyRef): Int = {
myMap.get(ar) match {
case Some(x) => x
case None => {
myMap += ar -> new Random().nextInt()
doSomething(ar)
}
}
}
}
object Main {
def main(args: Array[String]) {
case class ExampleClass(x: String);
val o1 = ExampleClass("test1")
val o2 = ExampleClass("test1")
println(o2 == o1) // true
println(o2 eq o1) // false
// I want the following two lines to yield different numbers
// and i do not have control over the classes, messing with their
// equals implementation is not possible.
println(Foo.doSomething(o1))
println(Foo.doSomething(o2))
}
}
In cases i have instances with the same hash code the "caching" for the random value will return the same value for both instances even those are not same. Which datastructed is used best in this situation?
Clarification/Edit
I know how this works normally, based on the hashCode and equals method. But that is exactly what I want to avoid. I updated my example to make that clearer. :)
EDIT: Based on clarifications to the question, you can create your own Map implementation, and override elemEquals().
The original implementation (in HashMap)
protected def elemEquals(key1: A, key2: A): Boolean = (key1 == key2)
Change this to:
protected def elemEquals(key1: A, key2: A): Boolean = (key1 eq key2)
class MyHashMap[A <: AnyRef, B] extends scala.collection.mutable.HashMap[A, B] {
protected override def elemEquals(key1: A, key2: A): Boolean = (key1 eq key2)
}
Note that to use eq, you need to restrict the key to be an AnyRef, or do a match in the elemEquals() method.
case class Foo(i: Int)
val f1 = new Foo(1)
val f2 = new Foo(1)
val map = new MyHashMap[Foo, String]()
map += (f1 -> "f1")
map += (f2 -> "f2")
map.get(f1) // Some(f1)
map.get(f2) // Some(f2)
--
Original answer
Map works with hashCode() and equals(). Have you implemented equals() correctly in your obejcts? Note that in Scala, == gets translated to a call to equals(). To get the same behaviour of == in Java, use the Scala operator eq
case class Foo(i: Int)
val f1 = new Foo(1)
val f2 = new Foo(1)
f1 == f2 // true
f1.equals(f2) // true
f1 eq f2 // false
val map = new MyHashMap (f1 -> "f1", f2 -> "f2")
map.get(f1) // Some("f2")
map.get(f2) // Some("f2")
Here, the case class implements equals() to be object equivalence, in this case:
f1.i == f1.i
You need to override equals() in your objects to include object equality, i.e something like:
override def equals(o: Any) = { o.asInstanceOf[AnyRef] eq this }
This should still work with the same hashCode().
You can also use IdentityHashMap together with scala.collection.JavaConversions.
Ah based on comment... You could use a wrapper that overrides equal to have reference semantics.
class EqWrap[T <: AnyRef](val value: T) {
override def hashCode() = if (value == null) 0 else value.hashCode
override def equals(a: Any) = a match {
case ref: EqWrap[_] => ref.value eq value
case _ => false
}
}
object EqWrap {
def apply[T <: AnyRef](t: T) = new EqWrap(t)
}
case class A(i: Int)
val x = A(0)
val y = A(0)
val map = Map[EqWrap[A], Int](EqWrap(x) -> 1)
val xx = map.get(EqWrap(x))
val yy = map.get(EqWrap(y))
//xx: Option[Int] = Some(1)
//yy: Option[Int] = None
Original answer (based on not understanding the question - I have to leave this so that the comment makes sense...)
Map already has this semantic (unless I don't understand your question).
scala> val x = A(0)
x: A = A(0)
scala> val y = A(0)
y: A = A(0)
scala> x == y
res0: Boolean = true // objects are equal
scala> x.hashCode
res1: Int = -2081655426
scala> y.hashCode
res2: Int = -2081655426 // same hash code
scala> x eq y
res3: Boolean = false // not the same object
scala> val map = Map(x -> 1)
map: scala.collection.immutable.Map[A,Int] = Map(A(0) -> 1)
scala> map(y)
res8: Int = 1 // return the mapping based on hash code and equal semantic

Convert java.util.IdentityHashMap to scala.immutable.Map

What is the simplest way to convert a java.util.IdentityHashMap[A,B] into a subtype of scala.immutable.Map[A,B]? I need to keep keys separate unless they are eq.
Here's what I've tried so far:
scala> case class Example()
scala> val m = new java.util.IdentityHashMap[Example, String]()
scala> m.put(Example(), "first!")
scala> m.put(Example(), "second!")
scala> m.asScala // got a mutable Scala equivalent OK
res14: scala.collection.mutable.Map[Example,String] = Map(Example() -> first!, Example() -> second!)
scala> m.asScala.toMap // doesn't work, since toMap() removes duplicate keys (testing with ==)
res15: scala.collection.immutable.Map[Example,String] = Map(Example() -> second!)
Here's a simple implementation of identity map in Scala. In usage, it should be similar to standard immutable map.
Example usage:
val im = IdentityMap(
new String("stuff") -> 5,
new String("stuff") -> 10)
println(im) // IdentityMap(stuff -> 5, stuff -> 10)
Your case:
import scala.collection.JavaConverters._
import java.{util => ju}
val javaIdentityMap: ju.IdentityHashMap = ???
val scalaIdentityMap = IdentityMap.empty[String,Int] ++ javaIdentityMap.asScala
Implementation itself (for performance reasons, there may be some more methods that need to be overridden):
import scala.collection.generic.ImmutableMapFactory
import scala.collection.immutable.MapLike
import IdentityMap.{Wrapper, wrap}
class IdentityMap[A, +B] private(underlying: Map[Wrapper[A], B])
extends Map[A, B] with MapLike[A, B, IdentityMap[A, B]] {
def +[B1 >: B](kv: (A, B1)) =
new IdentityMap(underlying + ((wrap(kv._1), kv._2)))
def -(key: A) =
new IdentityMap(underlying - wrap(key))
def iterator =
underlying.iterator.map {
case (kw, v) => (kw.value, v)
}
def get(key: A) =
underlying.get(wrap(key))
override def size: Int =
underlying.size
override def empty =
new IdentityMap(underlying.empty)
override def stringPrefix =
"IdentityMap"
}
object IdentityMap extends ImmutableMapFactory[IdentityMap] {
def empty[A, B] =
new IdentityMap(Map.empty)
private class Wrapper[A](val value: A) {
override def toString: String =
value.toString
override def equals(other: Any) = other match {
case otherWrapper: Wrapper[_] =>
value.asInstanceOf[AnyRef] eq otherWrapper.value.asInstanceOf[AnyRef]
case _ => false
}
override def hashCode =
System.identityHashCode(value)
}
private def wrap[A](key: A) =
new Wrapper(key)
}
One way to handle this would be change what equality means for the class, e.g.
scala> case class Example() {
override def equals( that:Any ) = that match {
case that:AnyRef => this eq that
case _ => false
}
}
defined class Example
scala> val m = new java.util.IdentityHashMap[Example, String]()
m: java.util.IdentityHashMap[Example,String] = {}
scala> m.put(Example(), "first!")
res1: String = null
scala> m.put(Example(), "second!")
res2: String = null
scala> import scala.collection.JavaConverters._
import scala.collection.JavaConverters._
scala> m.asScala
res3: scala.collection.mutable.Map[Example,String] = Map(Example() -> second!, Example() -> first!)
scala> m.asScala.toMap
res4: scala.collection.immutable.Map[Example,String] = Map(Example() -> second!, Example() -> first!)
Or if you don't want to change equality for the class, you could make a wrapper.
Of course, this won't perform as well as a Map that uses eq instead of ==; it might be worth asking for one....

Map of with object references as keys?

I have an object with stores information about specific instances. For that, i would like to use a Map, but as the keys are not by-reference (they aren't, right?) but as hashes provided by the getHashCode method. For better understanding:
import collection.mutable._
import java.util.Random
object Foo {
var myMap = HashMap[AnyRef, Int]()
def doSomething(ar: AnyRef): Int = {
myMap.get(ar) match {
case Some(x) => x
case None => {
myMap += ar -> new Random().nextInt()
doSomething(ar)
}
}
}
}
object Main {
def main(args: Array[String]) {
case class ExampleClass(x: String);
val o1 = ExampleClass("test1")
val o2 = ExampleClass("test1")
println(o2 == o1) // true
println(o2 eq o1) // false
// I want the following two lines to yield different numbers
// and i do not have control over the classes, messing with their
// equals implementation is not possible.
println(Foo.doSomething(o1))
println(Foo.doSomething(o2))
}
}
In cases i have instances with the same hash code the "caching" for the random value will return the same value for both instances even those are not same. Which datastructed is used best in this situation?
Clarification/Edit
I know how this works normally, based on the hashCode and equals method. But that is exactly what I want to avoid. I updated my example to make that clearer. :)
EDIT: Based on clarifications to the question, you can create your own Map implementation, and override elemEquals().
The original implementation (in HashMap)
protected def elemEquals(key1: A, key2: A): Boolean = (key1 == key2)
Change this to:
protected def elemEquals(key1: A, key2: A): Boolean = (key1 eq key2)
class MyHashMap[A <: AnyRef, B] extends scala.collection.mutable.HashMap[A, B] {
protected override def elemEquals(key1: A, key2: A): Boolean = (key1 eq key2)
}
Note that to use eq, you need to restrict the key to be an AnyRef, or do a match in the elemEquals() method.
case class Foo(i: Int)
val f1 = new Foo(1)
val f2 = new Foo(1)
val map = new MyHashMap[Foo, String]()
map += (f1 -> "f1")
map += (f2 -> "f2")
map.get(f1) // Some(f1)
map.get(f2) // Some(f2)
--
Original answer
Map works with hashCode() and equals(). Have you implemented equals() correctly in your obejcts? Note that in Scala, == gets translated to a call to equals(). To get the same behaviour of == in Java, use the Scala operator eq
case class Foo(i: Int)
val f1 = new Foo(1)
val f2 = new Foo(1)
f1 == f2 // true
f1.equals(f2) // true
f1 eq f2 // false
val map = new MyHashMap (f1 -> "f1", f2 -> "f2")
map.get(f1) // Some("f2")
map.get(f2) // Some("f2")
Here, the case class implements equals() to be object equivalence, in this case:
f1.i == f1.i
You need to override equals() in your objects to include object equality, i.e something like:
override def equals(o: Any) = { o.asInstanceOf[AnyRef] eq this }
This should still work with the same hashCode().
You can also use IdentityHashMap together with scala.collection.JavaConversions.
Ah based on comment... You could use a wrapper that overrides equal to have reference semantics.
class EqWrap[T <: AnyRef](val value: T) {
override def hashCode() = if (value == null) 0 else value.hashCode
override def equals(a: Any) = a match {
case ref: EqWrap[_] => ref.value eq value
case _ => false
}
}
object EqWrap {
def apply[T <: AnyRef](t: T) = new EqWrap(t)
}
case class A(i: Int)
val x = A(0)
val y = A(0)
val map = Map[EqWrap[A], Int](EqWrap(x) -> 1)
val xx = map.get(EqWrap(x))
val yy = map.get(EqWrap(y))
//xx: Option[Int] = Some(1)
//yy: Option[Int] = None
Original answer (based on not understanding the question - I have to leave this so that the comment makes sense...)
Map already has this semantic (unless I don't understand your question).
scala> val x = A(0)
x: A = A(0)
scala> val y = A(0)
y: A = A(0)
scala> x == y
res0: Boolean = true // objects are equal
scala> x.hashCode
res1: Int = -2081655426
scala> y.hashCode
res2: Int = -2081655426 // same hash code
scala> x eq y
res3: Boolean = false // not the same object
scala> val map = Map(x -> 1)
map: scala.collection.immutable.Map[A,Int] = Map(A(0) -> 1)
scala> map(y)
res8: Int = 1 // return the mapping based on hash code and equal semantic

Extending collection classes with extra fields in Scala

I'm looking to create a class that is basically a collection with an extra field. However, I keep running into problems and am wondering what the best way of implementing this is. I've tried to follow the pattern given in the Scala book. E.g.
import scala.collection.IndexedSeqLike
import scala.collection.mutable.Builder
import scala.collection.generic.CanBuildFrom
import scala.collection.mutable.ArrayBuffer
class FieldSequence[FT,ST](val field: FT, seq: IndexedSeq[ST] = Vector())
extends IndexedSeq[ST] with IndexedSeqLike[ST,FieldSequence[FT,ST]] {
def apply(index: Int): ST = return seq(index)
def length = seq.length
override def newBuilder: Builder[ST,FieldSequence[FT,ST]]
= FieldSequence.newBuilder[FT,ST](field)
}
object FieldSequence {
def fromSeq[FT,ST](field: FT)(buf: IndexedSeq[ST])
= new FieldSequence(field, buf)
def newBuilder[FT,ST](field: FT): Builder[ST,FieldSequence[FT,ST]]
= new ArrayBuffer mapResult(fromSeq(field))
implicit def canBuildFrom[FT,ST]:
CanBuildFrom[FieldSequence[FT,ST], ST, FieldSequence[FT,ST]] =
new CanBuildFrom[FieldSequence[FT,ST], ST, FieldSequence[FT,ST]] {
def apply(): Builder[ST,FieldSequence[FT,ST]]
= newBuilder[FT,ST]( _ ) // What goes here?
def apply(from: FieldSequence[FT,ST]): Builder[ST,FieldSequence[FT,ST]]
= from.newBuilder
}
}
The problem is the CanBuildFrom that is implicitly defined needs an apply method with no arguments. But in these circumstances this method is meaningless, as a field (of type FT) is needed to construct a FieldSequence. In fact, it should be impossible to construct a FieldSequence, simply from a sequence of type ST. Is the best I can do to throw an exception here?
Then your class doesn't fulfill the requirements to be a Seq, and methods like flatMap (and hence for-comprehensions) can't work for it.
I'm not sure I agree with Landei about flatMap and map. If you replace with throwing an exception like this, most of the operations should work.
def apply(): Builder[ST,FieldSequence[FT,ST]] = sys.error("unsupported")
From what I can see in TraversableLike, map and flatMap and most other ones use the apply(repr) version. So for comprehensions seemingly work. It also feels like it should follow the Monad laws (the field is just carried accross).
Given the code you have, you can do this:
scala> val fs = FieldSequence.fromSeq("str")(Vector(1,2))
fs: FieldSequence[java.lang.String,Int] = FieldSequence(1, 2)
scala> fs.map(1 + _)
res3: FieldSequence[java.lang.String,Int] = FieldSequence(2, 3)
scala> val fs2 = FieldSequence.fromSeq("str1")(Vector(10,20))
fs2: FieldSequence[java.lang.String,Int] = FieldSequence(10, 20)
scala> for (x <- fs if x > 0; y <- fs2) yield (x + y)
res5: FieldSequence[java.lang.String,Int] = FieldSequence(11, 21, 12, 22)
What doesn't work is the following:
scala> fs.map(_ + "!")
// does not return a FieldSequence
scala> List(1,2).map(1 + _)(collection.breakOut): FieldSequence[String, Int]
java.lang.RuntimeException: unsupported
// this is where the apply() is used
For breakOut to work you would need to implement the apply() method. I suspect you could generate a builder with some default value for field: def apply() = newBuilder[FT, ST](getDefault) with some implementation of getDefault that makes sense for your use case.
For the fact that fs.map(_ + "!") does not preserve the type, you need to modify your signature and implementation, so that the compiler can find a CanBuildFrom[FieldSequence[String, Int], String, FieldSequence[String, String]]
implicit def canBuildFrom[FT,ST_FROM,ST]:
CanBuildFrom[FieldSequence[FT,ST_FROM], ST, FieldSequence[FT,ST]] =
new CanBuildFrom[FieldSequence[FT,ST_FROM], ST, FieldSequence[FT,ST]] {
def apply(): Builder[ST,FieldSequence[FT,ST]]
= sys.error("unsupported")
def apply(from: FieldSequence[FT,ST_FROM]): Builder[ST,FieldSequence[FT,ST]]
= newBuilder[FT, ST](from.field)
}
In the end, my answer was very similar to that in a previous question. The difference with that question and my original and the answer are slight but basically allow anything that has a sequence to be a sequence.
import scala.collection.SeqLike
import scala.collection.mutable.Builder
import scala.collection.mutable.ArrayBuffer
import scala.collection.generic.CanBuildFrom
trait SeqAdapter[+A, Repr[+X] <: SeqAdapter[X,Repr]]
extends Seq[A] with SeqLike[A,Repr[A]] {
val underlyingSeq: Seq[A]
def create[B](seq: Seq[B]): Repr[B]
def apply(index: Int) = underlyingSeq(index)
def length = underlyingSeq.length
def iterator = underlyingSeq.iterator
override protected[this] def newBuilder: Builder[A,Repr[A]] = {
val sac = new SeqAdapterCompanion[Repr] {
def createDefault[B](seq: Seq[B]) = create(seq)
}
sac.newBuilder(create)
}
}
trait SeqAdapterCompanion[Repr[+X] <: SeqAdapter[X,Repr]] {
def createDefault[A](seq: Seq[A]): Repr[A]
def fromSeq[A](creator: (Seq[A]) => Repr[A])(seq: Seq[A]) = creator(seq)
def newBuilder[A](creator: (Seq[A]) => Repr[A]): Builder[A,Repr[A]] =
new ArrayBuffer mapResult fromSeq(creator)
implicit def canBuildFrom[A,B]: CanBuildFrom[Repr[A],B,Repr[B]] =
new CanBuildFrom[Repr[A],B,Repr[B]] {
def apply(): Builder[B,Repr[B]] = newBuilder(createDefault)
def apply(from: Repr[A]) = newBuilder(from.create)
}
}
This fixes all the problems huynhjl brought up. For my original problem, to have a field and a sequence treated as a sequence, a simple class will now do.
trait Field[FT] {
val defaultValue: FT
class FieldSeq[+ST](val field: FT, val underlyingSeq: Seq[ST] = Vector())
extends SeqAdapter[ST,FieldSeq] {
def create[B](seq: Seq[B]) = new FieldSeq[B](field, seq)
}
object FieldSeq extends SeqAdapterCompanion[FieldSeq] {
def createDefault[A](seq: Seq[A]): FieldSeq[A] =
new FieldSeq[A](defaultValue, seq)
override implicit def canBuildFrom[A,B] = super.canBuildFrom[A,B]
}
}
This can be tested as so:
val StringField = new Field[String] { val defaultValue = "Default Value" }
StringField: java.lang.Object with Field[String] = $anon$1#57f5de73
val fs = new StringField.FieldSeq[Int]("str", Vector(1,2))
val fsfield = fs.field
fs: StringField.FieldSeq[Int] = (1, 2)
fsfield: String = str
val fm = fs.map(1 + _)
val fmfield = fm.field
fm: StringField.FieldSeq[Int] = (2, 3)
fmfield: String = str
val fs2 = new StringField.FieldSeq[Int]("str1", Vector(10, 20))
val fs2field = fs2.field
fs2: StringField.FieldSeq[Int] = (10, 20)
fs2field: String = str1
val ffor = for (x <- fs if x > 0; y <- fs2) yield (x + y)
val fforfield = ffor.field
ffor: StringField.FieldSeq[Int] = (11, 21, 12, 22)
fforfield: String = str
val smap = fs.map(_ + "!")
val smapfield = smap.field
smap: StringField.FieldSeq[String] = (1!, 2!)
smapfield: String = str
val break = List(1,2).map(1 + _)(collection.breakOut): StringField.FieldSeq[Int]
val breakfield = break.field
break: StringField.FieldSeq[Int] = (2, 3)
breakfield: String = Default Value
val x: StringField.FieldSeq[Any] = fs
val xfield = x.field
x: StringField.FieldSeq[Any] = (1, 2)
xfield: String = str

abstracting over a collection

Recently, I wrote an iterator for a cartesian product of Anys, and started with a List of List, but recognized, that I can easily switch to the more abstract trait Seq.
I know, you like to see the code. :)
class Cartesian (val ll: Seq[Seq[_]]) extends Iterator [Seq[_]] {
def combicount: Int = (1 /: ll) (_ * _.length)
val last = combicount
var iter = 0
override def hasNext (): Boolean = iter < last
override def next (): Seq[_] = {
val res = combination (ll, iter)
iter += 1
res
}
def combination (xx: Seq [Seq[_]], i: Int): List[_] = xx match {
case Nil => Nil
case x :: xs => x (i % x.length) :: combination (xs, i / x.length)
}
}
And a client of that class:
object Main extends Application {
val illi = new Cartesian (List ("abc".toList, "xy".toList, "AB".toList))
// val ivvi = new Cartesian (Vector (Vector (1, 2, 3), Vector (10, 20)))
val issi = new Cartesian (Seq (Seq (1, 2, 3), Seq (10, 20)))
// val iaai = new Cartesian (Array (Array (1, 2, 3), Array (10, 20)))
(0 to 5).foreach (dummy => println (illi.next ()))
// (0 to 5).foreach (dummy => println (issi.next ()))
}
/*
List(a, x, A)
List(b, x, A)
List(c, x, A)
List(a, y, A)
List(b, y, A)
List(c, y, A)
*/
The code works well for Seq and Lists (which are Seqs), but of course not for Arrays or Vector, which aren't of type Seq, and don't have a cons-method '::'.
But the logic could be used for such collections too.
I could try to write an implicit conversion to and from Seq for Vector, Array, and such, or try to write an own, similar implementation, or write an Wrapper, which transforms the collection to a Seq of Seq, and calls 'hasNext' and 'next' for the inner collection, and converts the result to an Array, Vector or whatever. (I tried to implement such workarounds, but I have to recognize: it's not that easy. For a real world problem I would probably rewrite the Iterator independently.)
However, the whole thing get's a bit out of control if I have to deal with Arrays of Lists or Lists of Arrays and other mixed cases.
What would be the most elegant way to write the algorithm in the broadest, possible way?
There are two solutions. The first is to not require the containers to be a subclass of some generic super class, but to be convertible to one (by using implicit function arguments). If the container is already a subclass of the required type, there's a predefined identity conversion which only returns it.
import collection.mutable.Builder
import collection.TraversableLike
import collection.generic.CanBuildFrom
import collection.mutable.SeqLike
class Cartesian[T, ST[T], TT[S]](val ll: TT[ST[T]])(implicit cbf: CanBuildFrom[Nothing, T, ST[T]], seqLike: ST[T] => SeqLike[T, ST[T]], traversableLike: TT[ST[T]] => TraversableLike[ST[T], TT[ST[T]]] ) extends Iterator[ST[T]] {
def combicount (): Int = (1 /: ll) (_ * _.length)
val last = combicount - 1
var iter = 0
override def hasNext (): Boolean = iter < last
override def next (): ST[T] = {
val res = combination (ll, iter, cbf())
iter += 1
res
}
def combination (xx: TT[ST[T]], i: Int, builder: Builder[T, ST[T]]): ST[T] =
if (xx.isEmpty) builder.result
else combination (xx.tail, i / xx.head.length, builder += xx.head (i % xx.head.length) )
}
This sort of works:
scala> new Cartesian[String, Vector, Vector](Vector(Vector("a"), Vector("xy"), Vector("AB")))
res0: Cartesian[String,Vector,Vector] = empty iterator
scala> new Cartesian[String, Array, Array](Array(Array("a"), Array("xy"), Array("AB")))
res1: Cartesian[String,Array,Array] = empty iterator
I needed to explicitly pass the types because of bug https://issues.scala-lang.org/browse/SI-3343
One thing to note is that this is better than using existential types, because calling next on the iterator returns the right type, and not Seq[Any].
There are several drawbacks here:
If the container is not a subclass of the required type, it is converted to one, which costs in performance
The algorithm is not completely generic. We need types to be converted to SeqLike or TraversableLike only to use a subset of functionality these types offer. So making a conversion function can be tricky.
What if some capabilities can be interpreted differently in different contexts? For example, a rectangle has two 'length' properties (width and height)
Now for the alternative solution. We note that we don't actually care about the types of collections, just their capabilities:
TT should have foldLeft, get(i: Int) (to get head/tail)
ST should have length, get(i: Int) and a Builder
So we can encode these:
trait HasGet[T, CC[_]] {
def get(cc: CC[T], i: Int): T
}
object HasGet {
implicit def seqLikeHasGet[T, CC[X] <: SeqLike[X, _]] = new HasGet[T, CC] {
def get(cc: CC[T], i: Int): T = cc(i)
}
implicit def arrayHasGet[T] = new HasGet[T, Array] {
def get(cc: Array[T], i: Int): T = cc(i)
}
}
trait HasLength[CC] {
def length(cc: CC): Int
}
object HasLength {
implicit def seqLikeHasLength[CC <: SeqLike[_, _]] = new HasLength[CC] {
def length(cc: CC) = cc.length
}
implicit def arrayHasLength[T] = new HasLength[Array[T]] {
def length(cc: Array[T]) = cc.length
}
}
trait HasFold[T, CC[_]] {
def foldLeft[A](cc: CC[T], zero: A)(op: (A, T) => A): A
}
object HasFold {
implicit def seqLikeHasFold[T, CC[X] <: SeqLike[X, _]] = new HasFold[T, CC] {
def foldLeft[A](cc: CC[T], zero: A)(op: (A, T) => A): A = cc.foldLeft(zero)(op)
}
implicit def arrayHasFold[T] = new HasFold[T, Array] {
def foldLeft[A](cc: Array[T], zero: A)(op: (A, T) => A): A = {
var i = 0
var result = zero
while (i < cc.length) {
result = op(result, cc(i))
i += 1
}
result
}
}
}
(strictly speaking, HasFold is not required since its implementation is in terms of length and get, but i added it here so the algorithm will translate more cleanly)
now the algorithm is:
class Cartesian[T, ST[_], TT[Y]](val ll: TT[ST[T]])(implicit cbf: CanBuildFrom[Nothing, T, ST[T]], stHasLength: HasLength[ST[T]], stHasGet: HasGet[T, ST], ttHasFold: HasFold[ST[T], TT], ttHasGet: HasGet[ST[T], TT], ttHasLength: HasLength[TT[ST[T]]]) extends Iterator[ST[T]] {
def combicount (): Int = ttHasFold.foldLeft(ll, 1)((a,l) => a * stHasLength.length(l))
val last = combicount - 1
var iter = 0
override def hasNext (): Boolean = iter < last
override def next (): ST[T] = {
val res = combination (ll, 0, iter, cbf())
iter += 1
res
}
def combination (xx: TT[ST[T]], j: Int, i: Int, builder: Builder[T, ST[T]]): ST[T] =
if (ttHasLength.length(xx) == j) builder.result
else {
val head = ttHasGet.get(xx, j)
val headLength = stHasLength.length(head)
combination (xx, j + 1, i / headLength, builder += stHasGet.get(head, (i % headLength) ))
}
}
And use:
scala> new Cartesian[String, Vector, List](List(Vector("a"), Vector("xy"), Vector("AB")))
res6: Cartesian[String,Vector,List] = empty iterator
scala> new Cartesian[String, Array, Array](Array(Array("a"), Array("xy"), Array("AB")))
res7: Cartesian[String,Array,Array] = empty iterator
Scalaz probably has all of this predefined for you, unfortunately, I don't know it well.
(again I need to pass the types because inference doesn't infer the right kind)
The benefit is that the algorithm is now completely generic and that there is no need for implicit conversions from Array to WrappedArray in order for it to work
Excercise: define for tuples ;-)