Thoughts about a collection method that splits multiple times given a predicate

Thoughts about a collection method that splits multiple times given a predicate - scala

I am looking for a collections method which splits at a given pairwise condition, e.g.
val x = List("a" -> 1, "a" -> 2, "b" -> 3, "c" -> 4, "c" -> 5)
implicit class RichIterableLike[A, CC[~] <: Iterable[~]](it: CC[A]) {
def groupWith(fun: (A, A) => Boolean): Iterator[CC[A]] = new Iterator[CC[A]] {
def hasNext: Boolean = ???
def next(): CC[A] = ???
}
}
assert(x.groupWith(_._1 != _._1).toList ==
List(List("a" -> 1, "a" -> 2), List("b" -> 3), List("c" -> 4, "c" -> 5))
)
So this is sort of a recursive span.
While I'm capable of implementing the ???, I wonder
if something already exists in collections that I'm overseeing
what that method should be called; groupWith doesn't sound right. It should be concise, but somehow reflect that the function argument operates on pairs. groupWhere would be a bit closer I guess, but still not clear.
actually I guess when using groupWith, the predicate logic should be inverted, so I would use x.groupWith(_._1 == _._1)
thoughts about the types. Returning an Iterator[CC[A]] looks reasonable to me. Perhaps it should take a CanBuildFrom and return an Iterator[To]?

You can also write a version that uses tailrec/pattern matching:
def groupWith[A](s: Seq[A])(p: (A, A) => Boolean): Seq[Seq[A]] = {
#tailrec
def rec(xs: Seq[A], acc: Seq[Seq[A]] = Vector.empty): Seq[Seq[A]] = {
(xs.headOption, acc.lastOption) match {
case (None, _) => acc
case (Some(a), None) => rec(xs.tail, acc :+ Vector(a))
case (Some(a), Some(group)) if p(a, group.last) => rec(xs.tail, acc.init :+ (acc.last :+ a))
case (Some(a), Some(_)) => rec(xs.tail, acc :+ Vector(a))
}
}
rec(s)
}

So here is my suggestion. I sticked to groupWith, because spans is not very descriptive in my opinion. It is true that groupBy has very different semantics, however there is grouped(size: Int) which is similar.
I tried to create my iterator purely based on combining existing iterators, but this got messy, so here is the more low level version:
import scala.collection.generic.CanBuildFrom
import scala.annotation.tailrec
import language.higherKinds
object Extensions {
private final class GroupWithIterator[A, CC[~] <: Iterable[~], To](
it: CC[A], p: (A, A) => Boolean)(implicit cbf: CanBuildFrom[CC[A], A, To])
extends Iterator[To] {
private val peer = it.iterator
private var consumed = true
private var elem = null.asInstanceOf[A]
def hasNext: Boolean = !consumed || peer.hasNext
private def pop(): A = {
if (!consumed) return elem
if (!peer.hasNext)
throw new NoSuchElementException("next on empty iterator")
val res = peer.next()
elem = res
consumed = false
res
}
 
def next(): To = {
val b = cbf()
#tailrec def loop(pred: A): Unit = {
b += pred
consumed = true
if (!peer.isEmpty) {
val succ = pop()
if (p(pred, succ)) loop(succ)
}
}
loop(pop())
b.result()
}
}
 
implicit final class RichIterableLike[A, CC[~] <: Iterable[~]](val it: CC[A])
extends AnyVal {
/** Clumps the collection into groups based on a predicate which determines
* if successive elements belong to the same group.
*
* For example:
* {{
* val x = List("a", "a", "b", "a", "b", "b")
* x.groupWith(_ == _).to[Vector]
* }}
*
* produces `Vector(List("a", "a"), List("b"), List("a"), List("b", "b"))`.
*
* #param p a function which is evaluated with successive pairs of
* the input collection. As long as the predicate holds
* (the function returns `true`), elements are lumped together.
* When the predicate becomes `false`, a new group is started.
*
* #param cbf a builder factory for the group type
* #tparam To the group type
* #return an iterator over the groups.
*/
def groupWith[To](p: (A, A) => Boolean)
(implicit cbf: CanBuildFrom[CC[A], A, To]): Iterator[To] =
new GroupWithIterator(it, p)
}
}
That is, the predicate is inverted as opposed to the question.
import Extensions._
val x = List("a" -> 1, "a" -> 2, "b" -> 3, "c" -> 4, "c" -> 5)
x.groupWith(_._1 == _._1).to[Vector]
// -> Vector(List((a,1), (a,2)), List((b,3)), List((c,4), (c,5)))

You could achieve it with a fold too right? Here is an unoptimized version:
def groupWith[A](ls: List[A])(p: (A, A) => Boolean): List[List[A]] =
ls.foldLeft(List[List[A]]()) { (acc, x) =>
if(acc.isEmpty)
List(List(x))
else
if(p(acc.last.head, x))
acc.init ++ List(acc.last ++ List(x))
else
acc ++ List(List(x))
}
val x = List("a" -> 1, "a" -> 2, "b" -> 3, "c" -> 4, "c" -> 5, "a" -> 4)
println(groupWith(x)(_._1 == _._1))
//List(List((a,1), (a,2)), List((b,3)), List((c,4), (c,5)), List((a,4)))

Related

how to implement takeUntil with Scala lazy collections

I have an expensive function which I want to run as few times as possible with the following requirement:
I have several input values to try
If the function returns a value below a given threshold, I don't want to try other inputs
if no result is below the threshold, I want to take the result with the minimal output
I could not find a nice solution using Iterator's takeWhile/dropWhile, because I want to have the first matching element included. just ended up with the following solution:
val pseudoResult = Map("a" -> 0.6,"b" -> 0.2, "c" -> 1.0)
def expensiveFunc(s:String) : Double = {
pseudoResult(s)
}
val inputsToTry = Seq("a","b","c")
val inputIt = inputsToTry.iterator
val results = mutable.ArrayBuffer.empty[(String, Double)]
val earlyAbort = 0.5 // threshold
breakable {
while (inputIt.hasNext) {
val name = inputIt.next()
val res = expensiveFunc(name)
results += Tuple2(name,res)
if (res<earlyAbort) break()
}
}
println(results) // ArrayBuffer((a,0.6), (b,0.2))
val (name, bestResult) = results.minBy(_._2) // (b, 0.2)
If i set val earlyAbort = 0.1, the result should still be (b, 0.2) without evaluating all the cases again.

You can make use of Stream to achieve what you are looking for, remember Stream is some kind of lazy collection, that evaluate operations on demand.
Here is the scala Stream documentation.
You only need to do this:
val pseudoResult = Map("a" -> 0.6,"b" -> 0.2, "c" -> 1.0)
val earlyAbort = 0.5
def expensiveFunc(s: String): Double = {
println(s"Evaluating for $s")
pseudoResult(s)
}
val inputsToTry = Seq("a","b","c")
val results = inputsToTry.toStream.map(input => input -> expensiveFunc(input))
val finalResult = results.find { case (k, res) => res < earlyAbort }.getOrElse(results.minBy(_._2))
If find does not get any value, you can use the same stream to find the min, and the function is not evaluated again, this is because of memoization:
The Stream class also employs memoization such that previously computed values are converted from Stream elements to concrete values of type A
Consider that this code will fail if the original collection was empty, if you want to support empty collections you should replace minBy with sortBy(_._2).headOption and getOrElse by orElse:
val finalResultOpt = results.find { case (k, res) => res < earlyAbort }.orElse(results.sortBy(_._2).headOption)
And the output for this is:
Evaluating for a
Evaluating for b
finalResult: (String, Double) = (b,0.2)
finalResultOpt: Option[(String, Double)] = Some((b,0.2))

The clearest, simplest, thing to do is fold over the input, passing forward only the current best result.
val inputIt :Iterator[String] = inputsToTry.iterator
val earlyAbort = 0.5 // threshold
inputIt.foldLeft(("",Double.MaxValue)){ case (low,name) =>
if (low._2 < earlyAbort) low
else Seq(low, (name, expensiveFunc(name))).minBy(_._2)
}
//res0: (String, Double) = (b,0.2)
It calls on expensiveFunc() only as many times as is needed, but it does walk through the entire input iterator. If that's still too onerous (lots of input) then I'd go with a tail-recursive method.
val inputIt :Iterator[String] = inputsToTry.iterator
val earlyAbort = 0.5 // threshold
def bestMin(low :(String,Double) = ("",Double.MaxValue)) :(String,Double) = {
if (inputIt.hasNext) {
val name = inputIt.next()
val res = expensiveFunc(name)
if (res < earlyAbort) (name, res)
else if (res < low._2) bestMin((name,res))
else bestMin(low)
} else low
}
bestMin() //res0: (String, Double) = (b,0.2)

Use view in your input list:
try the following:
val pseudoResult = Map("a" -> 0.6, "b" -> 0.2, "c" -> 1.0)
def expensiveFunc(s: String): Double = {
println(s"executed for ${s}")
pseudoResult(s)
}
val inputsToTry = Seq("a", "b", "c")
val earlyAbort = 0.5 // threshold
def doIt(): List[(String, Double)] = {
inputsToTry.foldLeft(List[(String, Double)]()) {
case (n, name) =>
val res = expensiveFunc(name)
if(res < earlyAbort) {
return n++List((name, res))
}
n++List((name, res))
}
}
val (name, bestResult) = doIt().minBy(_._2)
println(name)
println(bestResult)
The output:
executed for a
executed for b
b
0.2
As you can see, only a and b are evaluated, and not c.

This is one of the use-cases for tail-recursion:
import scala.annotation.tailrec
val pseudoResult = Map("a" -> 0.6,"b" -> 0.2, "c" -> 1.0)
def expensiveFunc(s:String) : Double = {
pseudoResult(s)
}
val inputsToTry = Seq("a","b","c")
val earlyAbort = 0.5 // threshold
#tailrec
def f(s: Seq[String], result: Map[String, Double] = Map()): Map[String, Double] = s match {
case Nil => result
case h::t =>
val expensiveCalculation = expensiveFunc(h)
val intermediateResult = result + (h -> expensiveCalculation)
if(expensiveCalculation < earlyAbort) {
intermediateResult
} else {
f(t, intermediateResult)
}
}
val result = f(inputsToTry)
println(result) // Map(a -> 0.6, b -> 0.2)
val (name, bestResult) = f(inputsToTry).minBy(_._2) // ("b", 0.2)

If you implement takeUntil and use it, you'd still have to go through the list once more to get the lowest one if you don't find what you are looking for. Probably a better approach would be to have a function that combines find with reduceOption, returning early if something is found or else returning the result of reducing the collection to a single item (in your case, finding the smallest one).
The result is comparable with what you could achieve using a Stream, as highlighted in a previous reply, but avoids leveraging memoization, which can be cumbersome for very large collections.
A possible implementation could be the following:
import scala.annotation.tailrec
def findOrElse[A](it: Iterator[A])(predicate: A => Boolean,
orElse: (A, A) => A): Option[A] = {
#tailrec
def loop(elseValue: Option[A]): Option[A] = {
if (!it.hasNext) elseValue
else {
val next = it.next()
if (predicate(next)) Some(next)
else loop(Option(elseValue.fold(next)(orElse(_, next))))
}
}
loop(None)
}
Let's add our inputs to test this:
def f1(in: String): Double = {
println("calling f1")
Map("a" -> 0.6, "b" -> 0.2, "c" -> 1.0, "d" -> 0.8)(in)
}
def f2(in: String): Double = {
println("calling f2")
Map("a" -> 0.7, "b" -> 0.6, "c" -> 1.0, "d" -> 0.8)(in)
}
val inputs = Seq("a", "b", "c", "d")
As well as our helpers:
def apply[IN, OUT](in: IN, f: IN => OUT): (IN, OUT) =
in -> f(in)
def threshold[A](a: (A, Double)): Boolean =
a._2 < 0.5
def compare[A](a: (A, Double), b: (A, Double)): (A, Double) =
if (a._2 < b._2) a else b
We can now run this and see how it goes:
val r1 = findOrElse(inputs.iterator.map(apply(_, f1)))(threshold, compare)
val r2 = findOrElse(inputs.iterator.map(apply(_, f2)))(threshold, compare)
val r3 = findOrElse(Map.empty[String, Double].iterator)(threshold, compare)
r1 is Some(b, 0.2), r2 is Some(b, 0.6) and r3 is (reasonably) None. In the first case, since we use a lazy iterator and terminate early, we only invoke f1 twice.
You can have a look at the results and can play with this code here on Scastie.

What is the efficient way to remove subsets from a List[List[String]]?

I have a ListBuffer of List[String], val tList = ListBuffer[TCount] where TCount is case class TCount(l: List[String], c: Long). I want to find those list l from tList which are not the subset of any other element of tlist and their c value is less than their superset c value. The following program works but I have to use two for loop that makes the code inefficient. Is there any better approach I can use to make the code efficient?
val _arr = tList.toArray
for (i <- 0 to (_arr.length - 1)) {
val il = _arr(i).l.toSet
val ic = _arr(i).c
for (j <- 0 to (_arr.length - 1)) {
val jl = _arr(j).toSet
val jc = _arr(j).c
if (i != j && il.subsetOf(jl) && ic >= jc) {
tList.-=(_arr(i))
}
}
}

Inspired by the set-trie comment:
import scala.collection.SortedMap
class SetTrie[A](val flag: Boolean, val children: SortedMap[A, SetTrie[A]])(implicit val ord: Ordering[A]) {
def insert(xs: List[A]): SetTrie[A] = xs match {
case Nil => new SetTrie(true, children)
case a :: rest => {
val current = children.getOrElse(a, new SetTrie[A](false, SortedMap.empty))
val inserted = current.insert(rest)
new SetTrie(flag, children + (a -> inserted))
}
}
def containsSuperset(xs: List[A], strict: Boolean): Boolean = xs match {
case Nil => !children.isEmpty || (!strict && flag)
case a :: rest => {
children.get(a).map(_.containsSuperset(rest, strict)).getOrElse(false) ||
children.takeWhile(x => ord.lt(x._1, a)).exists(_._2.containsSuperset(xs, false))
}
}
}
def removeSubsets[A : Ordering](xss: List[List[A]]): List[List[A]] = {
val sorted = xss.map(_.sorted)
val setTrie = sorted.foldLeft(new SetTrie[A](false, SortedMap.empty)) { case (st, xs) => st.insert(xs) }
sorted.filterNot(xs => setTrie.containsSuperset(xs, true))
}

Here is a method that relies on a data structure somewhat similar to Set-Trie, but which stores more subsets explicitly. It provides worse compression, but is faster during lookup:
def findMaximal(lists: List[List[String]]): List[List[String]] = {
import collection.mutable.HashMap
class Node(
var isSubset: Boolean = false,
val children: HashMap[String, Node] = HashMap.empty
) {
def insert(xs: List[String], isSubs: Boolean): Unit = if (xs.isEmpty) {
isSubset |= isSubs
} else {
var isSubsSubs = false || isSubs
for (h :: t <- xs.tails) {
children.getOrElseUpdate(h, new Node()).insert(t, isSubsSubs)
isSubsSubs = true
}
}
def isMaximal(xs: List[String]): Boolean = xs match {
case Nil => children.isEmpty && !isSubset
case h :: t => children(h).isMaximal(t)
}
override def toString: String = {
if (children.isEmpty) "#"
else children.flatMap{
case (k,v) => {
if (v.children.isEmpty) List(k)
else (k + ":") :: v.toString.split("\n").map(" " + _).toList
}
}.mkString("\n")
}
}
val listsWithSorted = for (x <- lists) yield (x, x.sorted)
val root = new Node()
for ((x, s) <- listsWithSorted) root.insert(s, false)
// println(root)
for ((x, s) <- listsWithSorted; if root.isMaximal(s)) yield x
}
Note that I'm allowed to do any kind of mutable nonsense inside the body of the method, because the mutable trie data structure never escapes the scope of the method, and can therefore not be inadvertently shared with another thread.
Here is an example with sets of characters (converted to lists of strings):
println(findMaximal(List(
"ab", "abc", "ac", "abd",
"ade", "efd", "adf", "bafd",
"abd", "fda", "dba", "dbe"
).map(_.toList.map(_.toString))))
The output is:
List(
List(a, b, c),
List(a, d, e),
List(e, f, d),
List(b, a, f, d),
List(d, b, e)
)
so indeed, the non-maximal elements ab, ac, abd, adf, fda and dba are eliminated.
And here is what my not-quite-set-trie data structure looks like (child nodes are indented):
e:
f
b:
e
d:
e
f
c
f
d:
e:
f
f
a:
e
b:
d:
f
c
f
d:
e
f
c
f
c
f

Not sure if you can avoid the complexity, but, I guess I'd write like this:
val tList = List(List(1, 2, 3), List(3, 2, 1), List(9, 4, 7), List(3, 5, 6), List(1, 5, 6), List(6, 1, 5))
val tSet = tList.map(_.toSet)
def result = tSet.filterNot { sub => tSet.count(_.subsetOf(sub)) > 1 }

Here's one approach:
Create an indexed Map for identifying the original List elements
Turn Map of List-elements into Map of Sets (with index)
Generate combinations of the Map elements and use a custom filter to capture the elements that are subset of others
Remove those subset elements from the Map of Sets and retrieve remaining elements from the Map of Lists via the index
Sample code:
type TupIntSet = Tuple2[Int, Set[Int]]
def subsetFilter(ls: List[TupIntSet]): List[TupIntSet] =
if ( ls.size != 2 ) List.empty[TupIntSet] else
if ( ls(0)._2 subsetOf ls(1)._2 ) List[TupIntSet]((ls(0)._1, ls(0)._2)) else
if ( ls(1)._2 subsetOf ls(0)._2 ) List[TupIntSet]((ls(1)._1, ls(1)._2)) else
List.empty[TupIntSet]
val tList = List(List(1,2), List(1,2,3), List(3,4,5), List(5,4,3), List(2,3,4), List(6,7))
val listMap = (Stream from 1).zip(tList).toMap
val setMap = listMap.map{ case (i, l) => (i, l.toSet) }
val tSubsets = setMap.toList.combinations(2).toSet.flatMap(subsetFilter)
val resultList = (setMap.toSet -- tSubsets).map(_._1).map(listMap.getOrElse(_, ""))
// resultList: scala.collection.immutable.Set[java.io.Serializable] =
// Set(List(5, 4, 3), List(2, 3, 4), List(6, 7), List(1, 2, 3))

Conditionally apply a function in Scala - How to write this function?

I need to conditionally apply a function f1 to the elements in a collection depending on the result of a function f2 that takes each element as an argument and returns a boolean. If f2(e) is true, f1(e) will be applied otherwise 'e' will be returned "as is".
My intent is to write a general-purpose function able to work on any kind of collection.
c: C[E] // My collection
f1 = ( E => E ) // transformation function
f2 = ( E => Boolean ) // conditional function
I cannot come to a solution. Here's my idea, but I'm afraid I'm in high-waters
/* Notice this code doesn't compile ~ partially pseudo-code */
conditionallyApply[E,C[_](c: C[E], f2: E => Boolean, f1: E => E): C[E] = {
#scala.annotation.tailrec
def loop(a: C[E], c: C[E]): C[E] = {
c match {
case Nil => a // Here head / tail just express the idea, but I want to use a generic collection
case head :: tail => go(a ++ (if f2(head) f1(head) else head ), tail)
}
}
loop(??, c) // how to get an empty collection of the same type as the one from the input?
}
Could any of you enlighten me?

This looks like a simple map of a Functor. Using scalaz:
def condMap[F[_],A](fa: F[A])(f: A => A, p: A => Boolean)(implicit F:Functor[F]) =
F.map(fa)(x => if (p(x)) f(x) else x)

Not sure why you would need scalaz for something so pedestrian.
// example collection and functions
val xs = 1 :: 2 :: 3 :: 4 :: Nil
def f1(v: Int) = v + 1
def f2(v: Int) = v % 2 == 0
// just conditionally transform inside a map
val transformed = xs.map(x => if (f2(x)) f1(x) else x)

Without using scalaz, you can use the CanBuildFrom pattern. This is exactly what is used in the standard collections library. Of course, in your specific case, this is probably over-engineered as a simple call to map is enough.
import scala.collection.generic._
def cmap[A, C[A] <: Traversable[A]](col: C[A])(f: A ⇒ A, p: A ⇒ Boolean)(implicit bf: CanBuildFrom[C[A], A, C[A]]): C[A] = {
val b = bf(col)
b.sizeHint(col)
for (x <- col) if(p(x)) b += f(x) else b += x
b.result
}
And now the usage:
scala> def f(i: Int) = 0
f: (i: Int)Int
scala> def p(i: Int) = i % 2 == 0
p: (i: Int)Boolean
scala> cmap(Seq(1, 2, 3, 4))(f, p)
res0: Seq[Int] = List(1, 0, 3, 0)
scala> cmap(List(1, 2, 3, 4))(f, p)
res1: List[Int] = List(1, 0, 3, 0)
scala> cmap(Set(1, 2, 3, 4))(f, p)
res2: scala.collection.immutable.Set[Int] = Set(1, 0, 3)
Observe how the return type is always the same as the one provided.
The function could be nicely encapsulated in an implicit class, using the "pimp my library" pattern.

For something like this you can use an implicit class. They were added just for this reason, to enhance libraries you can't change.
It would work like this:
object ImplicitStuff {
implicit class SeqEnhancer[A](s:Seq[A]) {
def transformIf( cond : A => Boolean)( f : A => A ):Seq[A] =
s.map{ x => if(cond(x)) f(x) else x }
}
def main(a:Array[String]) = {
val s = Seq(1,2,3,4,5,6,7)
println(s.transformIf(_ % 2 ==0){ _ * 2})
// result is (1, 4, 3, 8, 5, 12, 7)
}
}
Basically if you call a method that does not exists in the object you're calling it in (in this case, Seq), it will check if there's an implicit class that implements it, but it looks like a built in method.

Sequence with Streams in Scala

Suppose there is a sequence a[i] = f(a[i-1], a[i-2], ... a[i-k]). How would you code it using streams in Scala?

It will be possible to generalize it for any k, using an array for a and another k parameter, and having, f.i., the function with a rest... parameter.
def next(a1:Any, ..., ak:Any, f: (Any, ..., Any) => Any):Stream[Any] {
val n = f(a1, ..., ak)
Stream.cons(n, next(a2, ..., n, f))
}
val myStream = next(init1, ..., initk)
in order to have the 1000th do next.drop(1000)
An Update to show how this could be done with varargs. Beware that there is no arity check for the passed function:
object Test extends App {
def next(a:Seq[Long], f: (Long*) => Long): Stream[Long] = {
val v = f(a: _*)
Stream.cons(v, next(a.tail ++ Array(v), f))
}
def init(firsts:Seq[Long], rest:Seq[Long], f: (Long*) => Long):Stream[Long] = {
rest match {
case Nil => next(firsts, f)
case x :: xs => Stream.cons(x,init(firsts, xs, f))
}
}
def sum(a:Long*):Long = {
a.sum
}
val myStream = init(Seq[Long](1,1,1), Seq[Long](1,1,1), sum)
myStream.take(12).foreach(println)
}

Is this OK?
(a[i] = f(a[i-k], a[i-k+1], ... a[i-1]) instead of a[i] = f(a[i-1], a[i-2], ... a[i-k]), since I prefer to this way)
/**
Generating a Stream[T] by the given first k items and a function map k items to the next one.
*/
def getStream[T](f : T => Any,a : T*): Stream[T] = {
def invoke[T](fun: T => Any, es: T*): T = {
if(es.size == 1) fun.asInstanceOf[T=>T].apply(es.head)
else invoke(fun(es.head).asInstanceOf[T => Any],es.tail :_*)
}
Stream.iterate(a){ es => es.tail :+ invoke(f,es: _*)}.map{ _.head }
}
For example, the following code to generate Fibonacci sequence.
scala> val fn = (x: Int, y: Int) => x+y
fn: (Int, Int) => Int = <function2>
scala> val fib = getStream(fn.curried,1,1)
fib: Stream[Int] = Stream(1, ?)
scala> fib.take(10).toList
res0: List[Int] = List(1, 1, 2, 3, 5, 8, 13, 21, 34, 55)
The following code can generate a sequence {an} where a1 = 1, a2 = 2, a3 = 3, a(n+3） = a(n) + 2a(n+1) + 3a(n+2).
scala> val gn = (x: Int, y: Int, z: Int) => x + 2*y + 3*z
gn: (Int, Int, Int) => Int = <function3>
scala> val seq = getStream(gn.curried,1,2,3)
seq: Stream[Int] = Stream(1, ?)
scala> seq.take(10).toList
res1: List[Int] = List(1, 2, 3, 14, 50, 181, 657, 2383, 8644, 31355)

The short answer, that you are probably looking for, is a pattern to define your Stream once you have fixed a chosen k for the arity of f (i.e. you have a fixed type for f). The following pattern gives you a Stream which n-th element is the term a[n] of your sequence:
def recStreamK [A](f : A ⇒ A ⇒ ... A) (x1:A) ... (xk:A):Stream[A] =
x1 #:: recStreamK (f) (x2)(x3) ... (xk) (f(x1)(x2) ... (xk))
(credit : it is very close to the answer of andy petrella, except that the initial elements are set up correctly, and consequently the rank in the Stream matches that in the sequence)
If you want to generalize over k, this is possible in a type-safe manner (with arity checking) in Scala, using prioritized overlapping implicits. The code (˜80 lines) is available as a gist here. I'm afraid I got a little carried away, and explained it as an detailed & overlong blog post there.

Unfortunately, we cannot generalize over number and be type safe at the same time. So we’ll have to do it all manually:
def seq2[T, U](initials: Tuple2[T, T]) = new {
def apply(fun: Function2[T, T, T]): Stream[T] = {
initials._1 #::
initials._2 #::
(apply(fun) zip apply(fun).tail).map {
case (a, b) => fun(a, b)
}
}
}
And we get def fibonacci = seq2((1, 1))(_ + _).
def seq3[T, U](initials: Tuple3[T, T, T]) = new {
def apply(fun: Function3[T, T, T, T]): Stream[T] = {
initials._1 #::
initials._2 #::
initials._3 #::
(apply(fun) zip apply(fun).tail zip apply(fun).tail.tail).map {
case ((a, b), c) => fun(a, b, c)
}
}
}
def tribonacci = seq3((1, 1, 1))(_ + _ + _)
… and up to 22.
I hope the pattern is getting clear somehow. (We could of course improve and exchange the initials tuple with separate arguments. This saves us a pair of parentheses later when we use it.) If some day in the future, the Scala macro language arrives, this hopefully will be easier to define.

Lazy Cartesian product of several Seqs in Scala

I implemented a simple method to generate Cartesian product on several Seqs like this:
object RichSeq {
implicit def toRichSeq[T](s: Seq[T]) = new RichSeq[T](s)
}
class RichSeq[T](s: Seq[T]) {
import RichSeq._
def cartesian(ss: Seq[Seq[T]]): Seq[Seq[T]] = {
ss.toList match {
case Nil => Seq(s)
case s2 :: Nil => {
for (e <- s) yield s2.map(e2 => Seq(e, e2))
}.flatten
case s2 :: tail => {
for (e <- s) yield s2.cartesian(tail).map(seq => e +: seq)
}.flatten
}
}
}
Obviously, this one is really slow, as it calculates the whole product at once. Did anyone implement a lazy solution for this problem in Scala?
UPD
OK, So I implemented a reeeeally stupid, but working version of an iterator over a Cartesian product. Posting here for future enthusiasts:
object RichSeq {
implicit def toRichSeq[T](s: Seq[T]) = new RichSeq(s)
}
class RichSeq[T](s: Seq[T]) {
def lazyCartesian(ss: Seq[Seq[T]]): Iterator[Seq[T]] = new Iterator[Seq[T]] {
private[this] val seqs = s +: ss
private[this] var indexes = Array.fill(seqs.length)(0)
private[this] val counts = Vector(seqs.map(_.length - 1): _*)
private[this] var current = 0
def next(): Seq[T] = {
val buffer = ArrayBuffer.empty[T]
if (current != 0) {
throw new NoSuchElementException("no more elements to traverse")
}
val newIndexes = ArrayBuffer.empty[Int]
var inside = 0
for ((index, i) <- indexes.zipWithIndex) {
buffer.append(seqs(i)(index))
newIndexes.append(index)
if ((0 to i).forall(ind => newIndexes(ind) == counts(ind))) {
inside = inside + 1
}
}
current = inside
if (current < seqs.length) {
for (i <- (0 to current).reverse) {
if ((0 to i).forall(ind => newIndexes(ind) == counts(ind))) {
newIndexes(i) = 0
} else if (newIndexes(i) < counts(i)) {
newIndexes(i) = newIndexes(i) + 1
}
}
current = 0
indexes = newIndexes.toArray
}
buffer.result()
}
def hasNext: Boolean = current != seqs.length
}
}

Here's my solution to the given problem. Note that the laziness is simply caused by using .view on the "root collection" of the used for comprehension.
scala> def combine[A](xs: Traversable[Traversable[A]]): Seq[Seq[A]] =
| xs.foldLeft(Seq(Seq.empty[A])){
| (x, y) => for (a <- x.view; b <- y) yield a :+ b }
combine: [A](xs: Traversable[Traversable[A]])Seq[Seq[A]]
scala> combine(Set(Set("a","b","c"), Set("1","2"), Set("S","T"))) foreach (println(_))
List(a, 1, S)
List(a, 1, T)
List(a, 2, S)
List(a, 2, T)
List(b, 1, S)
List(b, 1, T)
List(b, 2, S)
List(b, 2, T)
List(c, 1, S)
List(c, 1, T)
List(c, 2, S)
List(c, 2, T)
To obtain this, I started from the function combine defined in https://stackoverflow.com/a/4515071/53974, passing it the function (a, b) => (a, b). However, that didn't quite work directly, since that code expects a function of type (A, A) => A. So I just adapted the code a bit.

These might be a starting point:
Cartesian product of two lists
Expand a Set[Set[String]] into Cartesian Product in Scala
https://stackoverflow.com/questions/6182126/im-learning-scala-would-it-be-possible-to-get-a-little-code-review-and-mentori

What about:
def cartesian[A](list: List[Seq[A]]): Iterator[Seq[A]] = {
if (list.isEmpty) {
Iterator(Seq())
} else {
list.head.iterator.flatMap { i => cartesian(list.tail).map(i +: _) }
}
}
Simple and lazy ;)

def cartesian[A](list: List[List[A]]): List[List[A]] = {
list match {
case Nil => List(List())
case h :: t => h.flatMap(i => cartesian(t).map(i :: _))
}
}

You can look here: https://stackoverflow.com/a/8318364/312172 how to translate a number into an index of all possible values, without generating every element.
This technique can be used to implement a stream.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Thoughts about a collection method that splits multiple times given a predicate - scala

Related

how to implement takeUntil with Scala lazy collections

What is the efficient way to remove subsets from a List[List[String]]?

Conditionally apply a function in Scala - How to write this function?

Sequence with Streams in Scala

Lazy Cartesian product of several Seqs in Scala

Categories

Resources