How to group large stream into sub streams

How to group large stream into sub streams - scala

I want to group large Stream[F, A] into Stream[Stream[F, A]] with at most n element for inner stream.
This is what I did, basically pipe chunks into Queue[F, Queue[F, Chunk[A]], and then yields queue elements as result stream.
implicit class StreamSyntax[F[_], A](s: Stream[F, A])(
implicit F: Concurrent[F]) {
def groupedPipe(
lastQRef: Ref[F, Queue[F, Option[Chunk[A]]]],
n: Int): Pipe[F, A, Stream[F, A]] = { in =>
val initQs =
Queue.unbounded[F, Option[Queue[F, Option[Chunk[A]]]]].flatMap { qq =>
Queue.bounded[F, Option[Chunk[A]]](1).flatMap { q =>
lastQRef.set(q) *> qq.enqueue1(Some(q)).as(qq -> q)
}
}
Stream.eval(initQs).flatMap {
case (qq, initQ) =>
def newQueue = Queue.bounded[F, Option[Chunk[A]]](1).flatMap { q =>
qq.enqueue1(Some(q)) *> lastQRef.set(q).as(q)
}
val evalStream = {
in.chunks
.evalMapAccumulate((0, initQ)) {
case ((i, q), c) if i + c.size >= n =>
val (l, r) = c.splitAt(n - i)
q.enqueue1(Some(l)) >> q.enqueue1(None) >> q
.enqueue1(None) >> newQueue.flatMap { nq =>
nq.enqueue1(Some(r)).as(((r.size, nq), c))
}
case ((i, q), c) if (i + c.size) < n =>
q.enqueue1(Some(c)).as(((i + c.size, q), c))
}
.attempt ++ Stream.eval {
lastQRef.get.flatMap { last =>
last.enqueue1(None) *> last.enqueue1(None)
} *> qq.enqueue1(None)
}
}
qq.dequeue.unNoneTerminate
.map(
q =>
q.dequeue.unNoneTerminate
.flatMap(Stream.chunk)
.onFinalize(
q.dequeueChunk(Int.MaxValue).unNoneTerminate.compile.drain))
.concurrently(evalStream)
}
}
def grouped(n: Int) = {
Stream.eval {
Queue.unbounded[F, Option[Chunk[A]]].flatMap { empty =>
Ref.of[F, Queue[F, Option[Chunk[A]]]](empty)
}
}.flatMap { ref =>
val p = groupedPipe(ref, n)
s.through(p)
}
}
}
But it is very complicated, is there any simpler way ?

fs2 has chunkN chunkLimit methods that can help with grouping
stream.chunkN(n).map(Stream.chunk)
stream.chunkLimit(n).map(Stream.chunk)
chunkN produces chunks of size n until the end of a stream
chunkLimit splits existing chunks and can produce chunks with variable size.
scala> Stream(1,2,3).repeat.chunkN(2).take(5).toList
res0: List[Chunk[Int]] = List(Chunk(1, 2), Chunk(3, 1), Chunk(2, 3), Chunk(1, 2), Chunk(3, 1))
scala> (Stream(1) ++ Stream(2, 3) ++ Stream(4, 5, 6)).chunkLimit(2).toList
res0: List[Chunk[Int]] = List(Chunk(1), Chunk(2, 3), Chunk(4, 5), Chunk(6))

In addition to the already mentioned chunksN, also consider using groupWithin (fs2 1.0.1):
def groupWithin[F2[x] >: F[x]](n: Int, d: FiniteDuration)(implicit timer: Timer[F2], F: Concurrent[F2]): Stream[F2, Chunk[O]]
Divide this streams into groups of elements received within a time window, or limited by the number of the elements, whichever happens first. Empty groups, which can occur if no elements can be pulled from upstream in a given time window, will not be emitted.
Note: a time window starts each time downstream pulls.
I'm not sure why you'd want this to be nested streams, since the requirement is to have "at most n elements" in one batch - which implies that you're keeping track of a finite number of elements (which is exactly what a Chunk is for). Either way, a Chunk can always be represented as a Stream with Stream.chunk:
val chunks: Stream[F, Chunk[O]] = ???
val streamOfStreams: Stream[F, Stream[F, O]] = chunks.map(Stream.chunk)
Here's a complete example of how to use groupWithin:
import cats.implicits._
import cats.effect.{ExitCode, IO, IOApp}
import fs2._
import scala.concurrent.duration._
object GroupingDemo extends IOApp {
override def run(args: List[String]): IO[ExitCode] = {
Stream('a, 'b, 'c).covary[IO]
.groupWithin(2, 1.second)
.map(_.toList)
.showLinesStdOut
.compile.drain
.as(ExitCode.Success)
}
}
Outputs:
List('a, 'b)
List('c)

Finally I use a more reliable version (use Hotswap ensure queue termination) like this.
def grouped(
innerSize: Int
)(implicit F: Async[F]): Stream[F, Stream[F, A]] = {
type InnerQueue = Queue[F, Option[Chunk[A]]]
type OuterQueue = Queue[F, Option[InnerQueue]]
def swapperInner(swapper: Hotswap[F, InnerQueue], outer: OuterQueue) = {
val innerRes =
Resource.make(Queue.unbounded[F, Option[Chunk[A]]])(_.offer(None))
swapper.swap(innerRes).flatTap(q => outer.offer(q.some))
}
def loopChunk(
gathered: Int,
curr: Queue[F, Option[Chunk[A]]],
chunk: Chunk[A],
newInnerQueue: F[InnerQueue]
): F[(Int, Queue[F, Option[Chunk[A]]])] = {
if (gathered + chunk.size > innerSize) {
val (left, right) = chunk.splitAt(innerSize - gathered)
curr.offer(left.some) >> newInnerQueue.flatMap { nq =>
loopChunk(0, nq, right, newInnerQueue)
}
} else if (gathered + chunk.size == innerSize) {
curr.offer(chunk.some) >> newInnerQueue.tupleLeft(
0
)
} else {
curr.offer(chunk.some).as(gathered + chunk.size -> curr)
}
}
val prepare = for {
outer <- Resource.eval(Queue.unbounded[F, Option[InnerQueue]])
swapper <- Hotswap.create[F, InnerQueue]
} yield outer -> swapper
Stream.resource(prepare).flatMap {
case (outer, swapper) =>
val newInner = swapperInner(swapper, outer)
val background = Stream.eval(newInner).flatMap { initQueue =>
s.chunks
.filter(_.nonEmpty)
.evalMapAccumulate(0 -> initQueue) { (state, chunk) =>
val (gathered, curr) = state
loopChunk(gathered, curr, chunk, newInner).tupleRight({})
}
.onFinalize(swapper.clear *> outer.offer(None))
}
val foreground = Stream
.fromQueueNoneTerminated(outer)
.map(i => Stream.fromQueueNoneTerminatedChunk(i))
foreground.concurrently(background)
}
}

Related

How to group objects using a classifier function in FS2?

I have a stream of unordered measurements, that I'd like to group into batches of a fixed size, so that I can persist them efficiently later:
val measurements = for {
id <- Seq("foo", "bar", "baz")
value <- 1 to 5
} yield (id, value)
fs2.Stream.emits(scala.util.Random.shuffle(measurements)).toVector
That is, instead of:
(bar,4)
(foo,5)
(baz,3)
(baz,5)
(baz,4)
(foo,2)
(bar,2)
(foo,4)
(baz,1)
(foo,1)
(foo,3)
(bar,1)
(bar,5)
(bar,3)
(baz,2)
I'd like to have the following structure for a batch size equal to 3:
(bar,[4,2,1])
(foo,[5,2,4])
(baz,[3,5,4])
(baz,[1,2])
(foo,[1,3])
(bar,[5,3])
Is there a simple, idiomatic way to achieve this in FS2? I know there's a groupAdjacentBy function, but this will take into account neighbouring items only.
I'm on 0.10.5 at the moment.

This can be achieved with fs2 Pull:
import cats.data.{NonEmptyList => Nel}
import fs2._
object GroupingByKey {
def groupByKey[F[_], K, V](limit: Int): Pipe[F, (K, V), (K, Nel[V])] = {
require(limit >= 1)
def go(state: Map[K, List[V]]): Stream[F, (K, V)] => Pull[F, (K, Nel[V]), Unit] = _.pull.uncons1.flatMap {
case Some(((key, num), tail)) =>
val prev = state.getOrElse(key, Nil)
if (prev.size == limit - 1) {
val group = Nel.ofInitLast(prev.reverse, num)
Pull.output1(key -> group) >> go(state - key)(tail)
} else {
go(state.updated(key, num :: prev))(tail)
}
case None =>
val chunk = Chunk.vector {
state
.toVector
.collect { case (key, last :: revInit) =>
val group = Nel.ofInitLast(revInit.reverse, last)
key -> group
}
}
Pull.output(chunk) >> Pull.done
}
go(Map.empty)(_).stream
}
}
Usage:
import cats.data.{NonEmptyList => Nel}
import cats.implicits._
import cats.effect.{ExitCode, IO, IOApp}
import fs2._
object Answer extends IOApp {
type Key = String
override def run(args: List[String]): IO[ExitCode] = {
require {
Stream('a -> 1).through(groupByKey(2)).compile.toList ==
List('a -> Nel.one(1))
}
require {
Stream('a -> 1, 'a -> 2).through(groupByKey(2)).compile.toList ==
List('a -> Nel.of(1, 2))
}
require {
Stream('a -> 1, 'a -> 2, 'a -> 3).through(groupByKey(2)).compile.toList ==
List('a -> Nel.of(1, 2), 'a -> Nel.one(3))
}
val infinite = (for {
prng <- Stream.eval(IO { new scala.util.Random() })
keys <- Stream(Vector[Key]("a", "b", "c", "d", "e", "f", "g"))
key = Stream.eval(IO {
val i = prng.nextInt(keys.size)
keys(i)
})
num = Stream.eval(IO { 1 + prng.nextInt(9) })
} yield (key zip num).repeat).flatten
infinite
.through(groupByKey(3))
.showLinesStdOut
.compile
.drain
.as(ExitCode.Success)
}
}

Scala generic "string split" method

If I were splitting a string, I would be able to do
"123,456,789".split(",")
to get
Seq("123","456","789")
Thinking of a string as a sequence of characters, how could this be generalized to other sequences of objects?
val x = Seq(One(),Two(),Three(),Comma(),Five(),Six(),Comma(),Seven(),Eight(),Nine())
x.split(
number=>{
case _:Comma => true
case _ => false
}
)
split in this case doesn't exist, but it reminds me of span, partition, groupby, but only span seems close, but it doesn't handle leading/ending comma's gracefully.

implicit class SplitSeq[T](seq: Seq[T]){
import scala.collection.mutable.ListBuffer
def split(sep: T): Seq[Seq[T]] = {
val buffer = ListBuffer(ListBuffer.empty[T])
seq.foreach {
case `sep` => buffer += ListBuffer.empty
case elem => buffer.last += elem
}; buffer.filter(_.nonEmpty)
}
}
It can be then used like x.split(Comma()).

The following is 'a' solution, not the most elegant -
def split[A](x: Seq[A], edge: A => Boolean): Seq[Seq[A]] = {
val init = (Seq[Seq[A]](), Seq[A]())
val (result, last) = x.foldLeft(init) { (cum, n) =>
val (total, prev) = cum
if (edge(n)) {
(total :+ prev, Seq.empty)
} else {
(total, prev :+ n)
}
}
result :+ last
}
Example result -
scala> split(Seq(1,2,3,0,4,5,0,6,7), (_:Int) == 0)
res53: Seq[Seq[Int]] = List(List(1, 2, 3), List(4, 5), List(6, 7))

This is how I've solved it in the past, but I suspect there is a better / more elegant way.
def break[A](xs:Seq[A], p:A => Boolean): (Seq[A], Seq[A]) = {
if (p(xs.head)) {
xs.span(p)
}
else {
xs.span(a => !p(a))
}
}

Faster implementation for reduceByKey on Seq of pairs possible?

The code below contains various single-threaded implementations of reduceByKeyXXX methods and a few helper methods to create input sets and measure execution times. (Feel free to run the main-method)
The main purpose of reduceByKey (as in Spark) is to reduce key-value-pairs with the same key. Example:
scala> val xs = Seq( "a" -> 2, "b" -> 3, "a" -> 5)
xs: Seq[(String, Int)] = List((a,2), (b,3), (a,5))
scala> ReduceByKeyComparison.reduceByKey(xs, (x:Int, y:Int) ⇒ x+y )
res8: Seq[(String, Int)] = ArrayBuffer((b,3), (a,7))
Code
import java.util.HashMap
object Util {
def measure( body : => Unit ) : Long = {
val now = System.currentTimeMillis
body
val nowAfter = System.currentTimeMillis
nowAfter - now
}
def measureMultiple( body: => Unit, n: Int) : String = {
val executionTimes = (1 to n).toList.map( x => {
print(".")
measure(body)
} )
val avg = executionTimes.sum / executionTimes.size
executionTimes.mkString("", "ms, ", "ms") + s" Average: ${avg}ms."
}
}
object RandomUtil {
val AB = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
val r = new java.util.Random();
def randomString( len: Int ) : String = {
val sb = new StringBuilder( len );
for( i <- 0 to len-1 ) {
sb.append(AB.charAt(r.nextInt(AB.length())));
}
sb.toString();
}
def generateSeq(n: Int) : Seq[(String, Int)] = {
Seq.fill(n)( (randomString(2), r.nextInt(100)) )
}
}
object ReduceByKeyComparison {
def main(args: Array[String]) : Unit = {
implicit def iterableToPairedIterable[K, V](x: Iterable[(K, V)]) = { new PairedIterable(x) }
val runs = 10
val problemSize = 2000000
val ss = RandomUtil.generateSeq(problemSize)
println("ReduceByKey : " + Util.measureMultiple( reduceByKey(ss, (x:Int, y:Int) ⇒ x+y ), runs ))
println("ReduceByKey2: " + Util.measureMultiple( reduceByKey2(ss, (x:Int, y:Int) ⇒ x+y ), runs ))
println("ReduceByKey3: " + Util.measureMultiple( reduceByKey3(ss, (x:Int, y:Int) ⇒ x+y ), runs ))
println("ReduceByKeyPaired: " + Util.measureMultiple( ss.reduceByKey( (x:Int, y:Int) ⇒ x+y ), runs ))
println("ReduceByKeyA: " + Util.measureMultiple( reduceByKeyA( ss, (x:Int, y:Int) ⇒ x+y ), runs ))
}
// =============================================================================
// Different implementations
// =============================================================================
def reduceByKey[A,B]( s: Seq[(A,B)], fnc: (B, B) ⇒ B) : Seq[(A,B)] = {
val t = s.groupBy(x => x._1)
val u = t.map { case (k,v) => (k, v.map(_._2).reduce(fnc))}
u.toSeq
}
def reduceByKey2[A,B]( s: Seq[(A,B)], fnc: (B, B) ⇒ B) : Seq[(A,B)] = {
val r = s.foldLeft( Map[A,B]() ){ (m,a) ⇒
val k = a._1
val v = a._2
m.get(k) match {
case Some(pv) ⇒ m + ((k, fnc(pv, v)))
case None ⇒ m + ((k, v))
}
}
r.toSeq
}
def reduceByKey3[A,B]( s: Seq[(A,B)], fnc: (B, B) ⇒ B) : Seq[(A,B)] = {
var m = scala.collection.mutable.Map[A,B]()
s.foreach{ e ⇒
val k = e._1
val v = e._2
m.get(k) match {
case Some(pv) ⇒ m(k) = fnc(pv, v)
case None ⇒ m(k) = v
}
}
m.toSeq
}
/**
* Method code from [[http://ideone.com/dyrkYM]]
* All rights to Muhammad-Ali A'rabi according to [[https://issues.scala-lang.org/browse/SI-9064]]
*/
def reduceByKeyA[A,B]( s: Seq[(A,B)], fnc: (B, B) ⇒ B): Map[A, B] = {
s.groupBy(_._1).map(l => (l._1, l._2.map(_._2).reduce( fnc )))
}
/**
* Method code from [[http://ideone.com/dyrkYM]]
* All rights to Muhammad-Ali A'rabi according to [[https://issues.scala-lang.org/browse/SI-9064]]
*/
class PairedIterable[K, V](x: Iterable[(K, V)]) {
def reduceByKey(func: (V,V) => V) = {
val map = new HashMap[K, V]
x.foreach { pair =>
val old = map.get(pair._1)
map.put(pair._1, if (old == null) pair._2 else func(old, pair._2))
}
map
}
}
}
yielding the following results on my machine
..........ReduceByKey : 723ms, 782ms, 761ms, 617ms, 640ms, 707ms, 634ms, 611ms, 380ms, 458ms Average: 631ms.
..........ReduceByKey2: 580ms, 458ms, 452ms, 463ms, 462ms, 470ms, 463ms, 465ms, 458ms, 462ms Average: 473ms.
..........ReduceByKey3: 489ms, 466ms, 461ms, 468ms, 555ms, 474ms, 469ms, 457ms, 461ms, 468ms Average: 476ms.
..........ReduceByKeyPaired: 140ms, 124ms, 124ms, 120ms, 122ms, 124ms, 118ms, 126ms, 121ms, 119ms Average: 123ms.
..........ReduceByKeyA: 628ms, 694ms, 666ms, 656ms, 616ms, 660ms, 594ms, 659ms, 445ms, 399ms Average: 601ms.
and ReduceByKeyPaired currently being the fastest.
Question / Task
Is there a faster single-threaded (Scala) implementation?

Rewritting reduceByKey method of PairedIterable to recursion gives around 5-10% performance improvement.
That all i was able to get.
I've also tryed to increase initial capacity allocation for HashMap - but it does not show any significant changes.
class PairedIterable[K, V](x: Iterable[(K, V)]) {
def reduceByKey(func: (V,V) => V) = {
val map = new HashMap[K, V]()
#tailrec
def reduce(it: Iterable[(K, V)]): HashMap[K, V] = {
it match {
case Nil => map
case (k, v) :: tail =>
val old = map.get(k)
map.put(k, if (old == null) v else func(old, v))
reduce(tail)
}
}
val r = reduce(x)
r
}
}
In general, making some comparison analysis of provided methods - they can be splitted onto two categories.
First set of reduces are with sorting (grouping) - as we can see those methods add extra O(n*log[n]) complexity and are not effective for this scenario.
Seconds are with linear looping across all enries of Iterable. Those set of methods has extra get/put operations to temp map. But those gets/puts are not so time consuming - O(n)*O(c).
Moreover necessity to work with Options in scala collections makes it less effective.

Cartesian product stream scala

I had a simple task to find combination which occurs most often when we drop 4 cubic dices an remove one with least points.
So, the question is: are there any Scala core classes to generate streams of cartesian products in Scala? When not - how to implement it in the most simple and effective way?
Here is the code and comparison with naive implementation in Scala:
object D extends App {
def dropLowest(a: List[Int]) = {
a diff List(a.min)
}
def cartesian(to: Int, times: Int): Stream[List[Int]] = {
def stream(x: List[Int]): Stream[List[Int]] = {
if (hasNext(x)) x #:: stream(next(x)) else Stream(x)
}
def hasNext(x: List[Int]) = x.exists(n => n < to)
def next(x: List[Int]) = {
def add(current: List[Int]): List[Int] = {
if (current.head == to) 1 :: add(current.tail) else current.head + 1 :: current.tail // here is a possible bug when we get maximal value, don't reuse this method
}
add(x.reverse).reverse
}
stream(Range(0, times).map(t => 1).toList)
}
def getResult(list: Stream[List[Int]]) = {
list.map(t => dropLowest(t).sum).groupBy(t => t).map(t => (t._1, t._2.size)).toMap
}
val list1 = cartesian(6, 4)
val list = for (i <- Range(1, 7); j <- Range(1,7); k <- Range(1, 7); l <- Range(1, 7)) yield List(i, j, k, l)
println(getResult(list1))
println(getResult(list.toStream) equals getResult(list1))
}
Thanks in advance

I think you can simplify your code by using flatMap :
val stream = (1 to 6).toStream
def cartesian(times: Int): Stream[Seq[Int]] = {
if (times == 0) {
Stream(Seq())
} else {
stream.flatMap { i => cartesian(times - 1).map(i +: _) }
}
}
Maybe a little bit more efficient (memory-wise) would be using Iterators instead:
val pool = (1 to 6)
def cartesian(times: Int): Iterator[Seq[Int]] = {
if (times == 0) {
Iterator(Seq())
} else {
pool.iterator.flatMap { i => cartesian(times - 1).map(i +: _) }
}
}
or even more concise by replacing the recursive calls by a fold :
def cartesian[A](list: Seq[Seq[A]]): Iterator[Seq[A]] =
list.foldLeft(Iterator(Seq[A]())) {
case (acc, l) => acc.flatMap(i => l.map(_ +: i))
}
and then:
cartesian(Seq.fill(4)(1 to 6)).map(dropLowest).toSeq.groupBy(i => i.sorted).mapValues(_.size).toSeq.sortBy(_._2).foreach(println)
(Note that you cannot use groupBy on Iterators, so Streams or even Lists are the way to go whatever to be; above code still valid since toSeq on an Iterator actually returns a lazy Stream).
If you are considering stats on the sums of dice instead of combinations, you can update the dropLowest fonction :
def dropLowest(l: Seq[Int]) = l.sum - l.min

Lazy Cartesian product of several Seqs in Scala

I implemented a simple method to generate Cartesian product on several Seqs like this:
object RichSeq {
implicit def toRichSeq[T](s: Seq[T]) = new RichSeq[T](s)
}
class RichSeq[T](s: Seq[T]) {
import RichSeq._
def cartesian(ss: Seq[Seq[T]]): Seq[Seq[T]] = {
ss.toList match {
case Nil => Seq(s)
case s2 :: Nil => {
for (e <- s) yield s2.map(e2 => Seq(e, e2))
}.flatten
case s2 :: tail => {
for (e <- s) yield s2.cartesian(tail).map(seq => e +: seq)
}.flatten
}
}
}
Obviously, this one is really slow, as it calculates the whole product at once. Did anyone implement a lazy solution for this problem in Scala?
UPD
OK, So I implemented a reeeeally stupid, but working version of an iterator over a Cartesian product. Posting here for future enthusiasts:
object RichSeq {
implicit def toRichSeq[T](s: Seq[T]) = new RichSeq(s)
}
class RichSeq[T](s: Seq[T]) {
def lazyCartesian(ss: Seq[Seq[T]]): Iterator[Seq[T]] = new Iterator[Seq[T]] {
private[this] val seqs = s +: ss
private[this] var indexes = Array.fill(seqs.length)(0)
private[this] val counts = Vector(seqs.map(_.length - 1): _*)
private[this] var current = 0
def next(): Seq[T] = {
val buffer = ArrayBuffer.empty[T]
if (current != 0) {
throw new NoSuchElementException("no more elements to traverse")
}
val newIndexes = ArrayBuffer.empty[Int]
var inside = 0
for ((index, i) <- indexes.zipWithIndex) {
buffer.append(seqs(i)(index))
newIndexes.append(index)
if ((0 to i).forall(ind => newIndexes(ind) == counts(ind))) {
inside = inside + 1
}
}
current = inside
if (current < seqs.length) {
for (i <- (0 to current).reverse) {
if ((0 to i).forall(ind => newIndexes(ind) == counts(ind))) {
newIndexes(i) = 0
} else if (newIndexes(i) < counts(i)) {
newIndexes(i) = newIndexes(i) + 1
}
}
current = 0
indexes = newIndexes.toArray
}
buffer.result()
}
def hasNext: Boolean = current != seqs.length
}
}

Here's my solution to the given problem. Note that the laziness is simply caused by using .view on the "root collection" of the used for comprehension.
scala> def combine[A](xs: Traversable[Traversable[A]]): Seq[Seq[A]] =
| xs.foldLeft(Seq(Seq.empty[A])){
| (x, y) => for (a <- x.view; b <- y) yield a :+ b }
combine: [A](xs: Traversable[Traversable[A]])Seq[Seq[A]]
scala> combine(Set(Set("a","b","c"), Set("1","2"), Set("S","T"))) foreach (println(_))
List(a, 1, S)
List(a, 1, T)
List(a, 2, S)
List(a, 2, T)
List(b, 1, S)
List(b, 1, T)
List(b, 2, S)
List(b, 2, T)
List(c, 1, S)
List(c, 1, T)
List(c, 2, S)
List(c, 2, T)
To obtain this, I started from the function combine defined in https://stackoverflow.com/a/4515071/53974, passing it the function (a, b) => (a, b). However, that didn't quite work directly, since that code expects a function of type (A, A) => A. So I just adapted the code a bit.

These might be a starting point:
Cartesian product of two lists
Expand a Set[Set[String]] into Cartesian Product in Scala
https://stackoverflow.com/questions/6182126/im-learning-scala-would-it-be-possible-to-get-a-little-code-review-and-mentori

What about:
def cartesian[A](list: List[Seq[A]]): Iterator[Seq[A]] = {
if (list.isEmpty) {
Iterator(Seq())
} else {
list.head.iterator.flatMap { i => cartesian(list.tail).map(i +: _) }
}
}
Simple and lazy ;)

def cartesian[A](list: List[List[A]]): List[List[A]] = {
list match {
case Nil => List(List())
case h :: t => h.flatMap(i => cartesian(t).map(i :: _))
}
}

You can look here: https://stackoverflow.com/a/8318364/312172 how to translate a number into an index of all possible values, without generating every element.
This technique can be used to implement a stream.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

How to group large stream into sub streams - scala

Related

How to group objects using a classifier function in FS2?

Scala generic "string split" method

Faster implementation for reduceByKey on Seq of pairs possible?

Cartesian product stream scala

Lazy Cartesian product of several Seqs in Scala

Categories

Resources