Combining multiple Lists of arbitrary length - scala

I am looking for an approach to join multiple Lists in the following manner:
ListA a b c
ListB 1 2 3 4
ListC + # * § %
..
..
..
Resulting List: a 1 + b 2 # c 3 * 4 § %
In Words: The elements in sequential order, starting at first list combined into the resulting list. An arbitrary amount of input lists could be there varying in length.
I used multiple approaches with variants of zip, sliding iterators but none worked and especially took care of varying list lengths. There has to be an elegant way in scala ;)

val lists = List(ListA, ListB, ListC)
lists.flatMap(_.zipWithIndex).sortBy(_._2).map(_._1)
It's pretty self-explanatory. It just zips each value with its position on its respective list, sorts by index, then pulls the values back out.

Here's how I would do it:
class ListTests extends FunSuite {
test("The three lists from his example") {
val l1 = List("a", "b", "c")
val l2 = List(1, 2, 3, 4)
val l3 = List("+", "#", "*", "§", "%")
// All lists together
val l = List(l1, l2, l3)
// Max length of a list (to pad the shorter ones)
val maxLen = l.map(_.size).max
// Wrap the elements in Option and pad with None
val padded = l.map { list => list.map(Some(_)) ++ Stream.continually(None).take(maxLen - list.size) }
// Transpose
val trans = padded.transpose
// Flatten the lists then flatten the options
val result = trans.flatten.flatten
// Viola
assert(List("a", 1, "+", "b", 2, "#", "c", 3, "*", 4, "§", "%") === result)
}
}

Here's an imperative solution if efficiency is paramount:
def combine[T](xss: List[List[T]]): List[T] = {
val b = List.newBuilder[T]
var its = xss.map(_.iterator)
while (!its.isEmpty) {
its = its.filter(_.hasNext)
its.foreach(b += _.next)
}
b.result
}

You can use padTo, transpose, and flatten to good effect here:
lists.map(_.map(Some(_)).padTo(lists.map(_.length).max, None)).transpose.flatten.flatten

Here's a small recursive solution.
def flatList(lists: List[List[Any]]) = {
def loop(output: List[Any], xss: List[List[Any]]): List[Any] = (xss collect { case x :: xs => x }) match {
case Nil => output
case heads => loop(output ::: heads, xss.collect({ case x :: xs => xs }))
}
loop(List[Any](), lists)
}
And here is a simple streams approach which can cope with an arbitrary sequence of sequences, each of potentially infinite length.
def flatSeqs[A](ssa: Seq[Seq[A]]): Stream[A] = {
def seqs(xss: Seq[Seq[A]]): Stream[Seq[A]] = xss collect { case xs if !xs.isEmpty => xs } match {
case Nil => Stream.empty
case heads => heads #:: seqs(xss collect { case xs if !xs.isEmpty => xs.tail })
}
seqs(ssa).flatten
}

Here's something short but not exceedingly efficient:
def heads[A](xss: List[List[A]]) = xss.map(_.splitAt(1)).unzip
def interleave[A](xss: List[List[A]]) = Iterator.
iterate(heads(xss)){ case (_, tails) => heads(tails) }.
map(_._1.flatten).
takeWhile(! _.isEmpty).
flatten.toList

Here's a recursive solution that's O(n). The accepted solution (using sort) is O(nlog(n)). Some testing I've done suggests the second solution using transpose is also O(nlog(n)) due to the implementation of transpose. The use of reverse below looks suspicious (since it's an O(n) operation itself) but convince yourself that it either can't be called too often or on too-large lists.
def intercalate[T](lists: List[List[T]]) : List[T] = {
def intercalateHelper(newLists: List[List[T]], oldLists: List[List[T]], merged: List[T]): List[T] = {
(newLists, oldLists) match {
case (Nil, Nil) => merged
case (Nil, zss) => intercalateHelper(zss.reverse, Nil, merged)
case (Nil::xss, zss) => intercalateHelper(xss, zss, merged)
case ( (y::ys)::xss, zss) => intercalateHelper(xss, ys::zss, y::merged)
}
}
intercalateHelper(lists, List.empty, List.empty).reverse
}

Related

Conditional concatenation of iterator elements - A Scala idiomatic solution

I have an Iterator of Strings and would like to concatenate each element preceding one that matches a predicate, e.g. for an Iterator of
Iterator("a", "b", "c break", "d break", "e")
and a predicate of
!line.endsWith("break")
I would like to print out
(Group: 0): a-b-c break
(Group: 1): d break
(Group: 2): e
(without needing to hold in memory more than a single group at a time)
I know I can achieve this with an iterator like below, but there has to be a more "Scala" way of writing this, right?
import scala.collection.mutable.ListBuffer
object IteratingAndAccumulating extends App {
class AccumulatingIterator(lines: Iterator[String])extends Iterator[ListBuffer[String]] {
override def hasNext: Boolean = lines.hasNext
override def next(): ListBuffer[String] = getNextLine(lines, new ListBuffer[String])
def getNextLine(lines: Iterator[String], accumulator: ListBuffer[String]): ListBuffer[String] = {
val line = lines.next
accumulator += line
if (line.endsWith("break") || !lines.hasNext) accumulator
else getNextLine(lines, accumulator)
}
}
new AccumulatingIterator(Iterator("a", "b", "c break", "d break", "e"))
.map(_.mkString("-")).zipWithIndex.foreach{
case (conc, i) =>
println(s"(Group: $i): $conc")
}
}
many thanks,
Fil
Here is a simple solution if you don't mind loading the entire contents into memory at once:
val lines: List[List[String]] = it.foldLeft(List(List.empty[String])) {
case (head::tail, x) if predicate(x) => Nil :: (x::head) :: tail
case (head::tail, x) => (x::head ) :: tail
}.dropWhile(_.isEmpty).map(_.reverse).reverse
If you would rather iterate through the strings and groups one-by-one, it gets a little bit more involved:
// first "instrument" the iterator, by "demarcating" group boundaries with None:
val instrumented: Iterator[Option[String]] = it.flatMap {
case x if predicate(x) => Seq(Some(x), None)
case x => Seq(Some(x))
}
// And now, wrap it around into another iterator, constructing groups:
val lines: Iterator[Iterator[String]] = Iterator.continually {
instrumented.takeWhile(_.nonEmpty).flatten
}.takeWhile(_.nonEmpty)

Scala for/yield runs but doesn't complete

I'm trying to walk through two arrays of potentially different sizes and compose a new array of randomly selected elements from them (for crossover in a genetic algorithm) (childGeneCount is just the length of the longer array).
In the following code snippet, each gene.toString logs, but my code doesn't seem to execute the last log. What dumb thing am I doing?
val genes = for (i <- 0 to childGeneCount) yield {
val gene = if (Random.nextBoolean()) {
if (i < p1genes.length) {
p1genes(i)
} else {
p2genes(i)
}
} else {
if (i < p2genes.length) {
p2genes(i)
} else {
p1genes(i)
}
}
Logger.debug(gene.toString)
gene
}
Logger.debug("crossover finishing - never gets here??")
New to scala, and would be happy for a slap on the wrist accompanied by a "do it this completely different way instead" if appropriate.
You are right, the problem was with "to" should have been "until". I have changed your code a bit to make it more scala like.
val p1genes = "AGTCTC"
val p2genes = "ATG"
val genePair = p1genes.zipAll(p2genes, None, None)
val matchedGene = for (pair <- genePair) yield {
pair match {
case (p1Gene, None) => p1Gene
case (None, p2Gene) => p2Gene
case (p1Gene, p2Gene) => if (Random.nextBoolean()) p1Gene else p2Gene
}
}
println(matchedGene)
The process is:
First zip two dna sequences into one.
Fill the shorter sequence with None.
Now loop over the zipped sequences and populate the new sequence.
Reworked Tawkir's answer, with cleaner None handling:
val p1genes = "AGTCTC"
val p2genes = "ATG"
val genePair = p1genes.map(Some.apply).zipAll(p2genes.map(Some.apply), None, None)
val matchedGene = genePair.map {
case (Some(p1Gene), None) => p1Gene
case (None, Some(p2Gene)) => p2Gene
case (Some(p1Gene), Some(p2Gene)) => if (Random.nextBoolean()) p1Gene else p2Gene
}
println(matchedGene)
If you want to avoid wrapping the sequence with Some, another solution is to use a character known not to appear in the sequence as a "none" marker:
val p1genes = "AGTCTC"
val p2genes = "ATG"
val none = '-'
val genePair = p1genes.zipAll(p2genes, none, none)
val matchedGene = genePair.map {
case (p1Gene, `none`) => p1Gene
case (`none`, p2Gene) => p2Gene
case (p1Gene, p2Gene) => if (Random.nextBoolean()) p1Gene else p2Gene
}
println(matchedGene)
Pretty sure harry0000's answer is correct: I was using "to" like "until", and am so used to exceptions being thrown loudly that I didn't think to look there!
I ended up switching from for/yield to List.tabulate(childGeneCount){ i => {, which fixed the error probably for the same reason.
Since you asked for possible style improvements, here are two suggested implementations. The first one is less idiomatic, but more performant. The second one is prettier but does some more work.
def crossover[E : ClassTag](a: Array[E], b: Array[E]): Array[E] = {
val (larger, smaller) = if(a.length > b.length) (a, b) else (b, a)
val result = Array.ofDim[E](larger.length)
for(i <- smaller.indices)
result(i) = if(Random.nextBoolean()) larger(i) else smaller(i)
for(i <- smaller.length until larger.length)
result(i) = larger(i)
result
}
def crossoverIdiomatic[E : ClassTag](a: Array[E], b: Array[E]): Array[E] = {
val randomPart = (a zip b).map { case (x,y) => if(Random.nextBoolean()) x else y }
val (larger, smaller) = if(a.length > b.length) (a, b) else (b, a)
randomPart ++ larger.drop(smaller.length)
}
val a = Array("1", "2", "3", "4", "5", "6")
val b = Array("one", "two", "three", "four")
// e.g. output: [one,2,three,4,5,6]
println(crossover(a, b).mkString("[", ",", "]"))
println(crossoverIdiomatic(a, b).mkString("[", ",", "]"))
Note that the E : ClassTag are only there to make the compiler happy about using Array[E], if you only need Int for your work, you can drop all the fancy generics.

Parallel Aggregate is not working on lists .length > 8

I'm writing a small exercise app that calculates number of unique letters (incl Unicode) in a seq of strings, and I'm using aggregate for it, as I try to run in parallel
here's my code:
class Frequency(seq: Seq[String]) {
type FreqMap = Map[Char, Int]
def calculate() = {
val freqMap: FreqMap = Map[Char, Int]()
val pattern = "(\\p{L}+)".r
val seqop: (FreqMap, String) => FreqMap = (fm, s) => {
s.toLowerCase().foldLeft(freqMap){(fm, c) =>
c match {
case pattern(char) => fm.get(char) match {
case None => fm+((char, 1))
case Some(i) => fm.updated(char, i+1)
}
case _ => fm
}
}
}
val reduce: (FreqMap, FreqMap) => FreqMap =
(m1, m2) => {
m1 ++ m2.map { case (k, v) => k -> (v + m1.getOrElse(k, 0)) }
}
seq.par.aggregate(freqMap)(seqop, reduce)
}
}
and then the code that makes use of that
object Frequency extends App {
val text = List("abc", "abc", "abc", "abc", "abc", "abc", "abc", "abc", "abc");
def frequency(seq: Seq[String]):Map[Char, Int] = {
new Frequency(seq).calculate()
}
Console println frequency(seq=text)
}
though I supplied "abc" 9 times, the result is Map(a -> 8, b -> 8, c -> 8), as it is for any number of "abc"'s > 8
I've looked at this, and it seems like I'm using aggregate correctly
Any suggestions to make it work?
You're discarding already collected results (the first fm) in your seqop. You need to add these to the new results you're computing, e.g. like this:
def calculate() = {
val freqMap: FreqMap = Map[Char, Int]()
val pattern = "(\\p{L}+)".r
val reduce: (FreqMap, FreqMap) => FreqMap =
(m1, m2) => {
m1 ++ m2.map { case (k, v) => k -> (v + m1.getOrElse(k, 0)) }
}
val seqop: (FreqMap, String) => FreqMap = (fm, s) => {
val res = s.toLowerCase().foldLeft(freqMap){(fm, c) =>
c match {
case pattern(char) => fm.get(char) match {
case None => fm+((char, 1))
case Some(i) => fm.updated(char, i+1)
}
case _ => fm
}
}
// I'm reusing your existing combinator function here:
reduce(res,fm)
}
seq.par.aggregate(freqMap)(seqop, reduce)
}
Depending on how the parallel collections divide the work you discard some of it. In your case (9x "abc") it divides the thing in 8 parallel seqop operations which means you discard exactly one result set. This varies depending on numbers, if you run in with say 17x "abc" it runs in 13 parallel operations, discarding 4 result sets (on my machine anyway - I'm not familiar with the underlying code and how it divides the work, this probably depends on the used ExecutionContext/Threadpool and subsequently number of CPUs/cores and so on).
Generally parallel collections are a drop in replacement for sequential collections, meaning if you drop .par you should still get the same result, albeit usually slower. If you do this with your original code you get a result of 1, which tells you that it's not a parallelization problem. This is a good way to test if you're doing to right thing when using these.
And last but not least: This was harder to spot than usual for me because you use the same variable name twice and subsequently shadow fm. Not doing that would make the code more readable and mistakes such as this easier to spot.

Simple functionnal way for grouping successive elements? [duplicate]

I'm trying to 'group' a string into segments, I guess this example would explain it more succintly
scala> val str: String = "aaaabbcddeeeeeeffg"
... (do something)
res0: List("aaaa","bb","c","dd","eeeee","ff","g")
I can thnk of a few ways to do this in an imperative style (with vars and stepping through the string to find groups) but I was wondering if any better functional solution could
be attained? I've been looking through the Scala API but there doesn't seem to be something that fits my needs.
Any help would be appreciated
You can split the string recursively with span:
def s(x : String) : List[String] = if(x.size == 0) Nil else {
val (l,r) = x.span(_ == x(0))
l :: s(r)
}
Tail recursive:
#annotation.tailrec def s(x : String, y : List[String] = Nil) : List[String] = {
if(x.size == 0) y.reverse
else {
val (l,r) = x.span(_ == x(0))
s(r, l :: y)
}
}
Seems that all other answers are very concentrated on collection operations. But pure string + regex solution is much simpler:
str split """(?<=(\w))(?!\1)""" toList
In this regex I use positive lookbehind and negative lookahead for the captured char
def group(s: String): List[String] = s match {
case "" => Nil
case s => s.takeWhile(_==s.head) :: group(s.dropWhile(_==s.head))
}
Edit: Tail recursive version:
def group(s: String, result: List[String] = Nil): List[String] = s match {
case "" => result reverse
case s => group(s.dropWhile(_==s.head), s.takeWhile(_==s.head) :: result)
}
can be used just like the other because the second parameter has a default value and thus doesnt have to be supplied.
Make it one-liner:
scala> val str = "aaaabbcddddeeeeefff"
str: java.lang.String = aaaabbcddddeeeeefff
scala> str.groupBy(identity).map(_._2)
res: scala.collection.immutable.Iterable[String] = List(eeeee, fff, aaaa, bb, c, dddd)
UPDATE:
As #Paul mentioned about the order here is updated version:
scala> str.groupBy(identity).toList.sortBy(_._1).map(_._2)
res: List[String] = List(aaaa, bb, c, dddd, eeeee, fff)
You could use some helper functions like this:
val str = "aaaabbcddddeeeeefff"
def zame(chars:List[Char]) = chars.partition(_==chars.head)
def q(chars:List[Char]):List[List[Char]] = chars match {
case Nil => Nil
case rest =>
val (thesame,others) = zame(rest)
thesame :: q(others)
}
q(str.toList) map (_.mkString)
This should do the trick, right? No doubt it can be cleaned up into one-liners even further
A functional* solution using fold:
def group(s : String) : Seq[String] = {
s.tail.foldLeft(Seq(s.head.toString)) { case (carry, elem) =>
if ( carry.last(0) == elem ) {
carry.init :+ (carry.last + elem)
}
else {
carry :+ elem.toString
}
}
}
There is a lot of cost hidden in all those sequence operations performed on strings (via implicit conversion). I guess the real complexity heavily depends on the kind of Seq strings are converted to.
(*) Afaik all/most operations in the collection library depend in iterators, an imho inherently unfunctional concept. But the code looks functional, at least.
Starting Scala 2.13, List is now provided with the unfold builder which can be combined with String::span:
List.unfold("aaaabbaaacdeeffg") {
case "" => None
case rest => Some(rest.span(_ == rest.head))
}
// List[String] = List("aaaa", "bb", "aaa", "c", "d", "ee", "ff", "g")
or alternatively, coupled with Scala 2.13's Option#unless builder:
List.unfold("aaaabbaaacdeeffg") {
rest => Option.unless(rest.isEmpty)(rest.span(_ == rest.head))
}
// List[String] = List("aaaa", "bb", "aaa", "c", "d", "ee", "ff", "g")
Details:
Unfold (def unfold[A, S](init: S)(f: (S) => Option[(A, S)]): List[A]) is based on an internal state (init) which is initialized in our case with "aaaabbaaacdeeffg".
For each iteration, we span (def span(p: (Char) => Boolean): (String, String)) this internal state in order to find the prefix containing the same symbol and produce a (String, String) tuple which contains the prefix and the rest of the string. span is very fortunate in this context as it produces exactly what unfold expects: a tuple containing the next element of the list and the new internal state.
The unfolding stops when the internal state is "" in which case we produce None as expected by unfold to exit.
Edit: Have to read more carefully. Below is no functional code.
Sometimes, a little mutable state helps:
def group(s : String) = {
var tmp = ""
val b = Seq.newBuilder[String]
s.foreach { c =>
if ( tmp != "" && tmp.head != c ) {
b += tmp
tmp = ""
}
tmp += c
}
b += tmp
b.result
}
Runtime O(n) (if segments have at most constant length) and tmp.+= probably creates the most overhead. Use a string builder instead for strict runtime in O(n).
group("aaaabbcddeeeeeeffg")
> Seq[String] = List(aaaa, bb, c, dd, eeeeee, ff, g)
If you want to use scala API you can use the built in function for that:
str.groupBy(c => c).values
Or if you mind it being sorted and in a list:
str.groupBy(c => c).values.toList.sorted

Abort early in a fold

What's the best way to terminate a fold early? As a simplified example, imagine I want to sum up the numbers in an Iterable, but if I encounter something I'm not expecting (say an odd number) I might want to terminate. This is a first approximation
def sumEvenNumbers(nums: Iterable[Int]): Option[Int] = {
nums.foldLeft (Some(0): Option[Int]) {
case (Some(s), n) if n % 2 == 0 => Some(s + n)
case _ => None
}
}
However, this solution is pretty ugly (as in, if I did a .foreach and a return -- it'd be much cleaner and clearer) and worst of all, it traverses the entire iterable even if it encounters a non-even number.
So what would be the best way to write a fold like this, that terminates early? Should I just go and write this recursively, or is there a more accepted way?
My first choice would usually be to use recursion. It is only moderately less compact, is potentially faster (certainly no slower), and in early termination can make the logic more clear. In this case you need nested defs which is a little awkward:
def sumEvenNumbers(nums: Iterable[Int]) = {
def sumEven(it: Iterator[Int], n: Int): Option[Int] = {
if (it.hasNext) {
val x = it.next
if ((x % 2) == 0) sumEven(it, n+x) else None
}
else Some(n)
}
sumEven(nums.iterator, 0)
}
My second choice would be to use return, as it keeps everything else intact and you only need to wrap the fold in a def so you have something to return from--in this case, you already have a method, so:
def sumEvenNumbers(nums: Iterable[Int]): Option[Int] = {
Some(nums.foldLeft(0){ (n,x) =>
if ((n % 2) != 0) return None
n+x
})
}
which in this particular case is a lot more compact than recursion (though we got especially unlucky with recursion since we had to do an iterable/iterator transformation). The jumpy control flow is something to avoid when all else is equal, but here it's not. No harm in using it in cases where it's valuable.
If I was doing this often and wanted it within the middle of a method somewhere (so I couldn't just use return), I would probably use exception-handling to generate non-local control flow. That is, after all, what it is good at, and error handling is not the only time it's useful. The only trick is to avoid generating a stack trace (which is really slow), and that's easy because the trait NoStackTrace and its child trait ControlThrowable already do that for you. Scala already uses this internally (in fact, that's how it implements the return from inside the fold!). Let's make our own (can't be nested, though one could fix that):
import scala.util.control.ControlThrowable
case class Returned[A](value: A) extends ControlThrowable {}
def shortcut[A](a: => A) = try { a } catch { case Returned(v) => v }
def sumEvenNumbers(nums: Iterable[Int]) = shortcut{
Option(nums.foldLeft(0){ (n,x) =>
if ((x % 2) != 0) throw Returned(None)
n+x
})
}
Here of course using return is better, but note that you could put shortcut anywhere, not just wrapping an entire method.
Next in line for me would be to re-implement fold (either myself or to find a library that does it) so that it could signal early termination. The two natural ways of doing this are to not propagate the value but an Option containing the value, where None signifies termination; or to use a second indicator function that signals completion. The Scalaz lazy fold shown by Kim Stebel already covers the first case, so I'll show the second (with a mutable implementation):
def foldOrFail[A,B](it: Iterable[A])(zero: B)(fail: A => Boolean)(f: (B,A) => B): Option[B] = {
val ii = it.iterator
var b = zero
while (ii.hasNext) {
val x = ii.next
if (fail(x)) return None
b = f(b,x)
}
Some(b)
}
def sumEvenNumbers(nums: Iterable[Int]) = foldOrFail(nums)(0)(_ % 2 != 0)(_ + _)
(Whether you implement the termination by recursion, return, laziness, etc. is up to you.)
I think that covers the main reasonable variants; there are some other options also, but I'm not sure why one would use them in this case. (Iterator itself would work well if it had a findOrPrevious, but it doesn't, and the extra work it takes to do that by hand makes it a silly option to use here.)
The scenario you describe (exit upon some unwanted condition) seems like a good use case for the takeWhile method. It is essentially filter, but should end upon encountering an element that doesn't meet the condition.
For example:
val list = List(2,4,6,8,6,4,2,5,3,2)
list.takeWhile(_ % 2 == 0) //result is List(2,4,6,8,6,4,2)
This will work just fine for Iterators/Iterables too. The solution I suggest for your "sum of even numbers, but break on odd" is:
list.iterator.takeWhile(_ % 2 == 0).foldLeft(...)
And just to prove that it's not wasting your time once it hits an odd number...
scala> val list = List(2,4,5,6,8)
list: List[Int] = List(2, 4, 5, 6, 8)
scala> def condition(i: Int) = {
| println("processing " + i)
| i % 2 == 0
| }
condition: (i: Int)Boolean
scala> list.iterator.takeWhile(condition _).sum
processing 2
processing 4
processing 5
res4: Int = 6
You can do what you want in a functional style using the lazy version of foldRight in scalaz. For a more in depth explanation, see this blog post. While this solution uses a Stream, you can convert an Iterable into a Stream efficiently with iterable.toStream.
import scalaz._
import Scalaz._
val str = Stream(2,1,2,2,2,2,2,2,2)
var i = 0 //only here for testing
val r = str.foldr(Some(0):Option[Int])((n,s) => {
println(i)
i+=1
if (n % 2 == 0) s.map(n+) else None
})
This only prints
0
1
which clearly shows that the anonymous function is only called twice (i.e. until it encounters the odd number). That is due to the definition of foldr, whose signature (in case of Stream) is def foldr[B](b: B)(f: (Int, => B) => B)(implicit r: scalaz.Foldable[Stream]): B. Note that the anonymous function takes a by name parameter as its second argument, so it need no be evaluated.
Btw, you can still write this with the OP's pattern matching solution, but I find if/else and map more elegant.
Well, Scala does allow non local returns. There are differing opinions on whether or not this is a good style.
scala> def sumEvenNumbers(nums: Iterable[Int]): Option[Int] = {
| nums.foldLeft (Some(0): Option[Int]) {
| case (None, _) => return None
| case (Some(s), n) if n % 2 == 0 => Some(s + n)
| case (Some(_), _) => None
| }
| }
sumEvenNumbers: (nums: Iterable[Int])Option[Int]
scala> sumEvenNumbers(2 to 10)
res8: Option[Int] = None
scala> sumEvenNumbers(2 to 10 by 2)
res9: Option[Int] = Some(30)
EDIT:
In this particular case, as #Arjan suggested, you can also do:
def sumEvenNumbers(nums: Iterable[Int]): Option[Int] = {
nums.foldLeft (Some(0): Option[Int]) {
case (Some(s), n) if n % 2 == 0 => Some(s + n)
case _ => return None
}
}
You can use foldM from cats lib (as suggested by #Didac) but I suggest to use Either instead of Option if you want to get actual sum out.
bifoldMap is used to extract the result from Either.
import cats.implicits._
def sumEven(nums: Stream[Int]): Either[Int, Int] = {
nums.foldM(0) {
case (acc, n) if n % 2 == 0 => Either.right(acc + n)
case (acc, n) => {
println(s"Stopping on number: $n")
Either.left(acc)
}
}
}
examples:
println("Result: " + sumEven(Stream(2, 2, 3, 11)).bifoldMap(identity, identity))
> Stopping on number: 3
> Result: 4
println("Result: " + sumEven(Stream(2, 7, 2, 3)).bifoldMap(identity, identity))
> Stopping on number: 7
> Result: 2
Cats has a method called foldM which does short-circuiting (for Vector, List, Stream, ...).
It works as follows:
def sumEvenNumbers(nums: Stream[Int]): Option[Long] = {
import cats.implicits._
nums.foldM(0L) {
case (acc, c) if c % 2 == 0 => Some(acc + c)
case _ => None
}
}
If it finds a not even element it returns None without computing the rest, otherwise it returns the sum of the even entries.
If you want to keep count until an even entry is found, you should use an Either[Long, Long]
#Rex Kerr your answer helped me, but I needed to tweak it to use Either
def foldOrFail[A,B,C,D](map: B => Either[D, C])(merge: (A, C) => A)(initial: A)(it: Iterable[B]): Either[D, A] = {
val ii= it.iterator
var b= initial
while (ii.hasNext) {
val x= ii.next
map(x) match {
case Left(error) => return Left(error)
case Right(d) => b= merge(b, d)
}
}
Right(b)
}
You could try using a temporary var and using takeWhile. Here is a version.
var continue = true
// sample stream of 2's and then a stream of 3's.
val evenSum = (Stream.fill(10)(2) ++ Stream.fill(10)(3)).takeWhile(_ => continue)
.foldLeft(Option[Int](0)){
case (result,i) if i%2 != 0 =>
continue = false;
// return whatever is appropriate either the accumulated sum or None.
result
case (optionSum,i) => optionSum.map( _ + i)
}
The evenSum should be Some(20) in this case.
You can throw a well-chosen exception upon encountering your termination criterion, handling it in the calling code.
A more beutiful solution would be using span:
val (l, r) = numbers.span(_ % 2 == 0)
if(r.isEmpty) Some(l.sum)
else None
... but it traverses the list two times if all the numbers are even
Just for an "academic" reasons (:
var headers = Source.fromFile(file).getLines().next().split(",")
var closeHeaderIdx = headers.takeWhile { s => !"Close".equals(s) }.foldLeft(0)((i, S) => i+1)
Takes twice then it should but it is a nice one liner.
If "Close" not found it will return
headers.size
Another (better) is this one:
var headers = Source.fromFile(file).getLines().next().split(",").toList
var closeHeaderIdx = headers.indexOf("Close")