Scala - map over sequence, stopping immediately when element cannot be processed

Scala - map over sequence, stopping immediately when element cannot be processed - scala

I'd like a function that maps a function f over a sequence xs, and if f(x) (where x is an element of xs) produces a Failure then don't process any further elements of xs but immediately return Failure. If f(x) succeeds for all x then return a Success containing a sequence of the results.
So the type signature might be something like
def traverse[A, B](xs: Seq[A])(f: A => Try[B]): Try[Seq[B]]
And some test cases:
def doWork(i: Int): Try[Int] = {
i match {
case 1 => Success(10)
case 2 => Failure(new IllegalArgumentException("computer says no"))
case 3 => Success(30)
}
}
traverse(Seq(1,2,3))(doWork)
res0: scala.util.Try[Seq[Int]] = Failure(java.lang.IllegalArgumentException: computer says no)
traverse(Seq(1,3))(doWork)
scala.util.Try[Seq[Int]] = Success(List(10, 30))
What would be the most elegant way to implement this?

Simple implementation:
def traverse[A, B](xs: Seq[A])(f: A => Try[B]): Try[Seq[B]] =
xs.foldLeft[Try[Seq[B]]](Success(Vector())) { (attempt, elem) => for {
seq <- attempt
next <- f(elem)
} yield seq :+ next
}
Trouble here that while function will not evaluate f after the Failure will occur, function will traverse the sequence to the end , which could be undesirable in case of some complex Stream, so we may use some specialized version:
def traverse1[A, B](xs: Seq[A])(f: A => Try[B]): Try[Seq[B]] = {
val ys = xs map f
ys find (_.isFailure) match {
case None => Success(ys map (_.get))
case Some(Failure(ex)) => Failure(ex)
}
}
which uses intermediate collection, which leads to unnecessary memory overhead in case of strict collection
or we could reimplement fold from scratch:
def traverse[A, B](xs: Seq[A])(f: A => Try[B]): Try[Seq[B]] = {
def loop(xs: Seq[A], acc: Seq[B]): Try[Seq[B]] = xs match {
case Seq() => Success(acc)
case elem +: tail =>
f(elem) match {
case Failure(ex) => Failure(ex)
case Success(next) => loop(tail, acc :+ next)
}
}
loop(xs, Vector())
}
As we could see inner loop will continue iterations while it deals only with Success

One way, but is it the most elegant?
def traverse[A, B](xs: Seq[A])(f: A => Try[B]): Try[Seq[B]] = {
Try(xs.map(f(_).get))
}

Related

Is it possible to pattern match on a by-name parameter without evaluating it?

Was playing with Lazy Structure Stream as below
import Stream._
sealed trait Stream[+A] {
..
def toList: List[A] = this match {
case Empty => Nil
case Cons(h, t) => println(s"${h()}::t().toList"); h()::t().toList
}
def foldRight[B](z: B) (f: ( A, => B) => B) : B = this match {
case Empty => println(s"foldRight of Empty return $z"); z
case Cons(h, t) => println(s"f(${h()}, t().foldRight(z)(f))"); f(h(), t().foldRight(z)(f))
}
..
}
case object Empty extends Stream[Nothing]
case class Cons[+A](h: () => A, t: () => Stream[A]) extends Stream[A]
object Stream {
def cons[A](h: => A, t: => Stream[A]): Stream[A] = {
lazy val hd = h
lazy val tl = t
Cons[A](() => hd, () => tl)
}
def empty[A]: Stream[A] = Empty
def apply[A](la: A*): Stream[A] = la match {
case list if list.isEmpty => empty[A]
case _ => cons(la.head, apply(la.tail:_*))
}
}
For a function takeWhile via foldRight i initially wrote:
def takeWhileFoldRight_0(p: A => Boolean) : Stream[A] = {
foldRight(empty[A]) {
case (a, b) if p(a) => println(s"takeWhileFoldRight cons($a, b) with p(a) returns: cons($a, b)"); cons(a, b)
case (a, b) if !p(a) => println(s"takeWhileFoldRight cons($a, b) with !p(a) returns: empty[A]"); empty[A]
}
}
Which when called as:
Stream(4,5,6).takeWhileFoldRight_0(_%2 == 0).toList
result in the following trace:
f(4, t().foldRight(z)(f))
f(5, t().foldRight(z)(f))
f(6, t().foldRight(z)(f))
foldRight of Empty return Empty
takeWhileFoldRight cons(6, b) with p(a) returns: cons(6, b)
takeWhileFoldRight cons(5, b) with !p(a) returns: empty[A]
takeWhileFoldRight cons(4, b) with p(a) returns: cons(4, b)
4::t().toList
res2: List[Int] = List(4)
Then questioning and questioning i figured that it might have been the unapply method in the pattern match that evaluate eagerly.
So i changed to
def takeWhileFoldRight(p: A => Boolean) : Stream[A] = {
foldRight(empty[A]) { (a, b) =>
if (p(a)) cons(a, b) else empty[A]
}
}
which when called as
Stream(4,5,6).takeWhileFoldRight(_%2 == 0).toList
result in the following trace:
f(4, t().foldRight(z)(f))
4::t().toList
f(5, t().foldRight(z)(f))
res1: List[Int] = List(4)
Hence my question:
Is there a way to recover the power of pattern match when working with by-name parameter ?
Said differently case i match parameter that are by-name without evaluating them eagerly ?
Or i have to go to a set of ugly nested "if" :p in that kind of scenario

Take a closer look at this fragment:
def toList: List[A] = this match {
case Empty => Nil
case Cons(h, t) => println(s"${h()}::t().toList"); h()::t().toList
}
def foldRight[B](z: B) (f: ( A, => B) => B) : B = this match {
case Empty => println(s"foldRight of Empty return $z"); z
case Cons(h, t) => println(s"f(${h()}, t().foldRight(z)(f))"); f(h(), t().foldRight(z)(f))
}
..
}
Here h and t in Cons aren't evaluated by unapply - after all unapply returns () => X functions without calling them. But you do. Twice for each match - once for printing and once for passing the result on. And you aren't remembering the result, so any future fold, map, etc would evaluate the function anew.
Depending on what behavior you want to have you should either:
Calculate the results once, right after matching them:
case Cons(h, t) =>
val hResult = h()
val tResult = t()
println(s"${hResult}::tail.toList")
hResult :: tResult.toList
or
not use case class because it cannot memoize the result and you might need to memoize it:
class Cons[A](fHead: () => A, fTail: () => Stream[A]) extends Stream[A] {
lazy val head: A = fHead()
lazy val tail: Stream[A] = fTail()
// also override: toString, equals, hashCode, ...
}
object Cons {
def apply[A](head: => A, tail: => Stream[A]): Stream[A] =
new Cons(() => head, () => tail)
def unapply[A](stream: Stream[A]): Option[(A, Stream[A])] = stream match {
case cons: Cons[A] => Some((cons.head, cons.tail)) // matches on type, doesn't use unapply
case _ => None
}
}
If you understand what you're doing you could also create a case class with overridden apply and unapply (like above) but that is almost always a signal that you shouldn't use a case class in the first place (because most likely toString, equals, hashCode, etc would have nonsensical implementation).

Scala cats trampoline

The test("ok") is copied from book "scala with cats" by Noel Welsh and Dave Gurnell pag.254 ("D.4 Safer Folding using Eval
"), the code run fine, it's the trampolined foldRight
import cats.Eval
test("ok") {
val list = (1 to 100000).toList
def foldRightEval[A, B](as: List[A], acc: Eval[B])(fn: (A, Eval[B]) => Eval[B]): Eval[B] =
as match {
case head :: tail =>
Eval.defer(fn(head, foldRightEval(tail, acc)(fn)))
case Nil =>
acc
}
def foldRight[A, B](as: List[A], acc: B)(fn: (A, B) => B): B =
foldRightEval(as, Eval.now(acc)) { (a, b) =>
b.map(fn(a, _))
}.value
val res = foldRight(list, 0L)(_ + _)
assert(res == 5000050000l)
}
The test("ko") returns same values of test("ok") for small list but for long list the value is different. Why?
test("ko") {
val list = (1 to 100000).toList
def foldRightSafer[A, B](as: List[A], acc: B)(fn: (A, B) => B): Eval[B] = as match {
case head :: tail =>
Eval.defer(foldRightSafer(tail, acc)(fn)).map(fn(head, _))
case Nil => Eval.now(acc)
}
val res = foldRightSafer(list, 0)((a, b) => a + b).value
assert(res == 5000050000l)
}

This is #OlegPyzhcov's comment, converted into a community wiki answer
You forgot the L in 0L passed as second argument to foldRightSafer.
Because of that, the inferred generic types of the invocation are
foldRightSafer[Int, Int]((list : List[Int]), (0: Int))((_: Int) + (_: Int))
and so your addition overflows and gives you something smaller than 2000000000 (9 zeroes, Int.MaxValue = 2147483647).

How stream passes incremental?

I am trying to understand how Stream works and have following Stream implementation:
sealed trait Stream[+A] {
def toList: List[A] = {
#annotation.tailrec
def go(s: Stream[A], acc: List[A]): List[A] = s match {
case Cons(h, t) => go(t(), h() :: acc)
case _ => acc
}
go(this, List()).reverse
}
def foldRight[B](z: => B)(f: (A, => B) => B): B =
this match {
case Cons(h, t) => f(h(), t().foldRight(z)(f))
case _ => z
}
def map[B](f: A => B): Stream[B] =
this.foldRight(Stream.empty[B])((x, y) => Stream.cons(f(x), y))
def filter(f: A => Boolean): Stream[A] =
this.foldRight(Stream.empty[A])((h, t) => if (f(h)) Stream.cons(h, t) else t)
}
case object Empty extends Stream[Nothing]
case class Cons[+A](h: () => A, t: () => Stream[A]) extends Stream[A]
object Stream {
def cons[A](hd: => A, t1: => Stream[A]): Stream[A] = {
lazy val head = hd
lazy val tail = t1
Cons(() => head, () => tail)
}
def empty[A]: Stream[A] = Empty
def apply[A](as: A*): Stream[A] =
if (as.isEmpty) empty else cons(as.head, apply(as.tail: _*))
}
and the code that is using Stream:
Stream(1,2,3,4).map((x) => {
println(x)
x + 10
}).filter((x) => {
println(x)
x % 2 == 0
}).toList
as output I've got:
1
11
2
12
3
13
4
14
res4: List[Int] = List(12, 14)
As you can see on the output, there is no intermediate result, the source will be pass one for one, how is that possible?
I can not image, how does it work.

Let's take a look at what the methods you used do on Stream:
map and filter are both implemented with foldRight. To make it clearer, let's inline foldRight inside map (the same can be done with filter), using the referential transparency principle:
def map[B](f: A => B) = this match {
case Cons(h, t) => Stream.cons(f(h()), t().map(f))
case _ => Empty
}
Now, where in this code is f evaluated? Never, since Stream.cons parameters are call-by-name, so we only give the description for the new stream, not its values.
Once you are convinced of this fact, you can easily see that the same will apply for filter, so we can move forward to toList.
It will evaluate each element in the Stream, putting the values in a List that will be reversed at the end.
But evaluating an element of the Stream which has been filtered and mapped is precisely reading the description of the values, so the actual functions are evaluated here. Hence the console output in order: first the map function is called then the filter function, for each element, one at a time (since we are now on the lazily mapped and filtered Stream).

Change list of Eithers to two value lists in scala

How can I change list of Eithers into two list of value Right and Left. When I use partition it returns two lists of Either's not values. What is the simplest way to do it?

foldLeft allows you to easily write your own method:
def separateEithers[T, U](list: List[Either[T, U]]) = {
val (ls, rs) = list.foldLeft(List[T](), List[U]()) {
case ((ls, rs), Left(x)) => (x :: ls, rs)
case ((ls, rs), Right(x)) => (ls, x :: rs)
}
(ls.reverse, rs.reverse)
}

You'll have to map the two resulting lists after partitioning.
val origin: List[Either[A, B]] = ???
val (lefts, rights) = origin.partition(_.isInstanceOf[Left[_]])
val leftValues = lefts.map(_.asInstanceOf[Left[A]].a)
val rightValues = rights.map(_.asInstanceOf[Right[B]].b)
If you are not happy with the casts and isInstanceOf's, you can also do it in two passes:
val leftValues = origin collect {
case Left(a) => a
}
val rightValues = origin collect {
case Right(b) => b
}
And if you are not happy with the two passes, you'll have to do it "by hand":
def myPartition[A, B](origin: List[Either[A, B]]): (List[A], List[B]) = {
val leftBuilder = List.newBuilder[A]
val rightBuilder = List.newBuilder[B]
origin foreach {
case Left(a) => leftBuilder += a
case Right(b) => rightBuilder += b
}
(leftBuilder.result(), rightBuilder.result())
}
Finally, if you don't like mutable state, you can do so:
def myPartition[A, B](origin: List[Either[A, B]]): (List[A], List[B]) = {
#tailrec
def loop(xs: List[Either[A, B]], accLeft: List[A],
accRight: List[B]): (List[A], List[B]) = {
xs match {
case Nil => (accLeft.reverse, accRight.reverse)
case Left(a) :: xr => loop(xr, a :: accLeft, accRight)
case Right(b) :: xr => loop(xr, accLeft, b :: accRight)
}
}
loop(origin, Nil, Nil)
}

If making two passes through the list is okay for you, you can use collect:
type E = Either[String, Int]
val xs: List[E] = List(Left("foo"), Right(1), Left("bar"), Right(2))
val rights = xs.collect { case Right(x) => x}
// rights: List[Int] = List(1, 2)
val lefts = xs.collect { case Left(x) => x}
// lefts: List[String] = List(foo, bar)

Using for comprehensions, like this,
for ( Left(v) <- xs ) yield v
and
for ( Right(v) <- xs ) yield v

How can I make this Scala function (a "flatMap" variant) tail recursive?

I'm having a look at the following code
http://aperiodic.net/phil/scala/s-99/p26.scala
Specifically
def flatMapSublists[A,B](ls: List[A])(f: (List[A]) => List[B]): List[B] =
ls match {
case Nil => Nil
case sublist#(_ :: tail) => f(sublist) ::: flatMapSublists(tail)(f)
}
I'm getting a StackOverflowError for large values presumably because the function is not tail recursive. Is there a way to transform the function to accommodate large numbers?

It is definitely not tail recursive. The f(sublist) ::: is modifying the results of the recursive call, making it a plain-old-stack-blowing recursion instead of a tail recursion.
One way to ensure that your functions are tail recursive is to put the #annotation.tailrec on any function that you expect to be tail recursive. The compiler will report an error if it fails to perform the tail call optimization.
For this, I would add a small helper function that's actually tail recursive:
def flatMapSublistsTR[A,B](ls: List[A])(f: (List[A]) => List[B]): List[B] = {
#annotation.tailrec
def helper(r: List[B], ls: List[A]): List[B] = {
ls match {
case Nil => r
case sublist#(_ :: tail) => helper(r ::: f(sublist), tail)
}
}
helper(Nil, ls)
}
For reasons not immediately obvious to me, the results come out in a different order than the original function. But, it looks like it works :-) Fixed.

Here is another way to implement the function:
scala> def flatMapSublists[A,B](ls: List[A])(f: (List[A]) => List[B]): List[B] =
| List.iterate(ls, ls.size)(_.tail).flatMap(f)
flatMapSublists: [A, B](ls: List[A])(f: List[A] => List[B])List[B]
A simply comparison between dave's flatMapSublistsTR and mine:
scala> def time(count: Int)(call : => Unit):Long = {
| val start = System.currentTimeMillis
| var cnt = count
| while(cnt > 0) {
| cnt -= 1
| call
| }
| System.currentTimeMillis - start
| }
time: (count: Int)(call: => Unit)Long
scala> val xs = List.range(0,100)
scala> val fn = identity[List[Int]] _
fn: List[Int] => List[Int] = <function1>
scala> time(10000){ flatMapSublists(xs)(fn) }
res1: Long = 5732
scala> time(10000){ flatMapSublistsTR(xs)(fn) }
res2: Long = 347232
Where the method flatMapSublistsTR is implemented as:
def flatMapSublistsTR[A,B](ls: List[A])(f: (List[A]) => List[B]): List[B] = {
#annotation.tailrec
def helper(r: List[B], ls: List[A]): List[B] = {
ls match {
case Nil => r
case sublist#(_ :: tail) => helper(r ::: f(sublist), tail)
}
}
helper(Nil, ls)
}

def flatMapSublists2[A,B](ls: List[A], result: List[B] = Nil)(f: (List[A]) => List[B]): List[B] =
ls match {
case Nil => result
case sublist#(_ :: tail) => flatMapSublists2(tail, result ++ f(sublist))(f)
}
You generally just need to add a result result parameter to carry from one iteration to the next, and spit out the result at the end instead of adding the end to the list.
Also that confusting sublist# thing can be simplified to
case _ :: tail => flatMapSublists2(tail, result ++ f(ls))(f)
Off-topic: here's how I solved problem 26, without the need for helper methods like the one above. If you can make this tail-recursive, have a gold star.
def combinations[A](n: Int, lst: List[A]): List[List[A]] = n match {
case 1 => lst.map(List(_))
case _ => lst.flatMap(i => combinations (n - 1, lst.dropWhile(_ != i).tail) map (i :: _))
}

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Scala - map over sequence, stopping immediately when element cannot be processed - scala

One way, but is it the most elegant? def traverse[A, B](xs: Seq[A])(f: A => Try[B]): Try[Seq[B]] = { Try(xs.map(f(_).get)) }

Related

Is it possible to pattern match on a by-name parameter without evaluating it?

Scala cats trampoline

How stream passes incremental?

Change list of Eithers to two value lists in scala

How can I make this Scala function (a "flatMap" variant) tail recursive?

Categories

Resources