Implementing recursive, stack safe function using State Monad

Implementing recursive, stack safe function using State Monad - scala

I want to implement a function representing a while loop using the State monad from cats.
I did it like this:
def whileLoopState[S](cond: S => Boolean)(block: S => S): State[S, Unit] = State { state =>
if (cond(state)) {
val nextState = block(state)
whileLoopState(cond)(block).run(nextState).value
} else {
(state, ())
}
}
The problem with this implementation is that it's not stack safe
because the recursive call is not in tail position, so the following
results in stack overflow error:
whileLoopState[Int](s => s > 0) { s =>
println(s)
s - 1
}.run(10000).value
Cats has tailRecM method implemented for every instance of Monad trait
that allows to make monadic recursive functions stack safe:
type WhileLoopState[A] = State[Unit, A]
def whileLoopStateTailRec[S](cond: S => Boolean)(block: S => S)(initialState: S): WhileLoopState[S] = Monad[WhileLoopState]
.tailRecM(initialState) { newState =>
State { _ =>
if (cond(newState)) {
val nextState = block(newState)
((), Left(nextState))
} else {
((), Right(newState))
}
}
}
Now this works:
whileLoopStateTailRec[Int](s => s > 0) { s =>
println(s)
s - 1
} (10000).run().value
but the implementation of whileLoopStateTailRec seems too convoluted
for a simple case like this and therefore raises my suspicion that I'm not doing things correctly.
Is there a way to simplify it?
Is it possible to use State[A, Unit] instead of State[Unit, A] so that the state is kept in the proper slot?
Is it possible to make recursive function using State monad stack safe without using tailRecM?

You can either just take advantage that flatMap on State is stack safe like this:
def whileLoopState[S](cond: S => Boolean)(block: S => S): State[S, Unit] =
State.get[S].flatMap { s =>
if (cond(s)) State.set(block(s)) >> whileLoopState(cond)(block)
else State.pure(())
}
Or, even better, just reuse existing syntax:
def whileLoopState[S](cond: S => Boolean)(block: S => S): State[S, Unit] =
State.modify(block).whileM_(State.inspect(cond))
You can see the code running here.

Related

scala syntax explanation involving higher order functions, type parameter and return type

I am having problems understanding the Scala syntax, please advice. I have two snippets of code.
abstract class Try[T] {
def flatMap[U](f: T => Try[U]): Try[U] = this match {
case Success(x) => try f(x) catch { case NonFatal(ex) => Failure(ex) }
case fail: Failure => fail
}
}
My understanding:
flatMap received as parameter a function f. In turn this function f
receives type parameter T and returns Try of type parameter U.
flatMap ultimately return Try of type parameter U.
Q1 - Is my understanding correct?
Q2 - what is the relation between the return type from f (namely Try[U]) and the return type of flat map Try[U]? Does it have to be the same?
def flatMap[U](f: T => Try[U]): Try[U]
Or can I somehow have something like
def flatMap[U](f: T => Option[U]): Try[U]
In the last snippet of code, I guess that, after I use the function f inside my flatMap, I would need to make the connection between the output of f (namely Option[U]) and the final output demanded by flatMap (I mean Try[U])
EDIT
This code is taken from a scala course. here is the full code (some people asked about it). I just want to understand the syntax.
abstract class Try[T] {
def flatMap[U](f: T => Try[U]): Try[U] = this match {
case Success(x) => try f(x) catch { case NonFatal(ex) => Failure(ex) }
case fail: Failure => fail
}
def map[U](f: T => U): Try[U] = this match {
case Success(x) => Try(f(x))
case fail: Failure => fail
}
}

Q1 - Is my understanding correct?
It's hard to comment based on your sample code which has method implementation in an abstract class while no concrete classes are defined. Lets consider the following toy version of Try extracted from the Scala API with the flatMap implementation in its concrete classes:
import scala.util.control.NonFatal
sealed abstract class MyTry[+T] {
def flatMap[U](f: T => MyTry[U]): MyTry[U]
}
object MyTry {
def apply[T](r: => T): MyTry[T] =
try MySuccess(r) catch { case NonFatal(e) => MyFailure(e) }
}
final case class MyFailure[+T](exception: Throwable) extends MyTry[T] {
override def flatMap[U](f: T => MyTry[U]): MyTry[U] =
this.asInstanceOf[MyTry[U]]
}
final case class MySuccess[+T](value: T) extends MyTry[T] {
override def flatMap[U](f: T => MyTry[U]): MyTry[U] =
try f(value) catch { case NonFatal(e) => MyFailure(e) }
}
Testing it out with the following function f: T => MyTry[U] where T = String and U = Int, I hope it helps answer your question:
val f: String => MyTry[Int] = s => s match {
case "bad" => MyFailure(new Exception("oops"))
case s => MySuccess(s.length)
}
MyTry("abcde").flatMap(f)
// res1: MyTry[Int] = MySuccess(5)
MyTry("bad").flatMap(f)
// res2: MyTry[Int] = MyFailure(java.lang.Exception: oops)
Q2 - what is the relation between the return type from f (namely Try[U])
and the return type of flat map Try[U]? Does it have to be the same?
In Scala, flatMap is a common method defined in many of Scala containers/collections such as Option[T], List[T], Try[T], Future[T], with a standard signature:
class Container[T] {
def flatMap[U](f: T => Container[U]): Container[U]
}
If you want to have a special map that takes a T => Container1[U] function and returns a Container2[U], it'd probably best not to name it flatMap.

Q1 Largely correct, but just to clarify, all of this happens at compile time - T is not known at runtime (see here)
Q2 Of course you can create a method with signature
...[U](f: T => Option[U]): Try[U]
and you're free to call that method flatMap, but it won't be a standard flatMap:
trait T[A] {
flatMap[B](f: A => T[B]): T[B]
}
There are mathematical reasons for the form of flatMap (which also have implications in Scala's implementation of for expressions). To avoid confusion ...
Rather than altering flatMap's signature, wrap your T => Option[U] with an Option[U] => Try[U] to create a T => Try[U] before passing it to flatMap.

functional parallelism and laziness in Scala

Background
I have been reading the book Functional Programming in Scala, and have some questions regarding the content in Chapter 7: Purely functional parallelism.
Here is the code for the answers in the book: Par.scala, but I am confused about certain part of it.
Here is the first part of the code of Par.scala, which stands for Parallelism:
import java.util.concurrent._
object Par {
type Par[A] = ExecutorService => Future[A]
def unit[A](a: A): Par[A] = (es: ExecutorService) => UnitFuture(a)
private case class UnitFuture[A](get: A) extends Future[A] {
def isDone = true
def get(timeout: Long, units: TimeUnit): A = get
def isCancelled = false
def cancel(evenIfRunning: Boolean): Boolean = false
}
def map2[A, B, C](a: Par[A], b: Par[B])(f: (A, B) => C): Par[C] =
(es: ExecutorService) => {
val af = a(es)
val bf = b(es)
UnitFuture(f(af.get, bf.get))
}
def fork[A](a: => Par[A]): Par[A] =
(es: ExecutorService) => es.submit(new Callable[A] {
def call: A = a(es).get
})
def lazyUnit[A](a: => A): Par[A] =
fork(unit(a))
def run[A](es: ExecutorService)(a: Par[A]): Future[A] = a(es)
def asyncF[A, B](f: A => B): A => Par[B] =
a => lazyUnit(f(a))
def map[A, B](pa: Par[A])(f: A => B): Par[B] =
map2(pa, unit(()))((a, _) => f(a))
}
The simplest possible model for Par[A] might be ExecutorService => Future[A], and run simply returns the Future.
unit promotes a constant value to a parallel computation by returning a UnitFuture, which is a simple implementation of Future that just wraps a constant value.
map2 combines the results of two parallel computations with a binary function.
fork marks a computation for concurrent evaluation. The evaluation won’t actually occur until forced by run. Here is with its simplest and most natural implementation of it. Even though it has its problems, let's first put them aside.
lazyUnit wraps its unevaluated argument in a Par and marks it for concurrent evaluation.
run extracts a value from a Par by actually performing the computation.
asyncF converts any function A => B to one that evaluates its result asynchronously.
Questions
The fork is the function confuses me a lot here, because it takes a lazy argument, which will be evaluated later when it is called. Then my questions are more about when we should use this fork, i.e., when we need lazy-evaluation and when we need to have the value directly.
Here is an exercise from the book:
EXERCISE 7.5
Hard: Write this function, called sequence. No additional primitives are required. Do not call run.
def sequence[A](ps: List[Par[A]]): Par[List[A]]
And here is the answers (offered here).
First
def sequence_simple[A](l: List[Par[A]]): Par[List[A]] =
l.foldRight[Par[List[A]]](unit(List()))((h, t) => map2(h, t)(_ :: _))
What is the different between above code and the following:
def sequence_simple[A](l: List[Par[A]]): Par[List[A]] =
l.foldLeft[Par[List[A]]](unit(List()))((t, h) => map2(h, t)(_ :: _))
Additionally
def sequenceRight[A](as: List[Par[A]]): Par[List[A]] =
as match {
case Nil => unit(Nil)
case h :: t => map2(h, fork(sequenceRight(t)))(_ :: _)
}
def sequenceBalanced[A](as: IndexedSeq[Par[A]]): Par[IndexedSeq[A]] = fork {
if (as.isEmpty) unit(Vector())
else if (as.length == 1) map(as.head)(a => Vector(a))
else {
val (l,r) = as.splitAt(as.length/2)
map2(sequenceBalanced(l), sequenceBalanced(r))(_ ++ _)
}
}
In sequenceRight, fork is used when recursive function is directly called. However, in sequenceBalanced, fork is used outside of the whole function body.
Then, what is the differences or above code and the following (where we switched the places of fork):
def sequenceRight[A](as: List[Par[A]]): Par[List[A]] = fork {
as match {
case Nil => unit(Nil)
case h :: t => map2(h, sequenceRight(t))(_ :: _)
}
}
def sequenceBalanced[A](as: IndexedSeq[Par[A]]): Par[IndexedSeq[A]] =
if (as.isEmpty) unit(Vector())
else if (as.length == 1) map(as.head)(a => Vector(a))
else {
val (l,r) = as.splitAt(as.length/2)
map2(fork(sequenceBalanced(l)), fork(sequenceBalanced(r)))(_ ++ _)
}
Finally, given the sequence defined above, we have the following function:
def parMap[A,B](ps: List[A])(f: A => B): Par[List[B]] = fork {
val fbs: List[Par[B]] = ps.map(asyncF(f))
sequence(fbs)
}
I would like to know, can I also implement the function in the following way, which is by applying the lazyUnit defined in the beginning? Is this implementation lazyUnit(ps.map(f)) lazy?
def parMapByLazyUnit[A, B](ps: List[A])(f: A => B): Par[List[B]] =
lazyUnit(ps.map(f))

I did not completely understand your doubt. But I see a major problem with the following solution,
def parMapByLazyUnit[A, B](ps: List[A])(f: A => B): Par[List[B]] =
lazyUnit(ps.map(f))
To understand the problem lets look at def lazyUnit,
def fork[A](a: => Par[A]): Par[A] =
(es: ExecutorService) => es.submit(new Callable[A] {
def call: A = a(es).get
})
def lazyUnit[A](a: => A): Par[A] =
fork(unit(a))
So... lazyUnit takes an expression of type => A and submits it to ExecutorService to get evaluated. And returns the wrapped result of this parallel computation as Par[A].
In parMap for every element of ps: List[A], we not only have to evaluate the corresponding mapping using the function f: A => B but we have to do these evaluations in parallel.
But our solution lazyUnit(ps.map(f)) will submit the whole { ps.map(f) } evaluation as a single task to our ExecutionService. Which means we are not doing it in parallel.
What we need to do is make sure that for each element a in ps: [A], the function f: A => B is executed as a separate task for our ExecutorService.
Now, as we learned from our implementation is that we can run an expression of type exp: => A by using lazyUnit(exp) to get a result: Par[A].
So, we will do exactly that for every a: A in ps: List[A],
val parMappedTmp = ps.map( a => lazyUnit(f(a) ) )
// or
val parMappedTmp = ps.map( a => asyncF(f)(a) )
// or
val parMappedTmp = ps.map(asyncF(f))
But, Now our parMappedTmp is a List[Par[B]] and whereas we needed a Par[List[B]]
So, you will need a function with the following signature to get what you wanted,
def sequence[A](ps: List[Par[A]]): Par[List[A]]
Once you have it,
val parMapped = sequence(parMappedTmp)

Scala tail recursion from finally block

I want to apply a function f to each element of a List and not stop at the first error but throw the last error (if any) only:
#annotation.tailrec
def tryAll[A](xs: List[A])(f: A => Unit): Unit = {
xs match {
case x :: xt =>
try {
f(x)
} finally {
tryAll(xt)(f)
}
case _ =>
}
}
But, the above code does not compile - it is complaining that this function is not tail recursive. Why not?

This solution iterates over all elements and produces (throws) the last error if any:
def tryAll[A](xs: List[A])(f: A => Unit): Unit = {
val res = xs.foldLeft(Option.empty[Throwable]) {
case (maybeThrowable, a) =>
Try(f(a)) match {
case Success(_) => maybeThrowable
case Failure(e) => Option(e)
}
}
res.foreach(throwable => throw throwable)
}

As mentioned by #HristoIliev, your method cannot be tail recursive because the finally call is not guaranteed to be the tail call. This means that any method using try in this way will not be tail recursive. See this answer, also.
Calling the method again is a weird way of trying something repeatedly until it succeeds, because at each stage it's throwing an exception you are presumably not handling. Instead, I'd argue using a functional approach with Try, taking failures from a view until the operation succeeds. The only disadvantage to this approach is that it doesn't throw any exceptions for you to handle along the way (which can also be an advantage!).
def tryAll[A](xs: List[A])(f: A => Unit): Unit =
xs.view.map(x => Try(f(x))).takeWhile(_.isFailure).force
scala> val list = List(0, 0, 0, 4, 5, 0)
scala> tryAll(list)(a => println(10 / a))
2
If you really want to handle the exceptions (or just the last exception), you can change the return type of tryAll to List[Try[Unit]] (or simply Try[Unit] if you modify the code to only take the last one). It's better for the return type of the method to describe part of what it's actually doing--potentially returning errors.

Not sure the intention of the method, but you can something like that:
final def tryAll[A](xs: List[A])(f: A => Unit): Unit = {
xs match {
case x :: xt =>
try {
f(x)
} catch {
case e => tryAll(xt)(f)
}
case _ => //do something else
}
}

I know this way to use #annotation.tailrec
From this:
def fac(n:Int):Int = if (n<=1) 1 else n*fac(n-1)
You should have this:
#scala.annotation.tailrec
def facIter(f:Int, n:Int):Int = if (n<2) f else facIter(n*f, n-1)
def fac(n:Int) = facIter(1,n)

How to carry on executing Future sequence despite failure?

The traverse method from Future object stops at first failure. I want a tolerant/forgiving version of this method which on occurrence of errors carries on with the rest of the sequence.
Currently we have added the following method to our utils:
def traverseFilteringErrors[A, B <: AnyRef]
(seq: Seq[A])
(f: A => Future[B]): Future[Seq[B]] = {
val sentinelValue = null.asInstanceOf[B]
val allResults = Future.traverse(seq) { x =>
f(x) recover { case _ => sentinelValue }
}
val successfulResults = allResults map { result =>
result.filterNot(_ == sentinelValue)
}
successfulResults
}
Is there a better way to do this?

A genuinely useful thing (generally speaking) would be to be able to promote the error of a future into a proper value. Or in other words, transform a Future[T] into a Future[Try[T]] (the succesful return value becomes a Success[T] while the failure case becomes a Failure[T]). Here is how we might implement it:
// Can also be done more concisely (but less efficiently) as:
// f.map(Success(_)).recover{ case t: Throwable => Failure( t ) }
// NOTE: you might also want to move this into an enrichment class
def mapValue[T]( f: Future[T] ): Future[Try[T]] = {
val prom = Promise[Try[T]]()
f onComplete prom.success
prom.future
}
Now, if you do the following:
Future.traverse(seq)( f andThen mapValue )
You'll obtain a succesful Future[Seq[Try[A]]], whose eventual value contains a Success instance for each successful future, and a Failure instance for each failed future.
If needed, you can then use collect on this seq to drop the Failure instances and keep only the sucessful values.
In other words, you can rewrite your helper method as follows:
def traverseFilteringErrors[A, B](seq: Seq[A])(f: A => Future[B]): Future[Seq[B]] = {
Future.traverse( seq )( f andThen mapValue ) map ( _ collect{ case Success( x ) => x } )
}

transforming a Seq[Future[X]] into an Enumerator[X]

Is there a way to turn a Seq[Future[X]] into an Enumerator[X] ? The use case is that I want to get resources by crawling the web. This is going to return a Sequence of Futures, and I'd like to return an Enumerator that will push the futures in the order in which they are first finished on to the Iteratee.
It looks like Victor Klang's Future select gist could be used to do this - though it looks pretty inefficient.
Note: The Iteratees and Enumerator's in question are those given by the play framework version 2.x, ie with the following imports: import play.api.libs.iteratee._

Using Victor Klang's select method:
/**
* "Select" off the first future to be satisfied. Return this as a
* result, with the remainder of the Futures as a sequence.
*
* #param fs a scala.collection.Seq
*/
def select[A](fs: Seq[Future[A]])(implicit ec: ExecutionContext):
Future[(Try[A], Seq[Future[A]])] = {
#scala.annotation.tailrec
def stripe(p: Promise[(Try[A], Seq[Future[A]])],
heads: Seq[Future[A]],
elem: Future[A],
tail: Seq[Future[A]]): Future[(Try[A], Seq[Future[A]])] = {
elem onComplete { res => if (!p.isCompleted) p.trySuccess((res, heads ++ tail)) }
if (tail.isEmpty) p.future
else stripe(p, heads :+ elem, tail.head, tail.tail)
}
if (fs.isEmpty) Future.failed(new IllegalArgumentException("empty future list!"))
else stripe(Promise(), fs.genericBuilder[Future[A]].result, fs.head, fs.tail)
}
}
I can then get what I need with
Enumerator.unfoldM(initialSeqOfFutureAs){ seqOfFutureAs =>
if (seqOfFutureAs.isEmpty) {
Future(None)
} else {
FutureUtil.select(seqOfFutureAs).map {
case (t, seqFuture) => t.toOption.map {
a => (seqFuture, a)
}
}
}
}

A better, shorter and I think more efficient answer is:
def toEnumerator(seqFutureX: Seq[Future[X]]) = new Enumerator[X] {
def apply[A](i: Iteratee[X, A]): Future[Iteratee[X, A]] = {
Future.sequence(seqFutureX).flatMap { seqX: Seq[X] =>
seqX.foldLeft(Future.successful(i)) {
case (i, x) => i.flatMap(_.feed(Input.El(x)))
}
}
}
}

I do realise that the question is a bit old already, but based on Santhosh's answer and the built-in Enumterator.enumerate() implementation I came up with the following:
def enumerateM[E](traversable: TraversableOnce[Future[E]])(implicit ec: ExecutionContext): Enumerator[E] = {
val it = traversable.toIterator
Enumerator.generateM {
if (it.hasNext) {
val next: Future[E] = it.next()
next map {
e => Some(e)
}
} else {
Future.successful[Option[E]] {
None
}
}
}
}
Note that unlike the first Viktor-select-based-solution this one preserves the order, but you can still start off all computations asynchronously before. So, for example, you can do the following:
// For lack of a better name
def mapEachM[E, NE](eventuallyList: Future[List[E]])(f: E => Future[NE])(implicit ec: ExecutionContext): Enumerator[NE] =
Enumerator.flatten(
eventuallyList map { list =>
enumerateM(list map f)
}
)
This latter method was in fact what I was looking for when I stumbled on this thread. Hope it helps someone! :)

You could construct one using the Java Executor Completeion Service (JavaDoc). The idea is to use create a sequence of new futures, each using ExecutorCompletionService.take() to wait for the next result. Each future will start, when the previous future has its result.
But please b e aware, that this might be not that efficient, because a lot of synchronisation is happening behind the scenes. It might be more efficient, to use some parallel map reduce for calculation (e.g. using Scala's ParSeq) and let the Enumerator wait for the complete result.

WARNING: Not compiled before answering
What about something like this:
def toEnumerator(seqFutureX: Seq[Future[X]]) = new Enumerator[X] {
def apply[A](i: Iteratee[X, A]): Future[Iteratee[X, A]] =
Future.fold(seqFutureX)(i){ case (i, x) => i.flatMap(_.feed(Input.El(x)))) }
}

Here is something I found handy,
def unfold[A,B](xs:Seq[A])(proc:A => Future[B])(implicit errorHandler:Throwable => B):Enumerator[B] = {
Enumerator.unfoldM (xs) { xs =>
if (xs.isEmpty) Future(None)
else proc(xs.head) map (b => Some(xs.tail,b)) recover {
case e => Some((xs.tail,errorHandler(e)))
}
}
}
def unfold[A,B](fxs:Future[Seq[A]])(proc:A => Future[B]) (implicit errorHandler1:Throwable => Seq[A], errorHandler:Throwable => B) :Enumerator[B] = {
(unfold(Seq(fxs))(fxs => fxs)(errorHandler1)).flatMap(unfold(_)(proc)(errorHandler))
}
def unfoldFutures[A,B](xsfxs:Seq[Future[Seq[A]]])(proc:A => Future[B]) (implicit errorHandler1:Throwable => Seq[A], errorHandler:Throwable => B) :Enumerator[B] = {
xsfxs.map(unfold(_)(proc)).reduceLeft((a,b) => a.andThen(b))
}

I would like to propose the use of a Broadcast
def seqToEnumerator[A](futuresA: Seq[Future[A]])(defaultValue: A, errorHandler: Throwable => A): Enumerator[A] ={
val (enumerator, channel) = Concurrent.broadcast[A]
futuresA.foreach(f => f.onComplete({
case Success(Some(a: A)) => channel.push(a)
case Success(None) => channel.push(defaultValue)
case Failure(exception) => channel.push(errorHandler(exception))
}))
enumerator
}
I added errorHandling and defaultValues but you can skip those by using onSuccess or onFailure, instead of onComplete

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Implementing recursive, stack safe function using State Monad - scala

Related

scala syntax explanation involving higher order functions, type parameter and return type

functional parallelism and laziness in Scala

Scala tail recursion from finally block

How to carry on executing Future sequence despite failure?

transforming a Seq[Future[X]] into an Enumerator[X]

Categories

Resources