Scala Tail recursion future returns - scala

how to implement a tail recursion functions in scala with futures as return values:
Example code
def getInfo(lists: List[Int]): Future[List[Int]] = {
def getStudentIDs(lists: List[Int]): Future[List[Int]] = {
//here a web service call that returns a future WS response
val response=ws.call(someURL)
response.map(r => {
r.status match {
case 200 =>
var newList = lists + (r.json \ ids)
.as[List[Int]] //here add the ws call response json..
getStudentIDs(newList)
case 429 =>Future.sucessful(lists)
case _ => getStudentIDs(lists)
}
})
}
getStudentIDs(List.empty[Int])
}

I think it's an XY-problem. You probably don't want it "tail-recursive" in the sense of "having a #tailrec"-annotation. What you want is stack safety, so that this method does not blow the stack after a few hundred retries to connect to your webservice.
For this, there are libraries, for example Cats.
In Cats, there is a typeclass called Monad, and this typeclass provides a special method for what seems to be exactly what you want:
tailRecM[A, B](a: A)(f: (A) => F[Either[A, B]]): F[B]
Quote from the docs:
Keeps calling f until a scala.util.Right[B] is returned.
Implementations of this method should use constant stack space [...]
There is an implementation of this for Future in FutureInstances available. The implementation seems trivial though, because it mixes in StackSafeMonad.
You could of course look into implementation of StackSafeMonad, and then try to understand why it is sufficient in the case of Future, but you could also just use the library implementation instead and not worry about whether your recursive method can fail with a StackOverflowError.

Here is the problem (code simplified to make it runnable):
import scala.concurrent._
import scala.concurrent.duration._
import annotation.tailrec
import scala.concurrent.ExecutionContext.Implicits.global
def getInfo(lists: List[Int]): Future[List[Int]] = {
#tailrec
def getStudentIDs(lists: List[Int]): Future[List[Int]] = {
Future(List(1, 2, 3)).flatMap(x => getStudentIDs(x ::: lists))
}
getStudentIDs(List.empty[Int])
}
Gives an error:
error: could not optimize #tailrec annotated method getStudentIDs:
it contains a recursive call not in tail position
Future(1).flatMap(x => getStudentIDs(x :: lists))
^
The problem is not only with a Future. The actual problem is that getStudents is not in terminal/tail position - it's called from map. This would be a problem if you use no Futures and use regular map from collections or any other function for that matter. For example:
def getInfo(lists: List[Int]): List[Int] = {
#tailrec
def getStudentIDs(lists: List[Int]): List[Int] = {
List(1).flatMap(x => getStudentIDs(x :: lists))
}
getStudentIDs(List.empty[Int])
}
Gives the same error:
error: could not optimize #tailrec annotated method getStudentIDs:
it contains a recursive call not in tail position
List(1).flatMap(x => getStudentIDs(x :: lists))
^
What makes it difficult here is that you can't just get a result from future directly to use it in getStudents because you don't know if it's completed and it's not a good practice to block on Future and await for result. So you sort of forced to use map. Here is a very bad example of how to make it tail recursive (just for science :)). Don't do it in production code:
def getInfo(lists: List[Int]): Future[List[Int]] = {
#tailrec
def getStudentIDs(lists: List[Int]): Future[List[Int]] = {
val r = Await.ready(Future(List(1, 2, 3)), Duration.Inf).value.get.getOrElse(lists)
getStudentIDs(r ::: lists)
}
getStudentIDs(List.empty[Int])
}
The compiler is happy, but this might lead to many problems - read about Await, blocking and thread pools for more on that.
I think it's probably not a big issue that your function is not tail recursive because you probably don't want to create lots of futures this way anyway. There are other concurrency frameworks you could try if that's really a problem like actors (Akka), etc.

Related

Convert EitherT[Future, A, B] to Option[B]

I have a method
def getInfoFromService(reqParams: Map[String, String]): Future[Either[CustomException, A]]
and I have another function
import cats.implicits._
def processInfoAndModelResponse(input: Future[Either[CustomException, A]]): Option[A] = {
for {
either <- input
resultA <- either
} yield resultA.some
}
So basically, I am trying to convert Future[Either[CustomException, A]]) to Option[A]. The resultA <- either in above code would not work as map is not happy with type.
What you try to do is basically something like
def convert[A](future: Future[Either[CustomException, A]]): Option[A] =
Try(Await.result(future, timeout)).toEither.flatten.toOption
This:
looses information about error, which prevents any debugging (though you could log after .flatten)
blocks async operation which kills any benefit of using Future in the first place
Monads do not work like that. Basically you can handle several monadic calculations with for for the same monad, but Future and Either are different monads.
Try
import scala.concurrent.{Await, Future}
import scala.concurrent.duration._
def processInfoAndModelResponse(input: Future[Either[CustomException, A]]): Option[A] = {
Await.result(input, 1.minute).toOption
}
#MateuszKubuszok's answer is different in that it doesn't throw.
You can't sensibly convert a Future[Either[CustomException, A]] to Option[A], as you don't know what the result will be until the Future has completed.
Future[Either[CustomException, A]] might just bet kept that way, or put in IO as IO[Either[CustomException, A]]. If you don't care about CustomException, you can also capture it Future (or IO) error handling mechanisms, or just discard it entirely.
Some options:
absorb the CustomException in Future's error handling:
val fa: Future[A] = fut.transform(_.map(_.toTry))
ignore the CustomException an make the Future return with None
val fa: Future[Option[A]] = fut.map(_.toOption)
if you really want to block a thread and wait for the result,
Await.result(input, Duration.Inf).toOption
but all the caveats of awaiting apply. See the documentaion, reproduced here:
WARNING: It is strongly discouraged to supply lengthy timeouts since the progress of the calling thread will be suspended—blocked—until either the Awaitable has a result or the timeout expires.

How to reason about stack safety in Scala Cats / fs2?

Here is a piece of code from the documentation for fs2. The function go is recursive. The question is how do we know if it is stack safe and how to reason if any function is stack safe?
import fs2._
// import fs2._
def tk[F[_],O](n: Long): Pipe[F,O,O] = {
def go(s: Stream[F,O], n: Long): Pull[F,O,Unit] = {
s.pull.uncons.flatMap {
case Some((hd,tl)) =>
hd.size match {
case m if m <= n => Pull.output(hd) >> go(tl, n - m)
case m => Pull.output(hd.take(n.toInt)) >> Pull.done
}
case None => Pull.done
}
}
in => go(in,n).stream
}
// tk: [F[_], O](n: Long)fs2.Pipe[F,O,O]
Stream(1,2,3,4).through(tk(2)).toList
// res33: List[Int] = List(1, 2)
Would it also be stack safe if we call go from another method?
def tk[F[_],O](n: Long): Pipe[F,O,O] = {
def go(s: Stream[F,O], n: Long): Pull[F,O,Unit] = {
s.pull.uncons.flatMap {
case Some((hd,tl)) =>
hd.size match {
case m if m <= n => otherMethod(...)
case m => Pull.output(hd.take(n.toInt)) >> Pull.done
}
case None => Pull.done
}
}
def otherMethod(...) = {
Pull.output(hd) >> go(tl, n - m)
}
in => go(in,n).stream
}
My previous answer here gives some background information that might be useful. The basic idea is that some effect types have flatMap implementations that support stack-safe recursion directly—you can nest flatMap calls either explicitly or through recursion as deeply as you want and you won't overflow the stack.
For some effect types it's not possible for flatMap to be stack-safe, because of the semantics of the effect. In other cases it may be possible to write a stack-safe flatMap, but the implementers might have decided not to because of performance or other considerations.
Unfortunately there's no standard (or even conventional) way to know whether the flatMap for a given type is stack-safe. Cats does include a tailRecM operation that should provide stack-safe monadic recursion for any lawful monadic effect type, and sometimes looking at a tailRecM implementation that's known to be lawful can provide some hints about whether a flatMap is stack-safe. In the case of Pull it looks like this:
def tailRecM[A, B](a: A)(f: A => Pull[F, O, Either[A, B]]) =
f(a).flatMap {
case Left(a) => tailRecM(a)(f)
case Right(b) => Pull.pure(b)
}
This tailRecM is just recursing through flatMap, and we know that Pull's Monad instance is lawful, which is pretty good evidence that Pull's flatMap is stack-safe. The one complicating factor here is that the instance for Pull has an ApplicativeError constraint on F that Pull's flatMap doesn't, but in this case that doesn't change anything.
So the tk implementation here is stack-safe because flatMap on Pull is stack-safe, and we know that from looking at its tailRecM implementation. (If we dug a little deeper we could figure out that flatMap is stack-safe because Pull is essentially a wrapper for FreeC, which is trampolined.)
It probably wouldn't be terribly hard to rewrite tk in terms of tailRecM, although we'd have to add the otherwise unnecessary ApplicativeError constraint. I'm guessing the authors of the documentation chose not to do that for clarity, and because they knew Pull's flatMap is fine.
Update: here's a fairly mechanical tailRecM translation:
import cats.ApplicativeError
import fs2._
def tk[F[_], O](n: Long)(implicit F: ApplicativeError[F, Throwable]): Pipe[F, O, O] =
in => Pull.syncInstance[F, O].tailRecM((in, n)) {
case (s, n) => s.pull.uncons.flatMap {
case Some((hd, tl)) =>
hd.size match {
case m if m <= n => Pull.output(hd).as(Left((tl, n - m)))
case m => Pull.output(hd.take(n.toInt)).as(Right(()))
}
case None => Pull.pure(Right(()))
}
}.stream
Note that there's no explicit recursion.
The answer to your second question depends on what the other method looks like, but in the case of your specific example, >> will just result in more flatMap layers, so it should be fine.
To address your question more generally, this whole topic is a confusing mess in Scala. You shouldn't have to dig into implementations like we did above just to know whether a type supports stack-safe monadic recursion or not. Better conventions around documentation would be a help here, but unfortunately we're not doing a very good job of that. You could always use tailRecM to be "safe" (which is what you'll want to do when the F[_] is generic, anyway), but even then you're trusting that the Monad implementation is lawful.
To sum up: it's a bad situation all around, and in sensitive situations you should definitely write your own tests to verify that implementations like this are stack-safe.

Making multiple API calls in a functional way

What would it be the best approach to solve this problem in the most functional (algebraic) way by using Scala and Cats (or maybe another library focused on Category Theory and/or functional programming)?
Resources
Provided we have the following methods which perform REST API calls to retrieve single pieces of information?
type FutureApiCallResult[A] = Future[Either[String, Option[A]]]
def getNameApiCall(id: Int): FutureApiCallResult[String]
def getAgeApiCall(id: Int): FutureApiCallResult[Int]
def getEmailApiCall(id: Int): FutureApiCallResult[String]
As you can see they produce asynchronous results. The Either monad is used to return possible errors during API calls and Option is used to return None whenever the resource is not found by the API (this case is not an error but a possible and desired result type).
Method to implement in a functional way
case class Person(name: String, age: Int, email: String)
def getPerson(id: Int): Future[Option[Person]] = ???
This method should used the three API calls methods defined above to asynchronously compose and return a Person or None if either any of the API calls failed or any of the API calls return None (the whole Person entity cannot be composed)
Requirements
For performance reasons all the API calls must be done in a parallel way
My guess
I think the best option would be to use the Cats Semigroupal Validated but I get lost when trying to deal with Future and so many nested Monads :S
Can anyone tell me how would you implement this (even if changing method signature or main concept) or point me to the right resources? Im quite new to Cats and Algebra in coding but I would like to learn how to handle this kind of situations so that I can use it at work.
The key requirement here is that it has to be done in parallel. It means that the obvious solution using a monad is out, because monadic bind is blocking (it needs the result in case it has to branch on it). So the best option is to use applicative.
I'm not a Scala programmer, so I can't show you the code, but the idea is that an applicative functor can lift functions of multiple arguments (a regular functor lifts functions of single argument using map). Here, you would use something like map3 to lift the three-argument constructor of Person to work on three FutureResults. A search for "applicative future in Scala" returns a few hits. There are also applicative instances for Either and Option and, unlike monads, applicatives can be composed together easily. Hope this helps.
You can make use of the cats.Parallel type class. This enables some really neat combinators with EitherT which when run in parallel will accumulate errors. So the easiest and most concise solution would be this:
type FutureResult[A] = EitherT[Future, NonEmptyList[String], Option[A]]
def getPerson(id: Int): FutureResult[Person] =
(getNameApiCall(id), getAgeApiCall(id), getEmailApiCall(id))
.parMapN((name, age, email) => (name, age, email).mapN(Person))
For more information on Parallel visit the cats documentation.
Edit: Here's another way without the inner Option:
type FutureResult[A] = EitherT[Future, NonEmptyList[String], A]
def getPerson(id: Int): FutureResult[Person] =
(getNameApiCall(id), getAgeApiCall(id), getEmailApiCall(id))
.parMapN(Person)
this is the only solution i came across with but still not satisfied because i have the feeling it could be done in a cleaner way
import cats.data.NonEmptyList
import cats.implicits._
import scala.concurrent.Future
case class Person(name: String, age: Int, email: String)
type FutureResult[A] = Future[Either[NonEmptyList[String], Option[A]]]
def getNameApiCall(id: Int): FutureResult[String] = ???
def getAgeApiCall(id: Int): FutureResult[Int] = ???
def getEmailApiCall(id: Int): FutureResult[String] = ???
def getPerson(id: Int): FutureResult[Person] =
(
getNameApiCall(id).map(_.toValidated),
getAgeApiCall(id).map(_.toValidated),
getEmailApiCall(id).map(_.toValidated)
).tupled // combine three futures
.map {
case (nameV, ageV, emailV) =>
(nameV, ageV, emailV).tupled // combine three Validated
.map(_.tupled) // combine three Options
.map(_.map { case (name, age, email) => Person(name, age, email) }) // wrap final result
}.map(_.toEither)
Personally I prefer to collapse all non-success conditions into the Future's failure. That really simplifies the error handling, like:
val futurePerson = for {
name <- getNameApiCall(id)
age <- getAgeApiCall(id)
email <- getEmailApiCall(id)
} yield Person(name, age, email)
futurePerson.recover {
case e: SomeKindOfError => ???
case e: AnotherKindOfError => ???
}
Note that this won't run the requests in parallel, to do so you'd need to move the future's creation outside of the for comprehension, like:
val futureName = getNameApiCall(id)
val futureAge = getAgeApiCall(id)
val futureEmail = getEmailApiCall(id)
val futurePerson = for {
name <- futureName
age <- futureAge
email <- futureEmail
} yield Person(name, age, email)

Combining Scala Futures and collections in for comprehensions

I'm trying to use a for expression to iterate over a list, then do a transformation on each element using a utility that returns a Future. Long story short, it doesn't compile, and I'd like to understand why. I read this question, which is similar, and was a great help, but what I'm trying to do is even simpler, which is all the more confusing as to why it doesn't work. I'm trying to do something like:
import scala.concurrent.Future
import scala.concurrent.ExecutionContext.Implicits.global
val numberList = List(1, 2, 3)
def squareInTheFuture(number: Int): Future[Int] = Future { number * number}
val allTheSquares = for {
number <- numberList
square <- squareInTheFuture(number)
} yield { square }
And what I get is:
error: type mismatch;
found : scala.concurrent.Future[Int]
required: scala.collection.GenTraversableOnce[?]
square <- squareInTheFuture(number)
^
Can someone help me understand why this doesn't work and what the best alternative is?
The Future companion object has a traverse method that does exactly what you want:
val allTheSquares: Future[List[Int]] =
Future.traverse(numberList)(squareInTheFuture)
This will asynchronously start all the computations and return a future that will be completed once all of those futures are completed.
flatMap requires that the type constructors of numberList and squareInTheFuture(number) are the same (modulo whatever implicit conversions the collection library does). That isn't the case here. Instead, this is a traversal:
val allSquaresInTheFuture: Future[List[Int]] =
Future.traverse(numberList)(squareInTheFuture)
#Lee is correct. As an addition, if you are trying to do parallel computation:
val numberList = List(1, 2, 3)
val allTheSquares = numberList.par.map(x => x * x)(breakOut)
If you really want Future:
val allTheSquares: Future[List[Int]] = Future.traverse(numberList)(squareInTheFuture)
Your for comprehension is the same as
val allTheSquares = numberList.flatMap(number => squareInTheFuture(number))
flatMap requires that it's argument function returns a GenTraversableOnce[Int], however yours returns a Future[Int], hence the mismatch.

How to write this three-liner as a one-liner?

I like the way, you can write one-liner-methods in Scala, e.g. with List(1, 2, 3).foreach(..).map(..).
But there is a certain situation, that sometimes comes up when writing Scala code, where things get a bit ugly. Example:
def foo(a: A): Int = {
// do something with 'a' which results in an integer
// e.g. 'val result = a.calculateImportantThings
// clean up object 'a'
// e.g. 'a.cleanUp'
// Return the result of the previous calculation
return result
}
In this situation we have to return a result, but can not return it directly after the calculation is done, because we have to do some clean up before returning.
I always have to write a three-liner. Is there also a possibility to write a one-liner to do this (without changing the class of A, because this may be a external library which can not be changed) ?
There are clearly side-effects involved here (otherwise the order of invocation of calculateImportantThings and cleanUp wouldn't matter) so you would be well advised to reconsider your design.
However, if that's not an option you could try something like,
scala> class A { def cleanUp {} ; def calculateImportantThings = 23 }
defined class A
scala> val a = new A
a: A = A#927eadd
scala> (a.calculateImportantThings, a.cleanUp)._1
res2: Int = 23
The tuple value (a, b) is equivalent to the application Tuple2(a, b) and the Scala specification guarantees that its arguments will be evaluated left to right, which is what you want here.
This is a perfect use-case for try/finally:
try a.calculateImportantThings finally a.cleanUp
This works because try/catch/finally is an expression in scala, meaning it returns a value, and even better, you get the cleanup whether or not the calculation throws an exception.
Example:
scala> val x = try 42 finally println("complete")
complete
x: Int = 42
There is, in fact, a Haskell operator for just such an occasion:
(<*) :: Applicative f => f a -> f b -> f a
For example:
ghci> getLine <* putStrLn "Thanks for the input!"
asdf
Thanks for the input!
"asdf"
All that remains then is to discover the same operator in scalaz, since scalaz usually replicates everything that Haskell has. You can wrap values in Identity, since Scala doesn't have IO to classify effects. The result would look something like this:
import scalaz._
import Scalaz._
def foo(a: A): Int =
(a.calculateImportantThings.pure[Identity] <* a.cleanup.pure[Identity]).value
This is rather obnoxious, though, since we have to explicitly wrap the side-effecting computations in Identity. Well the truth is, scalaz does some magic that implicitly converts to and from the Identity container, so you can just write:
def foo(a: A): Int = Identity(a.calculateImportantThings) <* a.cleanup()
You do need to hint to the compiler somehow that the leftmost thing is in the Identity monad. The above was the shortest way I could think of. Another possibility is to use Identity() *> foo <* bar, which will invoke the effects of foo and bar in that order, and then produce the value of foo.
To return to the ghci example:
scala> import scalaz._; import Scalaz._
import scalaz._
import Scalaz._
scala> val x : String = Identity(readLine) <* println("Thanks for the input!")
<< input asdf and press enter >>
Thanks for the input!
x: String = asdf
Maybe you want to use a kestrel combinator? It is defined as follows:
Kxy = x
So you call it with the value you want to return and some side-effecting operation you want to execute.
You could implement it as follows:
def kestrel[A](x: A)(f: A => Unit): A = { f(x); x }
... and use it in this way:
kestrel(result)(result => a.cleanUp)
More information can be found here: debasish gosh blog.
[UPDATE] As Yaroslav correctly points out, this is not the best application of the kestrel combinator. But it should be no problem to define a similar combinator using a function without arguments, so instead:
f: A => Unit
someone could use:
f: () => Unit
class Test {
def cleanUp() {}
def getResult = 1
}
def autoCleanup[A <: Test, T](a: A)(x: => T) = {
try { x } finally { a.cleanUp }
}
def foo[A <: Test](a:A): Int = autoCleanup(a) { a.getResult }
foo(new Test)
You can take a look at scala-arm project for type class based solution.
Starting Scala 2.13, the chaining operation tap can be used to apply a side effect (in this case the cleanup of A) on any value while returning the original value untouched:
def tap[U](f: (A) => U): A
import util.chaining._
// class A { def cleanUp { println("clean up") } ; def calculateImportantThings = 23 }
// val a = new A
val x = a.calculateImportantThings.tap(_ => a.cleanUp)
// clean up
// x: Int = 23
In this case tap is a bit abused since we don't even use the value it's applied on (a.calculateImportantThings (23)) to perform the side effect (a.cleanUp).