Concurrent for-comprehensions - scala

According to this blog post there's a potential performance issue with for comprehensions. For example:
for {
a <- remoteCallA()
b <- remoteCallB()
} yield {
(a, b)
}
has remoteCallB blocked until remoteCallA is completed. The blog post suggests that we do this instead:
futureA <- remoteCallA()
futureB <- remoteCallB()
for {
a <- futureA
b <- futureB
} yield {
(a, b)
}
which will ensure that the two remote calls can start at the same time.
My question: is the above (and therefore the blog writer) correct?
I've not seen people using this pattern, which has got me wondering whether there are alternative patterns that are generally used instead.
With thanks

The for comprehension
for {
a <- remoteCallA()
b <- remoteCallB()
} yield {
(a, b)
}
Translates to:
remoteCallA().flatmap(a => remoteCallB().map(b => (a,b)))
So, yes, I believe the blogger is correct in that the calls will be sequential, not concurrent, to one another.

The common pattern to execute several futures simultaneously is to use zip or Future.traverse. Here are a few examples:
for {
(a, b) <- remoteCallA() zip remoteCallB()
} yield f(a, b)
This becomes a bit cumbersome when there are more than 2 futures:
for {
((a, b), c) <- remoteCall() zip remoteCallB() zip remoteCallC()
} yield (a, b, c)
In those cases you can use Future.sequence:
for {
Seq(a, b, c) <-
Future.sequence(Seq(remoteCallA(), remoteCallB(), remoteCallC()))
} yield (a, b, c)
or Future.traverse, in case you have a sequence of arguments, and want to apply to all of them the same function, which returns a Future.
But both approaches have an issue: if one of the Futures fails early, before the others finish, naturally you may want the resulting Future to fail immediately at that moment. But that's not what happens. The result Future is failed only after all the futures have completed. See this question for details: How to implement Future as Applicative in Scala?

Related

Inteliji showing ```_ <- ``` error for scala zio operator

<- operator is working fine in basic for loop of zio but with the zip operator IntelliJ showing error.
Solution to solve the IntelliJ issue
It's not IntelliJ, it's Scala.
This:
for {
(a, b) <- foo
} yield ()
Does NOT translate to
foo.flatMap { case (a, b) =>
()
}
It translates to:
foo.withFilter {
case (a, b) => true
case _ => false
}.map {
case (a, b) =>
()
}
By specification pattern matching in for-comprehension (here (a, b) <- sth) is not checked by compiler for exhaustiveness, so instead it uses .withFilter to filter out all values which cannot be pattern-matched with it, before using .map/.flatMap/.foreach with case to process them.
ZIO does not define .withFiler which is why compiler (and IntelliJ) complains. Because you have uncommented invalid code in 2 of your examples. In fact, many structures don't have it, since it doesn't make any sense for them to have some "default" way of filtering, and how would it work securely (e.g. Future fails when you filter out its value, giving you some unhelpful exception message. Failing manually with Future.failure inside .flatMap would allow you to give it some meaningful error if you have to).
If you want to change this behavior use better-monadic-for compiler plugin. Or Scala 3 with -source:future flag. Alternatively, you can work around this issue with:
for {
_ <- Console.printLine("test")
result <- Random.nextUUID zip Console.readLine
(uuid, name) = result
} yield ()
since = creates just a normal val and it doesn't get in your way of creating unchecked match and extraction.

Execute two fs2 tasks concurrently (non-determenistically)

With Scalaz Task I make this with scalaz.Nondeterminism.both:
Nondeterminism[Task]
.both(
Task.now("Hello"),
Task.now("world")
)
or with Nondeterminism[Task].gatherUnordered().
How can I do the same thing with fs2 0.9.x version tasks?
I'm assuming you're on fs2 version 0.9.x.
To execute several Tasks in parallel, you can simply call Task.start.
Here's an example from the docs:
for {
f <- Task.start { expensiveTask1 }
// at this point, `expensive1` is evaluating in background
g <- Task.start { expensiveTask2 }
// now both `expensiveTask2` and `expensiveTask1` are running
result1 <- f
// we have forced `f`, so now only `expensiveTask2` may be running
result2 <- g
// we have forced `g`, so now nothing is running and we have both results
} yield (result1 + result2)
So in your case it would look like this:
for {
ta <- Task.start(Task.now("Hello"))
tb <- Task.start(Task.now("World"))
a <- ta
b <- tb
} yield (a, b)
Note that in the future it might be possible to do something like this with much less boilerplate. There's a PR in the works to add a Parallel type class, which would allow us to write something like this:
(taskA, taskB).parMapN((a, b) => ...)

Is there a concept for 'fold with break' or 'find with accumulator' in functional programming?

Title says it all, really; iterating over collection while preserving state between loops and finishing iteration based on termination condition in addition to simply running out of elements may be the most common pattern to accomplish anything in imperative programming. It seems to me however like it's something functional gentleprogrammers agreed to not talk about, or at least I never encountered an idiom for it or a semi-standarized name such as with map, fold, reduce, etc.
I often use the followinig code in scala:
implicit class FoldWhile[T](private val items :Iterable[T]) extends AnyVal {
def foldWhile[A](start :A)(until :A=>Boolean)(op :(A, T)=>A) :A = {
if (until(start)) start
else {
var accumulator = start
items.find{ e => accumulator = op(accumulator, e); until(accumulator) }
accumulator
}
}
}
But it's ugly. Whenever I try a more declarative approach, I come with even longer and almost surely slower code, akin to:
Iterator.iterate((start, items.iterator)){
case (acc, i) if until(acc) => (acc, i)
case (acc, i) if i.hasNext => (op(acc, i.next()), i)
case x => x
}.dropWhile {
case (acc, i) => !until(acc) && i.hasNext
}.next()._1
(A more functional variant would use Lists or Streams, but iterators have arguably lesser overhead than converting items to a Stream, as default implementation for the latter uses an iterator underneath anyway).
My questions are:
1) Does this concept have a name in functional programming, and if so, what is the pattern associated with its implementation?
2) What would be the best (i.e. concise, generic, lazy, and with least overhead) way to implememnt it in scala?
This is frowned upon by scala purists, but you can use a return statement like this:
def foldWhile[A](zero: A)(until:A => Boolean)(op: (A,T) => A): A = items.fold(zero) {
case (a, b) if until(a) => return a
case (a,b) => op(a, b)
}
Or, if you are one of those frowning, and would like a purely functional solution without dirty imperative tricks, you can use something lazy, like an iterator or a stream:
items
.toStream // or .iterator - it doesn't really matter much in this case
.scanLeft(zero)(op)
.find(until)
The functional way of doing such things is via Tail Recursion:
implicit class FoldWhile[T](val items: Iterable[T]) extends AnyVal {
def foldWhile[A](zero: A)(until: A => Boolean)(op: (A, T) => A): A = {
#tailrec def loop(acc: A, remaining: Iterable[T]): A =
if (remaining.isEmpty || !until(acc)) acc else loop(op(acc, remaining.head), remaining.tail)
loop(zero, items)
}
}
Using recursion you can decide at each step if you want to proceed or not without using break and without any overhead, because tail recursions are converted to iterations from the compiler.
Also, pattern matching is often used to decompose sequences. For example, if you had a List you could do:
implicit class FoldWhile[T](val items: List[T]) extends AnyVal {
def foldWhile[A](zero: A)(until: A => Boolean)(op: (A, T) => A): A = {
#tailrec def loop(acc: A, remaining: List[T]): A = remaining match {
case Nil => acc
case _ if !until(acc) => acc
case h :: t => loop(op(acc, h), t)
}
loop(zero, items)
}
}
Scala has the #scala.annotation.tailrec annotation to force compilation to fail if the function you're annotating is not tail recursive. I suggest you use it as much as you can because it helps both to avoid errors and document the code.
The functional name for this is an Iteratee.
There are a bunch of references about this, but it's probably better to start from where the design ended up by reading the Pipes Tutorial and only if you're interested working backwards from there to see how it came from an early terminating left fold.
A right fold, when done lazily, can do early termination. In Haskell, for example, you can write the find function (return first element of list that satisfies predicate) with foldr:
find :: (a -> Bool) -> [a] -> Maybe a
find p = foldr (\a r -> if p a then Just a else r) Nothing
-- For reference:
foldr :: (a -> r -> r) -> r -> [a] -> r
foldr _ z [] = []
foldr f z (a:as) = f a (foldr f z as)
What happens when you try, say, find even [1..]? (Note that this is an infinite list!)
find even [1..]
= foldr (\a r -> if even a then Just a else r) Nothing [1..]
= if even 1
then Just 1
else foldr (\a r -> if even a then Just a else r) Nothing ([2..])
= if False
then Just 1
else foldr (\a r -> if even a then Just a else r) Nothing ([2..])
= foldr (\a r -> if even a then Just a else r) Nothing ([2..])
= if even 2
then Just 2
else foldr (\a r -> if even a then Just a else r) Nothing ([3..])
= if True
then Just 2
else foldr (\a r -> if even a then Just a else r) Nothing ([3..])
= Just 2
Laziness means that the function that we fold with (\a r -> if even a then Just a else r) gets to decide whether to force the r argument—the one whose evaluation requires us to recurse down the list—at all. So when even 2 evaluates to True, we pick the branch of the if ... then ... else ... that discards the result computed off the tail of the list—which means we never evaluate it. (It also runs in constant space as well. While programmers in eager functional languages learn to avoid foldr because of space and termination issues, those aren't always true in lazy languages!)
This of course hinges on the fact that Haskell is lazily evaluated, but it should nevertheless be possible to simulate this in an eager language like Scala—I do know it has a lazy val feature that might be usable for this. It looks like you'd need to write a lazyFold function that does a right fold, but the recursion happens inside a lazy value. You might still have problems with space usage, though.

Scala: Ignore Future return value, but chain them

How should I write code, when I do not care about the returned value.
Example:
for {
a <- getA // I do not care about a, but I need to wait for the future to finish
b <- getB
} yield (b)
Like this
for {
_ <- getA
b <- getB
} yield (b)
Or if not a for comprehension fan, can do
getA.flatMap(_ => getB )
But I think most people will vote for comprehension

Async computation with Validation in Scala using Scalaz

Being writing a completely async library to access a remote service (using Play2.0), I'm using Promise and Validation to create non-blocking call, which has a type presenting fail and valid result at once.
Promise comes from Play2-scala, where Validation comes from scalaz.
So here is the type of examples of such functions
f :: A => Promise[Validation[E, B]]
g :: B => Promise[Validation[E, C]]
So far, so good, now if I want to compose them, I can simple use the fact that Promise present a flatMap, so I can do it with a for-comprehension
for (
x <- f(a);
y <- g(b)
) yield y
Ok, I took a shortcut to my problem here because I didn't reused the Validation results within the for-comprehension. So if I want to reuse x in g, here is how I could do
for (
x <- f(a); // x is a Validation
y <- x.fold(
fail => Promise.pure(x),
ok => g(ok)
)
) yield y
Fair enough, but this kind of boilerplate will go to pollute my code over and over again. The problem here is that I've a kind of two-levels Monadic structure like M[N[_]].
At this stage, is there any structure in f° programming that enables working with such structure by skipping easily the secong level:
for (
x <- f(a); //x is a B
y <- g(b)
) yield y
Now, below is how I achieved something similar.
I created kind of Monadic structure that wraps the two level in one, let say ValidationPromised which pimped the Promise type with two methods:
def /~> [EE >: E, B](f: Validation[E, A] => ValidationPromised[EE, B]): ValidationPromised[EE, B] =
promised flatMap { valid =>
f(valid).promised
}
def /~~>[EE >: E, B](f: A => ValidationPromised[EE, B]): ValidationPromised[EE, B] =
promised flatMap { valid =>
valid.fold (
bad => Promise.pure(KO(bad)),
good => f(good).promised
)
}
This allows me to do such things
endPoint.service /~~> //get the service
(svc => //the service
svc.start /~~> (st => //get the starting elt
svc.create(None) /~~> //svc creates a new elt
(newE => //the created one
newEntry.link(st, newE) /~~> //link start and the new
(lnk => Promise.pure(OK((st, lnk, newE)))) //returns a triple => hackish
)
)
)
As we can see /~~> is pretty similar to flatMap but skips one level. The problem is the verbosity (that's why "for-comprehension" exists in Scala and "do" in Haskell).
Another point, I've the /~> that stands like a map also but works on the second level (instead of the Valid type -- third level)
So my second question is corollary to the former... Am I approching a sustainable solution with this construction ?
sorry to be that long
The concept you are looking for here is monad transformers. In brief, monad transformers compensate for monads not composing by allowing you to "stack" them.
You didn't mention the version of Scalaz you are using, but if you look in the scalaz-seven branch, you'll find ValidationT. This can be used to wrap any F[Validation[E, A]] into a ValidationT[F, E, A], where in your case F = Promise. If you change f and g to return ValidationT, then you can leave your code as
for {
x ← f(a)
y ← g(b)
} yield y
This will give you a ValidationT[Promise, E, B] as a result.