Alternate to reduce to send live stats (of reduce) - scala

I have a silly question, but couldn't know the cause:
import akka.{Done, NotUsed}
import akka.actor.Status.Success
import akka.actor.{ActorRef, ActorSystem}
import akka.stream.scaladsl.{Flow, RunnableGraph, Sink, Source}
import akka.stream.{ActorMaterializer, OverflowStrategy}
import scala.concurrent.Future
object Generic {
def main(args: Array[String]) {
implicit val system = ActorSystem("system")
implicit val mat = ActorMaterializer()
val sink: Sink[Any, Future[Done]] = Sink.foreach(x => println("Ans =====> " + x))
val counts = Flow[String]
.mapConcat(x => x.split("\\s").toList)
.filter(!_.isEmpty)
.groupBy(Int.MaxValue, identity)
.map(x => x -> 1)
.reduce((l, r) => (l._1, l._2 + r._2))
.mergeSubstreams
val fold: Flow[String, Int, NotUsed] = Flow[String].map(x => 1).fold(0)(_ + _)
val words: RunnableGraph[ActorRef] = Source.actorRef(Int.MaxValue, OverflowStrategy.fail)
.via(counts)
.to(sink)
val ref = words.run()
for {
ln <- scala.io.Source.stdin.getLines.takeWhile(_ != "-1")
} {
println("---> Message sent " + ln)
ref ! ln
}
ref ! Success("end")
Thread.sleep(5000)
system.terminate()
}
}
It does very simple things: On the application terminal, I input sentences. It extracts words and then keeps maintaining the frequency of each word. And it works as expected. The problem is:
The Source is an infinite stream. i.e. only when I end the source, does it print the output. Can I refactor the program to always print live stats instead of ending. I understand, this behavior is expected due to reduce
A lame way to do is to have a print statement inside reduce. But can I do something else, like send live stats post each sentence to another sink (via broadcast?)

Take a look at the scan combinator. It will give you the aggregating power of fold/reduce but it will emit intermediate results.
// .reduce((l, r) => (l._1, l._2 + r._2))
.scan("" → 0)((l, r) => (l._1, l._2 + r._2))
In addition, if you want to send the outputs to a logging Sink, you can look into alsoTo, which will effectively perform a broadcast to a side Sink of choice.

Related

Concurrent Streams Does't Print Anything to Console

I'm trying to build an example concerning using the Stream.concurrently method in fs2. I'm developing the producer/consumer pattern, using a Queue as the shared state:
import cats.effect.std.{Queue, Random}
object Fs2Tutorial extends IOApp {
val random: IO[Random[IO]] = Random.scalaUtilRandom[IO]
val queue: IO[Queue[IO, Int]] = Queue.bounded[IO, Int](10)
val producer: IO[Nothing] = for {
r <- random
q <- queue
p <-
r.betweenInt(1, 11)
.flatMap(q.offer)
.flatTap(_ => IO.sleep(1.second))
.foreverM
} yield p
val consumer: IO[Nothing] = for {
q <- queue
c <- q.take.flatMap { n =>
IO.println(s"Consumed $n")
}.foreverM
} yield c
val concurrently: Stream[IO, Nothing] = Stream.eval(producer).concurrently(Stream.eval(consumer))
override def run(args: List[String]): IO[ExitCode] = {
concurrently.compile.drain.as(ExitCode.Success)
}
}
I expect the program to print some "Consumed n", for some n. However, the program prints nothing to the console.
What's wrong with the above code?
What's wrong with the above code?
You are not using the same Queue in the consumer and in the producer, rather each of them is creating its own new independent Queue (the same happens with Random BTW)
This is a common mistake made by newbies who don't grasp yet the main principles behind a data type like IO
When you do val queue: IO[Queue[IO, Int]] = Queue.bounded[IO, Int](10) you are saying that queue is a program that when evaluated will produce a value of type Queue[IO, Unit], that is the point of all this.
The program become a value, and as any value you can manipulate it in any ways to produce new values, for example using flatMap so when both consumer & producer crate a new program by flatMapping queue they both create new independent programs / values.
You can fix that code like this:
import cats.effect.{IO, IOApp}
import cats.effect.std.{Queue, Random}
import cats.syntax.all._
import fs2.Stream
import scala.concurrent.duration._
object Fs2Tutorial extends IOApp.Simple {
override final val run: IO[Unit] = {
val resources =
(
Random.scalaUtilRandom[IO],
Queue.bounded[IO, Int](10)
).tupled
val concurrently =
Stream.eval(resources).flatMap {
case (random, queue) =>
val producer =
Stream
.fixedDelay[IO](1.second)
.evalMap(_ => random.betweenInt(1, 11))
.evalMap(queue.offer)
val consumer =
Stream.fromQueueUnterminated(queue).evalMap(n => IO.println(s"Consumed $n"))
producer.concurrently(consumer)
}
concurrently.interruptAfter(10.seconds).compile.drain >> IO.println("Finished!")
}
}
(You can see it running here).
PS: I would recommend you to look into the "Programs as Values" Series from Fabio Labella: https://systemfw.org/archive.html

How can I merge an arbitrary number of sources in Akka stream?

I have n sources that I'd like to merge by priority in Akka streams. I'm basing my implementation on the GraphMergePrioritiziedSpec, in which three prioritized sources are merged. I attempted to abstract away the number of Sources with the following:
import akka.NotUsed
import akka.stream.{ClosedShape, Graph, Materializer}
import akka.stream.scaladsl.{GraphDSL, MergePrioritized, RunnableGraph, Sink, Source}
import org.apache.activemq.ActiveMQConnectionFactory
class SourceMerger(
sources: Seq[Source[java.io.Serializable, NotUsed]],
priorities: Seq[Int],
private val sink: Sink[java.io.Serializable, _]
) {
require(sources.size == priorities.size, "Each source should have a priority")
import GraphDSL.Implicits._
private def partial(
sources: Seq[Source[java.io.Serializable, NotUsed]],
priorities: Seq[Int],
sink: Sink[java.io.Serializable, _]
): Graph[ClosedShape, NotUsed] = GraphDSL.create() { implicit b =>
val merge = b.add(MergePrioritized[java.io.Serializable](priorities))
sources.zipWithIndex.foreach { case (s, i) =>
s.shape.out ~> merge.in(i)
}
merge.out ~> sink
ClosedShape
}
def merge(
sources: Seq[Source[java.io.Serializable, NotUsed]],
priorities: Seq[Int],
sink: Sink[java.io.Serializable, _]
): RunnableGraph[NotUsed] = RunnableGraph.fromGraph(partial(sources, priorities, sink))
def run()(implicit mat: Materializer): NotUsed = merge(sources, priorities, sink).run()(mat)
}
However, I get an error when running the following stub:
import akka.actor.ActorSystem
import akka.stream.{ActorMaterializer, Materializer}
import akka.stream.scaladsl.{Sink, Source}
import org.scalatest.{Matchers, WordSpecLike}
import akka.testkit.TestKit
import scala.collection.immutable.Iterable
class SourceMergerSpec extends TestKit(ActorSystem("SourceMerger")) with WordSpecLike with Matchers {
implicit val materializer: Materializer = ActorMaterializer()
"A SourceMerger" should {
"merge by priority" in {
val priorities: Seq[Int] = Seq(1,2,3)
val highPriority = Iterable("message1", "message2", "message3")
val mediumPriority = Iterable("message4", "message5", "message6")
val lowPriority = Iterable("message7", "message8", "message9")
val source1 = Source[String](highPriority)
val source2 = Source[String](mediumPriority)
val source3 = Source[String](lowPriority)
val sources = Seq(source1, source2, source3)
val subscriber = Sink.seq[java.io.Serializable]
val merger = new SourceMerger(sources, priorities, subscriber)
merger.run()
source1.runWith(Sink.foreach(println))
}
}
}
The relevant stacktrace is here:
[StatefulMapConcat.out] is already connected
java.lang.IllegalArgumentException: [StatefulMapConcat.out] is already connected
at akka.stream.scaladsl.GraphDSL$Builder.addEdge(Graph.scala:1304)
at akka.stream.scaladsl.GraphDSL$Implicits$CombinerBase$class.$tilde$greater(Graph.scala:1431)
at akka.stream.scaladsl.GraphDSL$Implicits$PortOpsImpl.$tilde$greater(Graph.scala:1521)
at SourceMerger$$anonfun$partial$1$$anonfun$apply$1.apply(SourceMerger.scala:26)
at SourceMerger$$anonfun$partial$1$$anonfun$apply$1.apply(SourceMerger.scala:25)
It seems that the error comes from this:
sources.zipWithIndex.foreach { case (s, i) =>
s.shape.out ~> merge.in(i)
}
Is it possible to merge an arbitrary number of Sources in Akka streams Graph DSL? If so, why isn't my attempt successful?
Primary Problem with Code Example
One big issue with the code snippet provided in the question is that source1 is connected to the Sink from the merge call and the Sink.foreach(println). The same Source cannot be connected to multiple Sinks without an intermediate fan-out element.
Removing the Sink.foreach(println) may solve your problem outright.
Simplified Design
The merging can be simplified based on the fact that all messages from a particular Source have the same priority. This means that you can sort the sources by their respective priority and then concatenate them all together:
private def partial(sources: Seq[Source[java.io.Serializable, NotUsed]],
priorities: Seq[Int],
sink: Sink[java.io.Serializable, _]): RunnableGraph[NotUsed] =
sources.zip(priorities)
.sortWith(_._2 < _._2)
.map(_._1)
.reduceOption(_ ++ _)
.getOrElse(Source.empty[java.io.Serializable])
.to(sink)
Your code runs without the error if I replace
sources.zipWithIndex.foreach { case (s, i) =>
s.shape.out ~> merge.in(i)
}
with
sources.zipWithIndex.foreach { case (s, i) =>
s ~> merge.in(i)
}
I admit I'm not quite sure why! At any rate, s.shape is a StatefulMapConcat and that's the point where it's complaining about the out port already being connected. The problem occurs even if you only pass a single source, so the arbitrary number isn't the problem.

NNTP client using akka streams in scala

I'm trying to implement an NNTP client that streams a list of commands to the server and parsing the results back. I'm facing several problems :
the NNTP protocol doesn't have an "unique" delimiter that could be use to frame results. Some commands return multi-line responses. How to handle that with streams ?
how to "map" the command issued with the server response and wait the end of server response before sending the next command ? (Throttling is not relevant here)
how to stop the stream processing on disconnection ? (Actually, the program never returns)
Here is my current implementation :
import akka.stream._
import akka.stream.scaladsl._
import akka.{ NotUsed, Done }
import akka.actor.ActorSystem
import akka.util.ByteString
import scala.concurrent._
import scala.concurrent.duration._
import java.nio.file.Paths
import scala.io.StdIn
import scala.concurrent.ExecutionContext.Implicits.global
import scala.util.Success
import scala.util.Failure
object AutomatedClient extends App {
implicit val system = ActorSystem("NewsClientTest")
implicit val materializer = ActorMaterializer()
// MODEL //
final case class Command(query: String)
final case class CommandResult(
resultCode: Int,
resultStatus: String,
resultList: Option[List[String]])
final case class ParseException(message: String) extends RuntimeException
// COMMAND HANDLING FUN //
// out ->
val sendCommand: Command => ByteString = c => ByteString(c.query + "\r\n")
// in <-
val parseCommandResultStatus: String => (Int, String) = s =>
(s.take(3).toInt, s.drop(3).trim)
val parseCommandResultList: List[String] => List[String] = l =>
l.foldLeft(List().asInstanceOf[List[String]]){
case (acc, ".") => acc
case (acc, e) => e.trim :: acc
}.reverse
val parseCommandResult: ByteString => Future[CommandResult] = b => Future {
val resultLines = b.decodeString("UTF-8").split("\r\n")
resultLines.length match {
case 0 => throw new ParseException("empty result")
case 1 =>
val (code, text) = parseCommandResultStatus(resultLines.head)
new CommandResult(code, text, None)
case _ =>
val (code, text) = parseCommandResultStatus(resultLines.head)
new CommandResult(code, text, Some(parseCommandResultList(resultLines.tail.toList)))
}
}
// STREAMS //
// Flows
val outgoing: Flow[Command, ByteString, NotUsed] = Flow fromFunction sendCommand
val incoming: Flow[ByteString, Future[CommandResult], NotUsed] = Flow fromFunction parseCommandResult
val protocol = BidiFlow.fromFlows(incoming, outgoing)
// Sink
val print: Sink[Future[CommandResult], _] = Sink.foreach(f =>
f.onComplete {
case Success(r) => println(r)
case Failure(r) => println("error decoding command result")
})
// Source
val testSource: Source[Command, NotUsed] = Source(List(
new Command("help"),
new Command("list"),
new Command("quit")
))
val (host, port) = ("localhost", 1119)
Tcp()
.outgoingConnection(host, port)
.join(protocol)
.runWith(testSource, print)
}
And here is the result output :
CommandResult(200,news.localhost NNRP Service Ready - newsmaster#localhost (posting ok),None)
CommandResult(100,Legal Commands,Some(List(article [<messageid>|number], authinfo type value, body [<messageid>|number], date, group newsgroup, head [<messageid>|number], help, last, list [active wildmat|active.times|counts wildmat], list [overview.fmt|newsgroups wildmat], listgroup newsgroup, mode reader, next, post, stat [<messageid>|number], xhdr field [range], xover [range], xpat field range pattern, xfeature useragent <client identifier>, xfeature compress gzip [terminator], xzver [range], xzhdr field [range], quit, 480 Authentication Required*, 205 Goodbye)))
We can see that the second CommandResult contains the result of "list" command and "quit" command.

Akka streams: Reading multiple files

I have a list of files. I want:
To read from all of them as a single Source.
Files should be read sequentially, in-order. (no round-robin)
At no point should any file be required to be entirely in memory.
An error reading from a file should collapse the stream.
It felt like this should work: (Scala, akka-streams v2.4.7)
val sources = Seq("file1", "file2").map(new File(_)).map(f => FileIO.fromPath(f.toPath)
.via(Framing.delimiter(ByteString(System.lineSeparator), 10000, allowTruncation = true))
.map(bs => bs.utf8String)
)
val source = sources.reduce( (a, b) => Source.combine(a, b)(MergePreferred(_)) )
source.map(_ => 1).runWith(Sink.reduce[Int](_ + _)) // counting lines
But that results in a compile error since FileIO has a materialized value associated with it, and Source.combine doesn't support that.
Mapping the materialized value away makes me wonder how file-read errors get handled, but does compile:
val sources = Seq("file1", "file2").map(new File(_)).map(f => FileIO.fromPath(f.toPath)
.via(Framing.delimiter(ByteString(System.lineSeparator), 10000, allowTruncation = true))
.map(bs => bs.utf8String)
.mapMaterializedValue(f => NotUsed.getInstance())
)
val source = sources.reduce( (a, b) => Source.combine(a, b)(MergePreferred(_)) )
source.map(_ => 1).runWith(Sink.reduce[Int](_ + _)) // counting lines
But throws an IllegalArgumentException at runtime:
java.lang.IllegalArgumentException: requirement failed: The inlets [] and outlets [MergePreferred.out] must correspond to the inlets [MergePreferred.preferred] and outlets [MergePreferred.out]
The code below is not as terse as it could be, in order to clearly modularize the different concerns.
// Given a stream of bytestrings delimited by the system line separator we can get lines represented as Strings
val lines = Framing.delimiter(ByteString(System.lineSeparator), 10000, allowTruncation = true).map(bs => bs.utf8String)
// given as stream of Paths we read those files and count the number of lines
val lineCounter = Flow[Path].flatMapConcat(path => FileIO.fromPath(path).via(lines)).fold(0l)((count, line) => count + 1).toMat(Sink.head)(Keep.right)
// Here's our test data source (replace paths with real paths)
val testFiles = Source(List("somePathToFile1", "somePathToFile2").map(new File(_).toPath))
// Runs the line counter over the test files, returns a Future, which contains the number of lines, which we then print out to the console when it completes
testFiles runWith lineCounter foreach println
Update Oh, I didn't see the accepted answer because I didn't refresh the page >_<. I'll leave this here anyway since I've also added some notes about error handling.
I believe the following program does what you want:
import akka.NotUsed
import akka.actor.ActorSystem
import akka.stream.{ActorMaterializer, IOResult}
import akka.stream.scaladsl.{FileIO, Flow, Framing, Keep, Sink, Source}
import akka.util.ByteString
import scala.concurrent.{Await, Future}
import scala.util.{Failure, Success}
import scala.util.control.NonFatal
import java.nio.file.Paths
import scala.concurrent.duration._
object TestMain extends App {
implicit val actorSystem = ActorSystem("test")
implicit val materializer = ActorMaterializer()
implicit def ec = actorSystem.dispatcher
val sources = Vector("build.sbt", ".gitignore")
.map(Paths.get(_))
.map(p =>
FileIO.fromPath(p)
.viaMat(Framing.delimiter(ByteString(System.lineSeparator()), Int.MaxValue, allowTruncation = true))(Keep.left)
.mapMaterializedValue { f =>
f.onComplete {
case Success(r) if r.wasSuccessful => println(s"Read ${r.count} bytes from $p")
case Success(r) => println(s"Something went wrong when reading $p: ${r.getError}")
case Failure(NonFatal(e)) => println(s"Something went wrong when reading $p: $e")
}
NotUsed
}
)
val finalSource = Source(sources).flatMapConcat(identity)
val result = finalSource.map(_ => 1).runWith(Sink.reduce[Int](_ + _))
result.onComplete {
case Success(n) => println(s"Read $n lines total")
case Failure(e) => println(s"Reading failed: $e")
}
Await.ready(result, 10.seconds)
actorSystem.terminate()
}
The key here is the flatMapConcat() method: it transforms each element of a stream into a source and returns a stream of elements yielded by these sources if they are run sequentially.
As for handling errors, you can either add a handler to the future in the mapMaterializedValue argument, or you can handle the final error of the running stream by putting a handler on the Sink.foreach materialized future value. I did both in the example above, and if you test it, say, on a nonexisting file, you'll see that the same error message will be printed twice. Unfortunately, flatMapConcat() does not collect materialized values, and frankly I can't see the way it could do it sanely, therefore you have to handle them separately, if necessary.
I do have one answer out of the gate - don't use akka.FileIO. This appears to work fine, for example:
val sources = Seq("sample.txt", "sample2.txt").map(io.Source.fromFile(_).getLines()).reduce(_ ++ _)
val source = Source.fromIterator[String](() => sources)
val lineCount = source.map(_ => 1).runWith(Sink.reduce[Int](_ + _))
I'd still like to know whether there's a better solution.

How to retry failed Unmarshalling of a stream of akka-http requests?

I know it's possible to restart an akka-stream on error with a supervision strategy on the ActorMaterialzer
val decider: Supervision.Decider = {
case _: ArithmeticException => Supervision.Resume
case _ => Supervision.Stop
}
implicit val materializer = ActorMaterializer(
ActorMaterializerSettings(system).withSupervisionStrategy(decider))
val source = Source(0 to 5).map(100 / _)
val result = source.runWith(Sink.fold(0)(_ + _))
// the element causing division by zero will be dropped
// result here will be a Future completed with Success(228)
source: http://doc.akka.io/docs/akka/2.4.2/scala/stream/stream-error.html
I have the following use case.
/***
scalaVersion := "2.11.8"
libraryDependencies ++= Seq(
"com.typesafe.akka" %% "akka-http-experimental" % "2.4.2",
"com.typesafe.akka" %% "akka-http-spray-json-experimental" % "2.4.2"
)
*/
import akka.http.scaladsl.unmarshalling.Unmarshal
import akka.http.scaladsl.marshallers.sprayjson.SprayJsonSupport._
import spray.json._
import akka.http.scaladsl.Http
import akka.http.scaladsl.model._
import Uri.Query
import akka.actor.ActorSystem
import akka.stream.ActorMaterializer
import akka.stream.scaladsl._
import scala.util.{Success, Failure}
import scala.concurrent.Await
import scala.concurrent.duration.Duration
import scala.concurrent.Future
object SO extends DefaultJsonProtocol {
implicit val system = ActorSystem()
import system.dispatcher
implicit val materializer = ActorMaterializer()
val httpFlow = Http().cachedHostConnectionPoolHttps[HttpRequest]("example.org")
def search(query: Char) = {
val request = HttpRequest(uri = Uri("https://example.org").withQuery(Query("q" -> query.toString)))
(request, request)
}
case class Hello(name: String)
implicit val helloFormat = jsonFormat1(Hello)
val searches =
Source('a' to 'z').map(search).via(httpFlow).mapAsync(1){
case (Success(response), _) => Unmarshal(response).to[Hello]
case (Failure(e), _) => Future.failed(e)
}
def main(): Unit = {
Await.result(searches.runForeach(_ => println), Duration.Inf)
()
}
}
Sometime a query will fail to unmarshall. I want to use a retry strategy on that single query
https://example.org/?q=v without restarting the whole alphabet.
I think it will be hard (or impossible) to implement it with a supervsior strategy, mostly because you want to retry "n" times (according to the discussion in comments), and I don't think you can track the number of times the element was tried when using supervision.
I think there are two ways to solve this issue. Either handle the risky operation as a separate stream or create a graph, which will do error handling. I will propose two solutions.
Note also that Akka Streams distinguishes between errors and failures, so if you wont' handle your failures they will eventually collapse the flow (if no strategy is intriduced), so in the example below I convert them to Either, which represent either success or error.
Separate stream
What you can do is to treat each alphabet letter as a separate stream and handle failures for each letter separately with the retry strategy, and some delay.
// this comes after your helloFormat
// note that the method is somehow simpler because it's
// using implicit dispatcher and scheduler from outside scope,
// you may also want to pass it as implicit arguments
def retry[T](f: => Future[T], delay: FiniteDuration, c: Int): Future[T] =
f.recoverWith {
// you may want to only handle certain exceptions here...
case ex: Exception if c > 0 =>
println(s"failed - will retry ${c - 1} more times")
akka.pattern.after(delay, system.scheduler)(retry(f, delay, c - 1))
}
val singleElementFlow = httpFlow.mapAsync[Hello](1) {
case (Success(response), _) =>
val f = Unmarshal(response).to[Hello]
f.recoverWith {
case ex: Exception =>
// see https://github.com/akka/akka/issues/20192
response.entity.dataBytes.runWith(Sink.ignore).flatMap(_ => f)
}
case (Failure(e), _) => Future.failed(e)
}
// so the searches can either go ok or not, for each letter, we will retry up to 3 times
val searches =
Source('a' to 'z').map(search).mapAsync[Either[Throwable, Hello]](1) { elem =>
println(s"trying $elem")
retry(
Source.single(elem).via(singleElementFlow).runWith(Sink.head[Hello]),
1.seconds, 3
).map(ok => Right(ok)).recover { case ex => Left(ex) }
}
// end
Graph
This method will integrate failures into the graph, and will allow for retries. This example makes all requests run in parallel and prefer to retry those which failed, but if you don't want this behaviour and run them one by one this is something you can also do I believe.
// this comes after your helloFormat
// you may need to have your own class if you
// want to propagate failures for example, but we will use
// right value to keep track of how many times we have
// tried the request
type ParseResult = Either[(HttpRequest, Int), Hello]
def search(query: Char): (HttpRequest, (HttpRequest, Int)) = {
val request = HttpRequest(uri = Uri("https://example.org").withQuery(Query("q" -> query.toString)))
(request, (request, 0)) // let's use this opaque value to count how many times we tried to search
}
val g = GraphDSL.create() { implicit b =>
import GraphDSL.Implicits._
val searches = b.add(Flow[Char])
val tryParse =
Flow[(Try[HttpResponse], (HttpRequest, Int))].mapAsync[ParseResult](1) {
case (Success(response), (req, tries)) =>
println(s"trying parse response to $req for $tries")
Unmarshal(response).to[Hello].
map(h => Right(h)).
recoverWith {
case ex: Exception =>
// see https://github.com/akka/akka/issues/20192
response.entity.dataBytes.runWith(Sink.ignore).map { _ =>
Left((req, tries + 1))
}
}
case (Failure(e), _) => Future.failed(e)
}
val broadcast = b.add(Broadcast[ParseResult](2))
val nonErrors = b.add(Flow[ParseResult].collect {
case Right(x) => x
// you may also handle here Lefts which do exceeded retries count
})
val errors = Flow[ParseResult].collect {
case Left(x) if x._2 < 3 => (x._1, x)
}
val merge = b.add(MergePreferred[(HttpRequest, (HttpRequest, Int))](1, eagerComplete = true))
// #formatter:off
searches.map(search) ~> merge ~> httpFlow ~> tryParse ~> broadcast ~> nonErrors
merge.preferred <~ errors <~ broadcast
// #formatter:on
FlowShape(searches.in, nonErrors.out)
}
def main(args: Array[String]): Unit = {
val source = Source('a' to 'z')
val sink = Sink.seq[Hello]
source.via(g).toMat(sink)(Keep.right).run().onComplete {
case Success(seq) =>
println(seq)
case Failure(ex) =>
println(ex)
}
}
Basically what happens here is we run searches through httpFlow and then try to parse the response, we
then broadcast the result and split errors and non-errors, the non errors go to sink, and errors get sent
back to the loop. If the number of retries exceed the count, we ignore the element, but you can also do
something else with it.
Anyway I hope this gives you some idea.
For the streams solution above, any retries for the last element in the stream won't execute. That's because when the upstream completes after sending the last element the merge will also complete. After that the only output came come from the non-retry outlet but since the element goes to retry that gets completed too.
If you need all input elements to generate an output you'll need an extra mechanism to stop the upstream complete from reaching the process&retry graph. One possibility is to use a BidiFlow which monitors the input and output from the process&retry graph to ensure all the required outputs have been generated (for the observed inputs) before propagating the oncomplete. In the simple case that could just be counting input and output elements.