Akka streams pass through flow limiting Parallelism / throughput of processing flow - scala

I have a use case where I want to send a message to an external system but the flow that sends this message takes and returns a type I cant use downstream. This is a great use case for the pass through flow. I am using the implementation here. Initially I was worried that if the processingFlow uses a mapAsyncUnordered then this flow wouldn't work. Since the processing flow may reorder messages and the zip with may push out a tuple with the incorrect pair. E.g In the following example.
val testSource = Source(1 until 50)
val processingFlow: Flow[Int, Int, NotUsed] = Flow[Int].mapAsyncUnordered(10)(x => Future {
Thread.sleep(Random.nextInt(50))
x * 10
})
val passThroughFlow = PassThroughFlow(processingFlow, Keep.both)
val future = testSource.via(passThroughFlow).runWith(Sink.seq)
I would expect that the processing flow could reorder its outputs with respect its input and i would get a result such as:
[(30,1), (40,2),(10,3),(10,4), ...]
With the right ( the passed through always being in order) but the left which goes through my mapAsyncUnordered potentially being joined with an incorrect element to make a bad tuple.
Instead i actually get:
[(10,1), (20,2),(30,3),(40,4), ...]
Every time. Upon further investigation I noticed the code was running slow and in fact its not running in parallel at all despite my map async unordered. I tried introducing a buffer before and after as well as an async boundary but it always seems to run sequentially. This explains why it always ordered but I want my processing flow to have a higher throughput.
I came up with the following work around:
object PassThroughFlow {
def keepRight[A, A1](processingFlow: Flow[A, A1, NotUsed]): Flow[A, A, NotUsed] =
keepBoth[A, A1](processingFlow).map(_._2)
def keepBoth[A, A1](processingFlow: Flow[A, A1, NotUsed]): Flow[A, (A1, A), NotUsed] =
Flow.fromGraph(GraphDSL.create() { implicit builder => {
import GraphDSL.Implicits._
val broadcast = builder.add(Broadcast[A](2))
val zip = builder.add(ZipWith[A1, A, (A1, A)]((left, right) => (left, right)))
broadcast.out(0) ~> processingFlow ~> zip.in0
broadcast.out(1) ~> zip.in1
FlowShape(broadcast.in, zip.out)
}
})
}
object ParallelPassThroughFlow {
def keepRight[A, A1](parallelism: Int, processingFlow: Flow[A, A1, NotUsed]): Flow[A, A, NotUsed] =
keepBoth(parallelism, processingFlow).map(_._2)
def keepBoth[A, A1](parallelism: Int, processingFlow: Flow[A, A1, NotUsed]): Flow[A, (A1, A), NotUsed] = {
Flow.fromGraph(GraphDSL.create() { implicit builder =>
import GraphDSL.Implicits._
val fanOut = builder.add(Balance[A](outputPorts = parallelism))
val merger = builder.add(Merge[(A1, A)](inputPorts = parallelism, eagerComplete = false))
Range(0, parallelism).foreach { n =>
val passThrough = PassThroughFlow.keepBoth(processingFlow)
fanOut.out(n) ~> passThrough ~> merger.in(n)
}
FlowShape(fanOut.in, merger.out)
})
}
}
Two questions:
In the original implementation, Why does the zip inside the pass through flow limit the amount of parallelism of the map async unordered?
Is my work around sound or could it be improved? I basically fan out my input the input to multiple stacks of the pass through flow and merge it all back together. It seems to have the properties that I want (parallel yet maintains order even if processing flow reorders) yet something doesn't feel right

The behavior you're witnessing is a result of how broadcast and zip work: broadcast emits downstream when all of its outputs signal demand; zip waits for all of its inputs before signaling demand (and emitting downstream).
broadcast.out(0) ~> processingFlow ~> zip.in0
broadcast.out(1) ~> zip.in1
Consider the movement of the first element (1) through the above graph. 1 is broadcast to both processingFlow and zip. zip immediately receives one of its inputs (1) and waits for its other input (10), which will take a little longer to arrive. Only when zip gets both 1 and 10 does it pull for more elements from upstream, thus triggering the movement of the second element (2) through the stream. And so on.
As for your ParallelPassThroughFlow, I don't know why "something doesn't feel right" to you.

Related

Fold and reduce show non-deterministic behavior when ran in parallel, why?

so I'm trying to count occurrences of items using Akka Streams.
Underneath example is a simplified version of what I have. I need two pipelines to work concurrently. For some reason, the printed results aren't correct.
Does anyone know why this happens? Am I missing something important regarding substreams?
/**
* SIMPLE EXAMPLE
*/
object TestingObject {
import akka.actor.ActorSystem
import akka.stream._
import akka.stream.scaladsl._
import java.nio.file.Paths
import akka.util.ByteString
import counting._
import graph_components._
// implicit actor system
implicit val system:ActorSystem = ActorSystem("Sys")
def main(args: Array[String]): Unit = {
val customFlow = Flow.fromGraph(GraphDSL.create() {
implicit builder =>
import GraphDSL.Implicits._
// Components
val A = builder.add(Balance[(Int, Int)](2, waitForAllDownstreams = true));
val B1 = builder.add(mergeCountFold.async);
val B2 = builder.add(mergeCountFold.async);
val C = builder.add(Merge[(Int, Int)](2));
val D = builder.add(mergeCountReduce);
// Graph
A ~> B1 ~> C ~> D
A ~> B2 ~> C
FlowShape(A.in, D.out);
})
// Run
Source(0 to 101)
.groupBy(10, x => x % 4)
.map(x => (x % 4, 1))
.via(customFlow)
.mergeSubstreams
.to(Sink.foreach(println)).run();
}
def mergeCountReduce = Flow[(Int, Int)].reduce((l, r) => {
println("REDUCING");
(l._1, l._2 + r._2)
})
def mergeCountFold = Flow[(Int, Int)].fold[(Int,Int)](0,0)((l, r) => {
println("FOLDING");
(r._1, l._2 + r._2)
})
}
Two observations:
mergeCountReduce will emit the first key it saw with the sum of the values seen (and will fail the stream if it didn't see any elements)
mergeCountFold will emit the last key it saw and the sum of the values seen (and will emit a key and value of zero if it didn't see any elements)
(in both cases, though the key is always the same)
Neither of those observations are affected by the async boundary.
In the context of the preceding Balance operator, though, async introduces an implicit buffer, which prevents the graph it wraps from backpressuring until that buffer is full. Balance sends stream values to the first output which isn't backpressuring, so if the stage after Balance is not dramatically slower than the upstream, Balance may send values only to one output (B1 in this case).
In that scenario, with reduce, B1 would emit the key and count, while B2 fails, causing the whole stream to fail.
For fold, in that scenario, B1 would emit the key and count, while B2, not having seen any values would emit (0,0). The merge would emit them in the order they emitted (reasonable to assume a 50/50 chance), so the final fold would then either have the key and the count or zero and the count.

How to explain this Akka Streams graph from official doc?

I have a couple of questions for this sample code hosted officially here:
val topHeadSink = Sink.head[Int]
val bottomHeadSink = Sink.head[Int]
val sharedDoubler = Flow[Int].map(_ * 2)
RunnableGraph.fromGraph(GraphDSL.create(topHeadSink, bottomHeadSink)((_, _)) { implicit builder =>
(topHS, bottomHS) =>
import GraphDSL.Implicits._
val broadcast = builder.add(Broadcast[Int](2))
Source.single(1) ~> broadcast.in
broadcast.out(0) ~> sharedDoubler ~> topHS.in
broadcast.out(1) ~> sharedDoubler ~> bottomHS.in
ClosedShape
})
When do you pass in a graph through create?
Why are topHeadSink, bottomHeadSink passed in through create, but sharedDoubler is not? What is the difference between them?
When do you need builder.add?
Can I create a broadcast outside the graph without builder.add? If I add a couple of flows inside the graph, should I add the flows via builder.add as well? It is very confusing that sometimes we need builder.add and sometimes we do not.
Update
I feel this is still confusing:
The difference between these approaches is that importing using builder.add(...) ignores the materialized value of the imported graph, while importing via the factory method allows its inclusion.
topHS, bottomHS are imported from create, so they will keep their materialized value. What if I do builder.add(topHS)?
And how do you explain sharedDoubler: does it have a materialized value or not? What if I use builder.add with it?
What does this mean, the ((_,_)) of GraphDSL.create(topHeadSink, bottomHeadSink)((_, _))?
It looks like boilerplate we just need, but I am not sure what it is.
When do you pass in a graph through create?
When you want to obtain the materialized value of the graph that you pass to the create factory method. The type of the RunnableGraph in your question is RunnableGraph[(Future[Int], Future[Int])], meaning that the materialized value of the graph is (Future[Int], Future[Int]):
val g = RunnableGraph.fromGraph(...).run() // (Future[Int], Future[Int])
val topHeadSinkResult = g._1 // Future[Int]
val bottomHeadSinkResult = g._2 // Future[Int]
Now consider the following variant, which defines the sinks "inside" the graph and discards the materialized value:
val g2 = RunnableGraph.fromGraph(GraphDSL.create() { implicit builder =>
import GraphDSL.Implicits._
val topHeadSink = Sink.head[Int]
val bottomHeadSink = Sink.head[Int]
val broadcast = builder.add(Broadcast[Int](2))
Source.single(1) ~> broadcast.in
broadcast.out(0) ~> sharedDoubler ~> topHeadSink
broadcast.out(1) ~> sharedDoubler ~> bottomHeadSink
ClosedShape
}).run() // NotUsed
The value of g2 is NotUsed.
When do you need builder.add?
All of the components of a graph must be added to the builder, but there are variants of the ~> operator that add the most commonly used components--such as Source and Flow--to the builder under the covers. However, junction operations that perform a fan-in (such as Merge) or a fan-out (such as Broadcast) must be explicitly passed to builder.add if you're using the Graph DSL.
Note that for simple graphs, you can use junctions without having to use the Graph DSL. Here is an example from the documentation:
val sendRmotely = Sink.actorRef(actorRef, "Done")
val localProcessing = Sink.foreach[Int](_ => /* do something usefull */ ())
val sink = Sink.combine(sendRmotely, localProcessing)(Broadcast[Int](_))
Source(List(0, 1, 2)).runWith(sink)
What does this mean? the ((_,_)) of GraphDSL.create(topHeadSink, bottomHeadSink)((_, _))?
It's a curried parameter that specifies which materialized value(s) you want to retain. Using ((_, _)) here is the same as:
val g = RunnableGraph.fromGraph(GraphDSL.create(topHeadSink, bottomHeadSink)((t, b) => (t, b)) {
implicit builder => (topHS, bottomHS) =>
...
}).run() // (Future[Int], Future[Int])
In other words, ((_, _)) in this context is shorthand for ((t, b) => (t, b)), which preserves the respective materialized values of the two sinks that are passed in. If, for example, you want to keep only the materialized value of topHeadSink, you could change the call to the following:
val g = RunnableGraph.fromGraph(GraphDSL.create(topHeadSink, bottomHeadSink)((t, _) => t) {
implicit builder => (topHS, bottomHS) =>
...
}).run() // Future[Int]

Akka-streams mapConcat not working with cycled RunnableGraph

I have RunnableGraph like following. When there is simple map between broadcast and merge stages everything is fine. However, when it comes to mapConcat, this code is not working after consuming the first element.
I want to know why it doesn't work.
RunnableGraph.fromGraph(GraphDSL.create() { implicit b =>
import GraphDSL.Implicits._
val M = b.add(MergePreferred[Int](1))
val B = b.add(Broadcast[Int](2))
val S = Source(List(3))
S ~> M ~> Flow[Int].map { s => println(s); s } ~> B ~> Sink.ignore
M.preferred <~ Flow[Int].map(x => List.fill(3)(x-1)).mapConcat(x => {println(x); x}).filter(_ > 0) <~ B
ClosedShape
})
// run() output:
// 3
// List(2,2,2)
The mapConcat stage blocks the feedback loop, and that is expected. Consider the following chain of events:
the mapConcat function prints List(2,2,2)
the mapConcat stage needs demand to emit the first of the 3 available elements (2, 2, 2)
the demand has to come from the Merge stage, and therefore from the Broadcast stage.
the Broadcast stage backpressures if any of its downstreams backpressures. It's downstreams are a Sink.ignore (that never backpressures), and the mapConcat itself.
the mapConcat backpressures if "there are still remaining elements from the previously calculated collection", as per the docs. This is indeed the case.
In other words, your cycle is unbalanced. You are introducing more elements in the feedback loop than you are removing.
This issue is explained in detail in this documentation page, where a couple of solutions are also presented. For your specific case, because of the filter stage you have, introducing a buffer larger than 13 would print all the elements. However, note that the graph will just hang and not complete afterwards.
S ~> M ~> Flow[Int].map { s => println(s); s } ~> B ~> Sink.ignore
M.preferred <~ Flow[Int].buffer(20, OverflowStrategy.dropHead) <~ Flow[Int].map(x => List.fill(3)(x-1)).mapConcat(x => {println(x); x}).filter(_ > 0) <~ B

Elegant way of reusing akka-stream flows

I am looking for a way to easily reuse akka-stream flows.
I treat the Flow I intend to reuse as a function, so I would like to keep its signature like:
Flow[Input, Output, NotUsed]
Now when I use this flow I would like to be able to 'call' this flow and keep the result aside for further processing.
So I want to start with Flow emiting [Input], apply my flow, and proceed with Flow emitting [(Input, Output)].
example:
val s: Source[Int, NotUsed] = Source(1 to 10)
val stringIfEven = Flow[Int].filter(_ % 2 == 0).map(_.toString)
val via: Source[(Int, String), NotUsed] = ???
Now this is not possible in a straightforward way because combining flow with .via() would give me Flow emitting just [Output]
val via: Source[String, NotUsed] = s.via(stringIfEven)
Alternative is to make my reusable flow emit [(Input, Output)] but this requires every flow to push its input through all the stages and make my code look bad.
So I came up with a combiner like this:
def tupledFlow[In,Out](flow: Flow[In, Out, _]):Flow[In, (In,Out), NotUsed] = {
Flow.fromGraph(GraphDSL.create() { implicit b =>
import GraphDSL.Implicits._
val broadcast = b.add(Broadcast[In](2))
val zip = b.add(Zip[In, Out])
broadcast.out(0) ~> zip.in0
broadcast.out(1) ~> flow ~> zip.in1
FlowShape(broadcast.in, zip.out)
})
}
that is broadcasting the input to the flow and as well in a parallel line directly -> both to the 'Zip' stage where I join values into a tuple. It then can be elegantly applied:
val tupled: Source[(Int, String), NotUsed] = s.via(tupledFlow(stringIfEven))
Everything great but when given flow is doing a 'filter' operation - this combiner is stuck and stops processing further events.
I guess that is due to 'Zip' behaviour that requires all subflows to do the same - in my case one branch is passing given object directly so another subflow cannot ignore this element with. filter(), and since it does - the flow stops because Zip is waiting for push.
Is there a better way to achieve flow composition?
Is there anything I can do in my tupledFlow to get desired behaviour when 'flow' ignores elements with 'filter' ?
Two possible approaches - with debatable elegance - are:
1) avoid using filtering stages, mutating your filter into a Flow[Int, Option[Int], NotUsed]. This way you can apply your zipping wrapper around your whole graph, as was your original plan. However, the code looks more tainted, and there is added overhead by passing around Nones.
val stringIfEvenOrNone = Flow[Int].map{
case x if x % 2 == 0 => Some(x.toString)
case _ => None
}
val tupled: Source[(Int, String), NotUsed] = s.via(tupledFlow(stringIfEvenOrNone)).collect{
case (num, Some(str)) => (num,str)
}
2) separate the filtering and transforming stages, and apply the filtering ones before your zipping wrapper. Probably a more lightweight and better compromise.
val filterEven = Flow[Int].filter(_ % 2 == 0)
val toString = Flow[Int].map(_.toString)
val tupled: Source[(Int, String), NotUsed] = s.via(filterEven).via(tupledFlow(toString))
EDIT
3) Posting another solution here for clarity, as per the discussions in the comments.
This flow wrapper allows to emit each element from a given flow, paired with the original input element that generated it. It works for any kind of inner flow (emitting 0, 1 or more elements for each input).
def tupledFlow[In,Out](flow: Flow[In, Out, _]): Flow[In, (In,Out), NotUsed] =
Flow[In].flatMapConcat(in => Source.single(in).via(flow).map( out => in -> out))
I came up with an implementation of TupledFlow that works when wrapped Flow uses filter() or mapAsync() and when wrapped Flow emits 0,1 or N elements for every input:
def tupledFlow[In,Out](flow: Flow[In, Out, _])(implicit materializer: Materializer, executionContext: ExecutionContext):Flow[In, (In,Out), NotUsed] = {
val v:Flow[In, Seq[(In, Out)], NotUsed] = Flow[In].mapAsync(4) { in: In =>
val outFuture: Future[Seq[Out]] = Source.single(in).via(flow).runWith(Sink.seq)
val bothFuture: Future[Seq[(In,Out)]] = outFuture.map( seqOfOut => seqOfOut.map((in,_)) )
bothFuture
}
val onlyDefined: Flow[In, (In, Out), NotUsed] = v.mapConcat[(In, Out)](seq => seq.to[scala.collection.immutable.Iterable])
onlyDefined
}
the only drawback I see here is that I am instantiating and materializing a flow for a single entity - just to get a notion of 'calling a flow as a function'.
I didn't do any performance tests on that - however since heavy-lifting is done in a wrapped Flow which is executed in a future - I believe this will be ok.
This implementation passes all the tests from https://gist.github.com/kretes/8d5f2925de55b2a274148b69f79e55ac#file-tupledflowspec-scala

Is there a nicer way to connect Scan and Broadcast in Akka Stream?

Let's assume I want to create a Flow, which takes Ints and outputs tuples (doubled int, sum). So I fan-out ints, map on one edge and scan on the other. Then I zip them and this is the result:
object Main extends App {
implicit val system = ActorSystem()
implicit val materializer = ActorMaterializer()
val flow = Flow.fromGraph(GraphDSL.create() { implicit b =>
import GraphDSL.Implicits._
val broadcast = b.add(Broadcast[Int](2))
val zip = b.add(Zip[Int, Int])
val flowMap = b.add(Flow[Int].map(_ * 2))
val flowScan = b.add(Flow[Int].scan(0)(_ + _))
broadcast.out(0) ~> flowMap ~> zip.in0
broadcast.out(1) ~> flowScan ~> zip.in1
FlowShape(broadcast.in, zip.out)
})
Source(1 to 5).via(flow).to(Sink.foreach(println)).run()
}
Unfortunately, this doesn't output anything. I researched it a bit and found out that:
Broadcast emits when all of the outputs stop backpressuring and there is an input element available,
Scan backpressures when downstream backpressures.
This makes the whole flow deadlock and nothing happens. Does somebody know how to achieve the result:
(2,0)
(4,1)
(6,3)
(8,6)
(10,10)
in a nice way? The only solution I have found so far is to use .buffer:
val flowScan = b.add(Flow[Int].buffer(1, OverflowStrategy.backpressure).scan(0)(_ + _))
But I don't really like this solution because it is describing not only the logic, but also some technicalities...
The reason of the deadlock is that scan will upon its first demand, emit the start value, so 0 in this case and not pass demand upstream, this means that demand only reaches broadcast.out(0) and as you said, broadcast only emits when there has been demand from all the downstreams.
The buffer might seem like a technicality, but it is actually expressing the graph according to what you want to achieve, that you want to zip the two branches, but the scan-one will always be one element ahead of the other. This is very central to how akka-streams works.
So your result is not actually something that matches what broadcast+zip does without some additional graph nodes, I think that the way to most cleanly express what you want to happen is to place the buffer separately before the scan, this makes it more clear that one branch will be ahead of the other:
broadcast.out(0) ~> flowMap ~> zip.in0
broadcast.out(1) ~> buffer ~> flowScan ~> zip.in1