Retain materialization type when combining Sources - scala

I want to combine 2 akka streams sources and retain ActorRef of the first source for the actual usage after materialization
val buffer = 100
val apiSource: Source[Data, ActorRef] = Source.actorRef[Data](buffer, OverflowStrategy.backpressure)
.delay(2.second, DelayOverflowStrategy.backpressure)
val kafkaSource: Source[Data, Consumer.Control] = createConsumer(config.kafkaConsumerConfig, "test")
val combinedSource: Source[Data, NotUsed] = Source.combine(kafkaSource, apiSource)(Merge(_))
The problem is that combined method ignores materialization types and I wonder if there is another way of achieving this

You can use Source#mergeMat:
val combinedSource: Source[Data, ActorRef] = kafkaSource.mergeMat(apiSource)(Keep.right)

This seems to be working
def combineAndRetainFirst[T,M1, M2](first: Source[T, M1], second: Source[T, M2]): Source[T, M1] ={
Source.fromGraph(
GraphDSL.create(first, second)((m1, _) => m1){ implicit builder => (g1, g2) =>
import GraphDSL.Implicits._
val merge = builder.add(Merge[T](2))
g1 ~> merge.in(0)
g2 ~> merge.in(1)
SourceShape(merge.out)
}
)
}

Related

Consume a source with two sinks and get the result of one sink

I'd like to consume a Source with two different sinks.
Simplified example:
val source = Source(1 to 20)
val addSink = Sink.fold[Int, Int](0)(_ + _)
val subtractSink = Sink.fold[Int, Int](0)(_ - _)
val graph = GraphDSL.create() { implicit builder =>
import GraphDSL.Implicits._
val bcast = builder.add(Broadcast[Int](2))
source ~> bcast.in
bcast.out(0) ~> addSink
bcast.out(1) ~> subtrackSink
ClosedShape
}
RunnableGraph.fromGraph(graph).run()
val result: Future[Int] = ???
I need to be able to retrieve the result of addSink. RunnableGraph.fromGraph(graph).run() gives me NotUsed,
but I'd like to get an Int (the result of the first fold Sink). Is it possible?
Pass in both sinks to the graph builder's create method, which gives you access to their respective materialized values:
val graph = GraphDSL.create(addSink, subtractSink)((_, _)) { implicit builder =>
(aSink, sSink) =>
import GraphDSL.Implicits._
val bcast = builder.add(Broadcast[Int](2))
source ~> bcast.in
bcast.out(0) ~> aSink
bcast.out(1) ~> sSink
ClosedShape
}
val (addResult, subtractResult): (Future[Int], Future[Int]) =
RunnableGraph.fromGraph(graph).run()
Alternatively, you can forgo the graph DSL and use alsoToMat:
val result: Future[Int] =
Source(1 to 20)
.alsoToMat(addSink)(Keep.right)
.toMat(subtractSink)(Keep.left)
.run()
The above gives you the materialized value of addSink. If you want to get the materialized value of both addSink and subtractSink, use Keep.both:
val (addResult, subtractResult): (Future[Int], Future[Int]) =
Source(1 to 20)
.alsoToMat(addSink)(Keep.right)
.toMat(subtractSink)(Keep.both) // <--
.run()

Custom Akka Streams

I have a set of stream stages (Sources, Flows and Sinks) which I would like to add some MetaData information to.
Therefore rather than Sources producing A -> (A, StreamMetaData). I've managed to do this using custom stream stages whereby on grab(in) the element, I push(out, (elem, StreamMetaData)). In reality it is not 'converting' the existing Source but passing it to a flow to recreate a new source.
Now I'm trying to implement the below MetaStream stage:
Therefore given that the source is producing tuples of (A, StreamMetaData), I want to pass the A to an existing Flow for some computation to take place and then merge the output produced 'B' with the StreamMetaData. These will then be passed to a Sink which accepts (B, StreamMetaData).
How would you suggest I go about it. I've been informed partial graphs are the best bet and would help in completing such a task. UniformFanOut and UniformFanIn using Unzip((A streamMetaData), A, StreamMetaData) and Zip(A,B)
val fanOut = GraphDSL.create() { implicit b =>
val unzip = b.add(Unzip[T, StreamMetaData])
UniformFanOutShape(unzip.in, unzip.out0, unzip.out1)
}
val fanIn = GraphDSL.create() { implicit b =>
val zip = b.add(Zip[T ,StreamMetaData]())
UniformFanInShape(zip)
}
How can I connect the fanIn and fanOut so as to achieve the same behavior as in the picture?
I had something like this in mind;
def metaFlow[T, B, Mat](flow: Flow[T, B, Mat]): Unit = {
val wrappedFlow =
Flow.fromGraph(GraphDSL.create(){ implicit b =>
import GraphDSL.Implicits._
val unzip: FanOutShape2[(T, StreamMetaData), T, StreamMetaData] = b.add(Unzip[T, StreamMetaData])
val existingFlow = b.add(flow)
val zip: FanInShape2[B,StreamMetaData,(B,StreamMetaData)] = b.add(Zip[B, StreamMetaData])
unzip.out0 ~> existingFlow ~> zip.in0
unzip.out1 ~> zip.in1
FlowShape(unzip.in, zip.out)
})
}
Thanks in advance.
This aprox creating a new SourceShape stacking flow graph can work, a little bit different of your flowShape implementation.
def sourceGraph[A, B](f: A => B, source: Source[(A, StreamMetaData), NotUsed]) = Source.fromGraph(GraphDSL.create() { implicit builder: GraphDSL.Builder[NotUsed] =>
import GraphDSL.Implicits._
val unzip = builder.add(Unzip[A, StreamMetaData]())
val zip = builder.add(Zip[B, StreamMetaData]())
val flow0 = builder.add(Flow[A].map { f(_) })
val flow1 = source ~> unzip.in
unzip.out0 ~> flow0 ~> zip.in0
unzip.out1 ~> zip.in1
SourceShape(zip.out)
})
def flowGraph[A, B](f: A => B) = Flow.fromGraph(GraphDSL.create() { implicit builder =>
import GraphDSL.Implicits._
val unzip = builder.add(Unzip[A, StreamMetaData]())
val zip = builder.add(Zip[B, StreamMetaData]())
val flow0 = builder.add(Flow[A].map { f(_) })
unzip.out0 ~> flow0 ~> zip.in0
unzip.out1 ~> zip.in1
FlowShape(unzip.in, zip.out)
})

Akka Stream: Combine `Flow[T, T2, M] + Sink[T2, M2]` to get `Sink[T, M2]`

Given simple source/sink/flows:
val source: Source[Int, NotUsed] = Source(1 to 5)
val sink: Sink[Any, Future[Done]] = Sink.foreach(println)
val intDoublerFlow: Flow[Int, Int, NotUsed] = Flow.fromFunction[Int, Int](i => i * 2)
I can combine a Source[T, M] with Flow[T, T2, M2] and get a Source[T2, M2] with the via method:
val sourceWithFlow: Source[Int, NotUsed] = source.via(intDoublerFlow)
How can I do the analogous operation and combine a Flow[T, T2, M] with Sink[T2, M2] to get Sink[T, M2]
val sinkWithFlow: Sink[Int, Future[Done]] = ???
You're looking to compose your flow and your sink by keeping the materialized value of the latter. As by default the fluent DSL would keep the materialized value of the former, a simple intDoubleFlow.to(sink) would not work. You need to be explicit by using toMat and enforce a Keep.Right.
The resulting code is:
val sinkWithFlow: Sink[Int, Future[Done]] = intDoublerFlow.toMat(sink)(Keep.right)
More info on composing graph stages with regards to materialized values can be found here.

Split Akka Stream Source into two

I have an Akka Streams Source which I want to split into two sources according to a predicate.
E.g. having a source (types are simplified intentionally):
val source: Source[Either[Throwable, String], NotUsed] = ???
And two methods:
def handleSuccess(source: Source[String, NotUsed]): Future[Unit] = ???
def handleFailure(source: Source[Throwable, NotUsed]): Future[Unit] = ???
I would like to be able to split the source according to _.isRight predicate and pass the right part to handleSuccess method and left part to handleFailure method.
I tried using Broadcast splitter but it requires Sinks at the end.
Although you can choose which side of the Source you want to retrieve items from it's not possible to create a Source that that yields two outputs which is what it seems like you would ultimately want.
Given the GraphStage below which essentially splits the left and right values into two outputs...
/**
* Fans out left and right values of an either
* #tparam L left value type
* #tparam R right value type
*/
class EitherFanOut[L, R] extends GraphStage[FanOutShape2[Either[L, R], L, R]] {
import akka.stream.{Attributes, Outlet}
import akka.stream.stage.GraphStageLogic
override val shape: FanOutShape2[Either[L, R], L, R] = new FanOutShape2[Either[L, R], L, R]("EitherFanOut")
override def createLogic(inheritedAttributes: Attributes): GraphStageLogic = new GraphStageLogic(shape) {
var out0demand = false
var out1demand = false
setHandler(shape.in, new InHandler {
override def onPush(): Unit = {
if (out0demand && out1demand) {
grab(shape.in) match {
case Left(l) =>
out0demand = false
push(shape.out0, l)
case Right(r) =>
out1demand = false
push(shape.out1, r)
}
}
}
})
setHandler(shape.out0, new OutHandler {
#scala.throws[Exception](classOf[Exception])
override def onPull(): Unit = {
if (!out0demand) {
out0demand = true
}
if (out0demand && out1demand) {
pull(shape.in)
}
}
})
setHandler(shape.out1, new OutHandler {
#scala.throws[Exception](classOf[Exception])
override def onPull(): Unit = {
if (!out1demand) {
out1demand = true
}
if (out0demand && out1demand) {
pull(shape.in)
}
}
})
}
}
.. you can route them to only receive one side:
val sourceRight: Source[String, NotUsed] = Source.fromGraph(GraphDSL.create(source) { implicit b => s =>
import GraphDSL.Implicits._
val eitherFanOut = b.add(new EitherFanOut[Throwable, String])
s ~> eitherFanOut.in
eitherFanOut.out0 ~> Sink.ignore
SourceShape(eitherFanOut.out1)
})
Await.result(sourceRight.runWith(Sink.foreach(println)), Duration.Inf)
... or probably more desirable, route them to two seperate Sinks:
val leftSink = Sink.foreach[Throwable](s => println(s"FAILURE: $s"))
val rightSink = Sink.foreach[String](s => println(s"SUCCESS: $s"))
val flow = RunnableGraph.fromGraph(GraphDSL.create(source, leftSink, rightSink)((_, _, _)) { implicit b => (s, l, r) =>
import GraphDSL.Implicits._
val eitherFanOut = b.add(new EitherFanOut[Throwable, String])
s ~> eitherFanOut.in
eitherFanOut.out0 ~> l.in
eitherFanOut.out1 ~> r.in
ClosedShape
})
val r = flow.run()
Await.result(Future.sequence(List(r._2, r._3)), Duration.Inf)
(Imports and initial setup)
import akka.NotUsed
import akka.stream.scaladsl.{GraphDSL, RunnableGraph, Sink, Source}
import akka.stream.stage.{GraphStage, InHandler, OutHandler}
import akka.stream._
import akka.actor.ActorSystem
import com.typesafe.config.ConfigFactory
import scala.concurrent.Future
import scala.concurrent.ExecutionContext.Implicits.global
import scala.concurrent.Await
import scala.concurrent.duration.Duration
val classLoader = getClass.getClassLoader
implicit val system = ActorSystem("QuickStart", ConfigFactory.load(classLoader), classLoader)
implicit val materializer = ActorMaterializer()
val values: List[Either[Throwable, String]] = List(
Right("B"),
Left(new Throwable),
Left(new RuntimeException),
Right("B"),
Right("C"),
Right("G"),
Right("I"),
Right("F"),
Right("T"),
Right("A")
)
val source: Source[Either[Throwable, String], NotUsed] = Source.fromIterator(() => values.toIterator)
Edit: this other answer with divertTo is a better solution than mine, IMO. I'll leave my answer as-is for posterity.
original answer:
This is implemented in akka-stream-contrib as PartitionWith. Add this dependency to SBT to pull it in to your project:
libraryDependencies += "com.typesafe.akka" %% "akka-stream-contrib" % "0.9"```
`PartitionWith` is shaped like a `Broadcast(2)`, but with potentially different types for each of the two outlets. You provide it with a predicate to apply to each element, and depending on the outcome, they get routed to the applicable outlet. You can then attach a `Sink` or `Flow` to each of these outlets independently as appropriate. Building on [cessationoftime's example](https://stackoverflow.com/a/39744355/147806), with the `Broadcast` replaced with a `PartitionWith`:
val eitherSource: Source[Either[Throwable, String], NotUsed] = Source.empty
val leftSink = Sink.foreach[Throwable](s => println(s"FAILURE: $s"))
val rightSink = Sink.foreach[String](s => println(s"SUCCESS: $s"))
val flow = RunnableGraph.fromGraph(GraphDSL.create(eitherSource, leftSink, rightSink)
((_, _, _)) { implicit b => (s, l, r) =>
import GraphDSL.Implicits._
val pw = b.add(
PartitionWith.apply[Either[Throwable, String], Throwable, String](identity)
)
eitherSource ~> pw.in
pw.out0 ~> leftSink
pw.out1 ~> rightSink
ClosedShape
})
val r = flow.run()
Await.result(Future.sequence(List(r._2, r._3)), Duration.Inf)
For this you can use a broadcast, then filter and map the streams within the GraphDSL:
val leftSink = Sink.foreach[Throwable](s => println(s"FAILURE: $s"))
val rightSink = Sink.foreach[String](s => println(s"SUCCESS: $s"))
val flow = RunnableGraph.fromGraph(GraphDSL.create(eitherSource, leftSink, rightSink)((_, _, _)) { implicit b => (s, l, r) =>
import GraphDSL.Implicits._
val broadcast = b.add(Broadcast[Either[Throwable,String]](2))
s ~> broadcast.in
broadcast.out(0).filter(_.isLeft).map(_.left.get) ~> l.in
broadcast.out(1).filter(_.isRight).map(_.right.get) ~> r.in
ClosedShape
})
val r = flow.run()
Await.result(Future.sequence(List(r._2, r._3)), Duration.Inf)
I expect you will be able to run the functions you want from within the map.
In the meantime this has been introduced to standard Akka-Streams:
https://doc.akka.io/api/akka/current/akka/stream/scaladsl/Partition.html.
You can split the input stream with a predicate and then use collect on each outputs to get only the types you are interested in.
You can use divertTo to attach alternative Sink to the flow to handle Lefts: https://doc.akka.io/docs/akka/current/stream/operators/Source-or-Flow/divertTo.html
source
.divertTo(handleFailureSink, _.isLeft)
.map(rightEither => handleSuccess(rightEither.right.get()))

Is there something like a SinkSource[T]?

I am looking for a SinkSource that provides a Sink and a Source. If an element flows into that Sink it should be provided at the corresponding Source. The following code shows what I mean:
object SinkSource {
def apply[T] = new {
def sink: Sink[T] = ???
def source: Source[T] = ???
}
}
val flowgraph = FlowGraph { implicit fgb =>
import FlowGraphImplicits._
val sinksource = SinkSource[Int]
Source(1 to 5) ~> sinksource.sink
sinksource.source ~> Sink.foreach(print)
}
implicit val actorSystem = ActorSystem(name = "System")
implicit val flowMaterializer = FlowMaterializer()
val materializedMap = flowgraph.run()
If executed this should print: 12345
So, does a SinkSource exist (haven't seen it in the API) or does anyone know how to implement it?
I should mention that I need distinct access to Sink and Source so that Flow isn't a solution in this particular form:
Source(1 to 5) ~> Flow[Int] ~> Sink.foreach(println)
As so often, ideas come to mind if question was already asked: It turned out, I don't need a Sink and a Source, JunctionInPort and JunctionOutPort are sufficient.
So here it goes:
object SinkSource {
def apply[T](implicit fgb: FlowGraphBuilder) = new SinkSource[T]
}
class SinkSource[T](implicit fgb: FlowGraphBuilder) {
import FlowGraphImplicits._
private val merge = Merge[T]
private val bcast = Broadcast[T]
Source.empty ~> merge
merge ~> bcast
bcast ~> Sink.ignore
def in: JunctionInPort[T] = merge
def out: JunctionOutPort[T] = bcast
}
val flowgraph = FlowGraph { implicit fgb =>
import FlowGraphImplicits._
val source = Source(1 to 5)
val sink = Sink.foreach(println)
val sinkSource = SinkSource[Int]
source ~> sinkSource.in
sinkSource.out ~> sink
}
implicit val actorSystem = ActorSystem(name = "System")
implicit val flowMaterializer = FlowMaterializer()
val materializedMap = flowgraph.run()