Testing Akka Reactive Streams - scala

I'm testing code which streams messages over an outgoing stream TCP connection obtained via:
(IO(StreamTcp) ? StreamTcp.Connect(settings, address))
.mapTo[StreamTcp.OutgoingTcpConnection]
.map(_.outputStream)
In my tests, I substitute the resulting Subscriber[ByteString] with a dummy subscriber, trigger some outgoing messages, and assert that have arrived as expected. I use the method below to produce the dummy subscriber and stream result future. (So far, so good)
def testSubscriber[T](settings: FlowMaterializer)(implicit ec: ExecutionContext): (Subscriber[T], Future[Seq[T]]) = {
var sent = Seq.empty[T]
val (subscriber, streamComplete) =
Duct[T].foreach( bs => sent = sent :+ bs)(settings)
(subscriber, streamComplete.map( _ => sent ))
}
My question is this: is there some canonical method for testing that streams output the expected values, something similar to Akka's TestActorRef? And if not, is there some library function similar to the above function?

Testing streams is possible with the akka-streams-testkit.
Read about it here: http://doc.akka.io/docs/akka/current/scala/stream/stream-testkit.html

Related

How to backpressure a ActorPublisher

I'm writing few samples to understand akka streams and backpressures. I'm trying to see how a slow consumer backpressure's a AkkaPublisher
My code as follows.
class DataPublisher extends ActorPublisher[Int] {
import akka.stream.actor.ActorPublisherMessage._
var items: List[Int] = List.empty
def receive = {
case s: String =>
println(s"Producer buffer size ${items.size}")
if (totalDemand == 0)
items = items :+ s.toInt
else
onNext(s.toInt)
case Request(demand) =>
if (demand > items.size) {
items foreach (onNext)
items = List.empty
}
else {
val (send, keep) = items.splitAt(demand.toInt)
items = keep
send foreach (onNext)
}
case other =>
println(s"got other $other")
}
}
and
Source.fromPublisher(ActorPublisher[Int](dataPublisherRef)).runWith(sink)
Where the sink is a Subscriber with a sleep to emulate slow consumer. And publisher keeps producing data regardless.
--EDIT--
My question is when the demand is 0 programatically buffers data. How can I make use of backpressure to slow down the publisher
Something like
throttledSource().buffer(10, OverflowStrategy.backpressure).runWith(throttledSink())
This will not effect the publisher and its buffer keeps going.
Thanks,
Sajith
Don't use ActorPublisher
Firstly, don't use ActorPublisher - it is a very low-level and deprecated API. We decided to deprecate as users should not be working on such low level of abstraction in Akka Streams.
One of the tricky things is exactly what you're asking about -- handling backpressure is entirely in the hands of the developer writing the ActorPublisher if they use this API. So you have to receive the Request(n) messages and make sure that you never signal more elements than you got requests for. This behaviour is specified in the Reactive Streams Specification which you then have to implement correctly. Basically, you're exposed to all the complexities of Reactive Streams (which is a full specification, with many edge cases -- disclaimer: I was/am part of developing Reactive Streams as well as Akka Streams).
Showing how back-pressure manifests in GraphStage
Secondly, to build custom stages you should be using the API designed for it: GraphStage. Please note that such stage is also pretty low level. Normally users of Akka Streams don't need to write custom stages, however it is absolutely expected and fine to write your own stages if they would implement some logic that the built-in stages don't provide.
Here's a simplified Filter implementation from the Akka codebase:
case class Filter[T](p: T ⇒ Boolean) extends SimpleLinearGraphStage[T] {
override def initialAttributes: Attributes = DefaultAttributes.filter
override def toString: String = "Filter"
override def createLogic(inheritedAttributes: Attributes): GraphStageLogic =
new GraphStageLogic(shape) with OutHandler with InHandler {
override def onPush(): Unit = {
val elem = grab(in)
if (p(elem)) push(out, elem)
else pull(in)
}
// this method will NOT be called, if the downstream has not signalled enough demand!
// this method being NOT called is how back-pressure manifests in stages
override def onPull(): Unit = pull(in)
setHandlers(in, out, this)
}
}
As you can see, instead of implementing the entire Reactive Streams logic and rules yourself (which is hard), you get simple callbacks like onPush and onPull. Akka Streams handles the demand management, and it will automatically call onPull if the downstream has signaled demand, and it will NOT call it, if there is no demand -- which would mean the downstream is applying backpressure to this stage.
This can be accomplished with an intermediate Flow.buffer:
val flowBuffer = Flow[Int].buffer(10, OverflowStrategy.dropHead)
Source
.fromPublisher(ActorPublisher[Int](dataPublisherRef))
.via(flowBuffer)
.runWith(sink)

How to create an Akka flow with backpressure and Control

I need to create a function with the following Interface:
import akka.kafka.scaladsl.Consumer.Control
object ItemConversionFlow {
def build(config: StreamConfig): Flow[Item, OtherItem, Control] = {
// Implementation goes here
}
My problem is that I don't know how to define the flow in a way that it fits the interface above.
When I am doing something like this
val flow = Flow[Item]
.map(item => doConversion(item)
.filter(_.isDefined)
.map(_.get)
the resulting type is Flow[Item, OtherItem, NotUsed]. I haven't found something in the Akka documentation so far. Also the functions on akka.stream.scaladsl.Flow only offer a "NotUsed" instead of Control. Would be great if someone could point me into the right direction.
Some background: I need to setup several pipelines which only distinguish in the conversion part. These pipelines are sub streams to a main stream which might be stopped for some reason (a corresponding message arrives in some kafka topic). Therefor I need the Control part. The idea would be to create a Graph template where I just insert the mentioned flow as argument (a factory returning it). For a specific case we have a solution which works. To generalize it I need this kind of flow.
You actually have backpressure. However, think about what do you really need about backpressure... you are not using asynchronous stages to increase your throughput... for example. Backpressure avoids fast producers overgrowing susbscribers https://doc.akka.io/docs/akka/2.5/stream/stream-rate.html. In your sample don´t worry about it, your stream will ask for new elements to he publisher depending on how long doConversion takes to complete.
In case that you want to obtain the result of the stream use toMat or viaMat. For example, if your stream emits Item and transform these into OtherItem:
val str = Source.fromIterator(() => List(Item(Some(1))).toIterator)
.map(item => doConversion(item))
.filter(_.isDefined)
.map(_.get)
.toMat(Sink.fold(List[OtherItem]())((a, b) => {
// Examine the result of your stream
b :: a
}))(Keep.right)
.run()
str will be Future[List[OtherItem]]. Try to extrapolate this to your case.
Or using toMat with KillSwitches, "Creates a new [[Graph]] of [[FlowShape]] that materializes to an external switch that allows external completion
* of that unique materialization. Different materializations result in different, independent switches."
def build(config: StreamConfig): Flow[Item, OtherItem, UniqueKillSwitch] = {
Flow[Item]
.map(item => doConversion(item))
.filter(_.isDefined)
.map(_.get)
.viaMat(KillSwitches.single)(Keep.right)
}
val stream =
Source.fromIterator(() => List(Item(Some(1))).toIterator)
.viaMat(build(StreamConfig(1)))(Keep.right)
.toMat(Sink.ignore)(Keep.both).run
// This stops the stream
stream._1.shutdown()
// When it finishes
stream._2 onComplete(_ => println("Done"))

Akka Stream - Splitting flow into multiple Sources

I have a TCP connection in Akka Stream that ends in a Sink. Right now all messages go into one Sink. I want to split the stream into an unknown number of Sinks given some function.
The use case is as follows, from the TCP connection I get en continuous stream of something like List[DeltaValue], now I want to create an actorSink for each DeltaValue.id so that i can continuously accumulate and implement behaviour for each DeltaValue.id. I find this to be a standard use case in stream processing but I'm not able to find a good example with Akka Stream.
This is what I have right now:
def connect(): ActorRef = tcpConnection
.//SOMEHOW SPLIT HERE and create a ReceiverActor for each message
.to(Sink.actorRef(system.actorOf(ReceiverActor.props(), ReceiverActor.name), akka.Done))
.run()
Update:
I now have this, not sure what to say about it, it does not feel super stable but it should work:
private def spawnActorOrSendMessage(m: ResponseMessage): Unit = {
implicit val timeout = Timeout(FiniteDuration(1, TimeUnit.SECONDS))
system.actorSelection("user/" + m.id.toString).resolveOne().onComplete {
case Success(actorRef) => actorRef ! m
case Failure(ex) => (system.actorOf(ReceiverActor.props(), m.id.toString)) ! m
}
}
def connect(): ActorRef = tcpConnection
.to(Sink.foreachParallel(10)(spawnActorOrSendMessage))
.run()
The below should be a somewhat improved version of what was updated in the question. The main improvement is that your actors are kept in a data structure to avoid actorSelection resolution for every incoming message.
case class DeltaValue(id: String, value: Double)
val src: Source[DeltaValue, NotUsed] = ???
src.runFold(Map[String, ActorRef]()){
case (actors, elem) if actors.contains(elem.id) ⇒
actors(elem.id) ! elem.value
actors
case (actors, elem) ⇒
val newActor = system.actorOf(ReceiverActor.props(), ReceiverActor.name)
newActor ! elem.value
actors.updated(elem.id, newActor)
}
Keep in mind that, when you integrate Akka Streams with bare actors, you lose backpressure support. This is one of the reasons why you should try and implement your logic within the boundaries of Akka Streams whenever possible. And this is not always possible - e.g. when remoting is needed etc.
In your case, you could consider leveraging groupBy and the concept of substream. The example below is folding the elements of each substream by summing them, just to give an idea:
src.groupBy(maxSubstreams = Int.MaxValue, f = _.id)
.fold("" → 0d) {
case ((id, acc), delta) ⇒ id → delta.value + acc
}
.mergeSubstreams
.runForeach(println)
EventStream
You can send messages to the ActorSystem's EventStream within a stream sink and separately have the Actors subscribe to the stream.
Split At Stream Level
You can split the stream at the stream level using Broadcast. The documentation has a good example of this.
Split At Actor Level
You could also use Sink.actorRef in combination with a BroadcastPool to broadcast the messages to multiple Actors.

multipart form data in Lagom

I want to have a service which receives an item object, the object contains; name, description, price and picture.
the other attributes are strings which easily can be sent as Json object but for including picture what is the best solution?
if multipart formdata is the best solution how it is handled in Lagom?
You may want to check the file upload example in the lagom-recipes repository on GitHub.
Basically, the idea is to create an additional Play router. After that, we have to tell Lagom to use it as noted in the reference documentation (this feature is available since 1.5.0). Here is how the router might look like:
class FileUploadRouter(action: DefaultActionBuilder,
parser: PlayBodyParsers,
implicit val exCtx: ExecutionContext) {
private def fileHandler: FilePartHandler[File] = {
case FileInfo(partName, filename, contentType, _) =>
val tempFile = {
val f = new java.io.File("./target/file-upload-data/uploads", UUID.randomUUID().toString).getAbsoluteFile
f.getParentFile.mkdirs()
f
}
val sink: Sink[ByteString, Future[IOResult]] = FileIO.toPath(tempFile.toPath)
val acc: Accumulator[ByteString, IOResult] = Accumulator(sink)
acc.map {
case akka.stream.IOResult(_, _) =>
FilePart(partName, filename, contentType, tempFile)
}
}
val router = Router.from {
case POST(p"/api/files") =>
action(parser.multipartFormData(fileHandler)) { request =>
val files = request.body.files.map(_.ref.getAbsolutePath)
Results.Ok(files.mkString("Uploaded[", ", ", "]"))
}
}
}
And then, we simply tell Lagom to use it
override lazy val lagomServer =
serverFor[FileUploadService](wire[FileUploadServiceImpl])
.additionalRouter(wire[FileUploadRouter].router)
Alternatively, we can make use of the PlayServiceCall class. Here is a simple sketch on how to do that provided by James Roper from the Lightbend team:
// The type of the service call is NotUsed because we are handling it out of band
def myServiceCall: ServiceCall[NotUsed, Result] = PlayServiceCall { wrapCall =>
// Create a Play action to handle the request
EssentialAction { requestHeader =>
// Now we create the sink for where we want to stream the request to - eg it could
// go to a file, a database, some other service. The way Play gives you a request
// body is that you need to return a sink from EssentialAction, and when it gets
// that sink, it stream the request body into that sink.
val sink: Sink[ByteString, Future[Done]] = ...
// Play wraps sinks in an abstraction called accumulator, which makes it easy to
// work with the result of handling the sink. An accumulator is like a future, but
// but rather than just being a value that will be available in future, it is a
// value that will be available once you have passed a stream of data into it.
// We wrap the sink in an accumulator here.
val accumulator: Accumulator[ByteString, Done] = Accumulator.forSink(sink)
// Now we have an accumulator, but we need the accumulator to, when it's done,
// produce an HTTP response. Right now, it's just producing akka.Done (or whatever
// your sink materialized to). So we flatMap it, to handle the result.
accumulator.flatMap { done =>
// At this point we create the ServiceCall, the reason we do that here is it means
// we can access the result of the accumulator (in this example, it's just Done so
// not very interesting, but it could be something else).
val wrappedAction = wrapCall(ServiceCall { notUsed =>
// Here is where we can do any of the actual business logic, and generate the
// result that can be returned to Lagom to be serialized like normal
...
})
// Now we invoke the wrapped action, and run it with no body (since we've already
// handled the request body with our sink/accumulator.
wrappedAction(request).run()
}
}
}
Generally speaking, it probably isn't a good idea to use Lagom for that purpose. As noted on the GitHub issue on PlayServiceCall documentation:
Many use cases where we fallback to PlayServiceCall are related to presentation or HTTP-specific use (I18N, file upload, ...) which indicate: coupling of the lagom service to the presentation layer or coupling of the lagom service to the transport.
Quoting James Roper again (a few years back):
So currently, multipart/form-data is not supported in Lagom, at least not out of the box. You can drop down to a lower level Play API to handle it, but perhaps it would be better to handle it in a web gateway, where any files handled are uploaded directly to a storage service such as S3, and then a Lagom service might store the meta data associated with it.
You can also check the discussion here, which provides some more insight.

Async operations with Spray, Akka, and actorSelection

I keep running into the same design problem using spray, which is how to find the original context of the Spray http request for a request, after doing some asynchronous (tell) operations in Akka.
I'm using Net-a-Porter actor per request model. It creates a child actor which I specify to handle each request, which is incapsulated by another actor which holds the correct request context.
Let let's call my actor ActorA, which has this receive method on it:
def receive: Receive = {
case v : InputJson =>
val id = createId
val redisList = context.actorOf(Props[RedisListActor])
// At this point, sender is the 'per-request' actor created, which has the HTTP context of the Spray request.
redisList ! ListRequest(id, sender.path.toStringWithoutAddress, v)
This is adding input to a job queue on redis, which is consumed on another server. When this job is completed, the other server adds the result to a Redis PubSub queue which we are subscribed to. When an item comes into this queue, it alerts my ActorA (using context.actorOf).
case kr : KubernetesReply =>
context.system.actorSelection(kr.actorPath) ! TaskResponse("Success", kr.payload, kr.id)
You can see that I am trying to find the original sender through using it's actorPath, but upon the KubernetesReply, I find that path is deadLetters (even though I have not explicitly killed the request actor). I've confirmed it's the correct path (i.e. I can send back the task response from the InputJson handler).
What's the correct pattern for this? How can I find my original actor, and why has it disappeared?
You can put an ActorRef directly in the ListRequest message.
case class ListRequest(id: YourIdType, requestActor: ActorRef, json: InputJson)
def receive: Receive = {
case v : InputJson =>
val id = createId
val redisList = context.actorOf(Props[RedisListActor])
redisList ! ListRequest(id, sender, v)
case kr : KubernetesReply =>
kr.requestActor ! TaskResponse("Success", kr.payload, kr.id)
}