akka-stream pipeline backpressures despite inserted buffer - scala

I have an akka-stream pipeline that fans out events (via BroadcastHub) that are pushed into the stream via a SourceQueueWithComplete.
Despite all downstream consumers having a .buffer() inserted (of which I'd expect that it ensures that upstream buffers of hub and queue remain drained), I still observe backpressure kicking in after the system has been running for a while.
Here's a (simplified) snippet:
class NotificationHub[Event](
implicit materializer: Materializer,
ecForLogging: ExecutionContext
) {
// a SourceQueue to enque events and a BroadcastHub to allow multiple subscribers
private val (queue, broadCastSource) =
Source.queue[Event](
bufferSize = 64,
// we expect the buffer to never run full and if it does, we want
// to log that asap, so we use OverflowStrategy.backpressure
OverflowStrategy.backpressure
).toMat(BroadcastHub.sink)(Keep.both).run()
// This keeps the BroadCastHub drained while there are no subscribers
// (see https://doc.akka.io/docs/akka/current/stream/stream-dynamic.html ):
broadCastSource.to(Sink.ignore).run()
def notificationSource(p: Event => Boolean): Source[Unit, NotUsed] = {
broadCastSource
.collect { case event if p(event) => () }
// this buffer is intended to keep the upstream buffers of
// queue and hub drained:
.buffer(
// if a downstream consumer ever becomes too slow to consume,
// only the latest two notifications are relevant
size = 2,
// doesn't really matter whether we drop head or tail
// as all elements are the same (), it's just important not
// to backpressure in case of overflow:
OverflowStrategy.dropHead
)
}
def propagateEvent(
event: Event
): Unit = {
queue.offer(event).onComplete {
case Failure(e) =>
// unexpected backpressure occurred!
println(e.getMessage)
e.printStackTrace()
case _ =>
()
}
}
}
Since the doc for buffer() says that for DropHead it never backpressures, I would have expected that the upstream buffers remain drained. Yet still I end up with calls to queue.offer() failing because of backpressure.
Reasons that I could think of:
Evaluation of predicate p in .collect causes a lot of load and hence backpressure. This seems very unlikely because those are very simple non-blocking ops.
Overall system is totally overloaded. Also rather unlikely.
I have a felling I am missing something? Do I maybe need to add an async boundary via .async before or after buffer() to fully decouple the "hub" from possible heavy load that may occur somewhere further downstream?

So after more reading of akka docs and some experiments, I think I found the solution (sorry for maybe asking here too early).
To fully detach my code from any heavy load that may occur somewhere downstream, I need to ensure that any downstream code is not executed by the same actor as the .buffer() (e.g. by inserting .async).
For example, this code would eventually lead to the SourceQueue running full and then backpressuring:
val hub: NotifactionHub[Int] = // ...
hub.notificationSource(_ => true)
.map { x =>
Thread.sleep(250)
x
}
Further inspections showed that this .map() would be executed on the same thread (of the underlying actor) as the upstream .collect() (and .buffer()).
When inserting .async as shown below, the .buffer() would drop elements (as I had intended it to) and the upstream SourceQueue would remaind drained:
val hub: NotifactionHub[Int] = // ...
hub.notificationSource(_ => true)
.async
.map { x =>
Thread.sleep(250)
x
}

Related

Akka streams RestartSink doesn't seem to be restarting during failures

I am playing around with handling errors in akka streams with restartable sources & sinks.
object Main extends App {
implicit val system: ActorSystem = ActorSystem("akka-streams-system")
val restartSettings =
RestartSettings(1.seconds, 10.seconds, 0.2d)
val restartableSource = RestartSource.onFailuresWithBackoff(restartSettings) {() => {
Source(0 to 10)
.map(n =>
if (n < 5) n.toString
else throw new RuntimeException("Boom!"))
}}
val restartableSink: Sink[String, NotUsed] = RestartSink.withBackoff(restartSettings){
() => Sink.fold("")((_, newVal) => {
if(newVal == "3") {
println(newVal + " Exception")
throw new RuntimeException("Kabooom!!!") // TRIGGERRING A FAILURE expecting the steam to restart just the sink.
} else {
println(newVal + " sink")
}
newVal
})
}
restartableSource.runWith(restartableSink)
}
I am breaking source and sink separately with different scenarios. I am breaking sink first expecting the sink to be restarting and reprocessing the newVal == 3 message over and over again.
But it seems like the error in the sink is just thrown away and only the source failure is retried so the source ends up being restarted and reprocess the events starting from 0.
I am mimicking a scenario where I want to read from a source(let's say from a file) and have an HTTP sink that retries failed HTTP requests independently without restarting the whole stream's pipeline.
The output I get with the above-shared code is as follows.
0 sink
1 sink
2 sink
3 Exception
4 sink
[WARN] [01/10/2022 09:13:14.647] [akka-streams-system-akka.actor.default-dispatcher-6] [RestartWithBackoffSource(akka://akka-streams-system)] Restarting stream due to failure [1]: java.lang.RuntimeException: Boom!
java.lang.RuntimeException: Boom!
at Main$.$anonfun$restartableSource$2(Main.scala:18)
at Main$.$anonfun$restartableSource$2$adapted(Main.scala:16)
at akka.stream.impl.fusing.Map$$anon$1.onPush(Ops.scala:52)
at akka.stream.impl.fusing.GraphInterpreter.processPush(GraphInterpreter.scala:542)
at akka.stream.impl.fusing.GraphInterpreter.processEvent(GraphInterpreter.scala:496)
at akka.stream.impl.fusing.GraphInterpreter.execute(GraphInterpreter.scala:390)
at akka.stream.impl.fusing.GraphInterpreterShell.runBatch(ActorGraphInterpreter.scala:650)
at akka.stream.impl.fusing.GraphInterpreterShell$AsyncInput.execute(ActorGraphInterpreter.scala:521)
at akka.stream.impl.fusing.GraphInterpreterShell.processEvent(ActorGraphInterpreter.scala:625)
at akka.stream.impl.fusing.ActorGraphInterpreter.akka$stream$impl$fusing$ActorGraphInterpreter$$processEvent(ActorGraphInterpreter.scala:800)
at akka.stream.impl.fusing.ActorGraphInterpreter.akka$stream$impl$fusing$ActorGraphInterpreter$$shortCircuitBatch(ActorGraphInterpreter.scala:787)
at akka.stream.impl.fusing.ActorGraphInterpreter$$anonfun$receive$1.applyOrElse(ActorGraphInterpreter.scala:819)
at akka.actor.Actor.aroundReceive(Actor.scala:537)
at akka.actor.Actor.aroundReceive$(Actor.scala:535)
at akka.stream.impl.fusing.ActorGraphInterpreter.aroundReceive(ActorGraphInterpreter.scala:716)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:580)
at akka.actor.ActorCell.invoke(ActorCell.scala:548)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:270)
at akka.dispatch.Mailbox.run(Mailbox.scala:231)
at akka.dispatch.Mailbox.exec(Mailbox.scala:243)
at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290)
at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1016)
at java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1665)
at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1598)
at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:183)
I would appreciate any help on reasoning about why this is happening and how to restart the sink independently of the source.
Your RestartSink is restarting (and not in the process restarting anything else): if it wasn't, you would never have gotten 4 sink as output right after 3 Exception. For some reason it's not logging, but that might be due to stream attributes (there's also been some behavioral changes around logging in the stream restarts in recent months, so logging may differ depending on what version you're running).
From the docs for RestartSink:
The restart process is inherently lossy, since there is no coordination between cancelling and the sending of messages. When the wrapped Sink does cancel, this Sink will backpressure, however any elements already sent may have been lost.
This is fundamentally because in the general case stream stages are memoryless. In your Sink.fold example, it will restart with clean state (viz. ""). This does, in my experience, make the RestartSink and RestartFlow somewhat less useful than the RestartSource.
For the use-case you describe, I would tend to use a mapAsync stage with akka.pattern.RetrySupport to send HTTP requests via a Future-based API and retry requests on failures:
val restartingSource: Source[Element, _] = ???
restartingSource.mapAsync(1) { elem =>
import akka.pattern.RetrySupport._
// will need an implicit ExecutionContext and an implicit Scheduler (both are probably best obtained from the ActorSystem)
val sendRequest = () => {
// Future-based HTTP call
???
}
retry(
attempt = sendRequest,
attempts = Int.MaxValue,
minBackoff = 1.seconds,
maxBackoff = 10.seconds,
randomFactor = 0.2
)
}.runWith(Sink.ignore)

Await for a Sequence of Futures with timeout without failing on TimeoutException

I have a sequence of scala Futures of same type.
I want, after some limited time, to get a result for the entire sequence while some futures may have succeeded, some may have failed and some haven't completed yet, the non completed futures should be considered failed.
I don't want to use Await each future sequentially.
I did look at this question: Scala waiting for sequence of futures
and try to use the solution from there, namely:
private def lift[T](futures: Seq[Future[T]])(implicit ex: ExecutionContext) =
futures.map(_.map { Success(_) }.recover { case t => Failure(t) })
def waitAll[T](futures: Seq[Future[T]])(implicit ex: ExecutionContext) =
Future.sequence(lift(futures))
futures: Seq[Future[MyObject]] = ...
val segments = Await.result(waitAll(futures), waitTimeoutMillis millis)
but I'm still getting a TimeoutException, I guess because some of the futures haven't completed yet.
and that answer also states,
Now Future.sequence(lifted) will be completed when every future is completed, and will represent successes and failures using Try.
But I want my Future to be completed after the timeout has passed, not when every future in the sequence has completed. What else can I do?
If I used raw Future (rather than some IO monad which has this functionality build-in, or without some Akka utils for exactly that) I would hack together utility like:
// make each separate future timeout
object FutureTimeout {
// separate EC for waiting
private val timeoutEC: ExecutorContext = ...
private def timeout[T](delay: Long): Future[T] = Future {
blocking {
Thread.sleep(delay)
}
throw new Exception("Timeout")
}(timeoutEC)
def apply[T](fut: Future[T], delat: Long)(
implicit ec: ExecutionContext
): Future[T] = Future.firstCompletedOf(Seq(
fut,
timeout(delay)
))
}
and then
Future.sequence(
futures
.map(FutureTimeout(_, delay))
.map(Success(_))
.recover { case e => Failure(e) }
)
Since each future would terminate at most after delay we would be able to collect them into one result right after that.
You have to remember though that no matter how would you trigger a timeout you would have no guarantee that the timeouted Future stops executing. It could run on and on on some thread somewhere, it's just that you wouldn't wait for the result. firstCompletedOf just makes this race more explicit.
Some other utilities (like e.g. Cats Effect IO) allow you to cancel computations (which is used in e.g. races like this one) but you still have to remember that JVM cannot arbitrarily "kill" a running thread, so that cancellation would happen after one stage of computation is completed and before the next one is started (so e.g. between .maps or .flatMaps).
If you aren't afraid of adding external deps there are other (and more reliable, as Thread.sleep is just a temporary ugly hack) ways of timing out a Future, like Akka utils. See also other questions like this.
Here is solution using monix
import monix.eval.Task
import monix.execution.Scheduler
val timeoutScheduler = Scheduler.singleThread("timeout") //it's safe to use single thread here because timeout tasks are very fast
def sequenceDiscardTimeouts[T](tasks: Task[T]*): Task[Seq[T]] = {
Task
.parSequence(
tasks
.map(t =>
t.map(Success.apply) // Map to success so we can collect the value
.timeout(500.millis)
.executeOn(timeoutScheduler) //This is needed to run timesouts in dedicated scheduler that won't be blocked by "blocking"/io work if you have any
.onErrorRecoverWith { ex =>
println("timed-out")
Task.pure(Failure(ex)) //It's assumed that any error is a timeout. It's possible to "catch" just timeout exception here
}
)
)
.map { res =>
res.collect { case Success(r) => r }
}
}
Testing code
implicit val mainScheduler = Scheduler.fixedPool(name = "main", poolSize = 10)
def slowTask(msg: String) = {
Task.sleep(Random.nextLong(1000).millis) //Sleep here to emulate a slow task
.map { _ =>
msg
}
}
val app = sequenceDiscardTimeouts(
slowTask("1"),
slowTask("2"),
slowTask("3"),
slowTask("4"),
slowTask("5"),
slowTask("6")
)
val started: Long = System.currentTimeMillis()
app.runSyncUnsafe().foreach(println)
println(s"Done in ${System.currentTimeMillis() - started} millis")
This will print an output different for each run but it should look like following
timed-out
timed-out
timed-out
3
4
5
Done in 564 millis
Please note the usage of two separate schedulers. This is to ensure that timeouts will fire even if the main scheduler is busy with business logic. You can test it by reducing poolSize for main scheduler.

How do I call context become outside of a Future from Ask messages?

I have a parent akka actor named buildingCoordinator which creates childs name elevator_X. For now I am creating only one elevator. The buildingCoordinator sends a sequence of messages and wait for responses in order to move an elevator. The sequence is this: sends ? RequestElevatorState -> receive ElevatorState -> sends ? MoveRequest -> receives MoveRequestSuccess -> changes the state. As you can see I am using the ask pattern. After the movement is successes the buildingCoordinator changes its state using context.become.
The problem that I am running is that the elevator is receiving MoveRequest(1,4) for the same floor twice, sometimes three times. I do remove the floor when I call context.become. However I remove inside the last map. I think it is because I am using context.become inside a future and I should use it outside. But I am having trouble implementing it.
case class BuildingCoordinator(actorName: String,
numberOfFloors: Int,
numberOfElevators: Int,
elevatorControlSystem: ElevatorControlSystem)
extends Actor with ActorLogging {
import context.dispatcher
implicit val timeout = Timeout(4 seconds)
val elevators = createElevators(numberOfElevators)
override def receive: Receive = operational(Map[Int, Queue[Int]](), Map[Int, Queue[Int]]())
def operational(stopsRequests: Map[Int, Queue[Int]], pickUpRequests: Map[Int, Queue[Int]]): Receive = {
case msg#MoveElevator(elevatorId) =>
println(s"[BuildingCoordinator] received $msg")
val elevatorActor: ActorSelection = context.actorSelection(s"/user/$actorName/elevator_$elevatorId")
val newState = (elevatorActor ? RequestElevatorState(elevatorId))
.mapTo[ElevatorState]
.flatMap { state =>
val nextStop = elevatorControlSystem.findNextStop(stopsRequests.get(elevatorId).get, state.currentFloor, state.direction)
elevatorActor ? MoveRequest(elevatorId, nextStop)
}
.mapTo[MoveRequestSuccess]
.flatMap(moveRequestSuccess => elevatorActor ? MakeMove(elevatorId, moveRequestSuccess.targetFloor))
.mapTo[MakeMoveSuccess]
.map { makeMoveSuccess =>
println(s"[BuildingCoordinator] Elevator ${makeMoveSuccess.elevatorId} arrived at floor [${makeMoveSuccess.floor}]")
// removeStopRequest
val stopsRequestsElevator = stopsRequests.get(elevatorId).getOrElse(Queue[Int]())
val newStopsRequestsElevator = stopsRequestsElevator.filterNot(_ == makeMoveSuccess.floor)
val newStopsRequests = stopsRequests + (elevatorId -> newStopsRequestsElevator)
val pickUpRequestsElevator = pickUpRequests.get(elevatorId).getOrElse(Queue[Int]())
val newPickUpRequestsElevator = {
if (pickUpRequestsElevator.contains(makeMoveSuccess.floor)) {
pickUpRequestsElevator.filterNot(_ == makeMoveSuccess.floor)
} else {
pickUpRequestsElevator
}
}
val newPickUpRequests = pickUpRequests + (elevatorId -> newPickUpRequestsElevator)
// I THINK I SHOULD NOT CALL context.become HERE
// context.become(operational(newStopsRequests, newPickUpRequests))
val dropOffFloor = BuildingUtil.generateRandomFloor(numberOfFloors, makeMoveSuccess.floor, makeMoveSuccess.direction)
context.self ! DropOffRequest(makeMoveSuccess.elevatorId, dropOffFloor)
(newStopsRequests, newPickUpRequests)
}
// I MUST CALL context.become HERE, BUT I DONT KNOW HOW
// context.become(operational(newState.flatMap(state => (state._1, state._2))))
}
Other thing that might be nasty here is this big chain of map and flatMap. This was my way to implement, however I think it might exist one way better.
You can't and you should not call context.become or anyhow change actor state outside Receive method and outside Receive method invoke thread (which is Akka distpatcher thread), like in your example. Eg:
def receive: Receive = {
// This is a bug, because context is not and is not supposed to be thread safe.
case message: Message => Future(context.become(anotherReceive))
}
What you should do - send message to self after async operation finished and change the state receive after. If in a mean time you don't want to handle incoming messages - you can stash them. See for more details: https://doc.akka.io/docs/akka/current/typed/stash.html
High level example, technical details omitted:
case OperationFinished(calculations: Map[Any, Any])
class AsyncActor extends Actor with Stash {
def operation: Future[Map[Any, Any]] = ...//some implementation of heavy async operation
def receiveStartAsync(calculations: Map[Any, Any]): Receive = {
case StartAsyncOperation =>
//Start async operation and inform yourself that it is finished
operation.map(OperationFinished.apply) pipeTo self
context.become(receiveWaitAsyncOperation)
}
def receiveWaitAsyncOperation: Receive = {
case OperationFinished =>
unstashAll()
context.become(receiveStartAsync)
case _ => stash()
}
}
I like your response #Ivan Kurchenko.
But, according to: Akka Stash docs
When unstashing the buffered messages by calling unstashAll the messages will be processed sequentially in the order they were added and all are processed unless an exception is thrown. The actor is unresponsive to other new messages until unstashAll is completed. That is another reason for keeping the number of stashed messages low. Actors that hog the message processing thread for too long can result in starvation of other actors.
Meaning that under load, for example, the unstashAll operation will cause all other Actors to be in starvation.
According to the same doc:
That can be mitigated by using the StashBuffer.unstash with numberOfMessages parameter and then send a message to context.self before continuing unstashing more. That means that other new messages may arrive in-between and those must be stashed to keep the original order of messages. It becomes more complicated, so better keep the number of stashed messages low.
Bottom line: you should keep the stashed message count low. It might be not suitable for load operation.

Sending actorRefWithAck inside stream

I'm using answer from this thread because I need to treat first element especially. The problem is, I need to send this data to another Actor or persist locally (which is not possibl).
So, my stream looks like this:
val flow: Flow[Message, Message, (Future[Done], Promise[Option[Message]])] = Flow.fromSinkAndSourceMat(
Flow[Message].mapAsync[Trade](1) {
case TextMessage.Strict(text) =>
Unmarshal(text).to[Trade]
case streamed: TextMessage.Streamed =>
streamed.textStream.runFold("")(_ ++ _).flatMap(Unmarshal(_).to[Trade])
}.groupBy(pairs.size, _.s).prefixAndTail(1).flatMapConcat {
case (head, tail) =>
// sending first element here
val result = Source(head).to(Sink.actorRefWithAck(
ref = actor,
onInitMessage = Init,
ackMessage = Ack,
onCompleteMessage = "done"
)).run()
// some kind of operation on the result
Source(head).concat(tail)
}.mergeSubstreams.toMat(sink)(Keep.right),
Source.maybe[Message])(Keep.both)
Is this a good practice? Will it have unintended consequences? Unfortunately, I cannot call persist inside stream, so I want to send this data to the external system.
Your current approach doesn't use result in any way, so a simpler alternative would be to fire and forget the first Message to the actor:
groupBy(pairs.size, _.s).prefixAndTail(1).flatMapConcat {
case (head, tail) =>
// sending first element here
actor ! head.head
Source(head).concat(tail)
}
The actor would then not have to worry about handling Init and sending Ack messages and could be solely concerned with persisting Message instances.

Why does Source.tick stop after one hundred HttpRequests?

Using akka stream and akka HTTP, I have created a stream which polls an api every 3 seconds, Unmarshalls the result to a JsValue object and sends this result to an actor. As can be seen in the following code:
// Source wich performs an http request every 3 seconds.
val source = Source.tick(0.seconds,
3.seconds,
HttpRequest(uri = Uri(path = Path("/posts/1"))))
// Processes the result of the http request
val flow = Http().outgoingConnectionHttps("jsonplaceholder.typicode.com").mapAsync(1) {
// Able to reach the API.
case HttpResponse(StatusCodes.OK, _, entity, _) =>
// Unmarshal the json response.
Unmarshal(entity).to[JsValue]
// Failed to reach the API.
case HttpResponse(code, _, entity, _) =>
entity.discardBytes()
Future.successful(code.toString())
}
// Run stream
source.via(flow).runWith(Sink.actorRef[Any](processJsonActor,akka.actor.Status.Success(("Completed stream"))))
This works, however the stream closes after 100 HttpRequests (ticks).
What is the cause of this behaviour?
Definitely something to do with outgoingConnectionHttps. This is a low level DSL and there could be some misconfigured setting somewhere which is causing this (although I couldn't figure out which one).
Usage of this DSL is actually discouraged by the docs.
Try using a higher level DSL like cached connection pool
val flow = Http().cachedHostConnectionPoolHttps[NotUsed]("akka.io").mapAsync(1) {
// Able to reach the API.
case (Success(HttpResponse(StatusCodes.OK, _, entity, _)), _) =>
// Unmarshal the json response.
Unmarshal(entity).to[String]
// Failed to reach the API.
case (Success(HttpResponse(code, _, entity, _)), _) =>
entity.discardBytes()
Future.successful(code.toString())
case (Failure(e), _) ⇒
throw e
}
// Run stream
source.map(_ → NotUsed).via(flow).runWith(...)
A potential issue is that there is no backpressure signal with Sink.actorRef, so the actor's mailbox could be getting full. If the actor, whenever it receives a JsValue object, is doing something that could take a long time, use Sink.actorRefWithAck instead. For example:
val initMessage = "start"
val completeMessage = "done"
val ackMessage = "ack"
source
.via(flow)
.runWith(Sink.actorRefWithAck[Any](
processJsonActor, initMessage, ackMessage, completeMessage))
You would need to change the actor to handle an initMessage and reply to the stream for every stream element with an ackMessage (with sender ! ackMessage). More information on Sink.actorRefWithAck is found here.