Akka streams Source.repeat stops after 100 requests - scala

I am working on the below stream processing system to grab frames from one source, process, and send to another. I'm using a combination of akka-streams and akka-http through their scapa api. The pipeline is very short but I can't seem to locate where the system decides to stop after precisely 100 requests to the endpoint.
object frameProcessor extends App {
implicit val system: ActorSystem = ActorSystem("VideoStreamProcessor")
val decider: Supervision.Decider = _ => Supervision.Restart
implicit val materializer: ActorMaterializer = ActorMaterializer()
implicit val dispatcher: ExecutionContextExecutor = system.dispatcher
val http = Http(system)
val sourceConnectionFlow: Flow[HttpRequest, HttpResponse, Future[Http.OutgoingConnection]] = http.outgoingConnection(sourceUri)
val byteFlow: Flow[HttpResponse, Future[ByteString], NotUsed] =
Flow[HttpResponse].map(_.entity.dataBytes.runFold(ByteString.empty)(_ ++ _))
Source.repeat(HttpRequest(uri = sourceUri))
.via(sourceConnectionFlow)
.via(byteFlow)
.map(postFrame)
.runWith(Sink.ignore)
.onComplete(_ => system.terminate())
def postFrame(imageBytes: Future[ByteString]): Unit = {
imageBytes.onComplete{
case Success(res) => system.log.info(s"post frame. ${res.length} bytes")
case Failure(_) => system.log.error("failed to post image!")
}
}
}
Fore reference, I'm using akka-streams version 2.5.19 and akka-http version 10.1.7. No error is thrown, no error codes on the source server where the frames come from, and the program exits with error code 0.
My application.conf is as follows:
logging = "DEBUG"
Always 100 units processed.
Thanks!
Edit
Added logging to the stream like so
.onComplete{
case Success(res) => {
system.log.info(res.toString)
system.terminate()
}
case Failure(res) => {
system.log.error(res.getMessage)
system.terminate()
}
}
Received a connection reset exception but this is inconsistent. The stream completes with Done.
Edit 2
Using .mapAsync(1)(postFrame) I get the same Success(Done) after precisely 100 requests. Additionally, when I check the nginx server access.log and error.log there are only 200 responses.
I had to modify postFrame as follows to run mapAsync
def postFrame(imageBytes: Future[ByteString]): Future[Unit] = {
imageBytes.onComplete{
case Success(res) => system.log.info(s"post frame. ${res.length} bytes")
case Failure(_) => system.log.error("failed to post image!")
}
Future(Unit)
}

I believe I have found the answer on on the Akka docs using delayed restarts with a backoff operator. Instead of sourcing direct from an unstable remote connection, I use RestartSource.withBackoff and not RestartSource.onFailureWithBackoff. The modified stream looks like;
val restartSource = RestartSource.withBackoff(
minBackoff = 100.milliseconds,
maxBackoff = 1.seconds,
randomFactor = 0.2
){ () =>
Source.single(HttpRequest(uri = sourceUri))
.via(sourceConnectionFlow)
.via(byteFlow)
.mapAsync(1)(postFrame)
}
restartSource
.runWith(Sink.ignore)
.onComplete{
x => {
println(x)
system.terminate()
}
}
I was not able to find the source of the problem but it seems this will work.

Related

Akka RestartSource does not restart

object TestSource {
implicit val ec = ExecutionContext.global
def main(args: Array[String]): Unit = {
def buildSource = {
println("fresh")
Source(List(() => 1,() => 2,() => 3,() => {
println("crash")
throw new RuntimeException(":(((")
}))
}
val restarting = RestartSource.onFailuresWithBackoff(
minBackoff = Duration(1, SECONDS) ,
maxBackoff = Duration(1, SECONDS),
randomFactor = 0.0,
maxRestarts = 10
)(() => {
buildSource
})
implicit val actorSystem: ActorSystem = ActorSystem()
implicit val executionContext: ExecutionContext = actorSystem.dispatcher
restarting.runWith(Sink.foreach(e => println(e())))
}
}
The code above prints: 1,2,3, crash
Why does my source not restart?
This is pretty much a 1:1 copy of the official documentation.
edit:
I also tried
val rs = RestartSink.withBackoff[() => Int](
Duration(1, SECONDS),
Duration(1, SECONDS),
0.0,
10
)(_)
val rsDone = rs(() => {
println("???")
Sink.foreach(e => println(e()))
})
restarting.runWith(rsDone)
but still get no restarts
This is because the exception is triggered outside of the buildSource Source in the Sink.foreach when you call the functions emitted from the Source.
Try this:
val restarting = RestartSource.onFailuresWithBackoff(
minBackoff = Duration(1, SECONDS) ,
maxBackoff = Duration(1, SECONDS),
randomFactor = 0.0,
maxRestarts = 10
)(() => {
buildSource
.map(e => e()) //call the functions inside the RestartSource
})
That way your exception will happen inside the inner Source wrapped by RestartSource and the restarting mechanism will kick in.
The source doesn't restart because your source never fails, therefore never needs to restart.
The exception gets thrown when Sink.foreach evaluates the function it received.
As artur noted, if you can move the failing bit into the source, you can wrap everything up to the sink in the RestartSource.
While it won't help for this contrived example (as restarting a sink doesn't result in resending previously sent messages), wrapping the sink in a RestartSink may be useful in real-world cases where this sort of thing can happen (off the top of my head, streams from Kafka blowing up because the offset commit in a sink failed (e.g. after a rebalance) should be an example of such a case).
An alternative, if you want to restart the whole stream if any part fails, and the stream materializes as a Future, you can implement retry-with-backoff on the failed future.
Source just never crashes, as already said here.
You are actually crashing you sink, not a source with this statement e => e()
this happens when applying lambda above to last element of source:
java.lang.RuntimeException: :(((
Here's the same stream without unhandled exception in sink:
...
RestartSource.withBackoff(
...
restarting.runWith(
Sink.foreach(e => {
def i: Int = try{ e() } catch {
case t: Throwable =>
println(t)
-1
}
println(i)
})
)
Works perfectly.

Play framework / Akka Streams: Detecting when a WebSocket has closed

When handling WebSockets with Akka Streams directly, I didn't find a proper way to know when the client disconnects (either normally or due to a crash or timeout). I'm using a basic example like the one from the official documentation:
import play.api.mvc._
import akka.stream.scaladsl._
def socket = WebSocket.accept[String, String] { request =>
// Log events to the console
val in = Sink.foreach[String](println)
// Send a single 'Hello!' message and then leave the socket open
val out = Source.single("Hello!").concat(Source.maybe)
Flow.fromSinkAndSource(in, out)
}
I need to know when a client is no longer connected.
Use watchTermination:
def socket = WebSocket.accept[String, String] { request =>
val in = Sink.foreach[String](println)
val out = Source.single("Hello!").concat(Source.maybe)
Flow.fromSinkAndSource(in, out)
.watchTermination() { (_, fut) =>
fut onComplete {
case Success(_) =>
println("Client disconnected")
case Failure(t) =>
println(s"Disconnection failure: ${t.getMessage}")
}
}
}

How to disable the buffering of messages on an Akka WebSocket server?

I have a very simple Akka WebSocket server that pushes lines from a file to a connected client with an interval of 400ms per line. Everything works fine, except for the fact that the web server seems to buffer messages for about a minute before broadcasting them.
So when a client connects, I see at the server end that every 400ms a line is read and pushed to the Sink, but on the client side I get nothing for a minute and then a burst of about 150 messages (corresponding to a minute of messages).
Is there a setting that I'm overlooking?
object WebsocketServer extends App {
implicit val actorSystem = ActorSystem("WebsocketServer")
implicit val materializer = ActorMaterializer()
implicit val executionContext = actorSystem.dispatcher
val file = Paths.get("websocket-server/src/main/resources/EURUSD.txt")
val fileSource =
FileIO.fromPath(file)
.via(Framing.delimiter(ByteString("\n"), Int.MaxValue))
val delayedSource: Source[Strict, Future[IOResult]] =
fileSource
.map { line =>
Thread.sleep(400)
println(line.utf8String)
TextMessage(line.utf8String)
}
def route = path("") {
extractUpgradeToWebSocket { upgrade =>
complete(upgrade.handleMessagesWithSinkSource(
Sink.ignore,
delayedSource)
)
}
}
val bindingFuture = Http().bindAndHandle(route, "localhost", 8080)
bindingFuture.onComplete {
case Success(binding) ⇒
println(s"Server is listening on ws://localhost:8080")
case Failure(e) ⇒
println(s"Binding failed with ${e.getMessage}")
actorSystem.terminate()
}
}
So the approach with Thread.sleep(400) was wrong. I should've used the .throttle mechanic on sources:
val delayedSource: Source[Strict, Future[IOResult]] =
fileSource
.throttle(elements = 1, per = 400.millis)
.map { line =>
println(line.utf8String)
TextMessage(line.utf8String)
}
This fixed the issue.

Akka-http: connect to websocket on localhost

I am trying to connect to some server through websocket on localhost. When I try to do it in JS by
ws = new WebSocket('ws://localhost:8137');
it succeeds. However, when I use akka-http and akka-streams I get "connection failed" error.
object Transmitter {
implicit val system: ActorSystem = ActorSystem()
implicit val materializer: ActorMaterializer = ActorMaterializer()
import system.dispatcher
object Rec extends Actor {
override def receive: Receive = {
case TextMessage.Strict(msg) =>
Log.info("Recevied signal " + msg)
}
}
// val host = "ws://echo.websocket.org"
val host = "ws://localhost:8137"
val sink: Sink[Message, NotUsed] = Sink.actorRef[Message](system.actorOf(Props(Rec)), PoisonPill)
val source: Source[Message, NotUsed] = Source(List("test1", "test2") map (TextMessage(_)))
val flow: Flow[Message, Message, Future[WebSocketUpgradeResponse]] =
Http().webSocketClientFlow(WebSocketRequest(host))
val (upgradeResponse, closed) =
source
.viaMat(flow)(Keep.right) // keep the materialized Future[WebSocketUpgradeResponse]
.toMat(sink)(Keep.both) // also keep the Future[Done]
.run()
val connected: Future[Done.type] = upgradeResponse.flatMap { upgrade =>
if (upgrade.response.status == StatusCodes.SwitchingProtocols) {
Future.successful(Done)
} else {
Future.failed(new Exception(s"Connection failed: ${upgrade.response.status}")
}
}
def test(): Unit = {
connected.onComplete(Log.info)
}
}
It works completely OK with ws://echo.websocket.org.
I think attaching code of my server is reasonless, because it works with JavaScript client and problem is only with connection, however if you would like to look at it I may show it.
What am I doing wrong?
I have tested your client implementation with a websocket server from akka documentation,
and I did not get any connection error. Your websocket client connects successfully. That is why I am guessing the problem is with your server implementation.
object WebSocketServer extends App {
implicit val system = ActorSystem()
implicit val materializer = ActorMaterializer()
import Directives._
val greeterWebSocketService = Flow[Message].collect {
case tm: TextMessage => TextMessage(Source.single("Hello ") ++ tm.textStream)
}
val route =
get {
handleWebSocketMessages(greeterWebSocketService)
}
val bindingFuture = Http().bindAndHandle(route, "localhost", 8137)
println(s"Server online at http://localhost:8137/\nPress RETURN to stop...")
StdIn.readLine()
import system.dispatcher // for the future transformations
bindingFuture
.flatMap(_.unbind()) // trigger unbinding from the port
.onComplete(_ => system.terminate()) // and shutdown when done
}
By the way, I noticed that your actor's receive method does not cover all possible messages. According to that akka issue,
every message, even very small, can end up as Streamed. If you want to print all text messages a better implementation of the actor would be:
object Rec extends Actor {
override def receive: Receive = {
case TextMessage.Strict(text) ⇒ println(s"Received signal $text")
case TextMessage.Streamed(textStream) ⇒ textStream.runFold("")(_ + _).foreach(msg => println(s"Received streamed signal: $msg"))
}
}
Please find a working project on my github.
I found the solution: the server I used was running on IPv6 (as ::1), but akka-http treats localhost as 127.0.0.1 and ignores ::1. I had to rewrite server to force it to use IPv4 and it worked.

Akka Streams Error Handling. How to know which row failed?

I read this article on akka streams error handling
http://doc.akka.io/docs/akka/2.5.4/scala/stream/stream-error.html
and wrote this code.
val decider: Supervision.Decider = {
case _: Exception => Supervision.Restart
case _ => Supervision.Stop
}
implicit val actorSystem = ActorSystem()
implicit val actorMaterializer = ActorMaterializer(ActorMaterializerSettings(actorSystem).withSupervisionStrategy(decider))
val source = Source(1 to 10)
val flow = Flow[Int].map{x => if (x != 9) 2 * x else throw new Exception("9!")}
val sink : Sink[Int, Future[Done]] = Sink.foreach[Int](x => println(x))
val graph = RunnableGraph.fromGraph(GraphDSL.create(sink){implicit builder => s =>
import GraphDSL.Implicits._
source ~> flow ~> s.in
ClosedShape
})
val future = graph.run()
future.onComplete{ _ =>
actorSystem.terminate()
}
Await.result(actorSystem.whenTerminated, Duration.Inf)
This works very well .... except that I need to scan the output to see which row did not get processed. Is there a way for me to print/log the row which failed? [Without putting explicit try/catch blocks in each and every flow that I write?]
So for example If I was using actors (as opposed to streams) I could have written a life cycle event of an actor and I could have logged when an actor restarted along with the message which was being processed at the time of restart.
but here I am not using actors explicitly (although they are used internally). Are there life cycle events for a Flow / Source / Sink?
Just a small modification to your code:
val decider: Supervision.Decider = {
case e: Exception =>
println("Exception handled, recovering stream:" + e.getMessage)
Supervision.Restart
case _ => Supervision.Stop
}
If you pass meaningful messages to your exceptions in the stream, the line for example, you can print them in the supervision decider.
I used println to give a quick and short answer, but strongly recommend to use
some logging libraries such as scala-logging