Monitoring Akka Streams Sources with ScalaFX - scala

What Im trying to solve is the following case:
Given an infinite running Akka Stream I want to be able to monitor certain points of the stream. The best way I could think of where to send the messages at this point to an Actor wich is also a Source. This makes it very flexible for me to then connect either individual Sources or merge multiple sources to a websocket or whatever other client I want to connect. However in this specific case Im trying to connect ScalaFX with Akka Source but it is not working as expected.
When I run the code below both counters start out ok but after a short while one of them stops and never recovers. I know there are special considerations with threading when using ScalaFX but I dont have the knowledge enough to understand what is going on here or debug it. Below is a minimal example to run, the issue should be visible after about 5 seconds.
My question is:
How could I change this code to work as expected?
import akka.NotUsed
import scalafx.Includes._
import akka.actor.{ActorRef, ActorSystem}
import akka.stream.{ActorMaterializer, OverflowStrategy, ThrottleMode}
import akka.stream.scaladsl.{Flow, Sink, Source}
import scalafx.application.JFXApp
import scalafx.beans.property.{IntegerProperty, StringProperty}
import scalafx.scene.Scene
import scalafx.scene.layout.BorderPane
import scalafx.scene.text.Text
import scala.concurrent.duration._
/**
* Created by henke on 2017-06-10.
*/
object MonitorApp extends JFXApp {
implicit val system = ActorSystem("monitor")
implicit val mat = ActorMaterializer()
val value1 = StringProperty("0")
val value2 = StringProperty("0")
stage = new JFXApp.PrimaryStage {
title = "Akka Stream Monitor"
scene = new Scene(600, 400) {
root = new BorderPane() {
left = new Text {
text <== value1
}
right = new Text {
text <== value2
}
}
}
}
override def stopApp() = system.terminate()
val monitor1 = createMonitor[Int]
val monitor2 = createMonitor[Int]
val marketChangeActor1 = monitor1
.to(Sink.foreach{ v =>
value1() = v.toString
}).run()
val marketChangeActor2 = monitor2
.to(Sink.foreach{ v =>
value2() = v.toString
}).run()
val monitorActor = Source[Int](1 to 100)
.throttle(1, 1.second, 1, ThrottleMode.shaping)
.via(logToMonitorAndContinue(marketChangeActor1))
.map(_ * 10)
.via(logToMonitorAndContinue(marketChangeActor2))
.to(Sink.ignore).run()
def createMonitor[T]: Source[T, ActorRef] = Source.actorRef[T](Int.MaxValue, OverflowStrategy.fail)
def logToMonitorAndContinue[T](monitor: ActorRef): Flow[T, T, NotUsed] = {
Flow[T].map{ e =>
monitor ! e
e
}
}
}

It seems that you assign values to the properties (and therefore affect the UI) in the actor system threads. However, all interaction with the UI should be done in the JavaFX GUI thread. Try wrapping value1() = v.toString and the second one in Platform.runLater calls.
I wasn't able to find a definitive statement about using runLater to interact with JavaFX data except in the JavaFX-Swing integration document, but this is quite a common thing in UI libraries; same is also true for Swing with its SwingUtilities.invokeLater method, for example.

Related

How should I test akka-streams RestartingSource usage

I'm working on an application that has a couple of long-running streams going, where it subscribes to data about a certain entity and processes that data. These streams should be up 24/7, so we needed to handle failures (network issues etc).
For that purpose, we've wrapped our sources in RestartingSource.
I'm now trying to verify this behaviour, and while it looks like it functions, I'm struggling to create a test where I push in some data, verify that it processes correctly, then send an error, and verify that it reconnects after that and continues processing.
I've boiled that down to this minimal case:
import akka.actor.ActorSystem
import akka.stream.ActorMaterializer
import akka.stream.scaladsl.{RestartSource, Sink, Source}
import akka.stream.testkit.TestPublisher
import org.scalatest.concurrent.Eventually
import org.scalatest.{FlatSpec, Matchers}
import scala.concurrent.duration._
import scala.concurrent.ExecutionContext
class MinimalSpec extends FlatSpec with Matchers with Eventually {
"restarting a failed source" should "be testable" in {
implicit val sys: ActorSystem = ActorSystem("akka-grpc-measurements-for-test")
implicit val mat: ActorMaterializer = ActorMaterializer()
implicit val ec: ExecutionContext = sys.dispatcher
val probe = TestPublisher.probe[Int]()
val restartingSource = RestartSource
.onFailuresWithBackoff(1 second, 1 minute, 0d) { () => Source.fromPublisher(probe) }
var last: Int = 0
val sink = Sink.foreach { l: Int => last = l }
restartingSource.runWith(sink)
probe.sendNext(1)
eventually {
last shouldBe 1
}
probe.sendNext(2)
eventually {
last shouldBe 2
}
probe.sendError(new RuntimeException("boom"))
probe.expectSubscription()
probe.sendNext(3)
eventually {
last shouldBe 3
}
}
}
This test consistently fails on the last eventually block with Last failure message: 2 was not equal to 3. What am I missing here?
Edit: akka version is 2.5.31
I figured it out after having had a look at the TestPublisher code. Its subscription is a lazy val. So when RestartSource detects the error, and executes the factory method () => Source.fromPublisher(probe) again, it gets a new Source, but the subscription of the probe is still pointing to the old Source. Changing the code to initialize both a new Source and TestPublisher works.

How to produce a Traversable with cats-effect's IO given an async that will call multiple times

What I'm really trying to do is monitor multiple files and when any of them is modified I'd like to update some state and produce a side effect using this state. I imagine what I want is a scan over a Traversable that produces a Traversable[IO[_]]. But I don't see the path there.
as a minimal attempt to produce this I wrote
package example
import better.files.{File, FileMonitor}
import cats.implicits._
import com.monovore.decline._
import cats.effect.IO
import java.nio.file.{Files, Path}
import scala.concurrent.ExecutionContext.Implicits.global
object Hello extends CommandApp(
name = "cats-effects-playground",
header = "welcome",
main = {
val filesOpts = Opts.options[Path]("input", help = "input files")
filesOpts.map { files =>
IO.async[File] { cb =>
val watchers = files.map { path =>
new FileMonitor(path, recursive = false) {
override def onModify(file: File, count: Int) = cb(Right(file))
}
}
watchers.toList.foreach(_.start)
}
.flatMap(f => IO { println(f) })
.unsafeRunSync
}
}
)
but this has two major flaws. One it creates a thread for each file I'm watching, which is a little heavy. But more importantly the program finishes as soon as a single file is modified, even though onModify would be called more times if the program stayed running.
I'm not married to using better-files, it just seemed like the path of least resistance. But I do require using Cats IO.
This solution doesn't solve the issue of creating a bunch of threads, and it doesn't strictly produce a Traversable, but it solves the underlying use case. I'm very open to this being critiqued and a better solution provided.
package example
import better.files.{File, FileMonitor}
import cats.implicits._
import com.monovore.decline._
import cats.effect.IO
import java.nio.file.{Files, Path}
import java.util.concurrent.LinkedBlockingQueue
import scala.concurrent.ExecutionContext.Implicits.global
object Hello extends CommandApp(
name = "cats-effects-playground",
header = "welcome",
main = {
val filesOpts = Opts.options[Path]("input", help = "input files")
filesOpts.map { files =>
val bq: LinkedBlockingQueue[IO[File]] = new LinkedBlockingQueue()
val watchers = files.map { path =>
new FileMonitor(path, recursive = false) {
override def onModify(file: File, count: Int) = bq.put(IO(file))
}
}
def ioLoop(): IO[Unit] = bq.take()
.flatMap(f => IO(println(f)))
.flatMap(_ => ioLoop())
watchers.toList.foreach(_.start)
ioLoop.unsafeRunSync
}
}
)

Fetch size in PGConnection.getNotifications

A function in my postgresql database sends a notification when a table is updated.
I'm polling that postgresql database by scalikejdbc, to get all the notifications, and then, do something with them.
The process is explained here . A typical reactive system to sql tables updates.
I get the PGConnection from the java.sql.Connection. And, after that, I get the notifications in this way:
val notifications = Option(pgConnection.getNotifications).getOrElse(Array[PGNotification]())
I'm trying to get the notifications in chunks of 1000 by setting the fetch size to 1000, and disabling the auto commit. But fetch size property is ignored.
Any ideas how I could do that?
I wouldn't want to handle hundreds of thousands of notifications in a single map over my notifications dataset.
pgConnection.getNotifications.size could be huge, and therefore, this code wouldn't scale well.
Thanks!!!
To better scale, consider using postgresql-async and Akka Streams: the former is a library that can obtain PostgreSQL notifications asynchronously, and the former is a Reactive Streams implementation that provides backpressure (which would obviate the need for paging). For example:
import akka.actor._
import akka.stream._
import akka.stream.scaladsl._
import com.github.mauricio.async.db.postgresql.PostgreSQLConnection
import com.github.mauricio.async.db.postgresql.util.URLParser
import scala.concurrent.duration._
import scala.concurrent.Await
class DbActor(implicit materializer: ActorMaterializer) extends Actor with ActorLogging {
private implicit val ec = context.system.dispatcher
val queue =
Source.queue[String](Int.MaxValue, OverflowStrategy.backpressure)
.to(Sink.foreach(println))
.run()
val configuration = URLParser.parse("jdbc:postgresql://localhost:5233/my_db?user=dbuser&password=pwd")
val connection = new PostgreSQLConnection(configuration)
Await.result(connection.connect, 5 seconds)
connection.sendQuery("LISTEN my_channel")
connection.registerNotifyListener { message =>
val msg = message.payload
log.debug("Sending the payload: {}", msg)
self ! msg
}
def receive = {
case payload: String =>
queue.offer(payload).pipeTo(self)
case QueueOfferResult.Dropped =>
log.warning("Dropped a message.")
case QueueOfferResult.Enqueued =>
log.debug("Enqueued a message.")
case QueueOfferResult.Failure(t) =>
log.error("Stream failed: {}", t.getMessage)
case QueueOfferResult.QueueClosed =>
log.debug("Stream closed.")
}
}
The code above simply prints notifications from PostgreSQL as they occur; you can replace the Sink.foreach(println) with another Sink. To run it:
import akka.actor._
import akka.stream.ActorMaterializer
object Example extends App {
implicit val system = ActorSystem()
implicit val materializer = ActorMaterializer()
system.actorOf(Props(classOf[DbActor], materializer))
}

AkkaHttp: Process incoming requests in parallel with multiple processes

Using AkkaHttp with Scala, the following code provides an endpoint for /api/endpoint/{DoubleNumber}. Querying this endpoint triggers a heavy computation and then returns the result as application/json.
import akka.actor.ActorSystem
import akka.http.scaladsl.Http
import akka.http.scaladsl.model._
import akka.http.scaladsl.server.Directives._
import akka.stream.ActorMaterializer
object Run {
def main(args: Array[String]) = {
implicit val system = ActorSystem("myApi")
implicit val materializer = ActorMaterializer()
implicit val executionContext = system.dispatcher
val e = get {
path("api/endpoint" / DoubleNumber) {
case (myNumberArgument) {
val result = someHeavyComputation(myNumberArgument)
complete(HttpEntity(ContentTypes.`application/json`, result.toString))
}
}
}
}
}
If one sends several concurrent requests from, say, a browser's console, the above code will wait for each request to be completed (and the response returned) before starting to handle the next one.
How to fix the above code to make it work in parallel, in other words launch an additional process for each incoming request, if previous requests are still being processed?
Looks like I came to an answer.
If you have the same problem, simply call someHeavyComputation from inside the complete() block, not before:
val e = get {
path("api/endpoint" / DoubleNumber) {
case (myNumberArgument) {
complete {
val result = someHeavyComputation(myNumberArgument)
HttpEntity(ContentTypes.`application/json`, result.toString)
}
}
}
}
New processes will be launched as necessary.

Akka Flow hangs when making http requests via connection pool

I'm using Akka 2.4.4 and trying to move from Apache HttpAsyncClient (unsuccessfully).
Below is simplified version of code that I use in my project.
The problem is that it hangs if I send more than 1-3 requests to the flow. So far after 6 hours of debugging I couldn't even locate the problem. I don't see exceptions, error logs, events in Decider. NOTHING :)
I tried reducing connection-timeout setting to 1s thinking that maybe it's waiting for response from the server but it didn't help.
What am I doing wrong ?
import akka.actor.ActorSystem
import akka.http.scaladsl.Http
import akka.http.scaladsl.model.headers.Referer
import akka.http.scaladsl.model.{HttpRequest, HttpResponse}
import akka.http.scaladsl.settings.ConnectionPoolSettings
import akka.stream.Supervision.Decider
import akka.stream.scaladsl.{Sink, Source}
import akka.stream.{ActorAttributes, Supervision}
import com.typesafe.config.ConfigFactory
import scala.collection.immutable.{Seq => imSeq}
import scala.concurrent.{Await, Future}
import scala.concurrent.duration.Duration
import scala.util.Try
object Main {
implicit val system = ActorSystem("root")
implicit val executor = system.dispatcher
val config = ConfigFactory.load()
private val baseDomain = "www.google.com"
private val poolClientFlow = Http()(system).cachedHostConnectionPool[Any](baseDomain, 80, ConnectionPoolSettings(config))
private val decider: Decider = {
case ex =>
ex.printStackTrace()
Supervision.Stop
}
private def sendMultipleRequests[T](items: Seq[(HttpRequest, T)]): Future[Seq[(Try[HttpResponse], T)]] =
Source.fromIterator(() => items.toIterator)
.via(poolClientFlow)
.log("Logger")(log = myAdapter)
.recoverWith {
case ex =>
println(ex)
null
}
.withAttributes(ActorAttributes.supervisionStrategy(decider))
.runWith(Sink.seq)
.map { v =>
println(s"Got ${v.length} responses in Flow")
v.asInstanceOf[Seq[(Try[HttpResponse], T)]]
}
def main(args: Array[String]) {
val headers = imSeq(Referer("https://www.google.com/"))
val reqPair = HttpRequest(uri = "/intl/en/policies/privacy").withHeaders(headers) -> "some req ID"
val requests = List.fill(10)(reqPair)
val qwe = sendMultipleRequests(requests).map { case responses =>
println(s"Got ${responses.length} responses")
system.terminate()
}
Await.ready(system.whenTerminated, Duration.Inf)
}
}
Also what's up with proxy support ? Doesn't seem to work for me either.
You need to consume the body of the response fully so that the connection is made available for subsequent requests. If you don't care about the response entity at all, then you can just drain it to a Sink.ignore, something like this:
resp.entity.dataBytes.runWith(Sink.ignore)
By the default config, when using a host connection pool, the max connections is set to 4. Each pool has it's own queue where requests wait until one of the open connections becomes available. If that queue ever goes over 32 (default config, can be changed, must be a power of 2) then yo will start seeing failures. In your case, you only do 10 requests, so you don't hit that limit. But by not consuming the response entity you don't free up the connection and everything else just queues in behind, waiting for the connections to free up.