Akka HTTP. Streaming source from callback - scala

I am trying to connect Akka HTTP and some old Java library. That library has two methods - one accepting a callback function to receive string, and one signaling the end of data stream. The callback function receiving the data can be called multiple times. Consider this snippet:
oldJavaLib.receiveData((s:String) => {
println("received:" + s)
})
oldJavaLib.dataEnd(() => {
println("data transmission is over")
})
I want to stream data using Akka HTTP as its being received by the callback function. But I am not sure what is a best way to go about that.
I was thinking to create a stream and then use it directly in HTTP route like this:
def fetchUsers(): Source[User, NotUsed] = Source.fromIterator(() => Iterator.fill(1000000) {
val id = Random.nextInt()
dummyUser(id.toString)
})
lazy val routes: Route =
pathPrefix("test") {
concat(
pathEnd {
concat(
get {
complete(fetchUsers())
}
)
}
)
}
fetchUsers() function should return a stream which is getting data from some legacy java API. May-be there is a better approach.

I assume that you want to create an Akka stream that emits values from callback? You can use Source.queue. For the first callback it would be:
val queue = Source.queue[String](bufferSize = 1000)
.toMat(Sink.ignore)(Keep.left)
.run()
oldJavaLib.receiveData((s: String) => {
queue.offer(s) match {
case Enqueued => println("received:" + s)
case _ => println("failed to enqueue:" + s)
}
})
Edit after question clarification
If you want to use the source in HTTP route you have to prematerialize it. Refering to my previous code it would look like this:
val (queue, source) = Source.queue[String](bufferSize = 1000).preMaterialize()
source then can be used in any route.

Related

How to use Futures within Kafka Streams

When using the org.apache.kafka.streams.KafkaStreams library in Scala, I have been trying to read in an inputStream, pass that information over to a method: validateAll(infoToValidate) that returns a Future, resolve that and then send to an output stream.
Example:
builder.stream[String, Object](REQUEST_TOPIC)
.mapValues(v => ValidateFormat.from(v.asInstanceOf[GenericRecord]))
.mapValues(infoToValidate => {
SuccessFailFormat.to(validateAll(infoToValidate))
})
Is there any documentation on performing this? I have looked into filter() and transform() but still not sure how to deal with Futures in KStreams.
The answer depends whether you need to preserve the original order of your messages. If yes, then you will have to block in one way or the other. For example:
val duration = 10 seconds // whatever your timeout should be or Duration.Inf
sourceStream
.mapValues(x => Await.result(validate(x), duration))
.to(outputTopic)
If however the order is not important, you can simply use a Kafka producer:
sourceStream
.mapValues(x => validate(x)) // now you have KStream[.., Future[...]]
.foreach { future =>
future.foreach { item =>
val record = new ProducerRecord(outputTopic, key, item)
producer.send(record) // provided you have the implicit serializer
}
}

Scala & Play Websockets: Storing messages exchanged

I started playing around scala and came to this particular boilerplate of web socket chatroom in scala.
They use MessageHub.source() and BroadcastHub.sink() as their Source and Sink for sending the messages to all connected clients.
The example is working fine for exchanging messages as it is.
private val (chatSink, chatSource) = {
// Don't log MergeHub$ProducerFailed as error if the client disconnects.
// recoverWithRetries -1 is essentially "recoverWith"
val source = MergeHub.source[WSMessage]
.log("source")
.recoverWithRetries(-1, { case _: Exception ⇒ Source.empty })
val sink = BroadcastHub.sink[WSMessage]
source.toMat(sink)(Keep.both).run()
}
private val userFlow: Flow[WSMessage, WSMessage, _] = {
Flow.fromSinkAndSource(chatSink, chatSource)
}
def chat(): WebSocket = {
WebSocket.acceptOrResult[WSMessage, WSMessage] {
case rh if sameOriginCheck(rh) =>
Future.successful(userFlow).map { flow =>
Right(flow)
}.recover {
case e: Exception =>
val msg = "Cannot create websocket"
logger.error(msg, e)
val result = InternalServerError(msg)
Left(result)
}
case rejected =>
logger.error(s"Request ${rejected} failed same origin check")
Future.successful {
Left(Forbidden("forbidden"))
}
}
}
I want to store the messages that are exchanged in the chatroom in a DB.
I tried adding map and fold functions to source and sink to get hold of the messages that are sent but I wasn't able to.
I tried adding a Flow stage between MergeHub and BroadcastHub like below
val flow = Flow[WSMessage].map(element => println(s"Message: $element"))
source.via(flow).toMat(sink)(Keep.both).run()
But it throws a compilation error that cannot reference toMat with such signature.
Can someone help or point me how can I get hold of messages that are sent and store them in DB.
Link for full template:
https://github.com/playframework/play-scala-chatroom-example
Let's look at your flow:
val flow = Flow[WSMessage].map(element => println(s"Message: $element"))
It takes elements of type WSMessage, and returns nothing (Unit). Here it is again with the correct type:
val flow: Flow[Unit] = Flow[WSMessage].map(element => println(s"Message: $element"))
This will clearly not work as the sink expects WSMessage and not Unit.
Here's how you can fix the above problem:
val flow = Flow[WSMessage].map { element =>
println(s"Message: $element")
element
}
Not that for persisting messages in the database, you will most likely want to use an async stage, roughly:
val flow = Flow[WSMessage].mapAsync(parallelism) { element =>
println(s"Message: $element")
// assuming DB.write() returns a Future[Unit]
DB.write(element).map(_ => element)
}

How can I use and return Source queue to caller without materializing it?

I'm trying to use new Akka streams and wonder how I can use and return Source queue to caller without materializing it in my code ?
Imagine we have library that makes number of async calls and returns results via Source. Function looks like this
def findArticlesByTitle(text: String): Source[String, SourceQueue[String]] = {
val source = Source.queue[String](100, backpressure)
source.mapMaterializedValue { case queue =>
val url = s"http://.....&term=$text"
httpclient.get(url).map(httpResponseToSprayJson[SearchResponse]).map { v =>
v.idlist.foreach { id =>
queue.offer(id)
}
queue.complete()
}
}
source
}
and caller might use it like this
// There is implicit ActorMaterializer somewhere
val stream = plugin.findArticlesByTitle(title)
val results = stream.runFold(List[String]())((result, article) => article :: result)
When I run this code within mapMaterializedValue is never executed.
I can't understand why I don't have access to instance of SourceQueue if it should be up to caller to decide how to materialize the source.
How should I implement this ?
In your code example you're returning source instead of the return value of source.mapMaterializedValue (the method call doesn't mutate the Source object).

With play and akka-https: How to properly chain multiple requests to the incoming request to create a response?

So I tried to get a small play app communicating with another rest service.
The idea is, to receive a request on the play side and then do a request to the rest api and feed parts of the result to another local actor before displaying the response from the local actor and the rest service in the browser.
This image shows how
And I tried to do it with streams. I got it all working, but I am absolutely not happy with the part where I talk to my local actor and create a Future[(Future[String],Future[String]) tuple , so I would be happy if you could point me in the direction, how to do this in an elegant and clean way.
So here is my code. The input is a csv file.
My local actor creates an additional graphic I want to put into the response.
def upload = Action.async(parse.multipartFormData) { request =>
request.body.file("input").map { inputCsv =>
//csv to list of strings
val inputList: List[String] = convertFileToList(inputCsv)
//http request to rest service
val responseFuture: Future[HttpResponse] = httpRequest(inputList, "/path",4321 ,"0.0.0.0")
//pattern match response and ask local actor
val formattedResult = responseFuture.flatMap { response =>
response.status match {
case akka.http.scaladsl.model.StatusCodes.OK =>
val resultTeams = Unmarshal(response.entity).to[CustomResultCaseClass]
//the part I'd like to improve
val tupleFuture = resultTeams.map(result =>
(Future(result.teams.reduce(_ + "," + _)),
plotter.ask(PlotData(result.eval)).mapTo[ChartPath].flatMap(plotAnswer => Future(plotAnswer.path))))
tupleFuture.map(tuple => tuple._1.map(teams =>
p._2.map(chartPath => Ok(views.html.upload(teams))(chartPath))))).flatMap(a => a).flatMap(b => b)
}
}
formattedResult
}.getOrElse(Future(play.api.mvc.Results.BadRequest))
}
For comprehensions are useful for this type of use cases. A basic example which demonstrates the refactoring involved:
val teamFut = Future(result.teams.reduce(_ + "," + _))
//I think the final .flatMap(Future(_.path)) is unnecessary it should be
// .map(_.path), but I wanted to replicate the question code functionality
val pathFut = plotter.ask(PlotData(result.eval))
.mapTo[ChartPath]
.flatMap(Future(_.path))
val okFut =
for {
teams <- teamFut
chartPath <- pathFut
} yield Ok(views.html.upload(teams))(chartPath)
Note: the Initial Futures should be instantiated outside of the for otherwise parallel execution won't occur.

How to handle multiple Promises in an (akka) Actor?

I have an Akka actor responsible of handling http calls. I use scala dispatch to send multiple HTTP requests over an API:
urls.foreach { u
val service = url(u)
val promise = Http(service OK as.String).either
for(p <- promise)
{
p match
{
case Left(error) =>
faultHandler(error)
case Right(result) =>
resultHandler(result)
}
}
In the resultHandlerfunction, I increment an instance variable nbOfResults and compare to the number of calls I have done.
def resultHandler(result:String)
{
this.nbOfResults++
...
if(nbOfResults == nbOfCalls)
// Do something
}
Is it safe ? May the nbOfResultsvaraible be accessed at the same time if two calls return their results simultaneously ?
For now, I believed that the actor is more or less equivalent to a thread and therefore the callback functions are not executed concurrently. Is it correct ?
Here is a variant of Alexey Romanov response using only dispatch :
//Promises will be of type Array[Promise[Either[Throwable, String]]]
val promises = urls.map { u =>
val service = url(u)
Http(service OK as.String).either
}
//Http.promise.all transform an Iterable[Promise[A]] into Promise[Iterable[A]]
//So listPromise is now of type Promise[Array[Either[Throwable, String]]]
val listPromise = Http.promise.all(promises)
for (results <- listPromise) {
//Here results is of type Array[Either[Throwable, String]]
results foreach { result =>
result match {
Left(error) => //Handle error
Right(response) => //Handle response
}
}
}
There is a far better way:
val promises = urls.map {u =>
val service = url(u)
val promise = Http(service OK as.String).either
}
val listPromise = Future.sequence(promises)
listPromise.onComplete { whatever }
I agree with Alexey Romanov on his answer. Whatever way you choose to synchronize your http requests beware of the way your are processing the promises completion. Your intuition is correct in that concurrent access may appear on the state of the actor. The better way to handle this would be to do something like this:
def resultHandler(result: String) {
//on completion we are sending the result to the actor who triggered the call
//as a message
self ! HttpComplete(result)
}
and in the actor's receive function:
def receive = {
//PROCESS OTHER MESSAGES HERE
case HttpComplete(result) => //do something with the result
}
This way, you make sure that processing the http results won't violate the actor's state from the exterior, but from the actor's receive loop which is the proper way to do it
val nbOfResults = new java.util.concurrent.atomic.AtomicInteger(nbOfCalls)
// After particular call was ended
if (nbOfResults.decrementAndGet <= 0) {
// Do something
}
[EDIT] Removed old answer with AtomicReference CAS - while(true), compareAndSet, etc