How to bind akka http with akka streams? - scala

I'm trying to use streams instead of pure actors to handle http requests and I came with the following code:
trait ImagesRoute {
val log = LoggerFactory.getLogger(this.getClass)
implicit def actorRefFactory: ActorRefFactory
implicit def materializer: ActorMaterializer
val source =
Source
.actorRef[Image](Int.MaxValue, OverflowStrategy.fail)
.via(Flow[Image].mapAsync(1)(ImageRepository.add))
.toMat(Sink.asPublisher(true))(Keep.both)
val route = {
pathPrefix("images") {
pathEnd {
post {
entity(as[Image]) { image =>
val (ref, publisher) = source.run()
val addFuture = Source.fromPublisher(publisher)
val future = addFuture.runWith(Sink.head[Option[Image]])
ref ! image
onComplete(future.mapTo[Option[Image]]) {
case Success(img) =>
complete(Created, img)
case Failure(e) =>
log.error("Error adding image resource", e)
complete(InternalServerError, e.getMessage)
}
}
}
}
}
}
}
I'm not sure if this is the correct way to do that, or even if this is a good approach or if I should use an actor to interact with the route, using the ask pattern and then inside the actor, stream everything.
Any ideas?

If you're only expecting 1 image from the entity then you don't need to create a Source from an ActorRef and you don't need Sink.asPublisher, you can simply use Source.single:
def imageToComplete(img : Option[Image]) : StandardRoute =
img.map(i => complete(Created, i))
.getOrElse {
log error ("Error adding image resource", e)
complete(InternalServerError, e.getMessage
}
...
entity(as[Image]) { image =>
val future : Future[StandardRoute] =
Source.single(image)
.via(Flow[Image].mapAsync(1)(ImageRepository.add))
.runWith(Sink.head[Option[Image]])
.map(imageToComplete)
onComplete(future)
}
Simplifying your code further, the fact that you are only processing 1 image means that Streams are unnecessary since there is no need for backpressure with just 1 element:
val future : Future[StandardRoute] = ImageRepository.add(image)
.map(imageToComplete)
onComplete(future)
In the comments you indicated
"this is just a simple example, but the stream pipeline should be
bigger doing a lot of things like contacting external resources and
eventually back pressure things"
This would only apply if your entity was a stream of images. If you're only ever processing 1 image per HttpRequest then backpressure never applies, and any stream you create will be a slower version of a Future.
If your entity is in fact a stream of Images, then you could use it as part of stream:
val byteStrToImage : Flow[ByteString, Image, _] = ???
val imageToByteStr : Flow[Image, Source[ByteString], _] = ???
def imageOptToSource(img : Option[Image]) : Source[Image,_] =
Source fromIterator img.toIterator
val route = path("images") {
post {
extractRequestEntity { reqEntity =>
val stream = reqEntity.via(byteStrToImage)
.via(Flow[Image].mapAsync(1)(ImageRepository.add))
.via(Flow.flatMapConcat(imageOptToSource))
.via(Flow.flatMapConcat(imageToByteStr))
complete(HttpResponse(status=Created,entity = stream))
}
}
}

Related

How to integrate akka streams kafka (reactive-kafka) into akka http application?

I have a basic scala akka http CRUD application. See below for the relevant classes.
I'd simply like to write an entity id and some data (as json) to a Kafka topic whenever, for example, an entity is created/updated.
I'm looking at http://doc.akka.io/docs/akka-stream-kafka/current/producer.html, but am new to scala and akka, and unsure of how to integrate it into my application?
For example, from the docs above, this is the example of a producer writing to kafka, so I think I need to something similar, but whereabouts in my application should this go? Can I just add another map call in the create method in my service after I have created the user?
Many thanks!
val done = Source(1 to 100)
.map(_.toString)
.map { elem =>
new ProducerRecord[Array[Byte], String]("topic1", elem)
}
.runWith(Producer.plainSink(producerSettings))
Or do I need to do something like the example here https://github.com/hseeberger/accessus in the bindAndHandle() method in my Server.scala?
WebServer.scala
object System {
implicit val system = ActorSystem()
implicit val dispatcher = system.dispatcher
implicit val actorMaterializer = ActorMaterializer()
}
object WebServer extends App {
import System._
val config = new ApplicationConfig() with ConfigLoader
ConfigurationFactory.setConfigurationFactory(new LoggingConfFileConfigurationFactory(config.loggingConfig))
val injector = Guice.createInjector(new MyAppModule(config))
val routes = injector.getInstance(classOf[Router]).routes
Http().bindAndHandle(routes, config.httpConfig.interface, config.httpConfig.port)
}
Router.scala
def routes(): Route = {
post {
entity(as[User]) { user =>
val createUser = userService.create(user)
onSuccess(createUser) {
case Invalid(y: NonEmptyList[Err]) => {
throw new ValidationException(y)
}
case Valid(u: User) => {
complete(ToResponseMarshallable((StatusCodes.Created, u)))
}
}
}
} ~
// More routes here, left out for example
}
Service.scala
def create(user: User): Future[MaybeValid[User]] = {
for {
validating <- userValidation.validateCreate(user)
result <- validating match {
case Valid(x: User) =>
userRepo.create(x)
.map(dbUser => Valid(UserConverters.fromUserRow(x)))
case y: DefInvalid =>
Future{y}
}
} yield result
}
Repo.scala
def create(user: User): Future[User] = {
mutateDbProvider.db.run(
userTable returning userTable.map(_.userId)
into ((user, id) => user.copy(userId = id)) +=
user.copy(createdDate = Some(Timestamp.valueOf(LocalDateTime.now())))
)
}
Since you have written your Route to unmarshall just 1 User from the Entity I don't think you need Producer.plainSink. Rather, I think Producer.send will work just as well. Also, as a side note, throwing exceptions is not "idiomatic" scala. So I changed the logic for invalid user:
val producer : KafkaProducer = new KafkaProducer(producerSettings)
val routes : Route =
post {
entity(as[User]) { user =>
val createUser = userService.create(user)
onSuccess(createUser) {
case Invalid(y: NonEmptyList[Err]) =>
complete(BadRequest -> "invalid user")
case Valid(u: User) => {
val producerRecord =
new ProducerRecord[Array[Byte], String]("topic1",s"""{"userId" : ${u.userId}, "entity" : "User"}""")
onComplete(producer send producerRecord) { _ =>
complete(ToResponseMarshallable((StatusCodes.Created, u)))
}
}
}
}
}

Scala & Play Websockets: Storing messages exchanged

I started playing around scala and came to this particular boilerplate of web socket chatroom in scala.
They use MessageHub.source() and BroadcastHub.sink() as their Source and Sink for sending the messages to all connected clients.
The example is working fine for exchanging messages as it is.
private val (chatSink, chatSource) = {
// Don't log MergeHub$ProducerFailed as error if the client disconnects.
// recoverWithRetries -1 is essentially "recoverWith"
val source = MergeHub.source[WSMessage]
.log("source")
.recoverWithRetries(-1, { case _: Exception ⇒ Source.empty })
val sink = BroadcastHub.sink[WSMessage]
source.toMat(sink)(Keep.both).run()
}
private val userFlow: Flow[WSMessage, WSMessage, _] = {
Flow.fromSinkAndSource(chatSink, chatSource)
}
def chat(): WebSocket = {
WebSocket.acceptOrResult[WSMessage, WSMessage] {
case rh if sameOriginCheck(rh) =>
Future.successful(userFlow).map { flow =>
Right(flow)
}.recover {
case e: Exception =>
val msg = "Cannot create websocket"
logger.error(msg, e)
val result = InternalServerError(msg)
Left(result)
}
case rejected =>
logger.error(s"Request ${rejected} failed same origin check")
Future.successful {
Left(Forbidden("forbidden"))
}
}
}
I want to store the messages that are exchanged in the chatroom in a DB.
I tried adding map and fold functions to source and sink to get hold of the messages that are sent but I wasn't able to.
I tried adding a Flow stage between MergeHub and BroadcastHub like below
val flow = Flow[WSMessage].map(element => println(s"Message: $element"))
source.via(flow).toMat(sink)(Keep.both).run()
But it throws a compilation error that cannot reference toMat with such signature.
Can someone help or point me how can I get hold of messages that are sent and store them in DB.
Link for full template:
https://github.com/playframework/play-scala-chatroom-example
Let's look at your flow:
val flow = Flow[WSMessage].map(element => println(s"Message: $element"))
It takes elements of type WSMessage, and returns nothing (Unit). Here it is again with the correct type:
val flow: Flow[Unit] = Flow[WSMessage].map(element => println(s"Message: $element"))
This will clearly not work as the sink expects WSMessage and not Unit.
Here's how you can fix the above problem:
val flow = Flow[WSMessage].map { element =>
println(s"Message: $element")
element
}
Not that for persisting messages in the database, you will most likely want to use an async stage, roughly:
val flow = Flow[WSMessage].mapAsync(parallelism) { element =>
println(s"Message: $element")
// assuming DB.write() returns a Future[Unit]
DB.write(element).map(_ => element)
}

Response from Server Scala with Play

I'm doing my first application with Scala and Play! Where I use a WebSocket exactly like they do in the chatroom example, so far I can send messages to my server but I don't seem to understand where I can "handle" this messages that I get from my client, also I wanted to know if I can send Json Arrays from my server to my client:
#Singleton
class HomeController #Inject()(cc: ControllerComponents)
(implicit actorSystem: ActorSystem,
mat: Materializer,
executionContext: ExecutionContext)
extends AbstractController(cc) {
private type WSMessage = String
private val logger = Logger(getClass)
private implicit val logging = Logging(actorSystem.eventStream, logger.underlyingLogger.getName)
// chat room many clients -> merge hub -> broadcasthub -> many clients
private val (chatSink, chatSource) = {
// Don't log MergeHub$ProducerFailed as error if the client disconnects.
// recoverWithRetries -1 is essentially "recoverWith"
val source = MergeHub.source[WSMessage]
.log("source")
.recoverWithRetries(-1, { case _: Exception ⇒ Source.empty })
val sink = BroadcastHub.sink[WSMessage]
source.toMat(sink)(Keep.both).run()
}
private val userFlow: Flow[WSMessage, WSMessage, _] = {
Flow.fromSinkAndSource(chatSink, chatSource)
}
def index: Action[AnyContent] = Action { implicit request: RequestHeader =>
val webSocketUrl = routes.HomeController.chat().webSocketURL()
logger.info(s"index: ")
Ok(views.html.index(webSocketUrl))
}
def chat(): WebSocket = {
WebSocket.acceptOrResult[WSMessage, WSMessage] {
case rh if sameOriginCheck(rh) =>
Future.successful(userFlow).map { flow =>
Right(flow)
}.recover {
case e: Exception =>
val msg = "Cannot create websocket"
logger.error(msg, e)
val result = InternalServerError(msg)
Left(result)
}
case rejected =>
logger.error(s"Request ${rejected} failed same origin check")
Future.successful {
Left(Forbidden("forbidden"))
}
}
}
}
Btw I'm sending the messages from my client through a Jquery function.
Edit The way I want to handle this messages is by passing them as parameters to a function, the function will return an Array of strings or integers which I want to return to the client
All you need to do is add a flow stage to your between your source and sink which can store in DB and return some other message to clients.
Here's how you can fix the above problem:
Do whatever you want with the received message here.
val flow = Flow[WSMessage].map { element =>
println(s"Message: $element")
element
}
Note that for persisting messages in the database, you will most likely want to use an async stage, roughly:
val flow = Flow[WSMessage].mapAsync(parallelism) { element =>
println(s"Message: $element")
// assuming DB.write() returns a Future[Unit]
DB.write(element).map(_ => element)
}
And finally, you need to add the flow stage to the stream pipeline.
source.via(flow).toMat(sink)(Keep.both).run()

How would you "connect" many independent graphs maintaining backpressure between them?

Continuing series of questions about akka-streams I have another problem.
Variables:
Single http client flow with throttling
Multiple other flows that want to use first flow simultaneously
Goal:
Single http flow is flow that makes requests to particular API that limits number of calls to it. Otherwise it bans me. Thus it's very important to maintain rate of request regardless of how many clients in my code use it.
There are number of other flows that want to make requests to mentioned API but I'd like to have backpressure from http flow. Normally you connect whole thing to one graph and it works. But it my case I have multiple graphs.
How would you solve it ?
My attempt to solve it:
I use Source.queue for http flow so that I can queue http requests and have throttling. Problem is that Future from SourceQueue.offer fails if I exceed number of requests. Thus somehow I need to "reoffer" when previously offered event completes. Thus modified Future from SourceQueue would backpressure other graphs (inside their mapAsync) that make http requests.
Here is how I implemented it
object Main {
implicit val system = ActorSystem("root")
implicit val executor = system.dispatcher
implicit val materializer = ActorMaterializer()
private val queueHttp = Source.queue[(String, Promise[String])](2, OverflowStrategy.backpressure)
.throttle(1, FiniteDuration(1000, MILLISECONDS), 1, ThrottleMode.Shaping)
.mapAsync(4) {
case (text, promise) =>
// Simulate delay of http request
val delay = (Random.nextDouble() * 1000 / 2).toLong
Thread.sleep(delay)
Future.successful(text -> promise)
}
.toMat(Sink.foreach({
case (text, p) =>
p.success(text)
}))(Keep.left)
.run
val futureDeque = new ConcurrentLinkedDeque[Future[String]]()
def sendRequest(value: String): Future[String] = {
val p = Promise[String]()
val offerFuture = queueHttp.offer(value -> p)
def addToQueue(future: Future[String]): Future[String] = {
futureDeque.addLast(future)
future.onComplete {
case _ => futureDeque.remove(future)
}
future
}
offerFuture.flatMap {
case QueueOfferResult.Enqueued =>
addToQueue(p.future)
}.recoverWith {
case ex =>
val first = futureDeque.pollFirst()
if (first != null)
addToQueue(first.flatMap(_ => sendRequest(value)))
else
sendRequest(value)
}
}
def main(args: Array[String]) {
val allFutures = for (v <- 0 until 15)
yield {
val res = sendRequest(s"Text $v")
res.onSuccess {
case text =>
println("> " + text)
}
res
}
Future.sequence(allFutures).onComplete {
case Success(text) =>
println(s">>> TOTAL: ${text.length} [in queue: ${futureDeque.size()}]")
system.terminate()
case Failure(ex) =>
ex.printStackTrace()
system.terminate()
}
Await.result(system.whenTerminated, Duration.Inf)
}
}
Disadvantage of this solution is that I have locking on ConcurrentLinkedDeque which is probably not that bad for rate of 1 request per second but still.
How would you solve this task?
We have an open ticket (https://github.com/akka/akka/issues/19478) and some ideas for a "Hub" stage which would allow for dynamically combining streams, but I'm afraid I cannot give you any estimate for when it will be done.
So that is how we, in the Akka team, would solve the task. ;)

Akka-Streams ActorPublisher does not receive any Request messages

I am trying to continuously read the wikipedia IRC channel using this lib: https://github.com/implydata/wikiticker
I created a custom Akka Publisher, which will be used in my system as a Source.
Here are some of my classes:
class IrcPublisher() extends ActorPublisher[String] {
import scala.collection._
var queue: mutable.Queue[String] = mutable.Queue()
override def receive: Actor.Receive = {
case Publish(s) =>
println(s"->MSG, isActive = $isActive, totalDemand = $totalDemand")
queue.enqueue(s)
publishIfNeeded()
case Request(cnt) =>
println("Request: " + cnt)
publishIfNeeded()
case Cancel =>
println("Cancel")
context.stop(self)
case _ =>
println("Hm...")
}
def publishIfNeeded(): Unit = {
while (queue.nonEmpty && isActive && totalDemand > 0) {
println("onNext")
onNext(queue.dequeue())
}
}
}
object IrcPublisher {
case class Publish(data: String)
}
I am creating all this objects like so:
def createSource(wikipedias: Seq[String]) {
val dataPublisherRef = system.actorOf(Props[IrcPublisher])
val dataPublisher = ActorPublisher[String](dataPublisherRef)
val listener = new MessageListener {
override def process(message: Message) = {
dataPublisherRef ! Publish(Jackson.generate(message.toMap))
}
}
val ticker = new IrcTicker(
"irc.wikimedia.org",
"imply",
wikipedias map (x => s"#$x.wikipedia"),
Seq(listener)
)
ticker.start() // if I comment this...
Thread.currentThread().join() //... and this I get Request(...)
Source.fromPublisher(dataPublisher)
}
So the problem I am facing is this Source object. Although this implementation works well with other sources (for example from local file), the ActorPublisher don't receive Request() messages.
If I comment the two marked lines I can see, that my actor has received the Request(count) message from my flow. Otherwise all messages will be pushed into the queue, but not in my flow (so I can see the MSG messages printed).
I think it's something with multithreading/synchronization here.
I am not familiar enough with wikiticker to solve your problem as given. One question I would have is: why is it necessary to join to the current thread?
However, I think you have overcomplicated the usage of Source. It would be easier for you to work with the stream as a whole rather than create a custom ActorPublisher.
You can use Source.actorRef to materialize a stream into an ActorRef and work with that ActorRef. This allows you to utilize akka code to do the enqueing/dequeing onto the buffer while you can focus on the "business logic".
Say, for example, your entire stream is only to filter lines above a certain length and print them to the console. This could be accomplished with:
def dispatchIRCMessages(actorRef : ActorRef) = {
val ticker =
new IrcTicker("irc.wikimedia.org",
"imply",
wikipedias map (x => s"#$x.wikipedia"),
Seq(new MessageListener {
override def process(message: Message) =
actorRef ! Publish(Jackson.generate(message.toMap))
}))
ticker.start()
Thread.currentThread().join()
}
//these variables control the buffer behavior
val bufferSize = 1024
val overFlowStrategy = akka.stream.OverflowStrategy.dropHead
val minMessageSize = 32
//no need for a custom Publisher/Queue
val streamRef =
Source.actorRef[String](bufferSize, overFlowStrategy)
.via(Flow[String].filter(_.size > minMessageSize))
.to(Sink.foreach[String](println))
.run()
dispatchIRCMessages(streamRef)
The dispatchIRCMessages has the added benefit that it will work with any ActorRef so you aren't required to only work with streams/publishers.
Hopefully this solves your underlying problem...
I think the main problem is Thread.currentThread().join(). This line will 'hang' current thread because this thread is waiting for himself to die. Please read https://docs.oracle.com/javase/8/docs/api/java/lang/Thread.html#join-long- .