Akka-http first websocket client only receives the data once from a kafka topic - scala

I am using akka-http websocket to push messages from a kafka topic to websocket clients.
For this purpose, i created a plain kafka consumer (using akka-streams-kafka connector) with offset set to "earliest" so that every new websocket client connecting gets all the data from the beginning.
The problem is that the first connected websocket client gets all the data and other ws clients (connecting after the first client has got all the data) do not get any. The kafka topic has 1million records.
I am using the BroadcastHub from Akka-streams.
Appreciate any suggestions.
lazy private val kafkaPlainSource: Source[String, NotUsed] = {
val consumerSettings = ConsumerSettings(system, new StringDeserializer, new StringDeserializer)
.withBootstrapServers(KAFKA_BROKERS)
.withGroupId(UUID.randomUUID().toString)
.withProperty(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest")
val kafkaSource = Consumer.plainSource(consumerSettings, Subscriptions.topics(KAFKA_TOPIC))
.mapAsync(PARALLELISM) { cr =>
Future {
cr.value
}
}
kafkaSource.toMat(BroadcastHub.sink)(Keep.right).run
}
def logicFlow: Flow[String, String, NotUsed] =
Flow.fromSinkAndSourceCoupled(Sink.ignore, kafkaSource)
val websocketFlow: Flow[Message, Message, Any] = {
Flow[Message]
.map {
case TextMessage.Strict(msg) => msg
case _ => println("ignore streamed message")
}
.via(logicFlow)
.map { msg: String => TextMessage.Strict(msg) }
}
lazy private val streamRoute =
path("stream") {
handleWebSocketMessages {
websocketFlow
.watchTermination() { (_, done) =>
done.onComplete {
case Success(_) =>
log.info("Stream route completed successfully")
case Failure(ex) =>
log.error(s"Stream route completed with failure : $ex")
}
}
}
}
def startServer(): Unit = {
bindingFuture = Http().bindAndHandle(wsRoutes, HOST, PORT)
log.info(s"Server online at http://localhost:9000/")
}
def stopServer(): Unit = {
bindingFuture
.flatMap(_.unbind())
.onComplete{
_ => system.terminate()
log.info("terminated")
}
}

Related

Akka committableOffset store in DB

I am beginner in akka and I have an problem statement to work with. I have an akka flow that's reading Kafka Events from some topic and doing some transformation before creating Commitable offset of the message.
I am not sure the best way to add a akka sink on top of this code to store the transformed events in some DB
def eventTransform : Flow[KafkaMessage,CommittableRecord[Either[Throwable,SomeEvent]],NotUsed]
def processEvents
: Flow[KafkaMessage, ConsumerMessage.CommittableOffset, NotUsed] =
Flow[KafkaMessage]
.via(eventTransform)
.filter({ x =>
x.value match {
case Right(event: SomeEvent) =>
event.status != "running"
case Left(_) => false
}
})
.map(_.message.committableOffset)
This is my akka source calling the akka flow
private val consumerSettings: ConsumerSettings[String, String] = ConsumerSettings(
system,
new StringDeserializer,
new StringDeserializer,
)
.withGroupId(groupId)
.withProperty(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest")
private val committerSettings: CommitterSettings = CommitterSettings(system)
private val control = new AtomicReference[Consumer.Control](Consumer.NoopControl)
private val restartableSource = RestartSource
.withBackoff(restartSettings) { () =>
Consumer
.committableSource(consumerSettings, Subscriptions.topics(topicIn))
.mapMaterializedValue(control.set)
.via(processEvents) // calling the flow here
}
restartableSource
.toMat(Committer.sink(committerSettings))(Keep.both)
.run()
def api(): Behavior[Message] =
Behaviors.receive[Message] { (_, message) =>
message match {
case Stop =>
context.pipeToSelf(control.get().shutdown())(_ => Stopped)
Behaviors.same
case Stopped =>
Behaviors.stopped
}
}
.receiveSignal {
case (_, _ #(PostStop | PreRestart)) =>
control.get().shutdown()
Behaviors.same
}
}

Websocket client does not receive data from Akka streams Source.queue

I am using Akka stream Source.queue as Source for websocket clients.
Reading from kafka topic with 10k records using kaka consumer API and offering it to Source.queue with buffer 100k.
I am using BroardcastHub for fan-out. The websocket client does not get any data but see records from kafka enqueued on offer result.
Appreciate any help.
def kafkaSourceQueue() = {
val sourceQueue = Source.queue[String](100000, OverflowStrategy.dropHead)
val (theQueue, queueSource)= sourceQueue.toMat(BroadcastHub.sink(bufferSize = 256))(Keep.both).run
val consumer = KafkaEventSource.initKafkaConsumer()
try {
while (true) {
val records = consumer.poll(polltimeout.toMillis)
for (record <- records.records(topic)) {
//println(record.value())
theQueue.offer(record.value()).onComplete{
case Success(QueueOfferResult.Enqueued) =>
println("enqueued")
case _ => println("Failed to enqueue")
}
}
}
}
finally {
if (consumer != null) {
println("consumer unsubscribed")
consumer.unsubscribe()
}
}
queueSource
}
private def logicStreamFlow: Flow[String, String, NotUsed] = {
Flow.fromSinkAndSourceCoupled(Sink.ignore, kafkaSourceQueue)
}
def websocketFlow: Flow[Message, Message, NotUsed] = {
Flow[Message]
.map{
case TextMessage.Strict(msg) => msg
case _ => throw new Exception("exception msg")
}
.via(logicStreamFlow)
.map { msg: String => TextMessage.Strict(msg) }
}
lazy private val streamRoute =
path("stream") {
handleWebSocketMessages {
websocketFlow
.watchTermination() { (_, done) =>
done.onComplete {
case Success(_) =>
log.info("Stream route completed successfully")
case Failure(ex) =>
log.error(s"Stream route completed with failure : $ex")
}
}
}
}
def startServer(): Unit = {
bindingFuture = Http().bindAndHandle(wsRoutes, HOST, PORT)
log.info(s"Server online at http://localhost:9000/")
}
def stopServer(): Unit = {
bindingFuture
.flatMap(_.unbind())
.onComplete{
_ => system.terminate()
log.info("terminated")
}
}

How does one measure throughput of Akka WebSocket stream?

I am new to Akka and developed a sample Akka WebSocket server that streams a file's contents to clients using BroadcastHub (based on a sample from the Akka docs).
How can I measure the throughput (messages/second), assuming the clients are consuming as fast as the server?
// file source
val fileSource = FileIO.fromPath(Paths.get(path)
// Akka file source
val theFileSource = fileSource
.toMat(BroadcastHub.sink)(Keep.right)
.run
//Akka kafka file source
lazy val kafkaSourceActorStream = {
val (kafkaSourceActorRef, kafkaSource) = Source.actorRef[String](Int.MaxValue, OverflowStrategy.fail)
.toMat(BroadcastHub.sink)(Keep.both).run()
Consumer.plainSource(consumerSettings, Subscriptions.topics("perf-test-topic"))
.runForeach(record => kafkaSourceActorRef ! record.value().toString)
}
def logicFlow: Flow[String, String, NotUsed] = Flow.fromSinkAndSource(Sink.ignore, theFileSource)
val websocketFlow: Flow[Message, Message, Any] = {
Flow[Message]
.collect {
case TextMessage.Strict(msg) => Future.successful(msg)
case _ => println("ignore streamed message")
}
.mapAsync(parallelism = 2)(identity)
.via(logicFlow)
.map { msg: String => TextMessage.Strict(msg) }
}
val fileRoute =
path("file") {
handleWebSocketMessages(websocketFlow)
}
}
def startServer(): Unit = {
bindingFuture = Http().bindAndHandle(wsRoutes, HOST, PORT)
log.info(s"Server online at http://localhost:9000/")
}
def stopServer(): Unit = {
bindingFuture
.flatMap(_.unbind())
.onComplete{
_ => system.terminate()
log.info("terminated")
}
}
//ws client
def connectToWebSocket(url: String) = {
println("Connecting to websocket: " + url)
val (upgradeResponse, closed) = Http().singleWebSocketRequest(WebSocketRequest(url), websocketFlow)
val connected = upgradeResponse.flatMap{ upgrade =>
if(upgrade.response.status == StatusCodes.SwitchingProtocols )
{
println("Web socket connection success")
Future.successful(Done)
}else {
println("Web socket connection failed with error: {}", upgrade.response.status)
throw new RuntimeException(s"Web socket connection failed: ${upgrade.response.status}")
}
}
connected.onComplete { msg =>
println(msg)
}
}
def websocketFlow: Flow[Message, Message, _] = {
Flow.fromSinkAndSource(printFlowRate, Source.maybe)
}
lazy val printFlowRate =
Flow[Message]
.alsoTo(fileSink("output.txt"))
.via(flowRate(1.seconds))
.to(Sink.foreach(rate => println(s"$rate")))
def flowRate(sampleTime: FiniteDuration) =
Flow[Message]
.conflateWithSeed(_ ⇒ 1){ case (acc, _) ⇒ acc + 1 }
.zip(Source.tick(sampleTime, sampleTime, NotUsed))
.map(_._1.toDouble / sampleTime.toUnit(SECONDS))
def fileSink(file: String): Sink[Message, Future[IOResult]] = {
Flow[Message]
.map{
case TextMessage.Strict(msg) => msg
case TextMessage.Streamed(stream) => stream.runFold("")(_ + _).flatMap(msg => Future.successful(msg))
}
.map(s => ByteString(s + "\n"))
.toMat(FileIO.toFile(new File(file)))(Keep.right)
}
You could attach a throughput-measuring stream to your existing stream. Here is an example, inspired by this answer, that prints the number of integers that are emitted from the upstream source every second:
val rateSink = Flow[Int]
.conflateWithSeed(_ => 0){ case (acc, _) => acc + 1 }
.zip(Source.tick(1.second, 1.second, NotUsed))
.map(_._1)
.toMat(Sink.foreach(i => println(s"$i elements/second")))(Keep.right)
In the following example, we attach the above sink to a source that emits the integers 1 to 10 million. To prevent the rate-measuring stream from interfering with the main stream (which, in this case, simply converts every integer to a string and returns the last string processed as part of the materialized value), we use wireTapMat:
val (rateFut, mainFut) = Source(1 to 10000000)
.wireTapMat(rateSink)(Keep.right)
.map(_.toString)
.toMat(Sink.last[String])(Keep.both)
.run() // (Future[Done], Future[String])
rateFut onComplete {
case Success(x) => println(s"rateFut completed: $x")
case Failure(_) =>
}
mainFut onComplete {
case Success(s) => println(s"mainFut completed: $s")
case Failure(_) =>
}
Running the above sample prints something like the following:
0 elements/second
2597548 elements/second
3279052 elements/second
mainFut completed: 10000000
3516141 elements/second
607254 elements/second
rateFut completed: Done
If you don't need a reference to the materialized value of rateSink, use wireTap instead of wireTapMat. For example, attaching rateSink to your WebSocket flow could look like the following:
val websocketFlow: Flow[Message, Message, Any] = {
Flow[Message]
.wireTap(rateSink) // <---
.collect {
case TextMessage.Strict(msg) => Future.successful(msg)
case _ => println("ignore streamed message")
}
.mapAsync(parallelism = 2)(identity)
.via(logicFlow)
.map { msg: String => TextMessage.Strict(msg) }
}
wireTap is defined on both Source and Flow.
Where I last worked I implemented a performance benchmark of this nature.
Basically, it meant creating a simple client app that consumes messages from the websocket and outputs some metrics. The natural choice was to implement the client using akka-http client-side support for websockets. See:
https://doc.akka.io/docs/akka-http/current/client-side/websocket-support.html#singlewebsocketrequest
Then we used the micrometer library to expose metrics to Prometheus, which was our tool of choice for reporting and charting.
https://github.com/micrometer-metrics
https://micrometer.io/docs/concepts#_meters

Apache Kafka: KafkaProducerActor throws exception ASk timeout.

I am using cake solution Akka client for scala and Kafka. While I am creating a KafkaProducerActor actor and trying to send message using ask pattern and return future and perform some operations, but every time, I am facing ask timeout exception. Below is my code:
class SimpleAkkaProducer (config: Config, system: ActorSystem) {
private val producerConf = KafkaProducer.
Conf(config,
keySerializer = new StringSerializer,
valueSerializer = new StringSerializer)
val actorRef = system.actorOf(KafkaProducerActor.props(producerConf))
def sendMessageWayOne(record: ProducerRecords[String, String]) = {
actorRef ! record
}
def sendMessageWayTwo(record: ProducerRecords[String, String]) = {
implicit val timeout = Timeout(100.seconds)
val future = (actorRef ? record).mapTo[String]
future onComplete {
case Success(data) => println(s" >>>>>>>>>>>> ${data}")
case Failure(ex) => ex.printStackTrace()
}
}
}
object SimpleAkkaProducer {
def main(args: Array[String]): Unit = {
val system = ActorSystem("KafkaProducerActor")
val config = ConfigFactory.defaultApplication()
val simpleAkkaProducer = new SimpleAkkaProducer(config, system)
val topic = config.getString("akka.topic")
val messageOne = ProducerRecords.fromKeyValues[String, String](topic,
Seq((Some("Topics"), "First Message")), None, None)
simpleAkkaProducer.sendMessageWayOne(messageOne)
simpleAkkaProducer.sendMessageWayTwo(messageOne)
}
}
Following is exception :
akka.pattern.AskTimeoutException: Ask timed out on [Actor[akka://KafkaProducerActor/user/$a#-1520717141]] after [100000 ms]. Sender[null] sent message of type "cakesolutions.kafka.akka.ProducerRecords".
at akka.pattern.PromiseActorRef$.$anonfun$apply$1(AskSupport.scala:604)
at akka.actor.Scheduler$$anon$4.run(Scheduler.scala:126)
at scala.concurrent.Future$InternalCallbackExecutor$.unbatchedExecute(Future.scala:864)
at scala.concurrent.BatchingExecutor.execute(BatchingExecutor.scala:109)
at scala.concurrent.BatchingExecutor.execute$(BatchingExecutor.scala:103)
at scala.concurrent.Future$InternalCallbackExecutor$.execute(Future.scala:862)
at akka.actor.LightArrayRevolverScheduler$TaskHolder.executeTask(LightArrayRevolverScheduler.scala:329)
at akka.actor.LightArrayRevolverScheduler$$anon$4.executeBucket$1(LightArrayRevolverScheduler.scala:280)
at akka.actor.LightArrayRevolverScheduler$$anon$4.nextTick(LightArrayRevolverScheduler.scala:284)
at akka.actor.LightArrayRevolverScheduler$$anon$4.run(LightArrayRevolverScheduler.scala:236)
at java.lang.Thread.run(Thread.java:745)
The producer actor only responds to the sender, if you specify the successResponse and failureResponse values in the ProducerRecords to be something other than None. The successResponse value is sent back to the sender when the Kafka write succeeds, and failureResponse value is sent back when the Kafka write fails.
Example:
val record = ProducerRecords.fromKeyValues[String, String](
topic = topic,
keyValues = Seq((Some("Topics"), "First Message")),
successResponse = Some("success"),
failureResponse = Some("failure")
)
val future = (actorRef ? record).mapTo[String]
future onComplete {
case Success("success") => println("Send succeeded!")
case Success("failure") => println("Send failed!")
case Success(data) => println(s"Send result: $data")
case Failure(ex) => ex.printStackTrace()
}

How to listen websocket server close events using akka-http websocket client

I have websocket client connected to akka-http websocket server , how can i listen to connection close events happened on server( i.e server shutdown/ server closed websocket connection) ?
object Client extends App {
implicit val actorSystem = ActorSystem("akka-system")
implicit val flowMaterializer = ActorMaterializer()
val config = actorSystem.settings.config
val interface = config.getString("app.interface")
val port = config.getInt("app.port")
// print each incoming strict text message
val printSink: Sink[Message, Future[Done]] =
Sink.foreach {
case message: TextMessage.Strict =>
println(message.text)
case _ => {
sourceQueue.map(q => {
println(s"offering message on client")
q.offer(TextMessage("received unknown"))
})
println(s"received unknown message format")
}
}
val (source, sourceQueue) = {
val p = Promise[SourceQueue[Message]]
val s = Source.queue[Message](Int.MaxValue, OverflowStrategy.backpressure).mapMaterializedValue(m => {
p.trySuccess(m)
m
})
.keepAlive(FiniteDuration(1, TimeUnit.SECONDS), () => TextMessage.Strict("Heart Beat"))
(s, p.future)
}
val flow =
Flow.fromSinkAndSourceMat(printSink, source)(Keep.right)
val (upgradeResponse, sourceClose) =
Http().singleWebSocketRequest(WebSocketRequest("ws://localhost:8080/ws-echo"), flow)
val connected = upgradeResponse.map { upgrade =>
// just like a regular http request we can get 404 NotFound,
// with a response body, that will be available from upgrade.response
if (upgrade.response.status == StatusCodes.SwitchingProtocols || upgrade.response.status == StatusCodes.SwitchingProtocols) {
Done
} else {
throw new RuntimeException(s"Connection failed: ${upgrade.response.status}")
}
}
connected.onComplete(println)
}
Websocket connection termination is modeled as a regular stream completion, therefore in your case you can use the materialized Future[Done] yielded by Sink.foreach:
val flow = Flow.fromSinkAndSourceMat(printSink, source)(Keep.both)
val (upgradeResponse, (sinkClose, sourceClose)) =
Http().singleWebSocketRequest(..., flow)
sinkClose.onComplete {
case Success(_) => println("Connection closed gracefully")
case Failure(e) => println("Connection closed with an error: $e")
}