Akka committableOffset store in DB - scala

I am beginner in akka and I have an problem statement to work with. I have an akka flow that's reading Kafka Events from some topic and doing some transformation before creating Commitable offset of the message.
I am not sure the best way to add a akka sink on top of this code to store the transformed events in some DB
def eventTransform : Flow[KafkaMessage,CommittableRecord[Either[Throwable,SomeEvent]],NotUsed]
def processEvents
: Flow[KafkaMessage, ConsumerMessage.CommittableOffset, NotUsed] =
Flow[KafkaMessage]
.via(eventTransform)
.filter({ x =>
x.value match {
case Right(event: SomeEvent) =>
event.status != "running"
case Left(_) => false
}
})
.map(_.message.committableOffset)
This is my akka source calling the akka flow
private val consumerSettings: ConsumerSettings[String, String] = ConsumerSettings(
system,
new StringDeserializer,
new StringDeserializer,
)
.withGroupId(groupId)
.withProperty(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest")
private val committerSettings: CommitterSettings = CommitterSettings(system)
private val control = new AtomicReference[Consumer.Control](Consumer.NoopControl)
private val restartableSource = RestartSource
.withBackoff(restartSettings) { () =>
Consumer
.committableSource(consumerSettings, Subscriptions.topics(topicIn))
.mapMaterializedValue(control.set)
.via(processEvents) // calling the flow here
}
restartableSource
.toMat(Committer.sink(committerSettings))(Keep.both)
.run()
def api(): Behavior[Message] =
Behaviors.receive[Message] { (_, message) =>
message match {
case Stop =>
context.pipeToSelf(control.get().shutdown())(_ => Stopped)
Behaviors.same
case Stopped =>
Behaviors.stopped
}
}
.receiveSignal {
case (_, _ #(PostStop | PreRestart)) =>
control.get().shutdown()
Behaviors.same
}
}

Related

Why does the stream never get triggered?

I have the following stream, that never reach the map after flatMapConcat.
private def stream[A](ref: ActorRef[ServerHealthStreamer])(implicit system: ActorSystem[A])
: KillSwitch = {
implicit val materializer = ActorMaterializer()
implicit val dispatcher = materializer.executionContext
system.log.info("=============> Start KafkaDetectorStream <=============")
val addr = system
.settings
.config
.getConfig("kafka")
.getString("servers")
val sink: Sink[ServerHealthEvent, NotUsed] =
ActorSink.actorRefWithAck[ServerHealthEvent, ServerHealthStreamer, Ack](
ref = ref,
onCompleteMessage = Complete,
onFailureMessage = Fail.apply,
messageAdapter = Message.apply,
onInitMessage = Init.apply,
ackMessage = Ack)
Source.tick(1.seconds, 5.seconds, NotUsed)
.flatMapConcat(_ => Source.fromFuture(health(addr)))
.map {
case true =>
KafkaActiveConfirmed
case false =>
KafkaInactiveConfirmed
}
.viaMat(KillSwitches.single)(Keep.right)
.to(sink)
.run()
}
private def health(server: String)(implicit executor: ExecutionContext): Future[Boolean] = {
val props = new Properties
props.put(AdminClientConfig.BOOTSTRAP_SERVERS_CONFIG, server)
props.put(AdminClientConfig.CONNECTIONS_MAX_IDLE_MS_CONFIG, "10000")
props.put(AdminClientConfig.REQUEST_TIMEOUT_MS_CONFIG, "5000")
Future {
AdminClient
.create(props)
.listTopics()
.names()
.get()
}
.map(_ => true)
.recover {
case _: Throwable => false
}
}
What I mean is, that this part:
.map {
case true =>
KafkaActiveConfirmed
case false =>
KafkaInactiveConfirmed
}
never gets executed and I do not know the reason. The method health executes as expected.
Try to add .log between flatMapConcat and map to see emited element. log can else log errors and stream cancelation.
https://doc.akka.io/docs/akka/current/stream/operators/Source-or-Flow/log.html
Note, .log using implicit logger
And your .flatMapConcat(_ => Source.fromFuture(health(addr))) seams triky,
try .mapAsyncUnordered(1)(_ => health(addr))

Websocket client does not receive data from Akka streams Source.queue

I am using Akka stream Source.queue as Source for websocket clients.
Reading from kafka topic with 10k records using kaka consumer API and offering it to Source.queue with buffer 100k.
I am using BroardcastHub for fan-out. The websocket client does not get any data but see records from kafka enqueued on offer result.
Appreciate any help.
def kafkaSourceQueue() = {
val sourceQueue = Source.queue[String](100000, OverflowStrategy.dropHead)
val (theQueue, queueSource)= sourceQueue.toMat(BroadcastHub.sink(bufferSize = 256))(Keep.both).run
val consumer = KafkaEventSource.initKafkaConsumer()
try {
while (true) {
val records = consumer.poll(polltimeout.toMillis)
for (record <- records.records(topic)) {
//println(record.value())
theQueue.offer(record.value()).onComplete{
case Success(QueueOfferResult.Enqueued) =>
println("enqueued")
case _ => println("Failed to enqueue")
}
}
}
}
finally {
if (consumer != null) {
println("consumer unsubscribed")
consumer.unsubscribe()
}
}
queueSource
}
private def logicStreamFlow: Flow[String, String, NotUsed] = {
Flow.fromSinkAndSourceCoupled(Sink.ignore, kafkaSourceQueue)
}
def websocketFlow: Flow[Message, Message, NotUsed] = {
Flow[Message]
.map{
case TextMessage.Strict(msg) => msg
case _ => throw new Exception("exception msg")
}
.via(logicStreamFlow)
.map { msg: String => TextMessage.Strict(msg) }
}
lazy private val streamRoute =
path("stream") {
handleWebSocketMessages {
websocketFlow
.watchTermination() { (_, done) =>
done.onComplete {
case Success(_) =>
log.info("Stream route completed successfully")
case Failure(ex) =>
log.error(s"Stream route completed with failure : $ex")
}
}
}
}
def startServer(): Unit = {
bindingFuture = Http().bindAndHandle(wsRoutes, HOST, PORT)
log.info(s"Server online at http://localhost:9000/")
}
def stopServer(): Unit = {
bindingFuture
.flatMap(_.unbind())
.onComplete{
_ => system.terminate()
log.info("terminated")
}
}

Akka-http first websocket client only receives the data once from a kafka topic

I am using akka-http websocket to push messages from a kafka topic to websocket clients.
For this purpose, i created a plain kafka consumer (using akka-streams-kafka connector) with offset set to "earliest" so that every new websocket client connecting gets all the data from the beginning.
The problem is that the first connected websocket client gets all the data and other ws clients (connecting after the first client has got all the data) do not get any. The kafka topic has 1million records.
I am using the BroadcastHub from Akka-streams.
Appreciate any suggestions.
lazy private val kafkaPlainSource: Source[String, NotUsed] = {
val consumerSettings = ConsumerSettings(system, new StringDeserializer, new StringDeserializer)
.withBootstrapServers(KAFKA_BROKERS)
.withGroupId(UUID.randomUUID().toString)
.withProperty(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest")
val kafkaSource = Consumer.plainSource(consumerSettings, Subscriptions.topics(KAFKA_TOPIC))
.mapAsync(PARALLELISM) { cr =>
Future {
cr.value
}
}
kafkaSource.toMat(BroadcastHub.sink)(Keep.right).run
}
def logicFlow: Flow[String, String, NotUsed] =
Flow.fromSinkAndSourceCoupled(Sink.ignore, kafkaSource)
val websocketFlow: Flow[Message, Message, Any] = {
Flow[Message]
.map {
case TextMessage.Strict(msg) => msg
case _ => println("ignore streamed message")
}
.via(logicFlow)
.map { msg: String => TextMessage.Strict(msg) }
}
lazy private val streamRoute =
path("stream") {
handleWebSocketMessages {
websocketFlow
.watchTermination() { (_, done) =>
done.onComplete {
case Success(_) =>
log.info("Stream route completed successfully")
case Failure(ex) =>
log.error(s"Stream route completed with failure : $ex")
}
}
}
}
def startServer(): Unit = {
bindingFuture = Http().bindAndHandle(wsRoutes, HOST, PORT)
log.info(s"Server online at http://localhost:9000/")
}
def stopServer(): Unit = {
bindingFuture
.flatMap(_.unbind())
.onComplete{
_ => system.terminate()
log.info("terminated")
}
}

How does one measure throughput of Akka WebSocket stream?

I am new to Akka and developed a sample Akka WebSocket server that streams a file's contents to clients using BroadcastHub (based on a sample from the Akka docs).
How can I measure the throughput (messages/second), assuming the clients are consuming as fast as the server?
// file source
val fileSource = FileIO.fromPath(Paths.get(path)
// Akka file source
val theFileSource = fileSource
.toMat(BroadcastHub.sink)(Keep.right)
.run
//Akka kafka file source
lazy val kafkaSourceActorStream = {
val (kafkaSourceActorRef, kafkaSource) = Source.actorRef[String](Int.MaxValue, OverflowStrategy.fail)
.toMat(BroadcastHub.sink)(Keep.both).run()
Consumer.plainSource(consumerSettings, Subscriptions.topics("perf-test-topic"))
.runForeach(record => kafkaSourceActorRef ! record.value().toString)
}
def logicFlow: Flow[String, String, NotUsed] = Flow.fromSinkAndSource(Sink.ignore, theFileSource)
val websocketFlow: Flow[Message, Message, Any] = {
Flow[Message]
.collect {
case TextMessage.Strict(msg) => Future.successful(msg)
case _ => println("ignore streamed message")
}
.mapAsync(parallelism = 2)(identity)
.via(logicFlow)
.map { msg: String => TextMessage.Strict(msg) }
}
val fileRoute =
path("file") {
handleWebSocketMessages(websocketFlow)
}
}
def startServer(): Unit = {
bindingFuture = Http().bindAndHandle(wsRoutes, HOST, PORT)
log.info(s"Server online at http://localhost:9000/")
}
def stopServer(): Unit = {
bindingFuture
.flatMap(_.unbind())
.onComplete{
_ => system.terminate()
log.info("terminated")
}
}
//ws client
def connectToWebSocket(url: String) = {
println("Connecting to websocket: " + url)
val (upgradeResponse, closed) = Http().singleWebSocketRequest(WebSocketRequest(url), websocketFlow)
val connected = upgradeResponse.flatMap{ upgrade =>
if(upgrade.response.status == StatusCodes.SwitchingProtocols )
{
println("Web socket connection success")
Future.successful(Done)
}else {
println("Web socket connection failed with error: {}", upgrade.response.status)
throw new RuntimeException(s"Web socket connection failed: ${upgrade.response.status}")
}
}
connected.onComplete { msg =>
println(msg)
}
}
def websocketFlow: Flow[Message, Message, _] = {
Flow.fromSinkAndSource(printFlowRate, Source.maybe)
}
lazy val printFlowRate =
Flow[Message]
.alsoTo(fileSink("output.txt"))
.via(flowRate(1.seconds))
.to(Sink.foreach(rate => println(s"$rate")))
def flowRate(sampleTime: FiniteDuration) =
Flow[Message]
.conflateWithSeed(_ ⇒ 1){ case (acc, _) ⇒ acc + 1 }
.zip(Source.tick(sampleTime, sampleTime, NotUsed))
.map(_._1.toDouble / sampleTime.toUnit(SECONDS))
def fileSink(file: String): Sink[Message, Future[IOResult]] = {
Flow[Message]
.map{
case TextMessage.Strict(msg) => msg
case TextMessage.Streamed(stream) => stream.runFold("")(_ + _).flatMap(msg => Future.successful(msg))
}
.map(s => ByteString(s + "\n"))
.toMat(FileIO.toFile(new File(file)))(Keep.right)
}
You could attach a throughput-measuring stream to your existing stream. Here is an example, inspired by this answer, that prints the number of integers that are emitted from the upstream source every second:
val rateSink = Flow[Int]
.conflateWithSeed(_ => 0){ case (acc, _) => acc + 1 }
.zip(Source.tick(1.second, 1.second, NotUsed))
.map(_._1)
.toMat(Sink.foreach(i => println(s"$i elements/second")))(Keep.right)
In the following example, we attach the above sink to a source that emits the integers 1 to 10 million. To prevent the rate-measuring stream from interfering with the main stream (which, in this case, simply converts every integer to a string and returns the last string processed as part of the materialized value), we use wireTapMat:
val (rateFut, mainFut) = Source(1 to 10000000)
.wireTapMat(rateSink)(Keep.right)
.map(_.toString)
.toMat(Sink.last[String])(Keep.both)
.run() // (Future[Done], Future[String])
rateFut onComplete {
case Success(x) => println(s"rateFut completed: $x")
case Failure(_) =>
}
mainFut onComplete {
case Success(s) => println(s"mainFut completed: $s")
case Failure(_) =>
}
Running the above sample prints something like the following:
0 elements/second
2597548 elements/second
3279052 elements/second
mainFut completed: 10000000
3516141 elements/second
607254 elements/second
rateFut completed: Done
If you don't need a reference to the materialized value of rateSink, use wireTap instead of wireTapMat. For example, attaching rateSink to your WebSocket flow could look like the following:
val websocketFlow: Flow[Message, Message, Any] = {
Flow[Message]
.wireTap(rateSink) // <---
.collect {
case TextMessage.Strict(msg) => Future.successful(msg)
case _ => println("ignore streamed message")
}
.mapAsync(parallelism = 2)(identity)
.via(logicFlow)
.map { msg: String => TextMessage.Strict(msg) }
}
wireTap is defined on both Source and Flow.
Where I last worked I implemented a performance benchmark of this nature.
Basically, it meant creating a simple client app that consumes messages from the websocket and outputs some metrics. The natural choice was to implement the client using akka-http client-side support for websockets. See:
https://doc.akka.io/docs/akka-http/current/client-side/websocket-support.html#singlewebsocketrequest
Then we used the micrometer library to expose metrics to Prometheus, which was our tool of choice for reporting and charting.
https://github.com/micrometer-metrics
https://micrometer.io/docs/concepts#_meters

Apache Kafka: KafkaProducerActor throws exception ASk timeout.

I am using cake solution Akka client for scala and Kafka. While I am creating a KafkaProducerActor actor and trying to send message using ask pattern and return future and perform some operations, but every time, I am facing ask timeout exception. Below is my code:
class SimpleAkkaProducer (config: Config, system: ActorSystem) {
private val producerConf = KafkaProducer.
Conf(config,
keySerializer = new StringSerializer,
valueSerializer = new StringSerializer)
val actorRef = system.actorOf(KafkaProducerActor.props(producerConf))
def sendMessageWayOne(record: ProducerRecords[String, String]) = {
actorRef ! record
}
def sendMessageWayTwo(record: ProducerRecords[String, String]) = {
implicit val timeout = Timeout(100.seconds)
val future = (actorRef ? record).mapTo[String]
future onComplete {
case Success(data) => println(s" >>>>>>>>>>>> ${data}")
case Failure(ex) => ex.printStackTrace()
}
}
}
object SimpleAkkaProducer {
def main(args: Array[String]): Unit = {
val system = ActorSystem("KafkaProducerActor")
val config = ConfigFactory.defaultApplication()
val simpleAkkaProducer = new SimpleAkkaProducer(config, system)
val topic = config.getString("akka.topic")
val messageOne = ProducerRecords.fromKeyValues[String, String](topic,
Seq((Some("Topics"), "First Message")), None, None)
simpleAkkaProducer.sendMessageWayOne(messageOne)
simpleAkkaProducer.sendMessageWayTwo(messageOne)
}
}
Following is exception :
akka.pattern.AskTimeoutException: Ask timed out on [Actor[akka://KafkaProducerActor/user/$a#-1520717141]] after [100000 ms]. Sender[null] sent message of type "cakesolutions.kafka.akka.ProducerRecords".
at akka.pattern.PromiseActorRef$.$anonfun$apply$1(AskSupport.scala:604)
at akka.actor.Scheduler$$anon$4.run(Scheduler.scala:126)
at scala.concurrent.Future$InternalCallbackExecutor$.unbatchedExecute(Future.scala:864)
at scala.concurrent.BatchingExecutor.execute(BatchingExecutor.scala:109)
at scala.concurrent.BatchingExecutor.execute$(BatchingExecutor.scala:103)
at scala.concurrent.Future$InternalCallbackExecutor$.execute(Future.scala:862)
at akka.actor.LightArrayRevolverScheduler$TaskHolder.executeTask(LightArrayRevolverScheduler.scala:329)
at akka.actor.LightArrayRevolverScheduler$$anon$4.executeBucket$1(LightArrayRevolverScheduler.scala:280)
at akka.actor.LightArrayRevolverScheduler$$anon$4.nextTick(LightArrayRevolverScheduler.scala:284)
at akka.actor.LightArrayRevolverScheduler$$anon$4.run(LightArrayRevolverScheduler.scala:236)
at java.lang.Thread.run(Thread.java:745)
The producer actor only responds to the sender, if you specify the successResponse and failureResponse values in the ProducerRecords to be something other than None. The successResponse value is sent back to the sender when the Kafka write succeeds, and failureResponse value is sent back when the Kafka write fails.
Example:
val record = ProducerRecords.fromKeyValues[String, String](
topic = topic,
keyValues = Seq((Some("Topics"), "First Message")),
successResponse = Some("success"),
failureResponse = Some("failure")
)
val future = (actorRef ? record).mapTo[String]
future onComplete {
case Success("success") => println("Send succeeded!")
case Success("failure") => println("Send failed!")
case Success(data) => println(s"Send result: $data")
case Failure(ex) => ex.printStackTrace()
}