NATS streaming server subscriber rate limiting and exactly once delivery - scala

I am playing a bit with the NATS streaming and I have a problem with the subscriber rate limiting. When I set the max in flight to 1 and the timeout to 1 second and I have a consumer which is basically a Thread.sleep(1000) then I get multiple times the same event. I thought by limiting the in flight and using a manual ack this should not happen. How can I get exatly once delivery on very slow consumers?
case class EventBus[I, O](inputTopic: String, outputTopic: String, connection: Connection, eventProcessor: StatefulEventProcessor[I, O]) {
// the event bus could be some abstract class while the `Connection` coulbd be injected using DI
val substritionOptions: SubscriptionOptions = new SubscriptionOptions.Builder()
.setManualAcks(true)
.setDurableName("foo")
.setMaxInFlight(1)
.setAckWait(1, TimeUnit.SECONDS)
.build()
if (!inputTopic.isEmpty) {
connection.subscribe(inputTopic, new MessageHandler() {
override def onMessage(m: Message) {
m.ack()
try {
val event = eventProcessor.deserialize(m.getData)
eventProcessor.onEvent(event)
} catch {
case any =>
try {
val command = new String(m.getData)
eventProcessor.onCommand(command)
} catch {
case any => println(s"de-serialization error: $any")
}
} finally {
println("got event")
}
}
}, substritionOptions)
}
if (!outputTopic.isEmpty) {
eventProcessor.setBus(e => {
try {
connection.publish(outputTopic, eventProcessor.serialize(e))
} catch {
case ex => println(s"serialization error $ex")
}
})
}
}
abstract class StatefulEventProcessor[I, O] {
private var bus: Option[O => Unit] = None
def onEvent(event: I): Unit
def onCommand(command: String): Unit
def serialize(o: O): Array[Byte] =
SerializationUtils.serialize(o.asInstanceOf[java.io.Serializable])
def deserialize(in: Array[Byte]): I =
SerializationUtils.deserialize[I](in)
def setBus(push: O => Unit): Unit = {
if (bus.isDefined) {
throw new IllegalStateException("bus already set")
} else {
bus = Some(push)
}
}
def push(event: O) =
bus.get.apply(event)
}
EventBus("out-1", "out-2", sc, new StatefulEventProcessor[String, String] {
override def onEvent(event: String): Unit = {
Thread.sleep(1000)
push("!!!" + event)
}
override def onCommand(command: String): Unit = {}
})
(0 until 100).foreach(i => sc.publish("out-1", SerializationUtils.serialize(s"test-$i")))

First, there is no exactly once (re)delivery guarantee with NATS Streaming. What MaxInflight gives you, is the assurance that the server will not send new messages to the subscriber until the number of unacknowledged messages is below that number. So in case of MaxInflight(1), you are asking the server to send the next new message only after receiving the ack from the previously delivered message. However, this does not block redelivery of unacknowledged messages.
The server has no guarantee or no knowledge that a message is actually received by a subscriber. This is what the ACK is for, to let the server know that the message was properly processed by the subscriber. If the server would not honor redelivery (even when MaxInflight is reached), then a "lost" message would stall your subscription for ever. Keep in mind that NATS Streaming server and clients are not directly connected to each other with a TCP connection (they are both connected to a NATS server, aka gnatsd).

Related

Consuming Server Sent Events(SSE) in scala play framework with automatic reconnect

How can we consume SSE in scala play framework? Most of the resources that I could find were to make an SSE source. I want to reliably listen to SSE events from other services (with autoconnect). The most relevant article was https://doc.akka.io/docs/alpakka/current/sse.html . I implemented this but this does not seem to work (code below). Also the event that I am su
#Singleton
class SseConsumer #Inject()((implicit ec: ExecutionContext) {
implicit val system = ActorSystem()
val send: HttpRequest => Future[HttpResponse] = foo
def foo(x:HttpRequest) = {
try {
println("foo")
val authHeader = Authorization(BasicHttpCredentials("user", "pass"))
val newHeaders = x.withHeaders(authHeader)
Http().singleRequest(newHeaders)
}catch {
case e:Exception => {
println("Exception", e.printStackTrace())
throw e
}
}
}
val eventSource: Source[ServerSentEvent, NotUsed] =
EventSource(
uri = Uri("https://abc/v1/events"),
send,
initialLastEventId = Some("2"),
retryDelay = 1.second
)
def orderStatusEventStable() = {
val events: Future[immutable.Seq[ServerSentEvent]] =
eventSource
.throttle(elements = 1, per = 500.milliseconds, maximumBurst = 1, ThrottleMode.Shaping)
.take(10)
.runWith(Sink.seq)
events.map(_.foreach( x => {
println("456")
println(x.data)
}))
}
Future {
blocking{
while(true){
try{
Thread.sleep(2000)
orderStatusEventStable()
} catch {
case e:Exception => {
println("Exception", e.printStackTrace())
}
}
}
}
}
}
This does not give any exceptions and println("456") is never printed.
EDIT:
Future {
blocking {
while(true){
try{
Await.result(orderStatusEventStable() recover {
case e: Exception => {
println("exception", e)
throw e
}
}, Duration.Inf)
} catch {
case e:Exception => {
println("Exception", e.printStackTrace())
}
}
}
}
}
Added an await and it started working. Able to read 10 messages at a time. But now I am faced with another problem.
I have a producer which can at times produce faster than I can consume and with this code I have 2 issues:
I have to wait until 10 messages are available. How can we take a max. of 10 and a min. of 0 messages?
When the production rate > consumption rate, I am missing few events. I am guessing this is due to throttling. How do we handle it using backpressure?
The issue in your code is that the events: Future would only complete when the stream (eventSource) completes.
I'm not familiar with SSE but the stream likely never completes in your case as it's always listening for new events.
You can learn more in Akka Stream documentation.
Depending on what you want to do with the events, you could just map on the stream like:
eventSource
...
.map(/* do something */)
.runWith(...)
Basically, you need to work with the Akka Stream Source as data is going through it but don't wait for its completion.
EDIT: I didn't notice the take(10), my answer applies only if the take was not here. Your code should work after 10 events sent.

Apache Spark Receiver Scheduling

I have implemented a receiver that is supposed to connect to a WebSocket stream and get the messages for processing. Here is the implementation that I have done so far:
class WebSocketReader (wsConfig: WebSocketConfig, stringMessageHandler: String => Option[String],
storageLevel: StorageLevel) extends Receiver[String] (storageLevel) {
// TODO: avoid using a var
private var wsClient: WebSocketClient = _
def sendRequest(isRequest: Boolean, msgCount: Int) = {
while (isRequest) {
wsClient.send(msgCount.toString)
Thread.sleep(1000)
}
}
// TODO: avoid using Synchronization...
private def connect(): Unit = {
Try {
wsClient = createWsClient
} match {
case Success(_) =>
wsClient.connect().map {
case result if result.isSuccess =>
sendRequest(true, 10)
case _ =>
connect()
}
case Failure(ex) =>
// TODO: how to signal a failure so that it is tried the next time....
ex.printStackTrace()
}
}
def onStart(): Unit = {
new Thread(getClass.getSimpleName) {
override def run() { connect() }
}.start()
}
override def onStop(): Unit =
if (wsClient != null) wsClient.disconnect()
private def createWsClient = {
new DefaultHookupClient(new HookupClientConfig(new URI(wsConfig.wsUrl))) {
override def receive: Receive = {
case Disconnected(_) =>
// TODO: use Logging framework, try reconnecting....
println(s"the web socket is disconnected")
case TextMessage(message) =>
stringMessageHandler(message).foreach(store)
case JsonMessage(jsValue) =>
stringMessageHandler(jsValue.toString).foreach(store)
}
}
}
}
How is this Receiver being run? Does this Receiver run on the worker nodes or on the driver node? Is this way of sleeping a thread a correct approach?
The reason why I want to do this is that the server that is exposing the WebSocket end point would need a count on the messages that I want to receive. Say if I ask the server for 100 messages, it would give me 100 messages and so on. So I need a way to periodically schedule this request to the server. Currently, I'm using the Thread.sleep mechanism. Is this advisable? What could be the alternative?

Akka Streams TCP socket client side termination

I have the following flow:
val actorSource = Source.actorRef(10000, OverflowStrategy.dropHead)
val targetSink = Flow[ByteString]
.map(_.utf8String)
.via(new JsonStage())
.map { json =>
MqttMessages.jsonToObject(json)
}
.to(Sink.actorRef(self, "Done"))
sourceRef = Some(Flow[ByteString]
.via(conn.flow)
.to(targetSink)
.runWith(actorSource))
within an Actor (which is the Sink.actorRef one). The conn.flow is an incoming TCP Connection using Tcp().bind(address, port).
Currently the Sink.actorRef Actor keeps running when the tcp connection is closed from the client side. Is there a way to register the client side termination of the tcp connection to shutdown the Actor?
Edit:
I tried handling both cases as suggested:
case "Done" =>
context.stop(self)
case akka.actor.Status.Failure =>
context.stop(self)
But when I test with a socket client and cancel it, the actor is not being shutdown. So neither the "Done" message nor the Failure seem to be registered if the TCP connection is terminated.
Here is the whole code:
private var connection: Option[Tcp.IncomingConnection] = None
private var mqttpubsub: Option[ActorRef] = None
private var sourceRef: Option[ActorRef] = None
private val sdcTopic = "out"
private val actorSource = Source.actorRef(10000, OverflowStrategy.dropHead)
implicit private val system = context.system
implicit private val mat = ActorMaterializer.create(context.system)
override def receive: Receive = {
case conn: Tcp.IncomingConnection =>
connection = Some(conn)
mqttpubsub = Some(context.actorOf(Props(classOf[MqttPubSub], PSConfig(
brokerUrl = "tcp://127.0.0.1:1883", //all params is optional except brokerUrl
userName = null,
password = null,
//messages received when disconnected will be stash. Messages isOverdue after stashTimeToLive will be discard
stashTimeToLive = 1.minute,
stashCapacity = 100000, //stash messages will be drop first haft elems when reach this size
reconnectDelayMin = 10.millis, //for fine tuning re-connection logic
reconnectDelayMax = 30.seconds
))))
val targetSink = Flow[ByteString]
.alsoTo(Sink.foreach(println))
.map(_.utf8String)
.via(new JsonStage())
.map { json =>
MqttMessages.jsonToObject(json)
}
.to(Sink.actorRef(self, "Done"))
sourceRef = Some(Flow[ByteString]
.via(conn.flow)
.to(targetSink)
.runWith(actorSource))
case msg: MqttMessages.MqttMessage =>
processMessage(msg)
case msg: Message =>
val jsonMsg = JsonParser(msg.payload).asJsObject
val mqttMsg = MqttMessages.jsonToObject(jsonMsg)
try {
sourceRef.foreach(_ ! ByteString(msg.payload))
} catch {
case e: Throwable => e.printStackTrace()
}
case SubscribeAck(Subscribe(topic, self, qos), fail) =>
case "Done" =>
context.stop(self)
case akka.actor.Status.Failure =>
context.stop(self)
}
the Actor keeps running
Which actor do you mean, the one you've registered with Sink.actorRef? If yes, then to shut it down when the stream shuts down, you need to handle "Done" and akka.actor.Status.Failure messages in it and invoke context.stop(self) explicitly. "Done" message will be sent when the stream closes successfully, while Status.Failure will be sent if there is an error.
For more information see Sink.actorRef API docs, they explain the termination semantics.
I ended up creating another Stage, which only passes elements through but and emits an additional message to the next flow if the upstream closes:
class TcpStage extends GraphStage[FlowShape[ByteString, ByteString]] {
val in = Inlet[ByteString]("TCPStage.in")
val out = Outlet[ByteString]("TCPStage.out")
override val shape = FlowShape.of(in, out)
override def createLogic(inheritedAttributes: Attributes): GraphStageLogic = new GraphStageLogic(shape) {
setHandler(out, new OutHandler {
override def onPull(): Unit = {
if (isClosed(in)) emitDone()
else pull(in)
}
})
setHandler(in, new InHandler {
override def onPush(): Unit = {
push(out, grab(in))
}
override def onUpstreamFinish(): Unit = {
emitDone()
completeStage()
}
})
private def emitDone(): Unit = {
push(out, ByteString("{ }".getBytes("utf-8")))
}
}
}
Which I then use in my flow:
val targetSink = Flow[ByteString]
.via(new TcpStage())
.map(_.utf8String)
.via(new JsonStage())
.map { json =>
MqttMessages.jsonToObject(json)
}
.to(Sink.actorRef(self, MqttDone))
sourceRef = Some(Flow[ByteString]
.via(conn.flow)
.to(targetSink)
.runWith(actorSource))

PlayFramework 2.3.x How to use websockets to both broadcast to all and to specific clients

UPDATE
This was answered here #
https://groups.google.com/forum/#!topic/play-framework/P-tG6b_SEyg
--
First Play/Scala app here and I am struggling a bit. What I am trying to achieve is collaboration between users on specific channels (websocket urls). I am trying to follow the example from the book "Play Framework Essentials" and trying to adapt to my use case. Depending on the request from the client, I would either like to broadcast the information to all the clients talking on this channel or only send information back to the client that sent the initial request. The broadcast part is working but I am unable to figure out how to send the information back to just the client of the request themselves.
In my controller I have
def ws(channelId: String) = WebSocket.tryAccept[JsValue] { implicit request =>
getEnumerator(channelId).map { out =>
val in = Iteratee.foreach[JsValue] { jsMsg =>
val id = (jsMsg \ "id").as[String]
// This works and broadcasts that the user has connected
connect(id, request.session.get("email").get)
// Not sure how to return the result of this just to the client of the request
retrieve(id)
}
Right((in, out))
}
}
private def getEnumerator(id: String): Future[Enumerator[JsValue]] = {
(myActor ? GetEnumerator(id)).mapTo[Enumerator[JsValue]]
}
private def connect(id: String, email: String): Unit = {
(myActor ! Connect(id, email))
}
// I have a feeling this is an incorrect return type and I need to return a Future[JsValue]
//and somehow feed that to the enumerator
private def retrieve(id: String): Future[Enumerator[JsValue]] = {
(myActor ? RetrieveInfo(id)).mapTo[Enumerator[JsValue]]
}
In my Actor
class MyActor extends Actor {
var communicationChannels = Map.empty[String, CommunicationChannel]
override def receive: Receive = {
case GetEnumerator(id) => sender() ! getOrCreateCommunicationChannel(id).enumerator
case Connect(id, email) => getOrCreateCommunicationChannel(id).connect(email)
case RetrieveInfo(id) => sender() ! Json.toJson(Info(id, "details for " + id))
}
private def getOrCreateCommunicationChannel(id: String): CommunicationChannel = {
communicationChannels.getOrElse(id, {
val communicationChannel = new CommunicationChannel
communicationChannels += id -> communicationChannel
communicationChannel
})
}
}
object MyActor {
def props: Props = Props[MyActor]
class CommunicationChannel {
val (enumerator, channel) = Concurrent.broadcast[JsValue]
def connect(email: String) = {
channel.push(Json.toJson(Connected(email)))
}
}
}
Where Connect, Connected, Info etc are just case classes with Reads and Writes defined
Can someone please tell me if it is possible to push message to specific user(s) with this approach rather than broadcast it to everyone? Or if I need to implement this in another way
Also is there a way to broadcast to everyone except yourself
Any help will be greatly appreciated
thanks!!

Play 2.2.2-WebSocket / Equivalent of in.onClose() in Scala

I use Play 2.2.2 with Scala.
I have this code in my controller:
def wsTest = WebSocket.using[JsValue] {
implicit request =>
val (out, channel) = Concurrent.broadcast[JsValue]
val in = Iteratee.foreach[JsValue] {
msg => println(msg)
}
userAuthenticatorRequest.tracked match { //detecting wheter the user is authenticated
case Some(u) =>
mySubscriber.start(u.id, channel)
case _ =>
channel push Json.toJson("{error: Sorry, you aren't authenticated yet}")
}
(in, out)
}
calling this code:
object MySubscriber {
def start(userId: String, channel: Concurrent.Channel[JsValue]) {
ctx.getBean(classOf[ActorSystem]).actorOf(Props(classOf[MySubscriber], Seq("comment"), channel), name = "mySubscriber") ! "start"
//a simple refresh would involve a duplication of this actor!
}
}
class MySubscriber(redisChannels: Seq[String], channel: Concurrent.Channel[JsValue]) extends RedisSubscriberActor(new InetSocketAddress("localhost", 6379), redisChannels, Nil) with ActorLogging {
def onMessage(message: Message) {
println(s"message received: $message")
channel.push(Json.parse(message.data))
}
override def onPMessage(pmessage: PMessage) {
//not used
println(s"message received: $pmessage")
}
}
The problem is that when the user refreshes the page, then a new websocket restarts involving a duplication of Actors named mySubscriber.
I noticed that the Play's Java version has a way to detect a closed connection, in order to shutdown an actor.
Example:
// When the socket is closed.
in.onClose(new Callback0() {
public void invoke() {
// Shutdown the actor
defaultRoom.shutdown();
}
});
How to handle the same thing with the Scala WebSocket API? I want to close the actor each time the socket is closed.
As #Mik378 suggested, Iteratee.map serves the role of onClose.
val in = Iteratee.foreach[JsValue] {
msg => println(msg)
} map { _ =>
println("Connection has closed")
}