Terminate Akka-Http Web Socket connection asynchronously - scala

Web Socket connections in Akka Http are treated as an Akka Streams Flow. This seems like it works great for basic request-reply, but it gets more complex when messages should also be pushed out over the websocket. The core of my server looks kind of like:
lazy val authSuccessMessage = Source.fromFuture(someApiCall)
lazy val messageFlow = requestResponseFlow
.merge(updateBroadcastEventSource)
lazy val handler = codec
.atop(authGate(authSuccessMessage))
.join(messageFlow)
handleWebSocketMessages {
handler
}
Here, codec is a (de)serialization BidiFlow and authGate is a BidiFlow that processes an authorization message and prevents outflow of any messages until authorization succeeds. Upon success, it sends authSuccessMessage as a reply. requestResponseFlow is the standard request-reply pattern, and updateBroadcastEventSource mixes in async push messages.
I want to be able to send an error message and terminate the connection gracefully in certain situations, such as bad authorization, someApiCall failing, or a bad request processed by requestResponseFlow. So basically, basically it seems like I want to be able to asynchronously complete messageFlow with one final message, even though its other constituent flows are still alive.

Figured out how to do this using a KillSwitch.
Updated version
The old version had the problem that it didn't seem to work when triggered by a BidiFlow stage higher up in the stack (such as my authGate). I'm not sure exactly why, but modeling the shutoff as a BidiFlow itself, placed further up the stack, resolved the issue.
val shutoffPromise = Promise[Option[OutgoingWebsocketEvent]]()
/**
* Shutoff valve for the connection. It is triggered when `shutoffPromise`
* completes, and sends a final optional termination message if that
* promise resolves with one.
*/
val shutoffBidi = {
val terminationMessageSource = Source
.maybe[OutgoingWebsocketEvent]
.mapMaterializedValue(_.completeWith(shutoffPromise.future))
val terminationMessageBidi = BidiFlow.fromFlows(
Flow[IncomingWebsocketEventOrAuthorize],
Flow[OutgoingWebsocketEvent].merge(terminationMessageSource)
)
val terminator = BidiFlow
.fromGraph(KillSwitches.singleBidi[IncomingWebsocketEventOrAuthorize, OutgoingWebsocketEvent])
.mapMaterializedValue { killSwitch =>
shutoffPromise.future.foreach { _ => println("Shutting down connection"); killSwitch.shutdown() }
}
terminationMessageBidi.atop(terminator)
}
Then I apply it just inside the codec:
val handler = codec
.atop(shutoffBidi)
.atop(authGate(authSuccessMessage))
.join(messageFlow)
Old version
val shutoffPromise = Promise[Option[OutgoingWebsocketEvent]]()
/**
* Shutoff valve for the flow of outgoing messages. It is triggered when
* `shutoffPromise` completes, and sends a final optional termination
* message if that promise resolves with one.
*/
val shutoffFlow = {
val terminationMessageSource = Source
.maybe[OutgoingWebsocketEvent]
.mapMaterializedValue(_.completeWith(shutoffPromise.future))
Flow
.fromGraph(KillSwitches.single[OutgoingWebsocketEvent])
.mapMaterializedValue { killSwitch =>
shutoffPromise.future.foreach(_ => killSwitch.shutdown())
}
.merge(terminationMessageSource)
}
Then handler looks like:
val handler = codec
.atop(authGate(authSuccessMessage))
.join(messageFlow via shutoffFlow)

Related

Wrapping Pub-Sub Java API in Akka Streams Custom Graph Stage

I am working with a Java API from a data vendor providing real time streams. I would like to process this stream using Akka streams.
The Java API has a pub sub design and roughly works like this:
Subscription sub = createSubscription();
sub.addListener(new Listener() {
public void eventsReceived(List events) {
for (Event e : events)
buffer.enqueue(e)
}
});
I have tried to embed the creation of this subscription and accompanying buffer in a custom graph stage without much success. Can anyone guide me on the best way to interface with this API using Akka? Is Akka Streams the best tool here?
To feed a Source, you don't necessarily need to use a custom graph stage. Source.queue will materialize as a buffered queue to which you can add elements which will then propagate through the stream.
There are a couple of tricky things to be aware of. The first is that there's some subtlety around materializing the Source.queue so you can set up the subscription. Something like this:
def bufferSize: Int = ???
Source.fromMaterializer { (mat, att) =>
val (queue, source) = Source.queue[Event](bufferSize).preMaterialize()(mat)
val subscription = createSubscription()
subscription.addListener(
new Listener() {
def eventsReceived(events: java.util.List[Event]): Unit = {
import scala.collection.JavaConverters.iterableAsScalaIterable
import akka.stream.QueueOfferResult._
iterableAsScalaIterable(events).foreach { event =>
queue.offer(event) match {
case Enqueued => () // do nothing
case Dropped => ??? // handle a dropped pubsub element, might well do nothing
case QueueClosed => ??? // presumably cancel the subscription...
}
}
}
}
)
source.withAttributes(att)
}
Source.fromMaterializer is used to get access at each materialization to the materializer (which is what compiles the stream definition into actors). When we materialize, we use the materializer to preMaterialize the queue source so we have access to the queue. Our subscription adds incoming elements to the queue.
The API for this pubsub doesn't seem to support backpressure if the consumer can't keep up. The queue will drop elements it's been handed if the buffer is full: you'll probably want to do nothing in that case, but I've called it out in the match that you should make an explicit decision here.
Dropping the newest element is the synchronous behavior for this queue (there are other queue implementations available, but those will communicate dropping asynchronously which can be really bad for memory consumption in a burst). If you'd prefer something else, it may make sense to have a very small buffer in the queue and attach the "overall" Source (the one returned by Source.fromMaterializer) to a stage which signals perpetual demand. For example, a buffer(downstreamBufferSize, OverflowStrategy.dropHead) will drop the oldest event not yet processed. Alternatively, it may be possible to combine your Events in some meaningful way, in which case a conflate stage will automatically combine incoming Events if the downstream can't process them quickly.
Great answer! I did build something similar. There are also kamon metrics to monitor queue size exc.
class AsyncSubscriber(projectId: String, subscriptionId: String, metricsRegistry: CustomMetricsRegistry, pullParallelism: Int)(implicit val ec: Executor) {
private val logger = LoggerFactory.getLogger(getClass)
def bufferSize: Int = 1000
def source(): Source[(PubsubMessage, AckReplyConsumer), Future[NotUsed]] = {
Source.fromMaterializer { (mat, attr) =>
val (queue, source) = Source.queue[(PubsubMessage, AckReplyConsumer)](bufferSize).preMaterialize()(mat)
val receiver: MessageReceiver = {
(message: PubsubMessage, consumer: AckReplyConsumer) => {
metricsRegistry.inputEventQueueSize.update(queue.size())
queue.offer((message, consumer)) match {
case QueueOfferResult.Enqueued =>
metricsRegistry.inputQueueAddEventCounter.increment()
case QueueOfferResult.Dropped =>
metricsRegistry.inputQueueDropEventCounter.increment()
consumer.nack()
logger.warn(s"Buffer is full, message nacked. Pubsub should retry don't panic. If this happens too often, we should also tweak the buffer size or the autoscaler.")
case QueueOfferResult.Failure(ex) =>
metricsRegistry.inputQueueDropEventCounter.increment()
consumer.nack()
logger.error(s"Failed to offer message with id=${message.getMessageId()}", ex)
case QueueOfferResult.QueueClosed =>
logger.error("Destination Queue closed. Something went terribly wrong. Shutting down the jvm.")
consumer.nack()
mat.shutdown()
sys.exit(1)
}
}
}
val subscriptionName = ProjectSubscriptionName.of(projectId, subscriptionId)
val subscriber = Subscriber.newBuilder(subscriptionName, receiver).setParallelPullCount(pullParallelism).build
subscriber.startAsync().awaitRunning()
source.withAttributes(attr)
}
}
}

Race condition when dynamically adding websocket handlers

I'm writing a websocket with netty and I seem to have a race condition in my code:
I have a channel initializer that builds pipeline consisting of:
ch.pipeline().addLast(new HttpServerCodec())
ch.pipeline().addLast(new HttpObjectAggregator(65536))
ch.pipeline().addLast(new MyServer())
And MyServer works as follows:
if it receives a websocket upgrade request, it tries to authenticate the request
if it fails, it returns bad request
if it succeeds, it tries to:
add websocket handlers followed my custom logic handler and finish the handshake and establish webscoket connection
it's done using following code:
awareLogger.debug(log"upgrading to websocket")(logContext)
ctx.pipeline()
.addLast(new WebSocketServerProtocolHandler(route, true))
.addLast(new WebSocketFrameAggregator(65536))
.addLast(new MyWebsocketLogic(logContext))
ctx.fireChannelRead(httpRequest)
val _ = awareLogger.debug(log"upgraded to websocket")(logContext)
It's trying to fireChannelRead(httpRequest) in hopes that WebSocketServerProtocolHandler will intercept it and finish the handshake.
My issue is - the httpRequest sometimes seems to be propagated all the way down to MyWebsocketLogic handler and fails to establish the connection and handshake.
Am I doing something obviously wrong? It's almost like i have some kind of race condition when in the code.
The issue was that:
awareLogger.debug(log"upgrading to websocket")(logContext)
ctx.pipeline()
.addLast(new WebSocketServerProtocolHandler(route, true))
.addLast(new WebSocketFrameAggregator(65536))
.addLast(new MyWebsocketLogic(logContext))
ctx.fireChannelRead(httpRequest)
val _ = awareLogger.debug(log"upgraded to websocket")(logContext)
was called in different thread than the one assigned to given ctx.
I was able to fix this by applying suggestion from Norman above, that is switching the pipeline modification to EventLoop of the channel, meaning:
ctx.channel().eventLoop().execute { () =>
val _ = ctx
.pipeline()
.addLast(new WebSocketServerProtocolHandler(route, true))
.addLast(new WebSocketFrameAggregator(65536))
.addLast(buildWebsocketHandler(logContext, connectionHandler))
val _ = ctx.fireChannelRead(msg)
}
This seems to work well

Stop Akka stream Source when web socket connection is closed by the client

I have an akka http web socket Route with a code similar to:
private val wsReader: Route =
path("v1" / "data" / "ws") {
log.info("Opening websocket connecting ...")
val testSource = Source
.repeat("Hello")
.throttle(1, 1.seconds)
.map(x => {
println(x)
x
})
.map(TextMessage.Strict)
.limit(1000)
extractUpgradeToWebSocket { upgrade ⇒
complete(upgrade.handleMessagesWithSinkSource(Sink.ignore, testSource))
}
}
Everything works fine (I receive from the client 1 test message every second). The only problem is that I don't understand how to stop/close the Source (testSource) if the client close the web socket connection.
You can see that the source continue to produce elements (see println) also if the web socket is down.
How can I detect a client disconnection?
handleMessagesWithSinkSource is implemented as:
/**
* The high-level interface to create a WebSocket server based on "messages".
*
* Returns a response to return in a request handler that will signal the
* low-level HTTP implementation to upgrade the connection to WebSocket and
* use the supplied inSink to consume messages received from the client and
* the supplied outSource to produce message to sent to the client.
*
* Optionally, a subprotocol out of the ones requested by the client can be chosen.
*/
def handleMessagesWithSinkSource(
inSink: Graph[SinkShape[Message], Any],
outSource: Graph[SourceShape[Message], Any],
subprotocol: Option[String] = None): HttpResponse =
handleMessages(Flow.fromSinkAndSource(inSink, outSource), subprotocol)
This means the sink and the source are independent, and indeed the source should keep producing elements even when the client closes the incoming side of the connection. It should stop when the client resets the connection completely, though.
To stop producing outgoing data as soon as the incoming connection is closed, you may use Flow.fromSinkAndSourceCoupled, so:
val socket = upgrade.handleMessages(
Flow.fromSinkAndSourceCoupled(inSink, outSource)
subprotocol = None
)
One way is to use KillSwitches to handle testSource shutdown.
private val wsReader: Route =
path("v1" / "data" / "ws") {
logger.info("Opening websocket connecting ...")
val sharedKillSwitch = KillSwitches.shared("my-kill-switch")
val testSource =
Source
.repeat("Hello")
.throttle(1, 1.seconds)
.map(x => {
println(x)
x
})
.map(TextMessage.Strict)
.limit(1000)
.via(sharedKillSwitch.flow)
extractUpgradeToWebSocket { upgrade ⇒
val inSink = Sink.onComplete(_ => sharedKillSwitch.shutdown())
val outSource = testSource
val socket = upgrade.handleMessagesWithSinkSource(inSink, outSource)
complete(socket)
}
}

How to use Flink streaming to process Data stream of Complex Protocols

I'm using Flink Stream for the handling of data traffic log in 3G network (GPRS Tunnelling Protocol). And I'm having trouble in the synthesis of information in a user session of the user.
For example: how to map the start and end one session. I don't know that there Flink streaming suited to handle complex protocols like that?
p/s:
We capture data exchanging between SGSN and GGSN in 3G network (use GTP protocol with GTP-C/U messages). A session is started when the SGSN sends the CreateReq (TEID, Seq, IMSI, TEID_dl,TEID_data_dl) message and GGSN responses CreateRsp(TEID_dl, Seq, TEID_ul, TEID_data_ul) message.
After the session is established, others GTP-C messages (ex: UpdateReq, DeleteReq) sent from SGSN to GGSN uses TEID_ul and response message uses TEID_dl, GTP- U message uses TEID_data_ul (SGSN -> GGSN) and TEID_data_dl (GGSN -> SGSN). GTP-U messages contain information such as AppID (facebook, twitter, web), url,...
Finally, I want to handle continuous log data stream and map the GTP-C messages and GTP-U of the same one user (IMSI) to make a report.
I've tried this:
val sessions = createReqs.connect(createRsps).flatMap(new CoFlatMapFunction[CreateReq, CreateRsp, Session] {
// holds CreateReqs indexed by (tedid_dl,seq)
private val createReqs = mutable.HashMap.empty[(String, String), CreateReq]
// holds CreateRsps indexed by (tedid,seq)
private val createRsps = mutable.HashMap.empty[(String, String), CreateRsp]
override def flatMap1(req: CreateReq, out: Collector[Session]): Unit = {
val key = (req.teid_dl, req.header.seqNum)
val oRsp = createRsps.get(key)
if (!oRsp.isEmpty) {
val rsp = oRsp.get
println("OK")
out.collect(new Session(rsp.header.time, req.imsi, req.teid_dl, req.teid_ddl, rsp.teid_upl, rsp.teid_dupl, req.rat, req.apn))
createRsps.remove(key)
} else {
createReqs.put(key, req)
}
}
override def flatMap2(rsp: CreateRsp, out: Collector[Session]): Unit = {
val key = (rsp.header.teid, rsp.header.seqNum)
val oReq = createReqs.get(key)
if (!oReq.isEmpty) {
val req = oReq.get
out.collect(new Session(rsp.header.time, req.imsi, req.teid_dl, req.teid_ddl, rsp.teid_upl, rsp.teid_dupl, req.rat, req.apn))
createReqs.remove(key)
} else {
createRsps.put(key, rsp)
}
}
}).print()
This code always returns empty result. The fact that the input stream contains CreateRsp and CreateReq message of the same session. They appear very close together (within 1 second). When I debug, the oReq.isEmpty == true every time.
What i'm doing wrong?
To be honest it is a bit difficult to see through the telco specifics here, but if I understand correctly you have at least 3 streams, the first two being the CreateReq and the CreateRsp streams.
To detect the establishment of a session I would use the ConnectedDataStream abstraction to share state between the two aforementioned streams. Check out this example for usage or the related Flink docs.
Is this what you are trying to achieve?

scalaz-stream how to implement `ask-then-wait-reply` tcp client

I want to implement an client app that first send an request to server then wait for its reply(similar to http)
My client process may be
val topic = async.topic[ByteVector]
val client = topic.subscribe
Here is the api
trait Client {
val incoming = tcp.connect(...)(client)
val reqBus = topic.pubsh()
def ask(req: ByteVector): Task[Throwable \/ ByteVector] = {
(tcp.writes(req).flatMap(_ => tcp.reads(1024))).to(reqBus)
???
}
}
Then, how to implement the remain part of ask ?
Usually, the implementation is done with publishing the message via sink and then awaiting some sort of reply on some source, like your topic.
Actually we have a lot of idioms of this in our code :
def reqRply[I,O,O2](src:Process[Task,I],sink:Sink[Task,I],reply:Process[Task,O])(pf: PartialFunction[O,O2]):Process[Task,O2] = {
merge.mergeN(Process(reply, (src to sink).drain)).collectFirst(pf)
}
Essentially this first hooks to reply stream to await any resulting O confirming our request sent. Then we publish message I and consult pf for any incoming O to be eventually translated to O2 and then terminate.