Akka Streams' ActorPublisher as a Source for web response - how back-pressure works - scala

I use akka-streams' ActorPublisher actor as a streaming per-connection Source of data being sent to an incoming WebSocket or HTTP connection.
ActorPublisher's contract is to regularly request data by supplying a demand - number of elements that can be accepted by downstream. I am not supposed to send more elements if the demand is 0. I observe that if I buffer elements, when consumer is slow, that buffer size fluctuates between 1 and 60, but mostly near 40-50.
To stream I use akka-http's ability to set WebSocket output and HttpResponse data to a Source of Messages (or ByteStrings).
I wonder how the back-pressure works in this case - when I'm streaming data to a client through network. How exactly these numbers are calculated? Does it check what's happening on network level?

The closest I could find for your question "how the back-pressure works in this case" is from the documentation:
Akka HTTP is streaming all the way through, which means that the
back-pressure mechanisms enabled by Akka Streams are exposed through
all layers–from the TCP layer, through the HTTP server, all the way up
to the user-facing HttpRequest and HttpResponse and their HttpEntity
APIs.
As to "how these numbers are calculated", I believe that is specified in the configuration settings.

Related

Kafka Streams and RPC: is calling REST service in map() operator considered an anti-pattern?

The naive approach for implementing the use case of enriching an incoming stream of events stored in Kafka with reference data - is by calling in map() operator an external service REST API that provides this reference data, for each incoming event.
eventStream.map((key, event) -> /* query the external service here, then return the enriched event */)
Another approach is to have second events stream with reference data and store it in KTable that will be a lightweight embedded "database" then join main event stream with it.
KStream<String, Object> eventStream = builder.stream(..., "event-topic");
KTable<String, Object> referenceDataTable = builder.table(..., "reference-data-topic");
KTable<String, Object> enrichedEventStream = eventStream
.leftJoin(referenceDataTable , (event, referenceData) -> /* return the enriched event */)
.map((key, enrichedEvent) -> new KeyValue<>(/* new key */, enrichedEvent)
.to("enriched-event-topic", ...);
Can the "naive" approach be considered an anti-pattern? Can the "KTable" approach be recommended as the preferred one?
Kafka can easily manage millions of messages per minute. Service that is called from the map() operator should be capable of handling high load too and also highly-available. These are extra requirements for the service implementation. But if the service satisfies these criteria can the "naive" approach be used?
Yes, it is ok to do RPC inside Kafka Streams operations such as map() operation. You just need to be aware of the pros and cons of doing so, see below. Also, you should do any such RPC calls synchronously from within your operations (I won't go into details here why; if needed, I'd suggest to create a new question).
Pros of doing RPC calls from within Kafka Streams operations:
Your application will fit more easily into an existing architecture, e.g. one where the use of REST APIs and request/response paradigms is common place. This means that you can make more progress quickly for a first proof-of-concept or MVP.
The approach is, in my experience, easier to understand for many developers (particularly those who are just starting out with Kafka) because they are familiar with doing RPC calls in this manner from their past projects. Think: it helps to move gradually from request-response architectures to event-driven architectures (powered by Kafka).
Nothing prevents you from starting with RPC calls and request-response, and then later migrating to a more Kafka-idiomatic approach.
Cons:
You are coupling the availability, scalability, and latency/throughput of your Kafka Streams powered application to the availability, scalability, and latency/throughput of the RPC service(s) you are calling. This is relevant also for thinking about SLAs.
Related to the previous point, Kafka and Kafka Streams scale very well. If you are running at large scale, your Kafka Streams application might end up DDoS'ing your RPC service(s) because the latter probably can't scale as much as Kafka. You should be able to judge pretty easily whether or not this is a problem for you in practice.
An RPC call (like from within map()) is a side-effect and thus a black box for Kafka Streams. The processing guarantees of Kafka Streams do not extend to such side effects.
Example: Kafka Streams (by default) processes data based on event-time (= based on when an event happened in the real world), so you can easily re-process old data and still get back the same results as when the old data was still new. But the RPC service you are calling during such reprocessing might return a different response than "back then". Ensuring the latter is your responsibility.
Example: In the case of failures, Kafka Streams will retry operations, and it will guarantee exactly-once processing (if enabled) even in such situations. But it can't guarantee, by itself, that an RPC call you are doing from within map() will be idempotent. Ensuring the latter is your responsibility.
Alternatives
In case you are wondering what other alternatives you have: If, for example, you are doing RPC calls for looking up data (e.g. for enriching an incoming stream of events with side/context information), you can address the downsides above by making the lookup data available in Kafka directly. If the lookup data is in MySQL, you can setup a Kafka connector to continuously ingest the MySQL data into a Kafka topic (think: CDC). In Kafka Streams, you can then read the lookup data into a KTable and perform the enrichment of your input stream via a stream-table join.
I suspect most of the advice you hear from the internet is along the lines of, "OMG, if this REST call takes 200ms, how wil I ever process 100,000 Kafka messages per second to keep up with my demand?"
Which is technically true: even if you scale your servers up for your REST service, if responses from this app routinely take 200ms - because it talks to a server 70ms away (speed of light is kinda slow, if that server is across the continent from you...) and the calling microservice takes 130ms even if you measure right at the source....
With kstreams the problem may be worse than it appears. Maybe you get 100,000 messages a second coming into your stream pipeline, but some kstream operator flatMaps and that operation in your app creates 2 messages for every one object... so now you really have 200,000 messages a second crashing through your REST server.
BUT maybe you're using Kstreams in an app that has 100 messages a second, or you can partition your data so that you get a message per partition maybe even just once a second. In that case, you might be fine.
Maybe your Kafka data just needs to go somewhere else: ie the end of the stream is back into a Good Ol' RDMS. In which case yes, there's some careful balancing there on the best way to deal with potentially "slow" systems, while making sure you don't DDOS yourself, while making sure you can work your way out of a backlog.
So is it an anti-pattern? Eh, probably, if your Kafka cluster is LinkedIn size. Does it matter for you? Depends on how many messages/second you need to drive, how fast your REST service really is, how efficiently it can scale (ie your new kstreams pipeline suddenly delivers 5x the normal traffic to it...)

If websocket messages arrive in order, why does 'sequence number' is needed?

I heard that websocket messages are received in order, because websocket runs over TCP.
Then what is the purpose of 'sequence number'?
This is the explanation of sequence number in websocket.
But I'm wondering why does that sequence number is needed, if we have a 'in-order' received message.
The sequence number allows you to map your requests to responses even if the responses don't come in the order you make them.
HTTP and other relevant protocols support pipelining. Also there is no need for the request responses to be sent back to you in any specific order. Each one may be processed according to its individual cost or dispatched across a server farm and reassembled in an order that is not predetermined. Either way, if they are out of order you will need a key to map the response back to your request.

Designing a REST service with akka-http and akka-stream

I'm new to akka http & streams and would like to figure out what the most idiomatic implementation is for a REST api. Let's say I need to implement a single endpoint that:
accepts a path parameter and several query string parameters
validates the params
constructs a db query based on the params
executes the query
sends response back to client
From my research I understand this is a good use case for akka http and modeling the flow of the request -> response seems to map well to using akka streams with several Flows but I'd like some clarification:
Does using the akka streams library make sense here?
Is that still true if the database driver making the call does not have an async api?
Do the stream back pressure semantics still hold true if there is a Flow making a blocking call?
How is parallelism handled with the akka streams implementation? Say for example the service experiences 500 concurrent connections, does the streams abstraction simply have a pool of actors under the covers to handle each of these connections?
Edit: This answers most of the questions I had: http://doc.akka.io/docs/akka-http/current/scala/http/handling-blocking-operations-in-akka-http-routes.html

Sending large files with Spray

I know very similar questions have been asked before. But I don't think the solutions I found on google/stackoverflow are suitable for me.
I started to write some web services with Scala/Spray, and it seems the best way to send large files without consuming large amouns of memory is using the stream marshalling. This way Spray will send http chunks. Two questions:
Is it possible to send the file without using HTTP chunks and without reading the entire file into memory?
AFAIK akka.io only process one write at a time, meaning it can buffer one write until it has been passed on to the O/S kernel in full. Would it be possible to tell Spray, for each HTTP response, the length of the content? Thereafter Spray would ask for new data (through akka messages) untill the entire content length is completed. Eg, I indicate my content length is 100 bytes. Spray sends a message asking for data to my actor, I provide 50 bytes. Once this data is passed on to the O/S, spray sends another message asking for new data. I provide the remaining 50 bytes... the response is completed then.
Is it possible to send the file without using HTTP chunks [on the wire]
Yes, you need to enable chunkless streaming. See http://spray.io/documentation/1.2.4/spray-routing/advanced-topics/response-streaming/
Chunkless streaming works regardless whether you use the Stream marshaller or provide the response as MessageChunks yourself. See the below example.
without reading the entire file into memory
Yes, that should work if you supply data as a Stream[Array[Byte]] or Stream[ByteString].
[...] Thereafter Spray would ask for new data [...]
That's actually almost like it already works: If you manually provide the chunks you can request a custom Ack message that will be delivered back to you when the spray-can layer is able to process the next part. See this example for how to stream from a spray route.
I indicate my content length is 100 bytes
A note upfront: In HTTP you don't strictly need to specify a content-length for responses because a response body can be delimited by closing the connection which is what spray does if chunkless streaming is enable. However, if you don't want to close the connection (because you would lose this persistent connection) you can now specify an explicit Content-Length header in your ChunkedResponseStart message (see #802) which will prevent the closing of the connection.

Spray chunked request throttle incoming data

I am using Spray 1.3, with incoming-auto-chunking-threshold-size set, to allow streaming of incoming requests.
When a very large request comes in from my client, I want to stream it through the app and out to a backing store in chunks, to limit the memory used by the Spray app.
I am finding that Spray will slurp in the response as fast as it can, creating MessageChunks of the configured size and passing them to my app.
If the backend store is slow, then this results in Spray caching most of the request in local memory, defeating the streaming design.
Is there any way I can get Spray to block or throttle the request stream so that the input data rate matches the output data rate, to cap my app's memory usage?
Relevant spray code:
The HttpMessagePartParser.parseBodyWithAutoChunking method is the one which breaks up the request byte stream into MessageChunk objects. It does so greedily, consuming as many chunks as are immediately available, then returning a NeedMoreData object.
The request pipeline accepts NeedMoreData in the handleParsingResult method of the RawPipelineStage, with the following code:
case Result.NeedMoreData(next) ⇒ parser = next // wait for the next packet
... so it looks to me like there is no "pull" control of the chunking stream in Spray, and the framework will always read in the request as fast as it can manage, and push it out to the app's Actors as MessageChunks. Once a MessageChunk message is in the queue for my Actor, its memory can't be cached to disk.
So there is no way to limit the memory used by Spray for a request?
There is a workaround discussed here: https://github.com/spray/spray/issues/281#issuecomment-40455433
This may be addressed in a future spray release.
EDIT: Spray is now Akka HTTP, which has "Reactive Streams" which gives back-pressure to the TCP stream while still being async: https://groups.google.com/forum/#!msg/akka-dev/PPleJEfI5sM/FbeptEYlicoJ