I am using Spray 1.3, with incoming-auto-chunking-threshold-size set, to allow streaming of incoming requests.
When a very large request comes in from my client, I want to stream it through the app and out to a backing store in chunks, to limit the memory used by the Spray app.
I am finding that Spray will slurp in the response as fast as it can, creating MessageChunks of the configured size and passing them to my app.
If the backend store is slow, then this results in Spray caching most of the request in local memory, defeating the streaming design.
Is there any way I can get Spray to block or throttle the request stream so that the input data rate matches the output data rate, to cap my app's memory usage?
Relevant spray code:
The HttpMessagePartParser.parseBodyWithAutoChunking method is the one which breaks up the request byte stream into MessageChunk objects. It does so greedily, consuming as many chunks as are immediately available, then returning a NeedMoreData object.
The request pipeline accepts NeedMoreData in the handleParsingResult method of the RawPipelineStage, with the following code:
case Result.NeedMoreData(next) ⇒ parser = next // wait for the next packet
... so it looks to me like there is no "pull" control of the chunking stream in Spray, and the framework will always read in the request as fast as it can manage, and push it out to the app's Actors as MessageChunks. Once a MessageChunk message is in the queue for my Actor, its memory can't be cached to disk.
So there is no way to limit the memory used by Spray for a request?
There is a workaround discussed here: https://github.com/spray/spray/issues/281#issuecomment-40455433
This may be addressed in a future spray release.
EDIT: Spray is now Akka HTTP, which has "Reactive Streams" which gives back-pressure to the TCP stream while still being async: https://groups.google.com/forum/#!msg/akka-dev/PPleJEfI5sM/FbeptEYlicoJ
Related
I need to implement a microservice that loads a ton of data into memory at startup and makes that data available via HTTP GET.
I have been looking at fs2 as an option to make the data available to the web layer via an fs2.Queue.
My concern is that if I use the synchronous queue from fs2, my performance of serving the data might be affected negatively because of the blocking nature of the synchronous queue (on enqueue operation).
Is this a valid concern?
Also, which Queue abstractions (in fs2) are thread-safe? ie: can I pass any Queue around to multiple threads and can they all safely take items out of the queue without more than one of them taking the same element out of the queue?
EDIT:
Use case: 10Mil records served by the Stream -> many workers (threads) picking work from the Stream via a HTTP endpoint (GET)
I use akka-streams' ActorPublisher actor as a streaming per-connection Source of data being sent to an incoming WebSocket or HTTP connection.
ActorPublisher's contract is to regularly request data by supplying a demand - number of elements that can be accepted by downstream. I am not supposed to send more elements if the demand is 0. I observe that if I buffer elements, when consumer is slow, that buffer size fluctuates between 1 and 60, but mostly near 40-50.
To stream I use akka-http's ability to set WebSocket output and HttpResponse data to a Source of Messages (or ByteStrings).
I wonder how the back-pressure works in this case - when I'm streaming data to a client through network. How exactly these numbers are calculated? Does it check what's happening on network level?
The closest I could find for your question "how the back-pressure works in this case" is from the documentation:
Akka HTTP is streaming all the way through, which means that the
back-pressure mechanisms enabled by Akka Streams are exposed through
all layers–from the TCP layer, through the HTTP server, all the way up
to the user-facing HttpRequest and HttpResponse and their HttpEntity
APIs.
As to "how these numbers are calculated", I believe that is specified in the configuration settings.
I found Slick 3.0 introduced a new feature called streaming
http://slick.typesafe.com/doc/3.0.0-RC1/database.html#streaming
I'm not familiar with Akka. streaming seems a lazy or async value, but it is not very clear for me to understand why it is useful, and when will it be useful..
Does anyone have ideas about this?
So lets imagine the following use case:
A "slow" client wants to get a large dataset from the server. The client sends a request to the server which loads all the data from the database, stores it in memory and then passes it down to the client.
And here we're faced with problems: The client handles the data not so fast as we wanted => we can't release the memory => this may result in an out of memory error.
Reactive streams solve this problem by using backpressure. We can wrap Slick's publisher around the Akka source and then "feed" it to the client via Akka HTTP.
The thing is that this backpressure is propagated through TCP via Akka HTTP down to the publisher that represents the database query.
That means that we only read from the database as fast as the client can consume the data.
P.S This just a little aspect where reactive streams can be applied.
You can find more information here:
http://www.reactive-streams.org/
https://youtu.be/yyz7Keg1w9E
https://youtu.be/9S-4jMM1gqE
Given this example code from Play documentation:
def upload = Action(parse.temporaryFile) { request =>
request.body.moveTo(new File("/tmp/picture/uploaded"))
Ok("File uploaded")
}
How 100 simultaneous slow upload requests will be handled (number of threads)?
Will be uploaded file buffered in memory or streamed directly to disk?
How 100 simultaneous slow upload requests will be handled (number of threads)?
It depends. The number of actual threads being used isn't really relevant. By default, Play uses a number of threads equal to the number of CPU cores available. But this doesn't mean that if you have 4 cores, you're limited to 4 concurrent processes at once. HTTP requests in Play are processed asynchronously in a special internal ExecutionContext provisioned by Akka. Processes running in an ExecutionContext can share threads, so long as they are non-blocking--which is abstracted away by Akka. All of this can be configured in different ways. See Understanding Play Thread Pools.
The Iteratee that consumes the client data must do some blocking in order to write the file chunks to disk, but done in small (and fast) enough chunks, this shouldn't cause other file uploads to become blocked.
What I would be more worried about is the amount of disk I/O your server can handle. 100 slow uploads might be okay, but you can't really say without benchmarking. At some point you will run into trouble when the client input exceeds the rate that your server can write to disk. This will also not work in a distributed environment. I almost always choose to bypass the Play server entirely and direct uploads to Amazon S3.
Will be uploaded file buffered in memory or streamed directly to disk?
All temporary files are streamed to disk. Under the hood, all data sent from the client to the server is read using the iteratee library asynchronously. For multipart uploads, it is no different. The client data is consumed by an Iteratee, which streams the file chunks to a temporary file on disk. So when using the parse.temporaryFile BodyParser, request.body is just a handle to a temporary file on disk, and not the file stored in memory.
It is worth noting that while Play can handle those requests in a non-blocking manner, moving the file once complete will block. That is, request.body.moveTo(...) will block the controller function until the move is complete. This means that if several of the 100 uploads complete at about the same time, Play's internal ExecutionContext for handling requests can quickly become overloaded. The underlying API of moveTo is also deprecated in Play 2.3, as it uses FileInputStream and FileOutputStream to copy the TemporaryFile to a permanent location. The docs advise you to use the Java 7 File API, instead, as it is much more efficient.
This may be a little crude, but something more like this should do it:
import java.io.File
import java.nio.file.Files
def upload = Action(parse.temporaryFile) { request =>
Files.copy(request.body.file.toPath, new File("/tmp/picture/uploaded").toPath)
Ok("File uploaded")
}
I know very similar questions have been asked before. But I don't think the solutions I found on google/stackoverflow are suitable for me.
I started to write some web services with Scala/Spray, and it seems the best way to send large files without consuming large amouns of memory is using the stream marshalling. This way Spray will send http chunks. Two questions:
Is it possible to send the file without using HTTP chunks and without reading the entire file into memory?
AFAIK akka.io only process one write at a time, meaning it can buffer one write until it has been passed on to the O/S kernel in full. Would it be possible to tell Spray, for each HTTP response, the length of the content? Thereafter Spray would ask for new data (through akka messages) untill the entire content length is completed. Eg, I indicate my content length is 100 bytes. Spray sends a message asking for data to my actor, I provide 50 bytes. Once this data is passed on to the O/S, spray sends another message asking for new data. I provide the remaining 50 bytes... the response is completed then.
Is it possible to send the file without using HTTP chunks [on the wire]
Yes, you need to enable chunkless streaming. See http://spray.io/documentation/1.2.4/spray-routing/advanced-topics/response-streaming/
Chunkless streaming works regardless whether you use the Stream marshaller or provide the response as MessageChunks yourself. See the below example.
without reading the entire file into memory
Yes, that should work if you supply data as a Stream[Array[Byte]] or Stream[ByteString].
[...] Thereafter Spray would ask for new data [...]
That's actually almost like it already works: If you manually provide the chunks you can request a custom Ack message that will be delivered back to you when the spray-can layer is able to process the next part. See this example for how to stream from a spray route.
I indicate my content length is 100 bytes
A note upfront: In HTTP you don't strictly need to specify a content-length for responses because a response body can be delimited by closing the connection which is what spray does if chunkless streaming is enable. However, if you don't want to close the connection (because you would lose this persistent connection) you can now specify an explicit Content-Length header in your ChunkedResponseStart message (see #802) which will prevent the closing of the connection.