Problem: limit binary files download rate.
def test = {
Logger.info("Call test action")
val file = new File("/home/vidok/1.jpg")
val fileIn = new FileInputStream(file)
response.setHeader("Content-type", "application/force-download")
response.setHeader("Content-Disposition", "attachment; filename=\"1.jpg\"")
response.setHeader("Content-Length", file.lenght + "")
val bufferSize = 1024 * 1024
val bb = new Array[Byte](bufferSize)
val bis = new java.io.BufferedInputStream(is)
var bytesRead = bis.read(bb, 0, bufferSize)
while (bytesRead > 0) {
bytesRead = bis.read(bb, 0, bufferSize)
//sleep(1000)?
response.writeChunk(bytesRead)
}
}
But its working only for the text files. How to work with binary files?
You've got the basic idea right: each time you've read a certain number of bytes (which are stored in your buffer) you need to:
evaluate how fast you've been reading (= X B/ms)
calculate the difference between X and how fast you should have been reading (= Y ms)
use sleep(Y) on the downloading thread if needed to slow the download rate down
There's already a great question about this right here that should have everything you need. I think especially the ThrottledInputStream solution (which is not the accepted answer) is rather elegant.
A couple of points to keep in mind:
Downloading using 1 thread for everything is the simplest way, however it's also the least efficient way if you want to keep serving requests.
Usually, you'll want to at least offload the actual downloading of a file to its own separate thread.
To speed things up: consider downloading files in chunks (using HTTP Content-Range) and Java NIO. However, keep in mind that this will make thing a lot more complex.
I wouldn't implement something which any good webserver should be able to for me. In enterprise systems this kind of thing is normally handled by a web entry server or firewall. But if you have to do this, then the answer by tmbrggmn looks good to me. NIO is a good tip.
Related
I am looking to use to Scala to get faster performance in accessing and downloading a Amazon S3 file. The file comes in as a InputStream and it is large (over 30 million rows).
I have tried this in python (pandas), but it is too slow. I am hoping to increase the speed with Scala.
So far I am doing this, but it is too slow. Have I encountered a bottle neck in the stream in that I cannot access data from the stream any faster than what I have with the code below?
val obj = amazonS3Client.getObject(bucket_name, file_name)
val reader = new BufferedReader(new InputStreamReader(obj.getObjectContent()))
while(line != null) {
list_of_lines = list_of_lines ::: List(line)
line = reader.readLine
}
I'm looking for serious speed improvement compared to the above approach. Thanks.
I suspect that your performance bottleneck is appending to a (linked) List in that loop:
list_of_lines = list_of_lines ::: List(line)
With 30 million lines, it should take a few hundred trillion times longer to process all the lines than it takes to process one line. If the first iteration of that loop is 1ns, then this should take somewhere around 15 minutes to execute.
Switching to prepending to the List and then reversing at the end should improve the speed of your loop by a factor of more than a million:
while(line != null) {
list_of_lines = line :: list_of_lines
line = reader.readLine
}
list_of_lines = list_of_lines.reverse
List is also notoriously memory-inefficient, so for this many elements, it's may also be worth doing something like this (which is also more idiomatic Scala):
import scala.io.{ Codec, Source }
val obj = amazonS3Client.getObject(bucket_name, file_name)
val source = Source.fromInputStream(obj.getObjectContent())(Codec.defaultCharsetCodec)
val lines = source.getLines().toVector
Vector being more memory-efficient than List should dramatically reduce GC thrashing.
The best way to achieve better performance is using the TransferManager provided by the AWS Java SDKs. It's a high level file transfer manager that will automatically parallelise downloads. I'd recommend using SDK v2, but the same can be done with SDK v1. Though, be aware, SDK v1 comes with limitations and only multipart files can be downloaded in parallel.
You need the following dependency (assuming you are using sbt with Scala). But note, there's no benefit of using Scala over Java with the TransferManager.
libraryDependencies += "software.amazon.awssdk" % "s3-transfer-manager" % "2.17.243-PREVIEW"
Example (Java):
S3TransferManager transferManager = S3TransferManager.create();
FileDownload download =
transferManager.downloadFile(b -> b.destination(Paths.get("myFile.txt"))
.getObjectRequest(req -> req.bucket("bucket").key("key")));
download.completionFuture().join();
I recommend reading more on the topic here: Introducing Amazon S3 Transfer Manager in the AWS SDK for Java 2.x
I want to be able to send BufferedImages generated from my java program over the local network in real time, some my second application can show them.
I have been looking through a lot of websites over the last 2 days but I wasn't able to find anything. Only thing I found was this:
Can I use Xuggler to encode video/audio to a byte array?
I tried implementing the URLHandler but problem is, MediaWriter still wants an URL and as soon as I add a VideoStream, it opens the container a second time with the url and then in crashes.
I hope you can help me and thanks in advance.
Code I have right now:
val clientSocket = serverSocket.accept()
connectedClients.add(clientSocket)
val container = IContainer.make()
val writer = ToolFactory.makeWriter("localhost", container)
container.open(VTURLProtocolHandler(clientSocket.getOutputStream()), IContainer.Type.WRITE, IContainerFormat.make())
writer.addVideoStream(0, 0, ICodec.ID.CODEC_ID_H264, width, height)
I'm using ProtoBuf-Net for serialize and deserialize TCP_Messages.
I've tried all the suggestions I've found here, so I really don't know where the mistake is.
The serialize is made server side, and the deserialize is made on an application client-side.
Serialize code:
public void MssGetCardPersonalInfo(out RCPersonalInfoRecord ssPersonalInfoObject, out bool ssResult) {
ssPersonalInfoObject = new RCPersonalInfoRecord(null);
TCP_Message msg = new TCP_Message(MessageTypes.GetCardPersonalInfo);
MemoryStream ms = new MemoryStream();
ProtoBuf.Serializer.Serialize(ms, msg);
_tcp_Client.Send(ms.ToArray());
_waitToReadCard.Start();
_stopWaitHandle.WaitOne();
And the deserialize:
private void tpcServer_OnDataReceived(Object sender, byte[] data, TCPServer.StateObject clientState)
{
TCP_Message message = new TCP_Message();
MemoryStream ms = new MemoryStream(data);
try
{
//ms.ToArray();
//ms.GetBuffer();
//ms.Position = 0;
ms.Seek(0, SeekOrigin.Begin);
message = Serializer.Deserialize<TCP_Message>(ms);
} catch (Exception ex)
{
EventLog.WriteEntry(_logSource, "Error deserializing: " + ex.Message, EventLogEntryType.Error, 103);
}
As you can see, I've tried a bunch of different approache, now comented.
I have also tried to deserialize using the DeserializeWithLengthPrefix but it didn't work either.
I'm a bit noob on this, so if you could help me I would really appreciate it.
Thank's
The first thing to look at here is: is the data you receive the data you send. Until you can answer "yes" to that, all other questions are moot. It is very easy to confuse network code and end up reading partial frames, etc. As a crude debugger test:
Debug.WriteLine(Convert.ToBase64String(ms.GetBuffer(), 0, (int)ms.Length));
should work. If the two base-64 strings are not identical, then you aren't working with the same data. This can be because of a range of reasons, including packet splitting and combining. You need to keep in mind that in a stream, what you send is not what you get - at least, not down to the fragment level. You might "send" data in the way of:
one bundle of 20 bytes
one bundle of 10 bytes
but at the receiving end, it would be entirely legitimate to read:
1 byte
22 bytes
7 bytes
All that TCP guarantees is the order and accuracy of the bytes. It says nothing about their breakdown in terms of chunks. When writing network code, there are basically 2 approaches:
have one thread that synchronously reads from a stream and local buffer (doesn't scale well)
async code (very scalable), but accept that you're going to have to do a lot of "do I have a complete frame? if not, append to an input buffer; if so, process any available frame data (could be multiple), then shuffle any incomplete data to the start of the buffer"
The orange color is the "OldGen", Green is "Eden Space", and blue is "survivor space". I used YourKit to do this profiling. This is how I wrote my file reading code:
val inputStream = new FileInputStream("E:\\Allen\\DataScience\\train\\train.csv")
val sc = new Scanner(inputStream, "UTF-8")
var counter = 0
while (sc.hasNextLine) {
rowActors(counter % 20) ! Row(sc.nextLine())
counter += 1
}
sc.close()
inputStream.close()
It seems like a big chunk of memory if taken by Scanner. However, my original file is only 5 GB large. I wonder if I was mishandling the file reading procedure! If not, how should I read in and process my file? I'm very frustrated with the Garbage Collection right now.
Akka-stream provides safer way for parallel processing of files: https://github.com/typesafehub/activator-akka-stream-scala/blob/master/src/main/scala/sample/stream/GroupLogFile.scala
I'm building an App with actionscript 3.0 in my Flash builder. This is a followup question this question.
I need to upload the bytearray to my server, but the function i use to convert the bitmapdata to a ByteArray is super slow, so slow it freezes up my mobile device. My code is as follows:
var jpgenc:JPEGEncoder = new JPEGEncoder(50);
trace('encode');
//encode the bitmapdata object and keep the encoded ByteArray
var imgByteArray:ByteArray = jpgenc.encode(bitmap);
temp2 = File.applicationStorageDirectory.resolvePath("snapshot.jpg");
var fs:FileStream = new FileStream();
trace('fs');
try{
//open file in write mode
fs.open(temp2,FileMode.WRITE);
//write bytes from the byte array
fs.writeBytes(imgByteArray);
//close the file
fs.close();
}catch(e:Error){
Is there a different way to convert it to a byteArray? Is there a better way?
Try to use blooddy library: http://www.blooddy.by . But i didn't test it on mobile devices. Comment if you will have success.
Use BitmapData.encode(), it's faster by orders of magnitude on mobile http://help.adobe.com/en_US/FlashPlatform/reference/actionscript/3/flash/display/BitmapData.html#encode%28%29
You should try to find a JPEG encoder that is capable of encoding asynchronously. That way the app can still be used while the image is being compressed. I haven't tried any of the libraries, but this one looks promising:
http://segfaultlabs.com/devlogs/alchemy-asynchronous-jpeg-encoding-2
It uses Alchemy, which should make it faster than the JPEGEncoder from as3corelib (which I guess is the one you're using at the moment.)
A native JPEG encoder is ideal, asynchronous would be good, but possibly still slow (just not blocking). Another option:
var pixels:ByteArray = bitmapData.getPixels(bitmapData.rect);
pixels.compress();
I'm not sure of native performance, and performance definitely depends on what kind of images you have.
The answer from Ilya was what did it for me. I downloaded the library and there is an example of how to use it inside. I have been working on getting the CameraUI in flashbuilder to take a picture, encode / compress it, then send it over via a web service to my server (the data was sent as a compressed byte array). I did this:
by.blooddy.crypto.image.JPEGEncoder.encode( bmp, 30 );
Where bmp is my bitmap data. The encode took under 3 seconds and was easily able to fit into my flow of control synchronously. I tried async methods but they ultimately took a really long time and were difficult to track for things like when a user moved from cell service to wifi or from tower to tower while an upload was going on.
Comment here if you need more details.