At a Vert.x verticle I'm implementing I have a Buffer that was previously loaded into memory and now I want to dump it into disk.
As far as I understood we should use a Pump to make sure not to overload the WriteStream.
But I'm not finding a way to get a ReadStream child instance from a Buffer. Shouldn't there be an easy / standard way to do this?
Regards
Generally, vert.x does not warn on any issues writing directly into AsyncFiles. Furthermore, they provide the corresponding example of using AsyncFile.write directly here and state that you can use those to write directly: http://vertx.io/docs/vertx-core/java/#_asynchronous_files
However, if you want the pump with Buffer you need an instance of ReadStream<Buffer> along with an AsyncFile to pump into. You can make use of the implementation by PitchPoint Solutions (Copyright 2016 The Simple File Server Authors):
https://github.com/pitchpoint-solutions/sfs/blob/master/sfs-server/src/main/java/org/sfs/io/BufferReadStream.java
Putting it all together:
CompletableFuture<Void> done = new CompletableFuture<>();
Buffer buffer = Buffer.buffer(new byte[100]);
Vertx.vertx().fileSystem.open("myfile.txt", new OpenOptions(), res -> {
if (res.succeeded()) {
AsyncFile outputFile = res.result();
BufferReadStream reader = new BufferReadStream(buffer)
Pump pump = Pump.pump(reader, outputFile);
pump.start();
reader.endHandler((r) -> {
pump.stop(); // not sure this is required
done.complete(null);
});
} else {
// Something went wrong!
}
});
// wait elsewhere
done.get();
Related
I'm writing a server application in D, who should be able to manage n connections simultaneously.
To achieve this i am using std.socket.Socket.select. This works fine. But I can't bind session specific data to the socket and i don't see any way to do this, cause Socket does not allow to save a handle to user specific data. After
Socket.select(socketSet, null, null);
I'm able to get all affected sockets, but I can't assign this sockets to my user specific session data. What's my mistake? Is it possible to reach my goal in this way? Or should I choose another way for my requirements?
My relevant code:
ushort port = 5010;
stoprequest = false;
auto listener = new TcpSocket();
assert(listener.isAlive);
listener.blocking = false;
listener.bind(new InternetAddress(port));
listener.listen(10);
enum MAX_CONNECTIONS = 100;
auto socketSet = new SocketSet(MAX_CONNECTIONS + 1);
Socket[] reads;
Session[] sessions;
while (true)
{
socketSet.add(listener);
foreach (session; sessions)
socketSet.add(session.socket);
Socket.select(socketSet, null, null);
for (size_t i = 0; i < reads.length; i++)
{
if (socketSet.isSet(reads[i]))
{
// Now i should acces to session related data, but how?
char[1024] buf;
auto datLength = reads[i].receive(buf[]);
if (datLength == Socket.ERROR)
writeln("Connection error.");
else if (datLength != 0)
{
writefln("Received %d bytes from %s: \"%s\"", datLength, reads[i].remoteAddress().toString(), buf[0..datLength]);
continue;
}
else { // Error Handling. Shortened, since unimportant for the example}
reads[i].close();
reads = reads.remove(i);
i--;
}
}
if (socketSet.isSet(listener))
{
Socket sn = null;
sn = listener.accept();
if (reads.length < MAX_CONNECTIONS)
{
Session session = new Session();
session.socket = sn;
sessions ~= session;
}
else { // Error Handling for too many connection. Shortened, since unimportant for the example}}
}
socketSet.reset();
}
The hint to use poll() was helpful. After reading https://daniel.haxx.se/docs/poll-vs-select.html I think that both variants work and neither of them are the real thing. For an efficient way, I should better deal with libev. Fortunately, efficiency is not my problem in this particular project. For this reason I will use select(), because i found out, that accessing handle gives me a unique number which can be passed to a own lookup table. This allows me to assign session data to a socket. So I prefer to stick with the encapsulated functionality of std.socket.Socket and don't work around it.
My concrete question can therefore be answered with :
Use Socket.handle to identify the socket and manage session related
data
A few other alternatives you can consider:
1) use a subclass of Socket. You can make your own class that inherits from it and adds more stuff.
2) The poll function is found in import core.sys.posix.poll;, and you can pass socket.handle to that as well. But note it will not work on Windows without modification.
or indeed 3) do your own lookup table, that works too.
Note that the std.socket.Socket is a very thin wrapper around the bsd socket api, just internally it does conveniently handle the slight differences between Windows and posix. Still it is pretty easy to adapt code to use the other apis with it (or tutorials on C language stuff to D) since it is all basically the same thing - and literally the same functions if you import core.sys stuff.
I want to use Vert.x routingContext.response().sendFile method to read the file from internet and send it to some handler.
I have tried to use routingContext.response().sendFile for files located on my local system which works fine but instead of local system file when I am using file located on internet, I am getting error java.io.FileNotFoundException
String filename = "http://www.awitness.org/prophecy.zip";
routingContext.response().sendFile(filename, asr->{
if(asr.succeeded()) {
System.out.println("success....");
} else {
System.out.println("Something went wrong " + asr.cause());
}
});
Getting this output:
Something went wrong java.io.FileNotFoundException
That's because sendFile() takes local file path as argument.
Best solution would be to download this file, and serve it from your application.
Worse solution is to download this file on demand, save it using vertx.fileSystem().createTempFile(), and still serve it locally.
Now, for the sake of the argument, let's decided that you would like to go down the second path. How would you do that? You can try something like this:
final Vertx vertx = Vertx.vertx();
final Router router = Router.router(vertx);
WebClient c = WebClient.create(vertx);
String temp = vertx.fileSystem().createTempFileBlocking("", "");
c.get("www.awitness.org", "/prophecy.zip").send(r -> {
if (r.succeeded()) {
Buffer buffer = r.result().body();
vertx.fileSystem().writeFileBlocking(temp, buffer);
}
});
router.route("/").produces("application/zip").handler(ctx -> {
ctx.response().sendFile(temp);
});
I'm using blocking APIs only for the sake of simplicity. Correct ones are the async ones.
I am new to using vertx and I am using vertx filesystem api to read file of large size.
vertx.fileSystem().readFile("target/classes/readme.txt", result -> {
if (result.succeeded()) {
System.out.println(result.result());
} else {
System.err.println("Oh oh ..." + result.cause());
}
});
But the RAM is all consumed while reading and the resource is not even flushed after use. The vertx filesystem api also suggest
Do not use this method to read very large files or you risk running out of available RAM.
Is there any alternative to this?
To read large file you should open an AsyncFile:
OpenOptions options = new OpenOptions();
fileSystem.open("myfile.txt", options, res -> {
if (res.succeeded()) {
AsyncFile file = res.result();
} else {
// Something went wrong!
}
});
Then an AsyncFile is a ReadStream so you can use it together with a Pump to copy the bits to a WriteStream:
Pump.pump(file, output).start();
file.endHandler((r) -> {
System.out.println("Copy done");
});
There are different kind of WriteStream, like AsyncFile, net sockets, HTTP server responses, ...etc.
To read/process a large file in chunks you need to use the open() method which will return an AsyncFile on success. On this AsyncFile you setReadBufferSize() (or not, the default is 8192), and attach a handler() which will be passed a Buffer of at most the size of the read buffer you just set.
In the example below I have also attached an endHandler() to print a final newline to stay in line with the sample code you provided in the question:
vertx.fileSystem().open("target/classes/readme.txt", new OpenOptions().setWrite(false).setCreate(false), result -> {
if (result.succeeded()) {
result.result().setReadBufferSize(READ_BUFFER_SIZE).handler(data -> System.out.print(data.toString()))
.endHandler(v -> System.out.println());
} else {
System.err.println("Oh oh ..." + result.cause());
}
});
You need to define READ_BUFFER_SIZE somewhere of course.
The reason for that is that internally .readFile calls to Files.readAllBytes.
What you should do instead is create a stream out of your file, and pass it to Vertx handler:
try (InputStream steam = new FileInputStream("target/classes/readme.txt")) {
// Your handling here
}
I am quite an unexperienced spray/scala developer, I am trying to properly use spray.io LruCache. I am trying to achieve something very simple. I have a kafka consumer, when it reads something from its topic I want it to put the value it reads to cache.
Then in one of the routings I want to read this value, the value is of type string, what I have at the moment looks as follows:
object MyCache {
val cache: Cache[String] = LruCache(
maxCapacity = 10000,
initialCapacity = 100,
timeToLive = Duration.Inf,
timeToIdle = Duration(24, TimeUnit.HOURS)
)
}
to put something into cache i use following code:
def message() = Future { new String(singleMessage.message()) }
MyCache.cache(key, message)
Then in one of the routings I am trying to get something from the cache:
val res = MyCache.cache.get(keyHash)
The problem is the type of res is Option[Future[String]], it is quite hard and ugly to access the real value in this case. Could someone please tell me how I can simplify my code to make it better and more readable ?
Thanks in advance.
Don't try to get the value out of the Future. Instead call map on the Future to arrange for work to be done on the value when the Future is completed, and then complete the request with that result (which is itself a Future). It should look something like this:
path("foo") {
complete(MyCache.cache.get(keyHash) map (optMsg => ...))
}
Also, if singleMessage.message does not do I/O or otherwise block, then rather than creating the Future like you are
Future { new String(singleMessage.message) }
it would be more efficient to do it like so:
Future.successful(new String(singleMessage.message))
The latter just creates an already completed Future, bypassing the use of an ExecutionContext to evaluate the function.
If singleMessage.message does do I/O, then ideally you would do that I/O with some library (like Spray client, if it's an HTTP request) that returns a Future (rather than using Future { ... } to create another thread which will block).
A friend of mine came to me with a problem: when using the NetworkStream class on the server end of the connection, if the client disconnects, NetworkStream fails to detect it.
Stripped down, his C# code looked like this:
List<TcpClient> connections = new List<TcpClient>();
TcpListener listener = new TcpListener(7777);
listener.Start();
while(true)
{
if (listener.Pending())
{
connections.Add(listener.AcceptTcpClient());
}
TcpClient deadClient = null;
foreach (TcpClient client in connections)
{
if (!client.Connected)
{
deadClient = client;
break;
}
NetworkStream ns = client.GetStream();
if (ns.DataAvailable)
{
BinaryFormatter bf = new BinaryFormatter();
object o = bf.Deserialize(ns);
ReceiveMyObject(o);
}
}
if (deadClient != null)
{
deadClient.Close();
connections.Remove(deadClient);
}
Thread.Sleep(0);
}
The code works, in that clients can successfully connect and the server can read data sent to it. However, if the remote client calls tcpClient.Close(), the server does not detect the disconnection - client.Connected remains true, and ns.DataAvailable is false.
A search of Stack Overflow provided an answer - since Socket.Receive is not being called, the socket is not detecting the disconnection. Fair enough. We can work around that:
foreach (TcpClient client in connections)
{
client.ReceiveTimeout = 0;
if (client.Client.Poll(0, SelectMode.SelectRead))
{
int bytesPeeked = 0;
byte[] buffer = new byte[1];
bytesPeeked = client.Client.Receive(buffer, SocketFlags.Peek);
if (bytesPeeked == 0)
{
deadClient = client;
break;
}
else
{
NetworkStream ns = client.GetStream();
if (ns.DataAvailable)
{
BinaryFormatter bf = new BinaryFormatter();
object o = bf.Deserialize(ns);
ReceiveMyObject(o);
}
}
}
}
(I have left out exception handling code for brevity.)
This code works, however, I would not call this solution "elegant". The other elegant solution to the problem I am aware of is to spawn a thread per TcpClient, and allow the BinaryFormatter.Deserialize (née NetworkStream.Read) call to block, which would detect the disconnection correctly. Though, this does have the overhead of creating and maintaining a thread per client.
I get the feeling that I'm missing some secret, awesome answer that would retain the clarity of the original code, but avoid the use of additional threads to perform asynchronous reads. Though, perhaps, the NetworkStream class was never designed for this sort of usage. Can anyone shed some light?
Update: Just want to clarify that I'm interested to see if the .NET framework has a solution that covers this use of NetworkStream (i.e. polling and avoiding blocking) - obviously it can be done; the NetworkStream could easily be wrapped in a supporting class that provides the functionality. It just seemed strange that the framework essentially requires you to use threads to avoid blocking on NetworkStream.Read, or, to peek on the socket itself to check for disconnections - almost like it's a bug. Or a potential lack of a feature. ;)
Is the server expecting to be sent multiple objects over the same connection? IF so I dont see how this code will work, as there is no delimiter being sent that signifies where the first object starts and the next object ends.
If only one object is being sent and the connection closed after, then the original code would work.
There has to be a network operation initiated in order to find out if the connection is still active or not. What I would do, is that instead of deserializing directly from the network stream, I would instead buffer into a MemoryStream. That would allow me to detect when the connection was lost. I would also use message framing to delimit multiple responses on the stream.
MemoryStream ms = new MemoryStream();
NetworkStream ns = client.GetStream();
BinaryReader br = new BinaryReader(ns);
// message framing. First, read the #bytes to expect.
int objectSize = br.ReadInt32();
if (objectSize == 0)
break; // client disconnected
byte [] buffer = new byte[objectSize];
int index = 0;
int read = ns.Read(buffer, index, Math.Min(objectSize, 1024);
while (read > 0)
{
objectSize -= read;
index += read;
read = ns.Read(buffer, index, Math.Min(objectSize, 1024);
}
if (objectSize > 0)
{
// client aborted connection in the middle of stream;
break;
}
else
{
BinaryFormatter bf = new BinaryFormatter();
using(MemoryStream ms = new MemoryStream(buffer))
{
object o = bf.Deserialize(ns);
ReceiveMyObject(o);
}
}
Yeah but what if you lose a connection before getting the size? i.e. right before the following line:
// message framing. First, read the #bytes to expect.
int objectSize = br.ReadInt32();
ReadInt32() will block the thread indefinitely.