What is the current state of asynchronous file IO in akka streams? - scala

Target:
I would like to see Akka's file IO to be as asynchronous and non-blocking as possible.
My status of knowledge so far:
In older project documentation you can read this:
Note
Since the current version of Akka (2.3.x) needs to support JDK6, the
currently provided File IO implementations are not able to utilise
Asynchronous File IO operations, as these were introduced in JDK7 (and
newer). Once Akka is free to require JDK8 (from 2.4.x) these
implementations will be updated to make use of the new NIO APIs (i.e.
AsynchronousFileChannel).
The current akka version is '2.5.4'. The current version of akka-stream is '2.11' or '2.12'. In the current documentation the note from abhove is missing and it is only explicitly mentioned that file IO means blocking operations which are processed by a dispatcher dedicated to IO operations.
In the 'MANIFEST.MF' file inside the akka-streams Jar-File there is a line:
Require-Capability: osgi.ee;filter:="(&(osgi.ee=JavaSE)(version=1.8))"
So I guess it requires Java 8.
There is a question related to scala asynchronous file IO but its from January 2015. One of the answers contains:
Akka IO, while not providing file IO in it's core, has a module
developed by Dario Rexin, which allows to use AsynchronousFileChannel
with Akka IO in a very simple manner. Have a look at this library to
make use of it: https://github.com/drexin/akka-io-file
Questions:
How asynchronous is the current state of akka streams file IO?
Does akka streams file IO use 'AsynchronousFileChannel' from Java's NIO?
Do I have to do something to use 'AsynchronousFileChannel' from Java's NIO?

How asynchronous is the current state of akka streams file IO?
Perusing the source code shows that Akka Stream's FileIO uses java.nio.ByteBuffer and java.nio.channels.FileChannel. And as the documentation states, the file IO operations run in isolation on a dedicated dispatcher.
There is an open pull request that attempts to use AsynchronousFileChannel. Based on the benchmark results reported in that PR, the PR might be closed in favor of trying a newer approach with synchronous NIO as captured in another PR.
Does akka streams file IO use 'AsynchronousFileChannel' from Java's NIO?
No.
Do I have to do something to use 'AsynchronousFileChannel' from Java's NIO?
This question is moot, because Akka Streams does not use AsynchronousFileChannel.

Related

Lua sockets - Asynchronous Events

In current lua sockets implementation, I see that we have to install a timer that calls back periodically so that we check in a non blocking API to see if we have received anything.
This is all good and well however in UDP case, if the sender has a lot of info being sent, do we risk loosing the data. Say another device sends a 2MB photo via UDP and we check socket receive every 100msec. At 2MBps, the underlying system must store 200Kbits before our call queries the underlying TCP stack.
Is there a way to get an event fired when we receive the data on the particular socket instead of the polling we have to do now?
There are a various ways of handling this issue; which one you will select depends on how much work you want to do.*
But first, you should clarify (to yourself) whether you are dealing with UDP or TCP; there is no "underlying TCP stack" for UDP sockets. Also, UDP is the wrong protocol to use for sending whole data such as a text, or a photo; it is an unreliable protocol so you aren't guaranteed to receive every packet, unless you're using a managed socket library (such as ENet).
Lua51/LuaJIT + LuaSocket
Polling is the only method.
Blocking: call socket.select with no time argument and wait for the socket to be readable.
Non-blocking: call socket.select with a timeout argument of 0, and use sock:settimeout(0) on the socket you're reading from.
Then simply call these repeatedly.
I would suggest using a coroutine scheduler for the non-blocking version, to allow other parts of the program to continue executing without causing too much delay.
Lua51/LuaJIT + LuaSocket + Lua Lanes (Recommended)
Same as the above method, but the socket exists in another lane (a lightweight Lua state in another thread) made using Lua Lanes (latest source). This allows you to instantly read the data from the socket and into a buffer. Then, you use a linda to send the data to the main thread for processing.
This is probably the best solution to your problem.
I've made a simple example of this, available here. It relies on Lua Lanes 3.4.0 (GitHub repo) and a patched LuaSocket 2.0.2 (source, patch, blog post re' patch)
The results are promising, though you should definitely refactor my example code if you derive from it.
LuaJIT + OS-specific sockets
If you're a little masochistic, you can try implementing a socket library from scratch. LuaJIT's FFI library makes this possible from pure Lua. Lua Lanes would be useful for this as well.
For Windows, I suggest taking a look at William Adam's blog. He's had some very interesting adventures with LuaJIT and Windows development. As for Linux and the rest, look at tutorials for C or the source of LuaSocket and translate them to LuaJIT FFI operations.
(LuaJIT supports callbacks if the API requires it; however, there is a signficant performance cost compared to polling from Lua to C.)
LuaJIT + ENet
ENet is a great library. It provides the perfect mix between TCP and UDP: reliable when desired, unreliable otherwise. It also abstracts operating system specific details, much like LuaSocket does. You can use the Lua API to bind it, or directly access it via LuaJIT's FFI (recommended).
* Pun unintentional.
I use lua-ev https://github.com/brimworks/lua-ev for all IO-multiplexing stuff.
It is very easy to use fits into Lua (and its function) like a charm. It is either select/poll/epoll or kqueue based and performs very good too.
local ev = require'ev'
local loop = ev.Loop.default
local udp_sock -- your udp socket instance
udp_sock:settimeout(0) -- make non blocking
local udp_receive_io = ev.IO.new(function(io,loop)
local chunk,err = udp_sock:receive(4096)
if chunk and not err then
-- process data
end
end,udp_sock:getfd(),ev.READ)
udp_receive_io:start(loop)
loop:loop() -- blocks forever
In my opinion Lua+luasocket+lua-ev is just a dream team for building efficient and robust networking applications (for embedded devices/environments). There are more powerful tools out there! But if your resources are limited, Lua is a good choice!
Lua is inherently single-threaded; there is no such thing as an "event". There is no way to interrupt executing Lua code. So while you could rig something up that looked like an event, you'd only ever get one if you called a function that polled which events were available.
Generally, if you're trying to use Lua for this kind of low-level work, you're using the wrong tool. You should be using C or something to access this sort of data, then pass it along to Lua when it's ready.
You are probably using a non-blocking select() to "poll" sockets for any new data available. Luasocket doesn't provide any other interface to see if there is new data available (as far as I know), but if you are concerned that it's taking too much time when you are doing this 10 times per second, consider writing a simplified version that only checks one socket you need and avoids creating and throwing away Lua tables. If that's not an option, consider passing nil to select() instead of {} for those lists you don't need to read and pass static tables instead of temporary ones:
local rset = {socket}
... later
...select(rset, nil, 0)
instead of
...select({socket}, {}, 0)

rpc mechanism to use with select driven daemon

I want to add an RPC service to my unix daemon. The daemon is written in C and has an event driven loop implemented using select(). I've looked at a number of RPC implementations but they all seem to involve calling a library routine, or auto generated code, which blocks indefinitely.
Are there any RPC frameworks out there where the library code/autogenerated code doesn't block or start threads. Ideally, I'd like to create the input/output sockets myself and pass them into my select loop.
Regards,
Alex - first time poster! :-)
I'm assuming that you can use C++ Apache Thrift is good - FAST RPC is also useful.
I evaluated a fair few libraries at the start of 2012 and eventually ended up going with ZeroMQ as it was more adaptable and (I found it) easier and a lot more flexible. I did consider using a Google protobuf implementation but ended up using a simpler structured command text approach.
I probably wouldn't consider doing this in C unless I had to, in which case I'd probably start with the standard rpc(3) stuff, for a good overview see this overview of Remote Procedure Calls (RPC).

How to run Akka Future using given SecurityManager?

For an open-source multiplayer programming game written in Scala that loads players' bot code via a plug-in system from .jar files, I'd like to prevent the code of the bots from doing harm on the server system by running them under a restrictive SecurityManager implementation.
The current implementation uses a URLClassLoader to extract a control function factory for each bot from its related plug-in .jar file. The factory is then used to instantiate a bot control function instance for each new round of the game. Then, once per simulation cycle, all bot control functions are invoked concurrently to get the bot's responses to their environment. The concurrent invocation is done using Akka's Future.traverse() with an implicitly provided ActorSystem shared by other concurrently operating components (compile service, web server, rendering):
val future = Future.traverse(bots)(bot => Future { bot.respondTo(state) })
val result = Await.result(future, Duration.Inf)
To restrict potentially malicious code contained in the bot plug-ins from running, it appears that following the paths taken in this StackOverflow question and this one I need to have the bot control functions execute in threads running under an appropriately restrictive SecurityManager implementation.
Now the question: how can I get Akka to process the work currently done in Future.traverse() with actors running in threads that have the desired SecurityManager, while the other Actors in the system, e.g. those running the background compile service, continue to run as they do now, i.e. unrestricted?
You can construct an instance of ExecutionContext (eg. via ExecutionContext.fromExecutorService) that runs all work under the restrictive security manager, and bring it into the implicit scope for Future.apply and Future.traverse.
If the invoked functions do not need to interact with the environment, I don't think you need a separate ActorSystem.

Do the methods in Apache's FileUtils peform synchronous (blocking) i/o?

Do the methods in Apache's FileUtils perform synchronous (blocking) i/o?
I am making a call to FileUtils.copyDirectoryToDirectory. In my next line, I want to delete the directory that I copied.
Example:
FileUtils.copyDirectoryToDirectory(source, destination);
FileUtils.deleteDirectory(source);
Just want to make sure this is "safe" and asynchronous (non-blocking) i/o isn't happening.
Thanks.
Two things:
FileUtils is not part of the standard JDK, it a class in the Apache Commons IO library.
The operations you mentioned do not use non-blocking IO.
So to answer your question, yes, your overall operation is safe.

AKKA: Communicating via Messaging Queue

We have a component written in Groovy ( let's call it a "G-Component" ) that needs to communicate with a component written in Scala / AKKA ( let's call it an "A-Component" ).
What fits our needs best is a messaging queue:
"G-COMPONENT" <==> in memory messaging queue <==> "A-COMPONENT"
For the "G-COMPONENT" life is simple:
queue.send( message )
message = queue.receive()
For the AKKA component it seems a bit more involved, since there is an Actor that needs to "handle"/"receive" messages, and be able to "send" messages back.
The problem is the "receive" part, as it now needs to go into a loop of its own to listen for messages from the queue. Which.. disables it as an AKKA Actor, since once it is in that loop, it can't receive any AKKA messages.
Would appreciate any help on the clean solution for this, without implementing an AKKA plugin of "that particular queue implementation" Actor mailbox.
converting a "question edit" to an answer
Found an interesting development going of not yet officially released AKKA API:
"Akka provides a ZeroMQ module which abstracts a ZeroMQ connection and therefore allows interaction between Akka actors to take place over ZeroMQ connections."
Seems that I can have an AKKA way to spawn a ZeroMQ listener:
val listener = actorOf(new Actor {
def receive: Receive = {
case message: ZMQMessage => ...
case _ => ...
}
}).start
val socket = ZMQ.newSocket(SocketParameters(context, SocketType.Sub, Some(listener)))
socket ! Connect("tcp://127.0.0.1:1234")
socket ! Subscribe(Seq())
confirmed by Viktor Klang (question comments) this is the way to go
This may be obvious but Akka has excellent camel and amqp integration.
http://akka.io/docs/akka-modules/1.2/modules/camel.html
http://akka.io/docs/akka-modules/1.2/modules/amqp.html
I am not sure what you mean by 'without implementing an AKKA plugin of "that particular queue implementation" Actor mailbox'. Does that mean you don't want to use these components?
AKKA is a library not a programming language.
Just write the zeromq message listener outside of an actor, and have it send incoming zeromq messages to AKKA actors. I've done this with AMQP using the Java AMQP client library and it works just fine.
If you want the ZeroMQ listener to be running in an event loop, then it is easy enough to write your own using the select poller http://api.zeromq.org/2-1:zmq-poll Have a look at the ConcurrentSocketActor source code in the AKKA zeromq module because that's what it uses. This would be a good model if you ever need to roll your own concurrent actor for some other type of network communication.
And this is the same problem that people have when they want to add a network accessible management interface to a non-network application in any language.