I am creating a Firefox extension that allows using Standard ML (SML) as a client-side programming language in Firefox. The way it works is the following:
The extension launches a PolyML process (SML compiler with a top-level interactive shell).
A socket communication is then established between the extension and the PolyML process.
SML code is read from the webpage and is sent via the sockets to the PolyML process for evaluation.
That code may then use a library that I provide, for working with the DOM.
Here is how the DOM library is implemented:
Say someone executes an SML function DOM.getElementById
This request is forwarded via the sockets to the extension, where the extension executes the JavaScript function getElementById on the page and sends the result back to the PolyML process via the sockets.
My question is, in theory, what limits in terms of performance should I be expecting to have here, when it comes to the socket communication?
I did some very approximate profiling and it seems that using this interface between the extension and PolyML, I can approximately send 2500 messages/second, of an average size of 70 bytes/message.
To put this in more context, say I want to draw some animations in the browser using the Canvas element. If I want to achieve 20fps, that means I need to draw every frame in 0.05 seconds, which means I can only send around 125 messages per frame. Those messages correspond to JavaScript function calls. For example, the code below draws a path and is making 9 JavaScript function calls, that correspond to 9 messages in the socket communication.
val _ = Canvas.beginPath context;
val _ = Canvas.setFillStyle context fillColor;
val _ = Canvas.setStrokeStyle context fillColor;
val _ = Canvas.setLineWidth context size;
val _ = Canvas.moveTo context posx posy;
val _ = Canvas.lineTo context posx_new posy_new;
val _ = Canvas.stroke context;
val _ = Canvas.arc context posx_new posy_new (size/2.0) 0.0 6.28 true;
val _ = Canvas.fill context;
JavaScript, obviously, has a much better performance, I would imagine that you could make thousands (hundreds) of times of more Canvas/DOM function calls in those 0.05 seconds, for drawing a frame.
So, I guess my question is, do you have any experience with using socket communication for very rapid message interchange. I would like to know whether 2500 small messages / second (in this case, corresponding to 150 kbytes/second) seems about right or might I be doing something very wrong.
For example, one suspicion is that the socket implementation in firefox (in particular using it via the JavaScript interface https://developer.mozilla.org/en/XPCOM_Interface_Reference/nsIServerSocket) is not very good for this kind of rapid interaction. For example, the reading from the socket is done via an event loop mechanism. That is I rely on Firefox.. to notify me about the availability of incoming socket messages and sometimes there is a large (e.g. 250ms) delay between sending the message and receiving it (though that seems to be happening only when firefox is busy with doing other things, and I'm more interested in the ..theoretical.. limits of the socket communication)
Any ideas, any thoughts, any flaws that you see? Do you think using other IPC mechanisms would be better, e.g. pipes, implementing my communication from C++ XPCOM component, rather than from JavaScript, foreign function interface to C (that both JavaScript and PolyML have)?
(The project is located at https://assembla.com/wiki/show/polymlext if anyone's interested)
TCP can be tuned for higher throughput or faster response time. For higher throughput, you need to set the socket buffer to larger value. For getting good response time with smaller chunk of data, you need to set the TCP_NODELAY socket option. TCP on loopback if its fine tuned it should be identical to any IPC mechanism. Newer windows OS does special optimization like increasing MTU size,etc on loopback adapter to make it faster.
Related
I pretty much understand how the CAN protocol works -- when two nodes attempt to use the network at the same time, the lower id can frame gets priority and the other node detects this and halts.
This seems to get abstracted away when using socketcan - we simply write and read like we would any file descriptor. I may be misunderstanding something but I've gone through most of the docs (http://lxr.free-electrons.com/source/Documentation/networking/can.txt) and I don't think it's described unambiguously.
Does write() block until our frame is the lowest id frame, or does socketcan buffer the frame until the network is ready? If so, is the user notified when this occurs or do we use the loopback for this?
write does not block for channel contention. It could block because of the same reasons a TCP socket write would (very unlikely).
The CAN peripheral will receive a frame to be transmitted from the kernel and perform the Medium Access Control Protocol (MAC protocol) to send it over the wire. SocketCAN knows nothing about this layer of the protocol.
Where the frame is buffered is peripheral/driver dependent: the chain kernel-driver-peripheral behaves as 3 chained FIFOs with their own control flow mechanisms, but usually, it is the driver that buffers (if it is needed) the most since the peripheral has less memory available.
It is possible to subscribe for errors in the CAN stack protocol (signaled by the so called "error frames") by providing certain flags using the SocketCAN interface (see 4.1.2 in your link): this is the way to get error information at application layer.
Of course you can check for a correctly transmitted frame by checking the loopback interface, but it is overkill, the error reporting mechanism described above should be used instead and it is easier to use.
I've been using ZMQ in some Python applications for a while, but only very recently I decided to reimplement one of them in Go and I realized that ZMQ sockets are not thread-safe.
The original Python implementation uses an event loop that looks like this:
while running:
socks = dict(poller.poll(TIMEOUT))
if socks.get(router) == zmq.POLLIN:
client_id = router.recv()
_ = router.recv()
data = router.recv()
requests.append((client_id, data))
for req in requests:
rep = handle_request(req)
if rep:
replies.append(rep)
requests.remove(req)
for client_id, data in replies:
router.send(client_id, zmq.SNDMORE)
router.send(b'', zmq.SNDMORE)
router.send(data)
del replies[:]
The problem is that the reply might not be ready on the first pass, so whenever I have pending requests, I have to poll with a very short timeout or the clients will wait for more than they should, and the application ends up using a lot of CPU for polling.
When I decided to reimplement it in Go, I thought it would be as simple as this, avoiding the problem by using infinite timeout on polling:
for {
sockets, _ := poller.Poll(-1)
for _, socket := range sockets {
switch s := socket.Socket; s {
case router:
msg, _ := s.RecvMessage(0)
client_id := msg[0]
data := msg[2]
go handleRequest(router, client_id, data)
}
}
}
But that ideal implementation only works when I have a single client connected, or a light load. Under heavy load I get random assertion errors inside libzmq. I tried the following:
Following the zmq4 docs I tried adding a sync.Mutex and lock/unlock on all socket operations. It fails. I assume it's because ZMQ uses its own threads for flushing.
Creating one goroutine for polling/receiving and one for sending, and use channels in the same way I used the req/rep queues in the Python version. It fails, as I'm still sharing the socket.
Same as 2, but setting GOMAXPROCS=1. It fails, and throughput was very limited because replies were being held back until the Poll() call returned.
Use the req/rep channels as in 2, but use runtime.LockOSThread to keep all socket operations in the same thread as the socket. Has the same problem as above. It doesn't fail, but throughput was very limited.
Same as 4, but using the poll timeout strategy from the Python version. It works, but has the same problem the Python version does.
Share the context instead of the socket and create one socket for sending and one for receiving in separate goroutines, communicating with channels. It works, but I'll have to rewrite the client libs to use two sockets instead of one.
Get rid of zmq and use raw TCP sockets, which are thread-safe. It works perfectly, but I'll also have to rewrite the client libs.
So, it looks like 6 is how ZMQ was really intended to be used, as that's the only way I got it to work seamlessly with goroutines, but I wonder if there's any other way I haven't tried. Any ideas?
Update
With the answers here I realized I can just add an inproc PULL socket to the poller and have a goroutine connect and push a byte to break out of the infinite wait. It's not as versatile as the solutions suggested here, but it works and I can even backport it to the Python version.
I opened an issue a 1.5 years ago to introduce a port of https://github.com/vaughan0/go-zmq/blob/master/channels.go to pebbe/zmq4. Ultimately the author decided against it, but we have used this in production (under VERY heavy workloads) for a long time now.
This is a gist of the file that had to be added to the pebbe/zmq4 package (since it adds methods to the Socket). This could be re-written in such a way that the methods on the Socket receiver instead took a Socket as an argument, but since we vendor our code anyway, this was an easy way forward.
The basic usage is to create your Socket like normal (call it s for example) then you can:
channels := s.Channels()
outBound := channels.Out()
inBound := channels.In()
Now you have two channels of type [][]byte that you can use between goroutines, but a single goroutine - managed within the channels abstraction, is responsible for managing the Poller and communicating with the socket.
The blessed way to do this with pebbe/zmq4 is with a Reactor. Reactors have the ability to listen on Go channels, but you don't want to do that because they do so by polling the channel periodically using a poll timeout, which reintroduces the same exact problem you have in your Python version. Instead you can use zmq inproc sockets, with one end held by the reactor and the other end held by a goroutine that passes data in from a channel. It's complicated, verbose, and unpleasant, but I have used it successfully.
I'm still learning the basics of Haskell and currently working through porting some Java code to Haskell. My current problem is in UDP recvFrom using Network.Socket.ByteString.
The problem is with this method:
public abstract SocketAddress receive(ByteBuffer dst) throws IOException
Receives a datagram via this channel.
If a datagram is immediately available, or if this channel
is in blocking mode and one eventually becomes available,
then the datagram is copied into the given byte buffer and
its source address is returned. If this channel is in
non-blocking mode and a datagram is not immediately available
then this method immediately returns null.
The thing is that when I use Network.Socket.ByteString.recvFrom my code blocks at this point waiting for the packet to come. It doesn't return something like Maybe to indicate if something was received or not (the same way in Java there is a null returned when no data was available)
I found this thread: https://mail.haskell.org/pipermail/haskell-cafe/2010-August/082725.html
At the end of it there are a couple of ways suggested: 1) FFI 2) run recvFrom in its own thread
I'm not sure I'm capable of using any of those approaches at this moment (not enough knowledge). What I want is to get something similar to Java way of non-blocking receive: get needed info if it is available or just nothing if there is no single UDP packet. Anyone can point out what would be a better approach, any code snippets, someone already handled this problem?
You could use socketToHandle together with hGetNonBlocking:
recvNonBlocking :: Socket -> Int -> IO ByteString
recvNonBlocking s n = do
hnd <- socketToHandle s ReadMode
hGetNonBlocking hnd n
However, keep in mind that you cannot use the Socket after a call to socketToHandle, so this is only feasible if you would close the Socket either way afterwards.
In current lua sockets implementation, I see that we have to install a timer that calls back periodically so that we check in a non blocking API to see if we have received anything.
This is all good and well however in UDP case, if the sender has a lot of info being sent, do we risk loosing the data. Say another device sends a 2MB photo via UDP and we check socket receive every 100msec. At 2MBps, the underlying system must store 200Kbits before our call queries the underlying TCP stack.
Is there a way to get an event fired when we receive the data on the particular socket instead of the polling we have to do now?
There are a various ways of handling this issue; which one you will select depends on how much work you want to do.*
But first, you should clarify (to yourself) whether you are dealing with UDP or TCP; there is no "underlying TCP stack" for UDP sockets. Also, UDP is the wrong protocol to use for sending whole data such as a text, or a photo; it is an unreliable protocol so you aren't guaranteed to receive every packet, unless you're using a managed socket library (such as ENet).
Lua51/LuaJIT + LuaSocket
Polling is the only method.
Blocking: call socket.select with no time argument and wait for the socket to be readable.
Non-blocking: call socket.select with a timeout argument of 0, and use sock:settimeout(0) on the socket you're reading from.
Then simply call these repeatedly.
I would suggest using a coroutine scheduler for the non-blocking version, to allow other parts of the program to continue executing without causing too much delay.
Lua51/LuaJIT + LuaSocket + Lua Lanes (Recommended)
Same as the above method, but the socket exists in another lane (a lightweight Lua state in another thread) made using Lua Lanes (latest source). This allows you to instantly read the data from the socket and into a buffer. Then, you use a linda to send the data to the main thread for processing.
This is probably the best solution to your problem.
I've made a simple example of this, available here. It relies on Lua Lanes 3.4.0 (GitHub repo) and a patched LuaSocket 2.0.2 (source, patch, blog post re' patch)
The results are promising, though you should definitely refactor my example code if you derive from it.
LuaJIT + OS-specific sockets
If you're a little masochistic, you can try implementing a socket library from scratch. LuaJIT's FFI library makes this possible from pure Lua. Lua Lanes would be useful for this as well.
For Windows, I suggest taking a look at William Adam's blog. He's had some very interesting adventures with LuaJIT and Windows development. As for Linux and the rest, look at tutorials for C or the source of LuaSocket and translate them to LuaJIT FFI operations.
(LuaJIT supports callbacks if the API requires it; however, there is a signficant performance cost compared to polling from Lua to C.)
LuaJIT + ENet
ENet is a great library. It provides the perfect mix between TCP and UDP: reliable when desired, unreliable otherwise. It also abstracts operating system specific details, much like LuaSocket does. You can use the Lua API to bind it, or directly access it via LuaJIT's FFI (recommended).
* Pun unintentional.
I use lua-ev https://github.com/brimworks/lua-ev for all IO-multiplexing stuff.
It is very easy to use fits into Lua (and its function) like a charm. It is either select/poll/epoll or kqueue based and performs very good too.
local ev = require'ev'
local loop = ev.Loop.default
local udp_sock -- your udp socket instance
udp_sock:settimeout(0) -- make non blocking
local udp_receive_io = ev.IO.new(function(io,loop)
local chunk,err = udp_sock:receive(4096)
if chunk and not err then
-- process data
end
end,udp_sock:getfd(),ev.READ)
udp_receive_io:start(loop)
loop:loop() -- blocks forever
In my opinion Lua+luasocket+lua-ev is just a dream team for building efficient and robust networking applications (for embedded devices/environments). There are more powerful tools out there! But if your resources are limited, Lua is a good choice!
Lua is inherently single-threaded; there is no such thing as an "event". There is no way to interrupt executing Lua code. So while you could rig something up that looked like an event, you'd only ever get one if you called a function that polled which events were available.
Generally, if you're trying to use Lua for this kind of low-level work, you're using the wrong tool. You should be using C or something to access this sort of data, then pass it along to Lua when it's ready.
You are probably using a non-blocking select() to "poll" sockets for any new data available. Luasocket doesn't provide any other interface to see if there is new data available (as far as I know), but if you are concerned that it's taking too much time when you are doing this 10 times per second, consider writing a simplified version that only checks one socket you need and avoids creating and throwing away Lua tables. If that's not an option, consider passing nil to select() instead of {} for those lists you don't need to read and pass static tables instead of temporary ones:
local rset = {socket}
... later
...select(rset, nil, 0)
instead of
...select({socket}, {}, 0)
From this rpg socket tutorial we created a socket client in rpg that calls a java server socket
The problem is that connect()/send() operations blocks and we have a requirement that if the connect/send couldn't be done in a matter of a second per say, we have to just log it and finish.
If I set the socket to non-blocking mode (I think with fnctl), we are not fully understanding how to proceed, and can't find any useful documentation with examples for it.
I think if I do connect to a non-blocking socket I have to do select(..., timeout) which tells us if the connect succeed and/ we are able to send(bytes). But, if we send(bytes) afterwards, as it is now a non-blocking socket (which will immediately return after the call), how do I know that send() did the actual sending of the bytes to the server before closing the socket ?
I can fall back to have the client socket in AS400 as a Java or C procedure, but I really want to just keep it in a simple RPG program.
Would somebody help me understand how to do that please ?
Thanks !
In my opinion, that RPG tutorial you mention has a slight defect. What I believe is causing your confusion is the following section's code:
...
Consequently, we typically call the
send() API like this:
D miscdata S 25A
D rc S 10I 0
C eval miscdata = 'The data to send goes here'
C eval rc = send(s: %addr(miscdata): 25: 0)
c if rc < 25
C* for some reason we weren't able to send all 25 bytes!
C endif
...
If you read the documentation of send() you will see that the return value does not indicate an error if it is greater than -1 yet in the code above it seems as if an error has occurred. In fact, the sum of the return values must equal the size of the buffer assuming that you keep moving the pointer into the buffer to reflect what has been sent. Look here in Beej's Guide to Network Programming. You might also like to look at Richard Stevens' book UNIX Network Programming, Volume 1 for really detailed explanations.
As to the problem of determining if the last send before close() did the actual send ... well the paragraph above explains how to determine what portion of the data was sent. However, calling close() will attempt to send all unsent data unless SO_LINGER is set.
fnctl() is used to control blocking while setsockopt() is used to set SO_LINGER.
The abstraction of network communications being used is BSD sockets. There are some slight differences in implementations across OS's but it is generally quite homogeneous. This means that one can generally use documentation written for other OS's for the broad overview. Most of the time.