How to deal with ZMQ sockets lack of thread safety?

How to deal with ZMQ sockets lack of thread safety? - sockets

I've been using ZMQ in some Python applications for a while, but only very recently I decided to reimplement one of them in Go and I realized that ZMQ sockets are not thread-safe.
The original Python implementation uses an event loop that looks like this:
while running:
socks = dict(poller.poll(TIMEOUT))
if socks.get(router) == zmq.POLLIN:
client_id = router.recv()
_ = router.recv()
data = router.recv()
requests.append((client_id, data))
for req in requests:
rep = handle_request(req)
if rep:
replies.append(rep)
requests.remove(req)
for client_id, data in replies:
router.send(client_id, zmq.SNDMORE)
router.send(b'', zmq.SNDMORE)
router.send(data)
del replies[:]
The problem is that the reply might not be ready on the first pass, so whenever I have pending requests, I have to poll with a very short timeout or the clients will wait for more than they should, and the application ends up using a lot of CPU for polling.
When I decided to reimplement it in Go, I thought it would be as simple as this, avoiding the problem by using infinite timeout on polling:
for {
sockets, _ := poller.Poll(-1)
for _, socket := range sockets {
switch s := socket.Socket; s {
case router:
msg, _ := s.RecvMessage(0)
client_id := msg[0]
data := msg[2]
go handleRequest(router, client_id, data)
}
}
}
But that ideal implementation only works when I have a single client connected, or a light load. Under heavy load I get random assertion errors inside libzmq. I tried the following:
Following the zmq4 docs I tried adding a sync.Mutex and lock/unlock on all socket operations. It fails. I assume it's because ZMQ uses its own threads for flushing.
Creating one goroutine for polling/receiving and one for sending, and use channels in the same way I used the req/rep queues in the Python version. It fails, as I'm still sharing the socket.
Same as 2, but setting GOMAXPROCS=1. It fails, and throughput was very limited because replies were being held back until the Poll() call returned.
Use the req/rep channels as in 2, but use runtime.LockOSThread to keep all socket operations in the same thread as the socket. Has the same problem as above. It doesn't fail, but throughput was very limited.
Same as 4, but using the poll timeout strategy from the Python version. It works, but has the same problem the Python version does.
Share the context instead of the socket and create one socket for sending and one for receiving in separate goroutines, communicating with channels. It works, but I'll have to rewrite the client libs to use two sockets instead of one.
Get rid of zmq and use raw TCP sockets, which are thread-safe. It works perfectly, but I'll also have to rewrite the client libs.
So, it looks like 6 is how ZMQ was really intended to be used, as that's the only way I got it to work seamlessly with goroutines, but I wonder if there's any other way I haven't tried. Any ideas?
Update
With the answers here I realized I can just add an inproc PULL socket to the poller and have a goroutine connect and push a byte to break out of the infinite wait. It's not as versatile as the solutions suggested here, but it works and I can even backport it to the Python version.

I opened an issue a 1.5 years ago to introduce a port of https://github.com/vaughan0/go-zmq/blob/master/channels.go to pebbe/zmq4. Ultimately the author decided against it, but we have used this in production (under VERY heavy workloads) for a long time now.
This is a gist of the file that had to be added to the pebbe/zmq4 package (since it adds methods to the Socket). This could be re-written in such a way that the methods on the Socket receiver instead took a Socket as an argument, but since we vendor our code anyway, this was an easy way forward.
The basic usage is to create your Socket like normal (call it s for example) then you can:
channels := s.Channels()
outBound := channels.Out()
inBound := channels.In()
Now you have two channels of type [][]byte that you can use between goroutines, but a single goroutine - managed within the channels abstraction, is responsible for managing the Poller and communicating with the socket.

The blessed way to do this with pebbe/zmq4 is with a Reactor. Reactors have the ability to listen on Go channels, but you don't want to do that because they do so by polling the channel periodically using a poll timeout, which reintroduces the same exact problem you have in your Python version. Instead you can use zmq inproc sockets, with one end held by the reactor and the other end held by a goroutine that passes data in from a channel. It's complicated, verbose, and unpleasant, but I have used it successfully.

Related

Non-blocking TCP socket fails to connect using socket2

I am currently in the process of converting some of my code from blocking to non-blocking using the sockets2 crate, however I am running into issues with connecting the socket. The socket always fails to connect before the timeout is exceeded. Despite my attempts to search for examples, I have yet to find any Rust code showing how a non-blocking TCP stream is created.
To give you an idea what I am attempting to do, the code I am currently converting looks looks roughly like this. This gives me no issues and works fine, but it is getting too costly to create a new thread for every socket.
let address = SocketAddr::from(([x, y, z, v], port));
let mut socket = TcpStream::connect_timeout(&address, timeout)?;
At the moment, my code to connect the socket looks like this. Since connect_timeout can only be executed in blocking mode, I use connect instead and regularly poll the socket to check if it is connected. At the moment, I keep getting WouldBlock errors when calling connect, but I do not know what this means. At first I assumed that the connect was proceeding, but returning the result immediately would require blocking so a WouldBlock error was given instead. However, due to the issues getting the socket to connect, I am second guessing those assumptions.
let address = SocketAddr::from(([x, y, z, v], port));
// Create socket equivalent to TcpStream
let socket = Socket::new(Domain::IPV4, Type::STREAM, Some(Protocol::TCP))?;
// Enable non-blocking mode on the socket
socket.set_nonblocking(true)?;
// What response should I expect? Do I need to bind an address first?
match socket.connect(&address.into()) {
Ok(_) => {}
Err(e) if e.kind() == ErrorKind::WouldBlock => {
// I keep getting this error, but I don't know what this means.
// Is non-blocking connect unavailable?
// Do I need to keep trying to connect until it succeeds?
},
// Are there any other types of errors I should be looking for before failing the connection?
Err(e) => return Err(e),
}
I am also unsure what the correct approach is to determine if a socket is connected. At the moment, I attempt to read to a zero length buffer and check if I get a NotConnected error. However, I am unsure what WouldBlock means in this context and I have never gotten a positive response from this approach.
let mut buffer = [0u8; 0];
// I also tried self.socket.peer_addr(), but ran into issues where it returned a positive
// response despite not being connected.
match self.socket.read(&mut buffer) {
Ok(_) => Ok(true),
// What does WouldBlock mean in this context?
Err(e) if e.kind() == ErrorKind::WouldBlock => Ok(false),
Err(e) if e.kind() == ErrorKind::NotConnected => Ok(false),
Err(e) => Err(e),
}
Each socket is periodically checked until an arbitrary timeout is reached to determine if it has connected. So far, no socket has passed the connected before reaching its timeout (20 sec) when connecting to a known-good server. These tests are all performed in a single threaded application on Windows using a known-good server that has been checked to work with the blocking version of my program.
Edit: Here is a minimum reproducible example for this issue. However, it likely won't work if you run it on Rust playground due to network restrictions. https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=a08c22574a971c0032fd9dd37e10fd94

WouldBlock is the expected error when a non-blocking connect() (or other operation) is successfully started in the background. You can then wait up to your desired timeout interval for the operation to finish (use select() or epoll() or other platform-specific notification to detect this). If the timeout elapses, close the socket and handle the timeout accordingly. Otherwise, check the socket's SO_ERROR option to see if the operation was successful or failed, and act accordingly.

To give you an idea what I am attempting to do, the code I am currently converting looks looks roughly like this. This gives me no issues and works fine, but it is getting too costly to create a new thread for every socket.
This sounds to me strongly like an XY-Problem.
I think you misunderstand what 'nonblocking' means. What it does not mean is that you can simply and without worrying run multiple sockets in parallel. What it does mean is that every operation that would block returns an error instead and you have to retry it at a later time.
Actual non-blocking sockets usually don't get used at enduser level. They are meant for libraries that depend on them and provide some higher level interface for asynchronism. Non-blocking sockets are hard to get right. They need to be paired with events, because otherwise you can only implement them with 100% cpu hungry busy loops, which is most likely not what you want.
There's good news, though! Remember the high-level libraries I talked about that use nonblocking sockets internally? The most famous one right now is called tokio and does exactly what you want. It will require you to learn a programming mechanism called asynchronism, but you will grasp it, I'm sure :)
I recommend this read: https://tokio.rs/tokio/tutorial

How can I send separate packets for each connection write in Go?

Problem
I want to run a load test with a high number of requests per second. I have written a socket sender and a receiver in Go. The sender sends a lot of packets to port 7357, each one containing the current time in nanoseconds. The receiver listens in port 7357 and parses each message, computing the latency.
The problem is that when reading I get multiple packets in one conn.Read(). I understand that this means that I am in fact sending multiple messages per packet: each conn.Write() does not send a socket packet, but it waits for some time and then gets coalesced with the next (or the next few) before sending.
Question
How can I make sure that each conn.Write() is sent individually through the socket as a separate packet? Note: I don't want to reinvent TCP, I just want to simulate the load from a number of external entities that send a message each.
Steps Taken
I have searched the documentation but there seems to be no conn.Flush() or similar. I have tried using a buffered writer:
writer := bufio.NewWriter(conn)
...
bytes, err := writer.Write(message)
err = writer.Flush()
No errors, but still I get mixed packets at the receiving end. I have also tried doing a fake conn.Read() of 0 bytes after every conn.Write(), but it didn't work either. Sending a message terminator such as \r\n does not seem to make any difference. Finally, Nagle algorithm is disabled by default, but I have called tcp.SetNoDelay(true) for good measure.
In Node.js I managed to do the trick with a setImmediate() after each socket.write(): setImmediate() waits for all I/O to finish before continuing. How can I do the same in Go so I get separate packets?
Code Snippets
Send:
func main() {
conn, _ := net.Dial("tcp", ":7357")
defer conn.Close()
for {
timestamp := strconv.FormatInt(time.Now().UnixNano(), 10)
conn.Write([]byte(timestamp))
conn.Read(buff)
}
}
Receive:
func main() {
listen, _ := net.Listen("tcp4", ":7357")
defer listen.Close()
for {
conn, _ := listen.Accept()
go handler(conn)
}
}
func handler(conn net.Conn) {
defer conn.Close()
var buf = make([]byte, 1024)
for {
conn.Read(buf)
data := string(buf[:n])
timestamp, _ := strconv.ParseInt(data, 10, 64)
elapsed := timestamp - time.Now().UnixNano()
log.Printf("Elapsed %v", elapsed)
}
}
Error handling has been removed for legibility, but it is thoroughly checked in the actual code. It crashes when running the strconv.ParseInt() the first time, with a value out of range error since it receives a lot of timestamps coalesced.

There used to a be a rule that before anyone was permitted to write any code that uses TCP, they were required to repeat the following sentence from memory and explain what it means: "TCP is not a message protocol, it is a reliable byte-stream protocol that does not preserve application message boundaries."
Aside from the fact that your suggested solution is simply not possible reliably with TCP, it is not the solution to reducing latency. If the network is overwhelmed, using more packets to send the same data will just make the latency worse.
TCP is a byte stream protocol. The service it provides is a stream of bytes. Period.
It seems that you want a low-latency message protocol that works over TCP. Great. Design one and implement it.
The main trick to getting low latency is to use application-level acknowledgements. The TCP ACK flag will piggy-back onto the acknowledgements, providing low latency.
Do not disable Nagling. That's a hack that's only needed when you can't design a proper protocol that's intended to work with TCP in the first place. It will make latency worse under non-ideal conditions for same reason the solution you suggested, even if it were possible, would be a poor idea.
But you MUST design and implement a message protocol or use an existing one. Your code is expecting TCP, which is not a message protocol, to somehow deliver messages to it. That is just not going to happen, period.
How can I make sure that each conn.Write() is sent individually through the socket as a separate packet? Note: I don't want to reinvent TCP, I just want to simulate the load from a number of external entities that send a message each.
Even if you could, that wouldn't do what you want anyway. Even if they were sent in separate packets, that would not guarantee that read on the other side wouldn't coalesce. If you want to send and receive messages, you need a message protocol which TCP is not.
In Node.js I managed to do the trick with a setImmediate() after each socket.write(): setImmediate() waits for all I/O to finish before continuing. How can I do the same in Go so I get separate packets?
You may have changed it from "happens not to work" to "happened to work when I tried it". But for the reasons I've explained, you can never make this work reliably and you are on a fool's errand.
If you want to send and receive messages, you need to precisely defined what a "message" will be and write code to send and receive them. There are no shortcuts that are reliable. TCP is a byte stream protocol, period.
If you care about latency and throughput, design an optimized message protocol to layer over TCP that optimizes these. Do not disable Nagle as Nagle is required to prevent pathological behavior. It should only be disabled when you cannot change the protocol and are stuck with a protocol that was not designed to layer on top of TCP. Disabling Nagle ties one hand behind your back and causes dramatically worse latency and throughput under poor network conditions by increasing the number of packets required to send the data even when that doesn't make any sense.
You probably want/need application-level acknowledgements. This works nicely with TCP because TCP ACKs will piggyback on the application-level acknowledgements.

You can read predefined number of bytes from the socket on each iteration, that might help, but you need to create your own protocol, that will be handled by your application. Without proto impossible to guarantee that everything will work stable, because on the receiver you cannot understand where is the begin and where is the end of message.

Non-blocking UDP recv in Haskell

I'm still learning the basics of Haskell and currently working through porting some Java code to Haskell. My current problem is in UDP recvFrom using Network.Socket.ByteString.
The problem is with this method:
public abstract SocketAddress receive(ByteBuffer dst) throws IOException
Receives a datagram via this channel.
If a datagram is immediately available, or if this channel
is in blocking mode and one eventually becomes available,
then the datagram is copied into the given byte buffer and
its source address is returned. If this channel is in
non-blocking mode and a datagram is not immediately available
then this method immediately returns null.
The thing is that when I use Network.Socket.ByteString.recvFrom my code blocks at this point waiting for the packet to come. It doesn't return something like Maybe to indicate if something was received or not (the same way in Java there is a null returned when no data was available)
I found this thread: https://mail.haskell.org/pipermail/haskell-cafe/2010-August/082725.html
At the end of it there are a couple of ways suggested: 1) FFI 2) run recvFrom in its own thread
I'm not sure I'm capable of using any of those approaches at this moment (not enough knowledge). What I want is to get something similar to Java way of non-blocking receive: get needed info if it is available or just nothing if there is no single UDP packet. Anyone can point out what would be a better approach, any code snippets, someone already handled this problem?

You could use socketToHandle together with hGetNonBlocking:
recvNonBlocking :: Socket -> Int -> IO ByteString
recvNonBlocking s n = do
hnd <- socketToHandle s ReadMode
hGetNonBlocking hnd n
However, keep in mind that you cannot use the Socket after a call to socketToHandle, so this is only feasible if you would close the Socket either way afterwards.

Lua sockets - Asynchronous Events

In current lua sockets implementation, I see that we have to install a timer that calls back periodically so that we check in a non blocking API to see if we have received anything.
This is all good and well however in UDP case, if the sender has a lot of info being sent, do we risk loosing the data. Say another device sends a 2MB photo via UDP and we check socket receive every 100msec. At 2MBps, the underlying system must store 200Kbits before our call queries the underlying TCP stack.
Is there a way to get an event fired when we receive the data on the particular socket instead of the polling we have to do now?

There are a various ways of handling this issue; which one you will select depends on how much work you want to do.*
But first, you should clarify (to yourself) whether you are dealing with UDP or TCP; there is no "underlying TCP stack" for UDP sockets. Also, UDP is the wrong protocol to use for sending whole data such as a text, or a photo; it is an unreliable protocol so you aren't guaranteed to receive every packet, unless you're using a managed socket library (such as ENet).
Lua51/LuaJIT + LuaSocket
Polling is the only method.
Blocking: call socket.select with no time argument and wait for the socket to be readable.
Non-blocking: call socket.select with a timeout argument of 0, and use sock:settimeout(0) on the socket you're reading from.
Then simply call these repeatedly.
I would suggest using a coroutine scheduler for the non-blocking version, to allow other parts of the program to continue executing without causing too much delay.
Lua51/LuaJIT + LuaSocket + Lua Lanes (Recommended)
Same as the above method, but the socket exists in another lane (a lightweight Lua state in another thread) made using Lua Lanes (latest source). This allows you to instantly read the data from the socket and into a buffer. Then, you use a linda to send the data to the main thread for processing.
This is probably the best solution to your problem.
I've made a simple example of this, available here. It relies on Lua Lanes 3.4.0 (GitHub repo) and a patched LuaSocket 2.0.2 (source, patch, blog post re' patch)
The results are promising, though you should definitely refactor my example code if you derive from it.
LuaJIT + OS-specific sockets
If you're a little masochistic, you can try implementing a socket library from scratch. LuaJIT's FFI library makes this possible from pure Lua. Lua Lanes would be useful for this as well.
For Windows, I suggest taking a look at William Adam's blog. He's had some very interesting adventures with LuaJIT and Windows development. As for Linux and the rest, look at tutorials for C or the source of LuaSocket and translate them to LuaJIT FFI operations.
(LuaJIT supports callbacks if the API requires it; however, there is a signficant performance cost compared to polling from Lua to C.)
LuaJIT + ENet
ENet is a great library. It provides the perfect mix between TCP and UDP: reliable when desired, unreliable otherwise. It also abstracts operating system specific details, much like LuaSocket does. You can use the Lua API to bind it, or directly access it via LuaJIT's FFI (recommended).
* Pun unintentional.

I use lua-ev https://github.com/brimworks/lua-ev for all IO-multiplexing stuff.
It is very easy to use fits into Lua (and its function) like a charm. It is either select/poll/epoll or kqueue based and performs very good too.
local ev = require'ev'
local loop = ev.Loop.default
local udp_sock -- your udp socket instance
udp_sock:settimeout(0) -- make non blocking
local udp_receive_io = ev.IO.new(function(io,loop)
local chunk,err = udp_sock:receive(4096)
if chunk and not err then
-- process data
end
end,udp_sock:getfd(),ev.READ)
udp_receive_io:start(loop)
loop:loop() -- blocks forever
In my opinion Lua+luasocket+lua-ev is just a dream team for building efficient and robust networking applications (for embedded devices/environments). There are more powerful tools out there! But if your resources are limited, Lua is a good choice!

Lua is inherently single-threaded; there is no such thing as an "event". There is no way to interrupt executing Lua code. So while you could rig something up that looked like an event, you'd only ever get one if you called a function that polled which events were available.
Generally, if you're trying to use Lua for this kind of low-level work, you're using the wrong tool. You should be using C or something to access this sort of data, then pass it along to Lua when it's ready.

You are probably using a non-blocking select() to "poll" sockets for any new data available. Luasocket doesn't provide any other interface to see if there is new data available (as far as I know), but if you are concerned that it's taking too much time when you are doing this 10 times per second, consider writing a simplified version that only checks one socket you need and avoids creating and throwing away Lua tables. If that's not an option, consider passing nil to select() instead of {} for those lists you don't need to read and pass static tables instead of temporary ones:
local rset = {socket}
... later
...select(rset, nil, 0)
instead of
...select({socket}, {}, 0)

Socket Read In Multi-Threaded Application Returns Zero Bytes or EINTR (104)

Am a c-coder for a while now - neither a newbie nor an expert. Now, I have a certain daemoned application in C on a PPC Linux. I use PHP's socket_connect as a client to connect to this service locally. The server uses epoll for multiplexing connections via a Unix socket. A user submitted string is parsed for certain characters/words using strstr() and if found, spawns 4 joinable threads to different websites simultaneously. I use socket, connect, write and read, to interact with the said webservers via TCP on their port 80 in each thread. All connections and writes seems successful. Reads to the webserver sockets fail however, with either (A) all 3 threads seem to hang, and only one thread returns -1 and errno is set to 104. The responding thread takes like 10 minutes - an eternity long:-(. *I read somewhere that the 104 (is EINTR?), which in the network context suggests that ...'the connection was reset by peer'; or (B) 0 bytes from 3 threads, and only 1 of the 4 threads actually returns some data. Isn't the socket read/write thread-safe? I use thread-safe (and reentrant) libc functions such as strtok_r, gethostbyname_r, etc.
*I doubt that the said webhosts are actually resetting the connection, because when I run a single-threaded standalone (everything else equal) all things works perfectly right, but of course in series not parallel.
There's a second problem too (oops), I can't write back to the client who connect to my epoll-ed Unix socket. My daemon application will hang and hog CPU > 100% for ever. Yet nothing is written to the clients end. Am sure the client (a very typical PHP socket application) hasn't closed the connection whenever this is happening - no error(s) detected either. Any ideas?
I cannot figure-out whatever is wrong even with Valgrind, GDB or much logging. Kindly help where you can.

Yes, read/write are thread-safe. But beware of gethostbyname() and getservbyname() if you're using them - they return pointers to static data, and may not be thread-safe.
errno 104 is ECONNREFUSED (not EINTR). Use strerror or perror to get the textual error message (like 'Connection reset by peer') for a particular errno code.
The best way to figure out what's going wrong is often to do very detailed logging - log the results of every operation, plus details like the IP address/port connecting to, the number of bytes read/written, the thread id, and so forth. And, of course, make sure your logging code is thread-safe :-)

Getting an ECONNRESET after 10 minutes sounds like the result of your connection timing out. Either the web server isn't sending the data or your app isn't receiving it.
To test the former, hookup a program like Wireshark to the local loopback device and look for traffic to and from the port you are using.
For the later, take a look at the epoll() man page. They mention a scenario where using edge triggered events could result in a lockup, because there is still data in the buffer, but no new data comes in so no new event is triggered.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse