How does zmq poller work? - sockets

I am confused as to what poller actually does in zmq. The zguide goes into it minimally, and only describes it as a way to read from multiple sockets. This is not a satisfying answer for me because it does not explain how to have timeout sockets. I know zeromq: how to prevent infinite wait? explains for push/pull, but not req/rep patterns, which is what I want to know how to use.
What I am attempting to ask is: How does poller work, and how does its function apply to keeping track of sockets and their requests?

When you need to listen on different sockets in the same thread, use a poller:
ZMQ.Socket subscriber = ctx.socket(ZMQ.SUB)
ZMQ.Socket puller = ctx.socket(ZMQ.PULL)
Register sockets with poller (POLLIN listens for incoming messages)
ZMQ.Poller poller = ZMQ.Poller(2)
poller.register(subscriber, ZMQ.Poller.POLLIN)
poller.register(puller, ZMQ.Poller.POLLIN)
When polling, use a loop:
while( notInterrupted()){
poller.poll()
//subscriber registered at index '0'
if( poller.pollin(0))
subscriber.recv(ZMQ.DONTWAIT)
//puller registered at index '1'
if( poller.pollin(1))
puller.recv( ZMQ.DONTWAIT)
}
Choose how you want to poll...
poller.poll() blocks until there's data on either socket.
poller.poll(1000) blocks for 1s, then times out.
The poller notifies when there's data (messages) available on the sockets; it's your job to read it.
When reading, do it without blocking: socket.recv( ZMQ.DONTWAIT). Even though poller.pollin(0) checks if there's data to be read, you want to avoid any blocking calls inside the polling loop, otherwise, you could end up blocking the poller due to 'stuck' socket.
So, if two separate messages are sent to subscriber, you have to invoke subscriber.recv() twice in order to clear the poller, otherwise, if you call subscriber.recv() once, the poller will keep telling you there's another message to be read. So, in essence, the poller tracks the availability and number of messages, not the actual messages.
You should run through the polling examples and play with the code, it's the best way to learn.
Does that answer your question?

In this answer i listed
Details from the documentation http://api.zeromq.org/4-1:zmq-poll
Also i added some important explanation and things that clear confusion for new commers! If you are in a hurry! You may like to start by What the poller do and What about Recieving and A note about recieving and what about one socket only sections at the end! Starting from important notes section! Where i clear things in depth! I still suggest reading well the details in the doc ref! And first section!
Doc ref and notes
Listen to multiple sockets and events
The zmq_poll() function provides a mechanism for applications to multiplex input/output events in a level-triggered fashion over a set of sockets. Each member of the array pointed to by the items argument is a zmq_pollitem_t structure. The nitems argument specifies the number of items in the items array. The zmq_pollitem_t structure is defined as follows:
typedef struct
{
void //*socket//;
int //fd//;
short //events//;
short //revents//;
} zmq_pollitem_t;
zmq Socket or standard socket through fd
For each zmq_pollitem_t item, zmq_poll() shall examine either the ØMQ socket referenced by socket or the standard socket specified by the file descriptor fd, for the event(s) specified in events. If both socket and fd are set in a single zmq_pollitem_t, the ØMQ socket referenced by socket shall take precedence and the value of fd shall be ignored.
Big note (same context):
All ØMQ sockets passed to the zmq_poll() function must share the same ØMQ context and must belong to the thread calling zmq_poll().
Revents member
For each zmq_pollitem_t item, zmq_poll() shall first clear the revents member, and then indicate any requested events that have occurred by setting the bit corresponding to the event condition in the revents member.
Upon successful completion, the zmq_poll() function shall return the number of zmq_pollitem_t structures with events signaled in revents or 0 if no events have been signaled.
Awaiting for events and blocking
If none of the requested events have occurred on any zmq_pollitem_t item, zmq_poll() shall wait timeout microseconds for an event to occur on any of the requested items. If the value of timeout is 0, zmq_poll() shall return immediately. If the value of timeout is -1, zmq_poll() shall block indefinitely until a requested event has occurred on at least one zmq_pollitem_t. The resolution of timeout is 1 millisecond.
0 => doesn't wait
-1 => block
+val => block and wait for the timeout amount
Events
The events and revents members of zmq_pollitem_t are bit masks constructed by OR'ing a combination of the following event flags:
ZMQ_POLLIN
For ØMQ sockets, at least one message may be received from the socket without blocking. For standard sockets this is equivalent to the POLLIN flag of the poll() system call and generally means that at least one byte of data may be read from fd without blocking.
ZMQ_POLLOUT
For ØMQ sockets, at least one message may be sent to the socket without blocking. For standard sockets this is equivalent to the POLLOUT flag of the poll() system call and generally means that at least one byte of data may be written to fd without blocking.
ZMQ_POLLERR
For standard sockets, this flag is passed through zmq_poll() to the underlying poll() system call and generally means that some sort of error condition is present on the socket specified by fd. For ØMQ sockets this flag has no effect if set in events, and shall never be returned in revents by zmq_poll().
Note:
The zmq_poll() function may be implemented or emulated using operating system interfaces other than poll(), and as such may be subject to the limits of those interfaces in ways not defined in this documentation.
Return value
Upon successful completion, the zmq_poll() function shall return the number of zmq_pollitem_t structures with events signaled in revents or 0 if no events have been signaled. Upon failure, zmq_poll() shall return -1 and set errno to one of the values defined below.
Example
Polling indefinitely for input events on both a 0mq socket and a standard socket.
zmq_pollitem_t items [2];
/* First item refers to ØMQ socket 'socket' */
items[0].socket = socket;
items[0].events = ZMQ_POLLIN;
/* Second item refers to standard socket 'fd' */
items[1].socket = NULL;
items[1].fd = fd;
items[1].events = ZMQ_POLLIN;
/* Poll for events indefinitely */
int rc = zmq_poll (items, 2, -1);
assert (rc >= 0); /* Returned events will be stored in items[].revents */
Important notes
What the poller do and What about Recieving
The poller only check and await for when events occure!
POLLIN is for receiving! Data is there for recieving!
Then we should read through recv()! We are responsible to read or do anything! The poller is just there to listen to the events and await for them! And through zmq_pollitem_t we can listen to multiple events! If any event happen! Then the poller unblock! We can check then the event in the recv! and zmq_pollitem_t! Note that the poller queue the events as they trigger! And next call will pick from the queue! The order because of that is also kept! And successive calls will return the next event and so on! As they came in!
A note about recieving and what about one socket only
For a Router! A one router can receive multiple requests even from a one client! And also from multiple clients at once! In a setup where multiple clients are of same nature! And are the ones connecting to the router! A question that can cross the mind of a new commer is! Do i need a poller for this async nature! The anwser is no! No need for a poller and listening for different sockets!
The big note is: Receiving calls (zmq_recv(), socket.recv() some lang binding)! Block! And are the way to read! When messages come! They are queued! The poller have nothing to do with this! The poller only listen to events from different sockets! And unblock if any of them happen! if the timeout is reached then no event happen! And does no more then that!
The nature of receiving is straight forward! The recieve call blocks! Till a message in the message queue comes! When multiple come they will get queued! Then on each next call for recv()! We will pull the next message! Or frame! (Depending on what recieving method we are using! And the api level! And abstraction from the binding library to the low level one!)
Because we can access too the messages by frame! a frame at each call!
But then here it becomes clear!
Recieve calls are the things to recieve! They block till a message enter the queue! Multiple parallel messages! Will enter the queue as they come! Then for each call! Either the queue is filled or not! consume it! or wait!
That's a very important thing to know! And which can confuse new commers!
The poller is only needed when there is multiple sockets! And they are always sockets that we declare on the process code in question (bind them, or to connect to something)! Because if not! How will you recieve the messages! You can't do it well! because you have to prioritize one or another! in a loop having one recv() go first ! Which will block! Which even if the other socket get a message in it's queue! The loop is blocked and can't proceed to the next recv()! hince the poller give us the beauty to be able to tackle this! And work well with multiple sockets!
while(true) {
socket1.recv() // this will block
socket2.recv() // this will have to wait till the first recieve! Even if messages come in in it's queue
With the poller:
While(true) {
zmq_poll() // block till one of the socket events happen! If the event was POLLIN!
// If any socket get a message to it's queue
// This will unblock
// then we check which type and which socket was
if (condition socket 1) {
// treat socket 1 request
}
if (condition socket 2) {
// treat socket 2 request
}
// ...
}
You can see real code from the doc at this section (scroll enough to see the code blocks, you can see in all different langs too)
Know that the poller just notify that there is messages in! If it's POLLIN!
In each iteration! The poller if many events triggered already! Let's give the example of 10 messages recieved 5 in each socket! Here the poller already have the events queued! And in each next call for the 9 times! Will resolve immediately! The message in question can be mapped to what socket (by using the poller object and mean! So binding libraries make it too simple and agreable)! Then the right socket block will make the recieve call! And when it does it will consume it's next message from it's queue!
And so you keep looping and each time consuming the next message! As they came in! And the poller tracked there order of comming in! And that through the subscription and the events that were choosen to listen to! In case of receiving, it should be POLLIN!
Then each socket have it's message queue! And each recieve call! Pull from it! The poller tracked them! And so when the poller resolve! It's assured that there is message for the sockets recieve calls!
Last example: The server client pattern
Let's take the example of one server (router) and many clients (Dealers) that connect to! As by the image bellow!
Question: many connections to the same router! Comming asynchronously at once! And bla bla bla! In the server side (Router)! Do i need a poller !? A lot of new commers, may think yes or question if it's needed! Yup you guess right!
BIG NO!
Why ? Because in the Server (router) code! We have only one socket we are dealing with! That we bind! The clients then are connecting to it! In that end! There is only one socket! And all recv() calls are on that one socket! And that socket have it's message queue! The recv() consume the message one after another! Doesn't matter asynchronous and how they comes! Again the poller is only when there is multiple sockets! And so having the mixing nature of treating messages comming from multiple sockets! If not! Then one recv() of one socket need to go first then the other! And will block the other! Not a good thing (it's bad)!
Note
This answer bring a nice clearing! Plus it mades a reference to the doc with good highlighting! Also show code by the low level lib (c lang)! The answer of #rafflan show a great code with a binding lib (seems c#)! And a great explanation! If you didn't check it you must!

Related

Parallel execution: blocking receive, deferred synchronous

I've asked a question about errors that happened while parallel sync and async calls. And answer shed light on an even bigger questions:
Does blocking receive construct replaces .z.ps/.z.pg calls?
If there exists deferred synchronous (used in mserve.q), are there something like deferred asynchronous exists?
My observations based on the previous question. Case 3 from that question is ok:
q)neg[h]({neg[.z.w] x};42); h[]
42
But what if we want to ensure that our message have been sent:
q)neg[h]({neg[.z.w] x};42); neg[h][]; h[]
42
Seems ok, right? If we go further on the documentation we find out that there another type of insurance we have: h"" - message processed on remote, and in this case we've got an error:
q)neg[h]({neg[.z.w] x};42); neg[h][]; h""; h[]
'type
<hangs>
So the proposition is the following - h[] (sent in the appropriate sequence) somehow changes the behaviour of a sender and may be a receiver process to prepare them for such communication.
To answer your first question, I don't think "replace" is correct term, rather the incoming message is expected as it was initiated by the local process, therefore it's not routed towards the .z.ps handler, unlike messages which the process wasn't expecting, where .z.ps can be used to ensure the message isn't unfriendly or whatever the case may be.
When you issue a blocking receive, the O_NONBLOCK flag is cleared and recvfrom() blocks until a message arrives & the O_NONBLOCK flag is replaced
read(0, "h[]\n", 4080) = 4
fcntl(4, F_GETFL) = 0x802 (flags O_RDWR|O_NONBLOCK)
fcntl(4, F_SETFL, O_RDONLY) = 0
recvfrom(4,
"\1\0\0\0\25\0\0\0", 8, 0, NULL, NULL) = 8
recvfrom(4, "\n\0\7\0\0\0unblock", 13, 0, NULL, NULL) = 13
fcntl(4, F_GETFL) = 0x2 (flags O_RDWR)
fcntl(4, F_SETFL, O_RDONLY|O_NONBLOCK) = 0
On your second question, I believe deferred synchronous was introduced in kdb+ v2.3 for the scenario where a client process shouldn't block the remote process while it waits for it's response. Deferred synchronous allows the server to process other client requests, while your client process blocks until the requested info is received. This is fine when the client can't do anything else until it receives the response.
There are cases where neither process should wait for the other - is this what you're referring to? If so then a use case might be something like a tiered gateway system, where one or more gateways send/receive messages to/from each other, but none block or wait. This is done via async callbacks. In a complex system with multiple processes, each request needs to be tagged with an ID while they are inflight so as to track them. Likewise, you would need to track which request came from which connection so as to return results to the correct client.
Here is a simpler example
////////////// PROC A //////////////
q)\p
1234i
q)remoteFunc:{system"sleep 4";neg[.z.w](`clientCallback;x+y)}
////////////// PROC B //////////////
q)h:hopen 1234
q)clientCallback:{0N!x;}; .z.ts:{-1"Processing continues..";}
q)
q)neg[h](`remoteFunc;45;55);system"t 1000"
q)Processing continues..
Processing continues..
Processing continues..
Processing continues..
Processing continues..
100
// process A sent back it's result when it was ready
On your last question
neg[h][] flushes async messages as least as far as tcp/ip. This does not mean the remote has received them.
The chaser h"" flushes any outgoing messages on h, sends it's own request & processes all other messages on h, until it receives it's response.
Chasing async messages is a way to ensure they've been processed on the remote before moving onto the next async message. In your example, the chase followed by a hanging call isn't valid, for one it will error & secondly, it's not a task which requires a guarantee that the previous async message was fully processed before commencing.
Jason

How socketcan get send failure status?

As we all know, in the CAN bus communication protocol, sender know whether the data was successfully sent. I send socketcan data as follows.
ret = write (socket, frame, sizeof (struct can_frame));
However, even if the CAN communication cable is disconnected, the return value of ret is still 16(=sizeof (struct can_frame)).I queried the information and found that the problem was due to the tx_queue of the network stack used by socketcan. When write is called multiple times, the buffer is full and the return value of ret is -1.
But this is not the behavior I expect, I hope that every frame of data sent will immediately get the status of success or failure.
By
echo 0> / sys / class / net / can0 / tx_queue_len
I want to cancel the tx_queue, but it does not work.
What I want to ask is, is there a way to cancel the tx_queue of socketcan, or to get the status of the each sending frame about controller through the API (such as libsocketcan).
Thanks.
You cannot use write() itself to discover whether a CAN frame was successfully put on the bus, because all it does is write the frame to the in-kernel socket buffer. The kernel then moves the frame to the transmit queue of the SocketCAN network interface, followed by the driver moving it to the transmit buffer of the CAN controller, which finally puts the frame on the bus. What you want is a direct write which bypasses all those buffers, but that's not possible with SocketCAN, even if you set the transmit queue length to 0.
However, there is another way to get confirmation. If you enable the CAN_RAW_RECV_OWN_MSGS socket option (see section 4.1.4 and 4.1.7 in the SocketCAN documentation), you will receive frames that were successfully sent. You'll need to use recvmsg() so you get the message flags. msg_flags will have the MSG_CONFIRM bit set for a frames that was successfully sent by the same socket on which it is received. You won't be informed of failures, but you can detect them by using a timeout for the confirmation.
It's not an ideal solution because it mixes the read and write logic in your application. One way to avoid this would be to use two sockets. One for writing and reading MSG_CONFIRM frames, the other for reading all other frames. You could then create a (blocking) write function that does a write() followed by multiple calls to recvmsg() with an appropriate timeout.
Finally, it is useful to enable error frames (through the CAN_RAW_ERR_FILTER socket option). If you send a frame on a socket with a disconnected cable, this will typically result in a bus off state, which will be reported in an error frame.

Epoll events for connecting sockets

I create epoll and register some non-blocking sockets which try connect to closed ports on localhost. Why epoll tells me, that i can write to this socket (it give event for one of created socket with eventmask contain EPOLLOUT)? But this socket doesn't open and if i try send something to it i get an error Connection refused.
Another question - what does mean even EPOLLHUP? I thought that this is event for refused connection. But how in this case event can have simultaneously EPOLLHUP and EPOLLOUT events?
Sample code on Python:
import socket
import select
poll = select.epoll()
fd_to_sock = {}
for i in range(1, 3):
s = socket.socket()
s.setblocking(0)
s.connect_ex(('localhost', i))
poll.register(s, select.EPOLLOUT)
fd_to_sock[s.fileno()] = s
print(poll.poll(0.1))
# prints '[(4, 28), (5, 28)]'
All that poll guarantees is that your application won't block after calling corresponding function. So you are getting what you've paid for - you can now rest assured writing to this socket won't block - and it didn't block, did it?
Poll never guarantees that corresponding operation will succeed.
poll/select/epoll return when the file descriptor is "ready" but that just means that the operation will not block (not that you will necessarily be able to write to it successfully).
Likewise for EPOLLIN: for example, it will return ready when a socket is closed; in that case, you won't actually be able to read data from it.
EPOLLHUP means that there was a "hang up" on the connection. That would really only occur once you actually had a connection. Also, the documentation (http://linux.die.net/man/2/epoll_ctl) says that you don't need to include it anyway:
EPOLLHUP
Hang up happened on the associated file descriptor. epoll_wait(2) will always wait for this event; it is not necessary to set it in events.

Does gen_tcp:recv/3 closes the socket if the timeout is reached?

I currently have a server that handles multiple connections from clients, and client that connects to the server using two connections. My client has two processes that handle respectively sending and receiving to and from the server, but not both. The problem I currently have is when I want to close the socket, my reading process is stuck on the gen_tcp:recv/2 block. If I put a timeout, the socket is closed when the timeout has been reached. My question is, is it possible to have gen_tcp:recv/3 call that doesn't closes the socket.
This is how my reading process looks like.
read(Socket, Control) ->
Control ! ok,
receive
read ->
case gen_tcp:recv(Socket, 0) of
{ok, Data} ->
%% handling for messages;
Other ->
io:format(Other)
end,
read(self()), %% this sends "read" to itself with "!"
read(Socket, Control);
{error, Reason} ->
io:format(Reason)
end;
close ->
io:format("Closing Reading Socket.~n"),
gen_tcp:close(Socket)
end.
As you can see here, the process will never be able to receive a close if recv/2 doesn't read anything.
Sure, gen_tcp:recv/3 with timeout set to infinity will not close the socket :) See the official documentation.
Edit:
From the documentation:
This function receives a packet from a socket in passive mode.
Check the documentation for setopts/2 to understand the difference between passive and active modes. In particular:
If the value is false (passive mode), the process must explicitly receive incoming data by calling gen_tcp:recv/2,3.
Your process can only do one thing at a time - either listen for the close message from another process or wait for the TCP packet. You could try to use the gen_tcp:controlling_process/2 but I don't know the details. Another solution would be to handle the recv/3 in a separate (third) linked process and kill the process when the close is received.
The better way would be to use an active socket instead, see the Examples section in the official documentation for some guide how to do that.
The best way in my opinion would be to use the OTP gen_server to handle both, the close message and incoming TCP packets in the same process. There is an excellent tutorial in the Erlang and OTP in Action book on how to implement that and here is the code example on Github.

How to implement Socket.PollAsync in C#

Is it possible to implement the equivalent of Socket.Poll in async/await paradigm (or BeginXXX/EndXXX async pattern)?
A method which would act like NetworkStream.ReadAsync or Socket.BeginReceive but:
leave the data in the socket buffer
complete after the specified interval of time if no data arrived (leaving the socket in connected state so that the polling operation can be retried)
I need to implement IMAP IDLE so that the client connects to the mail server and then goes into waiting state where it received data from the server. If the server does not send anything within 10 minutes, the code sends ping to the server (without reconnecting, the connection is never closed), and starts waiting for data again.
In my tests, leaving the data in the buffer seems to be possible if I tell Socket.BeginReceive method to read no more than 0 bytes, e.g.:
sock.BeginReceive(b, 0, 0, SocketFlags.None, null, null)
However, not sure if it indeed will work in all cases, maybe I'm missing something. For instance, if the remote server closes the connection, it may send a zero-byte packet and not sure if Socket.BeginReceive will act identically to Socket.Poll in this case or not.
And the main problem is how to stop socket.BeginReceive without closing the socket.