Get subscriber filter from a ZMQ PUB socket - sockets

I noticed in the FAQ, in the Monitoring section, that it's not possible to get a list of connected peers or to be notified when peers connect/disconnect.
Does this imply that it's also not possible to know which topics a PUB/XPUB socket knows it should publish, from its upstream feedback? Or is there some way to access that data?
I know that ZMQ >= 3.0 "supports PUB/SUB filtering at the publisher", but what I really want is to filter at my application code, using the knowledge ZMQ has about which topics are subscribed to.
My use-case is that I want to publish info about the status of a robot. Some topics involve major hardware actions, like switching the select lines on an ADC to read IR values.
I have a publisher thread running on the bot that should only do that "read" to get IR data when there are actually subscribers. However, since I can only feed a string into my pub_sock.send, I always have to do the costly operation, even if ZMQ is about to drop that message when there are no subscribers.
I have an implementation that uses a backchannel REQ/REP socket to send topic information, which my app can check in its publish loop, thereby only collecting data that needs to be collected. This seems very inelegant though, since ZMQ must already have the data I need, as evidenced by its filtering at the publisher.
I noticed that in this mailing list message, the OP seems to be able to see subscribe messages being sent to an XPUB socket.
However, there's no mention of how they did that, and I'm not seeing any such ability in the docs (still looking). Maybe they were just using Wireshark (to see upstream subscribe messages to an XPUB socket).

Using zmq.XPUB socket type, there is a way to detect new and leaving subscribers. The following code sample shows how:
# Publisher side
import zmq
ctx = zmq.Context.instance()
xpub_socket = ctx.socket(zmq.XPUB)
xpub_socket.bind("tcp://*:%d" % port_nr)
poller = zmq.Poller()
poller.register(xpub_socket)
events = dict(poller.poll(1000))
if xpub_socket in events:
msg = xpub_socket.recv()
if msg[0] == b'\x01':
topic = msg[1:]
print "Topic '%s': new subscriber" % topic
elif msg[0] == b'\x00':
topic = msg[1:]
print "Topic '%s': subscriber left" % topic
Note that the zmq.XSUB socket type does not subscribe in the same manner as the "normal" zmq.SUB. Code sample:
# Subscriber side
import zmq
ctx = zmq.Context.instance()
# Subscribing of zmq.SUB socket
sub_socket = ctx.socket(zmq.SUB)
sub_socket.setsockopt(zmq.SUBSCRIBE, "sometopic") # OK
sub_socket.connect("tcp://localhost:%d" % port_nr)
# Subscribing zmq.XSUB socket
xsub_socket = ctx.socket(zmq.XSUB)
xsub_socket.connect("tcp://localhost:%d" % port_nr)
# xsub_socket.setsockopt(zmq.SUBSCRIBE, "sometopic") # NOK, raises zmq.error.ZMQError: Invalid argument
xsub_socket.send_multipart([b'\x01', b'sometopic']) # OK, triggers the subscribe event on the publisher
I'd also like to point out the zmq.XPUB_VERBOSE socket option. If set, all subscription events are received on the socket. If not set, duplicate subscriptions are filtered. See also the following post: ZMQ: No subscription message on XPUB socket for multiple subscribers (Last Value Caching pattern)

At least for the XPUB/XSUB socket case you can save a subscription state by forwarding and handling the packages manually:
context = zmq.Context()
xsub_socket = context.socket(zmq.XSUB)
xsub_socket.bind('tcp://*:10000')
xpub_socket = context.socket(zmq.XPUB)
xpub_socket.bind('tcp://*:10001')
poller = zmq.Poller()
poller.register(xpub_socket, zmq.POLLIN)
poller.register(xsub_socket, zmq.POLLIN)
while True:
try:
events = dict(poller.poll(1000))
except KeyboardInterrupt:
break
if xpub_socket in events:
message = xpub_socket.recv_multipart()
# HERE goes some subscription handle code which inspects
# message
xsub_socket.send_multipart(message)
if xsub_socket in events:
message = xsub_socket.recv_multipart()
xpub_socket.send_multipart(message)
(this is Python code but I guess C/C++ looks quite similar)
I'm currently working on this topic and I will add more information as soon as possible.

Related

How does the Camel Netty TCP socket consumer decide how to split incoming data into messages (and is it configurable)?

I'm working with a Camel flow that uses a Netty TCP socket consumer to receive messages from a client program (which is outside of my control). The client should be opening a socket, sending us one message, then closing the socket, but we've been seeing cases where instead of one message Camel is "splitting" the text stream into two parts and trying to process them separately.
So I'm trying to figure out, since you can re-use the same socket for multiple Camel messages, but TCP sockets don't have a built-in concept of "frames" or a standard for message delimiters, how does Camel decide that a complete message has been received and is ready to process? I haven't been able to find a documented answer to this in the Netty component docs (https://camel.apache.org/components/3.15.x/netty-component.html), although maybe I'm missing something.
From playing around with a test script, it seems like one answer is "Camel assumes a message is complete and should be processed if it goes more than 1ms without receiving any input on the socket". Is this a correct statement, and if so is this behavior documented anywhere? Is there any way to change or configure this behavior? Really what I would prefer is for Camel to wait for an ETX character (or a much longer timeout) before processing a message, is that possible to set up?
Here's my test setup:
Camel flow:
from("netty:tcp://localhost:3003")
.log("Received: ${body}");
Python snippet:
DELAY_MS = 3
def send_msg(sock, msg):
print("Sending message: <{}>".format(msg))
if not sock.sendall(msg.encode()) is None:
print("Message failed to send")
time.sleep(DELAY_MS / 1000.0)
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
try:
print("Using DELAY_MS: {}".format(str(DELAY_MS)))
s.connect((args.hostname, args.port))
cutoff = int(math.floor(len(args.msg) / 2))
msg1 = args.msg[:cutoff]
send_msg(s, msg1)
msg2 = args.msg[cutoff:]
send_msg(s, msg2)
response = s.recv(1024)
except Exception as e:
print(e)
finally:
s.close()
I can see that with DELAY_MS=1 Camel logs one single message:
2022-02-21 16:54:40.689 INFO 19429 --- [erExecutorGroup] route1 : Received: a long string sent over the socket
But with DELAY_MS=2 it logs two separate messages:
2022-02-21 16:56:12.899 INFO 19429 --- [erExecutorGroup] route1 : Received: a long string sen
2022-02-21 16:56:12.899 INFO 19429 --- [erExecutorGroup] route1 : Received: t over the socket
After doing some more research, it seems like what I need to do is add a delimiter-based FrameDecoder to the decoders list.
Setting it up like this:
from("netty:tcp://localhost:3003?sync=true"
+ "&decoders=#frameDecoder,#stringDecoder"
+ "&encoders=#stringEncoder")
where frameDecoder is provided by
#Bean
ChannelHandlerFactory frameDecoder() {
ByteBuf[] ETX_DELIM = new ByteBuf[] { Unpooled.wrappedBuffer(new byte[] { (char)3 }) };
return ChannelHandlerFactories.newDelimiterBasedFrameDecoder(1024, ETX_DELIM,
false, "tcp");
}
seems to do the trick.
On the flip side though, it seems like this will hang indefinitely (or until lower-level TCP timeouts kick in?) if an ETX frame is not received, and I can't figure out any way to set a timeout on the decoder, so would still be eager for input if anyone knows how to do that.
I think the default "timeout" behavior I was seeing might've just been an artifact of Netty's read loop speed -- How does netty determine when a read is complete?

What is meant by record or data boundaries in the sense of TCP & UDP protocol?

I am learning to sockets and found the word Data OR Record Boundaries in SOCK_SEQPACKET communication protocol? Can anyone explain in simple words what is Data boundary and how the SOCK_SEQPACKET is different from SOCK_STREAM & SOCK_DGRAM ?
This answer https://stackoverflow.com/a/9563694/1076479 has a good succinct explanation of message boundaries (a different name for "record boundaries").
Extending that answer to SOCK_SEQPACKET:
SOCK_STREAM provides reliable, sequenced communication of streams of data between two peers. It does not maintain message (record) boundaries, which means the application must manage its own boundaries on top of the stream provided.
SOCK_DGRAM provides unreliable transmission of datagrams. Datagrams are self-contained capsules and their boundaries are maintained. That means if you send a 20 byte buffer on peer A, peer B will receive a 20 byte message. However, they can be dropped, or received out of order, and it's up to the application to figure that out and handle it.
SOCK_SEQPACKET is a newer technology that is not yet widely used, but tries to marry the benefits of both of the above. That is, it provides reliable, sequenced communication that also transmits entire "datagrams" as a unit (and hence maintains message boundaries).
It's easiest to demonstrate the concept of message boundaries by showing what happens when they're neglected. Beginners often post client code like this here on SO (using python for convenience):
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect(('192.168.4.122', 9000))
s.send(b'FOO') # Send string 1
s.send(b'BAR') # Send string 2
reply = s.recv(128) # Receive reply
And server code similar to this:
lsock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
lsock.bind(('', 9000))
lsock.listen(5)
csock, caddr = lsock.accept()
string1 = csock.recv(128) # Receive first string
string2 = csock.recv(128) # Receive second string <== XXXXXXX
csock.send(b'Got your messages') # Send reply
They don't understand then why the server hangs on the second recv call, while the client is hung on its own recv call. That happens because both strings the client sent (may) get bundled together and received as a single unit in the first recv on the server side. That is, the message boundary between the two logical messages was not preserved, and so string1 will often contain both chunks run together: 'FOOBAR'
(Often there are other timing-related aspects to the code that influence when/whether that actually happens or not.)

OPC UA Subscriptions and Notifications

I'm having trouble with OPC UA Subscriptions and Notifications in the ANSI C stack. OPC UA Part 4, Service says:
5.13.1 Subscription model
5.13.1.1 Description c) NotificationMessages are sent to the Client in response to Publish requests.
Sent how? I'm really expecting a callback of some sort, but there doesn't seem to be one. It does say these are in response to a 'Publish' request, but a Publish service call acknowledges receipt of a notification, it doesn't seem to request one. Besides, that would be polling and the whole point of Subscriptions and Monitoring is to not poll.
Can anyone supply an example showing monitoring of a data value in ANSI C?
PublishRequests are queued on the server and responses are only returned when notifications are ready or a keep-alive needs to be sent (or a bunch of other stuff, check the state machine description in part 4).
They do include acknowledgements of previously received notifications as well, but the idea is that the response isn't expected immediately and that the client will generally keep pumping PublishRequests out so that the server has a queue of them ready to return notifications whenever a subscription needs to.
Yes, it's polling. It's a bit of a bummer that it's not strictly unsolicited, but that's how it works.
__
edit:
It's not really polling. It's batched report by exception with a QoS guarantee and back pressure mechanism provided by subsequent PublishRequests.
This is C# code. I hope that it will help you.
private NotificationMessageReceivedEventHandler
m_NotificationMessageReceived;
// ...
m_NotificationMessageReceived =
new NotificationMessageReceivedEventHandler
(Subscription_NotificationMessageReceived);
m_subscription.NotificationMessageReceived +=
Subscription_NotificationMessageReceived;
// ...
private void Subscription_NotificationMessageReceived
(Subscription subscription,
NotificationMessageReceivedEventArgs e)
{
if (e.NotificationMessage.NotificationData == null ||
e.NotificationMessage.NotificationData.Count == 0)
{
LogMessage("{0:HH:mm:ss.fff}: KeepAlive",
e.NotificationMessage.PublishTime.ToLocalTime());
}
}

hornetq message remain in queue after ack

We are using hornetq-core 2.2.21.Final stand-alone after reading a non-transnational message , the message still remains in queue although it acknowledge
session is created using
sessionFactory.createSession(true, true, 0)
locator setting:
val transConf = new TransportConfiguration(classOf[NettyConnectorFactory].getName,map)
val locator = HornetQClient.createServerLocatorWithoutHA(transConf)
locator.setBlockOnDurableSend(false)
locator.setBlockOnNonDurableSend(false)
locator.setAckBatchSize(0) // also tried without this setting
locator.setConsumerWindowSize(0)// also tried without this setting
Message is acknowledge using message.acknowledge ()
I think that the problem might be two queues on the same address
also tried to set the message expiration but it didn't help , messages are still piling up in the queue
please advise
It seems you are using the core api. Are you explicitly calling acknowledge on the messages?
If you have two queues on the same address ack will only ack the messages on the queue you are consuming. On that case the system is acting normally.

How does zmq poller work?

I am confused as to what poller actually does in zmq. The zguide goes into it minimally, and only describes it as a way to read from multiple sockets. This is not a satisfying answer for me because it does not explain how to have timeout sockets. I know zeromq: how to prevent infinite wait? explains for push/pull, but not req/rep patterns, which is what I want to know how to use.
What I am attempting to ask is: How does poller work, and how does its function apply to keeping track of sockets and their requests?
When you need to listen on different sockets in the same thread, use a poller:
ZMQ.Socket subscriber = ctx.socket(ZMQ.SUB)
ZMQ.Socket puller = ctx.socket(ZMQ.PULL)
Register sockets with poller (POLLIN listens for incoming messages)
ZMQ.Poller poller = ZMQ.Poller(2)
poller.register(subscriber, ZMQ.Poller.POLLIN)
poller.register(puller, ZMQ.Poller.POLLIN)
When polling, use a loop:
while( notInterrupted()){
poller.poll()
//subscriber registered at index '0'
if( poller.pollin(0))
subscriber.recv(ZMQ.DONTWAIT)
//puller registered at index '1'
if( poller.pollin(1))
puller.recv( ZMQ.DONTWAIT)
}
Choose how you want to poll...
poller.poll() blocks until there's data on either socket.
poller.poll(1000) blocks for 1s, then times out.
The poller notifies when there's data (messages) available on the sockets; it's your job to read it.
When reading, do it without blocking: socket.recv( ZMQ.DONTWAIT). Even though poller.pollin(0) checks if there's data to be read, you want to avoid any blocking calls inside the polling loop, otherwise, you could end up blocking the poller due to 'stuck' socket.
So, if two separate messages are sent to subscriber, you have to invoke subscriber.recv() twice in order to clear the poller, otherwise, if you call subscriber.recv() once, the poller will keep telling you there's another message to be read. So, in essence, the poller tracks the availability and number of messages, not the actual messages.
You should run through the polling examples and play with the code, it's the best way to learn.
Does that answer your question?
In this answer i listed
Details from the documentation http://api.zeromq.org/4-1:zmq-poll
Also i added some important explanation and things that clear confusion for new commers! If you are in a hurry! You may like to start by What the poller do and What about Recieving and A note about recieving and what about one socket only sections at the end! Starting from important notes section! Where i clear things in depth! I still suggest reading well the details in the doc ref! And first section!
Doc ref and notes
Listen to multiple sockets and events
The zmq_poll() function provides a mechanism for applications to multiplex input/output events in a level-triggered fashion over a set of sockets. Each member of the array pointed to by the items argument is a zmq_pollitem_t structure. The nitems argument specifies the number of items in the items array. The zmq_pollitem_t structure is defined as follows:
typedef struct
{
void //*socket//;
int //fd//;
short //events//;
short //revents//;
} zmq_pollitem_t;
zmq Socket or standard socket through fd
For each zmq_pollitem_t item, zmq_poll() shall examine either the ØMQ socket referenced by socket or the standard socket specified by the file descriptor fd, for the event(s) specified in events. If both socket and fd are set in a single zmq_pollitem_t, the ØMQ socket referenced by socket shall take precedence and the value of fd shall be ignored.
Big note (same context):
All ØMQ sockets passed to the zmq_poll() function must share the same ØMQ context and must belong to the thread calling zmq_poll().
Revents member
For each zmq_pollitem_t item, zmq_poll() shall first clear the revents member, and then indicate any requested events that have occurred by setting the bit corresponding to the event condition in the revents member.
Upon successful completion, the zmq_poll() function shall return the number of zmq_pollitem_t structures with events signaled in revents or 0 if no events have been signaled.
Awaiting for events and blocking
If none of the requested events have occurred on any zmq_pollitem_t item, zmq_poll() shall wait timeout microseconds for an event to occur on any of the requested items. If the value of timeout is 0, zmq_poll() shall return immediately. If the value of timeout is -1, zmq_poll() shall block indefinitely until a requested event has occurred on at least one zmq_pollitem_t. The resolution of timeout is 1 millisecond.
0 => doesn't wait
-1 => block
+val => block and wait for the timeout amount
Events
The events and revents members of zmq_pollitem_t are bit masks constructed by OR'ing a combination of the following event flags:
ZMQ_POLLIN
For ØMQ sockets, at least one message may be received from the socket without blocking. For standard sockets this is equivalent to the POLLIN flag of the poll() system call and generally means that at least one byte of data may be read from fd without blocking.
ZMQ_POLLOUT
For ØMQ sockets, at least one message may be sent to the socket without blocking. For standard sockets this is equivalent to the POLLOUT flag of the poll() system call and generally means that at least one byte of data may be written to fd without blocking.
ZMQ_POLLERR
For standard sockets, this flag is passed through zmq_poll() to the underlying poll() system call and generally means that some sort of error condition is present on the socket specified by fd. For ØMQ sockets this flag has no effect if set in events, and shall never be returned in revents by zmq_poll().
Note:
The zmq_poll() function may be implemented or emulated using operating system interfaces other than poll(), and as such may be subject to the limits of those interfaces in ways not defined in this documentation.
Return value
Upon successful completion, the zmq_poll() function shall return the number of zmq_pollitem_t structures with events signaled in revents or 0 if no events have been signaled. Upon failure, zmq_poll() shall return -1 and set errno to one of the values defined below.
Example
Polling indefinitely for input events on both a 0mq socket and a standard socket.
zmq_pollitem_t items [2];
/* First item refers to ØMQ socket 'socket' */
items[0].socket = socket;
items[0].events = ZMQ_POLLIN;
/* Second item refers to standard socket 'fd' */
items[1].socket = NULL;
items[1].fd = fd;
items[1].events = ZMQ_POLLIN;
/* Poll for events indefinitely */
int rc = zmq_poll (items, 2, -1);
assert (rc >= 0); /* Returned events will be stored in items[].revents */
Important notes
What the poller do and What about Recieving
The poller only check and await for when events occure!
POLLIN is for receiving! Data is there for recieving!
Then we should read through recv()! We are responsible to read or do anything! The poller is just there to listen to the events and await for them! And through zmq_pollitem_t we can listen to multiple events! If any event happen! Then the poller unblock! We can check then the event in the recv! and zmq_pollitem_t! Note that the poller queue the events as they trigger! And next call will pick from the queue! The order because of that is also kept! And successive calls will return the next event and so on! As they came in!
A note about recieving and what about one socket only
For a Router! A one router can receive multiple requests even from a one client! And also from multiple clients at once! In a setup where multiple clients are of same nature! And are the ones connecting to the router! A question that can cross the mind of a new commer is! Do i need a poller for this async nature! The anwser is no! No need for a poller and listening for different sockets!
The big note is: Receiving calls (zmq_recv(), socket.recv() some lang binding)! Block! And are the way to read! When messages come! They are queued! The poller have nothing to do with this! The poller only listen to events from different sockets! And unblock if any of them happen! if the timeout is reached then no event happen! And does no more then that!
The nature of receiving is straight forward! The recieve call blocks! Till a message in the message queue comes! When multiple come they will get queued! Then on each next call for recv()! We will pull the next message! Or frame! (Depending on what recieving method we are using! And the api level! And abstraction from the binding library to the low level one!)
Because we can access too the messages by frame! a frame at each call!
But then here it becomes clear!
Recieve calls are the things to recieve! They block till a message enter the queue! Multiple parallel messages! Will enter the queue as they come! Then for each call! Either the queue is filled or not! consume it! or wait!
That's a very important thing to know! And which can confuse new commers!
The poller is only needed when there is multiple sockets! And they are always sockets that we declare on the process code in question (bind them, or to connect to something)! Because if not! How will you recieve the messages! You can't do it well! because you have to prioritize one or another! in a loop having one recv() go first ! Which will block! Which even if the other socket get a message in it's queue! The loop is blocked and can't proceed to the next recv()! hince the poller give us the beauty to be able to tackle this! And work well with multiple sockets!
while(true) {
socket1.recv() // this will block
socket2.recv() // this will have to wait till the first recieve! Even if messages come in in it's queue
With the poller:
While(true) {
zmq_poll() // block till one of the socket events happen! If the event was POLLIN!
// If any socket get a message to it's queue
// This will unblock
// then we check which type and which socket was
if (condition socket 1) {
// treat socket 1 request
}
if (condition socket 2) {
// treat socket 2 request
}
// ...
}
You can see real code from the doc at this section (scroll enough to see the code blocks, you can see in all different langs too)
Know that the poller just notify that there is messages in! If it's POLLIN!
In each iteration! The poller if many events triggered already! Let's give the example of 10 messages recieved 5 in each socket! Here the poller already have the events queued! And in each next call for the 9 times! Will resolve immediately! The message in question can be mapped to what socket (by using the poller object and mean! So binding libraries make it too simple and agreable)! Then the right socket block will make the recieve call! And when it does it will consume it's next message from it's queue!
And so you keep looping and each time consuming the next message! As they came in! And the poller tracked there order of comming in! And that through the subscription and the events that were choosen to listen to! In case of receiving, it should be POLLIN!
Then each socket have it's message queue! And each recieve call! Pull from it! The poller tracked them! And so when the poller resolve! It's assured that there is message for the sockets recieve calls!
Last example: The server client pattern
Let's take the example of one server (router) and many clients (Dealers) that connect to! As by the image bellow!
Question: many connections to the same router! Comming asynchronously at once! And bla bla bla! In the server side (Router)! Do i need a poller !? A lot of new commers, may think yes or question if it's needed! Yup you guess right!
BIG NO!
Why ? Because in the Server (router) code! We have only one socket we are dealing with! That we bind! The clients then are connecting to it! In that end! There is only one socket! And all recv() calls are on that one socket! And that socket have it's message queue! The recv() consume the message one after another! Doesn't matter asynchronous and how they comes! Again the poller is only when there is multiple sockets! And so having the mixing nature of treating messages comming from multiple sockets! If not! Then one recv() of one socket need to go first then the other! And will block the other! Not a good thing (it's bad)!
Note
This answer bring a nice clearing! Plus it mades a reference to the doc with good highlighting! Also show code by the low level lib (c lang)! The answer of #rafflan show a great code with a binding lib (seems c#)! And a great explanation! If you didn't check it you must!