How does the heartbeat feature of AMPS work? - publish-subscribe

I was reading about AMPS, it says you can set heartbeat interval to ask client keep checking if the server is still up or disconnected. But how does it work ? is the checking the heartbeat part asynchronous ?
What if my current thread is blocked will it still receive the heartbeat.
what does the below lines from the APMS document means, can someone pls explain me with an example -
The AMPS client processes heartbeat messages on the client receive
thread, which is the thread used for asynchronous message processing.
If your application uses asynchronous message processing and occupies
the thread for longer than the heartbeat interval, the client may fail
to respond to heartbeat messages in a timely manner and may be
disconnected by the server.

Related

How would you grab the latest message from multiple connections to a single ZMQ socket?

I am new to ZMQ and am not sure if what I want is even possible or if I should use another technology.
I would like to have a socket that multiple servers can stream to.
It appears that a ZMQ socket can do this based on this documentation: http://api.zeromq.org/4-0:zmq-setsockopt
How would I implement a ZMQ socket on the receiving end that only grabs the latest message sent from each server?
You can do this with Zmq's PUB / SUB.
The first key thing is that a SUB socket can be connected to multiple PUBlishers. This is covered in Chapter 1 of the guide:
Some points about the publish-subscribe (pub-sub) pattern:
A subscriber can connect to more than one publisher, using one connect call each time. Data will then arrive and be interleaved “fair-queued” so that no single publisher drowns out the others.
If a publisher has no connected subscribers, then it will simply drop all messages.
If you’re using TCP and a subscriber is slow, messages will queue up on the publisher. We’ll look at how to protect publishers against this using the “high-water mark” later.
So, that means that you can have a single SUB socket on your client. This can be connected to several PUB sockets, one for each server from which the client needs to stream messages.
Latest Message
The "latest message" can be partially dealt with (as I suspect you'd started to find) using high water marks. The ZMQ_RCVHWM option allows the number to be received to be set to 1, though this is an imprecise control.
You also have to consider what it is that is meant by "latest" message; the PUB servers and SUB client will have different views of what this is. For example, when the zmq_send() function on a PUB server returns, the sent message is the one that the PUBlisher would regard as the "latest".
However, over in the client there is no knowledge of this as nothing has yet got down through the PUBlishing server's operating system network stack, nothing has yet touched the Ethernet, etc. So the SUBscribing client's view of the "latest" message at that point in time is whichever message is in ZMQ's internal buffers / queues waiting for the application to read it. This message could be quite old in comparison to the one the PUBlisher has just started sending.
In reality, the "latest" message seen by the client SUBscriber will be dependent on how fast the SUBscriber application runs.
Provided it's fast enough to keep up with all the PUBlishers, then every single message the SUBscriber gets will be as close to the "latest" message as it can get (the message will be only as old as the network propagation delays and the time taken to transit through ZMQ's internal protocols, buffers and queues).
If the SUBscriber isn't fast enough to keep up, then the "latest" messages it will see will be at least as old as the processing time per message multiplied by the number of PUBlishers. If you've set the receive HWM to 1, and the subscriber is not keeping up, the publishers will try publishing messages but the subscriber socket will keep rejecting them until the subscribed application has cleared out the old message that's caused the queue congestion, waiting for zmq_recv() to be called.
If the subscriber can't keep up, the best thing to do in the subscriber is:
have a receiving thread dedicated to receiving messages and dispose of them until processing becomes available
have a separate processing thread that does the processing.
Have the two threads communicate via ZMQ, using a REQ/REP pattern via an inproc connection.
The receiving thread can zmq_poll both the SUB socket connection to the PUBlishing servers and the REP socket connection to the processing thread.
If the receiving thread receives a message on the REP socket, it can reply with the next message read from the SUB socket.
If it receives a message from the SUB socket with no REPly due, it disposes of the message.
The processing thread sends 1 bytes messages (the content doesn't matter) to its REQ socket to request the latest message, and receives the latest message from the PUBlishers in reply.
Or, something like that. That'll keep the messages flowing from PUBlishers to the SUBscriber, thus the SUBscriber always has a message as close to possible as being "the latest" and is processing that as and when it can, disposing of messages it can't deal with.

Winsock2 synchronous IO...waiting for WSASend to "complete" using fWait==TRUE in WSAGetOverlappedResult

A coworker and I are having a disagreement on what constitutes "completion" of a WSASend overlapped IO request. He asserts using fWait as TRUE in the WSAGetOverlappedResult call only waits until the message is queued for sending. He believes waiting for the write/send operation to "complete" only means the message was successfully initiated. In my opinion that is far from a "completed" message to the other side of the socket...that would simple be a beginning of a send and not a completion. If the fWait of TRUE does not block until the bytes have been sent and been ACKed (or error returned), then this is far from synchronous...it would in fact be acting the same as asynchronous IO because it's just fire and forget.
I have been maintaining our company's communication library with my understanding of how to do, and what is, "synchronous" IO for decades so I'll be shocked if I'm indeed wrong in my understanding. But my coworker is a brilliant developer with TONS of TCP/IP experience and is adamant he's right. Says he even posed this question here on stackoverflow and was told he was right. I can't imagine how I could be misunderstanding "completion" of a send to mean anything other than the sending of the bytes requested were indeed sent and ACKed. But I've been wrong before LOL
So...who is right? What EXACTLY does it mean to wait for a WSASend request to be "complete"? Simply waiting until the message is queued for sending in the TCP/IP stack...or waiting for all the packets that constitute the message to be both sent and ACKed??? Or is the truth somewhere in-between?
You are wrong.
A send request is completed when the request is no longer being processed. Once the stack has taken over the operation, the request is no longer being processed and its resources can be used for some other purpose.
If you were right, overlapped I/O would be nearly useless. Who would want to have to wait until the other side ACKed a previous send in its entirety before they could queue any more data to the TCP connection?
How would overlapped I/O be useful if there was no way to determine when the queing process was complete? That's what the application needs to know so it can send another chunk of data.
You would always have dead time as the send queue would always have to get completely empty before more data could be queued on the sending side. Yuck!
And knowing when the ACK is received is useless anyway. It doesn't mean the application on the other end got the data. So it has no meaning at application layer. Knowing that the send has been queued does have meaning at application layer -- it means you can now queue more data to be sent and be guaranteed that all of the data from the previous send will be passed to the receiver's application layer before any data from the next send. That's what the application needs to know.
A synchronous call to WASSend also completes when the send is queued. An asynchronous operation just means you don't have to wait for whatever you'd wait for in a synchronous operation. So that's another reason your understanding would be quite strange.
Lastly, there is no standard or common way to wait for a remote ACK with synchronous operations. So you would have to think that asynchronous operations default to providing a semantic that is not even really available with synchronous ones. That's also pretty strange.
David is correct.
WSASend will complete when that data is handed off to the TCPIP stack (buffered at their layer) to be sent whenever the transport will allow. If it's a blocking call, it will wait until it's ready to pend; just like if it's async, the OVERLAPPED I/O will complete once it pends.
Some may ask why is this even necessary? This is behavior is crucial for keeping as much data in-flight over a TCP connection. In order to keep as much data in-flight, a sender should call WSASend until it pends (recall if it's a synchronous WSASend call then that thread will just block until WSASend can complete; if it's asynchronous, the async completion will occur once that WSASend call can pend).
Why would WSASend ever pend, why not just complete immediately? WSASend will only complete once the transport (in kernel) is ready to buffer more data. So looping until WSASend pends will keep that connection flush with enough data to keep the maximum data in-flight.
Hope this makes sense.
My testing seems to show I am indeed wrong. Synchronous behavior does not exist for sending...just receiving. I see how this is a performance help, but I object to all documentation saying sending fWait of TRUE in WSAGetOverlappedResult will wait until the IO request is "complete". This word implies much more than just it being queued up. Well..I'm flabbergasted that my understanding was off...but TCP handles things well enough that it hasn't caused me issues before.
Thanks for the excellent replies and patience with my frustrated replies. I'm incredibly disappointed at how all the documentation is written using words that would imply my understanding was right...when all along waiting for IO to "end", "complete", "finish" absolutely does NOT wait.

How can I ensure that the messages sent are not lost when the kafka is not working?

I've started to use Kafka. And I have a question about it.
If Kafka is not running because of network problem, kafka crash etc. how can I eliminate this problem? And, What happens to messages that was sent to kafka?
If all brokers in the cluster are unavailable your producer will not get any acknowledgements (note that the actual send happens in a background thread, not the thread that calls send - that is an async call).
If you have acks=0 then you have lost the message but acks=1 or acks=all then it depends on retry configuration. By default the producer thread retries pretty much indefinitely which means at some point the send buffer will fill up and then the async send method will fail synchronously, but if your client fails in the meantime then the messages in the buffer are lost as that is just in memory.
If you are wondering about behaviour when some but not all brokers are down, I wrote about it here

How to handle application failure after reading event from source in Spring Cloud Stream with rabbit MQ

I am using Spring Cloud Stream over RabbitMQ for my project. I have a processor that reads from a source, process the message and publish it to the sink.
Is my understanding correct that if my application picks up an event from the stream and fails (e.g. app sudden death):
unless I ack the message or
I save the message after reading it from the queue
then my event would be lost? What other option would I have to make sure not to lose the event in such case?
DIgging through the Rabbit-MQ documentation I found this very useful example page for the different types of queues and message deliveries for RabbitMQ, and most of them can be used with AMPQ.
In particular looking at the work queue example for java, I found exactly the answer that I was looking for:
Message acknowledgment
Doing a task can take a few seconds. You may wonder what happens if
one of the consumers starts a long task and dies with it only partly
done. With our current code, once RabbitMQ delivers a message to the
consumer it immediately marks it for deletion. In this case, if you
kill a worker we will lose the message it was just processing. We'll
also lose all the messages that were dispatched to this particular
worker but were not yet handled. But we don't want to lose any tasks.
If a worker dies, we'd like the task to be delivered to another
worker.
In order to make sure a message is never lost, RabbitMQ supports
message acknowledgments. An ack(nowledgement) is sent back by the
consumer to tell RabbitMQ that a particular message has been received,
processed and that RabbitMQ is free to delete it.
If a consumer dies (its channel is closed, connection is closed, or
TCP connection is lost) without sending an ack, RabbitMQ will
understand that a message wasn't processed fully and will re-queue it.
If there are other consumers online at the same time, it will then
quickly redeliver it to another consumer. That way you can be sure
that no message is lost, even if the workers occasionally die.
There aren't any message timeouts; RabbitMQ will redeliver the message
when the consumer dies. It's fine even if processing a message takes a
very, very long time.
Manual message acknowledgments are turned on by default. In previous
examples we explicitly turned them off via the autoAck=true flag. It's
time to set this flag to false and send a proper acknowledgment from
the worker, once we're done with a task.
Thinking about it, using the ACK seems to be the logic thing to do. The reason why I didn't think about it before, is because I thought of a ACK just under the perspective of the publisher and not of the broker. The piece of documentation above was very useful to me.

Paho-MQTT check message queue size

I'm publishing MQTT messages from an Arduino, and subscribing to those from a Raspberry Pi. Sometimes the publishing goes faster than the Raspberry can receive (and process).
I'm looking for a way of checking how many messages are queued on the Raspberry side. I'm using Paho-MQTT. I only see it is possible to set a max queue size, but how can I check the current queue size? (If possible.)
There is no queue in the broker, all messages are delivered as they are published.
The Paho client is singled threaded and the Message received call back is handled on the network thread, so messages may back up on the network stack (for QOS0 messages). QOS1/2 messages will back up in the broker until the QOS handshake for the current message completes.
The max_queued message setting is about how many QOS 1/2 messages the client will accept to publish before blocking, not how many incoming messages it will queue up.
If you want to queue messages in a measurable way then have the Message received callback place the messages on to a local queue and have a second thread (or pool of threads if they can be handled in parallel) take messages from the local queue.