Google PubSub with pull subscriber design flaw? - publish-subscribe

We are using googles steaming pull subscriber the design is as follows
We are doing
sending file from FE (frontend) to BE (backend)
BE converting that file to ByteArray and publishing to pubsub topic as message (so ByteArray going as message)
Topic sending that message to subscriber, subscriber converting the ByteArray to file again
that converted file subscriber sending to that tool
tool doing some cool stuff with file and notify the status to subscriber
that status going to BE and BE update the DB and sending that status to FE
Now in our subscriber when we receive message we are immediately acknowledge it and removing the listener of subscriber so that we don't get message any more
and when that tool done that stuff, it sending status to subscriber (we have express server running on subscriber) and
after receiving status we are re-creating listener of subscriber to receive message
Note
that tool may take 1hr or more to do stuff
we are using ordering key to properly distribute message to VM's
this code is working fine but my question is
is there any flaw in this (bcz we r removing listener then again re creating it or anything like that)
or any better option or GCP services to best fit this design
or any improvement in code
EDIT :
Removed code sample

I would say that there are several parts of this design that are sub-optimal. First of all, acking a message before you have finished processing it means you risk message loss. What happens if your tool or subscriber crashes after acking the message, but before processing has completed? This means when the processes start back up, they will not receive the message again. Are you okay with requests from the frontend possibly never being processed? If not, you'll want to ack after processing is completed, or--given that your processing takes so long--persist the request to a database or to some storage and then acknowledge the message. If you are going to have to persist the file somewhere else anyway, you might want to consider taking Pub/Sub out of the picture and just writing the file to storage like GCS and then having your subscribers instead read out of GCS directly.
Secondly, stopping the subscriber upon each message being received is an anti-pattern. Your subscriber should be receiving and processing each message as it arrives. If you need to limit the number of messages being processed in parallel, use message flow control.
Also ordering keys isn't really a way to "properly distribute message to VM's." Ordering keys is only a means by which to ensure ordered delivery. There are no guarantees that the messages for the same ordering key will continually go to the same subscriber client. In fact, if you shut down the subscriber client after receiving each message, then another subscriber could receiving the next message for the ordering key since you've acked the earlier message. If all you mean by "properly distribute message" is that you want the messages delivered in order, then this is the correct way to use ordering keys.
You say you have a subscription per client, then whether or not that is the right thing to do depends on what you mean by "client." If client means "user of the front end," then I imagine you plan to have a different topic per user as well. If so, then you need to keep in mind the 10,000 topic-per-project limit. If you mean that each VM has its own subscription, then note that each VM is going to receive every message published to the topic. If you only want one VM to receive each message, then you need to use the same subscription across all VMs.
In general, also keep in mind that Cloud Pub/Sub has at-least-once delivery semantics. That means that even an acknowledged message could be redelivered, so you do need to be prepared to handle duplicate message delivery.

Related

How would you grab the latest message from multiple connections to a single ZMQ socket?

I am new to ZMQ and am not sure if what I want is even possible or if I should use another technology.
I would like to have a socket that multiple servers can stream to.
It appears that a ZMQ socket can do this based on this documentation: http://api.zeromq.org/4-0:zmq-setsockopt
How would I implement a ZMQ socket on the receiving end that only grabs the latest message sent from each server?
You can do this with Zmq's PUB / SUB.
The first key thing is that a SUB socket can be connected to multiple PUBlishers. This is covered in Chapter 1 of the guide:
Some points about the publish-subscribe (pub-sub) pattern:
A subscriber can connect to more than one publisher, using one connect call each time. Data will then arrive and be interleaved “fair-queued” so that no single publisher drowns out the others.
If a publisher has no connected subscribers, then it will simply drop all messages.
If you’re using TCP and a subscriber is slow, messages will queue up on the publisher. We’ll look at how to protect publishers against this using the “high-water mark” later.
So, that means that you can have a single SUB socket on your client. This can be connected to several PUB sockets, one for each server from which the client needs to stream messages.
Latest Message
The "latest message" can be partially dealt with (as I suspect you'd started to find) using high water marks. The ZMQ_RCVHWM option allows the number to be received to be set to 1, though this is an imprecise control.
You also have to consider what it is that is meant by "latest" message; the PUB servers and SUB client will have different views of what this is. For example, when the zmq_send() function on a PUB server returns, the sent message is the one that the PUBlisher would regard as the "latest".
However, over in the client there is no knowledge of this as nothing has yet got down through the PUBlishing server's operating system network stack, nothing has yet touched the Ethernet, etc. So the SUBscribing client's view of the "latest" message at that point in time is whichever message is in ZMQ's internal buffers / queues waiting for the application to read it. This message could be quite old in comparison to the one the PUBlisher has just started sending.
In reality, the "latest" message seen by the client SUBscriber will be dependent on how fast the SUBscriber application runs.
Provided it's fast enough to keep up with all the PUBlishers, then every single message the SUBscriber gets will be as close to the "latest" message as it can get (the message will be only as old as the network propagation delays and the time taken to transit through ZMQ's internal protocols, buffers and queues).
If the SUBscriber isn't fast enough to keep up, then the "latest" messages it will see will be at least as old as the processing time per message multiplied by the number of PUBlishers. If you've set the receive HWM to 1, and the subscriber is not keeping up, the publishers will try publishing messages but the subscriber socket will keep rejecting them until the subscribed application has cleared out the old message that's caused the queue congestion, waiting for zmq_recv() to be called.
If the subscriber can't keep up, the best thing to do in the subscriber is:
have a receiving thread dedicated to receiving messages and dispose of them until processing becomes available
have a separate processing thread that does the processing.
Have the two threads communicate via ZMQ, using a REQ/REP pattern via an inproc connection.
The receiving thread can zmq_poll both the SUB socket connection to the PUBlishing servers and the REP socket connection to the processing thread.
If the receiving thread receives a message on the REP socket, it can reply with the next message read from the SUB socket.
If it receives a message from the SUB socket with no REPly due, it disposes of the message.
The processing thread sends 1 bytes messages (the content doesn't matter) to its REQ socket to request the latest message, and receives the latest message from the PUBlishers in reply.
Or, something like that. That'll keep the messages flowing from PUBlishers to the SUBscriber, thus the SUBscriber always has a message as close to possible as being "the latest" and is processing that as and when it can, disposing of messages it can't deal with.

Is it possible for a single message to be given to multiply instance of the same subscription in gcloud PubSub

I have a publisher that publishes messages to a particular topic (myTopic), then on my PubSub I create a subscription name: myTopicSub to this topic (myTopic), then I have a VM that runs a service that listeners on my subscription myTopicSub
THIS WORKS
MY PROBLEM IS: if there be a need to scale, and I add 5 more VM to handle more messages from my subscription... is it possible for PubSub to send the same message to more than one VM...
Because I only need one VM to process the message once. Please I need help
Cloud Pub/Sub offers at-least-once delivery. That means that a message can be delivered multiple times and in some cases, can be delivered to two different subscribers for the same subscription within a short period of time. That particular type of duplicate delivery is rare, but not impossible.
Subscribers have to be able to handle the delivery of duplicates and depending on their nature may handle it in different ways. For some, all actions are idempotent, so re-processing the same message has no ill effects. In other cases, the subscribers need to track which messages they have received and processed and if a message is a duplicate, just immediately ack the message instead of process it.

How to prevent sending same data to different clients in REST api GET?

I have 15 worker clients and one master connected through internet. Job & data are been passed through REST api in json format.
Jobs are not restricted to any particular client. Any worker can query for the available job in regular interval(say 30 seconds), process it and will update the status.
In this scenario, how can I prevent same records been sent to different clients while GET request.
Followings are my solution approach to overcome this issue:
Take top 5 unprocessed records from the database and make it as SENT and expose via REST GET.
But the problem is, it creates inconsistency. Some times, the client doesn't got data due to network connectivity issue. But in server, it will be marked as SENT. So, no other clients can get that data. It will remain as SENT forever.
Get the list from server, and reply back the list of job IDs to Server as received. But in-between this time gap, some other clients also getting same set of Jobs.
You've stumbled upon a fundamental problem in distributed systems: there is no way to know if the other side received your message. You can certainly improve the situation with TCP and ack messages. But if you never get the ACK did the message never arrive, did it arrive but the recipient die before processesing, or did the recipient send he ACK and the ACK get dropped?
That means you need to design your system to handle receiving data more than once.
You offer two partial solutions; if you combine them, your solution starts to look like how SQS works. Mark the item as pending_ack with a timestamp. After client replies, it is marked sent. Any pending_ackss past a certain time period are eligible to be resent.
Pick your time period to allow for slow network and slow clients and it boils down to only sending duplicates when you really don't know if the client died or not.
Maybe you should reconsider the approach to blocking resources. REST architecture - by definition is not obliged to save information about client. Instead, you may want to consider optimistic concurrency control (http://en.wikipedia.org/wiki/Optimistic_concurrency_control).

Routing MSMQ messages from one queue to another

Is there some standard configuration setting, service, or tool that accepts messages from one queue and moves them on to another one? Automatically handling the dead message problem, and providing some of retry capability? I was thinking this is what "MSMQ Message Routing" does but can't seem to find documentation on it (except for on Windows Mobile 6, and I don't know if that's relevant).
Context:
I understand that when using MSMQ you should always write to a local queue so that failure is unlikely, and then X should move that message to a remote queue. Is my understanding wrong? Is this where messaging infrastructure like Biztalk comes in? Is it unnecessary to write to a local queue first to absolutely ensure success? Am I supposed to build X myself?
As Hugh points out, you need only one MSMQ Queue to Send messages in one direction from a source to a destination. Source and destination can be on the same server, same network or across the internet, however, both source and destination must have the MSMQ service running.
If you need to do 'message' routing (e.g. a switch which processes messages from several source or destination queues, or routing a message to one or more subscribers based on the type of message etc) you would need more than just MSMQ queue.
Although you certainly can use BizTalk to do message routing, this would be expensive / overkill if you didn't need to use other features of BizTalk. Would recommend you look at open source, or building something custom yourself.
But by "Routing" you might be referring to the queue redirection capability when using HTTP as the transport e.g. over the internet (e.g. here and here).
Re : Failed delivery and retry
I think you have most of the concepts - generally the message DELIVERY retry functionality should be implicit in MSMQ. If MSMQ cannot deliver the message before the defined expiry, then it will be returned on the Dead Letter Queue, and the source can then process messages from the DLQ and then 'compensate' for them (e.g. reverse the actions of the 'send', indicate failure to the user, etc).
However 'processing' type Retries in the destination will need to be performed by the destination application / listener (e.g. if the destination system is down, deadlocks, etc)
Common ways to do this include:
Using 2 Phase commit - under a distributed unit of work, pull the message off MSMQ and process it (e.g. insert data into a database, change the status of some records etc), and if any failure is encountered, then leave the message back onto the queue and the DB changes will be rolled back.
Application level retries - i.e. on the destination system, in the event of 'retryable' type errors (timeout due to load, deadlocks etc) then to sleep for a few seconds and then retry the same transaction.
However, in most cases, indefinite processing retries are not desirable and you would ultimately need to admit defeat and implement a mechanism to log the message and the error and remove it from the queue.
But I wouldn't 'retry' business failures (e.g. Business Rules, Validation etc) and the behaviour should be defined in your requirements of how to handle these (e.g. account is overdrawn, message is not in a correct format or not valid, etc), e.g. by returning a "NACK" type message back to the source.
HTH
MSMQ sends messages from one queue to another queue.
Let's say you have a queue on a remote machine. You want to send a message to that queue.
So you create a sender. A sender is an application that can use the MSMQ transport to send a message. This can be a .Net queue client (System.Messaging), a WCF service consumer (either over netMsmqBinding or msmqIntegrationBinding, BizTalk using the MSMQ adapter, etc etc.
When you send the message, what actually happens is:
The MSMQ queue manager on the sender machine writes the message to a temporary local queue.
The MSMQ queue manager on the sender machine connects to the MSMQ manager on the receiving machine and transmits the message.
The MSMQ queue manager on the receivers machine puts the message onto the destination queue.
In certain situations MSMQ will encounter messages which for some reason or another cannot be received on the destination queue. In these situations, if you have indicated that a message will use the dead-letter queue then MSMQ will make sure that the message is forwarded to the dead-letter queue.

How can I get QuickFix to process messages that come in from a resend request?

I am writing an acceptor application and using a persistent FIX session. I am trying to write a recovery mode, such that if I go offline or my program restarts, when I reconnect I want to reprocess all the messages sent to me during the day to get back to the current state.
To do this, when I start up I send a resend request for all messages to the server. They fire me back all the relevant messages, and they are marked possdupflag=Y and possresend=Y. Before each message, they send a sequence reset for the repeated message they are about to send.
The problem is though, these messages do not seem to be processed by my message cracker. Both fromAdmin and fromApp do not get these messages. I assume they are being ignored because of the dup flag and/or resend. So is there a way for me to tell QuickFIX that I want to see these messages?
On that note- if anyone has any recommendations on better recovery processes I would be open to them.
Thanks.
There's at least a couple of potential problems with this recovery strategy. The first is that it's not very friendly to your trading counterparty. If you only receive a small number of messages during your session then it may not be an issue, but if you receive hundreds of thousands of messages then your counterparty might complain about the massive resends.
The other issue is that message resend is intended for error recovery and is managed by the session protocol layer. In QuickFIX/J (and other FIX engines) the session maintains recovery state in addition to sending the ResendRequest automatically when it detects a sequence number gap. Your approach might work if you reset the next expected incoming sequence number to 1. When the session receives the next message with a higher sequence number it will detect the gap and request the missing messages. If the messages are validated, they will be forwarded to application layer with the PossDup flag set. If you send the ResendRequest message yourself the behavior is undefined since the session state will not have been set up properly.
I recommend using a MessageLog implementation to store your incoming messages in a form you can use for recovery when your application starts. You can look at the implementation of the existing message logs (FileLog, JdbcLog) to get some ideas.
The behaviour occurs because the engine's persistance system tells it that the recieved messages are resent messages and so (per the FIX protocol specification) are discarded. Here we save FIXml strings into our database to provide a similar recovery ability to that which you describe(they are also written to xml files on disk for other reasons). I don't believe that there is any way to tell quickfix that you want to see duplicate messages but it is probably better to use a different form or persistance to save on connection overheads. Quickfix does provide a way of outputting messages to file as they come in if that helps.
I too have the same issue and What Frank Says is absolutely correct ,
Just use the below method to set the target sequence number to the begin seq number of the desired resend req .
getSession()->setNextTargetMsgSeqNum(atoi(seq.c_str()));
The engine internally identifies that the target number is way too large and automatically sends resend request , and all messages will be captured in onMessage call back itself as usual