Slow Subscriber - publish-subscribe

I am newbie to ZMQ
ZMQ Version - 2.2.1
Ubuntu - 10.04
I am using the PUB-SUB pattern for communication between multiple publishers and multiple subscribers. A forwarder is used to subscribe data from multiple publishers and the same is published to all the subscribers.
Currently, if three publishers are running and if each publisher sends 1000 messages in 1second via the PUB channel. The subscriber receives the data, stores it and writes to a database every 1second.
Because of the involvement of database, the rate at which subscriber receives the data is getting delayed, as a result the memory usage (RAM) increases by 6-7MB every 1second. Finally the subscriber gets killed by OS due to OOM
I tried using the options ZWQ_HWM & ZMQ_SWAP on both the sockets of forwarder. But still the issue persists.
Is there any solution for this???

Overall your problem is that your database cannot keep up with your publisher. 0MQ cannot solve this for you. You need an architectural solution based on changing the behavior of your system, presumably the way you do inserts.
You have a few options:
Use a faster database
Use a faster database insert method
Write to a log which is processed asynchronously by another process
Change to a socket pattern that lets the receivers tell the senders that they are backed up, so the senders pause (if that's possible)
I think in your case the spool-to-disk-file option is the best.

Related

How do I make sure that I process one message at a time at most?

I am wondering how to process one message at a time using Googles pub/sub functionality in Go. I am using the official library for this, https://pkg.go.dev/cloud.google.com/go/pubsub#section-readme. The event is being consumed by a service that runs with multiple instances, so any in memory locking mechanism will not work.
I realise that it's an anti-pattern to do this, so let me explain my use-case. Using mongoDB I store an array of objects as an embedded document for each entity. The event being published is modifying parts of this array and saves it. If I receive more than one event at a time and they start processing exactly at the same time, one of the saves will override the other. So I was thinking a solution for this is to make sure that only one message will be processed at a time, and it would be nice to use any built-in functionality in cloud pub/sub to do so. Otherwise I was thinking of implementing some locking mechanism in the DB but i'd like to avoid that.
Any help would be appreciated.
You can imagine 2 things:
You can use ordering key in PubSub. Like that, all the message in relation with the same object will be delivered in order and one by one.
You can use a PUSH subscription to PubSub, to push to Cloud Run or Cloud Functions. With Cloud Run, set the concurrency to 1 (it's by default with Cloud Functions gen1), and set the max instance to 1 also. Like that you can process only one message at a time, all the other message will be rejected (429 HTTP error code) and will be requeued to PubSub. The problem is that you can parallelize the processing as before with ordering key
A similar thing, and simpler to implement, is to use Cloud Tasks instead of PubSub. With Cloud Tasks you can set a rate limit on a queue, and set the maxConcurrentDispatches to 1 (and you haven't to do the same with Cloud Functions max instances or Cloud Run max instances and concurrency)

Synchronising transactions between database and Kafka producer

We have a micro-services architecture, with Kafka used as the communication mechanism between the services. Some of the services have their own databases. Say the user makes a call to Service A, which should result in a record (or set of records) being created in that service’s database. Additionally, this event should be reported to other services, as an item on a Kafka topic. What is the best way of ensuring that the database record(s) are only written if the Kafka topic is successfully updated (essentially creating a distributed transaction around the database update and the Kafka update)?
We are thinking of using spring-kafka (in a Spring Boot WebFlux service), and I can see that it has a KafkaTransactionManager, but from what I understand this is more about Kafka transactions themselves (ensuring consistency across the Kafka producers and consumers), rather than synchronising transactions across two systems (see here: “Kafka doesn't support XA and you have to deal with the possibility that the DB tx might commit while the Kafka tx rolls back.”). Additionally, I think this class relies on Spring’s transaction framework which, at least as far as I currently understand, is thread-bound, and won’t work if using a reactive approach (e.g. WebFlux) where different parts of an operation may execute on different threads. (We are using reactive-pg-client, so are manually handling transactions, rather than using Spring’s framework.)
Some options I can think of:
Don’t write the data to the database: only write it to Kafka. Then use a consumer (in Service A) to update the database. This seems like it might not be the most efficient, and will have problems in that the service which the user called cannot immediately see the database changes it should have just created.
Don’t write directly to Kafka: write to the database only, and use something like Debezium to report the change to Kafka. The problem here is that the changes are based on individual database records, whereas the business significant event to store in Kafka might involve a combination of data from multiple tables.
Write to the database first (if that fails, do nothing and just throw the exception). Then, when writing to Kafka, assume that the write might fail. Use the built-in auto-retry functionality to get it to keep trying for a while. If that eventually completely fails, try to write to a dead letter queue and create some sort of manual mechanism for admins to sort it out. And if writing to the DLQ fails (i.e. Kafka is completely down), just log it some other way (e.g. to the database), and again create some sort of manual mechanism for admins to sort it out.
Anyone got any thoughts or advice on the above, or able to correct any mistakes in my assumptions above?
Thanks in advance!
I'd suggest to use a slightly altered variant of approach 2.
Write into your database only, but in addition to the actual table writes, also write "events" into a special table within that same database; these event records would contain the aggregations you need. In the easiest way, you'd simply insert another entity e.g. mapped by JPA, which contains a JSON property with the aggregate payload. Of course this could be automated by some means of transaction listener / framework component.
Then use Debezium to capture the changes just from that table and stream them into Kafka. That way you have both: eventually consistent state in Kafka (the events in Kafka may trail behind or you might see a few events a second time after a restart, but eventually they'll reflect the database state) without the need for distributed transactions, and the business level event semantics you're after.
(Disclaimer: I'm the lead of Debezium; funnily enough I'm just in the process of writing a blog post discussing this approach in more detail)
Here are the posts
https://debezium.io/blog/2018/09/20/materializing-aggregate-views-with-hibernate-and-debezium/
https://debezium.io/blog/2019/02/19/reliable-microservices-data-exchange-with-the-outbox-pattern/
first of all, I have to say that I’m no Kafka, nor a Spring expert but I think that it’s more a conceptual challenge when writing to independent resources and the solution should be adaptable to your technology stack. Furthermore, I should say that this solution tries to solve the problem without an external component like Debezium, because in my opinion each additional component brings challenges in testing, maintaining and running an application which is often underestimated when choosing such an option. Also not every database can be used as a Debezium-source.
To make sure that we are talking about the same goals, let’s clarify the situation in an simplified airline example, where customers can buy tickets. After a successful order the customer will receive a message (mail, push-notification, …) that is sent by an external messaging system (the system we have to talk with).
In a traditional JMS world with an XA transaction between our database (where we store orders) and the JMS provider it would look like the following: The client sets the order to our app where we start a transaction. The app stores the order in its database. Then the message is sent to JMS and you can commit the transaction. Both operations participate at the transaction even when they’re talking to their own resources. As the XA transaction guarantees ACID we’re fine.
Let’s bring Kafka (or any other resource that is not able to participate at the XA transaction) in the game. As there is no coordinator that syncs both transactions anymore the main idea of the following is to split processing in two parts with a persistent state.
When you store the order in your database you can also store the message (with aggregated data) in the same database (e.g. as JSON in a CLOB-column) that you want to send to Kafka afterwards. Same resource – ACID guaranteed, everything fine so far. Now you need a mechanism that polls your “KafkaTasks”-Table for new tasks that should be send to a Kafka-Topic (e.g. with a timer service, maybe #Scheduled annotation can be used in Spring). After the message has been successfully sent to Kafka you can delete the task entry. This ensures that the message to Kafka is only sent when the order is also successfully stored in application database. Did we achieve the same guarantees as we have when using a XA transaction? Unfortunately, no, as there is still the chance that writing to Kafka works but the deletion of the task fails. In this case the retry-mechanism (you would need one as mentioned in your question) would reprocess the task an sends the message twice. If your business case is happy with this “at-least-once”-guarantee you’re done here with a imho semi-complex solution that could be easily implemented as framework functionality so not everyone has to bother with the details.
If you need “exactly-once” then you cannot store your state in the application database (in this case “deletion of a task” is the “state”) but instead you must store it in Kafka (assuming that you have ACID guarantees between two Kafka topics). An example: Let’s say you have 100 tasks in the table (IDs 1 to 100) and the task job processes the first 10. You write your Kafka messages to their topic and another message with the ID 10 to “your topic”. All in the same Kafka-transaction. In the next cycle you consume your topic (value is 10) and take this value to get the next 10 tasks (and delete the already processed tasks).
If there are easier (in-application) solutions with the same guarantees I’m looking forward to hear from you!
Sorry for the long answer but I hope it helps.
All the approach described above are the best way to approach the problem and are well defined pattern. You can explore these in the links provided below.
Pattern: Transactional outbox
Publish an event or message as part of a database transaction by saving it in an OUTBOX in the database.
http://microservices.io/patterns/data/transactional-outbox.html
Pattern: Polling publisher
Publish messages by polling the outbox in the database.
http://microservices.io/patterns/data/polling-publisher.html
Pattern: Transaction log tailing
Publish changes made to the database by tailing the transaction log.
http://microservices.io/patterns/data/transaction-log-tailing.html
Debezium is a valid answer but (as I've experienced) it can require some extra overhead of running an extra pod and making sure that pod doesn't fall over. This could just be me griping about a few back to back instances where pods OOM errored and didn't come back up, networking rule rollouts dropped some messages, WAL access to an aws aurora db started behaving oddly... It seems that everything that could have gone wrong, did. Not saying Debezium is bad, it's fantastically stable, but often for devs running it becomes a networking skill rather than a coding skill.
As a KISS solution using normal coding solutions that will work 99.99% of the time (and inform you of the .01%) would be:
Start Transaction
Sync save to DB
-> If fail, then bail out.
Async send message to kafka.
Block until the topic reports that it has received the
message.
-> if it times out or fails Abort Transaction.
-> if it succeeds Commit Transaction.
I'd suggest to use a new approach 2-phase message. In this new approach, much less codes are needed, and you don't need Debeziums any more.
https://betterprogramming.pub/an-alternative-to-outbox-pattern-7564562843ae
For this new approach, what you need to do is:
When writing your database, write an event record to an auxiliary table.
Submit a 2-phase message to DTM
Write a service to query whether an event is saved in the auxiliary table.
With the help of DTM SDK, you can accomplish the above 3 steps with 8 lines in Go, much less codes than other solutions.
msg := dtmcli.NewMsg(DtmServer, gid).
Add(busi.Busi+"/TransIn", &TransReq{Amount: 30})
err := msg.DoAndSubmitDB(busi.Busi+"/QueryPrepared", db, func(tx *sql.Tx) error {
return AdjustBalance(tx, busi.TransOutUID, -req.Amount)
})
app.GET(BusiAPI+"/QueryPrepared", dtmutil.WrapHandler2(func(c *gin.Context) interface{} {
return MustBarrierFromGin(c).QueryPrepared(db)
}))
Each of your origin options has its disadvantage:
The user cannot immediately see the database changes it have just created.
Debezium will capture the log of the database, which may be much larger than the events you wanted. Also deployment and maintenance of Debezium is not an easy job.
"built-in auto-retry functionality" is not cheap, it may require much codes or maintenance efforts.

Biztalk - How to throttle a streaming disassemble pipeline

I need to limit the number of orchestration instances spawned while debatching a large message in a streaming disassemble receive pipeline. Let’s say that I have a large xml coming in that contains 100 000 separate "Order" message. The receive pipeline would then debatch it and create 100 000 "ProcessOrder" orchestrations. This is too much and I need to limit that.
Requirements
The debatching needs to be done in a streaming manner so that I only load one "Order" message in memory at a time before sending it to the messagebox;
The debatching needs to be throttled based on the number of current running "ProcessOrder" orchestration instances (say if I already have 100 running instances, the debatching would wait till one is over to send another "Order" message to the messagebox).
Where I'm at
I have the receive pipeline that does the debatching and functional modifications to my messages. It does what it should in a streaming manner and puts individual messages in VirtualStreams;
I have an orchestration and helper methods that can limit the number of “ProcessOrder” orchestration instances.
The problem
I know that I can run a receive pipeline inside an orchestration (and that would solve my problem since on every "getnext" call to the pipeline, I could just hold on if there are too many running orchestration instances) but, digging in biztalk dlls, I noticed that using Microsoft.XLANGs.Pipeline.XLANGPipelineManager still loads up all the messages in memory instead of enumerating them like Microsoft.BizTalk.PipelineOM.PipelineManager does. I know they are putting every messages in VirtualStream but this is still inadequate, memory wise, for such a large message number.
Question
My next step would be to run the receive pipeline directly in the receive port (so it would use Microsoft.BizTalk.PipelineOM.PipelineManager) without having the orchestration that limits the number of “ProcessOrder” instances, but to meet the requirements, I would need to add a delay logic in my pipeline. Is this a viable option? If not, why? and what other alternative do I have?
You should debatch all messages once from pipeline and store those individual messages in MSMQ before even they are processed by orchestration. Use standard pipeline to debatch messages as they are efficient to handle large files debatching. MSMQ is available for free through Turn On Windows Features. Using MSMQ is very easy and does not require any development. Sending to MSMQ will be very fast 100K messages is not issue at all.
Then have a receive location to read from MSMQ. Depending on your orchestration throughput, you can control message flow by using BizTalk receive host throttling or by receiving the messages from MSMQ in Order or using the combination of both. Make sure you have separate host instance for both receive MSMQ and send MSMQ and for your orchestration processing.
This will be done through all configurations without any extra code simplifing your design. Make sure you have orchestration with minimum number of persistent points.

MSMQ as a job queue

I am trying to implement job queue with MSMQ to save up some time on me implementing it in SQL. After reading around I realized MSMQ might not offer what I am after. Could you please advice me if my plan is realistic using MSMQ or recommend an alternative ?
I have number of processes picking up jobs from a queue (I might need to scale out in the future), once job is picked up processing follows, during this time job is locked to other processes by status, if needed job is chucked back (status changes again) to the queue for further processing, but physically the job still sits in the queue until completed.
MSMQ doesn't let me to keep the message in the queue while working on it, eg I can peek or read. Read takes message out of queue and peek doesn't allow changing the message (status).
Thank you
Using MSMQ as a datastore is probably bad as it's not designed for storage at all. Unless the queues are transactional the messages may not even get written to disk.
Certainly updating queue items in-situ is not supported for the reasons you state.
If you don't want a full blown relational DB you could use an in-memory cache of some kind, like memcached, or a cheap object db like raven.
Take a look at RabbitMQ, or many of the other messages queues. Most offer this functionality out of the box.
For example. RabbitMQ calls what you are describing, Work Queues. Multiple consumers can pull from the same queue and not pull the same item. Furthermore, if you use acknowledgements and the processing fails, the item is not removed from the queue.
.net examples:
https://www.rabbitmq.com/tutorials/tutorial-two-dotnet.html
EDIT: After using MSMQ myself, it would probably work very well for what you are doing, as far as I can tell. The key is to use transactions and multiple queues. For example, each status should have it's own queue. It's fairly safe to "move" messages from one queue to another since it occurs within a transaction. This moving of messages is essentially your change of status.
We also use the Message Extension byte array for storing message metadata, like status. This way we don't have to alter the actual message when moving it to another queue.
MSMQ and queues in general, require a different set of patterns than what most programmers are use to. Keep that in mind.
Perhaps, if you can give more information on why you need to peek for messages that are currently in process, there would be a way to handle that scenario with MSMQ. You could always add a database for additional tracking.

Message bus integration and resync of Bounded Contexts after downtime - Service Bus 1.0

I have just downloaded joliver eventstore and looking to wire up a service bus with Windows Service Bus 1.0 for an application separated across more than one Bounded Context process.
If a bounded context has been offline whilst events in other bounded contexts have been created (or may even be a new context that has been deployed), I can see the following sequence of events.
For an example ContextA, ContextB and ContextC, all connected using Service Bus 1.0 and each context with their own event store, they all share the same bus messaging backplane.
ContextC goes offline.
When ContextC comes back-up, other bounded contexts need to be notified of the events that need to be resent to the context that has just come back online. These events are replayed from each of the event stores.
My questions are:
The above scenario would apply to any event sourcing libraries, so is there any infrastructure code on top of this I can use, or do I have to roll my own?
With Windows Service Bus 1.0, how do I marry sequence numbers in my event store to sequence numbers on the Service Bus?
What is the best practice to detect and handle events that have already been received in a safe manner (protecting against message handlers failing)?
The above scenario would apply to any event sourcing libraries, so is there any infrastructure code on top of this I can use, or do I have to roll my own?
The notion of a Projection mechanism tied to the events is certainly common. Unfortunately, there are many many ways of handling how that might be done, depending on your stack, performance requirements and scale and many other factors.
As a result I'm not aware of a commoditized facility of this nature.
The GetEventStore store has an integrated Projection facility which looks extremely powerful and takes the need to build all this off the table. Before its existence, I'd have argued that one shouldnt even consider looking past the the SRPness of the JOES.
You havent said much about your actual stack other than mentioning Azure.
With Windows Service Bus, how do I marry sequence numbers in my event store to sequence numbers on the Service Bus?
You can use stream id + the commit sequence number the MessageId (and use that to ensure duplicates are removed by the bus). You will probably also include properties in the Message metadata.
What is the best practice to detect and handle events that have already been received in a safe manner (protecting against message handlers failing)?
If you're on Azure and considering ServiceBus then the Topics can be used to ensure at least once delivery (and you'll use the sessioning facility). Go watch the two hour deep dive ClemensV Subscribe video plus a few other episodes or you'll spent the same amount of time making mistakes)
To keep broadcast traffic down, if ContextC requests replays from ContextA and ContextB, is there any way for these replay messages to be sent only to ContextC? Or should I not worry about this?
Mu. You started off asking whether this stuff was a good idea but now seem to have baked in an assumption that it's the way to go.
Firstly, this infrastructure is a massive wheel to reinvent. Have you considered simply setting up a topic per BC and having anyone that needs to listen listen?
A key thing here is that you need to bear in mind the fact that just because you can think of cases where BCs need to consume each others events, that this central magic bus that's everywhere will deliver everything everywhere.
EDIT: Answers to your edited versions of questions 2+
With Windows Service Bus 1.0, how do I marry sequence numbers in my event store to sequence numbers on the Service Bus?
Your event store doesnt have a sequence number. It has a commit sequence number per aggregate. You'd typically use a sessioned topic and subscription. Then you need to choose whether you want a global ordering (use a single session id) or per aggregate ordering (use the stream id as the session id).
Once events are on a topic, they have a MessageSequenceNumber and the subscription (when sessioned) delivers (actually the subscriber recieves them) them in sequence.
What is the best practice to detect and handle events that have already been received in a safe manner (protecting against message handlers failing)?
This is built into the Service Bus (or any queueing mechanism). You don't mark the Message completed until it has been successfully processed. Any failure leads to Abandonment (which puts it back on the queue for reprocessing).
The subscriber taking a break, becoming disconnected or work backing up is naturally dealt with by the Topic.