How to publish an event in ActiveMQ on row level changes in DB2? - db2

I don't know if it's possible or not, and I Googled it I didn't find any concrete solution for this.
I have a scenario to implement where I need to use the pub/sub model to understand what changes are made on certain DB2 tables. So if the user updates or adds any data in certain tables then the event should be published on MQ along with data and metadata.
Any guidance or suggestions would be helpful on where to start.

You could potentially create a SQL trigger that runs a Java stored procedure to send a message to a JMS topic in ActiveMQ for the table's you're concerned about.
You could also potentially change the client that is modifying the database so that it sends a message to a JMS topic in ActiveMQ when it does the database work. Ideally this would be in an XA transaction to ensure the actions are atomic.

Related

Debezium Server for Azure Event hub sink send messages to multiple partition keys

I'm implementing CDC for a PostgreSQL azure database. And I want the events to be sent to azure event hub. My current plan is to use Debezium Server with the Event Hub sink to do this. However I want to enforce order of events by table. From this article I know I can do this by having a single topic with multiple partitions but only sending events from a single table to a specific partition every time.
However it seems like debezium doesn't provide a nice way to handle this. You can specify the partition key for all events to be sent to, but not dynamically per event. The only other thing I saw that could solve this is a custom sink implementation or a custom EventHubProducerClient implementation passed into the config.
What are my options for handling this? Is there another way to architect this solution so that I don't have to use partition keys? Or is a custom sink implementation going to be my best bet? Or should i just drop debezium and write a custom listener/publisher?
Context / requirements
typically to run debezium you need a kafka instance running, however
if possible I don't want to use kafka as I'm already planning on
using event hub, and it seems duplicitous, and it is another service that needs to be maintained.
FIFO ordering of events by table when read by consumers of the event hub
all logical database changes are turned into events
no java developers on the team so the custom (java) implementations will be a stretch to our expertise.

Is it possible to implement transactional outbox pattern for only RabbitMQ publish fail scenarios

I have a system that uses mongoDB as persistence and RabbitMQ as Message broker. I have a challenge that I only want to implement transactional outbox for RabbitMQ publish fail scenarios. I'm not sure is it possible because, I also have consumers that is using same mongoDB persistence so when I'm writing a code that covers transactional outbox for RabbitMQ publish fail scenarios, published messages reaching consumers before mongoDB commitTransaction so my consumer couldn't find the message in mongoDB because of latency.
My code is something like below;
1- start session transaction
2- insert into document with session (so it doesn't persist until I call commit)
3- publish rabbitMQ
4- if success commitTransaction
5- if error insert into outbox document with session than commitTransaction
6- if something went wrong on mongoDB abortTransaction (if published succeed and mongoDB failed, my consumers first check for mongoDB existence and if it doesn't exist don't do anything.)
So the problem is in here messages reaching consumer earlier than
mongoDB persistence, do you advice any solution that covers my
problem?
As far as I can tell the architecture outlined in the picture in https://microservices.io/patterns/data/transactional-outbox.html maps directly to MongoDB change streams:
keep the transaction around 1
insert into the outbox table in the transaction
setup message relay process which requests a change stream on the outbox table and for every inserted document publishes a message to the message broker
The publication to message broker can be retried and the change stream reads can also be retried in case of any errors. You need to track resume tokens correctly, see e.g. https://docs.mongodb.com/ruby-driver/master/reference/change-streams/#resuming-a-change-stream.
Limitations of this approach:
only one message relay process, no scalability and no redundancy - if it dies you won't get notifications until it comes back
Your proposed solution has a different set of issues, for example by publishing notifications before committing you open yourself up to the possibility of the notification processor not being able to find the document it got out of the message broker as you said.
So I would like to share my solution.
Unfortunately it's not possible to implement transactional outbox pattern for only fail scenarios.
What I decided is, create an architecture around High Availability so;
MongoDB as High Available persistence and RabbitMQ as High Available message broker.
I removed all session transactions that I coded before and implemented immediate write and publish.
In worst case scenario:
1- insert into document (success)
2- rabbitmq publish (failed)
3- insert into outbox (failed)
What will I have is, unpublished documents in my mongo. Even in worst case scenario I could re publish messages from MongoDB with another application but I'll not write that application until I'll face with that case because we can not cover every fail scenarios on our code. So our message brokers or persistences must be high available.

Kafka Microservice Proper Use Cases

In my new work's project, i discovered that instead of directly making post/put API calls from one microservice to another microservice, a microservice would produce a message to kafka, which is then consumed by a single microservice.
For example, Order microservice would publish a record to "pending-order" topic, which would then be consumed by Inventory microservice (no other consumer). In turn, after consuming the record and done some processing, Inventory microservice would produce a record to "processed-order" which would then be consumed only by Order microservice.
Is this a correct use case? Or is it better to just do API calls between microservices in this case?
There are two strong use cases of Kafka in a microservice based application:
You need to do a state change in multiple microservices as part of a single end user activity. If you do this by calling all the appropriate microservice APIs sequentially or parallely, there will be two issues:
Firstly, you lose atomicity i.e. you canNot guarantee "all or nothing" . It's very well possible that the call to microservice A succeeds but call to service B fails and that would lead to inconsistent data permanently. Secondly, in a cloud environment unpredictable latency and network timeouts are not uncommon and so when you make multiple calls as part of a single call, the probability of one of these calls getting delayed or failed is higher impacting user experience. Hence, the general recommendation here is, you write the user action atomically in a Kafka topic as an event and have multiple consumer groups - one for each interested microservice consume the event and make the state change in their own database. If the action is triggered by the user from a UI, you would also need to provide a "read your own write" guarantee where the user would like to see his data immediately after writing. Therefore, you'd need to write the event first in the local database of the first microservice and then do log based event sourcing (using an aporopriate Kafka Connector) to transfer the event data to Kafka. This will enable you to show the data to the user from the local DB. You may also need to update a cache, a search index, a distributed file system etc. and all of these can be done by consuming the Kafka events published by the individual microservices.
It is not very uncommon that you need to pull data from multiple microservice to do some activity or to aggregate data and display to the user. This, in general, is not recommended because of the latency and timeout issue mentioned above. It is usually recommended that we precompute those aggregates in the microservice local DB based on Kafka events published by the other microservices when they were changing their own state. This will allow you to serve the aggregate data to the user much faster. This is called materialized view pattern.
The only point to remember here is writing to Kafka log or broker and reading from it us asynchronous and there maybe a little time delay.
Microservice as consumer, seems fishy to me. You might mean Listeners to that topic would consume the message and maybe they will call your second microservice i.e. Inventory Microservice.
Yes, the model is fine, specially when you want to have asynchronous behavior and loads of traffic handled through it.
Imaging a scenario when you have more than 1 microservice to call from 1 endpoint. Here you need either aggregation layer which aggregates your services and you call it once, or you would like to publish several messages to Kafka which then will do the job.
Also think about Read services, if you need to call a microservice to read some data from somewhere else, then you can't use Kafka.
It all depends on your requirements and design.

Exactly-once semantics in spring Kafka

I need to apply transactions in a system that comprises of below components:
A Kafka producer, this is some external application which would publish messages on a kafka topic.
A Kafka consumer, this is a spring boot application where I have configured the kafka listener and after processing the message, it needs to be saved to a NoSQL database.
I have gone through several blogs like this & this, and all of them talks about the transactions in context of streaming application, where the messages would be read-processed-written back to a Kafka topic.
I don't see any clear example or blog around achieving transactionality in the use case similar to mine i.e. producing-processing-writing to a DB in a single atomic transaction. I believe it to be very common scenario & there must be some support for it as well.
Can someone please guide me on how to achieve this? Any relevant code snippet would be greatly appreciated.
in a single atomic transaction.
There is no way to do it; Kafka doesn't support XA transactions (nor do most NoSQL DBs). You can use Spring's transaction synchronization for best-effort 1PC.
See the documentation.
Spring for Apache Kafka implements normal Spring transaction synchronization.
It provides "best efforts 1PC" - see Distributed transactions in Spring, with and without XA for more understanding and the limitations.
I'm guessing you're trying to solve the scenario where your consumer goes down after writing to the database but before committing the offsets, or other similar problems. Unfortunately this means you have to build your own fault-tolerance.
In the case of the problem I mentioned above, this means you would have to manage the consumer offsets in your end-output database, updating them in the same database transaction that you're writing the output of your consumer application to.

Query message store of mirth connect

Can I use mirth connect to store millions of HL7v2 messages (pipe delimited) and query them programmatically by our third party software application at a later point of time?
What's the best way to do that? Is mirth's REST API capable to query its message store efficently?
Unfortunatly I need a running mirth connect instance to browse the REST API documentation according to the manual at page 368. (If it wouldn't require to have a running instance of mirth to browse the documentation of the REST API I wouldn't have asked that question. Is there a mirth connect instance available on the internet to play with? Or would somebody be so kind to post the relevant REST API documentation for that question?)
So far, those are the scenarios I came up yet:
Mirth is integration engine, and its strength is processing messages. Browsing historical messages can be at times difficult or slow, depending on the storage settings for the channel and whether or not you take care to pull additional information out during processing to store in "custom metadata" fields. The custom metadata fields are not indexed by default, but you can add your own (mirth supports several back-end databases, including postgres, mysql, oracle, and mssql.) Searching the message content basically involves doing a full-text search and scanning. Filter options to reduce scan time, apart from the custom metadata you create, are mostly related to the message properties (datetime received, status, etc..) and not the content.
So, I would not recommend it for the use-case you are suggesting.
However, Mirth could definitely be used to convert your messages (batched from files or live) to xml which could be put in a database designed to handle and query large volumes of xml documents. I assume when you say HL7 you mean the ER7 (pipe delimited) format of HL7v2. Mirth automatically does the conversion to xml for those types of messages as they are handled as xml during processing. You could easily create a new parent node that holds both the converted xml and the original message string as children.
If the database you choose has a JDBC driver, Java SDK, or HTTP/REST API, mirth can likely directly insert the converted messages for you as it processes them.
There are two misconceptions here:
HL7v2 message is triggered by the real-world event, called the trigger event, on the placer (sender) side. It expects some activity to happen on the filler (receiver) side by either confirming the message, replying with the query response, etc. I.e., HL7v2 supports data flow among systems.
Mirth Connect is HL7 interface engine aimed at transforming incoming feeds in one format (e.g., HL7v2 in ER7 format) into outgoing feeds in another format (which could be another HL7v2, or XML, or database, etc.). It does not store anything except a configured portion of messages for audit purposes.
Now, to implement a solution you outlined, Mirth Connect or any other transformation mechanism has to implement two flows: receive, convert if needed and store incoming messages; provide an interface to query those messages.
This is obviously can be done with Mirth Connect but your initial question if Mirth is capable in storing millions of records is incorrect. In fact it's recommended to keep as less messages as possible to speed up Mirth processing (each processed message is stored in the Mirth internal database several times depending on configuration). Thus, all transformed messages are going into the external public or private message storage exactly as shown on your diagrams.