Apache ActiveMQ Artemis: Live server failure detection frequency for backup? - activemq-artemis

In Apache ActiveMQ Artemis when live server goes down immediately the backup server changes its role to be live. I want to understand parameters behind this and whether we can fine-tune them.
What is the frequency which backup server notice live server no longer available. Is this configurable?
Is there a retry count which backup server tries to connect to live before deciding it should change its role to live?

Related

How does Oracle NoSQL DB Streams handles failures?

I had set up several NoSQL streams this morning. Around mid-day, there were issues with the network connectivicity in the environment where Oracle NoSQL Database was deployed. I saw some anomalies in my applications.
How does Oracle NoSQL DB Streams API handle failures like unstable network connection, master transfer, and rebalance at the server?
Please advise.
Thanks for trying Oracle NoSQL Database Streams API.
Oracle NoSQL Streams handles failures in different ways depending on the nature of the failure. In the case of unstable network connectivity, the Streams API would reconnect to the master node of each shard at the host and port cached in the Streams API, therefore, when network connectivity restores, the Streams API would reconnect successfully and resume streaming from the last streamed operation. If the attempts to reconnect fail after three times, the Streams API would refresh the topology from the Oracle NoSQL Database, and reconnect with the master node found in the latest store topology. This happens when the master transfers and the old master is no longer accessible.
Handling store rebalancing is similar to handling master transfer, the Streams API would pull the new topology from the store, and locate the new master of each shard to reconnect. After the rebalancing is complete and the new topology is ready, the Streams API would be able to reconnect and resume streaming.
The description above is for the latest version Oracle NoSQL Database 21.1. In previous versions, there is a limit that caps the number of reconnects in Stream API, and the Streams API will terminate the stream and signal NoSQLSubscriber.onError() if it runs out of the maximum attempts. In 21.1, such limit is removed by default and the Streams API would just keep reconnecting till the connection is restored. The users can override the default behavior by setting NoSQLSubscriptionConfig.Builder.setMaxReconnect(long maxReconnect)

Kafka Connect instead of Flume Ingestion

I have been looking into the concepts and application of Kafka Connect, and I have even touched one project based on it in one of my intern. Now in my working scenario, now I am considering replacing the architecture of the our real time data ingestion platform which is currently based on flume -> Kafka with Kafka Connect and Kafka.
The reason why I am considering the switch can be concluded mainly into:
But if we use flume we need to install the agent on each remote machine which generates tons of workload for further devops, especially at the place where I am working where the authority of machines is managed in a rigid way that maintaining utilities on machines belonging to other departments.
Another reason for the consideration is that the machines' os environment varies, if we install flumes on a variety of machines , some machine with different os and jdks(I have met some with IBM jdk) just cannot make flume work well which in worst case can result in zero data ingestion.
It looks with Kafka Connect we can deploy it in a centralized way with our Kafka cluster so that the develops cost can go down. Beside, we can avoid installing flumes on machines belonging to others and avoid the risk of incompatible environment to ensure the stable ingestion of data from every remote machine.
Besides, the most ingestion scenario is only to ingest real-time-written log text file on remote machines(on linux and unix file system) into Kafka topics, that is it. So I won't need advanced connectors which is not supported in apache version of Kafka.
But I am not sure if I am understanding the usage or scenario of Kafka Connect the right way. Also I am wondering if Kafka Connect should be deployed on the same machine with the data source machines or if it is ok they resides on different machines. If they can be different then why flume requires the agent to be run on the same machine with the data source? So I wish someone more experienced can give me some lights on that.
Is Kafka Connect appropriate for ingesting data to Kafka? yes
Does Kafka Connect run local to the data source? only if it has to (e.g. reading a local file with Kafka Connect spooldir plug, FilePulse plugin, etc ).
Should you rip out something that works and replace it with Kafka Connect? not unless it's fixing a problem that you have
If you're not using either yet, should you use Kafka Connect instead of Flume? Quite possibly.
Learn more about Kafka Connect here: https://dev.to/rmoff/crunchconf-2019-from-zero-to-hero-with-kafka-connect-81o
For file ingest alone there's other tools too like Filebeat too

Kafka connect confluent jdbc does not control session pool in MSSQL database

I am working with Kafka connect and confluent jdbc. Integrate a source connector with Mssql and a few days ago the operating area warned us that there is a high number of sessions in the "sleeping" state in the database. I need to control those sessions but apparently the connector (confluent jdbc) doesn't have those properties in its configuration.
Do you have any ideas to correct this problem?
Kafka Connect will run a minimum of one task per connector. Each connector is isolated from the other and other than sharing a runtime environment is isolated from the others.
Therefore if you have 27 connectors sourcing from the same database, you will have a minimum of 27 connections to the database.
If you can't reduce the number of connectors (e.g. by have one connector pull from multiple tables), then the only option I think you have is to speak to your DBA about enforcing some kind of resource management on the RDBMS side. For example, on Oracle the Resource Manager option can be used for this.

Client Local Queue in Red Hat AMQ

We have a network of Red Hat AMQ 7.2 brokers with Master/Slave configuration. The client application publish / subscribe to topics on the broker cluster.
How do we handle the situation wherein the network connectivity between the client application and the broker cluster goes down? Does Red Hat AMQ have a native solution like client local queue and a jms to jms bridge between local queue and remote broker so that network connectivity failure will not result in loss of messages.
It would be possible for you to craft a solution where your clients use a local broker and that local broker bridges messages to the remote broker. The local broker will, of course, never lose network connectivity with the local clients since everything is local. However, if the local broker loses connectivity with the remote broker it will act as a buffer and store messages until connectivity with the remote broker is restored. Once connectivity is restored then the local broker will forward the stored messages to the remote broker. This will allow the producers to keep working as if nothing has actually failed. However, you would need to configure all this manually.
That said, even if you don't implement such a solution there is absolutely no need for any message loss even when clients encounter a loss of network connectivity. If you send durable (i.e. persistent) messages then by default the client will wait for a response from the broker telling the client that the broker successfully received and persisted the message to disk. More complex interactions might require local JMS transactions and even more complex interactions may require XA transactions. In any event, there are ways to eliminate the possibility of message loss without implementing some kind of local broker solution.

Is there a way to connect to multiple databases in multiple hosts using Kafka Connect?

I have a need to get data from Informix database using Kafka Connect. The scenario is this - I have 50 Informix Databases residing in 50 hosts. What I have understood by reading from Kafka connect is that we need to install the Kafka connect in each hosts to get the data from the database residing in that host. My question is this - Is there a way in which I can create the connectors centrally for these 50 hosts instead of installing into each of them and pull data from the databases?
Kafka Connect JDBC does not have to run on the database, just as other JDBC clients don't, so you can a have a Kafka Connect cluster be larger or smaller than your database pool.
Informix seems to have a thing called "CDC Replication Engine for Kafka", however, which might be something worth looking into, as CDC overall causes less load on the database
You don’t need any additional software installation on the system where Informix server is running.I am not fully clear about the question or the type of operation you are plan to do. If you are planning to setup a real time replication type of scenario, then you may have to invoke CDC API. Then one-time setup of CDC API at server is needed, then this APIs can be invoked using any Informix database driver API. If you are plan to read existing data from table(s) and pump into Kafka topic, then no need of any additional setup at server side. You could connect to all 50 database server(s) from a single program (remotely) and then pump those records to the Kafka topic(s). Base on the program language you are using you may choose Informix database driver.