ActiveMQ Artemis clusterd env - retained messages are not accurate after the broker restarts - restart

What is the behavior when a new client connects to a cluster with respect to retained messages? Will it get the retained messages received by all the nodes in the cluster?

Related

Consume directly from ActiveMQ Artemis replica

In a cluster scenario using HA/Data replication feature is there a way for consumers to consume/fetch data from a slave node instead of always reaching out to the master node (master of that particular queue)?
If you think about scalability, having all consumers call a single node responsible to be the master of a specific queue means all traffic goes to a single node.
Kafka allows consumers to fetch data from the closest node if that node contains a replica of the leader, is there something similar on ActiveMQ?
In short, no. Consumers can only consume from an active broker and slave brokers are not active, they are passive.
If you want to increase scalability you can add additional brokers (or HA broker pairs) to the cluster. That said, I would recommend careful benchmarking to confirm that you actually need additional capacity before increasing your cluster size. A single ActiveMQ Artemis broker can handle millions of messages per second depending on the use-case.
As I understand it, Kafka's semantics are quite different from a "traditional" message broker like ActiveMQ Artemis so the comparison isn't particularly apt.

During rolling upgrade/restart, how to detect when a kafka broker is "done"?

I need to automate a rolling restart of a kafka cluster (3 kafka brokers). I can easily do it manually - restart one after the other, while checking the log to see when it's fine (e.g., when the new process has joined the cluster).
What is a good way to automate this check? How can I ask the broker whether it's up and running, connected to its peers, all topics up-to-date and such? In my restart script, I have access to the metrics, but to be frank, I did not really see one there which gives me a clear picture.
Another way would be to ask what a good "readyness" probe would be that does not simply check some TCP/IP port, but looks at the actual server...
I would suggest exposing JMX metrics and tracking the following for cluster health
the controller count (must be 1 over the whole cluster)
under replicated partitions (should be zero for healthy cluster)
unclean leader elections (if you don't disable this in server.properties make sure there are none in the metric counts)
ISR shrinks within a reasonable time period, like 10 minute window (should be none)
Also, Yelp has tooling for rolling restarts implemented in Python, which requires Jolokia JMX Agents installed on the brokers, and it polls the metrics to make sure some of the above conditions are true
Assuming your cluster was healthy at the beginning of the restart operation, at a minimum, after each broker restart, you should ensure that the under-replicated partition count returns to zero before restarting the next broker.
As the previous responders mentioned, there is existing code out there to automate this. I don’t use Jolikia, myself, but my solution (which I’m working on now) also uses JMX metrics.
Kakfa Utils by Yelp is one of the best tools that can be used to detect when a kafka broker is "done". Specifically, kafka_rolling_restart is the tool which gets broker details from zookeeper and URP (Under Replicated Partitions) metrics from each broker. When a broker is restarted, total URPs across Kafka cluster is periodically collected and when it goes to zero, it restarts another broker. The controller broker is restarted at the last.

Does kafka client connect to zookeeper or is it behind the scene

Kafka client code directly refers to the broker ip and port and in case if it is down will zookeeper direct to another broker. is zookeper always behind the scene
In the case you provide only one broker address in the client code, and it goes down, plus your client restarts, then your client will also be down. Zookeeper will not be used here because the broker will not be reachable.
If you give more than one broker address in the client, then it's more resilient in that the Kafka Controller process periodically fetches a list of all alive brokers in the cluster from Zookeeper and is responsible for sending that information back to the clients via the leader of the partitions they get assigned. Zookeeper is indirectly used here, but does not communicate with any external clients
If I got the question in the right way the answer is no.
The Kafka clients need connection only to Kafka brokers and Zookeeper isn't involved at all. Clients needs to write/read leader partitions on brokers.
If the Kafka brokers set in the brokers list aren't available, the clients can connect and cannot start to send/receive messages.
Only in the old version 0.8.0 the Zookeeper was involved for consumers which saved offset on Zookeeper. Starting from 0.9.0, the consumers save offset in Kafka topics so Zookeeper isn't needed anymore.

Cluster in ActiveMQ Artemis

I am new to ActiveMQ Artemis,
I am trying to understand symmetric-cluster in ActiveMQ Artemis.
Here is example of it which i am trying to understand.
I am getting the list of Topic message and Queue message which are consume from cluster node, How can i get the information about node means which node is returning this information(Queue message/Topic message)
Artemis doesn't add any meta-data to the message to indicate which cluster node it's coming from. Typically a cluster is comprised of interchangeable/indistinguishable nodes so it doesn't actually matter where the message comes from.

How to permanently remove a broker from Kafka cluster?

How do I permanently remove a broker from a Kafka cluster?
Scenario:
I have a stable cluster of 3 brokers.
I temporarily added a fourth broker that successfully joined the cluster. The controller returned metadata indicating this broker was part of the cluster.
However, I never rebalanced partitions onto this broker, so this broker #4 was never actually used.
I later decided to remove this unused broker from the cluster. I shutdown the broker successfully and Zookeeper /broker/ids no longer lists broker #4.
However, when our application code connects to any Kafka broker and fetches metadata, we get a broker list that includes this deleted broker.
How do I indicate to the cluster that this broker has been permanently removed from the cluster and not just a transient downtime?
Additionally, what's happening under the covers that causes this?
I'm guessing that when I connect to a broker and ask for metadata, the broker checks its local cache for the controller ID, contacts the broker and asks it for the list of all brokers. Then the controller checks it's cached list of brokers and returns the list of all brokers known to have belonged to the cluster at any point in time.
I'm guessing this happens because it's not certain if the dead broker is permanently removed or just transient downtime. So I'm thinking we just need to indicate to the controller that it needs to reset it's list of known cluster brokers to the known live brokers in zookeeper. But would not be surprised if something in my mental model is incorrect.
This is for Kafka 0.8.2. I am planning to upgrade to 0.10 soon, so if 0.10 handles this differently, I'd also love to know that.
It looks like this is most likely due to this bug in Kafka 8, which was fixed in Kafka 9.