ActiveMQ Artemis Error - AMQ224088: Timeout (10 seconds) while handshaking has occurred - activemq-artemis

In ActiveMQ Artemis, I occasionally receive the connection error below. I can't see any obvious impact to the brokers or message queues. Anyone able to advise exactly what it means or what impact it could be having?
Current action performed is to either restart the brokers or check to see they're still connected to the cluster. Is either of this action necessary?
Current ActiveMQ Artemis version deployed is v2.7.0.
//error log line received at least once a month
2019-05-02 07:28:14,238 ERROR [org.apache.activemq.artemis.core.server] AMQ224088: *Timeout (10 seconds) while handshaking* has occurred.

This error indicates that something on the network is connecting to the ActiveMQ Artemis broker, but it's not completing any protocol handshake. This is commonly seen with, for example, load balancers that do a health check by creating a socket connection without sending any real data just to see if the port is open on the target machine.
The timeout is configurable so that the ERROR messages aren't logged, but that will also disable the clean-up which may or may not be a problem in your use-case. You should just be able to set handshake-timeout=0 on the relevant acceptor URL in broker.xml.
When you see this message there should be no need to restart the broker.
In the next ActiveMQ Artemis release the IP address of the remote client where the connection originated will be included as part of the message.

Related

Client Local Queue in Red Hat AMQ

We have a network of Red Hat AMQ 7.2 brokers with Master/Slave configuration. The client application publish / subscribe to topics on the broker cluster.
How do we handle the situation wherein the network connectivity between the client application and the broker cluster goes down? Does Red Hat AMQ have a native solution like client local queue and a jms to jms bridge between local queue and remote broker so that network connectivity failure will not result in loss of messages.
It would be possible for you to craft a solution where your clients use a local broker and that local broker bridges messages to the remote broker. The local broker will, of course, never lose network connectivity with the local clients since everything is local. However, if the local broker loses connectivity with the remote broker it will act as a buffer and store messages until connectivity with the remote broker is restored. Once connectivity is restored then the local broker will forward the stored messages to the remote broker. This will allow the producers to keep working as if nothing has actually failed. However, you would need to configure all this manually.
That said, even if you don't implement such a solution there is absolutely no need for any message loss even when clients encounter a loss of network connectivity. If you send durable (i.e. persistent) messages then by default the client will wait for a response from the broker telling the client that the broker successfully received and persisted the message to disk. More complex interactions might require local JMS transactions and even more complex interactions may require XA transactions. In any event, there are ways to eliminate the possibility of message loss without implementing some kind of local broker solution.

How to unlock ActiveMQ Artemis broker

I did something to lock my ActiveMQ Artemis 2.8.1 broker. I needed to run > ./artemis data exp to get data on my queue setup. It failed to run, giving an error saying that the broker was locked: /var/lib/[broker]/lock
So I stopped the broker and ran the data exp successfully, but now when I try to start the broker I get the same error, and I don't know how to stop whatever was started by data exp.
Error: There is another process using the server at /var/lib/broker1/lock. Cannot start the process!*
So how do I unlock the broker in this situation? I've tried using systemctl to restart Artemis all together, but that didn't do anything. And the Artemis tab is missing entirely from Console.
You should be able to simply remove the lock file at /var/lib/broker1/lock and then start the broker again.

Synchronization mode of ha replication

Version : ActiveMQ Artemis 2.10.1
When we use ha-policy and replication, is the synchronization mode between the master and the slave full synchronization? Can we choose full synchronization or asynchronization?
I'm not 100% certain what you mean by "full synchronization" so I'll just explain how the brokers behave...
When a master broker receives a durable (i.e. persistent) message it will write the message to disk and send the message to the slave in parallel. The broker will then wait for the local disk write operation to complete as well as receive a response from the slave that it accepted the message before it responds to the client who originally sent the message.
This behavior is not configurable.

org.apache.kafka.clients.NetworkClient Bootstrap broker bootstrap-servers-ip:9092 disconnected

I am running apache kafka on my local system and it is running absolutely fine. But during smoke testing my application is not able to connect to the kafka cluster. It keeps throwing the following error endlessly:
[2016-11-22T23:04:35,017][WARN ][org.apache.kafka.clients.NetworkClient] Bootstrap broker <host1>:9092 disconnected
[2016-11-22T23:04:35,474][WARN ][org.apache.kafka.clients.NetworkClient] Bootstrap broker <host2>:9092 disconnected
[2016-11-22T23:04:35,951][WARN ][org.apache.kafka.clients.NetworkClient] Bootstrap broker <host1>:9092 disconnected
[2016-11-22T23:04:36,430][WARN ][org.apache.kafka.clients.NetworkClient] Bootstrap broker <host2>:9092 disconnected
I am using the below consumer config to connect:
propsMap.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, "<host1>:9092,<host2>:9092);
propsMap.put("zookeeper.connect", "<host1>:2181,<host2>:2181");
propsMap.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, false);
propsMap.put(ConsumerConfig.AUTO_COMMIT_INTERVAL_MS_CONFIG, "100");
propsMap.put(ConsumerConfig.SESSION_TIMEOUT_MS_CONFIG, "15000");
propsMap.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
propsMap.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
propsMap.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
propsMap.put(ConsumerConfig.GROUP_ID_CONFIG, "test");
Could it be network issue on smoke servers due to which my deployment server is not able to connect to the kafka servers? Because it is working fine on my local and 2 other testing environments.
Could it have something to do with the kafka version?
Or do I need to add some other config such as SSL in this case to connect?
I am new to Kafka, it would really help if someone could point me in the right direction!
If you are using the Kafka 0.9.x.x client or later (which you are if you are using spring-kafka), you don't need the zookeeper.connect property (but that shouldn't cause your problem).
If the broker is down, you should get something like
WARN o.apache.kafka.clients.NetworkClient - Connection to node -1 could not be established. Broker may not be available.
I suggest you look at the server logs to see if there's anything useful there. You need to talk to your admins to figure out if you need SSL/SASL/Kerberos, etc to connect.
This may be due to server moved to different address or not available at moment.
If you still want to go ahead with this assuming the server will come up later, but do not want logs to keep printing "server disconnected" in an infinite loop, use this property.
reconnect.backoff.ms
The base amount of time to wait before attempting to reconnect to a given host. This avoids repeatedly connecting to a host in a tight loop. This backoff applies to all connection attempts by the client to a broker.
Type: long
Default: 50
Valid Values: [0,...]
By default, it retries every 50 milliseconds to reconnect a failed host, this can be increased to, lets say, 5 minutes (300,000ms). By doing so, your logs wouldn't print the infinite disconnection message.
[OPTIONAL] Also, if you are using Apache Camel for routing purpose, use the similar sounding property in camel-kafka component bean definition.
reconnectBackoffMs (producer)

Apache zookeeper client times out

We continuously get EndOfStreamException in zookeeper logs,
[2017-04-06 19:15:24,350] WARN EndOfStreamException: Unable to read additional data from client sessionid 0x15b43c712fc03a5, likely client has closed socket (org.apache.zookeeper.server.NIOServerCnxn)
And in the client's (consumer) logs, we get session time out,
main-SendThread(localhost:2181) INFO 2017-04-06 21:30:27,823: org.apache.zookeeper.ClientCnxn Session establishment complete on server localhost/127.0.0.1:2181, sessionid = 0x15b43c712fc03a5, negotiated timeout = 6000
Is it normal behavior ?
We are actually amid investigation for the issue and the consumers are unable to read messages from queue. And producers are unable to put in. Thus, the whole process in jammed.
What do you suggest?
In our case, we were running into zookeeper disconnects which were right above the 6000ms default timeout, due to flaky distributed network. Since at that point the node takes itself out of the cluster, it was causing fairly high impact on the production cluster. So, we simply increased the timeout to 15 seconds, and did not see the problem again.