Paging mode in ActiveMQ Artemis - activemq-artemis

As far as I understand paging will be carried out on adresses if they exceed the defined size. Currently we experience a paging, but not on known addresses (queues). It seems like it is an internal queue from ActiveMQ? Is it possible to understand what kind of address ActiveMQ is paging here?
WARN [org.apache.activemq.artemis.core.server] AMQ222038: Starting paging on address '$.artemis.internal.my-cluster.fec50662-55c7-11eb-91d1-005056903119'; size is currently: 25,238,532 bytes; max-size-bytes: -1; global-size-bytes: 524,357,417
This is important for us, because we have analyzed that this paging causing the inability for the messages in our queues to be consumed.

The address named $.artemis.internal.my-cluster.fec50662-55c7-11eb-91d1-005056903119, and the related queue, are used for intra-cluster communication. When messages need to be moved from one node to another they are sent to this address and then forwarded to another broker by the internal cluster bridge.
Given the log message I would surmise that you've reached the global-size-bytes which is calculated by adding up the bytes from all addresses. You might consider increasing your global-max-size in broker.xml.
You say that this paging is preventing your consumers from consuming messages. However, it's also worth noting that paging is typically caused by consumers not consuming messages, not the other way around. When consumers slow down or stop then messages build up in the broker and it has no choice but to begin paging. Therefore you would likely see both of these things simultaneously which could lead to misattribution.

Related

Can "observer" nodes in zookeeper respond with stale results?

This question is in reference to https://zookeeper.apache.org/doc/trunk/zookeeperObservers.html
Observers are non-voting members of an ensemble which only hear the
results of votes, not the agreement protocol that leads up to them.
Other than this simple distinction, Observers function exactly the
same as Followers - clients may connect to them and send read and
write requests to them. Observers forward these requests to the Leader
like Followers do, but they then simply wait to hear the result of the
vote. Because of this, we can increase the number of Observers as much
as we like without harming the performance of votes.
Observers have other advantages. Because they do not vote, they are
not a critical part of the ZooKeeper ensemble. Therefore they can
fail, or be disconnected from the cluster, without harming the
availability of the ZooKeeper service. The benefit to the user is that
Observers may connect over less reliable network links than Followers.
In fact, Observers may be used to talk to a ZooKeeper server from
another data center. Clients of the Observer will see fast reads, as
all reads are served locally, and writes result in minimal network
traffic as the number of messages required in the absence of the vote
protocol is smaller.
1) non-voting members of an ensemble - What do the voting members vote on?
2) How does an update request work for observers - When a ZK leader gets an update request, it requires a quorum of nodes to respond. Observer nodes seems like is not considered a quorum node. Does that mean an observer node lags behind the leader node for updates? If that is true, how does it ensure that observer nodes do not respond with stale data during reads?
3) Clients of the Observer will see fast reads, as all reads are served locally, and writes result in minimal network traffic as the number of messages required in the absence of the vote protocol is smaller - Reads from all the other nodes will also be local only because they are in-sync with the leader, no? And I did not get the part about writes.
These questions should be good to understanding zookeeper and distributed systems in general. Appreciate a good detailed answer for these. Thanks in advance !
1) non-voting members of an ensemble - What do the voting members vote on?
Typical members of the ensemble (not observers) vote on success/failure of proposed changes coordinated by the leader. There is some further discussion of the details in the paper ZooKeeper: Wait-free coordination for Internet-scale systems.
2) How does an update request work for observers - When a ZK leader gets an update request, it requires a quorum of nodes to respond. Observer nodes seems like is not considered a quorum node. Does that mean an observer node lags behind the leader node for updates? If that is true, how does it ensure that observer nodes do not respond with stale data during reads?
You are correct that observer nodes are not considered necessary participants in the quorum. In general, update lag will be subject to network latency between the observer and the leader. (Whether or not this is noticeable is subject to specific external factors, such as whether or not the observer and leader are in the same data center with a low-latency network link.)
Note that even without use of observers, there is no guarantee that every server in the ensemble is always completely up to date. The Apache ZooKeeper documentation on Consistency Guarantees contains this disclaimer:
Sometimes developers mistakenly assume one other guarantee that ZooKeeper does not in fact make. This is:
Simultaneously Consistent Cross-Client Views ZooKeeper does not
guarantee that at every instance in time, two different clients will
have identical views of ZooKeeper data. Due to factors like network
delays, one client may perform an update before another client gets
notified of the change. Consider the scenario of two clients, A and B.
If client A sets the value of a znode /a from 0 to 1, then tells
client B to read /a, client B may read the old value of 0, depending
on which server it is connected to. If it is important that Client A
and Client B read the same value, Client B should should call the
sync() method from the ZooKeeper API method before it performs its
read.
However, clients of ZooKeeper will never appear to "go back in time" by reading stale data from a point in time prior to the data they already read. This is accomplished by attaching a monotonically increasing transaction ID (called "zxid") to each ZooKeeper transaction. When the ZooKeeper client interacts with a server, it compares the client's last seen zxid to the current zxid of the server. If the server is behind the client, then it will not allow the client's next read to be processed by that server.
3) Clients of the Observer will see fast reads, as all reads are served locally, and writes result in minimal network traffic as the number of messages required in the absence of the vote protocol is smaller - Reads from all the other nodes will also be local only because they are in-sync with the leader, no? And I did not get the part about writes.
It's important to note that this statement from the documentation is written in the context of an important use-case for observers: multiple data center deployments with higher network latency between different data centers. In this statement, "served locally" means served from a ZooKeeper server within the same data center as the client, so that it doesn't suffer from the longer latency of connecting to another data center. For full context, here is a copy of the full quote:
In fact, Observers may be used to talk to a ZooKeeper server from another data center. Clients of the Observer will see fast reads, as all reads are served locally, and writes result in minimal network traffic as the number of messages required in the absence of the vote protocol is smaller.

How are distributed queues architectured?

What are architectural patterns/solutions that make distributed queues tick?
Please share for both ordered and non-ordered types.
You can think of the backend of a queue as a replicated database. (I am assuming the queues you are talking about consider themselves as durable: when they accept a message, they guarantee at least once delivery.)
As a replicated database, the message queue backend uses a replication protocol to make sure the message is on at least N hosts before acknowledging receipt to the sender. Common replication protocols are 2PC, 3PC, and consensus protocols like Raft, Multi-Paxos, and Chain Replication.
To send a message to a receiver, you have to do almost the same replication with a message lease. The queue server reserves the message for a certain period of time; it sends the message to the receiver, and if/when the receiver ackowledges receipt of the message the server deletes the message. Otherwise, the servers will resend the message to the next available receiver.
Some message queues stop there, others add lots of bells and whistles. SQS is one queue implementation that doesn't add many bells and whistles so that it can scale more. It allows them, for example, to shard the queue so that one SQS queue is actually made of many—even thousands—of these queues as described above. As an aside, I once heard one SQS developer ask another "What does 'ordering' mean when you are accepting millions of messages per second?"
That being said, some queues do provide strong ordering guarantees. (I have implemented a couple of these types of systems.) The cost of this is less ability to scale. To maintain ordering the queue's complexity goes way up. The queue has to maintain an ordered log of all the messages, and have the same ordering replicated across its servers. This is much much harder than unordered replication. Ordered queue systems typically elect a master to maintain the ordering and all messages are routed to the master. They also tend to use the more complex protocols for replication.

How to discard some number of messages in rabbitmQ

I have a use case where I need to get data from a queue on an exchange that I dont have control on.
the usecase is that from this queue I get messages constantly. Just wonder if in rabbitmq or by using/writing a plugin I can discard 90% of the messages at a time before saving them to my local datastore. The reason for this is that I'm not capable of storing all the messages but 10% of it.
Obviously one way is in my application to do so. but I wonder if there is a way to do it on rabbitmq level.
Just wonder if you have any thoughts/solutions on this.
If you don't have control of the exchange, you're pretty much limited to doing it in your app.
You can bulk-reject messages using a nack - here's the help page:
http://www.rabbitmq.com/nack.html
Due to the AMQP specs, a rabbitmq queue passes its messages to the connected consumers in a round robin algorithm. So if your code is the sole consumer of the rabbitmq queue & you want your application to neglect about 90% of recieved messages and process only remaining 10% then,....
connect to the same queue using 10 different consumers simultaneously (all may be written in same language or diff. dose not matter) and write your message processing logic in any one or two of them....abandon the rest 8/9 consumers(these will be used by rabbitmq [and conceptually by us] to drain off about 90% of messages)
You can simply consume the messages and do nothing about it. Using rabbitmqadmin is the easiest way to do this as below:
rabbitmqadmin get queue=queuename requeue=false count=1

Must msmq queues be transactional?

I've just recently gotten into using Rebus, and noticed that it always creates transactional msmq-queues resulting in heavy traffic to the HDD (0,5 - 5mb/sec). Is this intentional - and can something be done to avoid it?
It's correctly observed that Rebus (with its default MsmqMessageQueue transport) always creates transactional MSMQ queues. It will also refuse to work with non-transactional input queues, throwing an error at startup if you created a non-transactional queue yourself and attempt to use it.
This is because the philosophy in Rebus revolves around the idea that messages are important data, just as important as the data in your SQL Server or whichever database you're using.
And yes, the way MSMQ implements durability is that messages are written to disk when they're sent, so that probably explains the disk activity you're seeing.
If you have a different opinion as to how you'd like your system to treat its messages, there's nothing that prevents you from replacing Rebus' transport with something that can work with non-transactional MSMQ. Keep in mind though, that all of Rebus' delivery guarantees will be void if you do so ;)
We had the very same observation, the annoying aspect is that we have 300/500 KB/sec write on disk even when there are no message on the queue. It seems that only polling from the queue causes a constant write on disk.
Gian Maria.

HornetQ clustering topologies

I understand that in HornetQ you can do live-backup pairs type of clustering. I also noticed from the documentation that you can do load balancing between two or more nodes in a cluster. Are those the only two possible topologies? How would you implement a clustered queue pattern?
Thanks!
Let me answer this using two terminologies: One the core queues from hornetq:
When you create a cluster connection, you are setting an address used to load balance hornetq addresses and core-queues (including its direct translation into jms queues and jms topics), for the addresses that are part of the cluster connection basic address (usually the address is jms)
When you load balance a core-queue, it will be load balanced among different nodes. That is each node will get one message at the time.
When you have more than one queue on the same address, all the queues on the cluster will receive the messages. In case one of these queues are in more than one node.. than the previous rule on each message being load balanced will also apply.
In JMS terms:
Topic subscriptions will receive all the messages sent to the topic. Case a topic subscription name / id is present in more than one node (say same clientID and subscriptionName on different nodes), they will be load balanced.
Queues will be load balanced through all the existent queues.
Notice that there is a setting on forward when no consumers. meaning that you may not get a message if you don't have a consumer. You can use that to configure that as well.
How would you implement a clustered queue pattern?
Tips for EAP 6.1/HornetQ 2.3 To implement a distributed queue/topic:
Read the official doc for your version: e.g. for 2.3 https://docs.jboss.org/hornetq/2.3.0.Final/docs/user-manual/html/clusters.html
Note that the old setting clusterd=true is deprecated, defining the cluster connection is enough, check that internal core bridges are created automatically / clustered=true is deprecated in 2.3+
take the full-ha configuration as a baseline or make sure you have jgroups properly set. This post goes deeply into the subject: https://developer.jboss.org/thread/253574
Without it, no errors are shown, the core bridge connection is
established... but messages are not being distributed, again no errors
or warnings at all...
make sure security domain and security realms, users, passwords, roles are properly set.
E.g. I confused the domain id ('other') with the realm id
('ApplicationRealm') and got auth errors, but the errors were
generic, so I wasted time checking users, passwords, roles... until I
eventually found out.
debug by enabling debug (logger.org.hornetq.level=DEBUG)