Artemis Bridges/Federation - activemq-artemis

Looking to understand differences between various options for moving messages i.e. diverts , bridges & Federation. As I understand diverts are for within same broker and can mix along with brides.Bridge on the other hand,can be used to move messages to different broker instance(JMS Compliant one).
Then Federation when I read looks similar to Bridging , where messages can be moved/pulled from upstream. Quick help on when which feature to be used is helpful.
Thanks a lot for your help!

Bridges are the most basic way to move messages from one broker to another. However, each bridge can only move messages from one queue to one address, and each bridge must be created manually in broker.xml or programmatically via the management interface. Many messaging use-cases involve dynamically created addresses and queues so manually creating bridges is not feasible. Furthermore, many messaging use-cases involve lots of addresses in queues and manually creating corresponding bridges would be undesirable.
Federation uses bridges behind the scenes, but it allows configuring one element in broker.xml to apply to lots of addresses and queues (even those created dynamically). Federation also allows upstream & downstream configurations whereas bridges can only be configured to "push" messages from one broker to another.

Related

What are the options of routing HTTP connections to one specific instance out of many instances behind a load balancer?

Assume there is a system that accepts millions of simultaneous WebSocket connections from client applications. I was wondering if there is a way to route WebSocket connections to a specific instance behind a load balancer (or IP/Domain/etc) if clients provide some form of metadata, such as hash key, instance name, etc.
For instance, let's say each WebSocket client of the above system will always belong to a group (e.g. max group size of 100), and it will attempt to communicate with 99 other clients using the above system as a message gateway.
So the system's responsibility is to relay messages sent from clients in a group to other 99 clients in the same group. Clients won't ever need to communicate with other clients who belong to different groups.
Of course, one way to tackle this problem is to use Pubsub system, such that regardless of which instance clients are connected to, the server can simply publish the message to the Pubsub system with a group identifier and other clients can subscribe to the messages with a group identifier.
However, the Pubsub system can potentially encounter scaling challenges, excessive resource usage (single message getting published to thousands of instances), management overhead, latency increase, cost, and etc.
If it is possible to guarantee that WebSocket clients in a group will all be connected to the instance behind LB, we can skip using the Pubsub system and make things simpler, lower latency, and etc.
Would this be something that is possible to do, and if it isn't, what would be the best option?
(I am using Kubernetes in one of the cloud service providers if that matters.)
Routing in HTTP is generally based on the hostname and/or URL path. Sometimes to a lesser degree on other headers like cookies. But in this case it would mean that each group should have it's own unique URL.
But that part is easy, what I think you're really asking is "given arbitrary URLs, how can I get consistent routing?" which is much, much more complicated. The base concept is "consistent hashing", you hash the URL and use that to pick which endpoint to talk to. But then how to do deal with adding or removing replicas without scrambling the mapping entirely. That usually means using a hash ring and assigning portions of the hash space to specific replicas. Unfortunately this is the point where off-the-shelf tools aren't enough. These kinds of systems require deep knowledge of your protocol and system specifics so you'll probably need to rig this up yourself.

architecture pattern for microservices

I have a microservices architecture whose logs have to be sent to a remote Kafka topic.
Next to it, the consumer of this topic will send the logs to an ELK stack (an other team)
I want to have a dedicated microservice (fwk-proxy-elasticsearch) whose responsability is to collec the logs from the others one and send them to the remote kafka topic.
what's the best protocol to dispatch all the logs aggregated from my microservices to the fwk-proxy-elasticsearch microservice ?
I want this pattern to not duplicate the security configuration of the remote kafka topic. I want to centralize it in a single place.
May I use vertx event bus for that ? or kafka is beter ? or someother tool ?
May I use vertx to send message from jvm to jvm ?
Moreover, in a microservice architecture, is it a good pattern to centralize a use case in a dedicated microservice? (remote http connection for example)
On my point of view, it allows business microservices to focus on a business issue and not to worry over the protocol that the result has to be sent.
Thanks!
I believe you can use both Vert.x event bus and Kafka to propagate the logs, there are pros and cons on each approach.
While I understand the reasoning behind this decision, I would still consider a dedicated solution built for this purpose, like Fluentd, which is able to aggregate the logs and push them into multiple sources (including Kafka, via the dedicated plugin). I'm sure there are other similar solutions.
There are a couple of important benefits that I see if you use a dedicated solution, instead of building it yourself:
The level of configurability, which is definitely useful in the future (in a dedicated solution, you need to write code each time you want to build something new)
The number of destinations where you can export the logs
Support for a hybrid architecture - with a few config updates, you will be able to grab logs from non-JVM microservices

What's the use of ClientQuotaCallback in kafka-clients?

I find this line in its Comment:"Quota callback interface for brokers that enables customization of client quota computation".but it doesnt has any child class,why?and i googled it but cant find an example.
In Kafka, it was decided to have all broker pluggable APIs as Java interfaces. For that reason, there are a few interfaces in kafka-clients that are not related to the clients. This is because the server side is actually written in Scala.
Anything under org.apache.kafka.server are pluggable APIs for the brokers. These can be used to customize some behaviours on the broker side:
http://kafka.apache.org/20/javadoc/org/apache/kafka/server/policy/package-summary.html
http://kafka.apache.org/20/javadoc/org/apache/kafka/server/quota/package-summary.html
For example, ClientQuotaCallback allows to customize the way quotas are calculated by Kafka brokers. For example, you can build Quotas for groups or have Quotas scale when topic/partitions are created. KIP-257 details exactly how this all works.
Of course, for these to work you need to build implementation of these interfaces and put them in the classpath on your brokers. It's not something that can be used by clients directly.

Is Kafka suitable for running a public API?

I have an event stream that I want to publish. It's partitioned into topics, continually updates, will need to scale horizontally (and not having a SPOF is nice), and may require replaying old events in certain circumstances. All the features that seem to match Kafka's capabilities.
I want to publish this to the world through a public API that anyone can connect to and get events. Is Kafka a suitable technology for exposing as a public API?
I've read the Documentation page, but not gone any deeper yet. ACLs seem to be sensible.
My concerns
Consumers will be anywhere in the world. I can't see that being a problem seeing Kafka's architecture. The rate of messages probably won't be more than 10 per second.
Is integration with zookeeper an issue?
Are there any arguments against letting subscriber clients connect that I don't control?
Are there any arguments against letting subscriber clients connect that I don't control?
One of the issues that I would consider is possible group.id collisions.
Let's say that you have one single topic to be used by the world for consuming your messages.
Now if one of your clients has a multi-node system and wants to avoid reading the same message twice, they would set the same group.id to both nodes, forming a consumer group.
But, what if someone else in the world uses the same group.id? They would affect the first client, causing it to lose messages. There seems to be no security at that level.

Implementing a message bus using ZeroMQ

I have to develop a message bus for processes to send, receive messages from each other. Currently, we are running on Linux with the view of porting to other platforms later.
For this, I am using ZeroMQ over TCP. The pattern is PUB-SUB with a forwarder. My bus runs as a separate process and all clients connect to SUB port to receive messages and PUB to send messages. Each process subscribes to messages by a unique tag. A send call from a process sends messages to all. A receive call will fetch that process the messages marked with the tag of that process. This is working fine.
Now I need to wrap the ZeroMQ stuff. My clients only need to supply a unique tag. I need to maintain a global list of tags vs. ZeroMQ context and sockets details. When a client say,
initialize_comms("name"); the bus needs to check if this name is unique, create ZeroMQ contexts and sockets. Similarly, if a client say receive("name"); the bus needs to fetch messages with that tag.
To summarize the problems I am facing;
Is there anyway to achieve this using facilities provided by ZeroMQ?
Is ZeroMQ the right tool for this, or should I look for something like nanomsg?
Is PUB-SUB with forwarder the right pattern for this?
Or, am I missing something here?
Answers
Yes, ZeroMQ is capable of serving this need
Yes. ZeroMQ is a right tool ( rather a powerful tool-box of low-latency components ) for this. While nanomsg has a straight primitive for bus, the core distributed logic can be integrated in ZeroMQ framework
Yes & No. PUB-SUB as given above may serve for emulation of the "shout-cast"-to-bus and build on a SUB side-effect of using a subscription key(s). The WHOLE REST of the logic has to be re-thought and designed so as the whole scope of the fabrication meets your plans (ref. below). Also kindly bear in mind, that initial versions of ZeroMQ operated PUB/SUB primitive as "subscription filtering" of the incoming stream of messages being done on receiver side, so massive designs shall check against traffic-volumes / risk-of-flooding / process-inefficiency on the massive scale...
Yes. ZeroMQ is rather a well-tuned foundation of primitive elements ( as far as the architecture is discussed, not the power & performance thereof ) to build more clever, more robust & almost-linearly-scaleable Formal Communication Pattern(s). Do not get stuck to PUB/SUB or PAIR primitives once sketching Architecture. Any design will remain poor if one forgets where the True Powers comes from.
A good place to start a next step forward towards a scaleable & fault-resilient Bus
Thus a best next step one may do is IMHO to get a bit more global view, which may sound complicated for the first few things one tries to code with ZeroMQ, but if you at least jump to the page 265 of the Code Connected, Volume 1, if it were not the case of reading step-by-step thereto.
The fastest-ever learning-curve would be to have first an un-exposed view on the Fig.60 Republishing Updates and Fig.62 HA Clone Server pair for a possible High-availability approach and then go back to the roots, elements and details.
Here is what I ended up designing, if anyone is interested. Thanks everyone for the tips and pointers.
I have a message bus implemented using ZeroMQ (and CZMQ) running as a separate process.
The pattern is PUBLISHER-SUBSCRIBER with a LISTENER. They are connected using a PROXY.
In addition, there is a ROUTER invoked using a newly forked thread.
These three endpoints run on TCP and are bound to predefined ports which the clients know of.
PUBLISHER accepts all messages from clients.
SUBSCRIBER sends messages with a unique tag to the client who have subscribed to that tag.
LISTENER listens to all messages passing through. currently, this is for logging testing and purposes.
ROUTER provides a separate comms channel to clients. Messages such as control commands are directed here so that they will not get passed downstream.
Clients connect to,
PUBLISHER to send messages.
SUBSCRIBER to receive messages. Subscription is using unique tags.
ROUTER to send commands (check tag uniqueness etc.)
I am still doing implementation so there may be unseen problems, but right now it works fine. Also, there may be a more elegant way but I didn't want to throw away the PUB-SUB thing I had built.