A customer wants to exchange data between his application and our application via ActiveMQ, so we want to create an Interface Specification Document which describes the settings and properties that both applications must use so that they can communicate. We don't know which programming language or API the customer will use; so if the specification is incomplete they might implicitly use settings that we don't expect.
So I'm wondering which settings must be the same on both sides, and which settings can be decided by each application on its own. This is what I have so far:
Must be specified in document:
connector type (openwire, stomp, ...)
connector settings (host name where broker runs, TCP port, user name, password)
message type (TextMessage, BytesMessage...)
payload details (XML with XSDs, JSON with schema, ...)
message encoding (UTF-8), for text payload
use queues, or topics, or durable topics
queue names
is any kind of request/response protocol being used
use single queue for requests and responses (with selectors being used to get correct messages), or use separate queues for requests and responses
how to transfer correlation ID used for correlating requests and responses
message expiration
Must not be specified in document:
ActiveMQ broker version (all versions are compatible, right?)
message compression (it should be transparent?)
What did I miss? Which things should be stated in such a document to ensure that two applications can communicate via ActiveMQ?
What did I miss?
You missed message headers. These can be broken into two categories:
Built-in (JMS) headers
Custom headers
Examples of the built-in headers are things such as JMSMessageID, JMSXGroupID, etc. In some cases, your interface definition will need to include details of whether and how these values will be set. For example, if messages need to be grouped, then any message producer or consumer using the definition will need to be aware of this.
Similarly, if there will any custom headers (common uses include bespoke ordering, source system identification, authorization tokens, etc.) attached to the messages need to be part of any interface definition.
In fact, I would argue that the interface definition only needs to include two things:
a schema definition for the message body, and
any headers + if they are required or optional
Everything else you have listed above is either a deployment or a management concern.
For example, whether a consumer or producer should connect to a queue or topic is a management concern, not an interface concern. The address of the queue/topic is a deployment concern, not an interface concern.
Related
I'm developing a router (events proxy) application with spring cloud stream over Kafka, in the functional paradigm. the application consumes from constant input topic, maps and filters the message and then should send it to some topic according to some input fields (only single message at a time, not multiple results).
Is the best way to do it by setting the spring.cloud.stream.sendto.destination header for the output message?
and if so, how should I set the bindings for the producer?
You can also use StreamBridge
With regard to the binding configuration. . .
If they are truly dynamic where you don't know the name of the destination (e.g., may come with message header), there is nothing you can do with regard to configuring it.
If they are semi dynamic where you do know the name(s) and it's a limited set of names, then you can configure then as any other binding.
For example, let's say you are sending to destination foo, than you can use spring.cloud.stream.bindings.foo.....
We're trying to build a platform using microservices that communicate async over kafka.
It would seem natural, the way i understood it, to have 1 topic per aggregate type in each microservice. So a microservice implementing user registration would publish user related events into the topic "users".
Other microservices would listen to events created from the "users" microservices and implement their own logic and fill their DBs accordingly. The problem is that other microservices might not be interested in all the events generated by the user microservice but rather a subset of these events, like UserCreated only (without UsernameChanged... for example).
Using RabbitMq is easy since event handlers are invoked based on message type.
Did you ever implement message based routing/filtering over kafka?
Should we consume all the messages, deserialize them and ignore unneeded ones by the consumer? (sounds like an overhead)
Should we forward these topics to storm and redirect these messages to consumer targeted topics? (sounds like an overkill and un-scalable)
Using partitions doesn't seem logical as a routing mechanism
Use a different topic for each of the standard object actions: Create, Read, Update, and Delete, with a naming convention like "UserCreated", "UserRead", etc. If you think about it, you will likely have a different schema for the objects in each. Created will require a valid object; Read will require some kind of filter; Update you might want to handle incremental updates (add 10 to a specific field, etc).
If the different actions have different schemas it makes deserialization difficult. If you're in a loosey-goosey language like JavaScript, ok -- no big deal. But a strictly typed language like Scala and having different schemas in this same topic is problematic.
It'll also solve you're problem -- you can listen for exactly the types of actions you want, no more, not less.
The scenario is publisher/subscriber, and I am looking for a solution which can give the feasibility of sending one message generated by ONE producer across MULTIPLE consumers in real-time. the light weight this scenario can be handled by one solution, the better!
In case of AMQP servers I've only checked out Rabbitmq and using rabbitmq server for pub/sub pattern each consumer should declare an anonymous, private queue and bind it to an fanout exchange, so in case of thousand users consuming one message in real-time there will be thousands or so anonymous queue handling by rabbitmq.
But I really do not like the approach by the rabbitmq, It would be ideal if rabbitmq could handle this pub/sub scenario with one queue, one message , many consumers listening on one queue!
what I want to ask is which AMQP server or other type of solutions (anyone similar including XMPP servers or Apache Kafka or ...) handles the pub/sub pattern/scenario better and much more efficient than RabbitMQ with consuming (of course) less server resource?
preferences in order of interest:
in case of AMQP enabled server handling the pub/sub scenario with only ONE or LESS number of queues (as explained)
handling thousands of consumers in a light-weight manner, consuming less server resource comparing to other solutions in pub/sub pattern
clustering, tolerating failing of nodes
Many Language Bindings ( Python and Java at least)
easy to use and administer
I know my question may be VERY general but I like to hear the ideas and suggestions for the pub/sub case.
thanks.
In general, for RabbitMQ, if you put the user in the routing key, you should be able to use a single exchange and then a small number of queues (even a single one if you wanted, but you could divide them up by server or similar if that makes sense given your setup).
If you don't need guaranteed order (as one would for, say, guaranteeing that FK constraints wouldn't get hit for a sequence of changes to various SQL database tables), then there's no reason you can't have a bunch of consumers drawing from a single queue.
If you want a broadcast-message type of scenario, then that could perhaps be handled a bit differently. Instead of the single user in the routing key, which you could use for non-broadcast-type messages, have a special user type, say, __broadcast__, that no user could actually have, and have the users to broadcast to stored in the payload of the message along with the message itself.
Your message processing code could then take care of depositing that message in the database (or whatever the end destination is) across all of those users.
Edit in response to comment from OP:
So the routing key might look something like this message.[user] where [user] could be the actual user if it were a point-to-point message, and a special __broadcast__ user (or similar user name that an actual user would not be allowed to register) which would indicate a broadcast style message.
You could then place the users to which the message should be delivered in the payload of the message, and then that message content (which would also be in the payload) could be delivered to each user. The mechanism for doing that would depend on what your end destination is. i.e. do the messages end up getting stored in Postgres, or Mongo DB or similar?
I understand that in HornetQ you can do live-backup pairs type of clustering. I also noticed from the documentation that you can do load balancing between two or more nodes in a cluster. Are those the only two possible topologies? How would you implement a clustered queue pattern?
Thanks!
Let me answer this using two terminologies: One the core queues from hornetq:
When you create a cluster connection, you are setting an address used to load balance hornetq addresses and core-queues (including its direct translation into jms queues and jms topics), for the addresses that are part of the cluster connection basic address (usually the address is jms)
When you load balance a core-queue, it will be load balanced among different nodes. That is each node will get one message at the time.
When you have more than one queue on the same address, all the queues on the cluster will receive the messages. In case one of these queues are in more than one node.. than the previous rule on each message being load balanced will also apply.
In JMS terms:
Topic subscriptions will receive all the messages sent to the topic. Case a topic subscription name / id is present in more than one node (say same clientID and subscriptionName on different nodes), they will be load balanced.
Queues will be load balanced through all the existent queues.
Notice that there is a setting on forward when no consumers. meaning that you may not get a message if you don't have a consumer. You can use that to configure that as well.
How would you implement a clustered queue pattern?
Tips for EAP 6.1/HornetQ 2.3 To implement a distributed queue/topic:
Read the official doc for your version: e.g. for 2.3 https://docs.jboss.org/hornetq/2.3.0.Final/docs/user-manual/html/clusters.html
Note that the old setting clusterd=true is deprecated, defining the cluster connection is enough, check that internal core bridges are created automatically / clustered=true is deprecated in 2.3+
take the full-ha configuration as a baseline or make sure you have jgroups properly set. This post goes deeply into the subject: https://developer.jboss.org/thread/253574
Without it, no errors are shown, the core bridge connection is
established... but messages are not being distributed, again no errors
or warnings at all...
make sure security domain and security realms, users, passwords, roles are properly set.
E.g. I confused the domain id ('other') with the realm id
('ApplicationRealm') and got auth errors, but the errors were
generic, so I wasted time checking users, passwords, roles... until I
eventually found out.
debug by enabling debug (logger.org.hornetq.level=DEBUG)
I can't really see a difference between a multicasting-router and a static-recipient-list-router. Why would I use one over the other?
According to Mule-2.x user guide
Recipient List
the Recipient list router can be used
to send the same event to multiple
endpoints over the same endpoint or to
implement routing-slip behaviour where
the next destination for the event is
determined from the event properties
or payload. Mule provides an abstract
Recipient list implementation
org.mule.routing.outbound.AbstractRecipientList
that provides a thread-safe base for
specialised implementations. Mule also
provides a Static recipient list that
takes a configured list of endpoints
from the current event or statically
declared on the endpoint.
<outbound>
<static-recipient-list-router>
<payload-type-filter expectedType="javax.jms.Message"/>
<recipients>
<spring:value="jms://orders.queue"/>
<spring:value="jms://tracking.queue"/>
</recipients>
</static-recipient-list-router> </outbound>
Multicasting Router
The Multicasting router can be used to
send the same event over multiple
endpoints. When using this router care
must be taken to configure the correct
transformers on the endpoints to
handle the event source type.
<outbound>
<multicasting-router>
<jms:endpoint queue="test.queue"
transformer-refs="StringToJmsMessage"/>
<http:endpoint host="10.192.111.11"
transformer-refs="StringToHttpClientRequest"/>
<tcp:endpoint host="10.192.111.12"
transformer-refs="StringToByteArray"/>
<payload-type-filter expectedType="java.lang.String"/>
</multicasting-router> </outbound>
Remember that care should be taken to
ensure that the message being routed
is transformed to a format that the
endpoint understands.
Straight from the horse's mouth (Mule in Action, by David Dossot, John D'Emic, p. 98..100)
The static-recipient-list router lets you simultaneously send the same message to multiple endpoints. You'll usually use a static recipient list when each endpoint is using the same transport. This is often the case with VM and JMS endpoints.
Use static recipient lists when sending the same message to endpoints using identical transports
The multicasting router is similar to the static recipient list in that it simultaneously sends the same message across a set of outbound endpoints. The difference is that the multicasting router is used when the endpoint list contains different types of transports.
Use the multicasting router when sending the same message to endpoints using different transports
This is how I understand these:
The static-recipient-list router will send the payload to each recipient in the order that they are listed. This gives you the ability to modify the payload before proceeding to the next endpoint. This also gives you the ability to stop processing in the event of an error.
The multicast-router sends the same payload to all endpoints at the same time. You will not be able to change the payload for each endpoint. You will not be able to stop other endpoints from processing if one of the endpoints fail.