ejabberd in a microservice network - xmpp

I'm willing to use ejabberd / mongooseIm in a microservice network. XMPP should be our chat protocol aside from a REST API network. I want to send messages incoming at the xmpp server downstream to worker services. Has anybody done this or could lead me into the right direction?
My first thoughts are using RabbitMQ for sending the new incoming messages to the workers.

There are basically two choices to giving your workers access to the messages routed by ejabberd / MongooseIM. I'll focus on MongooseIM, since I know it better (DISCLAIMER: I'm in the dev team).
The first is to scan the message archive in an async / polling fashion. The Message Archive Management describes XMPP level protocol for accessing it, but for your use case the important part is message persistence - so just making sure the relevant module (mod_mam) is enabled in server config and the messages will hit the database. The databases supported for MAM are PostgreSQL and Riak, though there was also some work on a Cassandra backend (YMMV). This doesn't require tinkering with the server / in Erlang for as long as there's a DB driver for your language of choice available. Since PR#657 it's possible to store the messages in raw XML or even some custom format if you're willing to write the serialization module.
The second option is to use the server mechanism of hooks and handlers (also available in ejabberd), which can trigger a server action on events like "user sent a message", "user logged in", "user logged out", ... This, however, requires a server side extension written in Erlang. In the simplest case the extension could forward any interesting event (with message content and metadata) via AMQP or just call some external HTTP/REST API - that way the real work is carried out by the workers giving you the freedom with regard to implementation language. This options also doesn't require to enable mod_mam or set up a database for message persistency (which you could still have with a persistent message queue...).
In general, the idea is perfectly feasible.

Generally, the most common XMPP extension use to build messaging systems for machines-to-machines, internet of things, microservices, etc is PubSub, as defined in XEP-0060.
This is a module you can enable in ejabberd. It is API based, so you can even customize the behaviour of that module to your application specific.
Pubsub basically allows to decouple senders and receivers and is especially designed for that use case.

Related

How do I create bot user with webhook on server side in MongooseIM?

This is what I want
A user(bot) that always shows status Online
When a message comes for the user, I will hit a webhook associated with the user
The response from the webhook request will be sent as reply to the sender
This user will be able to intercept any message (let's say for profanity moderation)
This user will be able to send message to anyone (let's say broadcast)
This user will come in every users roster as default(like echo bot of skype)
I can't seem to find any resource on how to achieve this. I've found a way to intercept the incoming packet in openfire but I don't see any easy way to do this with MongooseIM. I haven't started diving deep into the source code yet, still looking for a way to do this without touching the source code and locking myself to a specific version of MongooseIM.
Disclaimer: I'm on the MongooseIM core team.
There are multiple ways this could be achieved. The easiest way to achieve this depends on your familiarity with Erlang, the programming language MongooseIM is written in.
You won't need any Erlang to use the event pusher module with its HTTP backend and the default settings, but you'd need some Erlang to control what messages get forwarded to the HTTP service or to make more complex setups. To send messages back, you'd either need to use the MongooseIM REST API or connect as an ordinary XMPP client to the server using one of the many XMPP libs available out there. This is probably the best approach to achieve your goal.
You can skip using the event pusher and just connect your bot as an XMPP client written in any language whatsoever. The bot might have your business logic within or can forward messages it gets to the HTTP service.
If you're comfortable working in Erlang, then the mechanism to extend the server is called Hooks and handlers and is described in the official MongooseIM documentation. This requires writing code in Erlang and building from source, but does not necessarily require modifying upstream MongooseIM code.
You could use the XMPP component protocol, which allows to extend the functionality of an XMPP server, yet structure it as multiple services. The components may be written in any technology you want and the most popular XMPP libraries should support the component protocol out of the box.
Depending on your choice from the above list and the language and environment you prefer, you might have to pick an XMPP library to use. There are XMPP libs available for iOS (ObjC and Swift), Android (Java and Kotlin), Python, JavaScript, C, and even some emerging ones for Rust, Dart and possibly more.

Is a message queue like RabbitMQ the ideal solution for this application?

I have been working on a project that is basically an e-commerce. It's a multi tenant application in which every client has its own domain and the website adjusts itself based on the clients' configuration.
If the client already has a software that manages his inventory like an ERP, I would need a medium on which, when the e-commerce generates an order, external applications like the ERP can be notified that this has happened to take actions in response. It would be like raising events over different applications.
I thought about storing these events in a database and having the client make requests in a short interval to fetch the data, but something about polling and using a REST Api for this seems hackish.
Then I thought about using Websockets, but if the client is offline for some reason when the event is generated, the delivery cannot be assured.
Then I encountered Message Queues, RabbitMQ to be specific. With a message queue, modeling the problem in a simplistic manner, the e-commerce would produce events on one end and push them to a queue that a clients worker would be processing as events arrive.
I don't know what is the best approach, to be honest, and would love some of you experienced developers give me a hand with this.
I do agree with Steve, using a message queue in your situation is ideal. Message queueing allows web servers to respond to requests quickly, instead of being forced to perform resource-heavy procedures on the spot. You can put your events to the queue and let the consumer/worker handle the request when the consumer has time to handle the request.
I recommend CloudAMQP for RabbitMQ, it's easy to try out and you can get started quickly. CloudAMQP is a hosted RabbitMQ service in the cloud. I also recommend this RabbitMQ guide: https://www.cloudamqp.com/blog/2015-05-18-part1-rabbitmq-for-beginners-what-is-rabbitmq.html
Your idea of using a message queue is a good one, better than database or websockets for the reasons you describe. With the message queue (RabbitMQ, or another server/broker based system such as Apache Qpid) approach you should consider putting a broker in a "DMZ" sort of network location so that your internal ecommerce system can push events out to it, and your external clients can reach into without risking direct access to your core business systems. You could also run a separate broker per client.

Microservices: REST vs Messaging

I heard Amazon uses HTTP for its microservice based architecture. An alternative is to use a messaging system like RabbitMQ or Solace systems. I personally have experience with Solace based microservice architecture, but never with REST.
Any idea what do various big league implementations like Amazon, Netflix, UK Gov etc use?
Other aspect is, in microservices, following things are required (besides others):
* Pattern matching
* Async messaging.. receiving system may be down
* Publish subscribe
* Cache load event.. i.e. on start up, a service may need to load all data from a couple of other services, and should be notified when data is completely loaded, so that it can 'know' that it is now ready to service requests
These aspects are naturally done with messaging rather than REST. Why should anyone use REST (except for public API). Thanks.
A standard that I've followed in the past is to use web services when the key requirement is speed (and data loss isn't critical) and messaging when the key requirement is reliability. Like you've said, if the receiving system is down, a message will sit on a queue until the system comes back up to process it. If it's a REST endpoint and it's down, requests will simply fail.
REST API presumes use of HTTP only. it is quite stone age technology and does not accept async. messaging. To plugin messaging there, I would consider WebSockets Gateways
-sorry for eventually dummy statements

What's the best practice to collect data from different clients?

Here are the details of my use case:
What's my data..
There would be user experiences, error report, state info and so on. The data is fragmented and may change in the future. So I plan to use NoSQL, maybe mongodb, to save data in the server.
What are the clients..
They are clients written in different languages, like C#, C++, LabVIEW and so on. Some don't even have an access to a mongodb driver, so of course it's not an option to communicate with database directly. And framework like below is needed.
Clients -> (Some protocol) -> Broker -> Database.
As those clients are not web client, so common web server using http may not suit for my case, right? Is there any suggestion for the protocol, broker and database, Or even a new framework.
My goal is to make the clients can send data as convenient as possible.
Thank you!
This is not really new, but a message driven application, which is a well understood pattern.
I did this mostly in Java, so I will stick to this language here.
A broker alone would be not enough here. Let us say you use Apache ActiveMQ as you message broker, you would still need to get your data into the database, since MQ is... ...a message queue. So you need a part which gets the messages out of MQ, processes them according to your business rules and stores them in the (correct) database instance, and the correct collection/bucket/table. Of course you could write this part by hand, but that would be pretty much reinventing the wheel. There is a notion of a "message routing and mediation engine", and the most commonly suggested here is Apache Camel, which has quite some components to communicate with databases and other so called consumers and producers. And that is the key point. In general, if possible, your clients should send their data to the message broker directly. But, if they can't, they can simply send text files or make REST calls – there are actually too many options to list here. This incoming data can be preprocessed and normalized to your standard format by a "route" in Apache Camel (a set of a consumer, conversion rules and a producer, in it's simplest form) and send as an AMQP message to MQ. From there, another Camel route can process the AMQP messages, apply your business rules and store the data in the database... ...or whatever else may come to your mind (for example sending an email).
So this solution supports a multitude of protocols for incoming and outgoing messages (as long as they are supported by Camel) and you have your business rules in a centralized and well defined location.
To implement this, I'd strongly suggest using Apache ServiceMix, which is a distribution of ActiveMQ, Camel and a system to manage the components and business rules.
Finally, web server with http protocal could suit for the use case, I think.
Mostly I want is a universal API for different kinds of clients to save data to cloud. Http has method GET, POST, PUT, DELETE, so with a RESTful API it is naturlly suitable for operate data, I think.
My solution at last is Node.js(Express) + Mongodb (a quite common group), and a RESTful API is provided via Express web server, clients can use http to operate data conviencely. Also, it is quite light weight and easy to get started.
Here is some tutorial: http://cwbuecheler.com/web/tutorials/2013/node-express-mongo/

What are the pros and cons of HTTP callbacks vs. message passing?

We are looking to develop a number of services, but are not sure which "response" mechanism is the best route to go. The two contenders are:
HTTP callbacks, where the service would update the client application via "pinging" it with update messages sent via HTTP requests
Message Passing, where the service would update the client via publishing messages into a pub-sub queue on a message server
In both cases, both the caller and the services are within our network, we have full control over them, and things we develop are the only users of the services.
What are the pros / cons of each way of providing status updates to the calling application, and what, if any, pros / cons would there be for making the initial request via one method or the other?
Note: The first service we have in mind for this is an email service similar to SendGrid, which we can't use for various reasons, but still need the same functionality.
The main difference would be the quality of service that you get "out of the box" with a messaging server.
If you go with HTTP then your application has to take care of what happens when a message doesn't arrive as expected. To get an idea of the issues you need to consider and the complexities involved in solving them, take a look at WS-ReliableMessaging or HTTPLR.
With messaging, you get a configurable level of reliability out of the box. And there's a lot of good choice these days such as ActiveMQ, RabbitMQ, 0MQ.
My personal preference is for reliability to be handled at the transport layer (by messaging), but then for a good discussion and dissenting view, check out "Nobody Needs Reliable Messaging."