Filtering the messages in a queue with Windows Azure - filtering

Is there any way I can filter the messages based on properties? I mean, that only with certain properties the Worker role takes it?

[I'm assuming that when you talk about filtering, you mean queue-side filtering, and not client side filtering]
As far as I know, Azure Queues don't have a messaging filtering feature. If you use ServiceBus topics and subscriptions, you can filter based on message properties. See the subscription filters section of this post for more info on that.

Related

Restrict Kafka consumers based on event headers (metadata)

The book "Building Event-Driven Microservices" gives good practice to use a metatags (event headers) for placing restrictions on Kafka consumers. One of which is the following:
Deprecation:
A way to indicate that a stream is out of date. Marking an event stream as deprecated
allows existing systems to continue using it while
new microservices are blocked from requesting a subscription... the
owner of the deprived stream of events can be notified when there are
no more registered users of the deprecated stream, at which point it
can be safely deleted.
Can you point to me please, how this can be implemented (Java/Spring centric)? Is it possible for Kafka ACL to make restrictions based on event headers?
Thank you in advance!
Is it possible for Kafka ACL to make restrictions based on event headers?
No, but you can filter out after receiving the message. ACLs will prevent access to partition as a whole, not to particular records.
the owner of the deprived stream of events can be notified when there are no more registered users of the deprecated stream
You need to remember that Kafka is not a pure messaging solution, and it does not have a concept of "registered" consumers, at any time as long as the message has not been removed by the cluster.
You'd need to implement your own "notification" pipeline that there are no instances interested in the original topic (possibly even with Kafka again).

Kafka P2P Header based routing

I have a requirement to send an event to multiple systems based on their system code. Destination system can grow in future and they should be able to subscribe only to the interested events. Its a security mandate, so as a producer we need to ensure this.
We could use RabbitMQ header exchange and use multiple shovel configurations to the different queues in different vhost or cluster. But I am looking for a similar pattern with Kafka.
If we maintain different topic and authorise the consumer to their corresponding topic, it can grow in future, so as a producer I need to do the topic routing logic and the number of topics will grow.
The other option is to use AWS SNS and subscribe multiple SQS queues. Based on filter policies the message can be routed.
Could anyone think of a better solution to this problem?
send an event to multiple systems based on their system code
Using Kafka Streams API, you can use branching to route data to different topics based on Predicate logic
Once data is in their respective topics, "multiple systems" can consume them

Pub/Sub and consumer aware publishing. Stop producing when nobody is subscribed

I'm trying to find a messaging system that supports the following use case.
Producer registers topic namespace
client subscribes to topic
first client triggers notification on producer to start producing
new client with subscription to the same topic receives data (potentially conflated, similar to hot/cold observables in RX world)
When the last client goes away, unsubscribe or crash, notify the producer to stop producing to said topic
I am aware that according to the pub/sub pattern A producer is defined to be blissfully unaware of the existence of consumers, meaning that my use-case simply does not fit the pub/sub paradigm.
So far I have looked into Kafka, Redis, NATS.io and Amazon SQS, but without much success. I've been thinking about a few possible ways to solve this, Haven't however found anything that would satisfy my needs yet.
One option that springs to mind, for bullet 2) is to model a request/reply pattern as amongs others detailed on the NATS page to have the producer listen to clients. A client would then publish a 'subscribe' message into the system that the producer would pick up on a wildcard subscription. This however leaves one big problem, which is unsubscribing. Assuming the consumer stops as it should, publishing an unsubscribe message just like the subscribe would work. But in the case of a crash or similar this won't work.
I'd be grateful for any ideas, references or architectural patterns/best practices that satisfy the above.
I've been doing quite a bit of research over the past week but haven't come across any satisfying Q&A or articles. Either I'm approaching it entirely wrong, or there just doesn't seem to be much out there which would surprise me as to me, this appears to be a fairly common scenario that applies to many domains.
thanks in advance
Chris
//edit
An actual simple use-case that I have at hand is stock quote distribution.
Quotes come from external source
subscribe to stock A quotes from external system when the first end-user looks at stock A
Stop receiving quotes for stock A from external system when no more end-users look at said stock
RabbitMQ has internal events you can use with the Event Exchange Plugin. Event such as consumer.created or consumer.deleted could be use to trigger some actions at your server level: for example, checking the remaining number of consumers using RabbitMQ Management API and takes action such as closing a topic, based on your use cases.
I don't have any messaging consumer present based publishing in mind. Got ever worst because you'll need kind of heartbeat mechanism to handle consumer crashes.
So here are my two cents, not sue if you're looking for an out of the box solution, but if not, you could wrap your application around a zookeeper cluster to handle all your use cases.
Simply use watchers on ephemeral nodes to check when you have no more consumers ( including crashes) and put some watcher around a 'consumers' path to be advertised when you get consumers.
Consumers side, you would have to register your zk node ID whenever you start it.
It's not so complicated to do, and zk is not the only solution for this, you might use other consensus techs as well.
A start for zookeeper :
https://zookeeper.apache.org/doc/r3.1.2/zookeeperStarted.html
( strongly advise to use curator api, which handle lot of recipes in a smooth way)
Yannick
Unfortunately you haven't specified your use business use case that you try to solve with such requirements. From the sound of it you want not the pub/sub system, but an orchestration one.
I would recommend checking out the Cadence Workflow that is capable of supporting your listed requirements and many more orchestration use cases.
Here is a strawman design that satisfies your requirements:
Any new subscriber sends an event to a workflow with a TopicName as a workflowID to subscribe. If workflow with given ID doesn't exist it is automatically started.
Any subscribe sends another signal to unsubscribe.
When no subscribers are left workflow exits.
Publisher sends an event to the workflow to deliver to subscribers.
Workflow delivers the event to the subscribers using an activity.
If workflow with given TopicName doesn't run the publish event to it is going to fail.
Cadence offers a lot of other advantages over using queues for task processing.
Built it exponential retries with unlimited expiration interval
Failure handling. For example it allows to execute a task that notifies another service if both updates couldn't succeed during a configured interval.
Support for long running heartbeating operations
Ability to implement complex task dependencies. For example to implement chaining of calls or compensation logic in case of unrecoverble failures (SAGA)
Gives complete visibility into current state of the update. For example when using queues all you know if there are some messages in a queue and you need additional DB to track the overall progress. With Cadence every event is recorded.
Ability to cancel an update in flight.
Distributed CRON support
See the presentation that goes over Cadence programming model.

amazon sqs :read message not in order

i would like to take messages from amazon sqs in the same order in which it is inserted into sqs ( first in first out model).
Is their any way to implement it??
I am using zend php for programing.
Unordered message delivery is inherent in the design of SQS. You could try to work around it by numbering the messages and storing the out-of-order messages locally until the missing messages arrive, but its probably not worth the hassle.
SQS is really a bit of an odd duck, it does what it says, but what it does isn't what most people are looking for in a message bus. I really wish Amazon would offer and additional queuing solution more like RabbitMQ. SQS is really only suited for distributing tasks that aren't even remotely coupled, and where things like order and latency aren't important. For instance it would be great for sending completed orders to a shipping center, or perhaps scheduling print jobs.
Their own documentation shows it being used to schedule thumbnail creation, but when I recently used it for this exact purpose I quickly discovered that my users weren't going to be impressed with the latency: which at times is 30-50 seconds.
You can still run RabbitMQ on EC2 nodes, and while not as scalable as SQS it does cluster and should take you pretty far.
You could try IronMQ. It is hosted like SQS, has guaranteed first in first out ordering, no eventual consistency delays, is uber scalable and you can be up and running in minutes.
Here's a PHP library for it: https://github.com/iron-io/iron_mq_php
Disclaimer: I work for Iron.io
The SQS documentation answers this for you (bold is my emphasis to directly answer your question):
Amazon SQS makes a best effort to preserve order in messages, but due
to the distributed nature of the queue, we cannot guarantee you will
receive messages in the exact order you sent them. If your system
requires that order be preserved, we recommend you place sequencing
information in each message so you can reorder the messages upon
receipt.
I have tried to implement the FIFO fashion for receiving the messages in the same order they were sent
For this you can use message sequence no which it sent every time with message and validate at the receiver end
By Using this way you can get desired output in FIFO order

XMPP: adding bidirectionality to pubsub?

I am not sure if pubsub or multiuserchat is the way to go?
What I think I need is pubsub, but with the added ability for subscribers to broadcast messages to the feed as well. Bidirectional information flow, if you will.
The use case is such that subscribers will be subscribed to on average 1000 different feeds, but each individual feed only broadcasts information on average once per week. So, lots of feeds, but low activity in each one. However, b/c there are 1000 different active subscriptions, a subscriber might still be notified of 100 messages per day, and they should be able to "reply" aka post content to any one of those feeds.
It seems like what I need is a pubsub/multiuserchat hybrid. But that doesn't exist, or does it? Any ideas or pointers?
Thanks a bunch!
If a subscriber is publishing data then they are not just a subscriber, they are a publisher. And there is no reason the same entity can't be a publisher and a subscriber at the same time.
As for your more general question about pubsub vs. MUC, that's a question that I find comes up a lot nowadays.
Obviously at first glance MUC and pubsub are very similar, they are both about broadcasting to a group. Many applications could easily use one or the other with no trouble.
To help decide which fits best with your applications, let's go through some of the differences between the two protocols.
MUC:
Is absolutely good for standard chatrooms of online users communicating with each other. If this is what you're doing, use it.
Includes presence, i.e. notifying other occupants about joining, leaving and changing status.
Allows for anonymous private communication between occupants.
Works out of the box with practically any standard XMPP client (for standard chat messages).
Automatic leaving of the room when the user goes offline or disconnects.
Messages with custom payloads are supported, meaning you are limited to routing standard chat messages.
Pubsub:
One or a few publishers transmitting to many read-only subscribers is core pubsub territory. In contrast to MUC the subscribers are not publishing, and are not receiving information about other subscribers.
Server implementations tend to have much more flexible access control for pubsub.
Custom payloads only, no standard chat messages.
Optionally has full item persistence.
A node can be managed as a list of items (ie. add/remove with notification) rather than just simple broadcast.
Subscriptions can persist through being offline.
The points above are just a guide. A lot can typically be achieved through server configuration. As an example, the MUC specification allows for rooms withholding presence broadcasts for certain classes of occupants based on configuration. The catch here is in the implementations... since this is an uncommon usage of MUC, you will find it may not be supported in many MUC implementations. The point being that as MUC was designed for chatting and not generic pubsub, you will largely find all the implementations and tooling around MUC to focus on that kind usage.
Not sure what the problem is. The subscriber simply needs to be a publisher as well. There is nothing stopping them from publishing as well as subscribing (unless the nodes are configured to disallow it).
This appears to be a very typical pubsub case.