This is an architectural question.
I am currently in the process of designing a web application and I am used to a basic: frontend, api, database, microservices setup.
For the sake of saving money and making my architecture a little bit more modern than what I am used to I decided to look into serverless.
The two main parts I am interested in are google cloud functions and firebase. My understanding is that google cloud functions can be fired when a database entry in firebase has been manipulated.
The way I used to communicate between services was through message queues such as RabbitMQ but it seems to me that by using firebase and cloud functions you can build communication through the database without the need for message queues. What I mean by communication in this case, would be that one service would be able to react to the execution of another service by seeing that an entry in the database was changed.
My question therefore is, what are the upsides and downsides of letting all your "communication" between microservices run through firebase instead of message queues, and is this an architecture that is generally used?
AFAIK, cloud function triggers is a beta feature in Firebase, and according to the doc, there are some limitations for firestore trigger events:
It can take up to 10 seconds for a function to respond to changes in Cloud Firestore.
Ordering is not guaranteed. Rapid changes can trigger function invocations in an unexpected order.
Events are delivered at least once, but a single event may result in multiple function invocations. Avoid depending on exactly-once mechanics, and write idempotent functions.
Cloud Firestore triggers for Cloud Functions is available only for Cloud Firestore in Native mode. It is not available for Cloud Firestore in Datastore mode.
The most concerning limitation here is the first one. 10 seconds for an update is a long time if you need that update to be visible to the user.
Another disadvantage I see is that it may run out of control (in terms of system design) as the complexity increases. You may be tempted to add events for everything, and it may be hard to partition them by category, for example (in message queues, you can use topics for that).
Also, according to the doc, cloud functions are rate-limited to 16 invocations per 100 seconds, which may quickly be reached if you got some traffic on your app.
I would use trigger-events for isolated scenarios and use a message queue for the backbone communication between microservices.
Related
I have an application running in multiple regions in AWS, this application reads from global DynamoDb table(s). Updates occur in the background via another process and I wanted to be able to be able to monitor for these updates so the application can invalidate its cache (I'm not using DAX).
I was thinking I could use DynamoDb streams for this, however; after going through a number of road blocks with Spring Kinesis Streams Binder (e.g. the fact that it requires 2 tables [SpringIntegrationMetadataStore & SpringIntegrationLockRegistry] be created, my company doesn't allow dynamic creation of tables (so that was fun to hunt down as I couldn't find any mention in the docs - š¤·āāļø maybe I missed it). Now I think I have found out that only 1 application can listen to a Kinesis stream at a time?
Is that true?
Is there a way
Is there a way for multiple applications, that only read from DynamoDb, to get notified when an update occurs? I was thinking that I could use DynamoDb Streams such that each app would monitor the stream for updates and be able to invalidate their cache. If the above is true, then I need to do something more involved or complex (use a SNS/SQS for updates, elasticache, Redis, Kafka) which just seems like overkill for this scenario.
e.g. the fact that it requires 2 tables [SpringIntegrationMetadataStore & SpringIntegrationLockRegistry]
Well, that's how consumer group management is handled by Spring Cloud Stream Kinesis Binder. Even if you would use only a KCL, it still would require from you extra table in DynamoDB. Therefore your concern sounds more like a lack of confidence in cloud services you use.
Now I think I have found out that only 1 application can listen to a Kinesis stream at a time?
That's not true if all your consumer applications are configured for different consumer groups.
Please, make yourself familiar with Spring Cloud Stream and its model: https://docs.spring.io/spring-cloud-stream/docs/3.1.1/reference/html/spring-cloud-stream.html#_main_concepts
Another way probably could be done via AWS Lambda trigger for DynamoDB Streams: https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Streams.Lambda.html
An application that I'm working on uses AFTER CREATE/UPDATE/DELETE triggers to create pg_notify notifications when certain actions occur within the system. Currently, we have a small Node.JS application that LISTENs for the events and then immediately turns around and posts them to an AWS SNS topic, which gets forwarded to our SQS event queue. From that queue, we trigger all sorts of things based on the event (emails, SMSs, lambdas, long running jobs, etc).
This architecture works well, but the Node.JS application that sits in between the PostgreSQL instance and the SNS topic seems a bit fragile. I can't really run two copies in two availability zones, because messages will be duplicated.
I'm looking for a better way to get these Postgres notifications into SQS. Are there any options out there for this? If Postgres Aurora has something, we might consider that.
Use your current strategy of a small application that LISTENs for events. Just introduce a deduplication step between that app and your event subscribers. This will allow you to run several instances of your app.
For example, you could use a FIFO SQS queue. These automatically drop duplicate messages. Since FIFO queues cannot subscribe to SNS, you'd need to put messages directly to the queue instead of through SNS.
Alternatively, you could use DynamoDB to store checksums of your recent messages and if your app encounters a duplicate, drop it manually (make sure to use conditional writes to prevent race conditions).
Some options I've found:
Continue with the current method
I could keep the current small application that's redirecting events from my PostgreSQL RDS and dumping them into SNS->SQS. I can deploy it in a 1 region/max 1/min 1 auto-scaling group to make sure there is not more than copy running at a time.
Ditch my RDS and use a self hosted database
I could ditch RDS and run PostgreSQL on an EC2 instance and then use PL/Python along with the AWS-SDK to make calls to SNS instead of using pg_notify. I don't like this idea, because I lose the ease of use that comes with RDS.
For now, I'll be sticking with the current method, unless someone has some other ideas that I could explore. I'm sure there will be more options in the future (like when Aurora PostgreSQL adds support for calling Lambdas, like the Aurora MySQL has).
Weāre working to take our software to Azure cloud and looking at Orleans and Service Fabric (SF) as potential frameworks. We need to:
Populate our analysis engines with lots of data (e.g., 100MB to 2GB) per engine instance.
Maintain that state, and if an engine instance goes idle for say 20 minutes or more, weād like to unload it (i.e., and not pay for the engine instance resource).
Each engine instance will support one to several end users with a specific data set.
Each engine instance can be highly interactive generating lots of plot data near realtime. Weāre maintaining state as we donāt want to pay the price to populate engine instance for each engine interaction.
An engine instance action can take a few seconds, a few minutes, to even tens of minutes. Weāll want some feedback.
Users may access an engine instance every few seconds (e.g., to steer the engine towards a result based on feedback) and will want live plot data.
Each user will want to talk to a specific engine instance.
As a user expresses interest in running a simulation (i.e., standing up an engine instance), ideally we want him to choose small/medium/large computing resource to run his engine instance (i.e., based on the problem heās trying to solve he may want more or less computing/memory power).
Weāre considering Orleans and SF but weāre having difficulty specifying architecture based on above requirements. Weāve considered:
Trying to think about an SF partition, or an Orleans silo as an āengine instanceā described above.
Leveraging both Orleans and SF notion of fault tolerance through replication.
Leveraging local (i.e., to partition or silo) storage to store results and maintain state (i.e., for long periods or until idle for 20 minutes).
Weāve not understood how to:
Limit a silo or a partition to a single engine instance so that we can control resourcing of the engine instance.
Keep a userās engine instance data separate from another users engine instance data.
Direct a request from a user (e.g., through a web API) to a particular engine instance.
Does this make sense for Orleans, does it make more sense for SF? Any pointers on how to implement the above would be helpful.
When you say SF I assume you mean SF Actors right?
You can use them the way you want, but in both cases does not look as the right solution for your problem, because:
Actors are single threaded, if you plan to share the same instance with multiple clients, each one would have to wait for the previous one to finish before it start processing anything. If you need to monitor the status of a running actor, you would have to make the actor publish the updates to external subscribers.
Actor state is isolated, so you can't access the state of other actors, the way to do it is provide a method to return it, but if the actor is running a command you have to wait the completion, unless you make a separate state service to hold the processed data.
You can't limit the resources required for a actor, in service fabric you specify the resources needed for a service, but you can't do it for actors, and you can't limit the resources they use, when they hit the limit, service fabric will try to balance the resources for your, but nothing prevent the process to consume more memory than requested.
Both actor services communicates using the ask approach, so they will "block" the caller waiting for an answer, it is asynchronous but you still have to keep the caller 'waiting'. (block and wait is because there is not an idea of fire and forget like Akka that uses the Tell approach, where it delivery the message and forget.)
Based on some of your requirements, I think a containers would be a better approach. Because:
You can limit the resource consumption for each container
The data is isolated inside the container and not visible to others
But on containers you have to manage the replication and partitioning by yourself, so in this case I would recommend the best of both worlds:
Create SF services to host the shared data sets between the the users
SF Service+Actor to only store the results of users simulations.
Containers to run the simulations and send updates to actors
This is just an example, it all will depend on your requirements, architecture and how data will be isolated from each other.
I am designing a whatsapp like messenger application for the desktop using WPF and .Net. Now, when a user creates a group I want other members of the group to receive a notification that they were added to a group. My frontend is built in C#.Net, which is connected to a RESTful Webservice (Ruby on Rails). I am using Postgres for the database. I also have a Redis layer to cache my rails models.
I am considering the following options.
1) Use Postgres's inbuilt NOTIFY/LISTEN mechanism which the clients can subscribe to directly. I foresee two issues here
i) Postgres might not be able to handle 10000's of clients subscribed directly.
ii) There is no guarantee of delivery if the client is disconnected
2) Use Redis' Pub/Sub mechanism to which the clients can subscribe. I am still concerned with no guarantee of delivery here.
3) Use a messaging queue like RabbitMQ. The producer of this queue will be postgres which will push in messages through triggers. The consumer of-course will be the .Net clients.
So far, I am inclined to use the 3rd option.
Does anyone have any suggestions how to design this?
In an application like WhatsApp itself, the client running in your phone is an integral part of a large and complex event-based, distributed system.
Without more context, it would be impossible to point in the right direction. That said:
For option 1: You seem to imply that each client, as in a WhatsApp client, would directly (or through some web service) communicate with Postgres as an event bus, which is not sound and would not scale because you can only have ONE Postgres instance.
For option 2: You have the same problem that in option 1 with worse failure modes.
For option 3: RabbitMQ seems like a reasonable ally here. It is distributed in nature and scales well. As a matter of fact, it runs on erlang just as most of WhatsApp does. Using triggers inside Postgres to publish messages however does not make a lot of sense.
You need a message bus because you would have lots of updates to do in the background, not to directly connect your users to each other. As you said, clients can be offline.
Architecture is more about deferring decisions than taking them.
I suggest that you start simple. Build a small, monolithic, synchronous system first, pushing updates as persisted data to all the involved users. For example; In a group of n users, just write n records to a table. It is already complicated to reliably keep track of who has received and read what.
This heavy "group" updates can then be moved to long-running processes using RabbitMQ or the like, but a system with several thousand users can very well work without such thing, especially because a simple message from user A to user B would not need many writes.
I'm in the planning stages of a .NET service which continually processes incoming messages, which involves various transformations, database inserts and updates, etc. As a whole, the service is huge and complicated, but the individual tasks it performs are small, simple, and well-defined.
For this reason, and in order to allow for easy expansion in future, I want to split the service into several smaller services which basically perform part of the processing before passing it onto the next service in the chain.
In order to achieve this, I need some kind of intermediary messaging system that will pass messages from one service to another. I want this to happen in such a way that if a link in the chain crashing or is taken offline briefly, the messages will begin to queue up and get processed once the destination comes back online.
I've always used message queuing for this type of thing, but have recently been made aware of SQL Service Broker which appears to do something similar. Is SQLSB a viable alternative for this scenario and, if so, would I see any performance benefits by using that instead of standard Message Queuing?
Thanks
It sounds to me like you may be after a service bus architecture. This would provide you with the coordination and fault tolerance you are looking for. I'm most familiar and partial to NServiceBus, but there are others including Mass Transit and Rhino Service Bus.
If most of these steps initiate from a database state and end up in a database update, then merging your message storage with your data storage makes a lot of sense:
a single product to backup/restore
consistent state backups
a single high-availability/disaster recoverability solution (DB mirroring, clustering, log shipping etc)
database scale storage (IO capabilities, size and capacity limitations etc as per the database product characteristics, not the limits of message store products).
a single product to tune, troubleshoot, administer
In addition there are also serious performance considerations, as having your message store be the same as the data store means you are not required to do two-phase commit on every message interaction. Using a separate message store requires you to enroll the message store and the data store in a distributed transaction (even if is on the same machine) which requires two-phase commit and is much slower than the single-phase commit of database alone transactions.
In addition using a message store in the database as opposed to an external one has advantages like queryability (run SELECT over the message queues).
Now if we translate the abstract terms 'message store in the database as being Service Broker and 'non-database message store' as being MSMQ, you can see my point why SSB will run circles any time around MSMQ.
My recent experiences with both approaches (starting with Sql Server Service Broker) led me to the situation in which I cry for getting my messages out of SQL server. The problem is quasi-political but you might want to consider it: SQL server in my organisation is managed by a specialized DBA while application servers (i.e. messaging like NServiceBus) by developers and network team. Any change to database servers requires painful performance analysis from DBA and is immersed in fear that we might get standard SQL responsibilities down by our queuing engine living in the same space.
SSSB is pretty difficult to manage (not unlike messaging middleware) but the difference is that I am more allowed to screw something up in the messaging world (the worst that may happen is some pile of messages building up somewhere and logs filling up) and I can't afford for any mistakes in SQL world, where customer transactional data live and is vital for business (including data from legacy systems). I really don't want to get those 'unexpected database growth' or 'wait time alert' or 'why is my temp db growing without end' emails anymore.
I've learned that application servers are cheap. Just add message handlers, add machines... easy. Virtually no license costs. With SQL server it is exactly opposite. It now appears to me that using Service Broker for messaging is like using an expensive car to plow potato field. It is much better for other things.