Sharing Aggregates Between Microservices - aggregate

I posted a question earlier about how notifications, and users seeing those notifications, could be modelled in DDD.
Link Here: Does everything have to be an aggregate? Many-to-Many Link
For a brief summary of this:
We can raise notifications in the system that we want to show users.
(A notification will target certain users, which you define a filter for. E.g. only show admins, only show normal users, only show users for x client)
When a user sees a notification we want to mark it so they don't see it again.
A suggestion was made on the post to have a notification aggregate and store the reference to it in the User aggregate.
So when a notification is created, the event will be picked up, and a service will add that notification to the users it seems fit.
So we have a notification list in the user.
I think this is a bounded context (a notification bounded context). Certainly if i was modelling it as a microservice, I would handle this notifications stuff in its own microservice.
If we were to use microservices, the user created event would come from another service (a users service).
Question:
The notification creation would go in the notifications microservice. I would also be tempted to put the user marking that they have seen a notification in that service as well.
So at this point, the notifications microservice wouldn't hold a full aggregate of a user, it would only have a partial, containing the; id, collection of notifications and any criteria might want to filter from
Is it ok to have an aggregate (be it a partial one, as it is only the stuff we want) in a microservice (notification microservice) that it isn't owned by?
So essentially we have an aggregate of the user in the users microservice and the notifications one.
This doesn't sound too bad as it is going to reduce the footprint on the user, by splitting parts up and it is nice to bundle this functionality into the service that handles it.
However do we want to keep all the user stuff in one place? even it it puts other functionality in with it? Would the seeing of a notification go in the user microservice (that feels wrong)
Thanks

In short, don't share the aggregates, share projections of those aggregates, which are value objects representing aggregates from other bounded contexts.
Is it ok to have an aggregate (be it a partial one, as it is only the stuff we want) in a microservice (notification microservice) that it isn't owned by?
What you call a "partial aggregate" I would call a projection of an aggregate owned by a separate BC. So this projection can be owned by the BC exposed in the importing microservice. In this sense, yes, it is ok.
So essentially we have an aggregate of the user in the users microservice and the notifications one.
No, you don't. You have User in its BC and you have a projection of the User in the Notifications BC. A BC can own the projections of aggregates from other BCs but not the foreign aggregates themselves. You only want to project what you need from the other BCs, not everything. If it was everything, then you've pretty much broken some fundamental DDD. And from a physical perspective, you might be tempted to share databases and so on, which defeats some of the hallmarks of good microservice architecture.
Question: The notification creation would go in the notifications microservice. I would also be tempted to put the user marking that they have seen a notification in that service as well.
I think this would be OK. It sounds like it from the context of your question (I certainly don't know the whole of what you're doing). Perhaps in your Notifications BC, you have a NotifiedUser with a list of Notifications or perhaps it's the other way around SeenNotifications with a list of Users.

Related

Microservices - Storing user data in separate database

I am building a microservice that has two separate services: a user service and a comments service. The user service stores the user details like email, first/last name job title, etc, and the comments service stores all comments made by the user.
In the UI, I need to populate the comments (via a REST API) and show the first/last name, email, and job title of the user.
Is it recommended that we store all these user details in the comments database?
If yes, then every time a user changes their details first/last name or job title then I will have to update their details in all the comments (I don't think this is a good idea )
If no, then if I store just the userid in the comments DB, how am I supposed to get the user details for each comment? Let's say we want to show 20 comments per page in the UI.
First, challenge architecture. Let's assume that the both services in the question are part of a larger ecosystem of microservices that all make use of the user information. Else separation will most certainly be overengineered. But from the word "comments" we can at least guess that there is at least one other class of objects, that is the things being commented. So let's assume a "user service" is a meaningful crumb to break out into a microservice, because at least some other crumbs get the necessary weight to justify the microservice breakup.
In that case I suggest the following strategy:
Second, implement an abstraction layer into your comments service right away so that most of the code will not have to care about where the user comes from (i.e. don't join or $lookup). This is also a great opportunity for local testing, because you can just create a collection with the data you need and run service level integration tests against it.
Third, for integration with the user service, get the data from there via API (which should support bulk data selection in any case) every time you need it. Because you have the abstraction layer, you can add caching, cache timeout and displacement strategies and whatever you may need below this abstraction without caring in the main portion of the code. Add such on an as needed basis. Keep it simple.
Fourth, when things really go heavyweight and you have to care with tens of thousands of users, tons of comments and many requests per second the comments service could, still below the abstraction, implement an upfront replication pattern to get the full user database locally. This will usually be done based on an asynchronous message being sent by the user service to all subscribers when something changes in te user base. When it suits the subscribers (i.e. the comment service), they can trigger full or (from time to time) delta replication of the changes. Suitable collections will be already in place from what you did for caching. And it will probably be considerably less info you need in the comments service, than is stored in user service (let alone the hashed password, other login options or accounting information).
Fifth, should you still hit performance challenges, you can break the abstraction for the few cases you need to and do the join or $lookup.
Follow the steps in order, and stop as soon as the overall assembly works fine. Every step adds considerable complexity, and when you don't need it, don't implement it.

Clean architecture and user logins in Flutter - how do I store user information?

I've been trying to use Reso Coder's Flutter adaptation of Uncle Bob's Clean Architecture.
My app connects to an API, and most requests (other than logging in) require an authentication token.
Furthermore, upon logging in, user profile information (like a name and profile picture) is received.
I need a way to save this data upon login, and use it in both future API requests and my app's UI.
As I'm new to Uncle Bob's Clean Architecture, I'm not quite sure where this data belongs. Here are the ideas I've come up with, all involving storing the data in a User object:
Store the User in the repository layer, in an authentication feature directory. Other repository-level methods can pass it to the appropriate datasource methods.
This seems to make the most sense; other repository-level methods that call other API calls can use the stored User easily, passing it to methods in the data source layer.
If this is the way to go, I'm not quite sure how other features (that use the API) would access the User - is it okay to have a repository depend on another, and pass the authentication repository to the new feature repository?
Store the User in the repository layer, in an authentication feature directory. Other (non-login) usecases can depend on both this repository and on one relevant to their own feature, passing the User to their repository methods.
This is also breaking the vertical feature barrier, but it may be cleaner then idea 1.
For both these ideas, here's what my repository looks like:
abstract class AuthenticationRepository {
/// The current user.
User get currentUser;
/// True if logged in.
bool get loggedIn;
/// Logs in, saving the [User].
Future<void> login(AuthenticationParams params);
/// Logs out, disposing of the [User].
Future<void> logout();
/// Same as [logout], but logs out of all devices.
Future<void> logoutAll();
/// Retrieves stored login credentials.
Future<AuthenticationParams> retrieveStoredCredentials();
}
Are these ideas "valid", and are there any better ways of doing it?
I see another option to tackle the problem. The solution I want to talk about comes from the domain-driven design and is an event based approach.
In DDD you have the concept of a bounded context. A business object (uncle bob's entity) can have different meanings in different bounded contexts. Take a look at your user business object. The data and methods that some use case uses is often differnt to the data and methods that other use cases use. That's why you have differnt user objects in differnt bounded contexts. They are a kind of perspective that each use case has on the same business object.
If a business object is modified in one bounded context it can emit a business event. Another feature can listen to those events. The event mechanism can either be a simple observer pattern or if you need to distribute your application features via microservices a message queue. In case you use a simple observer patter the event emitter and event handler can run within the same data source transaction. But they can also run in differnt ones. It depends on your needs.
So when the sign-up use case registers a new user it emits a UserSignedUpEvent. Other features can now listen to this event. The event carries the information of the user, like the email, the name, the profile image and other infomration that the user provided during sign-up. Other features can now save the piece of data they need to their own data source. It can be the same as the sign-up use case uses (just other tables or another schema). But it is also possible that is a completely differnt data source, maybe another kind of data source like a nosql db. The part I wrote above about transactions is of course more difficult if you have different data sources.
The main point is that each feature has it's own data and manages it. It might be a copy of the whole user infromation, but in a lot of cases it is only a subset.
The event-based approach can give you perfect modularization. But as it is always when something looks great, it comes at a cost. You have to duplicate some part or even all data. When you think of a microservice architecture and some features are in different microservices it means that the duplication increases the availability of the service. The service can operate even the main service that manages the data is down, because a local copy exists. But now you have to deal with consistency issues - eventual consistency.
At this point I like to stop and guide you to other sources for details:
Chapter 8: Domain Events, Implementing Domain-Driven Desing, Vaughn Vernon
The many meanings of event-driven architectures, Martin Fowler

Does everything have to be an aggregate? Many-to-Many Link

Say I have two entities
Notifications and Users.
I want to mark that a user has seen a specific notification.
This would commonly be done with a many-to-many relationship
e.g. UserNotification
Because there is no invariant around this relationship (we don't care if "all" users have seen the notification) these users shouldn't be on the notification aggregate.
On the opposite side, the users aggregate doesn't need a list of notifications on it
So that leads to say that the UserNotification (this relationship) is an aggregate of its own.
However, because we are never going to reference this thing by Id, does it really really belong as one? It seems like just adding an aggregate for storing the data.
What should I do here?
Just make an aggregate anyway and ignore the id?
Put these notifications on the user or users on notifications. (does it belong on either, and would putting it on one not add weight and cause concurrency issues?)
just make a crud table?
An aggregate without the id and keep the composite key (is that allowed?)
thanks
Does a Notification have its own lifecycle? Can a Notification exist without a User to be notified?
I could imagine a Notification to simply be a Value Object that gets copied to each affected User.
have you considered modeling User and Notification as aggregates but NOT modelling the association at all?
There is a high probability of not needing to. The only usecase I can come up with is retrieving all notifcations of a user. this can be exposed in an repository interface via getNotifications(user: User): Iterable[Notifications] (scala syntax).
on the write side the saveNotification(notification: Notification, users: List[User]) could save the aggregate as well as populate the n:m table.
EDIT: on afterthought to this - my solution would introduce a source code dependency from notifications to users (at least on the repository) and your intiuition might be right - the notification should not know about the user at all.
But there has to be at least the concept of an Recipient which may perfectly reside in the notification "module" or "package". maybe you are crossing bounded contexts here and the User entity on one side should be translated to an Recipient value object on the other via Anti Corruption Layer.
It's up to you and your domain to decide. In this example it would perfectly make sense that the notification package has some knowledge about a "User". otherwise - what would be notified?

Client Interaction With Event Sourcing

I have been recently looking into event sourcing and have some questions about the interactions with clients.
So event-sourcing sounds great. decoupling all your microservices, keeping your information in immutable events and formulating a stored states off of that to fit your needs is really handy. Having event propagate through your system/services and reacting to events in their own way is all fine.
The issue i am having lies with understanding the client interaction.
So you want clients to interact with the system, but they need to do this now by events. They can not longer submit a state to mutate your existing one.
So the question is how do clients fire off specific event and interact with (not only an event based system) but a system based on event sourcing.
My understanding is that you no longer use the rest api as resources (which you can get, update, delete, etc.. handling them as a resource), but you instead post to an endpoint as an event.
So how do these endpoint work?
my second question is how does the user get responses back?
for instance lets say we have an event to place an order.
your going to fire off an event an its going to do its thing. Again my understanding is that you dont now validate the request, e.g. checking if the user ordering the order has enough money, but instead fire it to be place and it will be handled in the system.
e.g. it will not be
- order placed
- this will be picked up by the pricing service and it will either fire an reserved money or money exceeded event based on if the user can afford it.
- The order service will then listen for those and then mark the order as denied or not enough credit.
So because this is a async process and the user has fired and forgotten, how do you then show the user it has either failed or succeeded? do you show them an order confirmation page with the order status as it is (even if its pending)
or do you poll it until it changes (web sockets or something).
I'm sorry if a lot of this is all nonsense, I am still learning about this architecture and am very much in the mindset of a monolith with REST responses.
Any help would be appreciated.
The issue i am having lies with understanding the client interaction.
Some of the issue may be understanding, but I promise you a fair share of the issue is that the literature sucks.
In particular, the word "Event" gets re-used a lot of different ways. If you aren't paying very careful attention to which meaning is being used, you are going to get knotted.
Event Sourcing is really about persistence - how does a micro-server store its private copy of state for later re-use? Instead of destructively overwriting our previous state, we write new information that links back to the previous state. If you imagine each microservice storing each change of state as a commit in its own git repository, you are in the right ballpark.
That's a different animal from using Event Messages to communicate information between one microservice and another.
There's some obvious overlap, of course, because the one message that you are likely to share with other microservices is "I just changed state".
So how do these endpoint work?
The same way that web forms do. I send you a representation of a form, the client displays the form to you. You fill in your data and submit the form, the client processes the contents of the form, and sends back to me an HTTP request with a "FormSubmitted" event in the message body.
You can achieve similar results by sending new representations of the state, but its a bit error prone to strip away the semantic intent and then try to guess it again on the server. So you are more likely to instead see task based user interfaces, or protocols that clearly identify the semantics of the change.
When the outside world is the authority for some piece of data (a shopper's shipping address, for example), you are more likely to see the more traditional "just edit the existing representation" approach.
So because this is a async process and the user has fired and forgotten, how do you then show the user it has either failed or succeeded?
Fire and forget really doesn't work for a distributed protocol on an unreliable network. In most cases, at-least-once delivery is important, so Fire until verified is the more common option. The initial acknowledgement of the message might be something like 202 Accepted -- "We received your message, we wrote it down, here's our current progress, here are some links you can fetch for progress reports".
It doesnt seem to me that event-sourcing fits with the traditional REST model where you CRUD a resource.
Jim Webber's 2011 talk may help to prune away the noise. A REST API is a disguise that your domain model wears; you exchange messages about manipulating resources, and as a side effect your domain model does useful work.
One way you could do this that would look more "traditional" is to work with representations of the event stream. I do a GET /08ff2ec9-a9ad-4be2-9793-18e232dbe615 and it returns me a representation of a list of events. I append a new event onto the end of that list, and PUT /08ff2ec9-a9ad-4be2-9793-18e232dbe615, and interesting side effects happen. Or perhaps I instead create a patch document that describes my change, and PATCH /08ff2ec9-a9ad-4be2-9793-18e232dbe615.
But more likely, I would do something else -- instead of GET /08ff2ec9-a9ad-4be2-9793-18e232dbe615 to fetch a representation of the list of events, I'd probably GET /08ff2ec9-a9ad-4be2-9793-18e232dbe615 to fetch a representation of available protocols - which is to say, a document filled with hyper links. From there, I might GET /08ff2ec9-a9ad-4be2-9793-18e232dbe615/603766ac-92af-47f3-8265-16f003ce5a09 to obtain a representation of the data collection form. I fill in the details of my event, submit the form, and POST /08ff2ec9-a9ad-4be2-9793-18e232dbe615 the form data to the server.
You can, of course, use any spelling you like for the URI.
In the first case, we need something like an HTTP capable document editor; the second case uses something more like a web browser.
If there were lots of different kinds of events, then the second case might well have lots of different form resources, all submitting POST /08ff2ec9-a9ad-4be2-9793-18e232dbe615 requests.
(You don't have to have all of the forms submitting to the same URI, but there are advantages to consider).
In a non event sourcing pattern I guess that would be first put into the database, then the event gets risen.
Even when you aren't event sourcing, there may still be some advantages to committing events to your durable store before emitting them. See Pat Helland: Data on the Outside versus Data on the Inside.
So you want clients to interact with the system, but they need to do this now by events.
Clients don't have to. Client may even not be aware of the underlying event store.
There are a number of trade-offs to consider and decisions to take when implementing an event-sourced system. To start with you can try to name a few pre computer era examples of event-sourced systems and look at their non-functional characteristics.
So the question is how do clients fire off specific event
Clients don't send events. They rather should express an intent (a command). Then it is the responsibility of the event-sourced system to validate the intent and either reject it or accept and store the corresponding event. It would mean that an intent to change the system's state was accepted and the stored event confirms the change.
My understanding is that you no longer use the rest api as resources
REST is one of the options. You just consider different things as resources. A command can be a REST resource. An event-sourced entity can be a resource, to which you POST a command. If you like it async - you can later GET the command to check its status. You can GET an entity to know its current state. You cant GET events from a class of entities as a means of subscription.
If we are talking about an end user, then most likely it doesn't deal with the event store directly. There is some third tier in between, which does CQRS. From a user client perspective it can be provided with REST, GraphQL, SOAP, gRPC or event e-mail. Whatever transport solution you find suitable. Command-processing part from CQRS is what specifically domain-driven. It decides which intent to accept and which to reject.
Event store itself is responsible for the data consistency. I.e. it should not allow two concurrent event leading to invalid state be published. This is what pre-computer event-sourced systems are good at. You usually have some physical object as an entity, so you lock for update by just getting hand of it.
Then an end-user client usually reads from some prepared read model. The responsibility of a read (R in CQRS) component is to prepare read-optimised data for clients. This data may come from multiple event-sourced of the same or different classes. Again, client may interact with a read model with whatever transport is suitable.
While an event-store is consistent and consistent immediately, a read model is eventually consistent. But it's up to you to tune this eventuality.
Just try to throw REST out of the architecture for a while. Consider it a one of available transport options - that may help to look at the root.

Converting resources in a RESTful manner

I'm currently stuck with designing my endpoints in a way so that they are conform with the REST principles but also ensure the integrity of the underlying data.
I have two resources, ShadowUser and RealUser whereas the first one only has a first name, last name and an e-mail.
The second user has much more properties such like an Id under which the real user can be addressed at other place in the system.
My use-case it to convert specific ShadowUsers into real users.
In my head the flow seems pretty simple:
get the shadow users /GET api/ShadowUsers?somePropery=someValue
create new real users with the data fetched /POST api/RealUsers
delete the shadow-users /DELETE api/ShadowUSers?somePropery=someValue
But what happens when there is a problem between the creation of new users and the deletion of the shadow ones? The data would now be inconsistent.
The example is even easier when there is only one single user, but the issue stays the same as there could be something between step 2 and 3 leaving the user existing as shadow and real.
So the question is, how this can be done in a "transactional" manner where anything is good and persisted or something went wrong and nothing has been changed in the underlying data-store?
Are there any "best practices" or "design-patterns" which can be used?
Perhaps the role of the RESTful API GETting and POSTing those real users in batch (I asked a question some weeks ago about a related issue: Updating RESTful resources against aggregate roots only).
In the API side, POSTed users wouldn't be handled directly but they would be enqueued in a reliable messaging queue (for example RabbitMQ). A background process would be subscribed to the whole queue and it would process both the creation and removal of real and shadow users respectively.
The point of using a reliable messaging system is that you can implement retry policies. If the operation is interrupted in the middle of finishing its work, you can retry it and detect which changes are still pending to complete the task.
In summary, using this approach you can implement that operation in a transactional way.