Related
We have (roughly) following architecture:
Application service does the infrastructure job - fetches data from repositories which are hidden behind interfaces.
Object graph is created and passed to appropriate domain service.
Domain service does it thing and raises appropriate events.
Events are handled in different application services which perform some persistent operations (altering repositories, sending e-mails etc).
However. Domain service (3) has become so complex that it requires data from different external APIs only if particular conditions are satisfied. For example - if Product X is of type Car, we need to know price of that car model from some external CatalogService (example invented) hidden behind ICatalogService. This operation is potentially expensive one (REST call).
How do we go about this?
A. Do we pre-fetch all data in Application Service listed as (1) even we might not need it? Do we inject interface ICatalogService into given Domain Service and fetch data only when needed? The latter solution might create performance issues if, some other client of Domain Service, calls this Domain Service repeatedly without knowing there is a REST call hidden inside it.
Or did we simply get the domain model wrong?
This question is related to Domain Driven Design.
How do we go about this?
There are two common patterns.
One is to pass the capability to make the query into the domain model, allowing the model to fetch the information itself when it is needed. What this will usually look like is defining an interface / a contract that will be consumed by the domain model, but implemented in the application/infrastructure layers.
The other is to extend the protocol between the domain model and the application, so that we can signal to the application layer what information is needed, and then the application code can decide how to provide it. You end up with something like a state machine for the processes, with the application code coordinating the exchange of information between the external api and the domain model.
If you use a bit of imagination, you've already got a state machine something like this; as your application code is already coordinating the movement of inputs to the repository and the domain model. The difference, of course, is that the existing "state machine" is simple and linear enough that it may not be obvious that there is a state machine present at all.
how exactly would you signal application layer?
Simple queries; which is to say, the application code pulls the information it needs out of the domain model and uses that information to compute the next action. When the action is completed, the application code pushes information to the domain model.
There isn't enough information to give you targeted good advice. I suspect you need to refactor your domains into further subdomains. It sounds like your domain service has way more than 1 responsibility. Keep the service simple.
In addition, If you have a long running task like a service call that takes a long time, then you need to architect it away. The most supple design will not keep the consumer waiting. It'll return immediately with some sort of result to the user even if it's simply a periodic status update.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
I am thinking about buildning a REST API with both websockets and http where I use websockets to tell the client that new data is available or provide the new data to the client directly.
Here are some different ideas of how it could work:
ws = websocket
Idea A:
David get all users with GET /users
Jacob add a user with POST /users
A ws message is sent to all clients with info that a new user exist
David recive a message by ws and calls GET /users
Idea B:
David get all users with GET /users
David register to get ws updates when a change is done to /users
Jacob add a user with POST /users
The new user is sent to David by ws
Idea C:
David get all users with GET /users
David register to get ws updates when a change is done to /users
Jacob add a user with POST /users and it gets the id 4
David receive the id 4 of the new user by ws
David get the new user with GET /users/4
Idea D:
David get all users with GET /users
David register to get ws updates when changes is done to /users.
Jacob add a user with POST /users
David receive a ws message that changes is done to /users
David get only the delta by calling GET /users?lastcall='time of step one'
Which alternative is the best and what are the pros and cons?
Is it another better 'Idea E'?
Do we even need to use REST or is ws enought for all data?
Edit
To solve problems with data getting out of sync we could provide the header"If-Unmodified-Since"https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/If-Unmodified-Sinceor "E-Tag" https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/ETag or both with PUT requests.
Idea B is for me the best, because the client specifically subscribes for changes in a resource, and gets the incremental updates from that moment.
Do we even need to use REST or is ws enought for all data?
Please check: WebSocket/REST: Client connections?
I don't know Java, but I worked with both Ruby and C on these designs...
Funny enough, I think the easiest solution is to use JSON, where the REST API simply adds the method data (i.e. method: "POST") to the JSON and forwards the request to the same handler the Websocket uses.
The underlying API's response (the response from the API handling JSON requests) can be translated to any format you need, such as HTML rendering... though I would consider simply returning JSON for most use cases.
This helps encapsulate the code and keep it DRY while accessing the same API using both REST and Websockets.
As you might infer, this design makes testing easier, since the underlying API that handles the JSON can be tested locally without the need to emulate a server.
Good Luck!
P.S. (Pub/Sub)
As for the Pub/Sub, I find it best to have a "hook" for any update API calls (a callback) and a separate Pub/Sub module that handles these things.
I also find it more resource friendly to write the whole data to the Pub/Sub service (option B) instead of just a reference number (option C) or an "update available" message (options A and D).
In general, I also believe that sending the whole user list isn't effective for larger systems. Unless you have 10-15 users, the database call might be a bust. Consider the Amazon admin calling for a list of all users... Brrr....
Instead, I would consider dividing this to pages, say 10-50 users a page. These tables can be filled using multiple requests (Websocket / REST, doesn't matter) and easily updated using live Pub/Sub messages or reloaded if a connection was lost and reestablished.
EDIT (REST vs. Websockets)
As For REST vs. Websockets... I find the question of need is mostly a subset of the question "who's the client?"...
However, once the logic is separated from the transport layer, than supporting both is very easy and often it makes more sense to support both.
I should note that Websockets often have a slight edge when it comes to authentication (credentials are exchanged once per connection instead of once per request). I don't know if this is a concern.
For the same reason (as well as others), Websockets usually have an edge with regards to performance... how big an edge over REST depends on the REST transport layer (HTTP/1.1, HTTP/2, etc').
Usually these things are negligible when it comes time to offer a public API access point and I believe implementing both is probably the way to go for now.
To summarize your ideas:
A: Send a message to all clients when a user edits data on the server. All users then request an update of all data.
-This system may make a lot of unnecessary server calls on behalf of clients who are not using the data. I don't recommend producing all of that extra traffic as processing and sending those updates could become costly.
B: After a user pulls data from the server, they then subscribe to updates from the server which sends them information about what has changed.
-This saves a lot of server traffic, but if you ever get out of sync, you're going to be posting incorrect data to your users.
C: Users who subscribe to data updates are sent information about which data has been updated, then fetch it again themselves.
-This is the worst of A and B in that you'll have extra round trips between your users and servers just to notify them that they need to make a request for information which may be out of sync.
D: Users who subscribe to updates are notified when any changes are made and then request the last change made to the server.
-This presents all of the problems with C, but includes the possibility that, once out of sync, you may send data that will be nonsense to your users which might just crash the client side app for all we know.
I think that this option E would be best:
Every time data changes on the server, send the contents of all the data to the clients who have subscribed to it. This limits the traffic between your users and the server while also giving them the least chance of having out of sync data. They might get stale data if their connection drops, but at least you wouldn't be sending them something like Delete entry 4 when you aren't sure whether or not they got the message that entry 5 just moved into slot 4.
Some Considerations:
How often does the data get updated?
How many users need to be updated each time an update occurs?
What are your transmission
costs? If you have users on mobile devices with slow connections, that will affect how often and how much you can afford to send to them.
How much data gets updated in a given update?
What happens if a user sees stale data?
What happens if a user gets data out of sync?
Your worst case scenario would be something like this: Lots of users, with slow connections who are frequently updating large amounts of data that should never be stale and, if it gets out of sync, becomes misleading.
I personally have used Idea B in production and am very satisfied with the results. We use http://www.axonframework.org/, so every change or creation of an entity is published as an event throughout the application. These events are then used to update several read models, which are basically simple Mysql tables backing one or more queries. I added some interceptors to the event processors that update these read models so that they publish the events they just processed after the data is committed to the DB.
Publishing of events is done through STOMP over web sockets. It is made very simple is you use Spring's Web Socket support (https://docs.spring.io/spring/docs/current/spring-framework-reference/html/websocket.html). This is how I wrote it:
#Override
protected void dispatch(Object serializedEvent, String topic, Class eventClass) {
Map<String, Object> headers = new HashMap<>();
headers.put("eventType", eventClass.getName());
messagingTemplate.convertAndSend("/topic" + topic, serializedEvent, headers);
}
I wrote a little configurer that uses Springs bean factory API so that I can annotate my Axon event handlers like this:
#PublishToTopics({
#PublishToTopic(value = "/salary-table/{agreementId}/{salaryTableId}", eventClass = SalaryTableChanged.class),
#PublishToTopic(
value = "/salary-table-replacement/{agreementId}/{activatedTable}/{deactivatedTable}",
eventClass = ActiveSalaryTableReplaced.class
)
})
Of course, that is just one way to do it. Connecting on the client side may look something like this:
var connectedClient = $.Deferred();
function initialize() {
var basePath = ApplicationContext.cataDirectBaseUrl().replace(/^https/, 'wss');
var accessToken = ApplicationContext.accessToken();
var socket = new WebSocket(basePath + '/wss/query-events?access_token=' + accessToken);
var stompClient = Stomp.over(socket);
stompClient.connect({}, function () {
connectedClient.resolve(stompClient);
});
}
this.subscribe = function (topic, callBack) {
connectedClient.then(function (stompClient) {
stompClient.subscribe('/topic' + topic, function (frame) {
callBack(frame.headers.eventType, JSON.parse(frame.body));
});
});
};
initialize();
Another option is to use Firebase Cloud Messaging:
Using FCM, you can notify a client app that new email or other data is
available to sync.
How does it work?
An FCM implementation includes two main components for sending and
receiving:
A trusted environment such as Cloud Functions for Firebase or an app server on which to build, target and send messages.
An iOS, Android, or Web (JavaScript) client app that receives messages.
Client registers its Firebase key to a server. When updates are available, server sends push notification to the Firebase key associated with the client. Client may receive data in notification structure or sync it with a server after receiving a notification.
Generally you might have a look at current "realtime" web frameworks like MeteorJS which tackle exactly this problem.
Meteor in specific works more or less like your example D with subscriptions on certain data and deltas being sent out after changes only to the affected clients. Their protocol used is called DDP which additionally sends the deltas not as overhead prone HTML but raw data.
If websockets are not available fallbacks like long polling or server sent events can be used.
If you plan to implement it yourself i hope these sources are some kind of inspiration how this problem has been approached. As already stated the specific use case is important
The answer depends on your use case. For the most part though I've found that you can implement everything you need with sockets. As long as you are only trying to access your server with clients who can support sockets. Also, scale can be an issue when you're using only sockets. Here are some examples of how you could use just sockets.
Server side:
socket.on('getUsers', () => {
// Get users from db or data model (save as user_list).
socket.emit('users', user_list );
})
socket.on('createUser', (user_info) => {
// Create user in db or data model (save created user as user_data).
io.sockets.emit('newUser', user_data);
})
Client side:
socket.on('newUser', () => {
// Get users from db or data model (save as user_list).
socket.emit('getUsers');
})
socket.on('users', (users) => {
// Do something with users
})
This uses socket.io for node. I'm not sure what your exact scenario is but this would work for that case. If you need to include REST endpoints that would be fine too.
With all great information all the great people added before me.
I found that eventually there is no right or wrong, its simply goes down to what suits your needs:
lets take CRUD in this scenario:
WS Only Approach:
Create/Read/Update/Deleted information goes all through the websocket.
--> e.g If you have critical performance considerations ,that is not
acceptable that the web client will do successive REST request to fetch
information,or if you know that you want the whole data to be seen in
the client no matter what was the event , so just send the CRUD events
AND DATA inside the websocket.
WS TO SEND EVENT INFO + REST TO CONSUME THE DATA ITSELF
Create/Read/Update/Deleted , Event information is sent in the Websocket,
giving the web client information that is necessary to send the proper
REST request to fetch exactly the thing the CRUD that happend in server.
e.g. WS sends UsersListChangedEvent {"ListChangedTrigger: "ItemModified" , "IdOfItem":"XXXX#3232" , "UserExtrainformation":" Enough info to let the client decide if it relevant for it to fetch the changed data"}
I found that using WS [Only for using Event Data] and REST
[To consume the data ]is better because:
[1] Separation between reading and writing model, Imagine you want to add some runtime information when your data is retrieved when its read from REST , that is now achieved because you are not mixing Write & Read models like in 1.
[2] Lets say other platform , not necessarily web client will consume this data.
so you just change the Event trigger from WS to the new way, and use REST to
consume the data.
[3] Client do not need to write 2 ways to read the new/modified data.
usually there is also code that reads the data when the page loads , and not
through the websocket , this code now can be used twice , once when page
loads , and second when WS triggered the specific event.
[4] Maybe the client do not want to fetch the new User because its showing currently only a view of old Data[E.g. users] , and new data changes is not in its interest to fetch ?
i prefer the A, it allows client the flexibility whether or not to update the existing data.
also with this method, implementation and access control becomes much more easier.
for example you can simply broadcast the userUpdated event to all users, this saves having a client list for do specific broadcasts and the Access Controls and Authentications applied for your REST Route wont have to change to reapplied again because the client is gonna make a GET request again.
Many things depends on the what kind of application you are making.
Let's say we have a User, Wallet REST microservices and an API gateway that glues things together. When Bob registers on our website, our API gateway needs to create a user through the User microservice and a wallet through the Wallet microservice.
Now here are a few scenarios where things could go wrong:
User Bob creation fails: that's OK, we just return an error message to the Bob. We're using SQL transactions so no one ever saw Bob in the system. Everything's good :)
User Bob is created but before our Wallet can be created, our API gateway hard crashes. We now have a User with no wallet (inconsistent data).
User Bob is created and as we are creating the Wallet, the HTTP connection drops. The wallet creation might have succeeded or it might have not.
What solutions are available to prevent this kind of data inconsistency from happening? Are there patterns that allow transactions to span multiple REST requests? I've read the Wikipedia page on Two-phase commit which seems to touch on this issue but I'm not sure how to apply it in practice. This Atomic Distributed Transactions: a RESTful design paper also seems interesting although I haven't read it yet.
Alternatively, I know REST might just not be suited for this use case. Would perhaps the correct way to handle this situation to drop REST entirely and use a different communication protocol like a message queue system? Or should I enforce consistency in my application code (for example, by having a background job that detects inconsistencies and fixes them or by having a "state" attribute on my User model with "creating", "created" values, etc.)?
What doesn't make sense:
distributed transactions with REST services. REST services by definition are stateless, so they should not be participants in a transactional boundary that spans more than one service. Your user registration use case scenario makes sense, but the design with REST microservices to create User and Wallet data is not good.
What will give you headaches:
EJBs with distributed transactions. It's one of those things that work in theory but not in practice. Right now I'm trying to make a distributed transaction work for remote EJBs across JBoss EAP 6.3 instances. We've been talking to RedHat support for weeks, and it didn't work yet.
Two-phase commit solutions in general. I think the 2PC protocol is a great algorithm (many years ago I implemented it in C with RPC). It requires comprehensive fail recovery mechanisms, with retries, state repository, etc. All the complexity is hidden within the transaction framework (ex.: JBoss Arjuna). However, 2PC is not fail proof. There are situations the transaction simply can't complete. Then you need to identify and fix database inconsistencies manually. It may happen once in a million transactions if you're lucky, but it may happen once in every 100 transactions depending on your platform and scenario.
Sagas (Compensating transactions). There's the implementation overhead of creating the compensating operations, and the coordination mechanism to activate compensation at the end. But compensation is not fail proof either. You may still end up with inconsistencies (= some headache).
What's probably the best alternative:
Eventual consistency. Neither ACID-like distributed transactions nor compensating transactions are fail proof, and both may lead to inconsistencies. Eventual consistency is often better than "occasional inconsistency". There are different design solutions, such as:
You may create a more robust solution using asynchronous communication. In your scenario, when Bob registers, the API gateway could send a message to a NewUser queue, and right-away reply to the user saying "You'll receive an email to confirm the account creation." A queue consumer service could process the message, perform the database changes in a single transaction, and send the email to Bob to notify the account creation.
The User microservice creates the user record and a wallet record in the same database. In this case, the wallet store in the User microservice is a replica of the master wallet store only visible to the Wallet microservice. There's a data synchronization mechanism that is trigger-based or kicks in periodically to send data changes (e.g., new wallets) from the replica to the master, and vice-versa.
But what if you need synchronous responses?
Remodel the microservices. If the solution with the queue doesn't work because the service consumer needs a response right away, then I'd rather remodel the User and Wallet functionality to be collocated in the same service (or at least in the same VM to avoid distributed transactions). Yes, it's a step farther from microservices and closer to a monolith, but will save you from some headache.
This is a classic question I was asked during an interview recently How to call multiple web services and still preserve some kind of error handling in the middle of the task. Today, in high performance computing, we avoid two phase commits. I read a paper many years ago about what was called the "Starbuck model" for transactions: Think about the process of ordering, paying, preparing and receiving the coffee you order at Starbuck... I oversimplify things but a two phase commit model would suggest that the whole process would be a single wrapping transaction for all the steps involved until you receive your coffee. However, with this model, all employees would wait and stop working until you get your coffee. You see the picture ?
Instead, the "Starbuck model" is more productive by following the "best effort" model and compensating for errors in the process. First, they make sure that you pay! Then, there are message queues with your order attached to the cup. If something goes wrong in the process, like you did not get your coffee, it is not what you ordered, etc, we enter into the compensation process and we make sure you get what you want or refund you, This is the most efficient model for increased productivity.
Sometimes, starbuck is wasting a coffee but the overall process is efficient. There are other tricks to think when you build your web services like designing them in a way that they can be called any number of times and still provide the same end result. So, my recommendation is:
Don't be too fine when defining your web services (I am not convinced about the micro-service hype happening these days: too many risks of going too far);
Async increases performance so prefer being async, send notifications by email whenever possible.
Build more intelligent services to make them "recallable" any number of times, processing with an uid or taskid that will follow the order bottom-top until the end, validating business rules in each step;
Use message queues (JMS or others) and divert to error handling processors that will apply operations to "rollback" by applying opposite operations, by the way, working with async order will require some sort of queue to validate the current state of the process, so consider that;
In last resort, (since it may not happen often), put it in a queue for manual processing of errors.
Let's go back with the initial problem that was posted. Create an account and create a wallet and make sure everything was done.
Let's say a web service is called to orchestrate the whole operation.
Pseudo code of the web service would look like this:
Call Account creation microservice, pass it some information and a some unique task id 1.1 Account creation microservice will first check if that account was already created. A task id is associated with the account's record. The microservice detects that the account does not exist so it creates it and stores the task id. NOTE: this service can be called 2000 times, it will always perform the same result. The service answers with a "receipt that contains minimal information to perform an undo operation if required".
Call Wallet creation, giving it the account ID and task id. Let's say a condition is not valid and the wallet creation cannot be performed. The call returns with an error but nothing was created.
The orchestrator is informed of the error. It knows it needs to abort the Account creation but it will not do it itself. It will ask the wallet service to do it by passing its "minimal undo receipt" received at the end of step 1.
The Account service reads the undo receipt and knows how to undo the operation; the undo receipt may even include information about another microservice it could have called itself to do part of the job. In this situation, the undo receipt could contain the Account ID and possibly some extra information required to perform the opposite operation. In our case, to simplify things, let's say is simply delete the account using its account id.
Now, let's say the web service never received the success or failure (in this case) that the Account creation's undo was performed. It will simply call the Account's undo service again. And this service should normaly never fail because its goal is for the account to no longer exist. So it checks if it exists and sees nothing can be done to undo it. So it returns that the operation is a success.
The web service returns to the user that the account could not be created.
This is a synchronous example. We could have managed it in a different way and put the case into a message queue targeted to the help desk if we don't want the system to completly recover the error". I've seen this being performed in a company where not enough hooks could be provided to the back end system to correct situations. The help desk received messages containing what was performed successfully and had enough information to fix things just like our undo receipt could be used for in a fully automated way.
I have performed a search and the microsoft web site has a pattern description for this approach. It is called the compensating transaction pattern:
Compensating transaction pattern
All distributed systems have trouble with transactional consistency. The best way to do this is like you said, have a two-phase commit. Have the wallet and the user be created in a pending state. After it is created, make a separate call to activate the user.
This last call should be safely repeatable (in case your connection drops).
This will necessitate that the last call know about both tables (so that it can be done in a single JDBC transaction).
Alternatively, you might want to think about why you are so worried about a user without a wallet. Do you believe this will cause a problem? If so, maybe having those as separate rest calls are a bad idea. If a user shouldn't exist without a wallet, then you should probably add the wallet to the user (in the original POST call to create the user).
IMHO one of the key aspects of microservices architecture is that the transaction is confined to the individual microservice (Single responsibility principle).
In the current example, the User creation would be an own transaction. User creation would push a USER_CREATED event into an event queue. Wallet service would subscribe to the USER_CREATED event and do the Wallet creation.
If my wallet was just another bunch of records in the same sql database as the user then I would probably place the user and wallet creation code in the same service and handle that using the normal database transaction facilities.
It sounds to me you are asking about what happens when the wallet creation code requires you touch another other system or systems? Id say it all depends on how complex and or risky the creation process is.
If it's just a matter of touching another reliable datastore (say one that can't participate in your sql transactions), then depending on the overall system parameters, I might be willing to risk the vanishingly small chance that second write won't happen. I might do nothing, but raise an exception and deal with the inconsistent data via a compensating transaction or even some ad-hoc method. As I always tell my developers: "if this sort of thing is happening in the app, it won't go unnoticed".
As the complexity and risk of wallet creation increases you must take steps to ameliorate the risks involved. Let's say some of the steps require calling multiple partner apis.
At this point you might introduce a message queue along with the notion of partially constructed users and/or wallets.
A simple and effective strategy for making sure your entities eventually get constructed properly is to have the jobs retry until they succeed, but a lot depends on the use cases for your application.
I would also think long and hard about why I had a failure prone step in my provisioning process.
One simple Solution is you create user using the User Service and use a messaging bus where user service emits its events , and Wallet Service registers on the messaging bus, listens on User Created event and create Wallet for the User. In the mean time , if user goes on Wallet UI to see his Wallet, check if user was just created and show your wallet creation is in progress, please check in some time
What solutions are available to prevent this kind of data inconsistency from happening?
Traditionally, distributed transaction managers are used. A few years ago in the Java EE world you might have created these services as EJBs which were deployed to different nodes and your API gateway would have made remote calls to those EJBs. The application server (if configured correctly) automatically ensures, using two phase commit, that the transaction is either committed or rolled back on each node, so that consistency is guaranteed. But that requires that all the services be deployed on the same type of application server (so that they are compatible) and in reality only ever worked with services deployed by a single company.
Are there patterns that allow transactions to span multiple REST requests?
For SOAP (ok, not REST), there is the WS-AT specification but no service that I have ever had to integrate has support that. For REST, JBoss has something in the pipeline. Otherwise, the "pattern" is to either find a product which you can plug into your architecture, or build your own solution (not recommended).
I have published such a product for Java EE: https://github.com/maxant/genericconnector
According to the paper you reference, there is also the Try-Cancel/Confirm pattern and associated Product from Atomikos.
BPEL Engines handle consistency between remotely deployed services using compensation.
Alternatively, I know REST might just not be suited for this use case. Would perhaps the correct way to handle this situation to drop REST entirely and use a different communication protocol like a message queue system?
There are many ways of "binding" non-transactional resources into a transaction:
As you suggest, you could use a transactional message queue, but it will be asynchronous, so if you depend on the response it becomes messy.
You could write the fact that you need to call the back end services into your database, and then call the back end services using a batch. Again, async, so can get messy.
You could use a business process engine as your API gateway to orchestrate the back end microservices.
You could use remote EJB, as mentioned at the start, since that supports distributed transactions out of the box.
Or should I enforce consistency in my application code (for example, by having a background job that detects inconsistencies and fixes them or by having a "state" attribute on my User model with "creating", "created" values, etc.)?
Playing devils advocate: why build something like that, when there are products which do that for you (see above), and probably do it better than you can, because they are tried and tested?
In micro-services world the communication between services should be either through rest client or messaging queue. There can be two ways to handle the transactions across services depending on how are you communicating between the services. I will personally prefer message driven architecture so that a long transaction should be a non blocking operation for a user.
Lets take you example to explain it :
Create user BOB with event CREATE USER and push the message to a message bus.
Wallet service subscribed to this event can create a wallet corresponding to the user.
The one thing which you have to take care is to select a robust reliable message backbone which can persists the state in case of failure. You can use kafka or rabbitmq for messaging backbone. There will be a delay in execution because of eventual consistency but that can be easily updated through socket notification. A notifications service/task manager framework can be a service which update the state of the transactions through asynchronous mechanism like sockets and can help UI to update show the proper progress.
Personally I like the idea of Micro Services, modules defined by the use cases, but as your question mentions, they have adaptation problems for the classical businesses like banks, insurance, telecom, etc...
Distributed transactions, as many mentioned, is not a good choice, people now going more for eventually consistent systems but I am not sure this will work for banks, insurance, etc....
I wrote a blog about my proposed solution, may be this can help you....
https://mehmetsalgar.wordpress.com/2016/11/05/micro-services-fan-out-transaction-problems-and-solutions-with-spring-bootjboss-and-netflix-eureka/
Eventual consistency is the key here.
One of the services is chosen to become primary handler of the event.
This service will handle the original event with single commit.
Primary handler will take responsibility for asynchronously communicating the secondary effects to other services.
The primary handler will do the orchestration of other services calls.
The commander is in charge of the distributed transaction and takes control. It knows the instruction to be executed and will coordinate executing them. In most scenarios there will just be two instructions, but it can handle multiple instructions.
The commander takes responsibility of guaranteeing the execution of all instructions, and that means retires.
When the commander tries to effect the remote update and doesn’t get a response, it has no retry.
This way the system can be configured to be less prone to failure and it heals itself.
As we have retries we have idempotence.
Idempotence is the property of being able to do something twice such a way that the end results be the same as if it had been done once only.
We need idempotence at the remote service or data source so that, in the case where it receives the instruction more than once, it only processes it once.
Eventual consistency
This solves most of distributed transaction challenges, however we need to consider couple of points here.
Every failed transaction will be followed by a retry, the amount of attempted retries depends on the context.
Consistency is eventual i.e., while the system is out of consistent state during a retry, for example if a customer has ordered a book, and made a payment and then updates the stock quantity. If the stock update operations fail and assuming that was the last stock available, the book will still be available till the retry operation for the stock updating has succeeded. After the retry is successful your system will be consistent.
Why not use API Management (APIM) platform that supports scripting/programming? So, you will be able to build composite service in the APIM without disturbing micro services. I have designed using APIGEE for this purpose.
Server frameworks: Scala, Play 2.2, ReactiveMongo, Heroku
I think I have quite interesting brain teaser for you:
In my trip-planning application I want to display weather forecast on a map(similar to this). I'm using a paid REST service to query weather data. To speed up user experience and reduce costs I plan to cache weather data for each location for one hour.
There are a few not-so obvious things to consider:
It might require to query up to 100 location for weather to display one weather map
Weather must be queried in parallel because it would take too long to query it in serial fashion considering network latency
However launching 100 threads for each user request is not an option as well (imagine just 5 users looking at a map at one time)
The solution is to have let's say 50 workers that query weather for user requests
Multiple users might be viewing the same portion of map
There is a possible racing condition where one location is queried multiple times.
However it should be queried only once and then cached.
The application is running in clustered environment meaning there will be several play instances.
Coming from a Java EE background I can come up with a pretty good solution using the Java EE stack.
However I wonder how to do this using something more natural to Scala/Play stack: Akka. There is an example (google "heroku scala akka") for similar problem but it doesn't solve one issue: Racing condition when multiple users query the same data at once.
How would you implement this?
EDIT: I have decided that the requirement to ensure that weather data is updated only once is not necessary. The situation would happen far too infrequently to be a real problem and all proposed solutions would bring too much overhead and complexity to the system to be viable.
Thanks everyone for your time and effort. I hope answers to this question will help someone in the future with similar problem.
In Akka you can choose from multiple routing strategies. ConsistentHashingRoutingLogic could serve you well in this situation. Since actors are single-threaded you can easily maintain a cache in each actor. This routing logic will assure that two equal messages will always hit the same actor.
Each actor can work in the following way:
1. check local cache (for example apache commons LRUMap)
- if found, return
2. check global cache (distributed memcache or any other key-value store)
- if found, store the result in the local cache and return
3. query the REST service
4. store the result in the global and local caches
You can have a look at this question, which I based my answer on.
I decided that I'll post my JMS solution as well.
Controller that processes the request for weather does following:
Query the DB for weather data. If there are NO locations with out-of-date data reply immediately. Otherwise continue:
Start listening on a topic (explained later).
For each location: Check whether the weather for the location isn't being updated.
If not send a weather update request message to queue.
Certain amount of workers (50?) listen to that queue.
Worker first marks the location weather as being updated
Worker retrieves updated weather and updates the DB.
Worker sends a message to a topic with weather data for that location.
When controller receives (via topic) weather updates for all out-of-date locations, combine it with up-to-date locations and reply.
I am writing an app for iOS that uses data provided by a web service. I am using core data for local storage and persistence of the data, so that some core set of the data is available to the user if the web is not reachable.
In building this app, I've been reading lots of posts about core data. While there seems to be lots out there on the mechanics of doing this, I've seen less on the general principles/patterns for this.
I am wondering if there are some good references out there for a recommended interaction model.
For example, the user will be able to create new objects on the app. Lets say the user creates a new employee object, the user will typically create it, update it and then save it. I've seen recommendations that updates each of these steps to the server --> when the user creates it, when the user makes changes to the fields. And if the user cancels at the end, a delete is sent to the server. Another different recommendation for the same operation is to keep everything locally, and only send the complete update to the server when the user saves.
This example aside, I am curious if there are some general recommendations/patterns on how to handle CRUD operations and ensure they are sync'd between the webserver and coredata.
Thanks much.
I think the best approach in the case you mention is to store data only locally until the point the user commits the adding of the new record. Sending every field edit to the server is somewhat excessive.
A general idiom of iPhone apps is that there isn't such a thing as "Save". The user generally will expect things to be committed at some sensible point, but it isn't presented to the user as saving per se.
So, for example, imagine you have a UI that lets the user edit some sort of record that will be saved to local core data and also be sent to the server. At the point the user exits the UI for creating a new record, they will perhaps hit a button called "Done" (N.B. not usually called "Save"). At the point they hit "Done", you'll want to kick off a core data write and also start a push to the remote server. The server pus h won't necessarily hog the UI or make them wait till it completes -- it's nicer to allow them to continue using the app -- but it is happening. If the update push to server failed, you might want to signal it to the user or do something appropriate.
A good question to ask yourself when planning the granularity of writes to core data and/or a remote server is: what would happen if the app crashed out, or the phone ran out of power, at any particular spots in the app? How much loss of data could possibly occur? Good apps lower the risk of data loss and can re-launch in a very similar state to what they were previously in after being exited for whatever reason.
Be prepared to tear your hair out quite a bit. I've been working on this, and the problem is that the Core Data samples are quite simple. The minute you move to a complex model and you try to use the NSFetchedResultsController and its delegate, you bump into all sorts of problems with using multiple contexts.
I use one to populate data from your webservice in a background "block", and a second for the tableview to use - you'll most likely end up using a tableview for a master list and a detail view.
Brush up on using blocks in Cocoa if you want to keep your app responsive whilst receiving or sending data to/from a server.
You might want to read about 'transactions' - which is basically the grouping of multiple actions/changes as a single atomic action/change. This helps avoid partial saves that might result in inconsistent data on server.
Ultimately, this is a very big topic - especially if server data is shared across multiple clients. At the simplest, you would want to decide on basic policies. Does last save win? Is there some notion of remotely held locks on objects in server data store? How is conflict resolved, when two clients are, say, editing the same property of the same object?
With respect to how things are done on the iPhone, I would agree with occulus that "Done" provides a natural point for persisting changes to server (in a separate thread).