How to deal with shared state in a micro-service architecture? - deployment

In our company we are transitioning from a huge monolithic application to a micro-service architecture. The main technical drivers for this decision were the need to be able to scale services independently and the scalability of development - we've got ten scrum teams working in different projects (or 'micro-services').
The transition process is being smooth and we've already started to benefit from the advantages of this new technical and organizational structures. Now, on the other hand, there is a main point of pain that we are struggling with: how to manage the 'state' of the dependencies between these micro-services.
Let's put an example: one of the micro-services deals with users and registrations. This service (let's call it X) is responsible for maintaining identity information and thus is the main provider for user 'ids'. The rest of the micro-services have a strong dependency on this one. For example, there are some services responsible for user profile information (A), user permissions (B), user groups (C), etc. that rely on those user ids and thus there is a need for maintaining some data sync between these services (i.e. service A should not have info for a userId not registered in service X). We currently maintain this sync by notifying changes of state (new registrations, for example) using RabbitMQ.
As you can imagine, there are many Xs: many 'main' services and many more complicated dependencies between them.
The main issue comes when managing the different dev/testing environments. Every team (and thus, every service) needs to go through several environments in order to put some code live: continuous integration, team integration, acceptance test and live environments.
Obviously we need all services working in all these environments to check that the system is working as a whole. Now, this means that in order to test dependent services (A, B, C, ...) we must not only rely on service X, but also on its state. Thus, we need somehow to maintain system integrity and store a global & coherent state.
Our current approach for this is getting snapshots of all DBs from the live environment, making some transformations to shrink and protect data privacy and propagating it to all environments before testing in a particular environment. This is obviously a tremendous overhead, both organizationally and in computational resources: we have ten continuous integration environments, ten integration environments and one acceptance test environment that all need to be 'refreshed' with this shared data from live and the latest version of the code frequently.
We are struggling to find a better way to ease this pain. Currently we are evaluating two options:
using docker-like containers for all these services
having two versions of each service (one intended for development of that service and one another as a sandbox to be used by the rest of the teams in their development & integration testing)
None of these solutions ease the pain of shared data between services. We'd like to know how some other companies/developers are addressing this problem, as we think this must be common in a micro services architecture.
How are you guys doing it? Do you also have this problem? Any recommendation?
Sorry for the long explanation and thanks a lot!

This time I've read your question from different perspective, so here is a 'different opinion'. I know it may be too late but hope it helps with further development.
It looks like shared state is a result of wrong decoupling. In 'right' microservice architecture all microservices have to be isolated functionally rather than logically. I mean all three user profile information (A), user permissions (B), user groups (C) look functionally the same and more or less functionally coherent. They seem to be a single user service with a coherent storage, although it may not look as a micro-service. I don't see here any reasons of decoupling them (or at least you haven't told about them).
Starting from this point, splitting it into smaller independently deployable units may cause more cost and troubles than benefits. There should be a significant reason for that (sometimes political, sometimes just lack of product knowledge)
So the real problem is related to microservice isolation. Ideally each microservice can live as complete standalone product and deliver a well defined business value. When elaborating system architecture we break up it into tiny logical units (A, B, C, etc in your case, or even smaller) and then define functionally coherent subgroups. I can't tell you exact rules of how to do that, perhaps some examples. Complex communication/dependencies between the units, many common terms in their ubiquitous languages so it looks like such units belong to the same functional group and thus to a single service.
So from your example, since there is a single storage you have only way of managing its consistency as you did.
BTW I wonder what actual way you've solved your problem?

Let me try to reformulate the problem:
Actors:
X: UserIds (state of account)
provide service to get ID (based on credentials) and status of account
A: UserProfile
Using X to check status of a user account. Stores name along with link to account
provide service to get/edit name based on ID
B: UserBlogs
Using X in same way. Stores blog post along with link to account when user writes one
Using A to search blog post based on user name
provide service get/edit list of blog entries based on ID
provide service to search for blog post based on name (relies on A)
C: MobileApp
wraps features of X, A, B into a mobile app
provide all services above, relying on well-defined communication contract with all others (following #neleus statement)
Requirements:
Work of teams X, A, B, C need to be uncoupled
Integration environments for X, A, B, C need to be updated with latests features (in order to perform integration tests)
Integration environments for X, A, B, C need to have 'sufficient' set of data (in order to perform load tests, and to find edge cases)
Following #eugene idea: having mocks for each service provided by every team would allow 1) and 2)
cost is more development from the teams
also maintenance of the mocks as well as the main feature
impediment is the fact that you have a monolithic system (you do not have a set of clean well defined/isolated services yet)
Suggested solution:
What about having a shared environment with the set of master data to resolve 3)? Every 'delivered services' (i.e running in production) would be avalailable. Each teams could chose which services they would use from here and which one they would use from their own environment
One immediate drawback I can see is the shared states and consistency of data.
Let's consider automated tests ran against the master data, e.g:
B changes names (owned by A) in order to work on its blog service
might break A, or C
A changes the status of an account in order to work on some permission scenarios
might break X, B
C changes all of it on same accounts
breaks all others
The master set of data would quickly become inconsistent and lose its value for requirement 3) above.
We could therefore add a 'conventional' layer on the shared master data: anyone can read from the full set, but can only modify the objects that they have created ?

From my perspective only the objects uses the services should have the state. Let's consider your example: service X responsible for the user Id, service A responsible for the profile information, etc. Lets assume the user Y that has some security token (that may be created for example by using it's user name and password - should be unique) entries to the system. Then the client, contains the user information, sends the security token to the service X. The service X contains info about user ID linked to such token. In case the new user, the service X creates the new ID and stores it's token. Then the service X returns ID to the user object. The user object asks service A about the user profile by providing user ID. Service A takes the ID and asks service X if that ID exists. Service X sends the positive answer then service A may search the profile information by user ID or ask the user to provide such information in order to create it. The same logic should work with the B and C services. They have to talk each with other but they don't need to know about the user state.
Few words about the environments. I would suggest to use puppets. This is the way to automate the service deploy process. We are using the puppets to deploy the services on the different environments. The puppet script is reach and allows flexible configuration.

Related

Microservices - Storing user data in separate database

I am building a microservice that has two separate services: a user service and a comments service. The user service stores the user details like email, first/last name job title, etc, and the comments service stores all comments made by the user.
In the UI, I need to populate the comments (via a REST API) and show the first/last name, email, and job title of the user.
Is it recommended that we store all these user details in the comments database?
If yes, then every time a user changes their details first/last name or job title then I will have to update their details in all the comments (I don't think this is a good idea )
If no, then if I store just the userid in the comments DB, how am I supposed to get the user details for each comment? Let's say we want to show 20 comments per page in the UI.
First, challenge architecture. Let's assume that the both services in the question are part of a larger ecosystem of microservices that all make use of the user information. Else separation will most certainly be overengineered. But from the word "comments" we can at least guess that there is at least one other class of objects, that is the things being commented. So let's assume a "user service" is a meaningful crumb to break out into a microservice, because at least some other crumbs get the necessary weight to justify the microservice breakup.
In that case I suggest the following strategy:
Second, implement an abstraction layer into your comments service right away so that most of the code will not have to care about where the user comes from (i.e. don't join or $lookup). This is also a great opportunity for local testing, because you can just create a collection with the data you need and run service level integration tests against it.
Third, for integration with the user service, get the data from there via API (which should support bulk data selection in any case) every time you need it. Because you have the abstraction layer, you can add caching, cache timeout and displacement strategies and whatever you may need below this abstraction without caring in the main portion of the code. Add such on an as needed basis. Keep it simple.
Fourth, when things really go heavyweight and you have to care with tens of thousands of users, tons of comments and many requests per second the comments service could, still below the abstraction, implement an upfront replication pattern to get the full user database locally. This will usually be done based on an asynchronous message being sent by the user service to all subscribers when something changes in te user base. When it suits the subscribers (i.e. the comment service), they can trigger full or (from time to time) delta replication of the changes. Suitable collections will be already in place from what you did for caching. And it will probably be considerably less info you need in the comments service, than is stored in user service (let alone the hashed password, other login options or accounting information).
Fifth, should you still hit performance challenges, you can break the abstraction for the few cases you need to and do the join or $lookup.
Follow the steps in order, and stop as soon as the overall assembly works fine. Every step adds considerable complexity, and when you don't need it, don't implement it.

Fetching potentially needed data from repository - DDD

We have (roughly) following architecture:
Application service does the infrastructure job - fetches data from repositories which are hidden behind interfaces.
Object graph is created and passed to appropriate domain service.
Domain service does it thing and raises appropriate events.
Events are handled in different application services which perform some persistent operations (altering repositories, sending e-mails etc).
However. Domain service (3) has become so complex that it requires data from different external APIs only if particular conditions are satisfied. For example - if Product X is of type Car, we need to know price of that car model from some external CatalogService (example invented) hidden behind ICatalogService. This operation is potentially expensive one (REST call).
How do we go about this?
A. Do we pre-fetch all data in Application Service listed as (1) even we might not need it? Do we inject interface ICatalogService into given Domain Service and fetch data only when needed? The latter solution might create performance issues if, some other client of Domain Service, calls this Domain Service repeatedly without knowing there is a REST call hidden inside it.
Or did we simply get the domain model wrong?
This question is related to Domain Driven Design.
How do we go about this?
There are two common patterns.
One is to pass the capability to make the query into the domain model, allowing the model to fetch the information itself when it is needed. What this will usually look like is defining an interface / a contract that will be consumed by the domain model, but implemented in the application/infrastructure layers.
The other is to extend the protocol between the domain model and the application, so that we can signal to the application layer what information is needed, and then the application code can decide how to provide it. You end up with something like a state machine for the processes, with the application code coordinating the exchange of information between the external api and the domain model.
If you use a bit of imagination, you've already got a state machine something like this; as your application code is already coordinating the movement of inputs to the repository and the domain model. The difference, of course, is that the existing "state machine" is simple and linear enough that it may not be obvious that there is a state machine present at all.
how exactly would you signal application layer?
Simple queries; which is to say, the application code pulls the information it needs out of the domain model and uses that information to compute the next action. When the action is completed, the application code pushes information to the domain model.
There isn't enough information to give you targeted good advice. I suspect you need to refactor your domains into further subdomains. It sounds like your domain service has way more than 1 responsibility. Keep the service simple.
In addition, If you have a long running task like a service call that takes a long time, then you need to architect it away. The most supple design will not keep the consumer waiting. It'll return immediately with some sort of result to the user even if it's simply a periodic status update.

Transactions across REST microservices?

Let's say we have a User, Wallet REST microservices and an API gateway that glues things together. When Bob registers on our website, our API gateway needs to create a user through the User microservice and a wallet through the Wallet microservice.
Now here are a few scenarios where things could go wrong:
User Bob creation fails: that's OK, we just return an error message to the Bob. We're using SQL transactions so no one ever saw Bob in the system. Everything's good :)
User Bob is created but before our Wallet can be created, our API gateway hard crashes. We now have a User with no wallet (inconsistent data).
User Bob is created and as we are creating the Wallet, the HTTP connection drops. The wallet creation might have succeeded or it might have not.
What solutions are available to prevent this kind of data inconsistency from happening? Are there patterns that allow transactions to span multiple REST requests? I've read the Wikipedia page on Two-phase commit which seems to touch on this issue but I'm not sure how to apply it in practice. This Atomic Distributed Transactions: a RESTful design paper also seems interesting although I haven't read it yet.
Alternatively, I know REST might just not be suited for this use case. Would perhaps the correct way to handle this situation to drop REST entirely and use a different communication protocol like a message queue system? Or should I enforce consistency in my application code (for example, by having a background job that detects inconsistencies and fixes them or by having a "state" attribute on my User model with "creating", "created" values, etc.)?
What doesn't make sense:
distributed transactions with REST services. REST services by definition are stateless, so they should not be participants in a transactional boundary that spans more than one service. Your user registration use case scenario makes sense, but the design with REST microservices to create User and Wallet data is not good.
What will give you headaches:
EJBs with distributed transactions. It's one of those things that work in theory but not in practice. Right now I'm trying to make a distributed transaction work for remote EJBs across JBoss EAP 6.3 instances. We've been talking to RedHat support for weeks, and it didn't work yet.
Two-phase commit solutions in general. I think the 2PC protocol is a great algorithm (many years ago I implemented it in C with RPC). It requires comprehensive fail recovery mechanisms, with retries, state repository, etc. All the complexity is hidden within the transaction framework (ex.: JBoss Arjuna). However, 2PC is not fail proof. There are situations the transaction simply can't complete. Then you need to identify and fix database inconsistencies manually. It may happen once in a million transactions if you're lucky, but it may happen once in every 100 transactions depending on your platform and scenario.
Sagas (Compensating transactions). There's the implementation overhead of creating the compensating operations, and the coordination mechanism to activate compensation at the end. But compensation is not fail proof either. You may still end up with inconsistencies (= some headache).
What's probably the best alternative:
Eventual consistency. Neither ACID-like distributed transactions nor compensating transactions are fail proof, and both may lead to inconsistencies. Eventual consistency is often better than "occasional inconsistency". There are different design solutions, such as:
You may create a more robust solution using asynchronous communication. In your scenario, when Bob registers, the API gateway could send a message to a NewUser queue, and right-away reply to the user saying "You'll receive an email to confirm the account creation." A queue consumer service could process the message, perform the database changes in a single transaction, and send the email to Bob to notify the account creation.
The User microservice creates the user record and a wallet record in the same database. In this case, the wallet store in the User microservice is a replica of the master wallet store only visible to the Wallet microservice. There's a data synchronization mechanism that is trigger-based or kicks in periodically to send data changes (e.g., new wallets) from the replica to the master, and vice-versa.
But what if you need synchronous responses?
Remodel the microservices. If the solution with the queue doesn't work because the service consumer needs a response right away, then I'd rather remodel the User and Wallet functionality to be collocated in the same service (or at least in the same VM to avoid distributed transactions). Yes, it's a step farther from microservices and closer to a monolith, but will save you from some headache.
This is a classic question I was asked during an interview recently How to call multiple web services and still preserve some kind of error handling in the middle of the task. Today, in high performance computing, we avoid two phase commits. I read a paper many years ago about what was called the "Starbuck model" for transactions: Think about the process of ordering, paying, preparing and receiving the coffee you order at Starbuck... I oversimplify things but a two phase commit model would suggest that the whole process would be a single wrapping transaction for all the steps involved until you receive your coffee. However, with this model, all employees would wait and stop working until you get your coffee. You see the picture ?
Instead, the "Starbuck model" is more productive by following the "best effort" model and compensating for errors in the process. First, they make sure that you pay! Then, there are message queues with your order attached to the cup. If something goes wrong in the process, like you did not get your coffee, it is not what you ordered, etc, we enter into the compensation process and we make sure you get what you want or refund you, This is the most efficient model for increased productivity.
Sometimes, starbuck is wasting a coffee but the overall process is efficient. There are other tricks to think when you build your web services like designing them in a way that they can be called any number of times and still provide the same end result. So, my recommendation is:
Don't be too fine when defining your web services (I am not convinced about the micro-service hype happening these days: too many risks of going too far);
Async increases performance so prefer being async, send notifications by email whenever possible.
Build more intelligent services to make them "recallable" any number of times, processing with an uid or taskid that will follow the order bottom-top until the end, validating business rules in each step;
Use message queues (JMS or others) and divert to error handling processors that will apply operations to "rollback" by applying opposite operations, by the way, working with async order will require some sort of queue to validate the current state of the process, so consider that;
In last resort, (since it may not happen often), put it in a queue for manual processing of errors.
Let's go back with the initial problem that was posted. Create an account and create a wallet and make sure everything was done.
Let's say a web service is called to orchestrate the whole operation.
Pseudo code of the web service would look like this:
Call Account creation microservice, pass it some information and a some unique task id 1.1 Account creation microservice will first check if that account was already created. A task id is associated with the account's record. The microservice detects that the account does not exist so it creates it and stores the task id. NOTE: this service can be called 2000 times, it will always perform the same result. The service answers with a "receipt that contains minimal information to perform an undo operation if required".
Call Wallet creation, giving it the account ID and task id. Let's say a condition is not valid and the wallet creation cannot be performed. The call returns with an error but nothing was created.
The orchestrator is informed of the error. It knows it needs to abort the Account creation but it will not do it itself. It will ask the wallet service to do it by passing its "minimal undo receipt" received at the end of step 1.
The Account service reads the undo receipt and knows how to undo the operation; the undo receipt may even include information about another microservice it could have called itself to do part of the job. In this situation, the undo receipt could contain the Account ID and possibly some extra information required to perform the opposite operation. In our case, to simplify things, let's say is simply delete the account using its account id.
Now, let's say the web service never received the success or failure (in this case) that the Account creation's undo was performed. It will simply call the Account's undo service again. And this service should normaly never fail because its goal is for the account to no longer exist. So it checks if it exists and sees nothing can be done to undo it. So it returns that the operation is a success.
The web service returns to the user that the account could not be created.
This is a synchronous example. We could have managed it in a different way and put the case into a message queue targeted to the help desk if we don't want the system to completly recover the error". I've seen this being performed in a company where not enough hooks could be provided to the back end system to correct situations. The help desk received messages containing what was performed successfully and had enough information to fix things just like our undo receipt could be used for in a fully automated way.
I have performed a search and the microsoft web site has a pattern description for this approach. It is called the compensating transaction pattern:
Compensating transaction pattern
All distributed systems have trouble with transactional consistency. The best way to do this is like you said, have a two-phase commit. Have the wallet and the user be created in a pending state. After it is created, make a separate call to activate the user.
This last call should be safely repeatable (in case your connection drops).
This will necessitate that the last call know about both tables (so that it can be done in a single JDBC transaction).
Alternatively, you might want to think about why you are so worried about a user without a wallet. Do you believe this will cause a problem? If so, maybe having those as separate rest calls are a bad idea. If a user shouldn't exist without a wallet, then you should probably add the wallet to the user (in the original POST call to create the user).
IMHO one of the key aspects of microservices architecture is that the transaction is confined to the individual microservice (Single responsibility principle).
In the current example, the User creation would be an own transaction. User creation would push a USER_CREATED event into an event queue. Wallet service would subscribe to the USER_CREATED event and do the Wallet creation.
If my wallet was just another bunch of records in the same sql database as the user then I would probably place the user and wallet creation code in the same service and handle that using the normal database transaction facilities.
It sounds to me you are asking about what happens when the wallet creation code requires you touch another other system or systems? Id say it all depends on how complex and or risky the creation process is.
If it's just a matter of touching another reliable datastore (say one that can't participate in your sql transactions), then depending on the overall system parameters, I might be willing to risk the vanishingly small chance that second write won't happen. I might do nothing, but raise an exception and deal with the inconsistent data via a compensating transaction or even some ad-hoc method. As I always tell my developers: "if this sort of thing is happening in the app, it won't go unnoticed".
As the complexity and risk of wallet creation increases you must take steps to ameliorate the risks involved. Let's say some of the steps require calling multiple partner apis.
At this point you might introduce a message queue along with the notion of partially constructed users and/or wallets.
A simple and effective strategy for making sure your entities eventually get constructed properly is to have the jobs retry until they succeed, but a lot depends on the use cases for your application.
I would also think long and hard about why I had a failure prone step in my provisioning process.
One simple Solution is you create user using the User Service and use a messaging bus where user service emits its events , and Wallet Service registers on the messaging bus, listens on User Created event and create Wallet for the User. In the mean time , if user goes on Wallet UI to see his Wallet, check if user was just created and show your wallet creation is in progress, please check in some time
What solutions are available to prevent this kind of data inconsistency from happening?
Traditionally, distributed transaction managers are used. A few years ago in the Java EE world you might have created these services as EJBs which were deployed to different nodes and your API gateway would have made remote calls to those EJBs. The application server (if configured correctly) automatically ensures, using two phase commit, that the transaction is either committed or rolled back on each node, so that consistency is guaranteed. But that requires that all the services be deployed on the same type of application server (so that they are compatible) and in reality only ever worked with services deployed by a single company.
Are there patterns that allow transactions to span multiple REST requests?
For SOAP (ok, not REST), there is the WS-AT specification but no service that I have ever had to integrate has support that. For REST, JBoss has something in the pipeline. Otherwise, the "pattern" is to either find a product which you can plug into your architecture, or build your own solution (not recommended).
I have published such a product for Java EE: https://github.com/maxant/genericconnector
According to the paper you reference, there is also the Try-Cancel/Confirm pattern and associated Product from Atomikos.
BPEL Engines handle consistency between remotely deployed services using compensation.
Alternatively, I know REST might just not be suited for this use case. Would perhaps the correct way to handle this situation to drop REST entirely and use a different communication protocol like a message queue system?
There are many ways of "binding" non-transactional resources into a transaction:
As you suggest, you could use a transactional message queue, but it will be asynchronous, so if you depend on the response it becomes messy.
You could write the fact that you need to call the back end services into your database, and then call the back end services using a batch. Again, async, so can get messy.
You could use a business process engine as your API gateway to orchestrate the back end microservices.
You could use remote EJB, as mentioned at the start, since that supports distributed transactions out of the box.
Or should I enforce consistency in my application code (for example, by having a background job that detects inconsistencies and fixes them or by having a "state" attribute on my User model with "creating", "created" values, etc.)?
Playing devils advocate: why build something like that, when there are products which do that for you (see above), and probably do it better than you can, because they are tried and tested?
In micro-services world the communication between services should be either through rest client or messaging queue. There can be two ways to handle the transactions across services depending on how are you communicating between the services. I will personally prefer message driven architecture so that a long transaction should be a non blocking operation for a user.
Lets take you example to explain it :
Create user BOB with event CREATE USER and push the message to a message bus.
Wallet service subscribed to this event can create a wallet corresponding to the user.
The one thing which you have to take care is to select a robust reliable message backbone which can persists the state in case of failure. You can use kafka or rabbitmq for messaging backbone. There will be a delay in execution because of eventual consistency but that can be easily updated through socket notification. A notifications service/task manager framework can be a service which update the state of the transactions through asynchronous mechanism like sockets and can help UI to update show the proper progress.
Personally I like the idea of Micro Services, modules defined by the use cases, but as your question mentions, they have adaptation problems for the classical businesses like banks, insurance, telecom, etc...
Distributed transactions, as many mentioned, is not a good choice, people now going more for eventually consistent systems but I am not sure this will work for banks, insurance, etc....
I wrote a blog about my proposed solution, may be this can help you....
https://mehmetsalgar.wordpress.com/2016/11/05/micro-services-fan-out-transaction-problems-and-solutions-with-spring-bootjboss-and-netflix-eureka/
Eventual consistency is the key here.
One of the services is chosen to become primary handler of the event.
This service will handle the original event with single commit.
Primary handler will take responsibility for asynchronously communicating the secondary effects to other services.
The primary handler will do the orchestration of other services calls.
The commander is in charge of the distributed transaction and takes control. It knows the instruction to be executed and will coordinate executing them. In most scenarios there will just be two instructions, but it can handle multiple instructions.
The commander takes responsibility of guaranteeing the execution of all instructions, and that means retires.
When the commander tries to effect the remote update and doesn’t get a response, it has no retry.
This way the system can be configured to be less prone to failure and it heals itself.
As we have retries we have idempotence.
Idempotence is the property of being able to do something twice such a way that the end results be the same as if it had been done once only.
We need idempotence at the remote service or data source so that, in the case where it receives the instruction more than once, it only processes it once.
Eventual consistency
This solves most of distributed transaction challenges, however we need to consider couple of points here.
Every failed transaction will be followed by a retry, the amount of attempted retries depends on the context.
Consistency is eventual i.e., while the system is out of consistent state during a retry, for example if a customer has ordered a book, and made a payment and then updates the stock quantity. If the stock update operations fail and assuming that was the last stock available, the book will still be available till the retry operation for the stock updating has succeeded. After the retry is successful your system will be consistent.
Why not use API Management (APIM) platform that supports scripting/programming? So, you will be able to build composite service in the APIM without disturbing micro services. I have designed using APIGEE for this purpose.

Howto develop a SaaS application with limited resources each tenant

I'd like to develop a bunch of SaaS-Applications in Java and I'm not sure wat is the best way to go.
Each Application will have a WAR containing the Webservice and will have at least one Worker-WAR, which is a Thread waiting for new Tasks in the DB to come up and then working off this task. This worker contains the intelligence of the application and uses a lot of cpu. The Webservice gives Users the possibility to add new tasks and other stuff ...
Resource Limitations
The infrastructure must ensure the following:
The Webservice must always get a certain amount of cpu time to be able to respond to the user. So the hungry Worker must not get all cpu time for its working.
Each Tenant has its own worker and they must not interfere with each other as it must be not possible to block the whole system (and all tenants) with a single task.
Resource Sharing
It would be nice to be able to share the resources but always ensure that in extreme situations every worker and webservice gets the required minimum.
Versioning
As new Versions of a application are released each tenant must have the possibility to initiate a update on its own when he adapted to the API-Changes. Furthermore a tenant must be able to keep more than one application-endpoint (lets call them channels) to have a production channel and a beta-channel. In the Beta-Channel the tenant can test againts new versions and when he feels comfortable with the new version he can update his production channel.
User-Management
All applications of a tenant must share a user-Database and have the same way to authenticate.
Environment
I want to use Java EE 7. I would enjoy using Wildfly.
Question
What is the best infrastructure to approach these aims? I want to host this on my own servers.
What I already found
I understand that you cannot limit CPU-usage in a jvm. So the Workers must have their own jvms.
I looked at PaaS-Providers like OpenShift Origin, but it seems that they encourage you to run a application-server per tenant, per application which sounds to me as a resource-eater.
Is there no way to have one Wildfly running and limit the amount cpu-usage per tenant and app?
Thank You
Lukas

Occasionally connected CQRS systems - Client and Server Commands - Task based screens

Premise:
It is recommended that in CQRS+DDD+ES style applications use task based screens, these screens guide the user and capture intent.
These task screens can also be referred to as Inductive User Interface. Some examples of UI design guidelines that can help you create mondern, user-friendly apps:
Microsoft Inductive User Interface Guidelines and,
Index of UX guidelines
The way I understand it, the tasks, generally speaking, should line up with Commands or Functions waiting on the Server.
For example, if the User makes a change to the Customer's [first name], generally speaking this should be an isolated task where a pop-up window or the like provides a mechanism for this event, and this event only.
Questions:
Part-1:
In the situation where the User is not just making a change to a Customer's [first name], but actually creating a new Customer. Surely the User will not go from [first name] => to [last name] => to [address] => to [email], etc. -- in a wizard like style, where each wizard screen maps to a Command.
a) How are the screens laid out when it's just not practical to isolate a single task? For example when creating a new Customer or Inventory Item.
b) What does the code and/or logic flow related to the Commands look like on the Client and Server in this situation, keeping in mind the obvious pull to stay consistent with the "normal" task based flow of the rest of the system? After all, these all just translate to Activities or Events in the Event Source.
Part-2:
What if the User is not just making a change to a Customer's [first name], but their [last name], [address], and [phone number] -- all the while they User is off-line.
I think ultimately, the User should still be able to do real work on multiple tasks in different areas of the application, while off-line, and perform robust conflict resolution when coming back online.
a) What is the code and/or logic flow and/or artifacts related to the Commands on the Client side while the User is off-line while handling these events locally (IndexDb, queues, etc.)? and
b) What does the connection look like and how does it act when off-line (retries)?
c) What is the code and/or logic flow and/or artifacts related to the Commands on the Client and Server side, when the User comes back on-line?
d) What does the connection look like and how does it act when coming back on-line (reestablish of connection, if it is determined that the Client side ViewModel is stale, WebSockets, etc.)?
Reference diagram:
The way I understand it, the tasks, generally speaking, should line up with Commands or Functions waiting on the Server.
Or sometimes events, but the basic idea is right.
Surely the User will not go from [first name] => to [last name] => to [address] => to [email], etc. -- in a wizard like style, where each wizard screen maps to a Command.
No, we usually want a coarser grain than that. Some tasks do only require a single property, but several properties is a common case.
How are the screens laid out when it's just not practical to isolate a single task?
By grouping together cohesive units; consider the Amazon order workflow -- there are actually several different sets of data collected (the order itself, the selection of payment, specifying new methods of payment, specifying the delivery address, specifying the shipping priority....).
all the while they User is off-line.
See CQRS, not just for server systems; but in broad strokes - treat the data collected from the user as events (FormSubmitted) rather than commands. The offline device is the authority for tracking what the user did while off line; but the unavailable server is still the authority for the consequences of those events. So the server is responsible for the merge when the client reconnects.
The precise details might vary from one domain to another -- for instance, in a warehousing system, where the offline device has been collecting information about inventory, you might handle the inconsistencies that the server observes during the merge by raising exception reports (the device registered this package leaving the warehouse, but we have no record of it entering the warehouse).