I've got a database which contains an Order table. I'm trying to create an API for our clients so they can manipulate the data within the Order table. I have chosen to go down the route of a WCF Data Service so that clients can perform CRUD operation via URI's.
However I need to ensure this WCF Data Service is scalable and can handle a large number of requests(1000) made from multiple clients. I've looked at using MSMQ with WCF Data services but looking at the following link this seems impossible -
Can we use msmq messaging with wcf data service.
My work around is when a client attempts to manipulate the data (not selects), instead of waiting for the changes to be saved to the db we add the request to a MSMQ. This is done by overriding the SaveChanges() method in the ORM and extracting the changes, serialise them into the MSMQ and then return to the client. A separate thread will then pull objects from the MSMQ and then save the changes to the db.
This all seem fine but how do I notify the client that its request has been processed or errored. Any ideas?
Any offline processing of a usually synchronous operation will incur this problem.
It is one of the problems faced when implementing CQRS pattern, which attempts to decouple reads and write operations. This has been termed eventual consistency.
There are actually many different techniques for addressing this problem - have a look at these links:
CQRS - Eventual Consistency
GUI recommandations for eventual consistency?
Related
Currently we are implementing REST API's using the spring-boot. Since our API's are growing in number we are thinking of a solution to implement the REST API's using a different approach.
The approach is as below :
Expose a single service to receive all the HTTP requests.
We will have the URI's configured in a data base table to call the
next set of services. These service are configured to listen to
particular JMS messages.
The next set of services will receive the JMS messages and process
the data.
Below are my questions :
Will the above approach still represent the REST architecture ?
What are the downsides of above approach(we are aware of network
latency) any thing other then network latency ?
What are the REST architecture benefits will we be missing.
Or can we just say that our approach is the REST architecture done differently ?
You're making 2 major choices, each can be decided separately:
1) Having a single HTTP service
2) Using JMS as the communication between this service and the underlying microservices
Regarding #1, if you do this, you can no longer call your services REST since the whole point of REST is to use HTTP verbs together with your domain objects for a predicable set of endpoints. GET on /objects/ means the object is being fetched, POST on /objects means a new object is being created, etc... Now, this is OK, you can do it this way and it can work, though it will be "non-standard".
In fact, you might want to check out GraphQL https://www.howtographql.com/basics/1-graphql-is-the-better-rest/ as its pretty close to what you're trying to do.
These days really either REST or GraphQL seems to be the two popular approaches.
Another way to do REST, if you're looking to simply expose REST services on your domain objects without having to write a lot of code, is Spring Data REST: https://spring.io/projects/spring-data-rest and if you're comfortable with Spring already, this should be pretty easy to understand.
For #2, your choice of communication between your single gateway service and the underlying services. Do most of your calls require synchronous answers, such as a UI asking for data to display in a browser or phone? If so, JMS is not a good approach. JMS would be an ok approach if the majority of your services were asyncronous - for example someone submitting a stock trade request. The UI would just need to know the request was submitted, but it will actually be processed some time later and the result will be fetched asyncronously.
Without knowing much about your application, I would recommend sticking with HTTP between your services for simplicity sake unless there is a good reason to switch to JMS.
I know that messaging system is non blocking and scalable and should be used in microservices environment.
The use case that i am questioning is:
Imagine that there's an admin dashboard client responsible for sending API request to create an Item object. There is a microservice that provides API endpoint which uses a MySQL database where the Item should be stored. There is another microservice which uses elastic search for text searching purposes.
Should this admin dashboard client :
A. Send 2 API Calls; 1 Call to MySQL service and another elasticsearch service
or
B. Send message to topic to be consumed by both MySQL service and elasticsearch service?
What are the pros and cons when considering A or B?
I'm thinking that it's a little overkill when only 2 microservices are consuming this topic. Also, the frequency of which the admin is creating Item object is very small.
Like many things in software architecture, it depends. Your requirements, SLAs and business needs should make it clearer.
As you noted, messaging system is not blocking and much more scalable, but, API communication got it pluses as well.
In general, REST APIs are best suited to request/response interactions where the client application sends a request to the API backend over HTTP.
Message streaming is best suited for notifications when new data or events occur that you may want to take action upon.
In you specific case, I would go with a messaging system with is much more scalable and non-blocking.
Your A approach is coupling the "routing" logic into your application. Pretend you need to perform an API call to audit your requests, then you will need to change the code and add another call to your application logic. As you said, the approach is synchronous and unless you're not providing threading logic, your calls will be lined up and won't scale, ie, call mysql --> wait response, then call elastic search --> wait response, ...
In any case you can prefer this approach if you need immediate consistency, ie, the result call of one action feeding the second action.
The B approach is decoupling that routing logic, so, any other service interested in the event can subscribe to the topic and perform the action expected. Totally asynchronous and scalable. Here you will have eventual consistency and you have to recover any possible failure.
We're currently about to migrate from monolithic design to the microservice architecture, trying to choose the best way to replace JAX-WS with RESTful and considering to use Spring WebFlux.
We currently have an JAX-WS endpoint deployed at Tomcat EE serving requests from third-party clients. Webservice endpoint makes a long running blocking call to the database and then sends a SOAP-response to the client with a data retrieved from DB (Oracle).
Oracle DB will be replaced with one of NoSQL databases soon (possibly it will be MongoDB). Since MongoDB supports asynchronous calls we're considering to substitute current implementation with a microservice exposing REST endpoint based on WebFlux.
We have about 2500 req/sec at peaks, so current endpoint often gets down with a OutOfMemoryError. It was a root cause that pushed us towards migration.
My thoughts are to create a non-blocking endpoint which will call MongoDB in asynchronous manner and send a REST-response to the client. So I have a few questions considering basic features that WebFlux provides:
As far as I concerned there is a built-in backpressure control at
the business-level (not TCP flow control) in WebFlux and it works
generally via Reactive Streams. Since our clients are not
reactive, does it means that such way of a backpressure control is
not implementable here?
Suppose that calls to a new database remains long-running in a new
architecture. Since Netty uses EventLoop to serve incoming
requests, is there possible a situation when the microservice has
accepted all incoming HTTP connections, invoke an async call to the
db and subscribed a resulted Mono to the scheduler, but, since
the request quantity keeps growing explosively, application keep
creating new workers at scheduler pools that leads to a
crashing? Is this a realistic scenario?
Suppose that calls to the database remained synchronous. Is there a
way to handle them using WebFlux in a such way that microservice
will remain reachable under load?
Which bottlenecks can be found in such design? Does this solution
looks adequate?
Does Netty (or Reactor-Netty, or whatever) has a tool to limit a
quantity of requests processing simultaneously? Say I would to limit
the endpoint to serve not more than 100 parallel requests and skip
all requests above that point, is it possible?
Suppose I will create a huge amount of threads serving async (or
maybe sync) calls to the DB. Where is a breaking point when the
application will crash or stop responding to the incoming
HTTP-requests? What will happened there - we will ran out of memory
or..?
Finally, there were no any major issues concerning perfomance during our pilot project. But unfortunately we didn't take in account some specific Linux (and also OpenShift) TCP tuning props.
They may significanly affect the overall perfomance, in our case we've gained about 10 times more requests after tuning.
So pay attention to the net.core.somaxconn and other related parameters.
I've summarized our expertise in the article.
Here is the background:
We have a cluster (of 3) different services deployed on various containers (like Tomcat, TomEE, JBoss) etc. Each of the services does one thing. Like one service manages a common DB and provides REST services to CRUD the db. One service puts some data into a JMS Queue, Another service reads from the Queue and updates the DB. There is a client app that makes a REST service call to one of the service that sets off creating a row in the db, pushing that row into a queue etc.
Question: We need to implement the client app so that we know at any given point in time where the processing is. How do I implement this in RcJava 2/Java 9?
First, you need to determine what functionality in RxJava 2 will benefit you.
Coordination between asynchronous sources. Since you have a) event-driven requests from one side, and b) network queries on the other sides, this is a good fit so far.
Managing a stream of data, transforming and combining from one or more sources. You have given no indication that this is required.
Second, you need to determine what RxJava 2 does not provide:
Network connections. This is provided by your existing libraries.
Database management. Again, this is provided in your existing solutions.
Now, you have to decide whether the firstlies add up to something you can benefit from, given the up-front costs of learning a new library.
I have a few different RESTful services that are hosted on different servers which use different DBs. I have a few RESTful services which call multiple such services above in what is supposed to be a transactional unit. We end up with data consistency issues if any of these RESTful services fail. Is there a neat architectural way of orchestrating a rollback? Or is having transaction managers the way to go?
As a simplistic example, RESTful service 1 has a POST request which reduces item count of thingamajig by 1.
RESTful service 2 POSTs a payment. If service 2 fails, how can we cleanly implement a rollback on service 1, without having a new RESTful refund service (it is ok if this has to be the way to go). I am looking for an architectural answer to above issue, which is in keeping with REST principles.
Your answer: https://stackoverflow.com/a/1390393/607033 You cannot use transactions because by REST the client maintains the client state and the server maintains the resource state. So if you want the resource state to be maintained by the client then it is not REST, because it would violate the stateless constraint. Violating the stateless constraint usually causes bad scalability. In this case it will cause bad horizontal scalability because you have to sync ongoing transactions between the instances. So please, don't try to build multi-phase commits on top of REST services.
Possible solutions:
You can stick with immediate consistency and use only
a single webservice instead of two. By resources like database, filesystem, etc. the multi phase commit is a necessity. When you break up a bigger REST service and move the usage of these resources into multiple smaller REST services, then problems can occur if you do this splitting wrongly. This is because one of the REST services will require a resource, which it does not have access to, so it has to use another REST service to access that resource. This will force the multi phase commit code to move to a higher abstraction level, to the level of REST services. You can fix this by merging these 2 REST services and move the code to the lower abstraction level where it belongs.
Another workaround to use REST with eventual consistency so you can respond with 202 accepted immediately and you can process the accepted request later. If you choose this solution then you must be aware by developing your application that the REST services are not always in sync. Ofc. this approach works only by inner REST services by which you are sure that the client retry if a REST service is not available, so if you write and run the client code.
Another, probably the ugliest workaround to store every transaction
as a resource, so you could POST commits and rollbacks. I think this possible solution is not viable, because it would violate the uniform interface constraint. We would use POST /transactions/ {resource: "/forums/12/messages/45", method: "PUT", data: "..."} and POST /transactions/1/commit instead of for example PUT /forums/12/messages/45
Distributed transactions are complex and require each participating system to support a notion of rollback. In the case of your services, they would each have to support some form of rollback. Co-ordinating something like this in a distributed system may not be practical or advisable in a synchronous way. In a situation like this, you would want to asynchronously roll back and the system would eventually reach consistency at a certain point in the future. There are obviously many other details (timeouts, error handling, retries, etc.).
For more details on eventual consistency, check out the wikipedia entry here.
Basically the problem is that you need to use transactions in an environment (HTTP) that by default is not transactional - in DB sense (because in HTTP a transaction is a successful request - response cycle).
The content of #leeor response is fully correct, what I'd like to add is how I'd solve the problem from the design site.
So you need a single endpoint, may be /transactions. Via POST method you add a new transaction (with all the necessary details) that is immutable - after creation you can only ask for it's data/status via GET method. What can update the transaction status/data is the server itself only.
Under the hood (during transaction creation) a snapshot (that could be reversed later on) for each resource taking part in the transaction should be created. Then execution for all the operations should begin and in case of any fail all the snapshots should be reversed. You do not mention any technologies so it's really hard to advise something reasonable. What you'd need for sure is comprehensive logging.