Caching in a Service oriented architecture - rest

In a distributed systems environment, we have a RESTful service that needs to provide high read throughput at low-latency. Due to limitations in the database technology and given its a read-heavy system, we decided to use MemCached. Now, in a SOA, there are atleast 2 choices for the location of the cache, basically client looks up in Cache before calling server vs client always calls server which looks up in cache. In both cases, caching itself is done in a distributed MemCached server.
Option 1: Client -> RESTful Service -> MemCached -> Database
OR
Option 2: Client -> MemCached -> RESTful Service -> Database
I have an opinion but i'd love to hear arguments for and against either option from SOA experts in the community. Please assume either option is feasible, its a architecture question. Appreciate sharing your experience.

I have seen the
Option 1: Client -> RESTful Service -> Cache Server -> Database
working very well. Pros IMHO are that you are able to operate wtih and use this layer in a way allowing you to "free" part of the load on the DB. Assuming that your end-users can have a lot of similar requests and after all the Client can decide what storage to spare for caching. Also how often to clear it.

I prefer Option 1 and I am currently using it. In this way it is easier to control the load on the DB (just as #ekostatinov mentioned). I have lots of data that are required for every user in the system, but the data is never changed (such as some system rules, types of items, etc). It really reduces the DB load. In this way you can also control the behavior of the cache (such as when to clear the items).

Option 1 is the prefered option as it makes memcache an implementation detail of the service. the other option means that if the business changes and things can't be kept in the cache (or other can etc.) the clients would have to change. Option 1 hides all that behind the service interface.
Additionally option 1 lets you evolve the service as you wish. e.g. maybe later you think you need a new technology, maybe you'd solve the performance problem with the DB etc. Again, option 1 lets you make all these changes without dragging the clients into the mess

Is the REST ful API exposed to external consumers. In that case it is up to the consumer to decide if they want to use a cache and how much stale data can they use.
As for as the REST ful service goes, the service is the container of business logic and it is the authority of data, so it decides how much to cache, cache expiry, when to flush etc. A client consuming the REST service always assumes that the service is providing it with the latest data. And hence option 1 is preferred.
Who is the client in this case?
Is it a wrapper for your REST API. Are you providing both client and the service.

I can share my experience with Enduro/X middleware implementation. For local XATMI service calls any client process connects to shared memory (LMDB) and checks the result there. If there is response saved it returns data directly from shm. If data is not there, client process goes the longest path and performs the IPC. In case of REST access, network clients still performs the HTTP invocation, but HTTP server as XATMI client returns the data from shared mem. From real life, this technique was greatly boosting the web frontend web application which used middleware via REST calls.

Related

Implementing REST using JDBC Tables

Currently we are implementing REST API's using the spring-boot. Since our API's are growing in number we are thinking of a solution to implement the REST API's using a different approach.
The approach is as below :
Expose a single service to receive all the HTTP requests.
We will have the URI's configured in a data base table to call the
next set of services. These service are configured to listen to
particular JMS messages.
The next set of services will receive the JMS messages and process
the data.
Below are my questions :
Will the above approach still represent the REST architecture ?
What are the downsides of above approach(we are aware of network
latency) any thing other then network latency ?
What are the REST architecture benefits will we be missing.
Or can we just say that our approach is the REST architecture done differently ?
You're making 2 major choices, each can be decided separately:
1) Having a single HTTP service
2) Using JMS as the communication between this service and the underlying microservices
Regarding #1, if you do this, you can no longer call your services REST since the whole point of REST is to use HTTP verbs together with your domain objects for a predicable set of endpoints. GET on /objects/ means the object is being fetched, POST on /objects means a new object is being created, etc... Now, this is OK, you can do it this way and it can work, though it will be "non-standard".
In fact, you might want to check out GraphQL https://www.howtographql.com/basics/1-graphql-is-the-better-rest/ as its pretty close to what you're trying to do.
These days really either REST or GraphQL seems to be the two popular approaches.
Another way to do REST, if you're looking to simply expose REST services on your domain objects without having to write a lot of code, is Spring Data REST: https://spring.io/projects/spring-data-rest and if you're comfortable with Spring already, this should be pretty easy to understand.
For #2, your choice of communication between your single gateway service and the underlying services. Do most of your calls require synchronous answers, such as a UI asking for data to display in a browser or phone? If so, JMS is not a good approach. JMS would be an ok approach if the majority of your services were asyncronous - for example someone submitting a stock trade request. The UI would just need to know the request was submitted, but it will actually be processed some time later and the result will be fetched asyncronously.
Without knowing much about your application, I would recommend sticking with HTTP between your services for simplicity sake unless there is a good reason to switch to JMS.

JavaFX interactivity with Spring MVC Restful

I am building a JavaFX client application communicating with Spring MVC Restful server(Spring boot 1.4.1) application which works as expected.
Some features require fast interaction with the server to validate limits and availability before proceeding to next input example check if member number insert is valid and if has exceeded limit to insert, during accumulation of records(each confirmed record temporarily stored in a tableview before sent to server for storage) before the records are actually saved.
Within JavaFX and Spring framework(in both frontend and backend) scope, how can such kind of features made look more interactive(or live) than normal "let-me-wait-for-response" approach
If question is not clear, just ask, otherwise i think it is
It appears that the only interaction you have between client (JavaFX) and server (SpringBoot) is through a REST API. This will make short bursts of data (such a validation) take longer.
Switching to another communication mechanism (for example gRPC or Netty with Msgpack) could help. Note that once you open the door for non-REST calls it'll make you re-think the use of REST in the first place.
Non-REST communication may not be an option depending on your requirements (firewalls, etc) or may need additional setup in order to surmount other obstacles, in other words, there's no free lunch.

REST API: Metadata goes to DB, file to storage. To proxy or not to proxy through API end-point?

I'm currently planning a REST-style API. The problem I have is that the client will send one or more files, belonging to the same "document", but while the metadata is to be stored in a DB, the files are going to file storage (probably S3, in my case).
The way I see it, there are two ways of doing it:
Send the metadata to the API end-point, which responds with the location for storing the files. And then, in a separate request, store the files directly.
Send metadata and files, in the same request, to the API, which acts as a proxy and takes care of sending the various parts to their final destinations.
The good thing about 1. is that the API server will have less to deal with, so can be smaller, and bandwidth is only paid once (client -> storage). Giving a good UX is, on the other hand, likely to be harder, and there will be more state to keep track of.
With 2. it's easy to ensure the transaction is atomic, since the API server is the sole gatekeeper. However, the server will need to be more powerful, and bandwidth may be paid twice (client -> API -> storage).
So, what's the best way of dealing with this situation, and if going with 1. any problems to look out for?
Assuming you have external clients, I believe that #2 is the better bet. The way to catch and keep clients is to have the best possible UX, with a simple, easy to learn and use interface. As you said, you also get to keep atomic transactions, which will save you plenty of headaches. In my experience, server power is relatively cheap, and you can always send a 202 back to the client instead of a 201.

Is it a good idea to write a gateway for PostgreSQL in Erlang?

I'm writing a web application in Erlang, and want to store my data to PostgreSQL.
There're two kinds of resources in my application. One kind is very important while the other one is not that important.
For the important one, no data loss is allowed.
For the less important one, data loss due to system failure is ok.
I want to gain maximum efficiency and came up with such an idea: write a gateway for PostgreSQL. The gateway is a gen_server, and business logic (BL) parts can talk to the gateway for storing resources.
For storing important resources, BL parts send the resources to be stored to the gateway, and block to receive a message (success or failure), and finally respond to the user with a web page.
For storing less important resources, BL parts only send the resources to the gateway without blocking. After sending the resources, BL parts respond with a web page directly.
What I'm expecting from this idea is less seconds per request, since most of the resources are less important ones. But I wonder if this is a good idea, or in other words, can I really get what I'm expecting?
Please answer according to your experience or some reliable "web search results". Thanks. :-)
I can see two problems with your proposal:
All messages sent to the gen_server gateway will be serialized (blocking or non-blocking)
If the gen_server gateway process crashes, you will lose at least the messages in the process mailbox.
What I would do is to create a helper module (not a process) that will be responsible for the database interactions. This process would use a postgresql library that supports connection pool (so the calls to DB can have some parallelism).
If you wish to do non-blocking DB operations for the less important resources, just spawn a process to do the DB interaction and move on.
Some links to postgresql erlang libraries (I haven't used any):
Postgresql connection pooling in Erlang
http://zotonic.com/page/519/epgsql-postgresql-driver

How often does RESTful client pull server data

I have a RESTful web-service application that I developed using the Netbeans IDE. The application uses MySQL server as its back end server. What I am wondering now is how often a client application that uses my RESTful application would refresh to reflect the data change in the server.
Are there any default pull intervals that clients get from the RESTful application? Does the framework(JAX-RS) do something about it Or is that my business to take care of.
Thanks in advance
#Abraham
There are no such rules. Only thing you can use for properly implementing this is HTTP's caching capabilities. Service must include control information how long representation of a particular resource can be cached, revalidated, never cached etc...
On client application side of things each client may decide it's own path how it will keep itself in sync with a service. It can be done by locally storing data and serve end user from local cache etc... Service can not(and shouldn't know) how clients are implemented, only thing service can do is to include caching information in response messages as i already mentioned above.
It is your responsibility to schedule the service to execute again and again. We can set time out interval but there is no pull interval.