Let's say I have a micro service which just registers a user into the database and we expose it to our client. I want to understand what's the better way of handling the following scenario,
What if the user sends multiple requests in parallel(say 10 requests within the 1 second) with same request body. Should I keep the requests in a queue and register for the very first user and deny all the other 9 requests, or should I classify each request and compare whichever having similar request body and if any of them has different request body shall be picked up one each and rest are rejected? or What's the best thing I can do to handle this scenario?
One more thing I would like to understand, is it recommended to have rate-limiting (say n requests per minute) on a global API level or micro-service level?
Thanks in advance!
The best way is to use an idempotent call. Instead of exposing an endpoint like this :
POST /users + payload
Expose an endpoint like this :
PUT /user/ID + payload
You let the caller generate the id, and you ask for an UUID. With UUID, no matter who generates it. This way, if caller invokes your endpoint multiple times, the first time you will create the user, the following times you will juste update the user with the same payload, which means you'll do nothing. At least you won't generate duplicates.
It's always a good practice to protect your services with rate-limiting. You have to set it at API level. If you define it at microservice level, you will authorize N times the rate if you have N instances, because you will ditribute the requests.
Related
I have a complex problem and I can't figure out which one is the best solution to solve it.
this is the scenario:
I have N servers under a single load balancer and a Database.
All the servers connect to the database
All the servers run the same identical application
I want to implement a Cache in order to decrease the response time and reduce to the minimum the HTTP calls Server -> Database
I implemented it and works like a charm on a single server...but I need to find a mechanism to update all the other caches in the other servers when the data is not valid anymore.
example:
I have server A and server B, both have their own cache.
At the first request from the outside, for example, get user information, replies server A.
his cache is empty so he needs to get the information from the database.
the second request goes to B, also here server B cache is empty, so he needs to get information from the database.
the third request, again on server A, now the data is in the cache, it replies immediately without database request.
the fourth request, on server B, is a write request (for example change user name), server B can make the changes on the database and update his own cache, invalidating the old user.
but server A still has the old invalid user.
So I need a mechanism for server B to communicate to server A (or N other servers) to invalidate/update the data in the cache.
whats is the best way to do this, in scala play framework?
Also, consider that in the future servers can be in geo-redundancy, so in different geographical locations, in a different network, served by a different ISP.
would be great also to update all the other caches when one user is loaded (one server request from database update all the servers caches), this way all the servers are ready for future request.
Hope I have been clear.
Thanks
Since you're using Play, which under the hood, already uses Akka, I suggest using Akka Cluster Sharding. With this, the instances of your Play service would form a cluster (including failure detection, etc.) at startup, and organize between themselves which instance owns a particular user's information.
So proceeding through your requests, the first request to GET /userinfo/:uid hits server A. The request handler hashes uid (e.g. with murmur3: consistent hashing is important) and resolves it to, e.g., shard 27. Since the instances started, this is the first time we've had a request involving a user in shard 27, so shard 27 is created and let's say it gets owned by server A. We send a message (e.g. GetUserInfoFor(uid)) to a new UserInfoActor which loads the required data from the DB, stores it in its state, and replies. The Play API handler receives the reply and generates a response to the HTTP request.
For the second request, it's for the same uid, but hits server B. The handler resolves it to shard 27 and its cluster sharding knows that A owns that shard, so it sends a message to the UserInfoActor on A for that uid which has the data in memory. It replies with the info and the Play API handler generates a response to the HTTP request from the reply.
In this way, all subsequent requests (e.g. the third, the same GET hitting server A) for the user info will not touch the DB, no matter which server they hit.
For the fourth request, which let's say is POST /userinfo/:uid and hits server B, the request handler again hashes the uid to shard 27 but this time, we send, e.g., an UpdateUserInfoFor(uid, newInfo) message to that UserInfoActor on server A. The actor receives the message, updates the DB, updates its in-memory user info and replies (either something simple like Done or the new info). The request handler generates a response from that reply.
This works really well: I've personally seen systems using cluster sharding keep terabytes in memory and operate with consistent single-digit millisecond latency for streaming analytics with interactive queries. Servers crash, and the actors running on the servers get rebalanced to surviving instances.
It's important to note that anything matching your requirements is a distributed system and you're requiring strong consistency, i.e. you're requiring that it be unavailable under a network partition (if B is unable to communicate an update to A, it has no choice but to fail the request). Once you start talking about geo-redundancy and multiple ISPs, you're going to see partitions pretty regularly. The only way to get availability under a network partition is to relax the consistency demand and accept that sometimes the GET will not incorporate the latest PUT/POST/DELETE.
This is probably not something that you want to build yourself. But there are plenty of distributed caches out there that you can use, such as Ehcache or InfiniSpan. I suggest you look into one of those two.
I am trying to tackle the following scenario, possibly using kafka streams and interactive queries.
Imagine an event e that triggers 500 (or any amount) clients to make a request to a RESTful backend. From these 500 requests, let's assume half have a parameter set of X and the other half have parameter set of Y. What the backend needs to do is to compute something only twice (as much as the amount of parameters set) and return back to clients some result.
My idea is to create topic for the first request that arrives of each parameter set, compute it, and all subsequent will query the local store for results. Would that be feasible? Is there some more efficient way I am not aware of?
I am reading a lot about rest API and I always stumble upon terms idempotency. Basically GET, HEAD, PUT, DELETE and OPTIONS are all idempotent, and POST is not.
This statement on http://www.restapitutorial.com/lessons/idempotency.html made me doubt my understanding of idempotency.
From a RESTful service standpoint, for an operation (or service call)
to be idempotent, clients can make that same call repeatedly while
producing the same result. In other words, making multiple identical
requests has the same effect as making a single request. Note that
while idempotent operations produce the same result on the server (no
side effects), the response itself may not be the same (e.g. a
resource's state may change between requests).
So does idempotency actually has something to do with server-work or a response?
What confuses me if I have
GET /users/5
returning
{
"first_name" : "John",
"last_name" : "Doe",
"minutes_active": 10
}
and then do the same request after one minute I would get
GET /users/5
{
"first_name" : "John",
"last_name" : "Doe",
"minutes_active": 11
}
How is this idempotent?
Furthermore if response contains some kind of UUID which is unique for each response, would that break idempotency?
And finally, is idempotency same server-work over and over again, or same results over and over again for the same/single request?
So does idempotency actually has something to do with server-work or a response?
It refers to the server state after subsequent request of the same type.
So, lets suppose that the client makes a request that changes the server's old state, for example S1, to a new state, S2, then makes the same request again.
If the method is idempotent then it is guaranteed that the second request would not change the server's state again, it will remain S2.
But if the method is not idempotent, there is no guarantee that the state would remain the same, S2; it may change to whatever state the server wants for example S3 or S1. So, in this case the client should not send the command again if a communication error occurs because the outcome would not be the same as the first time it sent the command.
GET /users/5
How is this idempotent?
You may call this url using GET method as many time you want and the server would not change its internal state, i.e. last_name of the user; if it does not change it, then it remains the same so GET is idempotent.
Furthermore if response contains some kind of UUID which is unique for each response, would that break idempotency?
The response has nothing to do with the server's state after subsequent requests of the same type so the response could be unique after each request and the request would still be idempotent. For example, in the GET request from your question, the minutes_active would be greater each minute and this does not make GET not-idempotent.
Other example of idempotent method is DELETE. For example if you delete a user then it is gone/deleted. Because DELETE is idempotent, after a second atempt/request to delete the same user, the user would remain deleted so the state would not change. Of course, the second response could be a little different, something like "warning, user already deleted" but this has nothing to do with idempotency.
For understanding idempotency in REST, your best starting point is probably going to be the definition include in RFC 7231
A request method is considered "idempotent" if the intended effect on the server of multiple identical requests with that method is the same as the effect for a single such request.
For "effect", think side effect. When the server is advertising that a particular action is idempotent, it is telling you that the (semantically significant) side effects will happen at most once.
// the guarantee of an idempotent operation
oldState.andThen(PUT(newState)) === oldState.andThen(PUT(newState)).andThen(PUT(newState))
Safe methods are inherently idempotent, because they have no effect on the server.
// the guarantee of a safe operation
oldState === oldState.andThen(GET)
// therefore, the guarantee of an idempotent operation follows trivially
oldState.andThen(GET) == oldState.andThen(GET).andThen(GET)
So does idempotency actually has something to do with server-work or a response?
Server work. More generally, its a constraint on the receiver of a command to change state.
Roy Fielding shared this observation in 2002:
HTTP does not attempt to require the results of a GET to be safe. What it does is require that the semantics of the operation be safe, and therefore it is a fault of the implementation, not the interface or the user of that interface, if anything happens as a result that causes loss of property (money, BTW, is considered property for the sake of this definition).
If you substitute PUT/DELETE for GET, and idempotent for safe, I think you get a good picture -- if a loss of property occurs because the server received two copies of an idempotent request, the fault is that the server handled the request improperly, not that the client broadcast the request more than once.
This is important because it allows for at least once delivery over an unreliable network. From RFC 7231
Idempotent methods are distinguished because the request can be repeated automatically if a communication failure occurs before the client is able to read the server's response.
Contrast this with POST; which does not promise idempotent handling. Submitting a web form twice may produce two side effects on the server, so the client implementations (and intermediary components, like proxies) cannot assume it is safe to repeat a lost request.
Back in the day, dialogs like this were common
for precisely this reason
And finally, is idempotency same server-work over and over again, or same results over and over again for the same/single request?
Work on the server. An idempotent change is analogous to SET, or REPLACE (aka, compare and swap).
The responses may, of course, be different. A conditional PUT, for example, will include meta data "indicating a precondition to be tested before applying the method semantics to the target resource."
So the server might change state in response to the receiving the first copy of a put, sending back 200 OK to indicate to the client that the request was successful; but upon receiving the second request, the server will find that the now-changed state of the resource no longer matches the provided meta data, and will respond with 412 Precondition Failed.
I noticed you mentioned may produce in "Contrast this with POST; which does not promise idempotent handling. Submitting a web form twice may produce two side effects on the server....." basically rest standards declare POST as non-idempotent, but one could actually make POST an idempotent, but it would be opposite to rest standards...
No, that's not quite right. The HTTP specification does not require that POST support idempotent semantics -- which means that clients and intermediaries are not permitted to assume that it does. The server controls its own implementation; it may provide idempotent handling of the POST request.
But the server has no way to advertise this capability to the clients (or intermediaries), so that they can take advantage of it.
I'm submitting multiple POST submits on a REST API using same input Json. That means multi users (ex: 10000) are submitting the same POST with same Json to measure the performance of POST request, but I need to capture the result of completion on each submission using a GET method and still measure the performance of GET as well. This is a asynchronous process as follows.
POST submit
generates an ID1
wait for processing
in next step another ID2 will be generated
wait for processing
in next step another ID3 will be generated
wait for processing
final step is completion.
So I need to create a jmeter test plan that can process this Asynchronous POST submits by multi users and wait for them to be processed and finally capture the completion on each submission. I need to generate a graph and table format report that can show me latency and throughput. Sorry for my lengthy question. Thanks, Santana.
Based on your clarification in the comment, looks to me like you have a fairly straight forward script, which could be expressed like this:
Thread Group
HTTP Sampler 1 (POST)
Post-processor: save ID1 as a variable ${ID1}
Timer: wait for next step to be available
HTTP Sampler 2 (GET, uses ${ID1})
Post-processor: save ID2 as a variable ${ID2}
Timer: wait for next step to be available
HTTP Sampler 3 (GET, uses ${ID1} and ${ID2})
Post-Processor: extract completion status
(Optional) Assertion: check completion status
I cannot speak about which Timer specifically to use, or which Post-processor, they depend on specific requests you have.
You don't need to worry about multiple users from JMeter perspective (the variables are always independent for the users), but of course you need to make sure that multiple initial POSTs do not conflict with each other from application perspective (i.e. each post should process independent data)
Latency is a part of the standard interface used to save results in the file. But as JMeter's own doc states, latency measurement is a bit limited in JMeter:
JMeter measures the latency from just before sending the request to just after the first response has been received. Thus the time includes all the processing needed to assemble the request as well as assembling the first part of the response, which in general will be longer than one byte. Protocol analysers (such as Wireshark) measure the time when bytes are actually sent/received over the interface. The JMeter time should be closer to that which is experienced by a browser or other application client.
Throughput is available in some UI listeners, but can also be calculated in the same way as JMeter calculates it:
Throughput = (number of requests) / (total time)
using raw data in the file.
If you are planning to run 100-200 users (or for debug purposes), use UI listeners; with the higher load, use non-UI mode of JMeter, and save results in CSV which you can later analyze. I say get your test to pass in UI mode first with 100 users, and then setup a more robust multi-machine 10K user test.
one connection send many request to server
How to process request concurrently.
Please use a simple example like timeserver or echoserver in netty.io
to illustrate the operation.
One way I could find out is to create a separate threaded handler that will be called as in a producer/consumer way.
The producer will be your "network" handler, giving message to the consumers, therefore not waiting for any wanswear and being able then to proceed with the next request.
The consumer will be your "business" handler, one per connection but possibly multi-threaded, consuming with multiple instances the messages and being able to answer using the Netty's context from the connection from which it is attached.
Another option for the consumer would be to have only one handler, still multi-threaded, but then message will come in with the original Netty's Context such that it can answear to the client, whatever the connection attached.
But the difficulties will come soon:
How to deal with an answear among several requests on client side: let say the client sends 3 requests A, B and C and the answears will come back, due to speed of the Business handler, as C, A, B... You have to deal with it, and knowing for which request the answer is.
You have to ensure all the ways the context given in parameter is still valid (channel active), if you don't want to have too many errors.
Perhaps the best way would be to however handle your request in order (as Netty does), and keep the answear's action as quick as possible.