Proper REST Service Structure For A Q&A Service - rest

I'm creating services that basically issue a set of questions for user authentication (i.e. how old is your pet) and can verify the answers (but must be done within 2 minutes of the questions being presented). The kicker is the question formulation is a long process so I'd like to have a separate call to get it going async.
...call #1
POST \users{user}\challengerequest -- creates the request, takes a few minutes, and returns request ID
...then polls call #2
GET \users{user}\challengerequest{requestID} -- returns the actual questions, the 2 minute timer starts
...call #3
POST \users{user}\challengeresponse{requestID} -- takes the response, verifies the answers, and returns if things are valid or not.
What I don't like about this currently:
- the first time my GET is called, a timer is kicked off and I believe in theory this is like an update which probably shouldn't be done from a GET
- these feel like transient resources and maybe are not good candidates for a RESTful implementation?
I'm interested in hearing your thoughts on implementation. What should I reevaluate? Thanks!

What I don't like about this currently: - the first time my GET is called, a timer is kicked off and I believe in theory this is like an update which probably shouldn't be done from a GET - these feel like transient resources and maybe are not good candidates for a RESTful implementation?
I agree 100%. GET should be idempotent. In this case it is not. I think using a POST to start the questions might be preferred. This also tells down stream servers to not cache the request.

Related

Owin- Slow CompatibleWithModel call

I have this line of code within a request of an ApiController of an Azure Mobile App Web API:
var user = t.TrackDependency(() => context.Users.SingleOrDefault(x => x.Email == loginRequest.Id || x.Name == loginRequest.Id), "GetUser");
Here is the result from Application Insights:
We can see that while the line of code took 2613ms, the actual query call to the database took 190ms. While this is an edge case it happens often enough to get complaining users about slow performance.
The thing is I have no idea where the difference could come from. Note this is not due to a cold start, the app was warm when this exact call happened.
The second line is the actual call to the database endpoint. Before that it is not database related.
ps: the graph is from application insights. They capture the call to the database and I add my onwn data through the TrackDependency method.
UPDATE
Today I got more data thanks to Application Insights sampling (great tool!).
Here are the results (this is not the exact request call instance but this is the same problem):
It clearly shows that context.Database.CompatibleWithModel(false) is the culprit. It is called by the call to InitializeDatabase of my custom implementation of IDatabaseInitializer. My custom intializer is set at Startup.
I found another unanswered question on SOF with the same issue
Why does it take so long?
InitializeDatabase is not always called and I don't know when it is called and why.
I found another culprit:
Now we see that EntityConnection.Open is waiting something. Are there some locks on the connection? So far the call to the database endpoint is still not made so we're still on EntityFramework here.
UPDATE 2
There are two issues in that post:
Why is CompatibleWithModel slow? There are many articles about startup time improvements. This is not be adressed in that SOF question.
Why is EntityConnection.Open blocking. This is not related to EntityFramework but is general to getting a connection which takes up to 3 seconds if not called within a 5 minutes windows. I raised that problem in a specific post.
Hence there is no more questions in that post which and it could be deleted but may still be useful as an analysis of tracking down lost time in Web Api calls.

REST API: How to deal with processing logic

I read (among others) the following blog about API design: https://www.thoughtworks.com/insights/blog/rest-api-design-resource-modeling. It helped me to better understand a lot of aspects, but I have one question remaining:
How do I deal with functionality that processes some data and gives a response directly. Think, verbs like translate, calculate or enrich. Which noun should they have and should they be called by GET, PUT or POST?
P.S. If it should be GET, how to deal with the maximum length of a GET request
This is really a discussion about naming more so than functionality. Its very much possible to have processed logic in your API, you just need to be careful about naming it.
Imaginary API time. Its got this resource: /v1/probe/{ID} and it responds to GET, POST, and DELETE.
Let's say we want to launch our probes out, and then want the probe to give us back the calculated flux variation of something its observing (totally made up thing). While it isn't a real thing, let's say that this has to be calculated on the fly. One of my intrepid teammates decides to plunk the calculation at GET /v1/1324/calculateflux.
If we're following real REST-ful practices... Oops. Suddenly we're not dealing with a noun, are we? If we have GET /v1/probe/1324/calculateflux we've broken RESTful practices because we're now asking for a verb - calculateflux.
So, how do we deal with this?
You'll want to reconsider the name calculateflux. That's no good - it doesn't name a resource on the probe. **In this case, /v1/probe/1324/fluxvalue is a better name, and /v1/probe/1324/flux works too.
Why?
RESTFUL APIs almost exclusively use nouns in their URIs - remember that each URI needs to describe a specific thing you can GET POST PUT or DELETE or whatever. That means that any time there is a processed value we should give the resource the name of the processed (or calculated) value. This way, we remain RESTful by adhering to the always-current data (We can re-calculate the Flux value any time) and we haven't changed the state of the probe (we didn't save any values using GET).
Well, I can tell you that I know about this.
GET // Returns, JUST return
DELETE // Delete
POST // Send information that will be processed on server
PUT // Update a information
This schema is for laravel framework. Will be most interesting that you read the link in ref
Ref:
https://rafaell-lycan.com/2015/construindo-restful-api-laravel-parte-1/
You should start with the following process:
Identify the resources (nouns) in your system.
They should all respond to GET.
Let's take your translation example. You could decide that every word in the source language is a resource. This would give:
http://example.com/translations/en-fr/hello
Which might return:
Content-Type: text/plain
Content-Language: fr
bonjour
If your processes are long-running, you should create a request queue that clients can POST to, and provide them with another (new) resource that they can query to see if the process has completed.

How to design a RESTful api for slow-generated resources or job status?

I am trying to design a RESTful api for a service that accepts a bunch of parameters and generates a large result. This is my first RESTful project. One tricky part is that the server needs some time (up to a few minutes) to generate the result. My current thought is to use POST to send in all the parameters. The server response can be a job id.
I can then retrieve the result using GET /result/{job_id}. The problem is that the result is not available for the first few minutes. Maybe I can return the resource unavailable at the beginning and the result once it is available. But this feels odd and add some odd logic in the client.
An alternative is to retrieve the job status GET /job_status/{job_id}, where the result might be running/error/done, similar to the http status code, where done status also comes with a result_id. Then I can retrieve it with GET /result/{result_id}.
Either case has some problem with what I have read about GET. In both cases, GET result is not fixed and not cacheable at the beginning while the job is still running. On the other hand, I read somewhere that it is OK to do things like GET /currentWhether or Get /currentTime, which are similar to at least my second approach. So my questions are:
Which one is better? Why?
Should I use GET for such situation?
Or neither one is OK? What would you do?
Thank you very much.
Should I use GET?
For long running operations, here is an approach which tells setting expire or max-age headers to your response properly. Here is the example Best practice for implementing long-running searches with REST
But I recommend The RESTy Long-op Protocol for your case.
Your solution will be more robust and more client friendly.

Is it valid GET method in REST, that returns some set of data, but after a while, the dataset can be modified?

I was reading about "idempotent methods", but not quite get it.
1.1. So the GET method must be idempotent.
1.2. An idempotent HTTP method is a HTTP method that can be called many times without different outcomes. It would not matter if the method is called only once, or ten times over. The result should be the same. - See more at: http://restcookbook.com/HTTP%20Methods/idempotency/#sthash.hW6zSUi7.dpuf
Okay, that was theory. Now specific case:
2.1. I have exposed a GET method, that return all records in DB.
2.2. Somebody called this method and it returned 1000 results.
2.3. The application is running, so in a few minutes I have 1001 records in the DB.
2.4. Somebody (maybe the same caller) called this method again and now it returned 1001 results.
Is mine GET method is still idempotent or it should be changed to POST?
Yes.
Because the GET is not changing the resource. That's the distinction.
Consider:
GET /currenttime
Perfectly valid request, idempotent, but you'll get a new answer pretty much every time you call it.
An idempotent HTTP method is a HTTP method that can be called many times without different outcomes. It would not matter if the method is called only once, or ten times over. The result should be the same.
The opening sentence is somewhat unfortunate but the rest explains it pretty clearly.
The key point to note here is that the outcome may not be altered by any number of subsequent calls of the same method. The state of the resource, a represantation of which you're GETting is free to be changed by other means though.
In your example it isn't the GET request that's changing the state of the database. It's an external factor.
Is my GET method is still idempotent or it should be changed to POST?
Yes, the way you describe it, it's both idempotent and safe as it does not modify the state of your resources and it will always yield the same result provided that other parties do not alter the resource state between calls. Calling it does not affect the result of calling it.

Why use two step approach to deleting multiple items with REST

We all know the 'standard' way of deleting a single item via REST is to send a single DELETE request to a URI example.com/Items/666. Grand, let's move on to deleting many at once. As we do not require atomic deleting (or true transaction, ie all or nothing) we could just tell the client 'tough luck, make many requests' but that's not very nice is it. So we need a way to allow a client to request many 'Items' be deleted at once.
From my understanding, the 'typical' solution to this problem is a 'two step' approach. First the client POSTs a list of item IDs and is returned a URI such as example.com/Items/Collection/1. Once that collection is created, they call DELETE on it.
Now, I see that this works just fine, except to me, it is a bad solution. Firstly, you are forcing the client to make two requests to accommodate the server. Secondly, 'I thought DELETE was supposed to delete an Item?', shouldn't calling DELETE on this URI effectively cancel the transaction (it's not a true transaction though), how would we even cancel it? Really would be better if there was some form 'EXECUTE' action, but I can't rock the boat that much. It also forces the server to have to consider 'the JSON that was POSTed looks more like a request to modify these Items, but the request was DELETE... so I guess I will delete them'. This approach also starts to impose a sort of state on the client/server, not a true state I will admit, but it is sort of.
In my opinion, a better solution would be to simply call DELETE on example.com/Items (or maybe example.com/Items/Collection to imply this is a multiple delete) and pass JSON data containing a list of IDs that you wish to delete. As far as I can see, this basically solves all the problems the first method had. It is easier to use as a client, reduces the work the server has to do, is truly stateless, is more semantic.
I would really appreciate the feed back on this, am I missing something about REST that makes my solution to this problem unrealistic? I would also appreciate links to articles, especially if they compare these two methods; I am aware this is not normally approved of for SO. I need to be able to disprove that only the first method is truly RESTfull, prove that the second approach is a viable solution. Of course, if I am barking up the wrong tree do tell me.
I have spent the last week or so reading a fair bit on REST, and to the best of my understanding, it would be wrong to describe either of these solutions as 'RESTfull', rather you should say that 'neither solution goes against what REST means'.
The short answer is simply that REST, as laid out in Roy Fielding's dissertation (See chapter 5), does not cover the topic of how to go about deleting resources, singular or multiple, in a REST manor. That's right, there is no 'correct RESTful way to delete a resource'... well, not quite.
REST itself does not define how delete a resource, but it does define that what ever protocol you are using (remember that REST is not a protocol) will dictate the how perform these actions. The protocol will usually be HTTP; 'usually' being the key word as Fielding will point out, REST is not synonymous with HTTP.
So we look to HTTP to say which method is 'right'. Sadly, as far as HTTP is concerned, both approaches are viable. Yes 'viable'. HTTP will allow a client to send a POST request with a payload (to create a collection resource), and then call a DELETE method on this new collection to delete the resources; it will also allow you to send the data within the payload of a single DELETE method to delete the list of resources. HTTP is simply the medium by which you send requests to the server, it would be up to the server to respond appropriately. To me, the HTTP protocol seems to be rather open to interpretation in places, but it does seem to lay down fairly clear guide lines for what actions mean, how they should be dealt with and what response should be given; it's just it is a 'you should do this' rather than 'you must do this', but perhaps I am being a little pedantic on the wording.
Some people would argue that the 'two stage' approach cannot possibly be 'REST' as the server has to store a 'state' for the client to perform the second action. This is simply a misunderstanding of some part. It must be understood that neither the client nor the server is storing any 'state' information about the other between the list being POSTed and then subsequently being DELETEd. Yes, the list must have been created before it can deleted, but the server does not remember that it was client alpha that made this list (such an approach would allow the client to simply call 'DELETE' as the next request and the server remembers to use that list, this would not be stateless at all) as such, the client must tell the server to DELETE that specific list, the list it was given a specific URI for. If the client attempted to DELETE a collection list that did not already exist it would simply be told 'the resource can not be found' (the classic 404 error most likely). If you wish to claim that this two step approach does maintain a state, you must also claim that to simply GET an URI requires a state, as the URI must first exist. To claim that there is this 'state' persisting is misunderstanding what 'state' means. And as further 'proof' that such a two stage approach is indeed stateless, you could quite happily have client alpha POST the list and later client beta (without having had any communication with the other client) call DELETE on the list resources.
I think it can stand rather self evident that the second option, of just sending the list in the payload of the DELETE request, is stateless. All the information required to complete the request is stored completely within the one request.
It could be argued though that the DELETE action should only be called on a 'tangible' resource, but in doing so you are blatantly ignoring the REpresentational part of REST; It's in the name! It is the representational aspect that 'permits' URIs such as http://example.com/myService/timeNow, a URI that when 'got' will return, dynamically, the current time, with out having to load some file or read from some database. It is a key concept that the URIs are not mapping directly to some 'tangible' piece of data.
There is however one aspect of that stateless nature that must be questioned. As Fielding describes the 'client-stateless-server' in section 5.1.3, he states:
We next add a constraint to the client-server interaction: communication must
be stateless in nature, as in the client-stateless-server (CSS) style of
Section 3.4.3 (Figure 5-3), such that each request from client to server must
contain all of the information necessary to understand the request, and
cannot take advantage of any stored context on the server. Session state is
therefore kept entirely on the client.
The key part here in my eyes is "cannot take advantage of any stored context on the server". Now I will grant you that 'context' is somewhat open for interpretation. But I find it hard to see how you could consider storing a list (either in memory or on disk) that will be used to give actual useful meaning would not violate this 'rule'. With out this 'list context' the DELETE operation makes no sense. As such, I can only conclude that making use of a two step approach to perform an action such as deleting multiple resources cannot and should not be considered 'RESTfull'.
I also begrudge somewhat the effort that has had to be put into finding arguments either way for this. The Internet at large seems to have become swept up with this idea the the two step approach is the 'RESTfull' way doing such actions, with the reasoning 'it is the RESTfull way to do it'. If you step back for a moment from what everybody else is doing, you will see that either approach requires sending the same list, so it can be ignored from the argument. Both approaches are 'representational' and 'stateless'. The only real difference is that for some reason one approach has decided to require two requests. These two requests then come with follow up questions, such as how 'long do you keep that data for' and 'how does a client tell a server that it no longer wants that this collection, but wishes to keep the actual resources it refers to'.
So I am, to a point, answering my question with the same question, 'Why would you even consider a two step approach?'
IMO:
HTTP DELETE on existing collection to delete all of its member seems fine. Creating the collection just to delete all of the member sounds odd. As you yourself suggest, just pass IDs of the to be deleted items using JSON (or any other payload format). I think that the server should try to make multiple deletes an internal transaction though.
I would argue that HTTP already provides a method of deleting multiple items in the form of persistent connections and pipelining. At the HTTP protocol level it is absolutely fine to request idempotent methods like DELETE in a pipelined way - that is, send all the DELETE requests at once on a single connection and wait for all the responses.
This may be problematic for an AJAX client running in a browser since few browsers have pipelining support enabled by default. This is not the fault of HTTP, though, it is the fault of those specific clients.