HTTP GET setting internal data, still considered RESTful? - rest

As I understand it, a HTTP GET request should return the requested data and is considered RESTful if safe (read-only) and idempotent (no side effects).
However, I would like to implement a service to show new items since last visit using an URI of /items/userid/new, can that in any way be RESTful?
When returning the data, the items sent in response to the GET request should be marked as read in order to keep track of what is new. Marking these items will violate both the safe requirement and the idempotent requirement.
Does that mean .../new is never considered RESTful?

Very interesting question. I think the answer would be "depends on the implementation" since there are different solutions that would all meet the business requirement.
If every visit to the URL /items/userid/new by the same user modifies the DB records, then it's not a "safe method" and would not fit the commonly accepted REST pattern.
If you're filtering for the new items on the view, but without any corresponding DB changes with each GET call, then it would certainly be RESTful. For example:
/items/userid/new?lastvisit=2015-12-31
or
/items/userid/new?last_id=12345
or storing the list of viewed items on the client side would certainly qualify.

Normally we decide the restful service and associate protocol methods to it. In case of HTTP it is we use GET,POST,PUT...etc. Normally Get is used get some application state that will rendered with hyper text.
Here /items/userid/new can be new service that returns the list. Any modification on this list will be subsequent request with different method association.

Keeping track of already visited items should not be the server's responsibility. The client should rember already visited items. That leads to better scalability and loosy coupled systems as the server must not handle these parameters for e.g. thousand of clients.
To deliver fresh items you could, as Cahit already touched, provide a feed that includes all items for either a time frame or a fixed number of items. Generally an archived feed is a good choice.
As a result of this a client can crawl your feed on its own to retrieve all new items.

Interesting question indeed.
For a plain url, I would not do it. One reason for example: I remember sending a testmail with a unique link pointing to our own webserver. While passing through the ISP's mailserver, a request was made with that url to our server.. This was done by a spamfilter to check if the url existed.. So, if a mail was sent to the customer saying 'look here for the new items', those would be gone even before the mail was received.
And you certainly wouldn't want that plain url to appear in search results somewhere, or be used for preloading pages.
On the other hand, if the request was made during login (or using a cookie), it could be ok.
Maybe that's what SO does when we view our statistics/responses.

Related

REST endpoint for complex actions

I have a REST API which serves data from the database to the frontend React app and to Android app.
The API have multiple common endpoints for each model:
- GET /model/<id> to retrieve a single object
- POST /model to create
- PATCH /model/<id> to update a single model
- GET /model to list objects
- DELETE /model/<id> to delete an object
Currently I'm developing an Android app and I find such scheme to make me do many extra requests to the API. For example, each Order object has a user_creator entry. So, if I want to delete all the orders created by specified user I need to
1) List all users GET /user
2) Select the one I need
3) List all orders he created GET /order?user=user_id
4) Select the order I want to delete
5) Delete the order DELETE /order/<id>
I'm wondering whether this will be okay to add several endpoints like GET /order/delete?user=user_id. By doing this I can get rid of action 4 and 5. And all the filtering will be done at the backend. However it seems to me as a bad architecture solution because all the APIs I've used before don't have such methods and all the filtering, sorting and other "beautifying" stuff is usually at the API user side, not the backend.
In your answer please offer a solution that is the best in your opinion for this problem and explain your point of view at least in brief, so I can learn from it
Taking your problem is in isolation:
You have an Order collection and a User collection
User 1..* Orders
You want to delete all orders for a given user ID
I would use the following URI:
// delete all orders for a given user
POST /users/:id/orders/delete
Naturally, this shows the relationship between Users & Orders and is self-explanatory that you are only dealing with orders associated with a particular user. Also, given the operation will result in side-effects on the server then you should POST rather than GET (reading a resource should never change the server). The same logic could be used to create an endpoint for pulling only user orders e.g.
// get all orders for a given user
GET /users/:id/orders
The application domain of HTTP is the transfer of documents over a network. Your "REST API" is a facade that acts like a document store, and performs useful work as a side effect of transferring documents. See Jim Webber (2011).
So the basic idioms are that we post a document, or we send a bunch of edits to an existing document, and the server interprets those changes and does something useful.
So a simple protocol, based on the existing remote authoring semantics, might look like
GET /orders?user=user_id
Make local edits to the representation of that list provided by the server
PUT /orders?user=user_id
The semantics of how to do that are something that needs to be understood by both ends of the exchange. Maybe you remove unwanted items from the list? Maybe there is a status entry for each record in the list, and you change the status from active to expired.
On the web, instead of remote authoring semantics we tend to instead use form submissions. You get a blank form from somewhere, you fill it out yourself, you post it to the indicated inbox, and the person responsible for processing that inbox does the work.
So we load a blank form into our browser, and we make our changes to it, and then we post it to the resource listed in the form.
GET /the-blank-form?user=user_id
Make changes in the form...
POST ????
What should the target-uri be? The web browser doesn't care; it is just going to submit the form to whatever target is specified by the representation it received. One answer might be to send it right back where we got it:
POST /the-blank-form?user=user_id
And that works fine (as long as you manage the metadata correctly). Another possibility is to instead send the changes to the resource you expect to reflect those changes:
POST /orders?user=user_id
and it turns out that works fine too. HTTP has interesting cache invalidation semantics built into the specification, so we can make sure the client's stale copy or the orders collection resource is invalidated by using that same resource as the target of the POST call.
Currently my API satisfies the table from the bottom of the REST, so, any extra endpoint will break it. Will it be fatal or not, that's the question.
No, it will be fine -- just add/extend a POST handler on the appropriate resource to handle the new semantics.
Longer answer: the table in wikipedia is a good representation of common practices; but common practices aren't quite on the mark. Part of the problem is that REST includes a uniform interface. Among other things, that means that all resources understand the same message semantics. The notion of "collection resources" vs "member resources" doesn't exist in REST -- the semantics are the same for both.
Another way of saying this is that a general-purpose component never knows if the resource it is talking to is a collection or a member. All unsafe methods (POST/PUT/PATCH/DELETE/etc) imply invalidation of the representations of the target-uri.
Now POST, as it happens, means "do something that hasn't been standardized" -- see Fielding 2009. It's the method that has the fewest semantic constraints.
The POST method requests that the target resource process the representation enclosed in the request according to the resource's own specific semantics. -- RFC 7231
It's perfectly fine for a POST handler to branch based on the contents of the request payload; if you see X, create something, if you see Y delete something else. It's analogous to having two different web forms, with different semantics, that submit to the same target resource.

REST API design for resource modification: catch all POST vs multiple endpoints

I'm trying to figure out best or common practices for API design.
My concern is basically this:
PUT /users/:id
In my view this endpoint could by used for a wide array of functions.
I would use it to change the user name or profile, but what about ex, resetting a password?
From a "model" point of view, that could be flag, a property of the user, so it would "work" to send a modification.
But I would expect more something like
POST /users/:id/reset_password
But that means that almost for each modification I could create a different endpoint according to the meaning of the modification, i.e
POST /users/:id/enable
POST /users/:id/birthday
...
or even
GET /user/:id/birthday
compared to simply
GET /users/:id
So basically I don't understand when to stop using a single POST/GET and creating instead different endpoints.
It looks to me as a simple matter of choice, I just want to know if there is some standard way of doing this or some guideline. After reading and looking at example I'm still not really sure.
Disclaimer: In a lot of cases, people ask about REST when what they really want is an HTTP compliant RPC design with pretty URLs. In what follows, I'm answering about REST.
In my view this endpoint could by used for a wide array of functions. I would use it to change the user name or profile, but what about ex, resetting a password?
Sure, why not?
I don't understand when to stop using a single POST/GET and creating instead different endpoints.
A really good starting point is Jim Webber's talk Domain Driven Design for RESTful systems.
First key idea - your resources are not your domain model entities. Your REST API is really a facade in front of your domain model, which supports the illusion that you are just a website.
So your resources are analogous to documents that represent information. The URI identifies the document.
Second key idea - that URI is used by clients to cache representations of the resource, so that we don't need to send requests back to the server all the time. Instead, we have built into HTTP a bunch of standard ways for communicating caching meta data from the server to the client.
Critical to that is the rule for cache invalidation: a successful unsafe request invalidates previously cached representations of the same resource (ie, the same URI).
So the general rule is, if the client is going to do something that will modify a resource they have already cached, then we want the modification request to go to that same URI.
Your REST API is a facade to make your domain model look like a web site. So if we think about how we might build a web site to do the same thing, it can give us insights to how we arrange our resources.
So to borrow your example, we might have a web page representation of the user. If we were going to allow the client to modify that page, then we might think through a bunch of use cases (enable, change birthday, change name, reset password). For each of these supported cases, we would have a link to a task-specific form. Each of those forms would have fields allowing the client to describe the change, and a url in the form action to decide where the form gets submitted.
Since what the client is trying to achieve is to modify the profile page itself, we would have each of those forms submit back to the profile page URI, so that the client would know to invalidate the previously cached representations if the request were successful.
So your resource identifiers might look like:
/users/:id
/users/:id/forms/enable
/users/:id/forms/changeName
/users/:id/forms/changeBirthday
/users/:id/forms/resetPassword
Where each of the forms submits its information to /users/:id.
That does mean, in your implementation, you are probably going to end up with a lot of different requests routed to the same handler, and so you may need to disambiguate them there.

How to make initial request for nested resource from self describing REST API

Background:
I have a single page application that pulls data from a REST API. The API is designed such that the only URL necessary is the API root, ie https://example.com/api which provides URLs for other resources so that the client doesn't need to have any knowledge of how they are constructed.
API Design
The API has three main classes of data:
Module: Top level container
Category: A sub-container in a specific module
Resource: An item in a category
SPA Design
The app consuming the API has views for listing modules, viewing a particular module's details, and viewing a particular resource. The way the app works is it keeps all loaded data in a store. This store is persistent until the page is closed/refreshed.
The Problem:
My question is, if the user has navigated to a resource's detail view (example.com/resources/1/) and then they refresh the page, how do I load that particular resource without knowing its URL for the API?
Potential Solutions:
Hardcode URLs
Hardcoding the URLs would be fairly straightforward since I control both the API and the client, but I would really prefer to stick to a self describing API where the client doesn't need to know about the URLs.
Recursive Fetch
I could fetch the data recursively. For example, if the user requests a Resource with a particular ID, I could perform the following steps.
Fetch all the modules.
For each module, fetch its categories
Find the category that contains the requested resource and fetch the requested resource's details.
My concern with this is that I would be making a lot of unnecessary requests. If we have 100 modules but the user is only ever going to view 1 of them, we still make 100 requests to get the categories in each module.
Descriptive URLs
If I nested URLs like example.com/modules/123/categories/456/resources/789/, then I could do 3 simple lookups since I could avoid searching through the received data. The issue with this approach is that the URLs quickly become unwieldy, especially if I also wanted to include a slug for each resource. However, since this approach allows me to avoid hardcoding URLs and avoid making unnecessary network requests, it is currently my preferred option.
Notes:
I control both the client application and the API, so I can make changes in either place.
I am open to redesigning the API if necessary
Any ideas for how to address this issue would by greatly appreciated.
Expanding on my comment in an answer.
I think this is a very common problem and one I've struggled with myself. I don't think Nicholas Shanks's answer truly solves this.
This section in particular I take some issues with:
The user reloading example.com/resources/1/ is simply re-affirming the current application state, and the client does not need to do any API traversal to get back here.
Your client application should know the current URL, but that URL is saved on the client machine (in RAM, or disk cache, or a history file, etc.)
The implication I take from this, is that urls on your application are only valid for the life-time of the history file or disk cache, and cannot be shared with other users.
If that is good enough for your use-case, then this is probably the simplest, but I feel that there's a lot of cases where this is not true. The most obvious one indeed being the ability to share urls from the frontend-application.
To solve this, I would sum the issue up as:
You need to be able to statelessly map a url from a frontend to an API
The simplest, but incorrect way might simply be to map a API url such as:
http://api.example.org/resources/1
Directly to url such as:
http://frontend.example.org/resources/1
The issue I have with this, is that there's an implication that /resource/1 is taken from the frontend url and just added on to the api url. This is not something we're supposed to do, because it means we can't really evolve this api. If the server decides to link to a different server for example, the urls break.
Another option is that you generate uris such as:
http://frontend.example.org/http://api.example.org/resources/1
http://frontend.example.org/?uri=http://api.example.org/resources/1
I personally don't think this is too crazy. It does mean that the frontend needs to be able to load that uri and figure out what 'view' to load for the backend uri.
A third possibility is that you add another api that can:
Generate short strings that the frontend can use as unique ids (http://frontend.example.org/[short-string])
This api would return some document to the frontend that informs what view to load and what the (last known) API uri was.
None of these ideas sound super great to me. I want a better solution to this problem, but these are things I came up with as I was contemplating this.
Super curious if there's better ideas out there!
The current URL that the user is viewing, and the steps it took to get to the current place, are both application state (in the HATEOAS sense).
The user reloading example.com/resources/1/ is simply re-affirming the current application state, and the client does not need to do any API traversal to get back here.
Your client application should know the current URL, but that URL is saved on the client machine (in RAM, or disk cache, or a history file, etc.)
The starting point of the API is (well, can be) compiled-in to your client. Commpiled-in URLs are what couple the client to the server, not URLs that the user has visited during use of the client, including the current URL.
Your question, "For example, if the user requests a Resource with a particular ID", indicates that you have not grasped the decoupling that HATEOAS provides.
The user NEVER asks for a resource with such-and-such an ID. The user can click a link to get a query form, and then the server provides a form that generates requests to /collection/{id}. (In HTML, this is only possible for query strings, not path components, but other hypermedia formats don't have this limitation).
When the user submits the form with the ID number in the field, the client can build the request URL from the data supplied by the server+user.

Does it violate the RESTful when I write stuff to the server on a GET call?

I would like to record user actions on my website, not only on POST requests, but on GET requests as well. For example, suppose the user tries to search for a city with the following GET request:
/search_city?name=greenville
This request would return a list of cities with the name "greenville". I'd also like to save this keyword to the server, as the "search history" for a user. I'm planning to just do the save this information during the processing of the GET call.
Is this a violation to RESTful principles? If yes, how do I do this the right way?
I see this kind of audit logging as an invisible side-effect. If the next person to call
/search_city?name=greenville
still gets the same answer then your GET is valid. A similar case would be some kind of cache building, the caller of GET doesn't (need to) know that you're doing some extra work.
Focus on the formal API - send this request get this response.
If there's some resource available in the API where the user search history is available, then it's not OK to do that, since your GET request has a visible side-effect. For instance, a client caching responses is under no obligation to know that any resource changed because he did a GET request to anything else. I think the only way to do this and stay compliant is to explicitly mark the side-effected resource as uncacheable.
In particular, the convention has been established that the GET and HEAD methods SHOULD NOT have the significance of taking an action other than retrieval. These methods ought to be considered "safe". This allows user agents to represent other methods, such as POST, PUT and DELETE, in a special way, so that the user is made aware of the fact that a possibly unsafe action is being requested.
Naturally, it is not possible to ensure that the server does not generate side-effects as a result of performing a GET request; in fact, some dynamic resources consider that a feature. The important distinction here is that the user did not request the side-effects, so therefore cannot be held accountable for them.
http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html
If that's kept only for internal usage, I guess it's fine to do it that way, but I still recommend against it.

When should I use GET or POST method? What's the difference between them?

What's the difference when using GET or POST method? Which one is more secure? What are (dis)advantages of each of them?
(similar question)
It's not a matter of security. The HTTP protocol defines GET-type requests as being idempotent, while POSTs may have side effects. In plain English, that means that GET is used for viewing something, without changing it, while POST is used for changing something. For example, a search page should use GET, while a form that changes your password should use POST.
Also, note that PHP confuses the concepts a bit. A POST request gets input from the query string and through the request body. A GET request just gets input from the query string. So a POST request is a superset of a GET request; you can use $_GET in a POST request, and it may even make sense to have parameters with the same name in $_POST and $_GET that mean different things.
For example, let's say you have a form for editing an article. The article-id may be in the query string (and, so, available through $_GET['id']), but let's say that you want to change the article-id. The new id may then be present in the request body ($_POST['id']). OK, perhaps that's not the best example, but I hope it illustrates the difference between the two.
When the user enters information in a form and clicks Submit , there are two ways the information can be sent from the browser to the server: in the URL, or within the body of the HTTP request.
The GET method, which was used in the example earlier, appends name/value pairs to the URL. Unfortunately, the length of a URL is limited, so this method only works if there are only a few parameters. The URL could be truncated if the form uses a large number of parameters, or if the parameters contain large amounts of data. Also, parameters passed on the URL are visible in the address field of the browser not the best place for a password to be displayed.
The alternative to the GET method is the POST method. This method packages the name/value pairs inside the body of the HTTP request, which makes for a cleaner URL and imposes no size limitations on the forms output. It is also more secure.
The best answer was the first one.
You are using:
GET when you want to retrieve data (GET DATA).
POST when you want to send data (POST DATA).
There are two common "security" implications to using GET. Since data appears in the URL string its possible someone looking over your shoulder at Address Bar/URL may be able to view something they should not be privy to such as a session cookie that could potentially be used to hijack your session. Keep in mind everyone has camera phones.
The other security implication of GET has to do with GET variables being logged to most web servers access log as part of the requesting URL. Depending on the situation, regulatory climate and general sensitivity of the data this can potentially raise concerns.
Some clients/firewalls/IDS systems may frown upon GET requests containing an excessive amount of data and may therefore provide unreliable results.
POST supports advanced functionality such as support for multi-part binary input used for file uploads to web servers.
POST requires a content-length header which may increase the complexity of an application specific client implementation as the size of data submitted must be known in advance preventing a client request from being formed in an exclusively single-pass incremental mode. Perhaps a minor issue for those choosing to abuse HTTP by using it as an RPC (Remote Procedure Call) transport.
Others have already done a good job in covering the semantic differences and the "when" part of this question.
I use GET when I'm retrieving information from a URL and POST when I'm sending information to a URL.
You should use POST if there is a lot of data, or sort-of sensitive information (really sensitive stuff needs a secure connection as well).
Use GET if you want people to be able to bookmark your page, because all the data is included with the bookmark.
Just be careful of people hitting REFRESH with the GET method, because the data will be sent again every time without warning the user (POST sometimes warns the user about resending data).
This W3C document explains the use of HTTP GET and POST.
I think it is an authoritative source.
The summary is (section 1.3 of the document):
Use GET if the interaction is more like a question (i.e., it is a safe operation such as a query, read operation, or lookup).
Use POST if:
The interaction is more like an order, or
The interaction changes the state of the resource in a way that the
user would perceive (e.g., a subscription to a service), or
The user be held accountable for the results of the interaction.
Get and Post methods have nothing to do with the server technology you are using, it works the same in php, asp.net or ruby. GET and POST are part of HTTP protocol.
As mark noted, POST is more secure. POST forms are also not cached by the browser.
POST is also used to transfer large quantities of data.
The reason for using POST when making changes to data:
A web accelerator like Google Web Accelerator will click all (GET) links on a page and cache them. This is very bad if the links make changes to things.
A browser caches GET requests so even if the user clicks the link it may not send a request to the server to execute the change.
To protect your site/application against CSRF you must use POST. To completely secure your app you must then also generate a unique identifier on the server and send that along in the request.
Also, don't put sensitive information in the query string (only option with GET) because it shows up in the address bar, bookmarks and server logs.
Hopefully this explains why people say POST is 'secure'. If you are transmitting sensitive data you must use SSL.
GET and POST are HTTP methods which can achieve similar goals
GET is basically for just getting (retrieving) data, A GET should not have a body, so aside from cookies, the only place to pass info is in the URL and URLs are limited in length , GET is less secure compared to POST because data sent is part of the URL
Never use GET when sending passwords, credit card or other sensitive information!, Data is visible to everyone in the URL, Can be cached data .
GET is harmless when we are reloading or calling back button, it will be book marked, parameters remain in browser history, only ASCII characters allowed.
POST may involve anything, like storing or updating data, or ordering a product, or sending e-mail. POST method has a body.
POST method is secured for passing sensitive and confidential information to server it will not visible in query parameters in URL and parameters are not saved in browser history. There are no restrictions on data length. When we are reloading the browser should alert the user that the data are about to be re-submitted. POST method cannot be bookmarked
All or perhaps most of the answers in this question and in other questions on SO relating to GET and POST are misguided. They are technically correct and they explain the standards correctly, but in practice it's completely different. Let me explain:
GET is considered to be idempotent, but it doesn't have to be. You can pass parameters in a GET to a server script that makes permanent changes to data. Conversely, POST is considered not idempotent, but you can POST to a script that makes no changes to the server. So this is a false dichotomy and irrelevant in practice.
Further, it is a mistake to say that GET cannot harm anything if reloaded - of course it can if the script it calls and the parameters it passes are making a permanent change (like deleting data for example). And so can POST!
Now, we know that POST is (by far) more secure because it doesn't expose the parameters being passed, and it is not cached. Plus you can pass more data with POST and it also gives you a clean, non-confusing URL. And it does everything that GET can do. So it is simply better. At least in production.
So in practice, when should you use GET vs. POST? I use GET during development so I can see and tweak the parameters I am passing. I use it to quickly try different values (to test conditions for example) or even different parameters. I can do that without having to build a form and having to modify it if I need a different set of parameters. I simply edit the URL in my browser as needed.
Once development is done, or at least stable, I switch everything to POST.
If you can think of any technical reason that this is incorrect, I would be very happy to learn.
GET method is use to send the less sensitive data whereas POST method is use to send the sensitive data.
Using the POST method you can send large amount of data compared to GET method.
Data sent by GET method is visible in browser header bar whereas data send by POST method is invisible.
Use GET method if you want to retrieve the resources from URL. You could always see the last page if you hit the back button of your browser, and it could be bookmarked, so it is not as secure as POST method.
Use POST method if you want to 'submit' something to the URL. For example you want to create a google account and you may need to fill in all the detailed information, then you hit 'submit' button (POST method is called here), once you submit successfully, and try to hit back button of your browser, you will get error or a new blank form, instead of last page with filled form.
I find this list pretty helpful
GET
GET requests can be cached
GET requests remain in the browser history
GET requests can be bookmarked
GET requests should (almost) never be used when dealing with sensitive data
GET requests have length restrictions
GET requests should be used only to retrieve data
POST
POST requests are not cached
POST requests do not remain in the browser history
POST requests cannot be bookmarked
POST requests have no restrictions on data length
The GET method:
It is used only for sending 256 character date
When using this method, the information can be seen on the browser
It is the default method used by forms
It is not so secured.
The POST method:
It is used for sending unlimited data.
With this method, the information cannot be seen on the browser
You can explicitly mention the POST method
It is more secured than the GET method
It provides more advanced features