Handling RESTful representation structure difference between POST and GET - rest

I'm designing a REST API and despite trawling a number of best practice guides I can't find much relating to the best practice of handling the disparity between representation structure needed for a POST vs the same representation structure returned from a GET.
GET for a dummy user representation might look like this:
{
"id": 1234,
"created": "2012-04-23T18:25:43.511Z",
"username": "johndoe#example.com",
"name": "John Doe"
}
However, POST for the same dummy user representation cannot specify certain properties (namely the id and created):
{
"username": "johndoe#example.com",
"name": "John Doe"
}
Obviously this is an overly simplified example but given that the user cannot specify certain fields (and it might not always be obvious which ones are pertinent to the applied method) is it best practice to create separate representations for each or to expect the most complete version and handle the data disparity transparently on the server?
Despite the apparent ease of having a single representation and handling the disparity server side I am worried that this would be a bad experience for a user if it wasn't clear which values can be specified (or altered using PUT for example). If the tendency is to create separate representations is there a naming convention to apply to the representation definition?
e.g. i_user for incoming user and o_user for outgoing user. Or user_full and user_min or user and .user etc.
Update: My overly simplified example perhaps didn't properly illustrate the issue. Imagine a representation that has 50 properties (for example a server representation with all its monitoring attributes - cpu, ram, temp, storage_drive_a, storage_drive_b, file_permission etc.) Of these 50 properties, 30 are read only properties and 20 of these are values that can be set.

First of all, the final semantics of the POST method are determined by the targeted resource, not by the HTTP protocol, as with the other methods, so your POST method can do anything you want, as long as you document it properly, and you are not replicating functionality already standardized by other methods.
So, in short, there's nothing wrong with having a different representation for POST and GET method.
However, asking for a best-practice in this case is pointless, because what defines the representation format is the media-type being used, not the method, but most of the so-called REST APIs around the internet use generic media-types for everything and clients rely on URI semantics to know which resource they are dealing with, which is not RESTful at all. Basically, you are asking for the best-practice for a problem that doesn't really exist in REST when things are done properly.
So, to answer your question, you can have different representations with different media-types -- like your complete user representation might have a media-type application/vnd.mycompany.user.full.v1+json, and a simplified user representation might have a media-type application/vnd.mycompany.user.min.v1+json -- or you can have a single representation like application/vnd.mycompany.user.v1+json and your documentation for this media-type might detail how some properties might exist or not, or might have default values if not provided. Your POST method will require one media-type to work, and will respond with 415 Unsupported Media Type if clients send anything else in the Content-Type header. In the same way, a client may choose the representation it wants with the Accept header.
As you can see, what you are asking isn't a problem when you are really doing REST, and not merely using it as a buzzword for an HTTP API.

Related

GET Vs POST in REST

According to Rest, we should use GET if we have to retrieve some data, and use POST if it's creating a resource.
But, if we have to take multiple parameters (more than 7-8), or list of UUIDs let's say, then shouldn't we use POST rather than GET?
To avoid:
The complexity of the URL
Future scope to incorporate any new field
URL length which may limit us in future
If we are not encoding our URL then we may risk of exposing params (least significant point though)
Thanks.
GET Vs POST in REST
Use GET when the semantics of GET fit the use case; in other words, when you are trying to retrieve a copy of the latest representation of some resource.
Use POST when no other standardized method supports the semantics you need.
You can use POST for everything, but you give up the advantages of using more specialized methods -- the ability of general purpose components to do intelligent things because they understand the semantics of the request. For instance, GET has safe semantics, which means that a general purpose component knows that it can pre-fetch a representation of the resource before the user needs it, or automatically repeat a request if it doesn't get a response from the server.
What HTTP gives you is a extendable collections of refinements of the general purpose POST method, adding more specific constraints so that general purpose components can leverage the resulting properties.
The complexity of the URL
Complexity in the URL really isn't a big deal, as far as general purpose components are concerned a URL is just a opaque sequence of bytes that happens to abide by certain production rules. For the most part, the effective target-uri of a web request is treated as an atomic unit, so what might seem "complex to a human being doesn't bother the machines at all (for instance, take a look at the URL used when you submit a search from the google home page).
URL length which may limit us in future
We care a bit about URL length. RFC 3986 doesn't restrict the length of the URI, but some implementations of general purpose components will fail if the length is far outside the norm. So you probably don't want to include a url encoded copy of the unabridged works of Shakespeare in the query part of your request.
Future scope to incorporate any new field
Again, there's not a lot of difference here. Adding new optional elements to a URI template is really no different than adding new optional elements to a message template.
we may risk of exposing params
We also want to be careful about sensitive information - as far as the machines are concerned, the URI is an identifier; there's no particular reason to worry about a specific sequence of bytes. Which means that the URI may be exposed at rest (in a clients history, or list of bookmarks, in the servers access logs). Restricting sensitive information to the body of a message reduces the chance of the data escaping beyond its intended use.
Note that REST, and leveraging the different HTTP methods, isn't the only way to get useful work done. SOAP (and more recently gRPC) decided that a different collection of trade offs was better -- in effect, reducing HTTP to transport, rather than an application in itself.
According to Rest, we should ... use POST if it's creating a resource.
This is an incorrect interpretation of REST. It's a very common interpretation, but incorrect. The semantics of POST are defined by RFC 7231; it means that, and not something else.
The suggestion that POST should only be used for create is a misleading over simplification. The earliest references I've been able to find to it is a blog post by Paul Prescod in 2002; and of course it became very popular with the arrival of Ruby on Rails.
But recall: REST is the architectural style of the world wide web. HTML, the most common hypertext media type in use on the web, has native support for only two HTTP methods; GET, used to fetch resource representations from a server, and POST which does everything else.
You should also use POST if you have sensitive data such as username and or password which are best encoded as form parameters (key value pairs)

Using GET for sending information to server in REST APIs

Until now, I used to think the only difference between GET and POST is the format on data trasmission.
In other examples I have seen that in REST Api, the only command to send information is POST, while for getting information is GET...
But what if I want to use GET also for information sending? I have to send a number of dataset names to be processed (usually a short number...)... I don't understand why GET doesn't fits this operation. Or it's only a "semantic" reason?
EDIT: I don't want answer about the general differences between GET or POST... I need to know if GET should not be used to update info of the server IN ALL CASES, in particular regarding my specific case, or there are some exceptions.
I don't understand why GET doesn't fits this operation. Or it's only a "semantic" reason?
Yes, but semantics are the entire point.
One of the important constraints of the REST architectural style is that of a uniform interface; the idea that all components understand messages the same way. What that gives us is the ability to achieve useful work using general purpose components.
In HTTP (an application built within the constraints of this style), that means that we can use browsers, and caches, and reverse proxies, and server toolkits, and so forth, mixing and matching as we go. It all works, because all of these components understand that a request should be interpreted as described in RFC 7230, and that the request method token is "the primary source of request semantics", and so on.
The semantics of GET have been defined, and we all share in that definition. The wording of the definition has evolved somewhat from its earliest specifications, but the basic semantics have been constant: that GET is effectively read-only, and that the message body (of the request) is of no semantic interest.
If you want to use an HTTP method where the message body is semantically significant, then you need to either relax some of the constraints of GET (for instance, by using POST), choose a different standardized method that better fits your use case (see the IANA HTTP Method Registry), or by defining (and potentially standardizing) your own HTTP method.
The core problem with trying to define a payload for GET - while your bespoke client may know what to do, and your bespoke resource may know what to do, general-purpose intermediates are likely to do the wrong thing (like caching responses without capturing the information in the request body, or discarding the body as unnecessary).
Note that information that is encoded into the target-uri works just fine; HTML forms using the GET method and other variations of URI templates can be used to construct a target-uri from local information that the server can interpret (of course, the defacto limits on target-uri length are much shorter than the defacto limits on payload size; it's not a universal solution).

What HTTP method should i use when i need to pass a lot of data to a single endpoint?

Currently i've asking if the current HTTP method that i'm using on my rest api is the correct one to the occasion, my endpoint need a lot of data to "match" a query, so i've made a POST endpoint where the user can send a json with the parameters, e.g:
{
"product_id": 1
"partner": 99,
"category": [{
"id": 8,
"subcategories": [
{"id": "x"}, {"id": "y"}
]
}]
"brands": ["apple", "samsung"]
}
Note that brands and category are a list.
Reading the mozzila page about http methods i found:
The POST method is used to submit an entity to the specified resource, often causing a change in state or side effects on the server.
My POST endpoint does not take any effect on my server/database so in theory i'm using it wrong(?), but if i use a GET request how can i make it more "readable" and how can i manage lists on this method?
What HTTP method should i use when i need to pass a lot of data to a single endpoint?
From RFC 7230
Various ad hoc limitations on request-line length are found in practice. It is RECOMMENDED that all HTTP senders and recipients support, at a minimum, request-line lengths of 8000 octets.
That effectively limits the amount of information you can encode into the target-uri. Your limit in practice may be lower still if you have to support clients or intermediaries that don't follow this recommendation.
If the server needs the information, and you cannot encode it into the URI, then you are basically stuck with encoding it into the message-body; which in turn means that GET -- however otherwise suitable the semantics might be for your setting -- is out of the picture:
A payload within a GET request message has no defined semantics
So that's that - you are stuck with POST, and you lose safe semantics, idempotent semantics, and caching.
A possible alternative to consider is creating a resource which the client can later GET to retrieve the current representation of the matches. That doesn't make things any better for the first adhoc query, but it does give you nicer semantics for repeat queries.
You might, for example, copy the message-body into an document store, and encode the key to the document store (for example, a hash of the document) into the URI to be used by the client in subsequent GETs.
For cases where the boilerplate of the JSON document is large, and the encoding of the variation small, you might consider a family of resources that encode the variations into the URI. In effect, the variation is extracted from the URI and applied to the server's copy of the template, and then the fully reconstituted JSON document is used to achieve... whatever.
You should be using a POST anyways. With Get you can only "upload" data via URL parameters or HTTP Headers. Both are unsuitable for structured data like yours. Do use POST even though no "change" happens on the server.

What is the best practice to HTTP GET only the list of objects that I require

I am in a situation where in I want to REST GET only the objects that I require by addressing them with one of the parameters that I know about those objects.
E.g., I want to GET all the USERS in my system with ids 111, 222, 333.
And the list can be bigger, so I don't think it is best way to append the URL with what is required, but use payload with json.
But I am skeptical using JSON payload in a GET request.
Please suggest a better practice in the REST world.
I am skeptical using JSON payload in a GET request.
Your skepticism is warranted; here's what the HTTP specification has to say about GET
A payload within a GET request message has no defined semantics
Trying to leverage Undefined Behavior is a Bad Idea.
Please suggest a better practice in the REST world.
The important thing to recognize is that URI are identifiers; the fact that we sometimes use human readable identifiers (aka hackable URI) is a convenience, not a requirement.
So instead of a list of system ids, the URI could just as easily be a hash digest of the list of system ids (which is probably going to be unique).
So your client request would be, perhaps
GET /ea3279f1d71ee1e99249c555f3f8a8a8f50cd2b724bb7c1d04733d43d734755b
Of course, the hash isn't reversible - if there isn't already agreement on what that URI means, then we're stuck. So somewhere in the protocol, we're going to need to make a request to the server that includes the list, so that the server can store it. "Store" is a big hint that we're going to need an unsafe method. The two candidates I would expect to see here are POST or PUT.
A way of thinking about what is going on is that you have a single resource with two different representations - the "query" representation and the "response" representation. With PUT and POST, you are delivering to the server the query representation, with GET you are retrieving the response representation (for an analog, consider HTML forms - we POST application/x-www-form-urlencoded representations to the server, but the representations we GET are usually friendlier).
Allowing the client to calculate a URI on its own and send a message to it is a little bit RPC-ish. What you normally do in a REST API is document a protocol with a known starting place (aka a bookmark) and a sequence of links to follow.
(Note: many things are labeled "REST API" that, well, aren't. If it doesn't feel like a human being navigating a web site using a browser, it probably isn't "REST". Which is fine; not everything has to be.)
But I believe POST or PUT are for some requests that modify the data. Is it a good idea to use query requests with them ?
No, it isn't... but they are perfect for creating new resources. Then you can make safe GET calls to get the current representation of the resource.
REST (and of course HTTP) are optimized for the common case of the web: large grain hypermedia, caching, all that good stuff. Various use cases suffer for that, one of which is the case of a transient message with safe semantics and a payload.
TL;DR: if you don't have one of the use cases that HTTP is designed for, use POST -- and come to terms with the fact that you aren't really leveraging the full power of HTTP. Or use a different application - you don't have to use HTTP if its a bad fit.

REST - What exactly is meant by Uniform Interface?

Wikipedia has:
Uniform interface
The uniform interface constraint is fundamental to the design of any REST service.[14] The uniform interface simplifies and decouples the architecture, which enables each part to evolve independently. The four guiding principles of this interface are:
Identification of resources
Individual resources are identified in requests, for example using URIs in web-based REST systems. The resources themselves are conceptually separate from the representations that are returned to the client. For example, the server may send data from its database as HTML, XML or JSON, none of which are the server's internal representation, and it is the same one resource regardless.
Manipulation of resources through these representations
When a client holds a representation of a resource, including any metadata attached, it has enough information to modify or delete the resource.
Self-descriptive messages
Each message includes enough information to describe how to process the message. For example, which parser to invoke may be specified by an Internet media type (previously known as a MIME type). Responses also explicitly indicate their cacheability.
Hypermedia as the engine of application state (A.K.A. HATEOAS)
Clients make state transitions only through actions that are dynamically identified within hypermedia by the server (e.g., by hyperlinks within hypertext). Except for simple fixed entry points to the application, a client does not assume that any particular action is available for any particular resources beyond those described in representations previously received from the server.
I'm listening to a lecture on the subject and the lecturer has said:
"When someone comes up to our API, if you are able to get a customer object and you know there are order objects, you should be able to get the order objects with the same pattern that you got the customer objects from. These URI's are going to look like each other."
This strikes me as wrong. It's not so much about what the URI's look like or that there is consistency as it is the way in which the URI's are used (identify resources, manipulate the resources through representations, self-descriptive messages, and hateoas).
I don't think that's what Uniform Interface means at all. What exactly does it mean?
Using interfaces to decouple classes from the implementation of their dependencies is a pretty old concept. In REST you use the same concept to decouple the client from the implementation of the REST service. In order to define such an interface (a contract between the client and the service), you have to use standards. This is because if you want an internet size network of REST services, you have to enforce global concepts, like standards to make them understand each other.
Identification of resources - You use the URI (IRI) standard to identify a resource. In this case, a resource is a web document.
Manipulation of resources through these representations - You use the HTTP standard to describe communication. So for example GET means that you want to retrieve data about the URI-identified resource. You can describe an operation with an HTTP method and a URI.
Self-descriptive messages - You use standard MIME types and (standard) RDF vocabs to make messages self-descriptive. So the client can find the data by checking the semantics, and it doesn't have to know the application-specific data structure the service uses.
Hypermedia as the engine of application state (a.k.a. HATEOAS) - You use hyperlinks and possibly URI templates to decouple the client from the application-specific URI structure. You can annotate these hyperlinks with semantics e.g. IANA link relations, so the client will understand what they mean.
The Uniform Interface constraint, that any ReSTful architecture should comply with, actually means that, along with the data, server responses should also announce available actions and resources.
In chapter 5 ("Reprensational State Transfer") of his dissertation, Roy Fielding states that the aim of using uniform interfaces is to:
ease and improve global architecture and the visibility of interactions
In other words, querying resources should allow the client to request other actions and resources without knowing them in advance.
The JSON-API specs (jsonapi.org) offer a good example in the form of a JSON response to an (hypothetical) GET HTTP request on http://example.com/articles :
{
"links": {
"self": "http://example.com/articles",
"next": "http://example.com/articles?page[offset]=2",
"last": "http://example.com/articles?page[offset]=10"
},
"data": [{
"type": "articles",
"id": "1",
"attributes": {
"title": "JSON API paints my bikeshed!"
},
"relationships": {
"author": {
"links": {
"self": "http://example.com/articles/1/relationships/author",
"related": "http://example.com/articles/1/author"
},
},
"comments": {
"links": {
"self": "http://example.com/articles/1/relationships/comments",
"related": "http://example.com/articles/1/comments"
}
}
},
"links": {
"self": "http://example.com/articles/1"
}
}]
}
Just by analysing this single response, a client knows:
What entities were queried ("articles" in this example);
How these entities are structured (articles have fields: id, title, author, comments);
How to retrieve related entities (i.e. the author and the comments);
That there are more entities of type "articles" (10, based on current response length and pagination links).
For those passionate about the topic, I strongly recommend reading Roy Thomas Fielding's dissertation!
Your question is somewhat broad, you seem to be asking for a restatement of the definitions you have. Are you looking for examples or do you not understand somethings specifically stated.
I agree that the line:
These URI's are going to look like each other
is fundamentally wrong. URIs needn't look anything like each other for the Uniform interface constraint to be met. What needs to be present is a uniform way to discover the URIs that identify the resources. This uniform way is unique to each message type, and there must be some agreed upon format. For example in HTML one document resource links to another via a simple tag:
fallback relationship
HTTP servers return html as a text/html resource type which browsers have an agreed upon way of parsing. The anchor tag is the hypermedia control (HATEOAS) that has the unique identifier for the related resource.
The only point that wasn't covered was manipulation. HTML has another awesome example of this, the form tag:
<form action="URI" method="verb">
<input name=""></input>
</form>
again, browser know how to interpret this meta information to define a representation of the resource acted upon at the URI. Unfortunately HTML only lets you GET and POST for verbs...
more commonly in a JOSN based service, when you retrieve a Person resource, it's easy to manipulate that representation and then PUT or PATCH it right back to it's canonical URL. No pre-existing knowledge of the resource is needed to modify it. Now when we write client code we get all wrapped up with the idea that we do in fact need to know the shape before we consume it...but that really is just to make our parsers efficient and easy. We could make parsers that analyze the semantic meaning of each part of a resource and modify it by interpreting the intent of the modification. IE: a command of make the person 10 years older would parse the resource looking for the age, identify the age, and then add 10 years to that value, then send that resource back to the server. Is it easier to have code that expects the age to be at a JSON path of $.age? absolutely...but it's not specifically necessary.
Ok I think I understand what it means.
From Fieldings dissertation:
The central feature that distinguishes the REST architectural style from other network-based styles is its emphasis on a uniform interface between components (Figure 5-6). By applying the software engineering principle of generality to the component interface, the overall system architecture is simplified and the visibility of interactions is improved.
He's saying that the interface between components must be the same. Ie. between client and server and any intermediaries, all of which are components.