PUT on REST collection. Do I provide addresses? - rest

When you do a PUT on a REST collection should you provide the addresses of the members in the collection?
PUT on dogs with address specified
[{"name":"sparky", "id":1}, {"name":"rusty", "id":2}]
or...
PUT on dogs without address specified
[{"name":"sparky"}, {"name":"rusty"}]
and let the server return the locations of the new members in the collection. In my case the row ids in my table.

If you are sending a PUT to replace every resource in the entire collection, you should send the full new / updated entities.
If the id is part of the entity, you should send it within the entity, but for the client that has nothing to do with the URI (a.k.a the "address") of the element, as clients have now knowledge how a URI is build by the server.
If the id is not part of the entity for the client, you should not send it within the entity.
But both cases are possible and your server could accept both, important is just that the the PUT on collection behaves as in the HTTP/1.1 specs described:
The PUT method requests that the enclosed entity be stored under the supplied Request-URI.
So if you PUT on /api/animals/dogs/ the dogs just be found under /dogs/ and not e.g. /cats/
If the Request-URI refers to an already existing resource, the enclosed entity SHOULD be considered as a modified version of the one residing on the origin server
That means that a PUT on any collection, delectes the whole collection and creates a new one with the entities given in the body of the method call.
If the Request-URI does not point to an existing resource [...] the origin server can create the resource with that URI.
That means if /animals/dogs/ is empty you should create a new collection there.

Related

REST API Design: Path variable vs request body on UPDATE (Best practices)

When creating an UPDATE endpoint to change a resource, the ID should be set in the path variable and also in the request body.
Before updating a resource, I check if the resource exists, and if not, I would respond with 404 Not Found.
Now I ask myself which of the two information I should use and if I should check if both values are the same.
For example:
PUT /users/42
// request body
{
"id": 42,
"username": "user42"
}
You should PUT only the properties you can change into the request body and omit read-only properties. So you should check the id in the URI, because it is the only one that should exist in the message.
It is convenient to accept the "id" field in the payload. But you have to be sure it is the same as the path parameter. I solve this problem by setting the id field to the value of the path parameter (be sure to explain that in the Swagger of the API). In pseudo-code :
idParam = request.getPathParam("id");
object = request.getPayload();
object.id = idParam;
So all these calls are equivalent :
PUT /users/42 + {"id":"42", ...}
PUT /users/42 + {"id":"41", ...}
PUT /users/42 + {"id":null, ...}
PUT /users/42 + {...}
Why do you need the id both in URL and in the body? because now you have to validate that they are both the same or ignore one in any case. If it is a requirement for some reason, than pick which one is the one that is definitive and ignore the other one. If you don't have to have this strange duplication, than I'd say pass it in the body only
If you take a closer look at how HTTP works you might notice that the URI used to send a request to is also used as key for caching results. Any non-safe operation performed on that URI, such as POST, PUT, PATCH, will lead to (intermediary) caches automatically invalidating any stored responses for that URI. As such, if you use an other URI than the actual resource URI you are actually bypassing that feature and risk getting served outdated state from caches. As caching is one of the few constraints REST has simply skipping all caching via certain directives isn't ideal in first place.
In regards to including the ID of the resource or domain entity in the URI and/or in the payload: A common mistake in designing so-called REST APIs is that the domain object is mapped in a 1:1 manner onto a resource. We had a customer once who went through a merger and in a result they ended up with the same products being addressed by multiple IDs. In order to reduce the data in their DB they at one point tried to consolidate their data and continue. But they had to support still the old URIs they exposed for their products. In the end they realized that exposing the product ID via the URI wasn't ideal in their situation as it lead to plenty of downstream changes that affected their customers. As such, a recommendation here is to use UUIDs that don't give the target resource any semantic meaning and don't ever change. If the product ID in the back changes it doesn't affect the exposed URI at all. Sure, you might need a further table/collection to map from the product to the actual resource URI but you in the end designed your system with the eventuality of change which it now is more likely to coop with.
I've read so many times that the product ID shouldn't be part of the resource as it is already present in the URI. First, the whole URI is a unique identifier of that resource and not only a part of it. Next as mentioned above, IMO the product ID shouldn't be part of the URI in first place but it should be part of the resources' state. After all, the product ID is part of the products properties and therefore should be included there accordingly. As such, the media type exposed should contain all the necessities that a client is able to identify the product ID off the payload. The media type the resource's state is exchange with should also provide means to include the ID if you want to perform an update. I.e. if you take HTML as example, here you get served a HTML form by the server which basically teaches you where to send the request to, which HTTP operation to use, which media-type to marshal the request with and the actual properties of the resource, including the ones you are not meant to change. HTML does this i.e. via hidden input fields. Other form-based media types, such as HAL forms, JsonForms or Ion, might provide other mechanisms though.
So, to sum my post up:
Don't map the product ID onto URIs. Use a mapping from product ID to UUIDs instead
Use form-based media-types that support clients in creating requests. These media types should allow to include unmodifiable properties, such as hidden input fields and the like

What is the proper RESTful API method to replace an entire collection?

Imagine, We have an Entity School and this entity has a one to many relationship with Student entity. In other words, there is a collection of Students attached to a given School
If we are to replace the entire Student collection via a single API call,
API_URL/school/:school_id/students
which is the best Rest method to go along with. I think PUT is only used on an Entity not on a Collection. Hence, available options would be to use either PATCH or POST
I think PUT is only used on an Entity not on a Collection
No - PUT is used on a resource, not on an entity or collection.
The PUT method requests that the state of the target resource be created or replaced with the state defined by the representation enclosed in the request message payload.
The changes that happen to the entities in your domain are a side effect of the manipulation of REST resources. See Jim Webber's talk REST: DDD in the Large.
If your message body is a replacement representation for the resource, then either POST or PUT is the appropriate method to use
If your message body is a patch document, then you should use POST or PATCH.
If you are concerned that POST would be overloaded, then create a new resource in your design to manage this part of your integration protocol.
Again, heed Jim Webber:
URIs do NOT map onto domain objects - that violates encapsulation. Work (ex: issuing commands to the domain model) is a side effect of managing resources. In other words, the resources are part of the anti-corruption layer. You should expect to have many many more resources in your integration domain than you do business objects in your business domain.
I'm facing same issue. My teammates hate none resource nouns in the path. So in order to pass the API design review and to distinct an operation on whole collection from one on a single resource, I go one level up to the school.
GET /schools/1234
{
"schoolMetadat": "xxxx",
"students": []
}
And PATCH on the students property. An update to a property is always replacement.
For updating the entire resource use PUT, for partial update use PATCH.
PATCH API_URL/school/:school_id {students: [...]}
PUT API_URL/school/:school_id/students [...]
PATCH API_URL/school/:school_id/students {add: [...], remove: [...]}
And don't confuse web services in the presentation layer with ORMs in the data access layer.

Embedding collections (subdocument arrays) with MongoDB violates REST?

Let's say I have a users collection in my Mongo database:
users
_id
emailAddress
firstName
lastName
passwordHash
accessLogs: [ ... ]
createdAt
As you can see, a user document can contain an array of accessLogs. Great.
But let's say I want to update a user record and do a PUT /users/:id request against my RESTful API that uses this database. With a PUT, you're supposed to get back what you put in. So let's say the user has logged in 500 times. In order to avoid violating REST, does this mean my PUT data should include the accessLogs array and all its items?
I suppose the request handler could just update everything except accessLogs.
In the strictest definition PUT should indeed replace the contents of the object. If you want to update an existing object with partial data/instructions instead, you should use the PATCH method. This would allow you to specify that you want to add accessLogs (or otherwise leave them unmodified) and not have to send the entire object -- just instructions for what needs to be updated.
In your case the accessLogs is a generated read-only property and so it does not have to be part of your PUT request. You don't send the _id property as well and if some of the properties have default values, you don't have to send those either.
The PUT method requests that the enclosed entity be stored under the
supplied Request-URI. If the Request-URI refers to an already existing
resource, the enclosed entity SHOULD be considered as a modified
version of the one residing on the origin server.
Hypertext Transfer Protocol - HTTP/1.1
The difference between the PUT and PATCH requests is reflected in the
way the server processes the enclosed entity to modify the resource
identified by the Request-URI. In a PUT request, the enclosed entity
is considered to be a modified version of the resource stored on the
origin server, and the client is requesting that the stored version be
replaced. With PATCH, however, the enclosed entity contains a set of
instructions describing how a resource currently residing on the
origin server should be modified to produce a new version. The PATCH
method affects the resource identified by the Request-URI, and it also
MAY have side effects on other resources; i.e., new resources may be
created, or existing ones modified, by the application of a PATCH.
PATCH Method for HTTP
You can find a similar question here.

REST - How would a PUT request handle auto-incremented resource identifiers

According to the HTTP 1.1. spec:
If the Request-URI does not point to an existing resource, and that
URI is capable of being defined as a new resource by the requesting
user agent, the origin server can create the resource with that URI.
So in other words, PUT can be used to create & update. More specifically, if I do a PUT request e.g.
PUT /users/1
and that user does not exist, I would expect the result of this request to create a user with this ID. However, how would this work if your backend is using an auto-increment key? Would it be a case of simply ignoring it if it's not feasible (e.g. auto-increment is at 6 and I request 10) & creating if it is possible (e.g. request 7)?
From the snippet I have extracted above it does appear to give you this flexibility, just looking for some clarification.
I'd suggest that you use POST, not PUT, for an auto-increment key, or do not use the auto-increment key in the resource ID.
If you use POST, then you'd POST to /users rather than to /users/1. The reply might redirect you to /users/1 or whatever the ID is.
If you use PUT, then you might PUT to /users/10292829 where the number is a unique resource key generated on the client. This key can be time-generated, or it can be a hash of time, session ID, and some other factors to guarantee uniqueness of the value across your client audience. The server can then generate its own auto-incremented index, distinct from 10292829 or whatever.
For more on that, see PUT vs POST in REST
Following up. . .
In the case of allowing PUT to /users/XXXXXXX, for all users, you'd end up with two distinct unique keys that refer to the same resource. (10292829 and 1 might refer to the same user). You'd need to decide how to allow the use of each of these different keys in a REST-style URL. Because of the need to reconcile the use of these two distinct ids, I'd prefer to use the first option, POSTing to /users and getting a unique REST url of the created resource in the response.
I just re-read the relevant section of RFC 2616, and saw a return code specifically designed for this in REST applications:
10.2.2 201 Created
The request has been fulfilled and resulted in a new resource being created. The newly created resource can be referenced by the URI(s) returned in the entity of the response, with the most specific URI for the resource given by a Location header field. The response SHOULD include an entity containing a list of resource characteristics and location(s) from which the user or user agent can choose the one most appropriate. The entity format is specified by the media type given in the Content-Type header field. The origin server MUST create the resource before returning the 201 status code. If the action cannot be carried out immediately, the server SHOULD respond with 202 (Accepted) response instead.
So, the RESTful way to go is to POST to /users and return a 201 Created, with a Location: header specifying /users/1.
You should be using POST to create resources while the PUT should only be used for updating. Actually REST semantics forces you to do so.

RESTful idempotence

I'm designing a RESTful web service utilizing ROA(Resource oriented architecture).
I'm trying to work out an efficient way to guarantee idempotence for PUT requests that create new resources in cases that the server designates the resource key.
From my understanding, the traditional approach is to create a type of transaction resource such as /CREATE_PERSON. The the client-server interaction for creating a new person resource would be in two parts:
Step 1: Get unique transaction id for creating the new PERSON resource:::
**Client request:**
POST /CREATE_PERSON
**Server response:**
200 OK
transaction-id:"as8yfasiob"
Step 2: Create the new person resource in a request guaranteed to be unique by using the transaction id:::
**Client request**
PUT /CREATE_PERSON/{transaction_id}
first_name="Big bubba"
**Server response**
201 Created // (If the request is a duplicate, it would send this
PersonKey="398u4nsdf" // same response without creating a new resource. It
// would perhaps send an error response if the was used
// on a transaction id non-duplicate request, but I have
// control over the client, so I can guarantee that this
// won't happen)
The problem that I see with this approach is that it requires sending two requests to the server in order to do to single operation of creating a new PERSON resource. This creates a performance issues increasing the chance that the user will be waiting around for the client to complete their request.
I've been trying to hash out ideas for eliminating the first step such as pre-sending transaction-id's with each request, but most of my ideas have other issues or involve sacrificing the statelessness of the application.
Is there a way to do this?
Edit::::::
The solution that we ended up going with was for the client to acquire a UUID and send it along with the request. A UUID is a very large number occupying the space of 16 bytes (2^128). Contrary to what someone with a programming mind might intuitively think, it is accepted practice to randomly generate a UUID and assume that it is a unique value. This is because the number of possible values is so large that the odds of generating two of the same number randomly are low enough to be virtually impossible.
One caveat is that we are having our clients request a UUID from the server (GET uuid/). This is because we cannot guarantee the environment that our client is running in. If there was a problem such as with seeding the random number generator on the client, then there very well could be a UUID collision.
You are using the wrong HTTP verb for your create operation. RFC 2616 specifies the semantic of the operations for POST and PUT.
Paragraph 9.5:
POST method is used to request
that the origin server accept the
entity enclosed in the request as
a new subordinate of the resource
identified by the Request-URI in the Request-Line
Paragraph 9.6
PUT method requests that the
enclosed entity be stored under the
supplied Request-URI.
There are subtle details of that behavior, for example PUT can be used to create new resource at the specified URL, if one does not already exist. However, POST should never put the new entity at the request URL and PUT should always put any new entity at the request URL. This relationship to the request URL defines POST as CREATE and PUT as UPDATE.
As per that semantic, if you want to use PUT to create a new person, it should be created in /CREATE_PERSON/{transaction_id}. In other words, the transaction ID returned by your first request should be the person key used to fetch that record later. You shouldn't make PUT request to a URL that is not going to be the final location of that record.
Better yet, though, you can do this as an atomic operation by using a POST to /CREATE_PERSON. This allows you with a single request to create the new person record and in the response to get the new ID (which should also be referred in the HTTP Location header as well).
Meanwhile, the REST guidelines specify that verbs should not be part of the resource URL. Thus, the URL to create new person should be the same as the location to get the list of all persons - /PERSONS (I prefer the plural form :-)).
Thus, your REST API becomes:
to get all persons - GET /PERSONS
to get single person - GET /PERSONS/{id}
to create new person - POST /PERSONS with the body containing the data for the new record
to update existing person or create new person with well-known id - PUT /PERSONS/{id} with the body containing the data for the updated record.
to delete existing person - DELETE /PERSONS/{id}
Note: I personally prefer not using PUT for creating records for two reasons, unless I need to create a sub record that has the same id as an already existing record from a different data set (also known as 'the poor man's foreign key' :-)).
Update: You are right that POST is not idempotent and that is as per HTTP spec. POST will always return a new resource. In your example above that new resource will be the transaction context.
However, my point is that you want the PUT to be used to create a new resource (a person record) and according to the HTTP spec, that new resource itself should be located at the URL. In particular, where your approach breaks is that the URL you use with the PUT is a representation of the transactional context that was created by the POST, not a representation of the new resource itself. In other words, the person record is a side effect of updating the transaction record, not the immediate result of it (the updated transaction record).
Of course, with this approach the PUT request will be idempotent, since once the person record is created and the transaction is 'finalized', subsequent PUT requests will do nothing. But now you have a different problem - to actually update that person record, you will need to make a PUT request to a different URL - one that represents the person record, not the transaction in which it was created. So now you have two separate URLs your API clients have to know and make requests against to manipulate the same resource.
Or you could have a complete representation of the last resource state copied in the transaction record as well and have person record updates go through the transaction URL for updates as well. But at this point, the transaction URL is for intends and purposes the person record, which means it was created by the POST request in first place.
I just came across this post:
Simple proof that GUID is not unique
Although the question is universally ridiculed, some of the answers go into deeper explanation of GUIDs. It seems that a GUID is a number of 2^128 in size and that the odds of randomly generating two of the same numbers of this size so low as to be impossible for all practical purposes.
Perhaps the client could just generate its own transaction id the size of a GUID instead of querying the server for one. If anyone can discredit this, please let me know.
I'm not sure I have a direct answer to your question, but I see a few issues that may lead to answers.
Your first operation is a GET, but it is not a safe operation as it is "creating" a new transaction Id. I would suggest POST is a more appropriate verb to use.
You mention that you are concerned about performance issues that would be perceived by the user caused by two round trips. Is this because your user is going to create 500 objects at once, or because you are on a network with massive latency problems?
If two round trips are not a reasonable expense for creating an object in response to a user request, then I would suggest HTTP is not the right protocol for your scenario. If however, your user needs to create large amounts of objects at once, then we can probably find a better way of exposing resources to enable that.
Why don't you just use a simple POST, also including the payload on your first call. This way you save on extra call and don't have to spawn a transaction:
POST /persons
first_name=foo
response would be:
HTTP 201 CREATED
...
payload_containing_data_and_auto_generated_id
server-internally an id would be generated. for simplicity i would go for an artifial primary key (e.g. auto-increment id from database).