Best practice for RESTful API design updating 1/many-to-many relationship? - postgresql

Suppose I have a recipe page where the recipe can have a number of ingredients associated with it. Users can edit the ingredients list and update/save the recipe. In the database there are these tables: recipes table, ingredients table, ingredients_recipes_table. Suppose a recipe has ingredients a, b, c, d but then the user changes it to a, d, e, f. With the request to the server, do I just send only the new ingredients list and have the back end determine what values need to be deleted/inserted into the database? Or do I explicitly state in the payload what values need to be deleted and what values need to be inserted? I'm guessing it's probably the former, but then is this handled before or during the db query? Do I read from the table first then write after calculating the differences? Or does the query just handle this?
I searched and I'm seeing solutions involving INSERT IGNORE... + DELETE ... NOT IN ... or using the MERGE statement. The project isn't using an ORM -- would I be right to assume that this could be done easily with an ORM?

Can you share what the user interface looks like? It would be pretty standard practice that you can either post a single new ingredient as an action or delete one as an action. You can simply have a button next to the ingredients to initiate a DELETE request, and have a form beneath for a POST.
Having the users input a list creates unnecessary complexity.

A common pattern to use would be to treat this like a remote authoring problem.
The basic idea of remote authoring is that we ask the server for its current representation of a resource. We then make local (to the client) edits to the representation, and then request that the server accept our representation as a replacement.
So we might GET a representation that includes a JSON Array of ingredients. In our local copy, we remove the ingredients we no longer want, add the new ones in. The we would PUT our local copy back to the server.
When the documents are very large, with changes that are easily described, we might instead of sending the entire document to the server instead send a PATCH request, with a "patch document" that describes the changes we have made locally.
When the server is just a document store, the implementation on the server is easy -- you can review the changes to decide if they are valid, compute the new representation (if necessary), and then save it into a file, or whatever.
When you are using a relational database? Then the server implementation needs to figure out how to update itself. An ORM library might save you a bunch of work, but there are no guarantees -- people tend to get tangled up in the "object" end of the "object relational mapper". You may need to fall back to hand rolling your own SQL.
An alternative to remote authoring is to treat the problem like a web site. In that case, you would get some representation of a form that allows the client to describe the change that should be made, and then submit the form, producing a POST request that describes the intended changes.
But you run into the same mapping problem on the server end -- how much work do you have to do to translate the POST request into the correct database transaction?
REST, alas, doesn't tell you anything about how to transform the representation provided in the request into your relational database. After all, that's part of the point -- REST is intended to allow you to replace the server with an alternative implementation without breaking existing clients, and vice versa.
That said, yes - your basic ideas are right; you might just replace the entire existing representation in your database, or you might instead optimize to only issue the necessary changes. An ORM may be able to effectively perform the transformations for you -- optimizations like lazy loading have been known to complicate things significantly.

Related

REST: How to support create-Or-Update and partial-update ? (aka PUT vs PATCH)

We are designing WebAPI for our software for managing ecommerce product information. We want to provide (among many others) two operations:
Simple one: allow user to add/modify existing product information:
don't create new product if it not exists
don't delete any information from existing product which was not provided in this request
In my opinion HTTP PATCH method is proper way to handle this scenario (with json-patch or json-merge-ptach) with URL like this: /products/{ID}
Harder one: allow user to add/modify existing product or create one
create product if not exists in DB
don't delete any information from existing product which was not provided in this request (same behaviour as in first case)
I'm struggling with designing REST endpoint for this second use case. I have few options but none of them fits perfectly for me in the REST principles:
a) Add custom HTTP header to the endpoint designed for first case (patch) to allow a caller to control of "not found behaviour" eg. create-entity-when-not-exists: true/false - but in my opinion PATCH shouldn't be used for creating resources.
b) Design new endpoint using PUT with special header "preserve-not-provided-data" - this on the other hand violates for me PUT principles because PUT is create-or-replace not create-or-update method
c) Create PATCH for /products URL (without {ID} at the end) - in this case we are updating whole collection(resource) of products - so if product exists we can update it or create new one if not exists.
For now c) solution looks fine for me with one exception: If in the future we would like to support batch operations (for both use cases: 1 and 2) we would like to use /products URL and it will conflict with URL from solution c)
What do you think ? Do you have any other ideas ?
PUT and PATCH have differing message semantics, but the core context ("remote authoring") is the same. In both cases, the client request is "Please, server, make your representation of this resource match my local copy".
For example, I GET a JSON document from the server. I make local edits to it. Now I want to "save" my changes on the server. If the document is modest in size, I might just send the entire revised document over the network. If the document is very large, and my changes are modest, then I might instead send the patch instead.
If you imagine using HTTP to publish edits of HTML web pages to a server, then you've got the right frame of reference. There's not a lot of practical difference between "please patch the title of your copy of the document" and "here is a complete new copy of the document, with my edit to the title". The bytes on disk are going to be the same in either case.
Given that, it would be very odd if those two methods for publishing a new revision of the document were to have vastly different side effects.
Your third approach, based on modifying /products, is potentially fine for both your individual and batch. The server gets the new representation of /products (or the patch document describing the changes), decides whether to accept the changes, and if so computes what it needs to do to its own database to make things work.
Note:
A PUT request applied to the target resource can have side effects on other resources.
The HTTP specification is relatively strict about what the message means, but offers the server a lot of leeway in how it behaves in response.

"Why" does Backbone NOT have a save (PUT/POST) method for its collections - is it unRESTful?

I asked a question a while back i.e. "How save an entire backbone collection?". However what intrigues me is that why is a save method not offered? Is it unRESTful to save (PUT/POST) entire collections or is it uncommon to do so in the REST-land?
GET: /MySite/Collections - allowed by collection.fetch()
POST: /MySite/Collections - for the model(s) in the collection to be Posted when calling model.save()
PUT: /MySite/Collections/{id} - for the model(s) to be updated individually
GET: /MySite/Collections/{id} - to fetch an individual model throuth model.fetch()
So why not allow for POST/PUT an entire collection of resources? It is convenient sometimes and although one can wrap/hack out some code using collection.toJSON why not include it? I'm just curious about its absence and the rationale for the same. Frameworks not having the capability of a few things usually implies bad programming/design and are thus left out. Is saving an entire collection 'bad practice'?
The wikipedia article about REST does mention CRUD verbs for collection.
But, in my opinion, a Collection is not a resource, it is not an entity, and it has not state. It is, instead, a bunch of resources. And if there would be an UPDATE command for a Collection it would be nothing else but a multiple UPDATE commands over multiple Models. Having the possibility of multiple UPDATE commands in only one request would be helpful but I think this is not a job for the REST implementation.
Also there will be problems of ambiguity, for example in a Collection that contains already saved Models with id and so on, and others that not, what will a POST command mean?... or an UPDATE command?...
No talking about the increase of the complexity in the server side where, if this Collection REST support should be taken like standard, we should to work the double to accomplish the casuistic.
Summarizing: I don't see any case where the need of a Collection REST command can't be solved with the actual, simpler, only-Model REST commands, so keeping the things as simple as possible I think is a good habit.

Is it RESTful to create complex objects in a single POST?

I have a form where users create Person records. Each Person can have several attributes -- height, weight, etc. But they can also have lists of associated data such as interests, favorite movies, etc.
I have a single form where all this data is collected. To me it seems like I should POST all of this data in a single request. But is that RESTful? My reading suggests that the interests, favorite movies and other lists should be added in separate POST requests. But I don't think that makes sense because one of those could fail and then there would be a partial insert of the Person and it may be missing their interests or favorite movies.
I'd say that it depends entirely upon the addressability and uniqueness of the dependent data.
If your user-associated data is dependent upon the user (i.e., a "distinct" string, e.g. an attribute such as a string representing an (unvalidated) name of a movie), then it should be included in the POST creation of the user representation; however, if the data is independent of the user (where the data can be addressed independently of the user, e.g. a reference, such as a movie from a set of movies) then it should be added independently.
The reasoning behind this is that reference addition when bundled with the original POST implies transactionality; that is, if another user deletes the movie reference for the "favorite" movie between when it is chosen on the client and when the POST goes through, the user add will (should by that design) fail, whereas if the "favorite" movie is not associative but is just an attribute, there's nothing to fail on (attributes (presumably) cannot be invalidated by a third party).
And again, this goes very much to your specific needs, but I fall on the side of allowing the partial inserts and indicating the failures. The proper way to handle this sort of thing if you really want to not allow partial inserts is to just implement transactions on the back end; they're the only way to truly handle a situation where a critical associated resource is removed mid-process.
The real restriction in REST is that for a modifiable resource that you GET, you can also turn around and PUT the same representation back to change its state. Or POST. Since it's reasonable (and very common) to GET resources that are big bundles of other things, it's perfectly reasonable to PUT big bundles of things, too.
Think of resources in REST very broadly. They can map one-to-one with database rows, but they don't have to. An addressable resource can embed other addressable resources, or include links to them. As long as you're honoring your representation and the semantics of the underlying protocol's operations (i.e. HTTP GET POST PUT etc.), REST doesn't have anything to say about other design considerations that might make your life easier or harder.
I don't think there is a problem with adding all data in one request as long as its inherently associated with the main resource (i.e. the person in your case). If interest, fav. movies etc are resources of their own, they should also be handled as such.

RESTful API design: should unchangable data in an update (PUT) be optional?

I'm in the middle of implementing a RESTful API, and I am unsure about the 'community accepted' behavior for the presence of data that can not change. For example, in my API there is a 'file' resource that when created contains a number of fields that can not be modified after creation, such as the file's binary data, and some metadata associated with it. Additionally, the 'file' can have a written description, and tags associated.
My question concerns doing an update to one of these 'file' resources. A GET of a specific 'file' will return all the metadata, description & tags associated with the file, plus the file's binary data. Should a PUT of a specific 'file' resource include the 'read only' fields? I realize that it can be coded either way: a) include the read only fields in the PUT data and then verify they match the original (or issue an error), or b) ignore the presence of the read only fields in the PUT data because they can't change, never issuing an error if they don't match or are missing because the logic ignores them.
Seems like it could go either way and be acceptable. The second method of ignoring the read only fields can be more compact, because the API client can skip sending that read only data if they want; which seems good for people who know what they are doing...
Personally, both ways are acceptable.... however, if I were you, I'll opt for option A (check read-only fields to ensure they are not changed, else throw an error). Depending on the scope of your project, you cannot assume what the consumers know about your Restful WS in depth because most of them don't read documentations or WADL, even if they are the experienced ones. :)
If you don't provide immediate feedback to the consumers that certain fields are read-only, they will have a false assumption that your web service will take care all the changes they have made without double checking, OR once they find out the "inconsistent" updates, they complain to others that your web service is buggy.
You can approach this in two different ways if the read-only field doesn't match the original values...
Don't process the request. Send a 409 Conflict code and specific error message.
Process the request, send a 200 OK and a message stating that changes made the read-only fields are ignored.
Unless the read-only data makes up a significant portion of the data (to the extreme that transmitting the read-only data has a noticeable impact on network traffic and response times), you should write the service to accept the read only fields in the PUT and check them for changes. It's just simpler to have the same data going in and out.
Look at it this way: You could make inclusion of the read only fields optional in the PUT, but you will still have to / should write the code in the service to check that any read only fields that were received contain the expected values. You have to write the read only checking either way.
Prohibiting the read-only fields in the PUT is a bad idea because it will require the clients to strip away fields they received from you in the GET. This requires that the client get more intimately involved with your data and semantics than they really need to be. The clients will consider this a headache, an unnecessary complication, and downright mean of you to add to their burden. Taking data received from your GET, modifying one field of interest, and sending it back to you with a PUT should be a brain-dead simple round-trip for the client. Don't complicate things when you don't have to.

REST returning an object graph

I am new to the REST architecural design, however I think I have the basics of it covered.
I have a problem with returning objects from a RESTful call. If I make a request such as http://localhost/{type A}/{id} I will return an instance of A from the database with the specified id.
My question is what happens when A contains a collection of B objects? At the moment the XML I generate returns A with a collection of B objects inside of it. As you can imagine if the B type has a collection of C objects then the XML returned will end up being a quite complicated object graph.
I can't be 100% sure but this feels to be against the RESTful principles, the XML for A should return the fields etc. for A as well as a collection of URI's to the collection of B's that it owns.
Sorry if this is a bit confusing, I can try to elaborate more. This seems like a relatively basic question, however I can't decide which approach is "more" RESTful.
Cheers,
Aidos
One essential RESTful principle is that everything has a URI.
You have URI's like this.
/A/ and /A/id/ to get a list of A's and a specific A. The A response includes the ID's of B's.
/B/ and /B/id/ to get a list of B's and a specific B. The B response includes the ID's of C's.
/C/ and /C/id/ to get a list of C's and a specific C.
You can, through a series of queries, rebuild the A-B-C structure. You get the A, then get the relevant B's. When getting a B, you get the various C's that are referenced.
Edit
Nothing prevents you from returning more.
For example, you might have the following kinds of URI's.
/flat/A/id/, /flat/B/id/ and /flat/C/id/ to return "flat" (i.e., no depth) structures.
/deep/A/id/, /deep/B/id/ and /deep/C/id/ to return structures with complete depth.
The /deep/A/id/ would be the entire structure, in a big, nested XML document. Fine for clients that can handle it. /flat/A/id/ would be just the top level in a flat document. Best for clients that can't handle depth.
There's nothing saying your REST interface can't be relational.
/bookstore/{bookstoreID}
/bookstore/{bookstoreID}/books
/book/{bookID}
Basically, you have a 1:1 correspondence with your DB schema.
Except for Many-to-Many relations forming child lists. For example,
/bookstore/657/books should return a list of book IDs or URLs. Then if you want a specific book's data you can invoke the 3rd URL.
This is just off the top of my head, please debate the merits.
Make a flat Universe that you expose to the world.
Even when I use SOAP, which can easily handle hierarchical object graphs up to whatever depth, I flatten the graph and link everything with simple IDs (you could even use your database IDs, although the idea is that you want you don't want to expose your PKs to the world).
Your object universe inside your app is not necessarily the same one you expose to the world. Let A have children and let B have children, but there's no need to reflect that in the REST request URLs.
Why flatten? Because then you can do things like fetch the objects later by ID, or send them in bursts (both the same case, more or less)... And better than all that, the request URIs don't change when the object hierarchy changes (object 37252 is always the same, even when it's been reclassed).
Edit: Well, you asked for it... Here's the architecture I ended up using:
package: server - contains the superclasses that are shared between the front-end server and the back-end server
package: frontEndServer - contains a Server interface which the front-end server must adhere to. The interface is nice because if you decide to change from SOAP to a straight Web client (that uses JSON or whatever, as well), you've got the interface all laid out. It also contains all the implementations for the frontEnd classes that will be tossed to the client, and all the logic for the interaction between classes except how to talk to the client.
package: backEndServer - contains a Server interface which the back-end server will adhere to. An example of a Server implementation would be one that talks to a MySql DB or one that talks to an XML DB, but the Server interface is neutral. This package also contains all the classes that the implementations of the Server interface use to get work done, and all the logic for the backend except for persistence.
then you have implementation packages for each of these... which include stuff like how to persist for the backend and how to talk to the client for the front end. The front-end implementation package might know, for instance, that a user has logged in, whereas the frontEndServer just knows that it has to implement methods for creating users and logging in.
After beginning to write this up I realize that it would take a while more to describe everything, but here you have the gist of it.