Transactional batch processing with OData - entity-framework

Working with Web API OData, I have $batch processing working, however, the persistence to the database is not Transactional. If I include multiple requests in a Changeset in my request, and one of those items fails, the other still completes, because each separate call to the controller has it's own DbContext.
for example, if I submit a Batch with two change sets:
Batch 1
- ChangeSet 1
- - Patch valid object
- - Patch invalid object
- End Changeset 1
- ChangeSet 2
- - Insert Valid Object
- End ChangeSet 2
End Batch
I would expect that the first valid patch would be rolled back, as the change set could not be completed in its entirety, however, since each call gets its own DbContext, the first Patch is committed, the second is not, and the insert is committed.
Is there a standard way to support transactions through a $batch request with OData?

The theory: let's make sure we're talking about the same thing.
In practice: addressing the problem as far as I can (no definitive answer) could.
In practice, really (update): a canonical way of implementing the backend-specific parts.
Wait, does it solve my problem?: let's not forget that the implementation (3) is bound by the specification (1).
Alternatively: the usual "do you really need it?" (no definitive answer).
The theory
For the record, here is what the OData spec has to say about it (emphasis mine):
All operations in a change set represent a single change unit so a
service MUST successfully process and apply all the requests in the
change set or else apply none of them. It is up to the service
implementation to define rollback semantics to undo any requests
within a change set that may have been applied before another request
in that same change set failed and thereby apply this all-or-nothing
requirement. The service MAY execute the requests within a change set
in any order and MAY return the responses to the individual requests
in any order. (...)
http://docs.oasis-open.org/odata/odata/v4.0/cos01/part1-protocol/odata-v4.0-cos01-part1-protocol.html#_Toc372793753
This is V4, which barely updates V3 regarding Batch Requests, so the same considerations apply for V3 services AFAIK.
To understand this, you need a tiny bit of background:
Batch requests are sets of ordered requests and change sets.
Change Sets themselves are atomic units of work consisting of sets of unordered requests, though these requests can only be Data Modification requests (POST, PUT, PATCH, DELETE, but not GET) or Action Invocation requests.
You might raise an eyebrow at the fact that requests within change sets are unordered, and quite frankly I don't have proper rationale to offer. The examples in the spec clearly show requests referencing each other, which would imply that an order in which to process them must be deduced. In reality, my guess is that change sets must really be thought of as single requests themselves (hence the atomic requirement) that are parsed together and are possibly collapsed into a single backend operation (depending on the backend of course). In most SQL databases, we can reasonably start a transaction and apply each subrequest in a certain order defined by their interdependencies, but for some other backends, it may be required that these subrequests be mangled and made sense of together before sending any change to the platters. This would explain why they aren't required to be applied in order (the very notion might not make sense to some backends).
An implication of that interpretation is that all your change sets must be logically consistent on their own; for example you can't have a PUT and a PATCH that touch the same properties on the same change set. This would be ambiguous. It's thus the responsibility of the client to merge operations as efficiently as possible before sending the requests to the server. This should always be possible.
(I'd love someone to confirm this.) I'm now fairly confident that this is the correct interpretation.
While this may seem like an obvious good practice, it's not generally how people think of batch processing. I stress again that all of this applies to requests within change sets, not requests and change sets within batch requests (which are ordered and work pretty much as you'd expect, minus their non-atomic/non-transactional nature).
In practice
To come back to your question, which is specific to ASP.NET Web API, it seems they claim full support of OData batch requests. More information here. It also seems true that, as you say, a new controller instance is created for each subrequest (well, I take your word for it), which in turn brings a new context and breaks the atomicity requirement. So, who's right?
Well, as you rightly point out too, if you're going to have SaveChanges calls in your handlers, no amount of framework hackery will help much. It looks like you're supposed to handle these subrequests yourself with the considerations I outlined above (looking out for inconsistent change sets). Quite obviously, you'd need to (1) detect that you're processing a subrequest that is part of a changeset (so that you can conditionally commit) and (2) keep state between invocations.
Update: See next section for how to do (2) while keeping controllers oblivious to the functionality (no need for (1)). The next two paragraphs may still be of interest if you want more context on the problems that are solved by the HttpMessageHandler solution.
I don't know if you can detect whether you're in a changeset or not (1) with the current APIs they provide. I don't know if you can force ASP.NET to keep the controller alive for (2) either. What you could do for the latter however (if you can't keep it alive) is to keep a reference to the context elsewhere (for example in some kind of session state Request.Properties) and reuse it conditionally (update: or unconditionally if you manage the transaction at a higher-level, see below). I realize this is probably not as helpful as you might have hoped, but at least now you should have the right questions to direct to the developers/documentation writers of your implementation.
Dangerously rambling: Instead of conditionally calling SaveChanges, you could conditionally create and terminate a TransactionScope for every changeset. This doesn't remove the need for (1) or (2), just another way of doing things. It sort of follows that the framework could technically implement this automatically (as long as the same controller instance can be reused), but without knowing the internals enough I wouldn't revisit my statement that the framework doesn't have enough to go with to do everything itself just yet. After all, the semantics of TransactionScope might be too specific, irrelevant or even undesired for certain backends.
Update: This is indeed what the proper way of doing things look like. The next section shows a sample implementation that uses the Entity Framework explicit transaction API instead of TransactionScope, but this has the same end-result. Although I feel there are ways to make a generic Entity Framework implementation, currently ASP.NET doesn't provide any EF-specific functionality so you're required to implement this yourself. If you ever extract your code to make it reusable, please do share it outside of the ASP.NET project if you can (or convince the ASP.NET team that they should include it in their tree).
In practice, really (update)
See snow_FFFFFF's helpful answer, which references a sample project.
To put it in context of this answer, it shows how to use an HttpMessageHandler to implement requirement #2 which I outlined above (keeping state between invocations of the controllers within a single request). This works by hooking at a higher-level than controllers, and splitting the request in multiple "subrequests", all the while keeping state oblivious to the controllers (the transactions) and even exposing state to the controllers (the Entity Framework context, in this case via HttpRequestMessage.Properties). The controllers happily process each subrequest without knowing if they are normal requests, part of a batch request, or even part of a changeset. All they need to do is use the Entity Framework context in the properties of the request instead of using their own.
Note that you actually have a lot of built-in support to achieve this. This implementation builds on top of the DefaultODataBatchHandler, which builds on top of the ODataBatchHandler code, which builds on top of the HttpBatchHandler code, which is an HttpMessageHandler. The relevant requests are explicitly routed to that handler using Routes.MapODataServiceRoute.
How does this implementation map to the theory? Quite well, actually. You can see that each subrequest is either sent to be processed as-is by the relevant controller if it is an "operation" (normal request), or handled by more specific code if it is a changeset. At this level, they are processed in order, but not atomically.
The changeset handling code however does indeed wrap each of its own subrequests in a transaction (one transaction for each changeset). While the code could at this point try to figure out the order in which to execute statements in the transaction by looking at the Content-ID headers of each subrequest to build a dependency graph, this implementation takes the more straightforward approach of requiring the client to order these subrequests in the right order itself, which is fair enough.
Wait, does it solve my problem?
If you can wrap all your operations in a single changeset, then yes, the request will be transactional. If you can't, then you must modify this implementation so that it wraps the entire batch in a single transaction. While the specification supposedly doesn't preclude this, there are obvious performance considerations to take into account. You could also add a non-standard HTTP header to flag whether you want the batch request to be transactional or not, and have your implementation act accordingly.
In any case, this wouldn't be standard, and you couldn't count on it if you ever wanted to use other OData servers in an interoperable manner. To fix this, you need to argue for optional atomic batch requests to the OData committee at OASIS.
Alternatively
If you can't find a way to branch code when processing a changeset, or you can't convince the developers to provide you with a way to do so, or you can't keep changeset-specific state in any satisfactory way, then it looks like you must [you may alternatively want to] expose a brand new HTTP resource with semantics specific to the operation you need to perform.
You probably know this, and this is likely what you're trying to avoid, but this involves using DTOs (Data Transfer Objects) to populate with the data in the requests. You then interpret these DTOs to manipulate your entities within a single handler controller action and hence with full control over the atomicity of the resulting operations.
Note that some people actually prefer this approach (more process-oriented, less data-oriented), though it can be very difficult to model. There's no right answer, it always depends on the domain and use-cases, and it's easy to fall into traps that would make your API not very RESTful. It's the art of API design. Unrelated: The same remarks can be said about data modeling, which some people actually find harder. YMMV.
Summary
There's a few approaches to explore, some information to retrieve from the developers a canonical implementation technique to use, an opportunity to create a generic Entity Framework implementation, and a non-generic alternative.
It'd be nice if you could update this thread as you gather answers elsewhere (well, if you feel motivated enough) and with what you eventually decide to do, as it seems like something many people would love to have some kind of definitive guidance for.
Good luck ;).

The following link shows the Web API OData implementation that is required to process the changeset in transactions. You are correct that the default batch handler does not do this for you:
http://aspnet.codeplex.com/SourceControl/latest#Samples/WebApi/OData/v3/ODataEFBatchSample/
UPDATE
The original link seems to have gone away - the following link includes similar logic (and for v4) for transactional processing:
https://damienbod.com/2014/08/14/web-api-odata-v4-batching-part-10/

I just learned about the TransactionScope class.
A few points I'd like to make before moving on:
The original question posited that each controller called received
its own DbContext. This isn't true. The DBContext lifetime is scoped to the entire request. Review Dependency lifetime in ASP.NET Core for
further details.
I suspect that the original poster is experiencing issues because each sub-request in the batch is invoking its assigned controller method,
and each method is calling DbContext.SaveChanges() individually - causing that unit of work to be committed.
My understanding of the
question comes from the basis of performing database transactions,
i.e. (pseudo-code for SQL expected):
BEGIN TRAN
DO SOMETHING
DO MORE THINGS
DO EVEN MORE THINGS
IF FAILURES OCCURRED ROLLBACK EVERYTHING. OTHERWISE, COMMIT EVERYTHING.
This is a reasonable request that I would expect OData to be able to perform with a single POST operation to [base URL]/odata/$batch.
Batch Execution Order Concerns
For our purposes, we may or may not necessarily care what order work is done against the DbContext. We definitely care though that the work being performed is being done as part of a batch. We want it to all succeed or all be rolled back in the database(s) being updated.
If you are using old school Web API (in other words, prior to ASP.Net Core), then your batch handler class is likely the DefaultHttpBatchHandler class. According to Microsoft's documentation here Introducing batch support in Web API and Web API OData, batch transactions using the DefaultHttpBatchHandler in OData are sequential by default. It has an ExecutionOrder property that can be set to change this behavior so that operations are performed concurrently.
If you are using ASP.Net Core, it appears we have two options:
If your batch operation is using the "old school" payload format, it
appears that batch operations are performed sequentially by default
(assuming I am interpreting the source code correctly).
ASP.Net Core provides a new option though. A new
DefaultODataBatchHandler has replaced the old DefaultHttpBatchHandler
class. Support for ExecutionOrder has been dropped in favor of adopting a
model where metadata in the payload communicates whether what batch
operations should happen in order and/or can be executed concurrently. To
utilize this feature, the request payload Content-Type is changed to
application/json and the payload itself is in JSON format (see below). Flow
control is established within the payload by adding dependency and group
directives to control execution order so that batch requests can be split
into multiple groups of individual requests that can be executed
asynchronously and in parallel where no dependencies exist, or in order where
dependencies do exist. We can take advantage of this fact and simply create
"Id", "atomicityGroup", and "DependsOn" tags in out payload to ensure
operations are performed in the appropriate order.
Transaction Control
As stated previously, your code is likely using either the DefaultHttpBatchHandler class or the DefaultODataBatchHandler class. In either case, these classes aren't sealed and we can easily derive from them to wrap the work being done in a TransactionScope. By default, if no unhandled exceptions occurred within the scope, the transaction is committed when it is disposed. Otherwise, it is rolled back:
/// <summary>
/// An OData Batch Handler derived from <see cref="DefaultODataBatchHandler"/> that wraps the work being done
/// in a <see cref="TransactionScope"/> so that if any errors occur, the entire unit of work is rolled back.
/// </summary>
public class TransactionedODataBatchHandler : DefaultODataBatchHandler
{
public override async Task ProcessBatchAsync(HttpContext context, RequestDelegate nextHandler)
{
using (TransactionScope scope = new TransactionScope( TransactionScopeAsyncFlowOption.Enabled))
{
await base.ProcessBatchAsync(context, nextHandler);
}
}
}
Just replace your default class with an instance of this one and you are good to go!
routeBuilder.MapODataServiceRoute("ODataRoutes", "odata",
modelBuilder.GetEdmModel(app.ApplicationServices),
new TransactionedODataBatchHandler());
Controlling Execution Order in the ASP.Net Core POST to Batch Payload
The payload for the ASP.Net Core batch handler uses "Id", "atomicityGroup", and "DependsOn" tags to control execution order of the sub-requests. We also gain a benefit in that the boundary parameter on the Content-Type header is not necessary as it was in prior versions:
HEADER
Content-Type: application/json
BODY
{
"requests": [
{
"method": "POST",
"id": "PIG1",
"url": "http://localhost:50548/odata/DoSomeWork",
"headers": {
"content-type": "application/json; odata.metadata=minimal; odata.streaming=true",
"odata-version": "4.0"
},
"body": { "message": "Went to market and had roast beef" }
},
{
"method": "POST",
"id": "PIG2",
"dependsOn": [ "PIG1" ],
"url": "http://localhost:50548/odata/DoSomeWork",
"headers": {
"content-type": "application/json; odata.metadata=minimal; odata.streaming=true",
"odata-version": "4.0"
},
"body": { "message": "Stayed home, stared longingly at the roast beef, and remained famished" }
},
{
"method": "POST",
"id": "PIG3",
"dependsOn": [ "PIG2" ],
"url": "http://localhost:50548/odata/DoSomeWork",
"headers": {
"content-type": "application/json; odata.metadata=minimal; odata.streaming=true",
"odata-version": "4.0"
},
"body": { "message": "Did not play nice with the others and did his own thing" }
},
{
"method": "POST",
"id": "TEnd",
"dependsOn": [ "PIG1", "PIG2", "PIG3" ],
"url": "http://localhost:50548/odata/HuffAndPuff",
"headers": {
"content-type": "application/json; odata.metadata=minimal; odata.streaming=true",
"odata-version": "4.0"
}
}
]
}
And that's pretty much it. With the batch operations being wrapped in a TransactionScope, if anything fails, it all gets rolled back.

There should only be one DbContext for the OData batch request. Both WCF Data Services and HTTP Web API support OData batch scenario and handle it in a transactional manner. You can check this example: http://blogs.msdn.com/b/webdev/archive/2013/11/01/introducing-batch-support-in-web-api-and-web-api-odata.aspx

I used the same from V3 of the Odata Samples, I saw that my transaction.rollback was called but the data did not rollback. something is lacking but I can't work out what. this may be an issue with having each Odata call using save changes and do they actually see the transaction as in scope. we may need a guru from the Entity Framework team to help solve this one.

Although very detailed, Mitselpliks answer didn't work right out of the box for me, as the transaction got rolled back after each request. To commit the transaction and apply it to the database, it is necessary to call scope.Complete() before disposing/leaving the using scope.
The next issue was that, although everything now ran in a transaction, Exceptions/failures of a single request didn't cause the request or transaction to fail, the status code of the batch response was still a 200 and all other changes were still applied.
As there was no direct way to read the status code of individual requests of the batch request directly from HttpContext, I had to also overload the ExecuteRequestMessageAsync Method and check the results there. So my final code, which causes transactions to be applied when everything is successful and else to rollback everyhing, looks like this:
public class TransactionODataBatchHandler : DefaultODataBatchHandler
{
protected bool Failed { get; set; }
public override async Task ProcessBatchAsync(HttpContext context, RequestDelegate nextHandler)
{
using (var scope = new TransactionScope(
TransactionScopeOption.Required,
new TransactionOptions { IsolationLevel = IsolationLevel.ReadCommitted },
TransactionScopeAsyncFlowOption.Enabled))
{
Failed = false;
await base.ProcessBatchAsync(context, nextHandler);
if (!Failed)
{
scope.Complete();
}
}
}
public override async Task<IList<ODataBatchResponseItem>> ExecuteRequestMessagesAsync(IEnumerable<ODataBatchRequestItem> requests, RequestDelegate handler)
{
var responses = await base.ExecuteRequestMessagesAsync(requests, handler);
Failed = responses.Cast<OperationResponseItem>().Any(r => !r.Context.Response.IsSuccessStatusCode());
return responses;
}
}

Related

How do PUT, POST or PATCH request differ ultimately?

The data, being sent over a PUT/PATCH/POST request, ultimately ends up in the database.
Now whether we are inserting a new resource or updating or modifying an existing one - it all depends upon the database operation being carried out.
Even if we send a POST and ultimately perform just an update in the database, it does not impact anywhere at all, isn't it?!
Hence, do they actually differ - apart from a purely conceptual point of view?
Hence, do they actually differ - apart from a purely conceptual point of view?
The semantics differ - what the messages mean, and what general purpose components are allowed to assume is going on.
The meanings are just those described by the references listed in the HTTP method registry. Today, that means that POST and PUT are described by HTTP semantics; PATCH is described by RFC 5789.
Loosely: PUT means that the request content is a proposed replacement for the current representation of some resource -- it's the method we would use to upload or replace a single web page if we were using the HTTP protocol to do that.
PATCH means that the request content is a patch document - which is to say a proposed edit to the current representation of some resource. So instead of sending the entire HTML document with PUT, you might instead just send a fix to the spelling error in the title element.
POST is... well, POST is everything else.
POST serves many useful purposes in HTTP, including the general purpose of “this action isn’t worth standardizing.” -- Fielding 2009
The POST method has the fewest constraints on its semantics (which is why we can use it for anything), but the consequence is that the HTTP application itself has to be very conservative with it.
Webber 2011 includes a good discussion of the implementations of the fact that HTTP is an application protocol.
Now whether we are inserting a new resource or updating or modifying an existing one - it all depends upon the database operation being carried out.
The HTTP method tells us what the request means - it doesn't place any constraints on how your implementation works.
See Fielding, 2002:
HTTP does not attempt to require the results of a GET to be safe. What it does is require that the semantics of the operation be safe, and therefore it is a fault of the implementation, not the interface or the user of that interface, if anything happens as a result that causes loss of property (money, BTW, is considered property for the sake of this definition).
The HTTP methods are part of the "transfer of documents over a network" domain - ie they are part of the facade that allows us to pretend that the bank/book store/cat video archive you are implementing is just another "web site".
It is about the intent of the sender and from my perspective it has a different behaviour on the server side.
in a nutshell:
POST : creates new data entry on the server (especially with REST)
PUT : updates full data entry on the server (REST) or it creates a new data entry (non REST). The difference to a POST request is that the client specifies the target location on the server.
PATCH : the client requests a partial update (Id and partial data of entry are given). The difference to PUT is that the client sends not the full data back to the server this can save bandwidth.
In general you can use any HTTP request to store data (GET, HEAD, DELETE...) but it is common practice to use POST, PUT, and PATCH for specific and standardized scenarios. Because every developer can understand it later
They are slightly different and they bind to different concepts of REST API (which is based on HTTP)
Just imagine that you have some Booking entity. And yo perform the following actions with resources:
POST - creates a new resource. And it is not idempotent - if you sent the same request twice -> two bookings will be stored. The third time - will create the third one. You are updating your DB with every request.
PUT - updates the full representation of a resource. It means - it replaces the booking full object with a new one. And it is idempotent - you could send a request ten times result will be the same (if a resource wasn't changed between your calls)
PATCH - updates some part of the resource. For example, your booking entity has a date property -> you update only this property. For example, replace the existing date with new date which is sent at the request.
However, for either of the above - who is deciding whether it is going to be a new resource creation or updating/modifying an existing one, it's the database operation or something equivalent to that which takes care of persistence
You are mixing different things.
The persistence layer and UI layer are two different things.
The general pattern used at Java - Model View Controller.
REST API has its own concept. And the DB layer has its own purpose. Keep in mind that separating the work of your application into layers is exactly high cohesion - when code is narrow-focused and does one thing and does it well.
Mainly at the answer, I posted some concepts for REST.
The main decision about what the application should do - create the new entity or update is a developer. And this kind of decision is usually done through the service layer. There are many additional factors that could be done, like transactions support, performing filtering of the data from DB, pagination, etc.
Also, it depends on how the DB layer is implemented. If JPA with HIbernate is used or with JDBC template, custom queries execution...

REST collections of collections, promoting and demoting

We have a resource called tasks. With the following endpoints, we can list all tasks, and create tasks:
GET: /tasks
POST: /tasks
The problem is that tasks can contain sub-tasks, and we wish to embed the functionality to both support promoting sub-tasks to their own tasks and demoting tasks to sub-tasks of another task.
The naïve approach should be to delete the sub-task and create it again as a task and vice-versa, but we find this a tad too naive.
The second option we have come up with is to support endpoints such as the following, where {id} is the ID of the task, and {sid} is the ID of the sub-task:
POST: /tasks/{id}/add/{sid}
POST: /tasks/{id}/upgrade/{sid}
The first endpoint should add a task to another task, thereby deleting the first task and adding a copy as a sub-task to the second task. The second endpoint should upgrade a sub-task to a task from another task, thereby deleting the sub-task and adding a copy as a task.
So our question is if this would be considered good practice, or if there are some other options which we have not considered?
It's important to highlight that REST doesn't care about the URI spelling at all. However, once the central piece of the REST architectural style is the resource, it makes sense that the URIs uses nouns instead of verbs.
The REST architectural style is protocol independent, but it's commonly implemented on the top of the HTTP protocol. In this approach, the HTTP method is meant to express the operation that will be performed over the resource identified by a given URI.
Your question doesn't state what your domain looks like. But let's consider you have something like:
+-------------------+
| Task |
+-------------------+
|- id: Long |
|- title: String |
|- parent: Task |
|- children: Task[] |
+-------------------+
Creating a task could be expressed with a POST request:
POST /tasks
Host: example.org
Content-Type: application/json
{
"title": "Prepare dinner"
}
HTTP/1.1 201 Created
Location: /tasks/1
Creating a subtask could expressed with a POST request indicating the parentId in the payload:
POST /tasks
Host: example.org
Content-Type: application/json
{
"parentId": 1
"title": "Order a pizza"
}
HTTP/1.1 201 Created
Location: /tasks/2
Promoting a subtask to a task could be achieved by setting the parentId to null using a PATCH request:
PATCH /tasks/2
Host: example.org
Content-Type: application/json-patch+json
[
{
"op": "replace", "path": "/parentId", "value": null
}
]
HTTP/1.1 204 No Content
Updating a task to become a subtask could be achieved by setting the parentId using a PATCH request:
PATCH /tasks/2
Host: example.org
Content-Type: application/json-patch+json
[
{
"op": "replace", "path": "/parentId", "value": 1
}
]
HTTP/1.1 204 No Content
The above examples use JSON Patch (application/json-patch+json) as payload for the PATCH. Alternatively, you could consider JSON Merge Patch (application/merge-patch+json). I won't go through the differences of such formats once it will make this answer overly long. But you can click the links above and check them by yourself.
For handling errors, refer to the error handling section of the RFC 5789, the document that defines the PATCH method.
I also appreciate that some APIs avoid using PATCH for a number of reasons that are out of the scope of this answer. If that's the case, you could consider PUT instead. In this approach, the state of the resource will be replaced with the state defined in the representation sent in the request payload.
Alternatively you could have an endpoint such as /tasks/{id}/parent, supporting PUT. In this approach, the client just have to send a representation of the parent (such as an id) instead of a full representation of the task.
Pick the approach that suits you best.
REST is an abbrevation for the transfer of a resource' current state in a representation format supported by the client. As REST is a generalization of the World Wide Web, the same concepts you use for the Web also apply to applications following the REST architecture model. So the basic question resolves around: How would you design your system to work on Web pages and apply the same steps to your design.
As Cassio already mentioned that the spelling of URIs is not of importance to clients, as a URI remains a URI and you can't deduce from a URI whether the system is "RESTful" or not. Actually, there is no such thing as "RESTful" or "RESTless" URIs as a URI, as stated above, remains a URI. It is probably better to think of a URI as a key used for caching responses in a local or intermediary cache. Fielding made support of caching even a constraint and not just an option.
As REST is not a protocol but just an architectural style, you are basically not obligated to implement it stringent, though you will certainly miss out on the promised benefits, such as the decoupling of clients from APIs, the freedom to evolve the server side without breaking clients and making clients in general more robust against changes. Fielding even stated that applications that violate the constraints he put on REST shouldn't be termed REST at all, to avoid confusions.
I don't agree with user991710 in one of his comments that REST can't be used to represent processes, I agree however that REST shouldn't attempt to create new verbs either. As mentioned before, REST is about transfering a resources current state in a supported representation format. If a task can be represented as a stream of data then it can be presented to a client as well. The self descriptiveness of messages, i.e. by using a media type that defines the rules on how to process the data payload, guarantees that a client will be able to make sense of the data. I.e. your browser is able to render images, play videos, show text and similar stuff as it knows how to interpret the data stream accordingly. Support for such fields can be added via special addons or plugins, that may be loaded dynamically during runtime without even having to restart the application.
If I had to design tasks for Web pages, I'd initially return a pagable view of existing tasks, probably rendered in a table, with a link to a page that is able to create new tasks or a link in each row to update or delete an existing task. The create-new and update pages may use the same HTML form to enter or update a tasks information. If a task should be assigned as sub-task to an other task, you may be able to either select a parent task from a given set of tasks, i.e. in a drop-down text field, or enter the URI of the parent in a dedicated field. On submitting the task, HTTP method POST would be used that will perform the operation based on the servers own semantic, so wheter a new resource is created or one or multiple ones are updated is up to the server. The quintesence here is, that everything a client may be able to do, is taught by the server. The form to add new or update existing tasks just informs the client which fields are supported (or expected) by the server. There is actually no external, out-of-band knowledge needed by a client in order to perform a request. A server has still the option to reject incomplete payloads or payloads that violate certain constraints and a client will know by receiving an appropriate error message.
As clients shouldn't parse or interpret URIs, they will use certain text describing what the URIs do. In a browser a picture of a dustbin may be used as symbol for deletion, while a pencil may be used as symbol for updating an existing entry (or the like). As humans we quickly realize what these links are intended for whithout having to read the characters of the actual URI.
The last two paragraphs just summed up how the interaction model on a common Web page may look like. The same concepts should be used in a REST architecture as well. The content you exchange may vary more, ideally with standardized representation formats, compared to the big brother, the Web, though still the concepts of links are used to reference from one resource to other resources and server teaches clients what it needs apply here. Compared to HTML, however, you can utilize more HTTP methods than just POST and GET. A deletion of a resource's representation would probably make more sense with a DELETE method than with a POST method. Also, for updates either PUT or PATCH may make more sense, depending on the situation. In contrast of using sensible pictures for hinting users on what this link might be good for, link-relation names should be used that hint clients about the purpose of the link. Such link-relation names should be standardized, or at least express common sense such as expressed though special ontologies, or use absolute URIs as extension mechanism.
You may add dedicated links to show a collection of all tasks based on certain filters, i.e. all tasks or only parent tasks and what not, so a client can iterate through the task it is interested in. On selecting a task you may add links to all sub-tasks a client can invoke to learn what these subtasks are, if interested. Here the URIs of the tasks may remain unchanged, which supports caching by default. Note that I didn't mention anything in regards to how you should store the data in your backend as this is just some implementation details a client is usually not interested in. This is just a perfect case where a domain model does not necessarily have to be similar to a resources state representation.
In regards to which HTTP operation to perform a task promotion or demotion is basically some design choice that also may depend on the representation format the payload is exchanged for. As HTTP only supports POST and GET, such a change could be requested via POST, other media types might support other HTTP methods, such as PUT, which, according to its specification (last paragraph page 27), is allowed to have side-effects, or PATCH, which needs to be applied atomically - either fully or not at all. PATCH actually is similar to patching software, where a set of instructions should be applied on some target code. William Durand summarized this concept in a quite cited blog-post. However, he later on updated his blog post to mention that via application/merge-patch+json a more natural and intuitive way of "updating" resources could be used. In regards to form support, there exist a couple of drafts such as hal-forms, halo-json (halform), Ion or hydra that offer such definition but lack currently wider library support or a final standard. So a bit of work needs yet to put into an accepted standard.
To wrap this post up, you should design your API like you'd design your system to work for Web pages, apply the same concepts you use for interaction on typical Web pages, such as links and forms, and used them in your responses. Which HTTP operation you perform the actual promotion or demotion with, may be dependent on the actual media type and what HTTP operations it supports. POST should work in all cases, PUT is probably more intuitive to typical updates done in Web forms while patching would require your client to actually calculate the steps needed to transform the current resource' representation to the desired one (if you use application/json-patch+json in particular). application/merge-patch+json may be applicable as well, which would simplify the patching notably, as the current data contained in the form could just be sent to the server and default rules would decide whether a field got removed, added or updated.
you can divide your route with :
POST : /tasks => for create or add tasks
PUT/PATCH : /tasks => for upgrading your tasks
or even be more specific about it you can pass :id to params. I think the best approach is whenever you need to change delete update or get a specific object you must use :id in your params
PUT/PATCH/GET : /tasks/:id

REST delete multiple items in the batch

I need to delete multiple items by id in the batch however HTTP DELETE does not support a body payload.
Work around options:
1. #DELETE /path/abc?itemId=1&itemId=2&itemId=3 on the server side it will be parsed as List of ids and DELETE operation will be performed on each item.
2. #POST /path/abc including JSON payload containing all ids. { ids: [1, 2, 3] }
How bad this is and which option is preferable? Any alternatives?
Update: Please note that performance is a key here, it is not an option execute delete operation for each individual id.
Along the years, many people fell in doubt about it, as we can see in the related questions here aside. It seems that the accepted answers ranges from "for sure do it" to "its clearly mistreating the protocol". Since many questions was sent years ago, let's dig into the HTTP 1.1 specification from June 2014 (RFC 7231), for better understanding of what's clearly discouraged or not.
The first proposed workaround:
First, about resources and the URI itself on Section 2:
The target of an HTTP request is called a "resource". HTTP does not limit the nature of a resource; it merely defines an interface that might be used to interact with resources. Each resource is identified by a Uniform Resource Identifier (URI).
Based on it, some may argue that since HTTP does not limite the nature of a resource, a URI containing more than one id would be possible. I personally believe it's a matter of interpretation here.
About your first proposed workaround (DELETE '/path/abc?itemId=1&itemId=2&itemId=3') we can conclude that it's something discouraged if you think about a resource as a single document in your entity collection while being good to go if you think about a resource as the entity collection itself.
The second proposed workaround:
About your second proposed workaround (POST '/path/abc' with body: { ids: [1, 2, 3] }), using POST method for deletion could be misleading. The section Section 4.3.3 says about POST:
The POST method requests that the target resource process the representation enclosed in the request according to the resource's own specific semantics. For example, POST is used for the following functions (among others): Providing a block of data, such as the fields entered into an HTML form, to a data-handling process; Posting a message to a bulletin board, newsgroup, mailing list, blog, or similar group of articles; Creating a new resource that has yet to be identified by the origin server; and Appending data to a resource's existing representation(s).
While there's some space for interpretation about "among others" functions for POST, it clearly conflicts with the fact that we have the method DELETE for resources removal, as we can see in Section 4.1:
The DELETE method removes all current representations of the target resource.
So I personally strongly discourage the use of POST to delete resources.
An alternative workaround:
Inspired on your second workaround, we'd suggest one more:
DELETE '/path/abc' with body: { ids: [1, 2, 3] }
It's almost the same as proposed in the workaround two but instead using the correct HTTP method for deletion. Here, we arrive to the confusion about using an entity body in a DELETE request. There are many people out there stating that it isn't valid, but let's stick with the Section 4.3.5 of the specification:
A payload within a DELETE request message has no defined semantics; sending a payload body on a DELETE request might cause some existing implementations to reject the request.
So, we can conclude that the specification doesn't prevent DELETE from having a body payload. Unfortunately some existing implementations could reject the request... But how is this affecting us today?
It's hard to be 100% sure, but a modern request made with fetch just doesn't allow body for GET and HEAD. It's what the Fetch Standard states at Section 5.3 on Item 34:
If either body exists and is non-null or inputBody is non-null, and request’s method is GET or HEAD, then throw a TypeError.
And we can confirm it's implemented in the same way for the fetch pollyfill at line 342.
Final thoughts:
Since the alternative workaround with DELETE and a body payload is let viable by the HTTP specification and is supported by all modern browsers with fetch and since IE10 with the polyfill, I recommend this way to do batch deletes in a valid and full working way.
It's important to understand that the HTTP methods operate in the domain of "transferring documents across a network", and not in your own custom domain.
Your resource model is not your domain model is not your data model.
Alternative spelling: the REST API is a facade to make your domain look like a web site.
Behind the facade, the implementation can do what it likes, subject to the consideration that if the implementation does not comply with the semantics described by the messages, then it (and not the client) are responsible for any damages caused by the discrepancy.
DELETE /path/abc?itemId=1&itemId=2&itemId=3
So that HTTP request says specifically "Apply the delete semantics to the document described by /path/abc?itemId=1&itemId=2&itemId=3". The fact that this document is a composite of three different items in your durable store, that each need to be removed independently, is an implementation details. Part of the point of REST is that clients are insulated from precisely this sort of knowledge.
However, and I feel like this is where many people get lost, the metadata returned by the response to that delete request tells the client nothing about resources with different identifiers.
As far as the client is concerned, /path/abc is a distinct identifier from /path/abc?itemId=1&itemId=2&itemId=3. So if the client did a GET of /path/abc, and received a representation that includes itemIds 1, 2, 3; and then submits the delete you describe, it will still have within its own cache the representation that includes /path/abc after the delete succeeds.
This may, or may not, be what you want. If you are doing REST (via HTTP), it's the sort of thing you ought to be thinking about in your design.
POST /path/abc
some-useful-payload
This method tells the client that we are making some (possibly unsafe) change to /path/abc, and if it succeeds then the previous representation needs to be invalidated. The client should repeat its earlier GET /path/abc request to refresh its prior representation rather than using any earlier invalidated copy.
But as before, it doesn't affect the cached copies of other resources
/path/abc/1
/path/abc/2
/path/abc/3
All of these are still going to be sitting there in the cache, even though they have been "deleted".
To be completely fair, a lot of people don't care, because they aren't thinking about clients caching the data they get from the web server. And you can add metadata to the responses sent by the web server to communicate to the client (and intermediate components) that the representations don't support caching, or that the results can be cached but they must be revalidated with each use.
Again: Your resource model is not your domain model is not your data model. A REST API is a different way of thinking about what's going on, and the REST architectural style is tuned to solve a particular problem, and therefore may not be a good fit for the simpler problem you are trying to solve.
That doesn’t mean that I think everyone should design their own systems according to the REST architectural style. REST is intended for long-lived network-based applications that span multiple organizations. If you don’t see a need for the constraints, then don’t use them. That’s fine with me as long as you don’t call the result a REST API. I have no problem with systems that are true to their own architectural style. -- Fielding, 2008

How to use HTTP request methods correctly in REST APIs?

In my search to understand the HTTP request methods, particularly PUT since I have apparently been using it all wrong, I have stumbled upon a lot of quotes stating “PUT doesn’t belong in REST APIs” and “One should only use GET and POST in modern APIs”.
This just makes me wonder, why would you not use PUT or PATCH or DELETE etc. in REST APIs? Isn’t it what they are there for? To be used because they help with semantics and structure etc.?
Could it have something to do with e.g. the methods receiving the request mainly being methods that directs the data to other methods like the database methods where they are then handled? I’ve used PUT when wanting to update a document, but never has it overwritten a document, even though I only sent part of the data to it.
There is an example of this below, using Express and MongoDB (showing a put method in Express).
app.put('/users/:id', (req, res, next) => {
let id = req.params.id;
userService.getOneUser({_id: ObjectID(id)}).then((user) => {
let updatedData = req.body;
userService.updateUser(user._id, updatedData)
.then((doc) => res.json(doc))
.catch((err) => errorHandler(err, res, next));
}).catch((err) => errorHandler(err, res, next));
})
Basically my question is: In relating to the above statements, how do you use these methods correctly in REST APIs and when do you use them?
EDIT: Example on two references:
PUT doesn't belong in REST API
Only use GET and POST - see third comment on question
I have stumbled upon a lot of quotes stating “PUT doesn’t belong in REST APIs” and “One should only use GET and POST in modern APIs”.
I wouldn't put much stock in those quotes - they imply a lack of understanding about REST, HTTP, and how everything is supposed to fit together.
I'd suggest starting with Jim Webber, who does have a good understanding.
HTTP is an application protocol, whose application domain is the transfer of documents over a network.
PUT, PATCH, DELETE are all perfectly fine methods for describing changes to a document. I GET a document from you, I use my favorite HTTP aware editor/library to make changes to the document, I send you a request describing my changes to the document, you get to figure out what to do on your end.
This just makes me wonder, why would you not use PUT or PATCH or DELETE etc. in REST APIs? Isn’t it what they are there for? To be used because they help with semantics and structure etc.?
One reason that you might not: your media type of choice is HTML -- HTML has native support for links (GET) and forms (GET/POST), but not a lot of direct support in the way of involving the other methods in the flow. You can use code-on-demand, for those clients that support it.
Could it have something to do with e.g. the methods receiving the request mainly being methods that directs the data to other methods like the database methods where they are then handled? I’ve used PUT when wanting to update a document, but never has it overwritten a document, even though I only sent part of the data to it.
An important thing to understand about HTTP methods is that they describe semantics, not implementations. Here's Fielding writing in 2002:
HTTP does not attempt to require the results of a GET to be safe. What it does is require that the semantics of the operation be safe, and therefore it is a fault of the implementation, not the interface or the user of that interface, if anything happens as a result that causes loss of property
In the specific case of PUT, there's an extra hint at the implication of the semantics
A successful PUT of a given representation would suggest that a subsequent GET on that same target resource will result in an equivalent representation being sent in a 200 (OK) response. However, there is no guarantee that such a state change will be observable....
I think Triynko raises a good point:
URIs identified in most modern apps ARE NOT resources to be replaced, updated, etc. They're not documents. They're PROCEDURES being called.
If you are creating a procedure focused API, as opposed to a resource focused API, then there is a decent probability that PUT/PATCH/DELETE aren't actually providing benefits that justify the extra complexity.
One hint that you are procedure focused: how much attention are you paying to cache invalidation? Part of the point of accepting a "uniform interface" constraint is that you want the capabilities that generic components can provide, and in HTTP, caching is a big deal.

How to model a CANCEL action in a RESTful way?

We are currently in the process of wrangling smaller services from our monoliths. Our domain is very similar to a ticketing system. We have decided to start with the cancellation process of the domain.
Our cancel service has as simple endpoint "Cancel" which takes in the id of the ticket. Internally, we retrieve the id, perform some operations related to cancel on it and update the state of the entity in the store. From the store's perspective the only difference between a cancelled ticket and a live ticket are a few properties.
From what I have read, PATCH seems to be the correct verb to be used in this case, as am updating only a simple property in the resource.
PATCH /api/tickets/{id}
Payload {isCancelled: true}
But isCancelled is not an actual property in the entity. Is it fair to send properties in the payload that are not part of the entity or should I think of some other form of modeling this request? I would not want to send the entire entity as part of the payload, since it is large.
I have considered creating a new resource CancelledTickets, but in our domain we would never have the need to a GET on cancelled tickets. Hence stayed away from having to create a new resource.
Exposing the GET interface of a resource is not compulsory.
For example, use
PUT /api/tickets/{id}/actions/cancel
to submit the cancellation request. I choose PUT since there would be no more than one cancellation request in effect.
Hope it be helpful.
Take a look what exactly is RESTful way. No matter if you send PATCH request with isCancelled as payload or even DELETE if you want tickets to disappear. It's still RESTful.
Your move depends on your needs. As you said
I have considered creating a new resource CancelledTickets, but in our
domain we would never have the need to a GET on cancelled tickets.
I would just send DELETE. You don't have to remove it physically. If it's possible to un-cancel, then implement isCancelled mechanism. It's just question of taste.
REST is basically a generalization of the browser based Web. Any concepts you apply for the Web can also be applied to REST.
So, how would you design a cancel activity in a Web page? You'd probably have a table row with certain activities like edit and delete outlined with icons and mouse-over text that on clicking invoke a URI on the server and lead to a followup state. You are not that much interested how the URI of that button might look like or if a PATCH or DELETE command is invoked in the back. You are just interested that the request is processed.
The same holds true if you want to perform the same via REST. Instead of images that hint the user that an edit or cancel activity is performed on an entry, a meaningful link-relation name should be used to hint the client about the possiblities. In your case this might be something like reserve new tickets, edit reservation or cancel reservation. Each link relation name is accompanied by a URL the client can invoke if he wants to perform one of the activities. The exact characters of the URI is not of importance here to the client also. Which method to invoke might either be provided already in the response (as further accompanying field) or via the media type the response was processed for. If neither the media type nor an accompanying field gives a hint on which HTTP operation to use an OPTIONS request may be issued on the URI beforehand. The rule of thumb here is, the server should teach a client on how to achieve something in the end.
By following such a concept you decouple a client from the API and make it robust to changes. Instead of a client generating a URI to invoke the client is fed by the server with possible URIs to invoke. If the server ever changes its iternal URI structure a client using one of the provided URIs will still be able to invoke the service as it simply used one of the URIs provided by the server. Which one to use is determined by analyzing the link relation name that hints the client when to invoke such URI. As mentioned above, such link relation names need to be defined somewhere. But this is exactly what Fielding claimed back in 2008 by:
A REST API should spend almost all of its descriptive effort in defining the media type(s) used for representing resources and driving application state, or in defining extended relation names and/or hypertext-enabled mark-up for existing standard media types. (Source)
Which HTTP operation to choose for canceling a ticket/reservation may depend on your desing. While some of the answers recommended DELETE RFC 7231 states that only the association between the URI and the resource is removed and no gurantee is given that the actual resource is also being removed here as well. If you design a system where a cancelation might be undone, then DELETE is not the right choice for you as the mapping of the URI to the resource should not exist further after handling a DELETE request. If you, however, consider a cancelation to also lead to a removal of the reservation then you MAY use DELETE.
If you model your resource in a way that maintains the state as property within the resource, PATCHing the resource might be a valid option. However, simply sending something like state=canceled is probably not enough here as PATCH is a calculation of steps done by the client in order to transform a certain resource (or multipe resources) into a desired target state. JSON Patch might give a clue on how this might be done. A further note needs to be done on the atomicy requirement PATCH has. Either all of the instructions succeed or none at all.
As also PUT was mentioned in one of the other answers. PUT has the semantics of replacing the current representation available at the given URI with the one given in the request' payload. The server is further allowed to either reject the content or transform it to a more suitable representation and also affect other resources as well, i.e. if they mimic a version history of the resource.
If neither of the above mentioned operations really satisfies your needs you should use POST as this is the all-purpose, swiss-army-knife toolkit of HTTP. While this operation is usually used to create new resources, it isn't limited to it. It should be used in any situation where the semantics of the other operations aren't applicable. According to the HTTP specification
The POST method requests that the target resource process the representation enclosed in the request according to the resource's own specific semantics.
This is basically the get-free-out-of-jail card. Here you can literally process anything at the server according to your own rules. If you want to cancel or pause something, just do it.
I highly discourage to use GET for creating, altering or canceling/removing something. GET is a safe operation and gurantees any invoking client that it wont alter any state for the invoked resource on the server. Note that it might have minor side-effects, i.e. logging, though the actual state should be unaffected by an invocation. This is a property Web crawler rely on. They will simply invoke any URI via GET and learn the content of the received response. And I assume you don't want Google (or any other crawler) to cancel all of your reservations, or do you?
As mentioned above, which HTTP operation you should use depends on your design. DELETE should only be used if you are also going to remove the representation, eventhough the spec does not necessarily require this, but once the URI mapping to the resource is gone, you basically have no way to invoke this resource further (unless you have created a further URI mapping first, of course). If you designed your resource to keep the state within a property I'd probably go for PATCH but in general I'd basically opt for POST here as here you have all the choices at your hands.
I would suggest having a state resource.
This keeps things RESTful. You have your HTTP method acting as the verb. The state part of the URI is a noun. Then your request body is simple and consistent with the URI.
The only thing I don't like about this is the value of state requires documentation, meaning what states are there? This could be solved via the API by providing possible states on a meta resource or part of the ticket body in meta object.
PUT /api/tickets/:id/state
{state: "canceled"}
GET /api/meta/tickets/state
// returns
[
"canceled",
...
]
GET /api/tickets/:id
{
id: ...
meta: {
states: [
"canceled",
...
]
}
}
For modelling CANCEL Action in Restful way :
Suppose we have to delete a note in DB by providing noteId(Note's ID) and Note is a pojo
1] At controller :
#DeleteMapping(value="/delete/{noteId}")
public ResponseEntity<Note> deleteNote( #PathVariable Long noteId)
{
noteServiceImpl.deleteNote(noteId);
return ResponseEntity.ok().build();
}
2]Service Layer :
#Service
public class NoteServiceImpl {
#Autowired
private NotesRepository notesDao;
public void deleteNote(Long id) {
notesDao.delete(id);
}
}
3] Repository layer :
#Repository
public interface NotesRepository extends CrudRepository<Note, Long> {
}
and in 4] postman : http://localhost:8080/delete/1
So we have deleted note Id 1 from DB by CANCEL Action