Are all CMIS operations required to be synchronous? - specifications

The CMIS protocol specification does not even contain the words "synchronous" or "asynchronous".
I guess that implicitely, every server-side operation is synchronous?
Are there counter-examples?
Scenario: I create a document, then immediately I list the directory, and the file does not appear yet.
Is this scenario illegal?
(not a client problem, the client waits for HTTP response before going to the next instruction)

CMIS is synchronous, and doesn't have any semantic for transactions spanning multiple requests. Thus, each successful mutation request (POST, PUT, DELETE in the AtomPub binding) have immediate effect to subsequent requests.
The scenario you depicted is not illegal, it's simply impossible to achieve under CMIS. If you need asynchronous operations, you need to decouple the client from CMIS and introduce an intermediate layer that e.g.:
caches the pre-execution state of a CMIS mutation
returns the cached value until the mutation is successful.

In general CMIS operations are supposed to be synchronous. I don't know of any counter examples. Asychronous operation on the server would make it almost impossible to create applications.
Assuming you are using AtomPub under the covers, you might want to check
http://bitworking.org/projects/atom/rfc5023.html . Creating a resource returns its URI.
So I guess what you are seeing is either a bug in the implementation or the client is using implemention details not covered by the spec. The alfresco repository for example indexes asynchronously by default (solr). Hence, I think it should be possible to come with code which demos the behavior you observe backing the listing by a search.

Related

Which HTTP method to use to build a REST API to perform following operation?

I am looking for a REST API to do following
Search based on parameters sent, if results found, return the results.
If no results found, create a record based on search parameters sent.
Can this be accomplished by creating one single API or 2 separate APIs are required?
I would expect this to be handled by a single request to a single resource.
Which HTTP method to use
This depends on the semantics of what is going on - we care about what the messages mean, rather than how the message handlers are implemented.
The key idea is the uniform interface constraint it REST; because we have a common understanding of what HTTP methods mean, general purpose connectors in the HTTP application can do useful work (for example, returning cached responses to a request without forwarding them to the origin server).
Thus, when trying to choose which HTTP method is appropriate, we can consider the implications the choice has on general purpose components (like web caches, browsers, crawlers, and so on).
GET announces that the meaning of the request is effectively read only; because of this, general purpose components know that they can dispatch this request at any time (for instance, a user agent might dispatch a GET request before the user decides to follow the link, to make the experience faster).
That's fine when you intend the request to provide the client with a copy of your search results, and the fact that you might end up making changes to server local state is just an implementation detail.
On the other hand, if the client is trying to edit the results of a particular search (but sometimes the server doesn't need to change anything), then GET isn't appropriate, and you should use POST.
A way to think about the difference is to consider what action you want to be taken when an intermediate cache holds a response from an earlier copy of "the same" request. If you want the cache to reuse the response, GET is the best; on the other hand, if you want the cache to throw away the old response (and possibly store the new one), then you should be using POST.

Using GET verb to update in rest api?

I know the use of http verbs is based on standard specification. But my question if I use "GET" for update operations and write a code logic to update, does it create issues in any scenario? Apart from the standard, what else could be the reason to use these verbs for a specific purpose only?
my question if I use "GET" for update operations and write a code logic to update, does it create issues in any scenario?
Yes.
A simple example - suppose the network between the client and the server is unreliable; specifically, for a time, HTTP responses are being lost. A general purpose component (like a web proxy) might time out, and then, noticing that the method token of the request is GET, resend the request a second/third/fourth time, with your server performing its update on every GET request.
Let us further assume that these multiple update operations lead to an undesirable outcome; where do we properly affix blame?
Second example: you send someone a copy of the link to the update operation, so that they can send you a request at the appropriate time. But suppose you send that link to them in an email, and the email client recognizes the uri and (as a performance optimization) pre-fetches the link, triggering your update operation too early. Where do we properly affix the blame?
HTTP does not attempt to require the results of a GET to be safe. What it does is require that the semantics of the operation be safe, and therefore it is a fault of the implementation, not the interface or the user of that interface, if anything happens as a result that causes loss of property -- Fielding, 2002
In these, and other examples, blame is correctly affixed to your server, because GET has a standardized meaning which include the constraint that the semantics of the request are safe.
That's not to say that you can't have side effects when handling a GET request; "hit counters" are almost as old as the web itself. You have a lot of freedom in your implementation; so long as you respect the uniform interface, there won't be too much trouble.
Experience report: one of our internal tools uses GET requests to trigger scheduling; in our carefully controlled context (which is not web scale), we get away with it, and have for a very long time.
To borrow your language, there are certainly scenarios that would give us problems; but given our controls we manage to avoid them.
I wouldn't like our chances, though, if requests started coming in from outside of our carefully controlled context.
I think it's a decent question. You're asking a hypothetical: is there any value to doing the right other than that's we agree to use GET for fetching? e.g.: is there value beyond the fact that it's 'semantically nice'. A similar question in HTML might be: "Is it ok to use a <div> with an onclick instead of a <button>? (the answer is no).
There certainly is. Clients, servers and intermediates all change their behavior depending on what method is used. Even if your server can process GET for updates, and you build a client that uses this, your browser might still get confused.
If you are interested in this subject, don't ask on a forum; read the spec. The HTTP specification tells you what clients, servers and proxies should do when they encounter certain methods, statuses and headers.
Start at RFC7231

RESTful GET that can change system state?

I am building a service that caches short lived data objects. The object creation process is expensive, so this service will cache them and other downstream applications can use them without managing their lifecycle.
The plan is that downstream apps will make a GET call to this service to fetch object. If the object is expired, the service will fetch a new object, cache it, and return it to the caller.
And Here is my dilemma - This way the GET operation changes system state, by fetching new object. I am sure that I am violating REST principles here, or is there a valid justification for this? Should I just change the method to POST?
This way the GET operation changes system state, by fetching new object. I am sure that I am violating REST principles here, or is there a valid justification for this? Should I just change the method to POST?
The short version: this is fine.
Longer version: REST says that our resources have common "uniform" semantics - the meaning of messages doesn't depend on which resource you reference.
In the case of HTTP, the primary discriminator for requests is the method. For the GET method, the semantics are (currently) described by RFC 7231. GET is explicitly identified as being safe
Request methods are considered "safe" if their defined semantics are essentially read-only; i.e., the client does not request, and does not expect, any state change on the origin server as a result of applying a safe method to a target resource.
If you, the server, need to change a bunch of your private information stores to compute the current representation of the resource, that's an implementation detail hidden behind the HTTP facade. You can do what you like.
Fundamentally, what safe means is that anybody who knows the identifier can ask for the current representation of the resource at any time. This allows browser to retry requests when the network is flaky, or for spiders to crawl around indexing the net, knowing that their requests do no harm (or more precisely, that the fault of any harms inflicted by those requests is properly assigned to the server).
If that's OK, then GET is a perfectly "RESTful" method to use for these requests.

http verb to invoke services / methods

what is the best practice in defining web service that represent a non REST command invocation?
For REST, basically we use POST to create new record(s), GET to retrieve record(s), PUT to update record(s) and DELETE to remove record(s). Which http verb should I use if I just want to invoke some other non resource function, for example - to flush a system cache?
Which http verb should I use if I just want to invoke some other non resource function, for example - to flush a system cache?
HTTP request methods should be selected based on their alignment with their defined semantics.
The most important of these is to determine whether or not the semantics are safe
Request methods are considered "safe" if their defined semantics are essentially read-only; i.e., the client does not request, and does not expect, any state change on the origin server as a result of applying a safe method to a target resource. Likewise, reasonable use of a safe method is not expected to cause any harm, loss of property, or unusual burden on the origin server.
Advertising a safe link invites consumers to pre-fetch a link, or to crawl and index the representation found there.
If having Google and a billion of her closest friends flushing your system cache sounds expensive, then you probably don't want a safe method.
PUT and PATCH are unsafe methods with semantics of manipulating representations. So if you had a schema that described a system cache, a client might PUT a representation of an empty cache in the entity body, and send that to you, whereupon you could flush the cache. You could achieve a similar things with PATCH, sending a list of the edits needed to make the change.
Both of these rely on the illusion that your resources are just documents. I GET a representation of your resource, I load that into my generic editor, make changes, send my edited representation back to you, and then it's up to you to manifest those changes (or not).
But they aren't required -- if you want to simply document that
PUT /df1645af-f960-4cc4-ad7a-d0ddd29903f8
Content-Length: 0
has the side effect of flushing the system cache, the REST Police aren't going to come after you just because you've introduced a bit of RPC into the mix.
Of course, if you were doing this with HTML, then your only choice would be POST.
The POST method requests that the target resource process the representation enclosed in the request according to the resource's own specific semantics.
Which is to say, POST is always an option.
It's easy enough to imagine the flow -- you load up some bookmark, follow a system cache link, find a form with a flush cache button, and submit. The browser would create the request as described in the form elements and submit it.
So that's going to be fine too. And the REST police won't bother you for that, because that protocol is actually RESTful.
If those answers are unsatisfying, or if you are just surveying the space to know what options are available, you can review the HTTP Method Registry. To be honest, I've never found anything there I've wanted to use. But if WebDAV is your jam....

Using Rest to trigger remote procedures

I am working on an application that is primarily concerned with polling data from a third party, then mapping and persisting relevant information to a customer. It does this on a fixed interval, but the requirements also specify a way to manually fire off the process. This manual call will only receive a status on the execution of the process, but is not interested in the data created by the process.
I implemented the call using http, but my implementation has been identified as not RESTful. Having now done the research into what better than means, I completely concur as I was using the URI to define verbs instead of nouns.
Is it possible to make this RESTful? Is it okay to make a resource that is extremely transient, like this?
POST /rpc/{process}?param1=....&....
Or should the process itself be considered permanent, but the triggering of as a temporary update? I doubt this as this is not an idempotent action.
PUT /rpc/{process}?run=true&param1......
Is there a correct way to handle this and be called RESTful? If it is not, do I use SOAP or continue to use the simple http structure and live with the fact that it is not RESTful?
Use querystrings for filtering
POST/PUT depends on what you want your process to do (PUT for idempotent actions, POST for non idempotent actions).
Keep it simple and I do not see any issues the way you have resolved this.