RPC vs restful endpoints - rest

I've spent more time than I'd like to admit trying to understand the differences between RPC and restful endpoints. In fact, after going down this rabbit hole, I'm not even sure I fully understand what RPC is anymore. And I'm not alone. I see countless messages repeatedly asking for clarification.
Here's my understanding so far:
RPC and restful endpoints are largely conceptual in nature only. There are no standards.
Both are ways of invoking a function on a remote server.
In the case of restful endpoints, we map HTTP methods to CRUD operations on a resource.
In the case of RPC, we're concerned w/actions on an object.
However, in both cases, we're addressing remote method invocation.
I've read some folks say restful shouldn't be stateful and if it is, it's not restful. But aren't most restful endpoints using state via sessions? I mean, I'm not going to let just anyone delete data w/o being logged in.
And I've also read that functions like domain(.)com/login isn't restful b/c it's an action. This seems to make sense, but since I'm not writing my functions any differently on my server than I would a simple restful function, does it really matter what we call them -- restful vs. RPC.
I truly just don't understand what I'm missing and no posts thus far on StackOverflow appear to address this confusion.
Hoping someone can provide some insight using the above as a guide for understanding. Thanks in advance.

It's an amazing question. There is a huge difference between REST and RPC approaches, I'll try to explain it.
RPC just lets you call some remote computer to make some job. It can be any job, and it can have any inputs and outputs. When you design API, you just define how to name this job operation and which inputs and outputs it will have.
Step 1. Just RPC
For example, you could have operation foo with one input and two outputs like that (I will use JSight API notation because it supports both RPC and REST API types):
JSIGHT 0.3
URL /rpc-api
Protocol json-rpc-2.0
Method foo
Params
{
"input-1": 123 // some integer
}
Result
{
"output-1": 123, // some other integer
"output-2": "some string"
}
When you design your RPC API you can use as many such operations as you want.
So, what's the problem with this? Why do we need REST, if we already have RPC?
The problems are the following:
When you look at the method foo definition you do not have the slightest idea, what it does.
When you design RPC APIs you have absolute freedom, but freedom often can be a bad thing. Because people think in different fashions. You could think, that your API is easy and elegant, but other people would not think so.
Step 2. Standardized RPC
If you are an experienced API designer and you use the RPC style, you will try to follow some pattern to avoid the two faults mentioned above. For example, let's imagine, that your foo method returns a customer entity. Then you will never call it like foo, you will call it getCustomer, like this:
JSIGHT 0.3
URL /rpc-api
Protocol json-rpc-2.0
Method getCustomer
Params
{
"input-1": 123 // some integer
}
Result
{
"output-1": 123, // some other integer
"output-2": "some string"
}
Ok! Now when you just look at this description, you can say, that this method returns a customer; that parameter input-1 is some filter, most possibly, ID; and output-1 and output-2 are some customer properties.
Look, just using the standard way of naming the method, we made our API much more easy-to-understand.
Still, why do we need REST? What's the problem here?
The problem is, that when you have dozens of such RPC methods, your API looks messy. E. g.:
JSIGHT 0.3
URL /rpc-api
Protocol json-rpc-2.0
Method getCustomer
Method updateCustomer
Method getCustomerFriends
Method getCustomerFather
Method askCustomerConfirmation
# etc...
Why it is not easy to understand this bunch of methods? Because it is not the way our brain works. Our brain always groups actions by entity. You naturally think about an entity, and then about actions, you can do with it. E. g. what you can do with a cup? Take, hide, break, wash, turn over. And brain rarely thinks firstly about actions and then about entities. What can I take? Cup, ball, ice cream, ticket, wife. Not very natural.
So, what is REST about?
Step 3. REST
REST is about grouping entities and defining the most standard actions with these entities.
In IT we usually create, read, write, or delete entities. That's why we have POST, GET, PUT, and DELETE methods in HTTP, which is de facto the only implementation of the REST style.
When we group our methods by entities, our API becomes much clearer. Look:
JSIGHT 0.3
GET /customer/{id}
PUT /customer/{id}
GET /customer/{id}/friends
GET /customer/{id}/father
POST /customer/{id}/ask-confirmation
It is much easier to perceive now than in the RPC-style example above.
Conclusion
There are a lot of other issues, concerning differences between RPC and REST. REST has its faults as well as RPC. But I wanted to explain to you the main idea:
REST is a standard, that helps you to organize your API in a user-friendly fashion, that is comfortable for the human brain, so any developer can quickly capture your API structure.
RPC is just a way of calling remote operations.
All the examples above are gathered here: https://editor.jsight.io/r/g6XJwjV/1

I truly just don't understand what I'm missing
Not your fault, the literature sucks.
I think you may find Jim Webber's 2011 guidance enlightening.
Embrace HTTP as an application protocol. HTTP is not a transport protocol which happens to go conveniently through port 80. HTTP is an application protocol whose application domain is the transfer of documents over a network.
HTTP, like remote procedure calls, is a request/response protocol. The client sends a message to the server, the server sends a message back. Big deal.
But in the REST architectural style, we're also following the uniform interface constraint. That means (among other things) that everybody understands requests the same way, and everybody interprets responses the same way.
And that in turn means that general purpose components (browsers, web crawlers, proxies) can understand the request and response messages, and react intelligently to those messages, without needing to know anything at all about the specialization of a particular resource.
In the case of HTTP, we're passing messages about the domain of documents; if I want information from you, I ask for the latest copy of your document. If I want to send information to you, I send you a proposed modification to your document. Maybe you respond to my modification by creating a new document and you tell me to go look at that. And so on.
Useful business activity, then, becomes a side effect of manipulating these documents. Or, expressed another way, the document metaphor is the facade we present to the world.
And I've also read that functions like domain(.)com/login isn't restful b/c it's an action.
See above: the literature sucks.
Consider this resource identifier: https://www.merriam-webster.com/dictionary/login
Does it work like every other resource identifier on the web? Yes. The browser knows how to make a request to GET this resource, and when Merriam-Webster answers that the browser should instead look at https://www.merriam-webster.com/dictionary/log%20on then it fetches that resource as well.
The fact that "login"/"log on" are intransitive English verbs does not matter; it's an identifier, and that's all the browser needs to know.
Alternatively, consider this: URL shorteners work.
Resource identifiers are a lot like variable names in a computer program; so long as the identifier satisfies the production rules of your language, you can use anything you want. The machines don't care.
The key idea is that resources are not actions, they are "information that can be named". https://www.merriam-webster.com/dictionary/log%20on is the name of the document that describes Merriam-Webster's definition of the verb "log on".
Since the machines don't care what names we use, we can use the extra freedom that offers us to choose names that make life easier for people we care about.
I've read some folks say restful shouldn't be stateful and if it is, it's not restful. But aren't most restful endpoints using state via sessions?
Stateless communication is a REST architectural constraint.
each request from client to server must contain all of the information necessary to understand the request, and cannot take advantage of any stored context on the server.
Cookies are one of the corners where HTTP failed to satisfy the stateless communication constraint.
I'm not going to let just anyone delete data w/o being logged in.
Not a problem: every HTTP request carries its own Authorization metadata; you aren't supposed to "remember" previous messages in the conversation. Instead, you verify the provided credentials every time.

Related

REST API: Does validation on identifiers break encapsulation? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 months ago.
Improve this question
I figured I'd post here to get some ideas/feedback on something I've come up against recently. The API I've developed has validation on an identifier that's passed through as a path parameter:
e.g. /resource/resource_identifier
There are some specific business rules as to what makes an idenfier valid and my API has validation which enforces these rules and returns a 400 when that's violated.
Now the reason I'm writing this is that I've been doing this sort of thing in every REST (ish) API I've ever written. It's kind of ingrained in me now but ecently I've been told that this is 'bad' and that it breaks encapsulation. Furthermore, it does this by forcing a consumer to have knowledge about the format of an identifier. I'm told that I should be returning a 404 instead and simply accept anything as an idenfier.
We've had some pretty heated debates about this and what encapsulation actually means in the context of REST. I've found numerous definitions but they aren't specific. As with any REST contention it's hard to substantiate an argument for either.
If StackOverflow would allow me, I'd like to try and gain a concensus on this and why APIs like Spotify for example, use 400 in this scenario.
I've been doing this sort of thing in every REST (ish) API I've ever written. It's kind of ingrained in me now but recently I've been told that this is 'bad'
In the context of HTTP, it is an "anti-pattern", yes.
I'm told that I should be returning a 404 instead
And that is the right pattern when you want the advantages of responding like a general purpose web server.
Here's the point: if you want general purpose components in the HTTP application to be able to do sensible things with your response messages, then you need to provide them with the appropriate meta data.
In the case of a target resource identifier that satisfies the request-target production rules defined in RFC 9112 but is otherwise unsatisfactory; you can choose any response semantics you want (400? 403? 404? 499? 200?).
But if you choose 404, then general purpose components will know that the response is an error that can be re-used for other requests (under appropriate conditions - see RFC 9111).
why APIs like Spotify for example, use 400 in this scenario.
Remember: engineering is about trade offs.
The benefits of caching may not outweigh more cost effective request processing, or more efficient incident analysis, or ....
It's also possible that it's just habit - it's done that way because that's the way that they have always done it; or because they were taught it as a "best practice", or whatever. One of the engineering trade offs we need to consider is whether or not to invest in analyzing a trade off!
An imperfect system that ships earns more market share than a perfect solution that doesn't.
While it may sound natural to expose the resource internal ID as ID used in the URI, remember that the whole URI itself is the identifier of a resource and not only the last bit of the URI. Clients are usually also not interested in the characters that form the URI (or at least they shouldn't care about it) but only in the state they receive upon requesting that from the API/server.
Further, if you think long-term, which should be the reason why you want to build your design on top of a REST architecture, is there a chance that the internal identifier of a resource could ever change? If so, introducing an indirection could make more sense then i.e. by using UUIDs instead of product IDs in the URI and then have a further table/collection to perform a mapping from UUID to domain object ID. Think of a resource that exposes some data of a product. It may sound like a good idea to use the product ID at the end of the URI as they identify the product in your domain model clearly. But what happens if you company undergoes a merge with an other company that happens to have an overlap on product but then uses different identifiers than you? I've seen such cases in reality, unfortunately, and almost all of them wanted to avoid change for their clients and so had to support multiple URIs for the same products in the end.
This is exactly why Mike Amundsen said
... your data model is not your object model is not your resource model ... (Source)
REST is full of such indirection mechanisms to allow such systems to avoid coupling. I.e. besides above mentioned mechanism, you also have link-relations to allow servers to switch URIs when needed while clients can still lookup the URI via the exposed relation name, or its focus on negotiated media types and its representation formats rather than forcing clients to speak their API-specific RPC-like, plain-JSON slang.
Jim Webber further coined the term domain application protocol to describe that HTTP is an application protocol for exchanging documents and any business rules we infer are just side effects of the actual document management performed by HTTP. So all we do in "REST" is basically to send documents back and forth and infer some business logic to act upon receiving certain documents.
In regards to encapsulation, this isn't the scope of REST nor HTTP. What data you return depends on your business needs and/or on the capabilities of the representation formats exchanged. If a certain media-type isn't able to express a certain capability, providing such data to clients might not make much sense.
In general, I'd would recommend not to use domain internal IDs as part of URIs for the above mentioned reasons. Usually that information should be part of the exchanged payload to give users/customers the option to refer to that resources on other channels like e/mail, telephone, ... Of course, that depends on the resource and its purpose at hand. As a user I'd prefer to refer to myself with my full name rather than some internal user- or customer ID or the like.
edit: sorry, missed the validation aspect ...
If you expect user/client input on the server/API side, you should always validate the data before starting to process it. Usually though, URIs are provided by the server and might only trigger business activities if the URI requested matches one of your defined rules. In general, most frameworks will respond with 400 Bad Request responses when they couldn't map the URI to a concrete action, giving the client a chance to correct its mistake and reissue the updated request. As URIs shouldn't be generated or altered by clients anyways, validating such parameters might be unnecessary overhead unless they might introduce security risks. Here it might be a better approach then to toughen-up the mapping rules of URIs to actions then and let those frameworks respond with a 400 message when clients use stuff they aren't supposed to.
Encapsulation makes sense when we want to hide data and implementation behind an interface. Here we want to expose the structure of the data, because it is for communication, not for storage and the service certainly needs this communication in order to function. Validation of data is a very basic concept, because it makes the service reliable and because is protects against hacking attempts. The id here is a parameter and checking its structure is just parameter validation, which should return 400 if failed. So this is not restricted to the body of the request, the problem can be anywhere in the HTTP message as you can read below. Another argument against 404 that the requested resource cannot possibly exist, because we are talking about a malformed id and so a malformed URI. It is very important to validate every user input, because a malformed parameter can be used for injections e.g. for SQL injection if it is not validated.
The HyperText Transfer Protocol (HTTP) 400 Bad Request response status
code indicates that the server cannot or will not process the request
due to something that is perceived to be a client error (for example,
malformed request syntax, invalid request message framing, or
deceptive request routing).
vs
The HTTP 404 Not Found response status code indicates that the server
cannot find the requested resource. Links that lead to a 404 page are
often called broken or dead links and can be subject to link rot.
A 404 status code only indicates that the resource is missing: not
whether the absence is temporary or permanent. If a resource is
permanently removed, use the 410 (Gone) status instead.
In the case of REST we describe the interface using the HTTP protocol, URI standard, MIME types, etc. instead of the actual programming language, because they are language independent standards. As of your specific case it would be nice to check the uniform interface constraints including the HATEOAS constraint, because if your service makes the URIs as it should, then it is clear that a malformed id is something malicious. As of Spotify and other APIs, 99% of them are not REST APIs, maybe REST-ish. Read the Fielding dissertation and standards instead of trying to figure it out based on SO answers and examples. So this a classic RTFM situation.
In the context of REST a very simple example of data hiding is storing a number something like:
PUT /x {"value": "111"} "content-type:application/vnd.example.binary+json"
GET /x "accept:application/vnd.example.decimal+json" -> {"value": 7}
Here we don't expose how we store the data. We just send the binary and decimal representations of it. This is called data hiding. In the case of id it does not make sense to have an external id and convert it to an internal id, it is why you use the same in your database, but it is fine to check if its structure is valid. Normally you validate it and convert it into a DTO.
Implementation hiding is more complicated in this context, it is sort of avoiding micromanagement with the service and rather implement new features if it happens frequently. It might involve consumer surveys about what features they need and checking logs and figuring out why certain consumers send way too many messages and how to merge them into a single one. For example we have a math service:
PUT /x 7
PUT /y 8
PUT /z 9
PUT /s 0
PATCH /s {"add": "x"}
PATCH /s {"add": "y"}
PATCH /s {"add": "z"}
GET /s -> 24
vs
POST /expression {"sum": [7,8,9]} -> 24
If you want to translate between structured programming, OOP and REST, then it is something like this:
Number countCartTotal(CartId cartId);
<=>
interface iCart {
Number countTotal();
}
<=>
GET api/cart/{cartid}/total -> {total}
So an endpoint represents an exposed operation something like verbNoun(details) e.g. countCartTotal(cartId), which you can split into verb=countTotal, noun=cart, details=cartId and build the URI from it. The verb must be transformed into a HTTP method. In this case using GET makes the most sense, because we need data instead of sending data. The rest of the verb must be transformed into a noun, so countTotal -> GET totalCount. Then you can merge the two nouns: totalCount + cart -> cartTotal. Then you can build an URI template based on the resulting noun and the details: cartTotal + cartId -> cart/{cartid}/total and you are done with the endpoint design GET {root}/cart/{cartid}/total. Now you can bind it to the countCartTotal(cartId) or to the repo.resource(iCart, cartId).countTotal().
So I think if the structure of the id does not change, then you can even add it to the API documentation if you want to. Though it is not necessary to do so.
From security perspective you can return 404 if the only possible reason to send such a request is a hacking attempt, so the hacker won't know for certain why it failed and you don't expose details of the protection. In this situation it would be overthinking the problem, but in certain scenarios it makes sense e.g. where the API can leak data. For example when you send a password reset link, then a web application usually asks for an email address and most of them send an error message if it is not registered. This can be used to check if somebody is registered on the site, so better to hide this kind of errors. I guess in your case the id is not something sensitive and if you have proper access control, then even if a hacker knows the id, they cannot do much with that information.
Another possible aspect is something like what if the structure of the id changes. Well we write a different validation code, which allows only the new structure or maybe both structures and make a new version of the API with v2/api and v2/docs root and documentation URIs.
So I fully support your point of view and I think the other developer you mentioned does not even understand OOP and encapsulation, not to mention webservices and REST APIs.

Single endpoint instead of API - what are the disadvantages?

I have a service, which is exposed over HTTP. Most of traffic input gets into it via single HTTP GET endpoint, in which the payload is serialized and encrypted (RSA). The client system have common code, which ensures that the serialization and deserialization will succeed. One of the encoded parameters is the operation type, in my service there is a huge switch (almost 100 cases) that checks which operation is performed and executes the proper code.
case OPERATION_1: {
operation = new Operation1Class(basicRequestData, serviceInjected);
break;
}
case OPERATION_2: {
operation = new Operation2Class(basicRequestData, anotherServiceInjected);
break;
}
The endpoints have a few types, some of them are typical resource endpoints (GET_something, UPDATE_something), some of them are method based (VALIDATE_something, CHECK_something).
I am thinking about refactoring the API of the service so that it is more RESTful, especially in the resource-based part of the system. To do so I would probably split the endpoint into the proper endpoints (e.g. /resource/{id}/subresource) or RPC-like endpoints (/validateSomething). I feel it would be better, however I cannot make up any argument for this.
The question is: what are the advantages of the refactored solution, and what follows: what are the disadvantages of the current solution?
The current solution separates client from server, it's scalable (adding new endpoint requires adding new operation type in the common code) and quite clear, two clients use it in two different programming languages. I know that the API is marked as 0-maturity in the Richardson's model, however I cannot make up a reason why I should change it into level 3 (or at least level 2 - resources and methods).
Most of traffic input gets into it via single HTTP GET endpoint, in which the payload is serialized and encrypted (RSA)
This is potentially a problem here, because the HTTP specification is quite clear that GET requests with a payload are out of bounds.
A payload within a GET request message has no defined semantics; sending a payload body on a GET request might cause some existing implementations to reject the request.
It's probably worth taking some time to review this, because it seems that your existing implementation works, so what's the problem?
The problem here is interop - can processes controlled by other people communicate successfully with the processes that you control? The HTTP standard gives us shared semantics for our "self descriptive messages"; when you violate that standard, you lose interop with things that you don't directly control.
And that in turn means that you can't freely leverage the wide array of solutions that we already have in support of HTTP, because you've introduce this inconsistency in your case.
The appropriate HTTP method to use for what you are currently doing? POST
REST (aka Richardson Level 3) is the architectural style of the world wide web.
Your "everything is a message to a single resource" approach gives up many of the advantages that made the world wide web catastrophically successful.
The most obvious of these is caching. "Web scale" is possible in part because the standardized caching support greatly reduces the number of round trips we need to make. However, the grain of caching in HTTP is the resource -- everything keys off of the target-uri of a request. Thus, by having all information shared via a single target-uri, you lose fine grain caching control.
You also lose safe request semantics - with every message buried in a single method type, general purpose components can't distinguish between "effectively read only" messages and messages that request that the origin server modify its own resources. This in turn means that you lose pre-fetching, and automatic retry of safe requests when the network is unstable.
In all, you've taken a rather intelligent application protocol and crippled it, leaving yourself with a transport protocol.
That's not necessarily the wrong choice for your circumstances - SOAP is a thing, after all, and again, your service does seem to be working as is? which implies that you don't currently need the capabilities that you've given up.
It would make me a little bit suspicious, in the sense that if you don't need these things, why are you using HTTP rather than some messaging protocol?

How to model REST API for designing a calculator service

I have been reading about the best practices for designing API's for REST services that will be exposed to customers. For example, we should use Nouns to name all the URI's exposed. Further the verbs should obey the semantics of the HTTP commands. For example, a GET request should never modify the resource, instead PUT request should be used here. I was asked this question during an interview and couldn't answer this satisfactorily-- I am designing a calculator that provides following functions, add, multiply, divide, subtract on two operands. How do I expose these methods to clients following REST principles. What URI to use for these operations? And I am not sure whether to map an add operation to GET,PUT or POST. Same for the other operations (divide, multiply etc). What are the guidelines here ?
What are the guidelines here ?
How would you do it with a web site?
That should be your first heuristic any time somebody asks you about REST. REST is an
architectural style, designed for "long-lived network-based applications that span multiple organizations". The reference application for this architectural style is the World Wide Web.
I am designing a calculator that provides following functions, add, multiply, divide, subtract on two operands. How do I expose these methods to clients following REST principles.
There are three pieces of information that the client has, which need to be communicated to the server - the operation and the two operands. One the web, the usual way to collect that sort of information is to provide a form. In this case, it might be a dropdown list of operations, and a couple of text controls to accept numbers. Because a pure function is a safe, we would probably use GET as the form method. So the HTML processing rules would take the values described by the form and transcribe them as key value pairs in the query part.
So the url would look something like
/22520c7f-6207-490e-99c9-bd1bb37f4056?op=add&firstArg=6&secondArg=9
The key generalization is to realize that the HTML form is playing the role of a URI Template - the server passes the template to the client, the client fills in the details and uses the result as the target of the request.
What this implies is that if you wanted to replace HTML with some other media type, you would need to specify a description for URI Templates, some mechanism for describing the acceptable ranges and what values might be reasonable defaults.
What URI to use for these operations?
With REST? absolutely does not matter. That's part of the point -- the client treats URI as opaque values.
URI are just identifiers; you can think of them like variable names in a program. The machines just don't care what the spellings are (so long as those spellings are consistent with RFC 3986).
Since the spelling doesn't matter, you can use whatever the local spelling conventions happen to be. A lot of people favor "hackable" URI -- the spelling of the identifier communicates useful information to the human reader.
URI are identifiers of resources; "any information that can be named can be a resource". This motivates some of the "nouns not verbs" noise -- the resources are web pages, or documents, or images, or scripts, or ... the "resource" concept is deliberately vague, so that it can be used flexibly.
I am not sure whether to map an add operation to GET,PUT or POST.
The key is to look at the semantics of the client's request. Anytime what you are doing is a query/lookup, something that is OK for the machine to do automatically for the user, then you are looking at safe semantics, and the method is likely to be GET or HEAD (special circumstances).
If we were asking the server to change the representations of its own resources, then PUT and POST come into play.
In this case, all of these operations are just doing lookups, so GET is appropriate.
(An interesting thing to note is that the none of these operations depend on server state; they are pure functions. So it might make sense to serve the client with code on demand -- javascript or some reasonable replacement -- and use the client's processor to perform the calculations rather than doing a bunch of round trips across the network).
I think one issue with the term REST is that it can have a different meaning to different people. It's possible that the interviewer has a different understanding of REST. When something like this comes up, my first inclination is to try to figure out what REST means to them. Do they mean hypermedia? Do they care about the verbs and nouns being used correctly as you mentioned? Or is in their mind REST just some HTTP endpoint that returns JSON. All of these are possible.
I would be inclined to answer the question as follows I think:
GET /add/1/2
GET /multiply/5/6
etc..
Others with much more experience than myself have answered with examples of good ways to do this.
I'd just like to point out what a really excellent question this is. I'm reading a paper by Richard Taylor and others from UC Irvine. Richard Taylor was Roy Fielding's advisor back when Fielding described REST in his PhD thesis. In a discussion comparing REST and SOAP, the paper includes this:
"...the closer the service semantics are to those of content, the more likely the service is to have a rich REST API.
...
the division between REST and SOAP may reflect a lack of design guidance; how can
services that are not content-centric, such as auction bidding
... be cleanly constructed in a REST-consistent manner?"
When they say "closer to content" they mean that the service looks more like some dynamic or static information to be retrieved by the end user, as opposed to a service looks like something that actively performs some function or calculation for the user.
They go on to provide additional guidance on this - but I haven't finished reading the paper yet, so I'll let you look that up for yourself. :-)
Search the web for: From Representations to Computations: The Evolution of
Web Architectures, Richard N. Taylor, UCI

Correct HTTP request method for nullipotent action that takes binary data

Consider a web API method that has no side effects, but which takes binary data as a parameter. An example would be a method that tells the user whether or not their image is photoshopped, but does not permanently store the image or the result on its servers.
Should such a method be a GET or a POST?
GET doesn't seem to have a recommended way of sending data outside of URL parameters, but the behavior of the method implies a GET, which according to the HTTP spec is for safe, stateless responses. This becomes particularly constraining under the semantics of REST, which imply that POST methods create a new object on the server.
This becomes particularly constraining under the semantics of REST, which imply that POST methods create a new object on the server.
While a POST request means that the entity sent will be treated "as a new subordinate of the resource identified by the Request-URI", there is no requirement that this result in the creation of a new permanent object or that any such new object be identified by a URI (so no new object as far as the client knows). An object can be transient, representing the results of e.g. "Providing a block of data, such as the result of submitting a form, to a data-handling process" and not persisting after the entity representing that object has been sent.
While this means that a POST can create a new resource, and is certainly the best way to do so when it is the server that will give that new resource its URI (with PUT being the more appropriate method when the client dictates the new URI) it is can also be used for cases that delete objects (though again if it's a deletion of a single* resource identifiable by a URI then DELETE is far more appropriate), both create and delete objects, change multiple objects, it can mean that your kitchen light turns on but that the response is the same whether that worked or failed because the communication from the webserver to the kitchen light doesn't allow for feedback about success. It really can do anything at all.
But, your instincts are good in wanting this to be a GET: While the looseness of POST means we can make a case for it for just about every request (as done by approaches that use HTTP for an RPC-like protocol, essentially treating HTTP as if it was a transport protocol), this is inelegant in theory, inefficient in practice and clumsy in definition. You have an idempotent function that depends on solely on what the client is interested in, and that maps most obviously the GET in a few ways.
If we could fit everything in a URI then GET would be easy. E.g we can define a simple integer addition with something like http://example.net/addInts?x=1;y=2 representing the addition of 71 and 2 and hence being a permanent immutable resource representing the number 3 (since the results of GET can vary with changes to a resource over time, but this resource never changes) and then use a mechanism like HTML's <form> or javascript to allow the server to inform the client as to how to construct the URIs for other numbers (to maintain the HATEOS and/or COD constraints). Simples!
Your problem here is that you do not have input that can be represented as concisely as the numbers 1 and 2 can above. In theory you could do something like http://example.net/photoshoppedCheck?image=… and hence create a URI that represents the resource of the results of the check. This URI will though will have 4 characters for every 3 bytes in the image. While there's no absolute limit on URI length both the theory and the practice allow this to fail (in theory HTTP allows for proxies and servers to set a limit on URI length, and in practice they do).
An argument could be made for using GET and sending a request body the same way as you would with a POST, and some webservers will even allow you to do this. However, GET is defined as returning an entity describing the resource identified in the URI with headers restricting how that entity does that describing: Since the request body is not part of that definition it must be ignored by your code! If you were tempted to bend this rule then you must consider that:
Some webservers will refuse the request or strip the body, so you may not be able to.
If your webserver does allow it, the fact that its not specified means you can't be sure an upgrade won't "fix" this and so break your code.
Some proxies will refuse or strip the request.
Some client libraries will most certainly refuse to allow developers to send a request body along with a GET.
So it's a no-no in both theory and practice.
About the only other approach we could do apart from POST is to have a URI that we consider as representing an image that was not photoshopped. Hence if you GET that you get an entity describing the image (obviously it could be the actual image, though it could also be something else if we stretch the concept of content-negotiation) and then PUT will check the image and if its deemed to not be photoshopped it responds with the same image and a 200 or just a 204 while if it is deemed to be photoshopped it responds with a 400 because we've tried to PUT a photoshopped image as a resource that can only be a non-photoshopped image. Because we respond immediately, there's no race-condition with simultaneous requests.
Frankly, this would be darn-right horrible. While I think I've made a case for it by the letter of the specs, it's just nasty: REST is meant to help us design clear APIs, not obtuse APIs we can offer a too-clever-for-its-own-good justification of.
No, in all the way to go here is to POST the image to a fixed URI which then returns a simple entity describing the analysis.
It's perfectly justifiable as REST (the POST creates a transient object based on that image, and then responds with an entity describing that object, and then that object disappears again). It's straight-forward. It's about as efficient as it could be (we can't do any HTTP caching† but most of the network delay is going to be on the upload rather than the download anyway). It also fits into the general use-case of "process something" that POST was first invented for. (Remember that first there was HTTP, then REST described why it worked so well, and then HTTP was refined to better play to those strengths).
In all, while the classic mistake that moves a web application away from REST is to abuse POST into doing absolutely everything when GET, PUT and DELETE (and perhaps the WebDAV methods) would be superior, don't be afraid to use its power when those don't meet the bill, and don't think that the "new subordinate of the resource" has to mean a full long-lived resource.
*Note that a "single" resource here might be composed of several resources that may have their own URIs, so it can be easy to have a single DELETE that deletes multiple objects, but if deleting X deletes A, B & C then it better be obvious that you can't have A, B or C if you don't have X or your API is not going to be understandable. Generally this comes down to what is being modelled, and how obvious it is that one thing depends on another.
†Strictly speaking we can, as we're allowed to send cache headers indicating that sending an identical entity to the same URI will have the same results, but there's no general-purpose web-software that will do this and your custom client can just "remember" the opinion about a given image itself anyway.
It's a difficult one. Like with many other scenarios there is no absolutely correct way of doing it. You have to try to interpret RESTful principles in terms of the limitations of the semantics of HTTP. (Incidentally, I don't think it's right to think of REST having semantics, REST is an architectural style which is commonly used with HTTP services, but can be used for any type of interface.)
I've faced a similar situation in my current project. We chose to use a POST but with the response code being a 200 (OK) rather than the 201 (Resource Created) usually returned by RESTful Web APIs.

Law of Demeter vs. REST

The Law of Demeter (really should be the suggestion of Demeter) says that you shouldn't "reach through" an object to get at their child objects. If you, as a client, need to perform some non-trivial operation, most of the time the domain model you're working with should support that operation.
REST is in principle a dumb hierarchy of objects. It is like a filesystem of resources/objects, where each object could have child objects. You almost always reach through with REST. You can optionally build up by-convention composite object types using REST techniques, and as long as the provider and the client agree on higher-level operations, you can avoid the reach-through.
So, how do you balance REST and Demeter? It seems to me that they are not in conflict, because REST is all about super-loose coupling to the point where it is pointless for the provider to try to anticipate all the needs of the clients, whereas Demeter assumes that logic can migrate to its most natural place through refactoring.
However, you could argue that REST is just a stop-gap until you understand your clients better. Is REST just a hack? Is Demeter unrealistic in any server/client scenario?
Is Demeter unrealistic in any server/client scenario?
I think you answered your own question here. How is REST in this manner different than SOAP or XML-RPC in this regard?
The point of REST is not to provide super-loose coupling, but the following:
Provide a identifier which accurately describes the resource being requested.
Provide services which behave as expected GET requests are idempotent, PUT updates records, POST creates, DELETE deletes
Minimize state being stored on the server
Tear down unnecessary complexity
There are cases where REST isn't the best answer, but REST works remarkably well in general.
I pay that law/suggestion no mind whatsoever. It defeats half the benefit of aggregation and composition, and I won't have it.
A link that is provided by a representation, exposed by a RESTful interface, can be completely opaque without violating any of REST's constraints. Therefore I would suggest that REST is completely consistent with the Law of Demeter. There is no requirement that the link expose the structure of the URL space in its URL.
e.g. in an object oriented scenario, you might replace the call a.b.c with a.bc
In a RESTful representation you could create the following:
<a>
<link href="bc"/>
</a>
instead of doing
GET a
<a>
<link href="b"/>
</a>
GET b
<b>
<link href="c"/>
</b>
GET c
I would have to disagree with altCognito and say that one of the main goals of REST is loose coupling. The uniform interface, standard media types and HATEOAS all combine to produce an extremely loosely coupled interface.
In response to David's comment:
REST is all about super-loose coupling to the point where it is pointless for the provider to try to anticipate all the needs of the clients
Actually, REST is about limiting the clients options by providing only valid links in the representations. Within those constraints the client can attempt to satisfy its own needs. It is by removing the knowledge from the client of when certain requests can be made that you achieve loose coupling. Loose coupling is not achieved by listing a set of resources and saying "ok, you can GET, PUT, POST, DELETE all you want."
I think they're really orthogonal. REST describes a collection of resources addressed by URIs with a set of canonical methods: GET, POST, etc. If REST routine returns a URI, that's not "reaching through" it's identifying a different object with the same method names.