I hope this question will be accepted by stack overflow and not be tagged "opinion based". At least I try...
I talked with a supplier that have to develop for us a web application and The agreement was to use a RESTful interface to access to services provided by my infrastructure.
The REST interface will be developed by my team and their team will develop the UI Interface. The API is in an advanced state of development and it is used by the main application my company is using, so I have to extend it to accomodate this new webapp.
He sent me a draft document to interface their client applications telling me to develop services like this:
http://1.2.3.4:8080/getGPSPositions
telling me:
1) the web frontend will use POST to ask its requests on the above URL
2) the frontend will send objects serialized using JSON with a format s like this (simplified)
{
serviceID: <number>
fromDate: <date>
toDate: <date>
customSQLWhere: <string>
customSQLOrderBy: <string>
}
and response like this:
{
gps_points:
{
#some data object
}
}
This is no REST to me, but JSON+HTTP+RPC with some embedded SQL code, that can be used by an attacker to do some SQL injection...
I think that the correct resource for the above example (about gpsPositions) is:
http://1.2.3.4:8080/gPSPositions?fromDate=...&toDate=...
using HTTP/GET and not HTTP/POST (using fro CRUD operations GET,DELETE,PUT and POST).
I would like to know the technical implications that this kind of approach can have on my project. the API will be exposed on the Internet and in the future it should be used by more supplier for different kind of applications. I also fear to develop an API with two different styles (RPC and REST) should be difficult to read and understand and this will be a problem in the future.
Thank you for your suggestions.
Actually, neither the OP nor his supplier have a correct understanding of REST, IMO.
REST is an architectural style which basically describes how client and servers should interact with each others in order to decouple the former from the latter. The decoupling is achieved by using a backing protocol (HTTP in your case) as transport layer and utilizing the available operations to exchange data. HTTP, which at its core is actually just a protocol to manage resources on a remote location, therefore defines the semantics of each operation, not REST. The decoupling is furthermore achieved by adhering to the rules outlined by the backing transport protocol and by generating new (or utilizing existing) media types both parties can use. Instead of clients coupling to the server API, both couple to intermediary media types instead. This is why Fielding stressed that REST APIs should spend almost all of their descriptive efforts in defining the media types used for representing resources.
Media types per se are just textual descriptions of the syntax and the semantic to expect on receiving data annotated to be of that kind. It is thus a knowledge base for peers on how to process the data and give the data some useful meaning. Clients and servers should further use content negotiation to agree on a common media type both understand. I.e. a client could send a server a list of media types the client knows of and the server will pick the one that is probably most suitable of for the available data or send an unsupported media type failure back to the client to indicate that none of the requested media types is known by the server.
It is furthermore important that the server provides clients with all of the information the client could need to take further actions. The client shouldn't parse meaning out of provided URIs but use the relation name instead and look up the semantics of the relation name within the exchanged media type, i.e. or rely on common standards like next, prev, first, last and self in a collection. This gives the server the freedom to change URIs (and thus move resources around freely) without breaking clients.
One common error of so called RESTful APIs (and clients using those) is, that data is exchanged as application/json or application/hal+json or the like media type only and clients expect resources to be of a certain type (typed resources), which already couple clients to the API. First, application/json or application/xml is a bad media-type in general as it does not hint the users on what content the document might contain. As the example code contains geo-location data probably application/vnd.geo+json would be more appropriate in that particular case.
As REST's focus is on resources, it is questionable IMO if geo-location data make up for a good resource. I at least consider them more of some kind of property of an entity that I'd express as resource. As the location seems to change over time this might be some kind of vehicle information (car, bus, truck, plane, ship, ...) or the like. The proposed response is in itself not very self-descriptive. The client therefore has to have some build in knowledge that somethin like gps_points in a JSON payload contain GPS data probably as time series for someting.
If the server decicdes to return data in a slightly different representation the client risks to break easily this way. On relying on common media types this risk is almost erased unless both parties adhere to the media types specification.
As your actual post more or less questions if POST requests are feasible instead of GET request on retrieving data:
Accoroding to HTTP the client expresses his intent by using one of the available operations defined within the spec. POST is actually an optional operation that should be used if none of the other operations fits the needs. It is kind of a swiss-army knife that can be misued easily. Using it for other operations is not wrong per se. The spec here states:
The POST method requests that the target resource process the representation enclosed in the request according to the resource's own specific semantics.
On retrieving data, GET is the preferable operation in case the request does not modify any state on the server or requires to extensive data in via a body (as plenty of HTTP clients limit the length of URIs that can be invoked).
The actual implication of accepting data retrieval via POST is, that this becomes part of your API and you have to support it further on. If you decide to later on "fix" this issue you risk of breaking clients that used this kind of communicating with your API. Some legacy systems might force you though to use such constructs. We have some RPC like Web services where we have to provide services to a wide variety of clients. Some of these are SOAP clients in an old SAP version that gets misued in order to send data to our services. As there is actually no build-in support for HTTP in that ABAP version (as far as I'm told) we have to accept such retrieval requests per POST as well. As I changed our API once in order to accept only GET data retrieval I broke plenty of legacy clients that way unintentionally (even once that could retrieve via GET actually) as the developers just used the first sample in the documentation, which was still POST at that time. Long story short, I had to roll back the changes and support both kinds of queries from now on.
Related
We are building set of new REST APIs.
Let's say we have a resource /users with the following fields:
{
id: 1
email: "test#user.com"
}
Clients implement this API and can then update this resource by sending a new resource representation to PUT /users/1.
Now let's say we add a new property name to the model like so:
{
id: 1
email: "test#user.com"
name: "test user"
}
If the models the existing clients are using are to call our API not updated, then calls to PUT /users/1 will remove the new name property since PUT is supposed to replace the resource. I know that the clients could work straight with the raw json to ensure they always receive any new properties that are added in the API, but that is a lot of extra work, and under normal circumstances clients are going to create their own model representations of the API resources on their side. This means that any time any new property is added, all clients need to update the code/models on their side to make sure they aren't accidentally removing properties. This creates unneeded coupling between systems.
As a way to solve this problem, we are considering not implementing PUT operations at all and switching updates to PATCH where properties that aren't passed in are simply not changed. That seems technically correct, but might not be in the spirit of REST. I am also slightly concerned about client support for the PATCH verb.
How are others solving this problem? Was is the best practice here?
You are in a situation where you need some form of API versioning. The most appropriate way is probably using a new media-type every time you make a change.
This way you can support older versions and a PUT would be perfectly legal.
If you don't want this and just stick to PATCH, PATCH is supported everywhere except if you use ancient browsers. Not something to worry about.
Switching from PUT to PATCH will not fix your problem, IMO. The root cause, IMO, is that clients already consider the data being returned for a representation to follow a certain type. According to Fielding
A REST API should never have “typed” resources that are significant to the client.
(Source)
Instead of using typed resources clients should use content-type negotiation to exchange data. Here, media-type formats that are generic enough to gain widespread adoption are for sure beneficial, certain domains may however require a more specific representation format.
Think of a car-vendor Web page where you can retrieve the data from your preferred car. You, as a human, can easily identify that the data depicts a typical car. However, the media-type you most likely received the data in (HTML) does not state by its syntax or the semantics of its elements that the data describes a car, unless some semantic annotation attributes or elements are present, though you might be able to update the data or use the data elsewhere.
This is possible as HTML ships with a rich specification of its elements and attributes, such as Web forms that not only describe the supported or expected input parameters but also the URI where to send the data to, the representation format to use upon sending (implicitly given by application/x-www-form-urlencoded; may be overwritten by the enctype attribute though) or the HTTP method to use, which is fixed to either GET or POST in HTML. Through this, a server is able to teach a client on how a request needs to be built. As a consequence the client does not need to know anything else besides having to understand the HTTP, URI and HTML specifications.
As Web pages are usually filled with all kinds of unrelated stuff, such as adds, styling information or scripts, and the XML(-like) syntax, which is not every ones favourite, as it may increase the size of the actual payload slightly, most so-called "REST" APIs do want to exchange JSON-based documents. While plain JSON is not an ideal representation format, as it does not ship with link-support at all, it is though very popular. Certain additions such as JSON Hyper-Schema (application/schema+json hyper-schema) or JSON Hypertext Application-Language (HAL) (application/hal+json) add support for links and link-relation. These can be used to render data received from the server as-is. However, if you want a response to automatically drive your application state (i.e. to dynamically draw the GUI with the processed data) a more specific representation format is needed, that can be parsed by your client and act accordingly as it understands what the server wants it to do with it (= affordance). If you like to instruct a client on how to build a request support for other media-types such as hal-forms or ion need to be supported. Certain media-types furthermore allow you to use a concept called profiles, that allow you to annotate a resource with a semantic type. HAL JSON i.e. does support something like that where the Content-Type header may now contain a value such as application/hal+json;profile=http://schema.org/Car that hints the media-type processor that the payload follows the definition of the given profile and may thus apply further validity checks.
As the representation format should be generic enough to gain widespread usage, and URIs itself shouldn't hint a client as well what kind of data to expect, an other mechanism needs to be used. Link relation names are basically an annotation for URIs that tell a client about the purpose of a certain link. A pageable collection might return links annotated with first, prev, next and last which are pretty obvious what they do. Other links might be hinted with prefetch, that hint a client that a resource can be loaded right after loading the current resource finished as it is very likely that the client will retrieve this resource next. Such media-types, however, should be either standardized (defined in a proposal or RFC and registerd with IANA) or follow the schema proposed by Web linking, (i.e. as used by Dublin Core). A client that just uses the URI for an invoked link-relation name will still work in case the server changes its URI scheme instead of attempting to parse some parameters from the URI itself.
In regards to de/coupling in a distributed system a certain amount of coupling has to exist otherwise parties wont be able to communicate at all. Though the point here is, the coupling should be based on well-defined and standardized formats that plenty of clients may support instead of exchanging specific representation formats only a very limited number of clients support (in worst case only the own client). Instead of directly coupling to the API and using an undefined JSON-based syntax (maybe with external documentation of the semantics of the respective fields) the coupling should now occur on the media-types parties can use to exchange the format. Here, not the question of which media-type to support should be asked but how many you want to support. The more media-types your client or server supports, the more likely it is to interact with other peers in the distributed system. On the grand-scheme of things, you want a server to be able to server a plethora of clients while a single client should be able to interact with (in best case) every server without the need for constant adoptions.
So, if you really want to decouple clients from servers, you should take a closer look at how the Web actually works and try to mimic its interaction model onto your application layer. As "Uncle Bob" Robert C. Martin mentioned
An architecture is about intent! (Source)
and the intention behind the REST architecture is the decoupling of clients from servers/services. As such, supporting multiple media-types (or defining your own-one that is generic enough to reach widespread adoption), looking up URIs just via their accompanying link-relation names and relying on content-type negotiation as well as relying only on the provided data may help you to achieve the degree of decoupling you are looking for.
All nice and well in theory, but so far every rest api I encountered in my career had predefined contracts that changed over time.
The problem here is, that almost all of those so called "REST APIs" are RPC services at its heart which should not be termed "REST" to start with - this is though a community issue. Usually such APIs ship with external documentation (i.e. Swagger) that just re-introduce the same problems classical RPC solutions, such as CORBA, RMI or SOAP, suffer from. The documentation may be seen as IDL in that process without the strict need for skeleton classes, though most "frameworks" use some kind of typed data classes that will either ignore the recently introduced field (in best case) or totally blow up on invocation.
One of the problems REST suffers from is, that most people haven't read Fieldings thesis and therefore don't see the big picture REST tries to establish but claim to know what REST is and therefore mix up things and call their services RESTful which lead to a situation where REST != REST. The ones pointing out what a REST architecture is and how one might achieve it are called out as dreamers and unworldly when the ones proclaiming the wrong term (RPC over HTTP = REST) continue to do so adding to the confusion of especially the ones just learning the whole matter.
I admit that developing a true REST architecture is really, really hard as it is just too easy to introduce some form of coupling. Hence, a very careful design needs to be done that needs time and also costs money. Money plenty of companies can't or don't want to spend, especially in a domain where new technologies evolve on a regular basis and the ones responsible for developing such solutions often leave the company before the whole process had finished.
Just saying it shouldn’t be ‘typed’ is not really a viable solution
Well, how often did you need to change your browser as it couldn't interact with a Web page? I don't talk about CSS-stuff or browser-specific CSS or JS stuff. How often needed the Web to change in the last 2-3 decades? Similar to the Web, the REST architecture is intended for long-lasting applications for years to come, that supports natural evolution by design. For simple frontend-2-backend systems it is for sure overkill. It starts to shine especially in cases where there are multiple peers not under your control you can interact with.
My understanding of REST is simply that a resource needs some means of self-describing itself. My understanding is that this isn't specifically tied to any one protocol (i.e. HTTP) and that there are theoretically numerous ways of achieving this. This is based on an answer to a SO question here: SOAP vs REST (differences) (and unlike the terrible answer to this question: Are Relay and Graphql RESTful?)
Since a GraphQL API is self-describing via introspection, doesn't that mean that GraphQL is RESTful by default since a client can use introspection to figure out how to query it?
While GraphQL is often mentioned as the replacement for REST, both tackle different problems actually.
REST, to start with, is not a protocol but just a style, which, if applied correctly and fully, just decouples clients from servers. A server following the REST principals will therefore provide the client with any information needed to take further steps. A client initially starts without any a-priori knowledge and learns on the fly through issuing requests and processing responses. HATEOAS describes the interaction model a REST architectue should be build upon. It thereby states that a link should be used to request new information which drives its internal flow. On utilizing similar representation to Web forms (HTML) a server can teach a client on needed inputs. Through the affordance of the respective elements a client knows, without any need for external documentation, what to do. I.e. It might find a couple of options to chose one or multiple options from, enter or update some freetext or push some buttons. In HTML forms usually trigger a POST request and send the entered data as application/x-www-form-urlenceded to the server though the form element itself may define something different.
While REST is protocol agnostic, meaning it can be build up ontop of many protocols, HTTP is probably the most prominent one. A common sample for a RESTful client is the Web browser we are all to familiar with. It will start by invoking either a bookmarked URI or invoke one entered in the address bar and progress from there on.
HTTP doesn't specify the representation the request or response has to be sent in but leaves that to clients and servers negotiating them. This helps in decoupling as both client and servers can rely on the common interface (HTTP) and only bind strongly onto the known media types used to exchange data in. A peer not being able to process a document in a certain representation (due to the lack of the respective mime type support) will indicate his other peer via a respective HTTP status code that it does not understand, and therefore can't serve, the requested media-type format. The media type, which is just a human readable documentation of the syntax and the semantics of the data payload, is therefore the most important part in a REST architecture. Even Fielding claimed:
A REST API should spend almost all of its descriptive effort in defining the media type(s) used for representing resources and driving application state, or in defining extended relation names and/or hypertext-enabled mark-up for existing standard media types. Any effort spent describing what methods to use on what URIs of interest should be entirely defined within the scope of the processing rules for a media type (and, in most cases, already defined by existing media types). [Failure here implies that out-of-band information is driving interaction instead of hypertext.]
A media type teaches a peer how to parse and interpret the received payload and to actually make sense out of it, though plenty of people still confuse REST for a JSON based HTTP API with over-engineered URIs they put to much effort in to give the URI some kind of logical sense when actually neither client nor server will interpret it anyway as they will probably use the link relation name given for the URI.
GraphQL on the other hand is a basically just a query language which gives the client the power to request specific fields and elements it wants to retrieve from the server. It is, loosely speaking, some kind of SQL for the Web, or as Fielding termed it just a Remote Data Access (RDA). It therefore has to have some knowledge of the available data beforehand which couples clients somehow to the server. If the server will rename some of the fields, the client might not be able to retrieve that kind of information further, though I'm not a GraphQL expert.
As stated above, REST is often confused for a JSON based HTTP API that allows to perform queries on directly mapped DB entries/entities. Keep in mind that REST doesn't prohibit this, though its focus is on the decoupling of peers not the retrieval aspect of some Web exposed database entries. As Jim Webber pointed out in a great talk back in 2011 in REST you don't simply expose database tables, you create a domain application protocol which clients will follow along like in a text-based computer game or in a typical Webshop system on the internet.
Especially the linked introspection documentation of GraphQL reminds me of reflection in Java, which couples to the actual class model available. If something along the datamodel changes, how does the GraphQL interaction behave? Is it able to change and adapt? Is a client built for one API able to work with an other API out of the box? All these are basically requirements for a true RESTful client. It basically has to adept to changes in future as the server is free to evolve anytime. It further shouldn't assume certain endpoints returning certain types but use content type negotiation to request a representation it can work upon.
These should give you enough insights to determine for yourself whether GraphQL can be RESTful or not. In my opinion it isn't, but my insights into GraphQL are rather limited, TBH.
Because graphql publishes Metadata about its types, it's entirely plausible (I think) to build a graphql client that could consume any graphql endpoint ...
SOAP did the exact same thing, though it was still an RPC protocol. A client could look up the ...?wsdl information at run-time and then generate a request according to the schema defined in the WSDL dynamically, though what usually happened was that some pre-generated stub-classes were generated based on the WSDL data that got compiled into a specific client. A client dynamically generating a request still needed a routine that defines what message-type to create and what data the message required as input.
While SOAP could potentially define multiple endpoints within a WSDL, in most cases only one was defined though. This endpoint usually only operates on POST requests even when later on (SOAP 1.2) GET would have been possible also.
According to Fielding's thesis
REST uses a resource identifier to identify the particular resource involved in an interaction between components.
, what would be the resource identifier in GraphQL? GraphQL's documentation states that
... In contrast, GraphQL's conceptual model is an entity graph. As a result, entities in GraphQL are not identified by URLs. Instead, a GraphQL server operates on a single URL/endpoint, usually /graphql, and all GraphQL requests for a given service should be directed at this endpoint.
Similar to SOAP, all the request are targeted towards a single endpoint. This has some impact if you consider caching, which is a further constraint REST implies. How are responses cacheable if the URI is the key used to store the response in the cache?
While all of the aggregation stuff and the flexibility may be nice from a consumer perspective, they are, probably, not in line with the constraints of REST, though Fielding himself claimed that REST is not applicable in all situations and that designers should select a style that fits their needs as not every style is the "silver bullet" to each problem. Even Mike Amundsen stated that GraphQL violates at least 3 constraints imposed by the REST architecture, even though GraphQL seems to have changed the default retrieval method from POST to GET since.
Usually, if you aim for long-living APIs that should be free to evolve in future and that has to deal with lots of clients, especially ones not under your direct control, this is when REST starts to shine. Fielding admits that most developers have problems when thinking long-term. For a single frontend-to-backend system or for a tailor-made client interacting with the own API, REST is not the architecture one should probably follow.
Last but not least, in a later tweet Fielding stated
There is no such thing as a REST endpoint. There are resources. A countably infinite set of resources bound only by restrictions on URL length. A client can POST to a REST service to create a resource that is a GraphQL query, and then GET that resource with all benefits of REST…
which I interpret as, don't focus to much on justifying whether GraphQL is REST or not, but think about how you can integrate its benefits into the overall design.
My understanding of SOAP vs REST:
REST = JSON, simple consistent interface, gives you CRUD access to 'entities' (Abstractions of things which are not necessarily single DB rows), simpler protocol, no formally enforced 'contract' (e.g. the values an endpoint returns could change, though it shouldn't)
SOAP = XML, more complex interface, gives you access to 'services' (specific operations you can apply to entities, rather than allowing you to CRUD entities directly), formally enforced, pre-stated 'contract' (like a WSDL, where e.g. the return types are predefined and formalized)
Is that a broadly correct assessment?
What about a mixture?
If so, what do I call an API that is a mixture?
For example, If we have what at surface level looks like a REST API (returns JSON, no WSDL or formalized contract defined - but instead of giving you access to the 'entities' that the system manages (User, product, comment, etc) it instead gives you specific access to services and complex operations (/sendUserAnUpdate/1111, /makeCommentTextPurple/3333, /getAllCommentsByUserThisYear/2222) without having full coverage?
The 'services' already exist internally, and the team simply publishes access to them on a request by request basis, through what would otherwise look like a REST API.
Question:
What is the 'mixture' typically referred to as (besides, maybe, a bad API). Is there a word for it? or a concept I can refer to that'll make most developers understand what I'm referring to, without having to say the entire paragraph I did above?
Is it just "JSON SOAP API?", "A Service-based REST API?" - what would you call it?
Thanks!
Thanks!
If you take a look at all those so-called REST-APIs your observation might seem true, though REST actually is something completely different. It describes an architecture or a philosophy whose intent it is to decouple clients from servers, allowing the latter one to evolve in future without breaking clients. It is quite similar to the typical Web page interaction in that a server will teach a client on what it needs and only reacts on client-triggered requests. One has to be pretty careful and pendant when designing REST services as it is too easy to include a coupling that may affect clients when a change is introduced, especially with all the pragmatism around in (commercial) software engineering. Stefan Tilkov gave a great talk on REST back in 2014 that, alongside with Jim Webber or Asbjørn Ulsberg, can be used as introduction lectures to what REST is at its core.
The general premise in REST should always be that a server teaches clients what they need and what a server expects and offers choices to the client via links. If the server expects to receive data from the client it will send a form-esque representation to inform the client about the respective fields it supports and based on the affordance of the respective elements contained in the form a client knows whether to select one or multiple options, enter some free text or enter a date value and such. Unfortunately, most of the media-type formats that attempt to mimic HTML's forms are still in draft versions.
If you take a look at HTML forms in particular you might sense what I'm refering to. Each of the elements that may occur inside a form are well defined to avoid abmiguity and improve interoperability. This is defacto the ultimate goal in REST, having one client that is able to interact with a sheer amount of other services without having to be adapted to each single API explicitely.
The beauty of REST is, it isn't limited to a single representation form, i.e. JSON, in fact there is almost an infinite number of possible representation formats that could be exchanged in a REST environment. Plain application/json is a terrible media-type for REST applications IMO as it doesn't include any defintions in regards to links and forms and doesn't describe the semantics of certain fields that may be shipped in requests and responses. The lack of semantical description usually leads to typed resources where a recipient expects that receiving data from i.e. /api/users returns some specific user data, that may differ from host to host. If you skim through IANA's media type registry you will find a couple of media-type formats you could have used to transfer user-related data and any client supporting these representation formats whold be able to interact with this enpoint without any issues. Fielding himself claimed that
A REST API should spend almost all of its descriptive effort in defining the media type(s) used for representing resources and driving application state, or in defining extended relation names and/or hypertext-enabled mark-up for existing standard media types. Any effort spent describing what methods to use on what URIs of interest should be entirely defined within the scope of the processing rules for a media type (and, in most cases, already defined by existing media types). (Source)
Through content-type negotiation client and server will negotiate about a representation format both support and understand. The question therefore shouldn't be which one to support but how many you want to support. The more media-type your API or client is able to exchange payloads for, the more likely it will be to interact with other participants.
Most of those so-called REST APIs are in reality just RPC services exposed via HTTP that may or may not respect and support certain HTTP operations. HTTP thereby is just a transport layer whose domain is the transfer of files or data over the Web. Plenty of people still believe that you shouldn't put verbs in URIs when in reality a script or process usually doesn't (and shouldn't) care whether a URI contains a verb or not. The URI itself is just a pointer a client will follow and invoke when it is interested in receiving the payload. We humans are also not that much interested in the URI itself in regards to the content it may return after invoking that URI. The same holds true for arbitrary clients. It is more important what you ship along with that URI. On the Web a link can be annotated with certain text and/or link relation names that set the links content in relation to the current page. It may hint a client that certain content may be invoked before the whole response was parsed as it is quite likely that the client will also want to know about that. preload i.e. is such a link-relation name that hints the client about that. If certain domain-specific terms exist one might use an extension scheme as defined by Web linking or reuse common knowlege or special microformats.
The whole interaction in a REST environment is similar to playing a text-based computer game or following a certain process flow (i.e. ordering and paying produts) defined by an application domain protocol, that can be designed as a state machine. The client is therefore guided through the whole process. It basically just follows the orders the server gave it, with some choices to break out of the process (i.e. cancel the order before paying).
SOAP on the otherhand is, as you've stated, an XML-based RPC protocol reusing a subset of HTTP to exchange requests and responses. The likelihood that when you change something within your WSDL plenty of clients have to be adapted and recompiled are quite high. SOAP even defines its own security mechanism instead of reusing TLS, which requires explicit support by the clients therefore. As you have a one-to-one communication model due to the state that may be kept in process, scaling SOAP services isn't that easy. In a REST environment this is just a matter of adding a load-balancer before the server and then mirroring the server n-times. The load-balancer can send the request to any of the servers due to the stateless constraint
What is the 'mixture' typically referred to as (besides, maybe, a bad API). Is there a word for it? or a concept I can refer to that'll make most developers understand what I'm referring to, without having to say the entire paragraph I did above?
Is it just "JSON SOAP API?", "A Service-based REST API?" - what would you call it?
The general term for an API that communicates on top of HTTP would be Web API or HTTP API IMO. This article also uses this term. It also lists XML-RPC and JSON-RPC besides SOAP. I do agree with Voice though that you'll receive 5 answers on asking 4 people about the right term to use. While it would be convenient to have a respective term available everyone would agree upon, the reality shows that people are not that interested in a clear separation. Just look here at SO on the questions taged with rest. There is nothing wrong with not being "RESTful", though one should avoid the term REST for truly RPC services. Though I think we are already in a situation where the term REST can't be rescued from misusage and marketing purposes.
For something that requires external documentation to use and that ships with its own custom, non-standardized representation format or that just exposes CRUD for domain objects I'd add -RPC to it, as this is more or less what it is at its heart. So if the API sends JSON and the representation to expect is documented via Swagger or some other external documentationJSON-RPC would probably the most fitting name IMO.
To sum up this post, I hope I could shed some light on what REST truly is and how your observation is flawed by all those pragmatic attempts that unfortunately are RPC through and through. If you change something within their implementation, how many clients will break? In addition to that you can't reuse the client that you've implemented for API A to interact with API B (of a different company or vendor) out of the box and therefore have to either adapt your client or create a new one solely for that API. This is true RPC and therfore should be reflected in the name somehow to hint developers about future expectations. Unfortunately, the process of naming things propperly, especially in regards to REST, seems already lost. There is a fine but tiny group who attempt to spread the true meaning, like Voice, Cassio and some others, though it is like fighting windmills. The best advice here would be to first discuss the naming conventions and what each participant understand on which term and then agree on a naming scheme everyone agrees on to avoid future confusion.
My understanding of SOAP vs REST
...
Is that a broadly correct assessment?
No.
REST is an "architectural style", which is to say a coordinated collection of architectural constraints. The World Wide Web is an example of an application built using the REST architectural style.
SOAP is a transport agnostic message protocol specification, based on XML Information Set
If so, what do I call an API that is a mixture?
I don't think you are going to find an authoritative terminology here. Colloquially, you are likely to hear the broad umbrella term "web api" to describe an HTTP API that isn't "RESTful".
The whole space is rather polluted by semantic diffusion.
I have a Soap service that is running over http. Is this also a REST service. What are the criteria that would make it a REST service. What are the criteria that would definitively exclude it as a REST service? There are posts (e.g. here) that compare REST and Soap but do not seem to answer this question directly. My answer is: Yes, a Soap service at its functional level is an http request that returns an XML payload where state is not maintained by the server and is therefore a REST service.
Fielding stated in his dissertation:
REST provides a set of architectural constraints that, when applied as a whole, emphasizes scalability of component interactions, generality of interfaces, independent deployment of components, and intermediary components to reduce interaction latency, enforce security, and encapsulate legacy systems.
If you compare the above mentioned properties with Web-browsing, you will find plenty of similarities between both as Fielding just took the concepts which made the Web such a success and applied it onto a more general field, that also should allow applications to "surf the Web".
In order to rightfully call an architecture REST it has to support self-descriptiveness, scalability and cacheability while also respecting and adhering to the rules and semantics outlined by the underlying transport protocol and enforce the usage of well-defined standards, such as media types, link relation names, HTTP operations, URI standards, ...
Self-descriptiveness of a service is utilized by HATEOAS (or hate-us, as I tend to pronounce it, as people like me who see the benefit in REST always have to stress this key-term, which therefore also ended up in its own meme). Via HATEOAS a client is served by the server with all the available "actions" a client could take from the current "state" the client is in. An "action" here is just a link with an accompanying link-relation name a client can use to deduce when to best invoke that URI. The media-type the response was returned for may define what to do with such links. HTML i.e. states that on clicking a link a GET request is triggered and the content of the link is loaded either in the current pane or in a new tab, depending on the arguments the link has. Other media-types may defines something similar or something different at all. The general motto here, though, is: proceeding thru exploring. The interaction model in a REST architecture is therefore best designed as affordance and state machine while the actual service should follow more like a Web site approach where a server is teaching a client, i.e. on how a request has to look like and where to send the request to (similar to HTML forms).
As plenty of Web pages are more or less static and a majority of requests are retrieval focused, the Web heavily relies on caching. The same is generally expected from REST APIs as well, hence the strong requirement for cacheability here, as this could reduce the workload on servers quite notably if proper caching is in place.
By keeping client state away from servers this also allows to add new copies of a service onto new servers located behind a load balancer or new regions and thus increase scalability. A client usually does not care where it retrieves the data from, hence a server might just return a URI pointing to a clone instead of itself.
SOAP on the other hand is RPC, like Java's remote method invocation (RMI) or CORBA, where you have an own interface definition language (IDL) to generate client side stub-code for you, that contains the actual logic on how to transform certain objects into byte streams and how to send them over the wire, where you invoke certain methods.
Where SOAP violates REST constraints is clearly by the lack of caching support as well as out-of-band knowledge that needs to be available before actually using a client. SOAP messages are usually always exchanged as POST operations, which are not cacheable by default. Certain HTTP headers are available to allow intermediary servers to cache the response though SOAP doesn't make use of such and thus lacks general support for it.
A client developed for SOAP endpoint A will most likely also not be interoperable with a further SOAP endpoint B ran by a different company. While one might argue that a Web client also does not know how to process each of the different media-types, browsers usually provide plugin mechanism to load that kind of knowledge into the client. A media type is in addition to that also standardized, at least it should be, and may therefore be usable with plenty of servers (think of Flash-support i.e.). A further problem SOAP services have is, that once anything is changed in the WSDL definition clients not aware of the update will most likely stop to work with that updated service until the client code is updated to work with the latest version of the generated stub classes.
In regards to the XML format exchanged in SOAP: While technically it is doable for a REST service to return a SOAP XML payload, the format itself lacks support of HATEOAS, which is a necessity and not an option. How should a client make further choices based on the received response simply on the content received without any a-priori knowledge of the API itself?
I hope you can see that SOAP lacks support of caching, may have problems with scalability as well as leads to a tight coupling of clients to the actual API. The lack of HATEOAS support by the SOAP message envelop/header/body also does not allow clients to explore the API freely and thus adapt to server changes automatically.
Proper REST services follow the architectural guidelines spelled out in chapter five of Roy Fielding's dissertation. Most people erroneously use the term "REST API" when they really mean "HTTP API". Statelessness is a necessary but not sufficient condition for an API to adhere to the REST architectural guidelines.
In the explanation of the differences between web services or (Web) APIs there seems to be agreement that REST results in a less coupled architecture.
For example:
https://datatracker.ietf.org/doc/html/draft-li-sdnrg-design-restapi-02 mentions that REST is suited for lowly coupled systems.
https://www.upwork.com/hiring/development/soap-vs-rest-comparing-two-apis/ states that SOAP is too highly coupled
What are the arguments for considering it less or lightly coupled?
In a systems where clients aren't coupled to a specifiy service API, clients will in general be more failure tolerant and thus robust besides being usable for multiple RESTful APIs in general. They will adapt to changes done on the server side while a tightly coupled client will fail to process further requests.
In REST API must be hypertext-driven Fielding explained some of the constraints a RESTful architecture has and what could happend if an API fails to respect these rules.
As clients use links to interact with some remote server, a client has to have some knowledge on what a link is and what actions it can perform on it. This knowledge is in general defined by HTTP (or any other transport protocol) and URI specifications which are often built into the client by relying on certain frameworks or middleware. As links are a major part in REST and clients have to learn respective endpoints somehow Fielding referred to this in his blog post as:
... allow servers to instruct clients on how to construct appropriate URIs, such as is done in HTML forms and URI templates, by defining those instructions within media types and link relations.
though you see plenty of so-called "REST" services that do not support clients on this by not returning URIs at all or not keeping the semantics in the relation but put them into the URI. I.e. you will often see something like URIs like http://some.server.com/api/v1/users/1234 which may give humans a clue of its purpose though if this "knowledge" is ported to a client, it might brake it easily if the server decides (or is instructed by someone) to change anything in the structure. If the server now moves the resource to i.e. http://some.server.com/api/v1/employees/1234 the client wont be able to retrieve data of the user/employee any longer and thus break.
Instead the server should instruct the client with the needed information. It can add some redirect logic which upon invoking the former URI informs the client that the resource can now be found at the new location. The response from the server itself should name such a URI so a client can refer to a resource endpoint via the name instead of analyzing the URI. In HTML this can be achived like this: Sam Sample. Instead of the client analyzing the URI for semantical structures, which also often leads to typed resources, it simply uses what is given by the server and grasp the sense of the URI via the relational name user in the sample.
As HTTP (or any other transmission protocol used) allows to send almost any data between client and server, media types are used by server and clients in order to agree upon a data representation format both sides are able to understand and know how to process. The media type is therefore some kind of knowledge base of what to do with certain data. It can describe the syntactical structure of a document, the necessary elements to expect and the semantics each field has.
According to Fielding
a REST API should spend almost all of its descriptive effort in defining the media type(s) used for representing resources and driving application state, or in defining extended relation names and/or hypertext-enabled mark-up for existing standard media types. Any effort spent describing what methods to use on what URIs of interest should be entirely defined within the scope of the processing rules for a media type
If you, however, take a close look at plenty of questions here at StackOverflow most messages are exchanged in plain JSON which does not convey any semantics about the actual data received nor does it hint the client on possible actions that can be performed on this data. HAL and similar media types at least provide some clue on resources and links a client can use in order to process further actions.
As the media type defines a client or server on how to process certain data, it might contain an indication that a link with a relation name like user references a user resource further data of that users can be retrieved from. If the URI of the resource is changed a RESTful client will still be able to process its task as it can deduct from the media type that the user information for a user can be retrieved via the relation name user. Where this URI is actually pointing at isn't of much relevance as the client will only invoke it to retrieve further data from.
As the question also targeted SOAP it is important to know that the SOAP API is very different to REST by nature. The tight coupling is defined via the WSDL contract which defines the server endpoint as well as the operations available to invoke as well as the parameter needed and response types to expect. If the server is adding or (re)moving certain parameters after a client implemented that contract that client will fail from sending further requests and hence needs to be updated before it can continue to work.
In this very simple scenario of letting the server move arround some resources it hopefully becomes clear that a client's knowledge is kept in media types and its state it is aquiring through interacting with the services rather than having it implemented in the code itself (like in SOAP or any propriatary Web-API client). The client is therefore not coupled to the API itself but to the media types which can be added dynamically.