Workarounds for adding new fields to existing output data type in SOAP - soap

According to this article about backwards compatibility in SOAP by IBM they state that new fields can not be added to output types without breaking the contract. The relevant snip from the page is from the section titled New, optional fields in an existing data type...
You can add an element to an existing complexType as long as you make it optional (using the minOccurs="0" attribute). But be careful. Adding an optional element is a minor change only if its enclosing complexType is received as input to the new service. The new service cannot return a complexType with new fields. If an old client were to receive the new field, the client deserialization would fail because the client would not know about the new field.
This was written in 2004 for the WSDL 1.1 spec. Is this still true under current under the WSDL 1.2 spec? Is there no way to define a default behavior of "ignore" for new unknown fields? This statement also seems implementation specific or is that per the spec?
I am trying to contend with the issue of evolving a SOAP service that returns complex business objects. New fields will be added as consumers find use cases for them. I would like to avoid having keep N versions of the service around for simply adding new fields.

From my personal experience this is still the case. I think your main concern is the versioning methodology. You can look at: http://www.ibm.com/developerworks/webservices/library/ws-version/, or more close to home Web Services API Versioning.

Related

How do PUT, POST or PATCH request differ ultimately?

The data, being sent over a PUT/PATCH/POST request, ultimately ends up in the database.
Now whether we are inserting a new resource or updating or modifying an existing one - it all depends upon the database operation being carried out.
Even if we send a POST and ultimately perform just an update in the database, it does not impact anywhere at all, isn't it?!
Hence, do they actually differ - apart from a purely conceptual point of view?
Hence, do they actually differ - apart from a purely conceptual point of view?
The semantics differ - what the messages mean, and what general purpose components are allowed to assume is going on.
The meanings are just those described by the references listed in the HTTP method registry. Today, that means that POST and PUT are described by HTTP semantics; PATCH is described by RFC 5789.
Loosely: PUT means that the request content is a proposed replacement for the current representation of some resource -- it's the method we would use to upload or replace a single web page if we were using the HTTP protocol to do that.
PATCH means that the request content is a patch document - which is to say a proposed edit to the current representation of some resource. So instead of sending the entire HTML document with PUT, you might instead just send a fix to the spelling error in the title element.
POST is... well, POST is everything else.
POST serves many useful purposes in HTTP, including the general purpose of “this action isn’t worth standardizing.” -- Fielding 2009
The POST method has the fewest constraints on its semantics (which is why we can use it for anything), but the consequence is that the HTTP application itself has to be very conservative with it.
Webber 2011 includes a good discussion of the implementations of the fact that HTTP is an application protocol.
Now whether we are inserting a new resource or updating or modifying an existing one - it all depends upon the database operation being carried out.
The HTTP method tells us what the request means - it doesn't place any constraints on how your implementation works.
See Fielding, 2002:
HTTP does not attempt to require the results of a GET to be safe. What it does is require that the semantics of the operation be safe, and therefore it is a fault of the implementation, not the interface or the user of that interface, if anything happens as a result that causes loss of property (money, BTW, is considered property for the sake of this definition).
The HTTP methods are part of the "transfer of documents over a network" domain - ie they are part of the facade that allows us to pretend that the bank/book store/cat video archive you are implementing is just another "web site".
It is about the intent of the sender and from my perspective it has a different behaviour on the server side.
in a nutshell:
POST : creates new data entry on the server (especially with REST)
PUT : updates full data entry on the server (REST) or it creates a new data entry (non REST). The difference to a POST request is that the client specifies the target location on the server.
PATCH : the client requests a partial update (Id and partial data of entry are given). The difference to PUT is that the client sends not the full data back to the server this can save bandwidth.
In general you can use any HTTP request to store data (GET, HEAD, DELETE...) but it is common practice to use POST, PUT, and PATCH for specific and standardized scenarios. Because every developer can understand it later
They are slightly different and they bind to different concepts of REST API (which is based on HTTP)
Just imagine that you have some Booking entity. And yo perform the following actions with resources:
POST - creates a new resource. And it is not idempotent - if you sent the same request twice -> two bookings will be stored. The third time - will create the third one. You are updating your DB with every request.
PUT - updates the full representation of a resource. It means - it replaces the booking full object with a new one. And it is idempotent - you could send a request ten times result will be the same (if a resource wasn't changed between your calls)
PATCH - updates some part of the resource. For example, your booking entity has a date property -> you update only this property. For example, replace the existing date with new date which is sent at the request.
However, for either of the above - who is deciding whether it is going to be a new resource creation or updating/modifying an existing one, it's the database operation or something equivalent to that which takes care of persistence
You are mixing different things.
The persistence layer and UI layer are two different things.
The general pattern used at Java - Model View Controller.
REST API has its own concept. And the DB layer has its own purpose. Keep in mind that separating the work of your application into layers is exactly high cohesion - when code is narrow-focused and does one thing and does it well.
Mainly at the answer, I posted some concepts for REST.
The main decision about what the application should do - create the new entity or update is a developer. And this kind of decision is usually done through the service layer. There are many additional factors that could be done, like transactions support, performing filtering of the data from DB, pagination, etc.
Also, it depends on how the DB layer is implemented. If JPA with HIbernate is used or with JDBC template, custom queries execution...

Versioning related media types individually or in lockstep in a RESTful API

I'm developing a REST API around an ecommerce site and one of my resources is an Order which contains information like went it was made, the ID, the status, when it will be shipped, etc.
I have defined a media type for my Order resource like so:
application/vnd.myapp.order.v1+json
I also have defined another resource which is the status of an order, like so:
application/vnd.myapp.order-status.v1+json
My question is around the versioning of these media types. Seeing as they're related, would it make sense to version them in lockstep? For example, if the representation of the order resource changes and I create a application/vnd.myapp.order.v2+json, would it wise to also bump the version of the order-status media type to v2 as well? I'm also wondering if any there is a RESTful option with regards to the guidelines. I did have a look around online and couldn't really find anything talking about the best practice here, so any advice/opinions are appreciated.
I don't think mixing version and media type is a good idea.
In my opinion, you should separate it according to 'separate concerns principle' and 'single responsibility'.
https://en.wikipedia.org/wiki/Separation_of_concerns
https://en.wikipedia.org/wiki/Single-responsibility_principle
Many teams use header/url for versioning:
For example:
/api/v1/
/api/v2/
header {version:'v1'}
header {version:'v2'}
Then we can easily map the request with our need:
#RequestMapping(value="api/v1/books", consumes="application/json")
#RequestMapping(value="api/v2/books", consumes="application/json")
or
#RequestMapping(value="api/books",headers="version=v1", consumes="application/json")
#RequestMapping(value="api/books",headers="version=v2", consumes="application/json")
Although it seems useful it would be a violation of SoC and cause additional issues down the line as your API evolves.
Versioning your URLs is the better choice here, as URL is a perfect way to signal what type of resource is being dealt with and relation to the data it handles.
(To me introducing a custom header sounds much better design-wise then a custom versioned media type)
Custom media types are generally supposed to tell a consumer about the type of the data and its encoding scheme (e.g. xml vs json vs plain text and so on) and not how your fields are arranged from version to version while the encoding scheme is literally unchanged.
By choosing this path you would:
Force consumers of your API to tightly couple to that specific “representation” that creates maintenance issues on both sides.
Whenever you have multiple “versions” of your API co-existing at a given time - it introduce ambiguity when bodiless https methods such as DELETE or HEAD are used as the request information would simply be insufficient to correctly route your request, let alone the backend code completely.
It renders rels (Link Relation Types) less usable in their normal form (if you’d ever want to introduce them)

REST: How to support create-Or-Update and partial-update ? (aka PUT vs PATCH)

We are designing WebAPI for our software for managing ecommerce product information. We want to provide (among many others) two operations:
Simple one: allow user to add/modify existing product information:
don't create new product if it not exists
don't delete any information from existing product which was not provided in this request
In my opinion HTTP PATCH method is proper way to handle this scenario (with json-patch or json-merge-ptach) with URL like this: /products/{ID}
Harder one: allow user to add/modify existing product or create one
create product if not exists in DB
don't delete any information from existing product which was not provided in this request (same behaviour as in first case)
I'm struggling with designing REST endpoint for this second use case. I have few options but none of them fits perfectly for me in the REST principles:
a) Add custom HTTP header to the endpoint designed for first case (patch) to allow a caller to control of "not found behaviour" eg. create-entity-when-not-exists: true/false - but in my opinion PATCH shouldn't be used for creating resources.
b) Design new endpoint using PUT with special header "preserve-not-provided-data" - this on the other hand violates for me PUT principles because PUT is create-or-replace not create-or-update method
c) Create PATCH for /products URL (without {ID} at the end) - in this case we are updating whole collection(resource) of products - so if product exists we can update it or create new one if not exists.
For now c) solution looks fine for me with one exception: If in the future we would like to support batch operations (for both use cases: 1 and 2) we would like to use /products URL and it will conflict with URL from solution c)
What do you think ? Do you have any other ideas ?
PUT and PATCH have differing message semantics, but the core context ("remote authoring") is the same. In both cases, the client request is "Please, server, make your representation of this resource match my local copy".
For example, I GET a JSON document from the server. I make local edits to it. Now I want to "save" my changes on the server. If the document is modest in size, I might just send the entire revised document over the network. If the document is very large, and my changes are modest, then I might instead send the patch instead.
If you imagine using HTTP to publish edits of HTML web pages to a server, then you've got the right frame of reference. There's not a lot of practical difference between "please patch the title of your copy of the document" and "here is a complete new copy of the document, with my edit to the title". The bytes on disk are going to be the same in either case.
Given that, it would be very odd if those two methods for publishing a new revision of the document were to have vastly different side effects.
Your third approach, based on modifying /products, is potentially fine for both your individual and batch. The server gets the new representation of /products (or the patch document describing the changes), decides whether to accept the changes, and if so computes what it needs to do to its own database to make things work.
Note:
A PUT request applied to the target resource can have side effects on other resources.
The HTTP specification is relatively strict about what the message means, but offers the server a lot of leeway in how it behaves in response.

How to specify data security constraints in REST APIs?

I'm designing a REST API and I'm a big defender of keeping my URL simple, avoiding more than two nested resources.
However, I've been having second thoughts because of data security restrictions that apply to my APIs, that have been trying to force me to nest more resources. I'll try to provide examples to be more specific, as I don't know the correct naming for this situation.
Consider a simple example where I want to get a given contact restriction for a customer, like during what period my customer accepts to be bothered with a phone call:
So, I believe it's simpler to have this:
- GET /customers/12345
- GET /customers/12345/contacts
- GET /contacts/9999
- GET /contacts/9999/restrictions
- GET /restrictions/1
than this:
- GET /customers/12345
- GET /customers/12345/contacts
- GET /customers/12345/contacts/9999
- GET /customers/12345/contacts/9999/restrictions
- GET /customers/12345/contacts/9999/restrictions/1
Note: If there are more related resources, who knows where this will go...
The first case is my favourite because since all resources MUST have a unique identifier, as soon I have its unique identifier I should be able to get the resource instance directly: GET /restrictions/1
The data security restriction in place in my company states that not everyone can see every customers' info (eg: only some managers can access private equity customers). So, to guarantee that, the architects are telling me I should use /customers/12345/contacts/9999/restrictions/1 or /customers/12345/contact-restrictions/1 so that our data access validator in our platform has the customerId to check if the caller has access to it.
I understand the requirement and I see its value. However, I think that this kind of custom security informatio, because that's what I believe to be, should be in a custom header.
So, I believe I should stick to GET /restriction/1 with a custom header "customerId" with the value 12345.
This custom header would only be needed for the apis that have this requirement.
Besides the simpler URL, another advantage of the header, is that if an API didn't start with that security requirement and suddenly needs to comply to it, we could simply require the header to be passed, instead of redefining paths.
I hope I made it clear for you and I'll be looking to learn more about great API design techniques.
Thank you all that reached the end of my post :)
TL;DR: you are fighting over URI design, and REST doesn't actually offer guidance there.
REST, and REST clients, don't distinguish between your "simpler" design and the nested version. A URI is just an opaque sequence of bytes with some little domain agnostic semantics.
/4290c3b2-134e-4647-867a-214d0c866f29
Is a perfectly "RESTFUL" URI. See Stefan Tilkov, REST: I don't Think it Means What You Think it Does.
Fundamentally, REST servers are document stores. You provide a key (the URI) and the server provides the document. Or you provide a key, and the server modifies the document.
How this is implemented is completely at the discretion of the server. It could be that /4290c3b2-134e-4647-867a-214d0c866f29 is used to look up the tuple (12345, 9999, 1), and then the server checks to see if the credentials described in the request header have permission to access that information, and if so the appropriate representation of the resource corresponding to that tuple is returned.
From the client's perspective, it's all the same thing: I provide an opaque identifier in a standard way, and credentials in a standard way, and I get access to the resource or I don't.
the architects are telling me I should use /customers/12345/contacts/9999/restrictions/1 or /customers/12345/contact-restrictions/1 so that our data access validator in our platform has the customerId to check if the caller has access to it.
I understand the requirement and I see its value. However, I think that this kind of custom security information, because that's what I believe to be, should be in a custom header.
There's nothing in REST to back you up. In fact, the notion of introducing a custom header is something of a down check, because your customer header is not something that a generic component is going to know about.
When you need a new header, the "REST" way to go about it is to introduce a new standard. See RFC 5988 for an example.
Fielding, writing in 2008
Every protocol, every media type definition, every URI scheme, and every link relationship type constitutes prior knowledge that the client must know (or learn) in order to make use of that knowledge. REST doesn’t eliminate the need for a clue. What REST does is concentrate that need for prior knowledge into readily standardizable forms.
The architects have a good point - encoding into the uri the hints that make it easier/cheaper/more-reliable to use your data access validator is exactly the sort of thing that allowing the servers to control their own URI namespace is supposed to afford.
The reason that this works, in REST, is that clients don't depend on URI for semantics; instead, they rely on the definitions of the relations that are encoded into the links (or otherwise expressed by the definition of the media type itself).

RESTful DELETE with reason

For a resource that will be deleted, ultimately with a soft delete (isDeleted flag), I am looking to provide a reason to store along with the resource for audit purposes.
The options I have encountered don't feel correct.
Custom HTTP Header
DELETE with Body
I have also considered instead using a PUT, but the content I would be putting is different from what makes up the resource on a typical update.
Which method makes the most sense from a RESTful perspective ?
DELETE with a body is wrong, in that it doesn't respect the semantics of the uniform interface defined in the HTTP specification.
A payload within a DELETE request message has no defined semantics; sending a payload body on a DELETE request might cause some existing implementations to reject the request.
Note that the spelling used here is the same as that of a payload for a GET request.
Semantically, DELETE is the right choice; soft vs hard delete is "beyond the scope of the specification", which is to say it is an implementation choice.
But communicating the "reason" gives you two problems to solve. One is where to put that reason, and the answer is, of course, to use a header.
New header fields can be defined such that, when they are understood by a recipient, they might override or enhance the interpretation of previously defined header fields, define preconditions on request evaluation, or refine the meaning of responses.
You can look through the message-headers registry to see if there is a close match to your requirements, but failing that you would define one of your own.
The second problem is figuring out how to communicate with the client so that it knows to use the header field. The most common approach today is to just write the header into the description of the API, but that's not quite REST.
The REST answer is that your hypermedia specification describes how the server might communicate to the client which headers are important, and what data should be put there. Imagine an HTML form with a "field-value" input control, and you've got the right idea.
Not many API bother to do it that way.
PUT is an intriguing choice; there's nothing in the rules that says that a resource can have only one content type, or that an endpoint must accept only one content type.
For instance, RFC 7807 defines application/problem+json, a simple representation for reporting issues from the server. But there's no reason that you couldn't PUT an application/problem+json representation to a resource to induce a soft delete.
This specification gives you both a title and a details element to play with, so the client has room to work.
Of course, it doesn't have to be application/problem+json -- you can specify a more suitable media type of your own design.
Again, you have similar problems to the using delete with a custom header: how does the client discover that your resources support put deletes?