Multiple entity replacement in a RESTful interface - rest

I have a service with some entities that I would like to expose in a RESTful way. Due to some of the requirements I have some trouble finding a way I find good.
These are the 'normal' operations I intend to support:
GET /rest/entity[?filter=<query>] # Return (matching) entities. The filter is optional and just a convenience for us CLI curl-users :)
GET /rest/entity/<id> # Return specific entity
POST /rest/entity # Creates one or more new entities
PUT /rest/entity/<id> # Updates specific entity
PUT /rest/entity # Updates many entities (json-dict or multipart. Haven't decided yet)
DELETE /rest/entity/<id> # Deletes specific entity
DELETE /rest/entity # Deletes all entities (dangerous but very useful to us :)
Now, the additional requirements:
We need to be able to replace the entire set of entities with a completely new set of entities (merging can occur internally as an optimization).
I thought of using POST /rest/entity for that, but that would remove the ability to create single entities unless I move that functionality. I've seen /rest/entity/new-style paths in other places, but it always seemed a bit odd to reuse the id path segment for that as there might or might not be a collision in IDs (not in my case, but mixing namespaces like that gives me an itch :)
Are there any common practices for this type of operation? I've also considered /rest/import/entity as a separate path for similar non-restful operations for other entity types we might have, but I don't like moving it outside of the entity home path.
We need to be able to perform most operations in a "dry-run"-mode for validation purposes.
Query strings are usually considered anathema, but I'm already a sinner for the filter one. For the validation mode, would adding a ?validate or ?dryrun flag be ok? Have anyone done anything similar? What are the drawbacks? This is meant as an aid for user-facing interfaces to implement validation easily.
We don't expect to have to use any caching mechanism as this is a tiny configuration service rarely touched, so optimization for caching is not strictly necessary

We need to be able to replace the entire set of entities with a
completely new set of entitiescompletely new set of entities
That's what this does, no?
PUT /rest/entity
PUT has replace semantics. Maybe you could use the PATCH verb to support doing partial updates.
Personally, I would change the resource name to "EntityList" or "EntityCollection", but that's just because it is clearer for me.

Related

Structuring nested rest API

I'm writing an API with spring boot, trying to keep it restful but the structure is quite nested. So say I have:
/api/examboard/{ebid}/qualification/{qid}/subject/{sid}/module/{mid}/
I have a controller for every noun that will take in all Id's, the problem with this is that I don't really need an ebid or a qid for modules, they only really need to be concerned with subjects most of the time. The mapping between them all is quite simple. An examboard will have many qualifications, a qualification will have many subjects etc....
Now the problem is say I go for a simpler API design where I only need the parent Id so the Subject controller will also have:
api/subject/{sid}/module
then I need to include multiple services in my controller based on the way JPA works. As I need to include SubjectEntity based calls and ModuleEntity based calls. However I want to maintain a one to one relationship between my controllers/services and services/repositories. This is why I opted for the longer url as I've mentioned above, but it does seem like overkill. Does anyone have any advice on how I should structure an API like this, most examples are quite small and don't really fit.
Without knowing more about your models and the relations between them, this answer will have to stay a bit diffuse.
First of all - "it depends". I know, but it really does. The way you should design an API depends heavily on your use cases that will define required access patterns. Do you often need all modules for a subject? Then introduce /subjects/{sid}/modules, if you need the details for a module of a subject in a qualification in an examboard - by all means have a /examboards/{ebid}/qualifications/{qid}/subjects/{sid}/modules/{mid}
As you say there are many relations between your entities. That is fine, but it does not mean that you need your API to capture each of these relations in a dedicated endpoint. You should distiguish between retrieving and modifying entities here. Find below examples for certain operations you might want to have (not knowing your models, this may not apply - let's consider this an illustration)
Retrieve qualifications for an examboard
GET /examboards/{ebid}/qualifications plain and simple
GET /qualifications?ebid={ebid} if you feel you might need sophisticated filtering later on
or create a new qualitication for an examboard
POST /examboards/{ebid}/qualifications with the details submitted in the body
POST /qualifications with the details submitted in the body and making the associated examboard ebid part of the submitted data
or update an existing qualification
PUT /qualifications/{qid} (if this operation is idempotent)
POST /qualifications/{qid} (if it should not be considered idempotent)
or delete qualifications
DELETE /qualifications/{qid} deletes entities, cascade-deletes associations
DELETE /examboards/{ebid}/qualifications clears all qualifications from an examboard, without actually deleting the qualification entities
There are certainly more ways to let an API do all these things, but this should demonstrate that you need to think of your use cases first and design your API around them.
Please note the pluralisation of collection resources in the previous examples. This comes down to personal preference, but I tends to follow the argumentation of Sam Ruby in RESTful Web Services (available as PDF) that collections should be first-class citizens in an API
Usually, there should not be a reason to have 1:1:1 relationships between controllers, services and repositories. Usually, this is not even possible. Now, I don't know the reason why you might want to do this, but following through with this will force you to put a lot of logic into your database queries and models. While this (depending on your setup and skills) may or may not be easily testable, it certainly shifts the required test types from unit (simpler, usually faster, more fine-grained) to integration tests (require more setup, more complex, usually slower), when instead of having the bulk of your business logic in your services you put them into many joins and subselects in your repositories.
I will only address your REST API structure question.
As you already pointed out
The problem with this is that I don't really need an ebid or a qid for modules, they only really need to be concerned with subjects most of the time
You need to think of your entities as resources if your entity can stand for itself give it its own top level resource. If instead your entity exists only as a part of another entity build a subresource below its parent. This should correspond with the association type aggregation and composition in your object model design.
Otherwise every entity that is part of a many relationship should also be accessible via a subresource on the other side of the relationship.
As I understood you you have a OneToMany relationship between examboard and qualification so we get:
api/examboards/{eid}/qualifications
api/qualifications/{qid}/examboard
Yo could also remove the examboard subresource and include it in the qualification response.
For ManyToMany realtionships you need two subresources:
api/foos/{fid}/bars
api/bars/{bid}/foos
And another resource to manipulate the relationship itself.
api/foosToBars/{fid}+{bid}
Or likewise.

Can EF 6 Data Annotations be different for POST than PUT or GET?

We are building a RESTful web service where there are sometimes different required fields for a POST than a PUT. For example, a field like CustomerSinceDate is allowed to be set on an insert, but not on an update. Is there is a way to set that up with Data Annotations?
EntityFramework does not (and should not) know anything about your web service. It deals only with what rules exist in the persistence layer.
What you are looking for is validation.
So in your REST service, you should check whether CustomerSinceData has been changed, and the entity is being updated. If so, you should throw an Exception with an appropriate message to the consumer.
Here is an article on writing your own DataAnnotations, if you prefer using those:
http://msdn.microsoft.com/en-us/data/jj819164#attributes
Otherwise, take a look at this article on how to write your own custom validation: http://msdn.microsoft.com/en-us/data/gg193959.aspx
(in particular, the section on IValidatableObject).
Your rule could be formulated as (pseudo code)
//if object exists in db AND CustomerSinceData has changed
DataAnnotations will get you a long way, but can be tedious to write if you are writing business logic that will never be reused anywhere else.

On observing an execution tree of interdependent models in MVC

I've developed on the Yii Framework for a while now (4 months), and so far I have encountered some issues with MVC that I want to share with experienced developers out there. I'll present these issues by listing their levels of complexity.
[Level 1] CR(create update) form. First off, we have a lot of forms. Each form itself is a model, so each has some validation rules, some attributes, and some operations to perform on the attributes. In a lot of cases, each of these forms does both updating and creating records in the db using a single active record object.
-> So at this level of complexity, a form has to
when opened,
be able to display the db-friendly data from the db in a human-friendly way
be able to display all the form fields with the attributes of the active record object. Adding, removing, altering columns from the db table has to affect the display of the form.
when saves, be able to format the human-friendly data to db-friendly data before getting the data
when validates, be able to perform basic validations enforced by the active record object, it also has to perform other validations to fulfill some business rules.
when validating fails, be able to roll back changes made to the attribute as well as changes made to the db, and present the user with their originally entered data.
[Level 2] Extended CR form. A form that can perform creation/update of records from different tables at once. Not just that, whether a form would create/update of one of its records can sometimes depend on other conditions (more business rules), so a form can sometimes update records at table A,B but not D, and sometimes update records at A,D but not B
-> So at this level of complexity, we see a form has to:
be able to satisfy [Level 1]
be able to conditionally create/update of certain records, conditionally create/update of certain columns of certain records.
[Level 3] The Tree of Models. The role of a form in an application is, in many ways, a port that let user's interact with your application. To satisfy requests, this port will interact with many other objects which, in turn, interact with many more objects. Some of these objects can be seen as models. Active Record is a model, but a Mailer can also be a model, so is a RobotArm. These models use one another to satisfy a user's request. Each model can perform their own operation and the whole tree has to be able to roll back any changes made in the case of error/failure.
Has anyone out there come across or been able to solve these problems?
I've come up with many stuffs like encapsulating model attributes in ModelAttribute objects to tackle their existence throughout tiers of client, server, and db.
I've also thought we should give the tree of models an Observer to observe and notify the observed models to rollback changes when errors occur. But what if multiple observers can exist, what if a node use its parent's observer but give its children another observers.
Engineers, developers, Rails, Yii, Zend, ASP, JavaEE, any MVC guys, please join this discussion for the sake of science.
--Update to teresko's response:---
#teresko I actually intended to incorporate the services into the execution inside a unit of work and have the Unit of work not worry about new/updated/deleted. Each object inside the unit of work will be responsible for its state and be required to implement their own commit() and rollback(). Once an error occur, the unit of work will rollback all changes from the newest registered object to the oldest registered object, since we're not only dealing with database, we can have mailers, publishers, etc. If otherwise, the tree executes successfully, we call commit() from the oldest registered object to the newest registered object. This way the mailer can save the mail and send it on commit.
Using data mapper is a great idea, but We still have to make sure columns in the db matches data mapper and domain object. Moreover, an extended CR form or a model that has its attributes depending on other models has to match their attributes in terms of validation and datatype. So maybe an attribute can be an object and shipped from model to model? An attribute can also tell if it's been modified, what validation should be performed on it, and how it can be human-friendly, application-friendly, and db-friendly. Any update to the db schema will affect this attribute, and, thereby throwing exceptions that requires developers to make changes to the system to satisfy this change.
The cause
The root of your problem is misuse of active record pattern. AR is meant for simple domain entities with only basic CRUD operations. When you start adding large amount of validation logic and relations between multiple tables, the pattern starts to break apart.
Active record, at its best, is a minor SRP violation, for the sake of simplicity. When you start piling on responsibilities, you start to incur severe penalties.
Solution(s)
Level 1:
The best option is the separate the business and storage logic. Most often it is done by using domain object and data mappers:
Domain objects (in other materials also known as business object or domain model objects) deal with validation and specific business rules and are completely unaware of, how (or even "if") data in them was stored and retrieved. They also let you have object that are not directly bound to a storage structures (like DB tables).
For example: you might have a LiveReport domain object, which represents current sales data. But it might have no specific table in DB. Instead it can be serviced by several mappers, that pool data from Memcache, SQL database and some external SOAP. And the LiveReport instance's logic is completely unrelated to storage.
Data mappers know where to put the information from domain objects, but they do not any validation or data integrity checks. Thought they can be able to handle exceptions that cone from low level storage abstractions, like violation of UNIQUE constraint.
Data mappers can also perform transaction, but, if a single transaction needs to be performed for multiple domain object, you should be looking to add Unit of Work (more about it lower).
In more advanced/complicated cases data mappers can interact and utilize DAOs and query builders. But this more for situation, when you aim to create an ORM-like functionality.
Each domain object can have multiple mappers, but each mapper should work only with specific class of domain objects (or a subclass of one, if your code adheres to LSP). You also should recognize that domain object and a collection of domain object are two separate things and should have separate mappers.
Also, each domain object can contain other domain objects, just like each data mapper can contain other mappers. But in case of mappers it is much more a matter of preference (I dislike it vehemently).
Another improvement, that could alleviate your current mess, would be to prevent application logic from leaking in the presentation layer (most often - controller). Instead you would largely benefit from using services, that contain the interaction between mappers and domain objects, thus creating a public-ish API for your model layer.
Basically, services you encapsulate complete segments of your model, that can (in real world - with minor effort and adjustments) be reused in different applications. For example: Recognition, Mailer or DocumentLibrary would all services.
Also, I think I should not, that not all services have to contain domain object and mappers. A quite good example would be the previously mentioned Mailer, which could be used either directly by controller, or (what's more likely) by another service.
Level 2:
If you stop using the active record pattern, this become quite simple problem: you need to make sure, that you save only data from those domain objects, which have actually changed since last save.
As I see it, there are two way to approach this:
Quick'n'Dirty
If something changed, just update it all ...
The way, that I prefer is to introduce a checksum variable in the domain object, which holds a hash from all the domain object's variables (of course, with the exception of checksum it self).
Each time the mapper is asked to save a domain object, it calls a method isDirty() on this domain object, which checks, if data has changed. Then mapper can act accordingly. This also, with some adjustments, can be used for object graphs (if they are not to extensive, in which case you might need to refactor anyway).
Also, if your domain object actually gets mapped to several tables (or even different forms of storage), it might be reasonable to have several checksums, for each set of variables. Since mapper are already written for specific classes of domain object, it would not strengthen the existing coupling.
For PHP you will find some code examples in this ansewer.
Note: if your implementation is using DAOs to isolate domain objects from data mappers, then the logic of checksum based verification, would be moved to the DAO.
Unit of Work
This is the "industry standard" for your problem and there is a whole chapter (11th) dealing with it in PoEAA book.
The basic idea is this, you create an instance, that acts like controller (in classical, not in MVC sense of the word) between you domain objects and data mappers.
Each time you alter or remove a domain object, you inform the Unit of Work about it. Each time you load data in a domain object, you ask Unit of Work to perform that task.
There are two ways to tell Unit of Work about the changes:
caller registration: object that performs the change also informs the Unit of Work
object registration: the changed object (usually from setter) informs the Unit of Work, that it was altered
When all the interaction with domain object has been completed, you call commit() method on the Unit of Work. It then finds the necessary mappers and store stores all the altered domain objects.
Level 3:
At this stage of complexity the only viable implementation is to use Unit of Work. It also would be responsible for initiating and committing the SQL transactions (if you are using SQL database), with the appropriate rollback clauses.
P.S.
Read the "Patterns of Enterprise Application Architecture" book. It's what you desperately need. It also would correct the misconception about MVC and MVC-inspired design patters, that you have acquired by using Rails-like frameworks.

How do I add relationships at runtime using DBIx::Class and Catalyst?

In the application I am building, users can specify relationships between tables.
Since I only determine this at runtime, I can't specify has_many or belongs_to relationships in the schema modules for startup.
So given two tables; system and place, I would like to add the relationship to join records between them.
I have part of the solution below:
$rs = $c->model('DB::system')->result_source;
$rs->add_relationship('locations','DB::place',{'foreign.fk0' => 'self.id'});
So the column fk0 would be the foreign key mapping to the location primary key id.
I know there must be a re-registration to allow future access to the relationship but I can't figure it out.
I don't believe you can re-define these relationships after an application is already running. At least not without discarding any existing DBIC objects, and re-creating them all from scratch. At that point, it would be easier to just re-start your application, I suspect.
If you're content defining these things dynamically at compile time, that is possible... we do something similar in one of our applications.
If that would be useful to you, I can provide some sample code.
The DBIx::Class::ResultSet::View module might provide a rough approximation of what you're looking for, by letting you execute arbitrary code, but retrieving the results as DBIx objects.
My general opinion on things like this, is that any abstraction layer (and an ORM is an abstraction layer), is intended to make life easier. When it gets in the way of making your application do what it wants, it's no longer making life easier, and ought to be discarded (for that specific use--not necessarily for every use). For this reason, I would suggest using DBI, as you suggested in one of your comments. I suspect it will make your life much easier in this case.
I've done this by calling the appropriate methods on the relevant result sources, e.g. $resultset->result_source-><relationship method>. It does work even in an active application.

Entity Framework and Encapsulation

I would like to experimentally apply an aspect of encapsulation that I read about once, where an entity object includes domains for its attributes, e.g. for its CostCentre property, it contains the list of valid cost centres. This way, when I open an edit form for an Extension, I only need pass the form one Extension object, where I normally access a CostCentre object when initialising the form.
This also applies where I have a list of Extensions bound to a grid (telerik RadGrid), and I handle an edit command on the grid. I want to create an edit form and pass it an Extension object, where now I pass the edit form an ExtensionID and create my object in the form.
What I'm actually asking here is for pointers to guidance on doing this this way, or the 'proper' way of achieving something similar to what I have described here.
It would depend on your data source. If you are retrieving the list of Cost Centers from a database, that would be one approach. If it's a short list of predetermined values (like Yes/No/Maybe So) then property attributes might do the trick. If it needs to be more configurable per-environment, then IoC or the Provider pattern would be the best choice.
I think your problem is similar to a custom ad-hoc search page we did on a previous project. We decorated our entity classes and properties with attributes that contained some predetermined 'pointers' to the lookup value methods, and their relationships. Then we created a single custom UI control (like your edit page described in your post) which used these attributes to generate the drop down and auto-completion text box lists by dynamically generating a LINQ expression, then executing it at run-time based on whatever the user was doing.
This was accomplished with basically three moving parts: A) the attributes on the data access objects B) the 'attribute facade' methods at the middle-tier compiling and generation dynamic LINQ expressions and C) the custom UI control that called our middle-tier service methods.
Sometimes plans like these backfire, but in our case it worked great. Decorating our objects with attributes, then creating a single path of logic gave us just enough power to do what we needed to do while minimizing the amount of code required, and completely eliminated any boilerplate. However, this approach was not very configurable. By compiling these attributes into the code, we tightly coupled our application to the datasource. On this particular project it wasn't a big deal because it was a clients internal system and it fit the project timeline. However, on a "real product" implementing the logic with the Provider pattern or using something like the Castle Projects IoC would have allowed us the same power with a great deal more configurability. The downside of this is there is more to manage, and more that can go wrong with deployments, etc.