I've developed on the Yii Framework for a while now (4 months), and so far I have encountered some issues with MVC that I want to share with experienced developers out there. I'll present these issues by listing their levels of complexity.
[Level 1] CR(create update) form. First off, we have a lot of forms. Each form itself is a model, so each has some validation rules, some attributes, and some operations to perform on the attributes. In a lot of cases, each of these forms does both updating and creating records in the db using a single active record object.
-> So at this level of complexity, a form has to
when opened,
be able to display the db-friendly data from the db in a human-friendly way
be able to display all the form fields with the attributes of the active record object. Adding, removing, altering columns from the db table has to affect the display of the form.
when saves, be able to format the human-friendly data to db-friendly data before getting the data
when validates, be able to perform basic validations enforced by the active record object, it also has to perform other validations to fulfill some business rules.
when validating fails, be able to roll back changes made to the attribute as well as changes made to the db, and present the user with their originally entered data.
[Level 2] Extended CR form. A form that can perform creation/update of records from different tables at once. Not just that, whether a form would create/update of one of its records can sometimes depend on other conditions (more business rules), so a form can sometimes update records at table A,B but not D, and sometimes update records at A,D but not B
-> So at this level of complexity, we see a form has to:
be able to satisfy [Level 1]
be able to conditionally create/update of certain records, conditionally create/update of certain columns of certain records.
[Level 3] The Tree of Models. The role of a form in an application is, in many ways, a port that let user's interact with your application. To satisfy requests, this port will interact with many other objects which, in turn, interact with many more objects. Some of these objects can be seen as models. Active Record is a model, but a Mailer can also be a model, so is a RobotArm. These models use one another to satisfy a user's request. Each model can perform their own operation and the whole tree has to be able to roll back any changes made in the case of error/failure.
Has anyone out there come across or been able to solve these problems?
I've come up with many stuffs like encapsulating model attributes in ModelAttribute objects to tackle their existence throughout tiers of client, server, and db.
I've also thought we should give the tree of models an Observer to observe and notify the observed models to rollback changes when errors occur. But what if multiple observers can exist, what if a node use its parent's observer but give its children another observers.
Engineers, developers, Rails, Yii, Zend, ASP, JavaEE, any MVC guys, please join this discussion for the sake of science.
--Update to teresko's response:---
#teresko I actually intended to incorporate the services into the execution inside a unit of work and have the Unit of work not worry about new/updated/deleted. Each object inside the unit of work will be responsible for its state and be required to implement their own commit() and rollback(). Once an error occur, the unit of work will rollback all changes from the newest registered object to the oldest registered object, since we're not only dealing with database, we can have mailers, publishers, etc. If otherwise, the tree executes successfully, we call commit() from the oldest registered object to the newest registered object. This way the mailer can save the mail and send it on commit.
Using data mapper is a great idea, but We still have to make sure columns in the db matches data mapper and domain object. Moreover, an extended CR form or a model that has its attributes depending on other models has to match their attributes in terms of validation and datatype. So maybe an attribute can be an object and shipped from model to model? An attribute can also tell if it's been modified, what validation should be performed on it, and how it can be human-friendly, application-friendly, and db-friendly. Any update to the db schema will affect this attribute, and, thereby throwing exceptions that requires developers to make changes to the system to satisfy this change.
The cause
The root of your problem is misuse of active record pattern. AR is meant for simple domain entities with only basic CRUD operations. When you start adding large amount of validation logic and relations between multiple tables, the pattern starts to break apart.
Active record, at its best, is a minor SRP violation, for the sake of simplicity. When you start piling on responsibilities, you start to incur severe penalties.
Solution(s)
Level 1:
The best option is the separate the business and storage logic. Most often it is done by using domain object and data mappers:
Domain objects (in other materials also known as business object or domain model objects) deal with validation and specific business rules and are completely unaware of, how (or even "if") data in them was stored and retrieved. They also let you have object that are not directly bound to a storage structures (like DB tables).
For example: you might have a LiveReport domain object, which represents current sales data. But it might have no specific table in DB. Instead it can be serviced by several mappers, that pool data from Memcache, SQL database and some external SOAP. And the LiveReport instance's logic is completely unrelated to storage.
Data mappers know where to put the information from domain objects, but they do not any validation or data integrity checks. Thought they can be able to handle exceptions that cone from low level storage abstractions, like violation of UNIQUE constraint.
Data mappers can also perform transaction, but, if a single transaction needs to be performed for multiple domain object, you should be looking to add Unit of Work (more about it lower).
In more advanced/complicated cases data mappers can interact and utilize DAOs and query builders. But this more for situation, when you aim to create an ORM-like functionality.
Each domain object can have multiple mappers, but each mapper should work only with specific class of domain objects (or a subclass of one, if your code adheres to LSP). You also should recognize that domain object and a collection of domain object are two separate things and should have separate mappers.
Also, each domain object can contain other domain objects, just like each data mapper can contain other mappers. But in case of mappers it is much more a matter of preference (I dislike it vehemently).
Another improvement, that could alleviate your current mess, would be to prevent application logic from leaking in the presentation layer (most often - controller). Instead you would largely benefit from using services, that contain the interaction between mappers and domain objects, thus creating a public-ish API for your model layer.
Basically, services you encapsulate complete segments of your model, that can (in real world - with minor effort and adjustments) be reused in different applications. For example: Recognition, Mailer or DocumentLibrary would all services.
Also, I think I should not, that not all services have to contain domain object and mappers. A quite good example would be the previously mentioned Mailer, which could be used either directly by controller, or (what's more likely) by another service.
Level 2:
If you stop using the active record pattern, this become quite simple problem: you need to make sure, that you save only data from those domain objects, which have actually changed since last save.
As I see it, there are two way to approach this:
Quick'n'Dirty
If something changed, just update it all ...
The way, that I prefer is to introduce a checksum variable in the domain object, which holds a hash from all the domain object's variables (of course, with the exception of checksum it self).
Each time the mapper is asked to save a domain object, it calls a method isDirty() on this domain object, which checks, if data has changed. Then mapper can act accordingly. This also, with some adjustments, can be used for object graphs (if they are not to extensive, in which case you might need to refactor anyway).
Also, if your domain object actually gets mapped to several tables (or even different forms of storage), it might be reasonable to have several checksums, for each set of variables. Since mapper are already written for specific classes of domain object, it would not strengthen the existing coupling.
For PHP you will find some code examples in this ansewer.
Note: if your implementation is using DAOs to isolate domain objects from data mappers, then the logic of checksum based verification, would be moved to the DAO.
Unit of Work
This is the "industry standard" for your problem and there is a whole chapter (11th) dealing with it in PoEAA book.
The basic idea is this, you create an instance, that acts like controller (in classical, not in MVC sense of the word) between you domain objects and data mappers.
Each time you alter or remove a domain object, you inform the Unit of Work about it. Each time you load data in a domain object, you ask Unit of Work to perform that task.
There are two ways to tell Unit of Work about the changes:
caller registration: object that performs the change also informs the Unit of Work
object registration: the changed object (usually from setter) informs the Unit of Work, that it was altered
When all the interaction with domain object has been completed, you call commit() method on the Unit of Work. It then finds the necessary mappers and store stores all the altered domain objects.
Level 3:
At this stage of complexity the only viable implementation is to use Unit of Work. It also would be responsible for initiating and committing the SQL transactions (if you are using SQL database), with the appropriate rollback clauses.
P.S.
Read the "Patterns of Enterprise Application Architecture" book. It's what you desperately need. It also would correct the misconception about MVC and MVC-inspired design patters, that you have acquired by using Rails-like frameworks.
Consider the following scenario
We have a simple database that involves two entities: user and category.
For our hypothesis let's say that a user can have only a type of category and a category can be associated with n users.
Now, consider a web page where a user - say ROLE_ADMINISTRATOR - could edit user table and associate them to a different one category.
As far I know (and I'm still new to symfony in general) if I use Doctrine and symfony2 in tandem, with - let's say - annotation method, i'll have two entity (php classes).
Embedded form
I will create a form that will show the user and, for show - and persist, of course! - also his category I "choose" to follow the "embedded form" strategy.
Having said that the entity has been created, i'll have to create a form for category (suppose that into formBuilder I'll add only id attribute of the category).
After that I have to add to formBuilder of UserType class the previous form and, with "some kind of magic" the form will render (after the appropriate operations) as magic and just as magically, when i'll post it (and bind, and so on) back all the informations will be persists onto database
Data Transformers
A.K.A. trasnform an input of a form into an object and vice versa.
In that way I'll have to define a - let's say - CategorySelectorType that into his builder, will add a Class (service ?) that will do those transformations.
Now we define the data transformer itself that will implement the DataTransofmerInterface (with his method and so on...)
The next step will be register that entity into services and add into UserType the form that will use this service.
So I don't understand any "strong" difference between those two metodology but "reusability" of the service. Somebody can offer me a different point of view and explain me the differences, if any?
A data transformer does not replace a embedded form, it rather enhances forms and wraps data transformation nicely.
The first sentence on the cookbook page about Data Transformers sums it up nicely:
You'll often find the need to transform the data the user entered in a
form into something else for use in your program.
In your example above, you could add a drop-down list of categories so the admin can select one for the given user. This would be done using a embedded form. As the category field is the id of an existing category, there is no need to transform the data.
For some reason you now want the admin to be able to enter a free text of the category. Now you need to do some tranformation of the text into the object in question. Maybe you want him to be able to either add a new or select a current category with this text field. Both is possible by using a data transformer which takes the text and searches for the category. Depending on your needs, not existing categories can be created and returned.
Another use case is when a user enters data which needs to be modified in some way before storing them. Let's say the user enters a street, house number and city but you want to store the coordinates instead.
In both cases it doesn't matter if you embed the form into another!
Could you do that in your controller? Of course. Is it a good idea to do such things in the controller? Probably not, as you have a hard time testing (while doing it in the transformer let you unit test the transformation nicely) or reusing it.
UPDATE
Of course it is possible to place the transformation code someplace else. The user object itself is not a good place, as the model should not know about the entity manager, which is needed to do the transformation.
The user type would be possible, but this means that it gets tied to the entity manager.
It all adds up to the very powerfull concept of Separation of concerns, which states that one class should only do one thing to make it maintainable, resuable, testable and so on. If you follow this concept, than it should be clear that data transformation is a thing on its own and should be threated as such. If you don't care, you may not need the transformation functionality.
I'm using Entity Framework and a Model class, DonationForm, wrapped by a view model class "CreateDonationForm".
In keeping with the DRY principle, I have added the validation annotations on the Model class (not the just the view model), so that they will be reusable. However, not all of the properties on the class will always be used (some are in fact mutually exclusive). For example when a particular phone number property is relevant, I want to make it conform to a Regex annotation and be Required. But in a different situation I want to be able to get away with submitting (and persiting to the database) a null value.
Following this post: How do I use IValidatableObject?
I made the model class implement IValidatableObject and implemented custom validation to selectiely remove validation errors from the ValidationResult object (when the field was part of a group of fields that were not relevant given the user's other choice(s) on the form). It worked and I was able to get back List of ValidationResults where those errors had been expunged.
However, when I called SaveChanges() I get validation errors which prevent the save. The validation is still happening at the database/model class level. (The database was generated from the Model class using EF 4.1 Code First.)
How is it possible to acheive conditional formatting rules and still use Annotations on the Model class? This is essentially saying - apply these validation rules IF otherwise do not apply these validation rules. Please coach me here. I'm new-ish to MVC and I'm trying to do things the proper way. It seems like putting the validation on the view model and then mapping values onto the underlying model class might work; however, it doesn't feel right. I see big value in having the validation attributes on the Model class itself and lots of wasted, repetivitive work in placing the same annotations on both the create and update view models. It feels like I'm fighting MVC Fw here on something that ought to be easier. Any insights that you can provide would be greatly appreciated.
There is a lot of cool principles but sometimes small violation makes your life easier. Get rid of data annotations from your EF model and place them on your view model where they belong. You can still use IValidatableObject in view model and compose the validation from multiple reusable helper methods used by multiple view models (so you can still achieve DRY principle).
If you stubborn and really want to have validation in EF model turn off validation in EF and handle it in upper layer as you already do:
dbContext.Configuration.ValidateOnSaveEnabled = false;
EF level validation is for simple scenarios where your validation rules do not change among operations.
I am using automapper to map my models to viewmodel classes to pass to my view.
My question really is where should the validation go? I was planning on using the MetaData decorations - a feature of mvc 2.
But either in the model or viewmodel? Or in both places?
Validation should be done minimum at the View Model because this is what you receive as action argument and contains the user input. You could also have validation at the model.
My answer would be ViewModel because Model can change (for example from using Linq2SQL to EF). This way when you plug another Model you still have your validation intact.
I personally have my validation 2 places using DataAnnotations. My model is not passed up to my view in full. I have separate models for my views and translate the data from the view model into the model. This way, I can put whatever I want in my view model and leave out the pieces I don't want to deal with.
My reasoning, however, is that I have a windows application and an web application using the same model. This way, the same set of validation rules govern the Model for all apps, and my view model can have slightly different rules if need be. Of course, this creates a "duplication of logic" - well, validation logic.
This way I don't have to rebuild the data that wasn't used on the page every trip back to the server or put it in hidden fields and inflate the size of my pages.
You should put validation that is specific to the UI in the ViewModel, and anything that is related to business process or database validation in the Model. These might overlap.
The model should implement the validation it needs to ensure that its state cannot become invalid; that validation most definitely belongs on the model.
for example, a book class must guarantee that its title must be between 1 and 50 characters, its id must be >= 0 etc.
business rules belong elsewhere (in your controllers if you only have the model view and controller layers). this might be something like a user cannot add more than 3 books if their email isnt verified.
validation in the view should be restricted to parsing user input for invalid data: anti xss, sql injection, out of range. etc
I would like to experimentally apply an aspect of encapsulation that I read about once, where an entity object includes domains for its attributes, e.g. for its CostCentre property, it contains the list of valid cost centres. This way, when I open an edit form for an Extension, I only need pass the form one Extension object, where I normally access a CostCentre object when initialising the form.
This also applies where I have a list of Extensions bound to a grid (telerik RadGrid), and I handle an edit command on the grid. I want to create an edit form and pass it an Extension object, where now I pass the edit form an ExtensionID and create my object in the form.
What I'm actually asking here is for pointers to guidance on doing this this way, or the 'proper' way of achieving something similar to what I have described here.
It would depend on your data source. If you are retrieving the list of Cost Centers from a database, that would be one approach. If it's a short list of predetermined values (like Yes/No/Maybe So) then property attributes might do the trick. If it needs to be more configurable per-environment, then IoC or the Provider pattern would be the best choice.
I think your problem is similar to a custom ad-hoc search page we did on a previous project. We decorated our entity classes and properties with attributes that contained some predetermined 'pointers' to the lookup value methods, and their relationships. Then we created a single custom UI control (like your edit page described in your post) which used these attributes to generate the drop down and auto-completion text box lists by dynamically generating a LINQ expression, then executing it at run-time based on whatever the user was doing.
This was accomplished with basically three moving parts: A) the attributes on the data access objects B) the 'attribute facade' methods at the middle-tier compiling and generation dynamic LINQ expressions and C) the custom UI control that called our middle-tier service methods.
Sometimes plans like these backfire, but in our case it worked great. Decorating our objects with attributes, then creating a single path of logic gave us just enough power to do what we needed to do while minimizing the amount of code required, and completely eliminated any boilerplate. However, this approach was not very configurable. By compiling these attributes into the code, we tightly coupled our application to the datasource. On this particular project it wasn't a big deal because it was a clients internal system and it fit the project timeline. However, on a "real product" implementing the logic with the Provider pattern or using something like the Castle Projects IoC would have allowed us the same power with a great deal more configurability. The downside of this is there is more to manage, and more that can go wrong with deployments, etc.