I've been studying Spring State Machine and State design pattern since I have to develop a microservice with Spring Boot and persisted objects with a lot of confused states that need to be cleared up, and I am looking for a clean design solution out of that mess. I checked DDD, State concepts and State Machine seeing which ones to apply
I am not sure about how to implement some concepts, and how to connect them. I would like to understand if:
Spring State Machine can work on entities during transitions, or only on a global application state level?
To manage multiple entities each in its own state does it have to have be created as a prototype scoped component?
Does it integrate easily with the State pattern or should it not be used together?
To manage this, I should inject state or state machines into the Entities (feasible, I know, but I do not like the idea of using #Configurable and appropriate AspectJ weaving configuration)? I share someone's impression that it could make it more complex, and maybe I would have to use #Scope("prototype")
Instead if it is possibile to have Domain Services delegate State Machines per entity basis (so another Domain Service) for single entities to change state? Or this is anemic domain antipattern, but if so, how well do State Machine integrates with DDD?
Is there any example on how Spring State Machine would allow me to do what I want to do, how lightweight it is, and how slow and memory consuming that would be?
I got that:
- DDD wants Domain Objects that have more functionality that simple Data Objects like in this quite complete but a little dated article on DDD
- State pattern should encapsulate how the Context element should behave in that particular case
- The State Machine is about encapsulating the management of the passage between one State and another
- Should they work together, the State should not dictate which is the next State on a specific command, but generate an Event for the State machine and the State machine would choose (or block if there are Guards that fail) the new State
- Somehow the new State would have to be set in the Context by the State Machine
Usually the Context object should delegate directly to the State object. But since the State machine dictates the change in the State of the object, should't in this case the Context delegate to a sort of Proxy State? Should be a/the State Machine be injected into the Entity or the Proxy?
Any thoughts, suggestions, even partial answers on some of these questions would be really appreciated.
BTW I just
DDD is all about protecting the business logic and make sure that it's not affected, or coupled, with infrastructure. When you look at an entity, you should directly understand what it's capable of and which part of the domain that it takes care of.
To me, state machines is not part of the domain. They are a facilitator and part of the application layer which controls what should happen in the domain. Then again, it's really hard to answer on an abstract level since DDD is all about modeling on a very "real" level.
If you could explain with a simple example what you are trying to accomplish it would also make it easier to write a more specific answer.
Related
I'm building a WPF application using MVVM pattern, and Caliburn.Micro is the framework choice to boost up the development.
Different from a conventional MVVM-based application, I add a business layer (BL) below the ViewModel (VM) layer to handle logic for specific business cases. VM is left with data binding and simple conversion/presentation logic. Below BL is an extra Data Access layer (DAL) that encapsulates the Data model (DM) underneath built with Entity Framework.
I'm pretty new to both WPF, MVVM and, of course, know almost nothing about Caliburn. I have read plenty of questions and answers about the Caliburn usage and now trying to use what I've learnt so far in my application.
My questions are:
Does it sound okay with the above layered architecture?
In the application bootstrapper, is it correct that we can register all services that will later be used (like EventAgreggator (EA), WindowManager or extra security and validation services), and also all the concerned VMs? These should be injected into VM instances via constructors or so (supposed I'll be using SimpleContainer). So from any VM that are properly designed and instantiated, we can have these services ready to be used. If I understand correctly, Caliburn and its IoC maintain a kind of global state so that different VMs can use and share it.
Navigation: I know this topic has been discussed so many times. But just to be sure I'm doing the right way: There'd be a ShellViewModel acting as the main window for the whole application with different VMs (or screens) loaded dynamically. Each VM can inherit either Screen or ViewModelBase or NotifyChangedBase. When I'm in, let's say, VM A and want to switch to VM B. I'd from inside VM A send a message (using EA) to the ShellViewModel, saying that I want to change to B. ShellViewModel receives the message and reloads its CurrentViewModel property. What should be a proper data structure to maintain the list of VMs to be loaded? How can stuffs like Conductor or WindowManger come into the place?
Can/Should Caliburn in one way or another support the access to the database (via EF). Or this access should be exposed to VM and/or BL only?
Thanks a lot!
Different from a conventional MVVM-based application, I add a business layer (BL) below the ViewModel (VM) layer
That's the standard case. ViewModels can't/shouldn't contain business logic which is considered to be part of the Model (Model in MVVM is considered a layer, not an object or data structure) in the MVVM. ViewModel is for presentation logic only.
Yes, as long as your Business (Domain) Layer has no dependency on the DAL (no reference to it's assembly). Repository interfaces should be defined in the Business Layer, their implementations in the Data Access Layer.
Yes, Bootstrapper is where you build your object graph (configuration the IoC container).
Registering ViewModels: Depends on IoC framework. Some frameworks let you resolve unregistered types, as long as they are not abstract or interfaces (i.e. Unity). Not sure about Caliburn, haven't used it. If the IoC supports it, you don't need to register them.
One possible way to do it. I prefer navigation services though, works better for passing around parameters to views and windows that are not yet instantiated and you always know there is exactly one objects handling it.
With messages, there could be 0, 1 or many objects listening to it. Up to you
What do you mean with support access to the database? You can use it to inject your repositories and/or services into your ViewModels, other than that there isn't much DB related stuff to it.
I am creating the high level design for a new service. The complexity of the service warrants using DDD (I think). So I did the conventional thing and created domain services, aggregates, repositories, etc. My repositories encapsulate the data source. So a query can look for an object in the cache, failing that look in the db, failing that make a REST call to an external service to fetch the required information. This is fairly standard. Now the argument put forward by my colleagues is that abstracting the data source this way is dangerous because the developer using the repository will not be aware of the time required to execute the api and consequently not be able to calculate the execution time for any apis he writes above it. May be he would want to set up his component's behaviour differently if he knew that his call would result in a REST call. They are suggesting I move the REST call outside of the repository and maybe even the caching strategy along with it. I can see their point but the whole idea behind the repository pattern is precisely to hide this kind of information and not have each component deal with caching strategies and data access. My question is, is there a pattern or model which addresses this concern?
They are suggesting I move the REST call outside of the repository
Then you won't have a repository. The repository means we don't know persistence details, not that we don't know there is persistence. Every time we're using a repository, regardless of its implementation (from a in memory list to a REST call) we expect 'slowness' because it's common knowledge that persistence usually is the bottleneck.
Someone who will use a certain repository implementation (like REST based) will know it will deal with latency and transient errors. A service having just a IRepository dependency still knows it deals with persistence.
About caching strategies, you can have some service level (more generic) caching and repository level (persistence specific) caching. These probably should be implementation details.
Now the argument put forward by my colleagues is that abstracting the data source this way is dangerous because the developer using the repository will not be aware of the time required to execute the api and consequently not be able to calculate the execution time for any apis he writes above it. May be he would want to set up his component's behaviour differently if he knew that his call would result in a REST call.
This is wasting time trying to complicate your life. The whole point of an abstraction is to hide the dirty details. What they suggest is basically: let's make the user aware of some implementation detail, so that the user can couple its code to that.
The point is, a developer should be aware of the api they're using. If a component is using an external service (db, web service), this should be known. Once you know there's data to be fetched, you know you'll have to wait for it.
If you go the DDD route then you have bounded contexts (BC). Making a model dependent on another BC is a very bad idea . Each BC should publish domain events and each interested BC should subscribe and keep their very own model based on those events. This means the queries will be 'local' but you'll still be hitting a db.
Repository pattern aim to reduce the coupling with persistence layer. In my opinion I wouldn't risk to make a repository so full of responsibility.
You could use an Anti Corruption Layer against changes in external service and a Proxy to hide the caching related issues.
Then in the application layer I will code the fallback strategy.
I think it all depends where you think the fetching/fallback strategy belongs, in the Service layer or in the Infrastructure layer (latter sounds more legit to me).
It could also be a mix of the two -- the Service is passed an ordered series of Repositories to use one after the other in case of failure. Construction of the series of Repos could be placed in the Infrastructure layer or somewhere else. Fallback logic in one place, fallback configuration in another.
As a side note, asynchrony seems like a good way to signal the users that something is potentially slow and would be blocking if you waited for it. Better than hiding everything behind a vanilla, inconspicuous Repository name and better than adding some big threatening "this could be slow" prefix to your type, IMO.
(Note: My question has very similar concerns as the person who asked this question three months ago, but it was never answered.)
I recently started working with MVC3 + Entity Framework and I keep reading that the best practice is to use the repository pattern to centralize access to the DAL. This is also accompanied with explanations that you want to keep the DAL separate from the domain and especially the view layer. But in the examples I've seen the repository is (or appears to be) simply returning DAL entities, i.e. in my case the repository would return EF entities.
So my question is, what good is the repository if it only returns DAL entities? Doesn't this add a layer of complexity that doesn't eliminate the problem of passing DAL entities around between layers? If the repository pattern creates a "single point of entry into the DAL", how is that different from the context object? If the repository provides a mechanism to retrieve and persist DAL objects, how is that different from the context object?
Also, I read in at least one place that the Unit of Work pattern centralizes repository access in order to manage the data context object(s), but I don't grok why this is important either.
I'm 98.8% sure I'm missing something here, but from my readings I didn't see it. Of course I may just not be reading the right sources... :\
I think the term "repository" is commonly thought of in the way the "repository pattern" is described by the book Patterns of Enterprise Application Architecture by Martin Fowler.
A Repository mediates between the domain and data mapping layers,
acting like an in-memory domain object collection. Client objects
construct query specifications declaratively and submit them to
Repository for satisfaction. Objects can be added to and removed from
the Repository, as they can from a simple collection of objects, and
the mapping code encapsulated by the Repository will carry out the
appropriate operations behind the scenes.
On the surface, Entity Framework accomplishes all of this, and can be used as a simple form of a repository. However, there can be more to a repository than simply a data layer abstraction.
According to the book Domain Driven Design by Eric Evans, a repository has these advantages:
They present clients with a simple model for obtaining persistence objects and managing their life cycle
They decouple application and domain design from persistence technology, multiple database strategies, or even multiple data sources
They communicate design decisions about object access
They allow easy substitution of a dummy implementation, for unit testing (typically using an in-memory collection).
The first point roughly equates to the paragraph above, and it's easy to see that Entity Framework itself easily accomplishes it.
Some would argue that EF accomplishes the second point as well. But commonly EF is used simply to turn each database table into an EF entity, and pass it through to UI. It may be abstracting the mechanism of data access, but it's hardly abstracting away the relational data structure behind the scenes.
In simpler applications that mostly data oriented, this might not seem to be an important point. But as the applications' domain rules / business logic become more complex, you may want to be more object oriented. It's not uncommon that the relational structure of the data contains idiosyncrasies that aren't important to the business domain, but are side-effects of the data storage. In such cases, it's not enough to abstract the persistence mechanism but also the nature of the data structure itself. EF alone generally won't help you do that, but a repository layer will.
As for the third advantage, EF will do nothing (from a DDD perspective) to help. Typically DDD uses the repository not just to abstract the mechanism of data persistence, but also to provide constraints around how certain data can be accessed:
We also need no query access for persistent objects that are more
convenient to find by traversal. For example, the address of a person
could be requested from the Person object. And most important, any
object internal to an AGGREGATE is prohibited from access except by
traversal from the root.
In other words, you would not have an 'AddressRepository' just because you have an Address table in your database. If your design chooses to manage how the Address objects are accessed in this way, the PersonRepository is where you would define and enforce the design choice.
Also, a DDD repository would typically be where certain business concepts relating to sets of domain data are encapsulated. An OrderRepository may have a method called OutstandingOrdersForAccount which returns a specific subset of Orders. Or a Customer repository may contain a PreferredCustomerByPostalCode method.
Entity Framework's DataContext classes don't lend themselves well to such functionality without the added repository abstraction layer. They do work well for what DDD calls Specifications, which can be simple boolean expressions sent in to a simple method that will evaluate the data against the expression and return a match.
As for the fourth advantage, while I'm sure there are certain strategies that might let one substitute for the datacontext, wrapping it in a repository makes it dead simple.
Regarding 'Unit of Work', here's what the DDD book has to say:
Leave transaction control to the client. Although the REPOSITORY will insert into and delete from the database, it will ordinarily not
commit anything. It is tempting to commit after saving, for example,
but the client presumably has the context to correctly initiate and
commit units of work. Transaction management will be simpler if the
REPOSITORY keeps its hands off.
Entity Framework's DbContext basically resembles a Repository (and a Unit of Work as well). You don't necessarily have to abstract it away in simple scenarios.
The main advantage of the repository is that your domain can be ignorant and independent of the persistence mechanism. In a layer based architecture, the dependencies point from the UI layer down through the domain (or usually called business logic layer) to the data access layer. This means the UI depends on the BLL, which itself depends on the DAL.
In a more modern architecture (as propagated by domain-driven design and other object-oriented approaches) the domain should have no outward-pointing dependencies. This means the UI, the persistence mechanism and everything else should depend on the domain, and not the other way around.
A repository will then be represented through its interface inside the domain but have its concrete implementation outside the domain, in the persistence module. This way the domain depends only on the abstract interface, not the concrete implementation.
That basically is object-orientation versus procedural programming on an architectural level.
See also the Ports and Adapters a.k.a. Hexagonal Architecture.
Another advantage of the repository is that you can create similar access mechanisms to various data sources. Not only to databases but to cloud-based stores, external APIs, third-party applications, etc.
You're right,in those simple cases the repository is just another name for a DAO and it brings only one value: the fact that you can switch EF to another data access technique. Today you're using MSSQL, tomorrow you'll want a cloud storage. OR using a micro orm instead of EF or switching from MSSQL to MySql.
In all those cases it's good that you use a repository, as the rest of the app won't care about what storage you're using now.
There's also the limited case where you get information from multiple sources (db + file system), a repo will act as the facade, but it's still a another name for a DAO.
A 'real' repository is valid only when you're dealing with domain/business objects, for data centric apps which won't change storage, the ORM alone is enough.
It would be useful in situations where you have multiple data sources, and want to access them using a consistent coding strategy.
For example, you may have multiple EF data models, and some data accessed using traditional ADO.NET with stored procs, and some data accessed using a 3rd party API, and some accessed from an Access database living on a Windows NT4 server sitting under a blanket of dust in your broom closet.
You may not want your business or front-end layers to care about where the data is coming from, so you build a generic repository pattern to access "data", rather than to access "Entity Framework data".
In this scenario, your actual repository implementations will be different from each other, but the code that calls them wouldn't know the difference.
Given your scenario, I would simply opt for a set of interfaces that represent what data structures (your Domain Models) need to be returned from your data layer. Your implementation can then be a mixture of EF, Raw ADO.Net or any other type of Data Store/Provider. The key strategy here is that the implementation is abstracted away from the immediate consumer - your Domain layer. This is useful when you want to unit test your domain objects and, in less common situations - change your data provider / database platform altogether.
You should, if you havent already, consider using an IOC container as they make loose coupling of your solution very easy by way of Dependency Injection. There are many available, personally i prefer Ninject.
The domain layer should encapsulate all of your business logic - the rules and requirements of the problem domain, and can be consumed directly by your MVC3 web application. In certain situations it makes sense to introduce a services layer that sits above the domain layer, but this is not always necessary, and can be overkill for straightforward web applications.
Another thing to consider is that even when you know that you will be working with a single data store it still might make sense to create a repository abstraction. The reason is that there might be a function that your application needs that your ORM du jour either does badly (performance), not at all, or you just don't know how to make the ORM bend to your needs.
If you are wrapping your ORM behind a well thought out repository interface, you can easily switch between different technologies as you see fit. It's not uncommon in my repositories to see some methods use EF for their work and others to use something like PetaPoco, or (gasp) ADO.net code. The repository abstraction enables you to use exactly the right tool for the job at hand without leaking these complexities into the client code.
I think there is a big misunderstanding of what many articles call "repository." And that's why there are doubts about what real value those abstractions bring.
In my opinion the repository in it's pure form is IEnumerable, while you and many articles are talking about "data access service."
I've blogged about it here.
If you were to have a REST layer on top of your DDD App for CRUD, would you let the REST layer spit out domain model(in terms of data)(say for a GET)?
Generally, you'd want to be able to change your domain objects (for instance when you learn something new about the domain), without having to change a public interface/API to your system. Same thing the other way around: if a change is required to a public interface, you don't want to have to change your domain model.
So from this perspective I'd never expose my domain objects as-is over a public interface. Instead I'd create data transfer objects (DTO) that are part of the public interface. This way, changes to my domain and public api can change independently.
You should not expose the DDD model. This is absolutely correct, because a SOA frontend should not expose implementation details to clients. Your users should depend on a business function, not an implementation detail… But this assumes a nice design of several, maybe heterogeneous, applications united into a SOA bus.
I would like to add to the answer because the mention of a CRUD interface makes me think that this could be a case of SOA abuse where SOA principles are used to glue the layers of an application, instead of a network of applications. SOA is meant as a way for the enterprise to communicate its systems, it is not a way to implement MVC! So simple yet so misunderstood. For example, just because your front end GUI uses services to access the backend you do not have a "SOA application."… what ever that means.
If this is a case of SOA used to glue layers, please revise your design and use an appropriate design architecture for that level of abstraction. Otherwise you will misinterpret the recommendations found here about no exposing the DDD model and not using CRUDY, and you will surely end up creating a separate domain model for the services interface, that then you will have to map to the DDD , which is so complicated that you will need to use dozer and things like that to map the same thing with different names, and so forth until we end up with a bloated un maintainable mess…
.. just be careful.
-Alex
Redzedi is so right that we need a clarification....
Like everything, this is quite more complicated to do than to say. Serializing a complex domain model could be so difficult that you can end up either not putting any logic in the domain, the anemic model antipattern (http://martinfowler.com/bliki/AnemicDomainModel.html), or having a separate anemic model for persistence, ie DTOs.
I don’t know what is worst, but both options are bad. You should put the logic that goes in the model in the model and you should be able to serialize directly everywhere.
In my experience using the domain model for many years, I believe that the best thing is a point in the middle. Yes, as Fowler and Evans state, business objects should carry logic, but not all (http://codebetter.com/gregyoung/2009/07/15/the-anemic-domain-model-pattern/) a little anemia with a nice service layer is best.
For example, an invoice should know about its items and have a procedure to calculate its total, which depends on the items. But an invoice's item does not need to know about invoicing. So what happens when an item changes in cost, should it have a pointer back to the father invoice as a circular reference and call the invoice's calculate total procedure?
I believe not. I think that's a task for the service layer who should received the event first and then orchestrate the procedure, with out having to couple all the business objects together for implementation purposes and violating the business interaction rules, which is what a domain model is for.
-Alex
I have an architectural question about EF and WCF.
We are developing a three-tier application using Entity Framework (with an Oracle database), and a GUI based on WPF. The GUI communicates with the server through WCF.
Our data model is quite complex (more than a hundred tables), with lots of relations. We are currently using the default EF code generation template, and we are having a lot of trouble with tracking the state of our entities.
The user interfaces on the client are also fairly complex, sometimes an object graph with more than 50 objects are sent down to a single user interface, with several layers of aggregation between the entities. It is an important goal to be able to easily decide in the BLL layer, which of the objects have been modified on the client, and which objects have been newly created.
What would be the clearest approach to manage entities and entity states between the two layers? Self tracking entities? What are the most common pitfalls in this scenario?
Could those who have used STEs in a real production environment tell their experiences?
STEs are supposed to solve this scenario but they are not silver bullet. I have never used them in real project (I don't like them) but I spent some time playing with them. The main pitfalls I found are:
Coupling your data layer with your client application - you must share entity assembly between projects (it also means it is .NET only solution but it should not be a problem in your case)
Large data transfers - you pass 50 entities to clients, client change single entity and you will pass 50 entities back. It will require some fighting with STEs to avoid passing unnecessary data
Unnecessary updates to database - normally when EF works with attached entities it track changes on property level but with STEs it track changes on entity level. So if user modify single property in entity with 100 properties it will generate update with setting all of them. It will require modifying template and adding property level change tracking to avoid this.
Client application should use STEs directly (binding STEs to UI) to get most of its self tracking ability. Otherwise you will have to implement code which will move data from UI back to self tracking entity and modify its state.
They are not proxied = they don't support lazy loading (in case of WCF service it is good behavior)
I described today the way to solve this without STEs. There is also related question about tracking over web services (check #Richard's answer and provided links).
We have developed a layered application with STE's. A user interface layer with ASP.NET and ModelViewPresenter, a business layer, a WCF service layer and the data layer with Entity Framework.
When I first read about STE's the documentation said that they are easier then using custom DTO's. They should be the 'quick and easy way' and that only on really big projects you should use hand written DTO's.
But we've run in a lot of problems using STE's. One of the main problems is that if your entities come from multiple service calls (for example in a master detail view) and so from different contexts you will run into problems when composing the graphs on the server and trying to save them. So our server function still have to check manually which data has changed and then recompose the object graph on the server. A lot has been written about this topic but it's still not easy to fix.
Another problem we ran into was that the STE's wouldn't work without WCF. The change tracking is activated when the entities are serialized. We've originally designed an architecture where WCF could be disabled and the service calls would just be in process (this was a requirement for our unit tests, which would run a lot faster without wcf and be easier to setup). It turned out that STE's are not the right choice for this.
I've also noticed that developers sometimes included a lot of data in their query and just send it to the client instead of really thinking about which data they needed.
After this project we've decided to use custom DTO's with automapper from server to client and use the POCO template in our data layer in a new project.
So since you already state that your project is big I would opt for custom DTO's and service functions that are a specifically created for one goal instead of 'Update(Person person)' functions that send a lot of data
Hope this helps :)