DTOs: best practices

DTOs: best practices - dto

I am considering to use DTOs instead of passing around my domain objects. I have read several posts here as well as elsewhere, and i understand there are several approaches to getting this done.
If i only have about 10 domain classes in all, and considering that i want to use DTOs rather than domain objects for consumption in my Views (WPF front ends), what is the recommended approach.
I think using tools like automapper etc maybe an overkill for my situation. So i am thinking of writing my custom mapper class that will have methods for converting a domain type to a DTO type.
What is the best way to do this, are there any sample to get me started to do this?
Second question: When writing those methods that will create DTOs, how do i deal with setting up all the data, especially when the domain type has references to other domain objects? Do i write equivalent properties in the DTO for mapping to those refernece types in the domain class?
Please ask if i have not put my second question in proper words. But i think you understand what i am trying to ask.
Thrid question: When writing DTOs, should i write multiple DTOs, each containing partial data for a given domain model, so that each of it can be used to cater to a specific View's requirement, or should the DTO have all the data that are there in the corresponding model class.

I've been reading a few posts here regarding DTO's and it seems to me that a lot of people equate them to what I would consider a ViewModel. A DTO is just that, Data Transfer Object - it's what gets passed down the wire. So I've got a website and services, only the services will have access to real domain/entity objects, and return DTO's. These may map 1:1, but consider that the DTO's may be populated from another service call, a database query, reading a config - whatever.
After that, the website then can take those DTO and either add them to a ViewModel, or convert into one. That ViewModel may contain many different types of DTO's. A simple example would be a task manager - the ViewModel contains both the task object you are editing, as well as a group of Dto.User objects that the task can be assigned to.
Keep in mind that the services returning DTO's maybe used by both a website, and maybe a tablet or phone application. These applications would have different views to take advantage of their displays and so the ViewModels would differ, but the DTO's would remain the same.
At any rate, I love these types of discussions, so anyone please let me know what you think.
Matt

I'm kind of using DTOs in a project. I tend to make the DTOs only to show the data I need for an specified view. I fetch all the data shown in the view in my data access class. For example, I may have an Order object which references a Client object:
public class Client{
public int Id{get;set;}
public string Name{get;set;}
}
public class Order{
public int OrderID{get;set;}
public Client client{get;set;}
public double Total{get;set;}
public IEnumerable<OrderLine> lines {get;set;}
}
Then in my OrderListDTO I may have something like:
public class OrderListDTO{
public int OrderId{get;set;}
public string ClientName{get;set;}
...
}
Which are the fields I want to show in my view. I fetch all these fields in my Database access code so I don't have to bother with entity asociations in my view or controller code.

Best Way to develop DTOs
The way to start developing DTOs is to understand that their sole purpose is to transfer subset of data of your business entities to different clients(could be UI, or an external service). Given this understanding you could create seperate packages for each client...and write your DTO classes. For mapping you could write your own mapper defining interfaces to be passed to a factory creating DTO objects based on which data from the entity for which the DTO is being created would be extracted. You could also define annotations to be placed on your entity fields but personally given the number of annotations used I would prefer the interface way. The main thing to note about DTOs is that they are also classes and data among the DTOs should be reused, in other words while it may seem tempting to create DTOs for each use case try to reuse existing DTOs to minimize this.
Getting started
Regarding getting started as stated above the sole purpose of the DTO is to give the client the data it needs....so you keeping in mind you could just set data into the dto using setters...or define a factory which creates a DTO from an Entity based on an interface.....
Regarding your third question, do as is required by your client :)

I come to project with spring-jdbc and there are used DAO layer. Some times existing entities doesn't cover all possible data from DB. So I start using DTO.
By applying '70 structure programming rule I put all DTOs into separate package:
package com.evil.dao; // DAO interfaces for IOC.
package com.evil.dao.impl; // DAO implementation classes.
package com.evil.dao.dto; // DTOs
Now I rethink and decide to put all DTO as inner classes on DAO interfaces for result-sets which have no reuse. So DAO interface look like:
interface StatisticDao {
class StatisticDto {
int count;
double amount;
String type;
public static void extract(ResultSet rs, StatisticDto dto) { ... }
}
List<StatisticDto> getStatistic(Criteria criteria);
}
class StatisticDaoImpl implements StatisticDao {
List<StatisticDto> getStatistic(Criteria criteria) {
...
RowCallbackHandler callback = new RowCallbackHandler() {
#Override
public void processRow(ResultSet rs) throws SQLException {
StatisticDao.StatisticDto.extract(rs, dto);
// make action on dto
}
}
namedTemplate.query(query, queryParams, callback);
}
}
I think that holding related data together (custom DTO with DAO interface) make code better for PageUp/PageDown.

Question 1: If the DTO's you need to transfer are just a simple subset of your domain object, you can use a modelmapper to avoid filling your codebase with logic-less mapping. But if you need to apply some logic/conversion to your mapping then do it yourself.
Question 2: You can and probably should create a DTO for each domain object you have on your main DTO. A DTO can have multiple DTO's inside of it, one for each domain object you need to map. And to map those you could do it yourself or even use some modelmapper.
Question 3: Don't expose all your domain if your view does not require it to. Also you don't need to create a DTO for each view, try to create DTO's that expose what need to be exposed and may be reused to avoid having multiples DTO's that share a lot of information. But it mainly depend's on your application needs.
If you need clarification, just ask.

I'm going to assume that your domain model objects have a primary key ID that may correspond to the ID's from the database or store they came from.
If the above is true, then your DTO will overcome type referecning to other DTO's like your domain objects do, in the form of a foreign key ID. So an OrderLine.OrderHeader relationship on the domain object, will be OrderLine.OrderHeaderId cin the DTO.
Hope that helps.
Can I ask why you have chosen to use DTO's instead of your rich domain objects in the view?

We all know what Dtos are (probably).
But the important thing is to overuse DTOs or not.
Transfering data using Dtos between "local" services is a good practice but have a huge overhead on your developer team.
There is some facts:
Clients should not see or interact with Entities (Daos). So you
always need Dtos for transferig data to/from remote (out of the process).
Using Dtos to pass data between services is optional. If you don't plan to split up your project to microservices there is no need to do that. It will be just an overhead for you.
And this is my comment: If you plan to distribute your project to
microservices in long future. or don't plan to do that, then
DON'T OVERUSE DTOs
You need to read this article https://martinfowler.com/bliki/LocalDTO.html

Related

DDD, JPA and non-default constructors

DDD says that you domain models needs to be expressive and say what they need. With that in mind I created the following class:
#Entity
public class User{
#Id
#GeneratedValue
private Long id;
private String userName;
private String password;
public User(String userName, String password) {
/**set the values*/
}
/** Validations, getters and business methods*/
With this constructor I tell that the minimun needed to build a new User object.
The problem with that approach is that JPA needs a non-argument default constructor to instatiate the class by reflections. In this case do I have to forget about that DDD sugestion or is there a workaround?

In your specific situation you can make the non-argument constructor private. But soon another JPA requirement will force you to make another workaround;
voiceofunreason gave you the right answer. The right way is to have an adapter in the infrastructure/persistence layer that receives your domain object and then convert it to a JPA anemic model and vice versa.
Some other options that fits better without too much conversions:
1 - Spring Data JDBC is trying to absorb Aggregates concept
https://www.youtube.com/watch?v=GOSW911Ox6s&t=3s;
https://docs.spring.io/spring-data/jdbc/docs/1.1.7.RELEASE/reference/html/#reference
2 - Persist your aggregate as a json in a relational database. Vaughn Vernon proposed an implementation -> https://kalele.io/the-ideal-domain-driven-design-aggregate-store/
3 - Document oriented database such as https://www.mongodb.com/ among others...

In this case do I have to forget about that DDD sugestion or is there a workaround?
There isn't really a good answer here. There's a tension at work when we try to mix objects with mutable, encapsulated state, and persistence (ie: storing the authoritative copy of state in some anemic durable store).
The "right" answer is that we don't use an O/RM to load domain entities from the durable store; the O/RM loads a DTO, and we copy information from the DTO into the domain object. Similarly, to save, you extract information from the domain entity and copy it into the DTO, then save the DTO.
In other words, the O/RM loads an in memory representation of an anemic data structure.
In practice, it's a lot more common to couple the entity state to the O/RM, implementing within the domain entity whatever interfaces the O/RM needs to do its work.
The trick is to make sure that you can decouple your implementation from the O/RM when the O/RM assumptions start to get in the way. For instance, you should be able to easily change a specific aggregate from using a relational data store to using a document data store without requiring a rewrite of your entire system.

Entity Framework and DDD - Load required related data before passing entity to business layer

Let's say you have a domain object:
class ArgumentEntity
{
public int Id { get; set; }
public List<AnotherEntity> AnotherEntities { get; set; }
}
And you have ASP.NET Web API controller to deal with it:
[HttpPost("{id}")]
public IActionResult DoSomethingWithArgumentEntity(int id)
{
ArgumentEntity entity = this.Repository.GetById(id);
this.DomainService.DoDomething(entity);
...
}
It receives entity identifier, load entity by id and execute some business logic on it with domain service.
The problem:
The problem here is with related data. ArgumentEntity has AnotherEntities collection that will be loaded by EF only if you explicitly ask to do so via Include/Load methods.
DomainService is a part of business layer and should know nothing about persistence, related data and other EF concepts.
DoDomething service method expects to receive ArgumentEntity instance with loaded AnotherEntities collection.
You would say - it's easy, just Include required data in Repository.GetById and load whole object with related collection.
Now lets come back from simplified example to reality of the large application:
ArgumentEntity is much more complex. It contains multiple related collections and that related entities have their related data too.
You have multiple methods of DomainService. Each method requires different combinations of related data to be loaded.
I could imagine possible solutions, but all of them are far from ideal:
Always load the whole entity -> but it is inefficient and often impossible.
Add several repository methods: GetByIdOnlyHeader, GetByIdWithAnotherEntities, GetByIdFullData to load specific data subsets in controller -> but controller become aware of which data to load and pass to each service method.
Add several repository methods: GetByIdOnlyHeader, GetByIdWithAnotherEntities, GetByIdFullData to load specific data subsets in each service method -> it is inefficient, sql query for each service method call. What if you call 10 service methods for one controller action?
Each domain method call repository method to load additional required data ( e.g: EnsureAnotherEntitiesLoaded) -> it is ugly because my business logic become aware of EF concept of related data.
The question:
How would you solve the problem of loading required related data for the entity before passing it to business layer?

In your example I can see method DoSomethingWithArgumentEntity which obviously belongs to Application Layer. This method has call to Repository which belongs to Data Access Layer. I think this situation does not conform to classic Layered Architecture - you should not call DAL directly from Application Layer.
So your code can be rewritten in another manner:
[HttpPost("{id}")]
public IActionResult DoSomethingWithArgumentEntity(int id)
{
this.DomainService.DoDomething(id);
...
}
In DomainService implementation you can read from repo whatever it needs for this specific operation. This avoids your troubles in Application Layer. In Business Layer you will have more freedom to implement reading: with serveral repository methods reads half-full entity, or with EnsureXXX methods, or something else. Knowledge about what you need to read for operation will be placed into operation's code and you don't need this knowledge in app-layer any more.
Every time situation like this emerged it is a strong signal about your entity is not preperly designed. As krzys said the entity has not cohesive parts. In other words if you often need parts of an entity separately you should split this entity.

Nice question :)
I would argue that "related data" in itself is not a strict EF concept. Related data is a valid concept with NHibernate, with Dapper, or even if you use files for storage.
I agree with the other points mostly, though. So here's what I usually do: I have one repository method, in your case GetById, which has two parameters: the id and a params Expression<Func<T,object>>[]. And then, inside the repository I do the includes. This way you don't have any dependency on EF in your business logic (the expressions can be parsed manually for another type of data storage framework if necessary), and each BLL method can decide for themselves what related data they actually need.
public async Task<ArgumentEntity> GetByIdAsync(int id, params Expression<Func<ArgumentEntity,object>>[] includes)
{
var baseQuery = ctx.ArgumentEntities; // ctx is a reference to your context
foreach (var inlcude in inlcudes)
{
baseQuery = baseQuery.Include(include);
}
return await baseQuery.SingleAsync(a=>a.Id==id);
}

Speaking in context of DDD, It seems that you had missed some modeling aspects in your project that led you to this issue. The Entity you wrote about looked not to be highly cohesive. If different related data is needed for different processes (service methods) it seems like you didn't find proper Aggregates yet. Consider splitting your Entity into several Aggregates with high cohesion. Then all processes correlated with particular Aggregate will need all or most of all data that this Aggregate contains.
So I don't know the answer for your question, but if you can afford to make few steps back and refactor your model, I believe you will not encounter such problems.

Refactoring application: Direct database access -> access through REST

we have a huge database application, which must get refactored (there are so many reasons for this. biggest one: security).
What we already have:
MySQL Database
JPA2 (Eclipselink) classes for over 100 tables
Client application that accesses the database directly
What needs to be there:
REST interface
Login/Logout with roles via database
What I've done so far:
Set up Spring MVC 3.2.1 with Spring Security 3.1.1
Using a custom UserDetailsService (contains just static data for testing atm)
Created a few Controllers for testing (simply receiving/providing data)
Design Problems:
We have maaaaany #OneToMany and #ManyToMany relations in our database
1.: (important)
If I'd send the whole object tree with all child objects as a response, I could probably send the whole database at once.
So I need a way to request for example 'all Articles'. But it should omit all the child objects. I've tried this yesterday and the objects I received were tons of megabytes:
#PersistenceContext
private EntityManager em;
#RequestMapping(method=RequestMethod.GET)
public #ResponseBody List<Article> index() {
List<Article> a = em.createQuery("SELECT a FROM Article a", Article.class).getResultList();
return a;
}
2.: (important)
If the client receives an Article, at the moment we can simply call article.getAuthor() and JPA will do a SELECT a FROM Author a JOIN Article ar WHERE ar.author_id = ?.
With REST we could make a request to /authors/{id}. But: This way we can't use our old JPA models on the client side, because the model contains Author author and not Long author_id.
Do we have to rewrite every model or is there a simpler approach?
3.: (less important)
Authentication: Make it stateless or not? I've never worked with stateless auth so far, but Spring seems to have some kind of support for it. When I look at some sample implementations on the web I have security concerns: With every request they send username and password. This can't be the right way.
If someone knows a nice solution for that, please tell me. Else I'd just go with standard HTTP Sessions.
4.:
What's the best way to design the client side model?
public class Book {
int id;
List<Author> authors; //option1
List<Integer> authorIds; //option2
Map<Integer, Author> idAuthorMap; //option3
}
(This is a Book which has multiple authors). All three options have different pros and cons:
I could directly access the corresponding Author model, but if I request a Book model via REST, I maybe don't want the model now, but later. So option 2 would be better:
I could request a Book model directly via REST. And use the authorIds to afterwards fetch the corresponding author(s). But now I can't simply use myBook.getAuthors().
This is a mixture of 1. and 2.: If I just request the Books with only the Author ids included, I could do something like: idAuthorMap.put(authorId, null).
But maybe there's a Java library that handles all the stuff for me?!
That's it for now. Thank you guys :)
The maybe solution(s):
Problem: Select only the data I need. This means more or less to ignore every #ManyToMany, #OneToMany, #ManyToOne relations.
Solution: Use #JsonIgnore and/or #JsonIgnoreProperties.
Problem: Every ignored relation should get fetched easily without modifying the data model.
Solution: Example models:
class Book {
int bId;
Author author; // has #ManyToOne
}
class Author {
int aId;
List<Book> books; // has #OneToMany
}
Now I can fetch a book via REST: GET /books/4 and the result will look like that ('cause I ignore all relations via #JsonIgnore): {"bId":4}
Then I have to create another route to receive the related author: GET /books/4/author. Will return: {"aId":6}.
Backwards: GET /authors/6/books -> [{"bId":4},{"bId":42}].
There will be a route for every #ManyToMany, #OneToMany, #ManyToOne, but nothing more. So this will not exist: GET /authors/6/books/42. The client should use GET /books/42.

First, you will want to control how the JPA layer handles your relationships. What I mean is using Lazy Loading vs. Eager loading. This can easily be controller via the "fetch" option on the annotation like thus:
#OneToMany(fetch=FetchType.Lazy)
What this tells JPA is that, for this related object, only load it when some code requests it. Behind the scenes, what is happening is that a dynamic "proxy" object is being made/created. When you try to access this proxy, it's smart enough to go out and do another SQL to gather that needed bit. In the case of Collection, its even smart enough to grab the underlying objects in batches are you iterate over the items in the Collection. But, be warned: access to these proxies has to happen all within the same general Session. The underlying ORM framework (don't know how Eclipselink works...I am a Hybernate user) will not know how to associate the sub-requests with the proper domain object. This has a bigger effect when you use transportation frameworks like Flex BlazeDS, which tries to marshal objects using bytecode instead of the interface, and usually gets tripped up when it sees these proxy objects.
You may also want to set your cascade policy, which can be done via the "cascade" option like
#OneToMany(cascade=CascadeType.ALL)
Or you can give it a list like:
#OneToMany(cascade={CascadeType.MERGE, CascadeType.REMOVE})
Once you control what is getting pulled from your database, then you need to look at how you are marshalling your domain objects. Are you sending this via JSON, XML, a mixture depending on the request? What frameworks are you using (Jackson, FlexJSON, XStream, something else)? The problem is, even if you set the fetch type to Lazy, these frameworks will still go after the related objects, thus negating all the work you did telling it to lazily load. This is where things get more specific to the mashalling/serializing scheme: you will need to figure out how to tell your framework what to marshal and what not to marshal. Again, this will be highly dependent on whatever framework is in use.

Is it really necessary to implement a Repository when using Entity Framework?

I'm studying MVC and EF at the moment and I see quite often in the articles on those subjects that controller manipulates data via a Repository object instead of directly using LINQ to Entities.
I created a small web application to test some ideas. It declares the following interface as an abstraction of the data storage
public interface ITinyShopDataService
{
IQueryable<Category> Categories { get; }
IQueryable<Product> Products { get; }
}
Then I have a class that inherits from the generated class TinyShopDataContext (inherited from ObjectContext) and implements ITinyShopDataService.
public class TinyShopDataService : TinyShopDataContext, ITinyShopDataService
{
public new IQueryable<Product> Products
{
get { return base.Products; }
}
public new IQueryable<Category> Categories
{
get { return base.Categories; }
}
}
Finally, a controller uses an implementation of ITinyShopDataService to get the data, e.g. for displaying products of a specific category.
public class HomeController : Controller
{
private ITinyShopDataService _dataService;
public HomeController(ITinyShopDataService dataService)
{
_dataService = dataService;
}
public ViewResult ProductList(int categoryId)
{
var category = _dataService.Categories.First(c => c.Id == categoryId);
var products = category.Products.ToList();
return View(products);
}
}
I think the controller above possesses some positive properties.
It is easily testable because of data service being injected.
It uses LINQ statements that query an abstract data storage in an implementation-neutral way.
So, I don't see any benefit of adding a Repository between the controller and the ObjectContext-derived class. What do I miss?
(Sorry, it was a bit too long for the first post)

You can test your controllers just fine with what you propose.
But how do you test your queries inside the data service? Imagine you have very complicated query behavior, with multiple, complex queries to return a result (check security, then query a web service, then query the EF, etc.).
You could put that in the controller, but that's clearly wrong.
It should go in your service layer, of course. You might mock/fake out the service layer to test the controller, but now you need to test the service layer.
It would be nice if you could do this without connecting to a DB (or web service). And that's where Repository comes in.
The data service can use a "dumb" Repository as a source of data on which it can perform complex queries. You can then test the data service by replacing the repository with a fake implementation which uses a List<T> as its data store.
The only thing which makes this confusing is the large amount of examples that show controllers talking directly to repositories. (S#arp Architecture, I'm looking at you.) If you're looking at those examples, and you already have a service layer, then I agree that it makes it difficult to see what the benefit of the repository is. But if you just consider the testing of the service layer itself, and don't plan on having controllers talking directly to the repository, then it starts to make more sense.

Ayende of the NHibernate project and other fame agrees with you.
Some Domain Driven Design people would suggest that you should have a service (not same as web service) layer that is called from your controller, but DDD is usually applied to complex domains. In simple applications, what you're doing is fine and testable.

What you're missing is the ability to update, create and delete objects. If that's not an issue, than the interface is probably good enough as it is (although I would let it derive from IDisposable to make sure you can always dispose the context (and test for this))

The repository pattern has nothing to do with Entity Framework, although Entity Framework implements a "repository" pattern.
The repository interface (lets say IRepositoryProducts) lives in the domain layer. It understands domain objects, or even Entities if you don't want to use domain driven design.
But its implementation (lets say RepositoryProducts) is the actual repository pattern implementation and does not live in the domain layer but in the persistance layer.
This implementation can use Entity or any ORM ..or not to persist the information in the database.
So the answer is: It's not really necessary, but it's recommended to use Repository pattern despite using Entity ORM as a good practice to keep separation between domain layer and persistance layer. Because that's the purpose of the Repository Pattern: separation of concerns between the domain logic and the way the information is persisted. When using the repository pattern from the domain level you just abstract yourself and think "I want to store this information, I don't care how it's done", or "I want a list of these things, I don't care where do you get them from or how do you retrieve them. Just give 'em to me".
Entity Framework has nothing to do with domain layer, it's only a nice way to persist objects, but should live in the repository implementation (persistance layer)

Should i use partial classes as business layer when using entity framework?

I am working on a project using entity framework. Is it okay to use partial classes of the EF generated classes as the business layer. I am begining to think that this is how EF is intended to be used.
I have attempted to use a DTO pattern and soon realized that i am just creating a bunch of mapping classes that is duplicating my effort and also a cause for more maintenance work and an additional layer.
I want to use self-tracking-entities and pass the EF entities to all the layers. Please share your thoughts and ideas. Thanks

I had a look at using partial classes and found that exposing the database model up towards the UI layer would be restrictive.
For a few reasons:
The entity model created includes a deep relational object model which, depending on your schema, would get exposed to the UI layer (say the presenter of MVP or the ViewModel in MVVM).
The Business logic layer typically exposes operations that you can code against. If you see a save method on the BLL and look at the parameters needed to do the save and see a model that require the construction of other entities (cause of the relational nature the entity model) just to do the save, it is not keeping the operation simple.
If you have a bunch of web services then the extra data will need to be sent across for no apparent gain.
You can create more immutable DTO's for your operations parameters rather than encountering side effects cause the same instance was modified in some other part of the application.
If you do TDD and follow YAGNI then you will tend to have a structure specifically designed for the operation you are writing, which would be easier to construct tests against (not requiring to create other objects not realated to the test just because they are on the model). In this case you might have...
public class Order
{ ...
public Guid CustomerID { get; set; }
... }
Instead of using the Entity model generated by the EF which have references exposed...
public class Order
{ ...
public Customer Customer { get; set; }
... }
This way the id of the customer is only needed for an operation that takes an order. Why would you need to construct a Customer (and potentially other objects as well) for an operation that is concerned with taking orders?
If you are worried about the duplication and mapping, then have a look at Automapper

I would not do that, for the following reasons:
You loose the clear distinction between the data layer and the business layer
It makes the business layer more difficult to test
However, if you have some data model specific code, place that is a partial class to avoid it being lost when you regenerate the model.

I think partial class will be a good idea. If the model is regenerated then you will not loose the business logic in the partial classes.
As an alternative you can also look into EF4 Code only so that you don't need to generate your model from the database.

I would use partial classes. There is no such thing as data layer in DDD-ish code. There is a data tier and it resides on SQL Server. The application code should only contain business layer and some mappings which allow persisting business objects in the mentioned data tier.
Entity Framework is you data access code so you shouldn't built your own. In most cases the database schema would be modified because the model have changed, not the opposite.
That being said, I would discourage you to share your entities in all the layers. I value separation of UI and domain layer. I would use DTO to transfer data in and out of the domain. If I have the necessary freedom, I would even use CQRS pattern to get rid of mapping entities to DTO -- I would simply create a second EF data access project meant only for reading data for the UI. It would be built on top of the same database. You read data through read (anemic -- without business logic) model, but you modify it by issuing commands that are executed against real model implemented using EF and partial methods.
Does this answer your question?

I wouldn't do that. Try too keep the layers independent as possible. So a tiny change in your database schema will not affect all your layers.
Entities can be used for data layer but they should not.
If at all, provide interfaces to be used and let your entities implement them (on the partial file) the BL should not know the entities but the interfaces.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse