OData / WCF Data Services / EDM - Mapping to disparate data - entity-framework

I'm researching OData as a RESTful interface for a database. The data is structured in a very unusual way and normal tables and rows do not apply, in fact, some stuff just exists in in-memory collections and objects.
Can I build my own arbitrary mapping system between the entities that make up 'feeds' and the sources behind, this might mean aggregating from sources and building the entities on the fly?
I'm just looking for yes/no (why not) and maybe some pointers to relevant reading material.
Many thanks
Luke

Yes and no.
You can build an OData feed of anything. In a WCF Data Service implementation of same, you can implement IDataServiceMetadataProvider.
However, the easiest way to define an EF data service is:
public class MyOData : DataService<MyObjectContext>
...and that won't work if you need to return non-entity objects. Such services are limited to entities and simple types only.
So yes, you can do it, but it's quite a bit more work than the one-liner above!

Related

Exposed domain model in Java microservice architecture

I'm aware that copying entity classes and properties into DTOs is considered anti-pattern, so by Exposed domain model pattern the same #Entity can be used as both database entity class, and DTO for service and MVC layer. (see here https://codereview.stackexchange.com/questions/93511/data-transfer-objects-vs-entities-in-java-rest-server-application)
But suppose we have microservice architecture where the same set of properties is used as entity in one project with persistence, and as DTO in another project which uses the first one as a service. What's the proposed pattern in such a situation?
Because the second project doesn't need #Entity related functionality, and if we put that class in shared library, it will be tied unnecessary to JPA specific APIs and libraries. And the alternative is to again use separate DTO classes anti-pattern.
When your requirements for a DTO model exactly match your entity model you are either in a very early stage of the project or very lucky that you just have a simple model. If your model is very simple, then DTOs won't give you many immediate benefits.
At some point, the requirements for the DTO model and the entity model will diverge though. Imagine you add some audit aspects, statistics or denormalization to your entity/persistence model. That kind of data is usually never exposed via DTOs directly, so you will need to split the models. It is also often the case that the main driver for DTOs is the fact that you don't need all the data all the time. If you display objects in e.g. a dropdown you only need a label and the object id, so why would you load the whole entity state for such a use case?
The fact that you have annotations on your DTO models shouldn't bother you that much, what is the alternative? An XML-like mapping? Manual object wiring?
If your model is used by third parties directly, you could use a subclassing i.e. keep the main model free of annotations and have annotated subclasses in your project that extend the main model.
Since implementing a DTO approach correctly, I created Blaze-Persistence Entity Views which will not only simplify the way you define DTOs, but it will also improve the performance of your queries.
If you are interested, I even have an example for an external model that uses entity view subclasses to keep the main model clean.
Thank you for the answers, but emphasize in the question is on microservice (MS) architecture and reusing defined entity POJOs from one MS in another as POJOs. From what I've read on microservices it's closely related to another question - should MSs share any common functionality and classes at all, or be completely independent? It seems there is no definite agreement on it, and also no definite answer, or widely accepted pattern, to this.
From my recent experience here is what I adopted, and it works well so far.
Have common functionality across MSs - yes, in form of a commons project added as dependency to all MSs, with its dependencies set as optional. Share entity classes (expose them in commons) - no.
The main reason is that entity classes are closely related to data store for particular MS. And as the established rule is that MSs shouldn't share data stores, then it makes sense not to share entity classes for those data stores. It helps MSs to be more independent, and freedom to manage their data store in their own way. It means some more typing to add additional DTO classes and conversion between them, but it's a trade-off worth taking to retain MS independence. Reasons Christian Beikov and Maksim Gumerov mentioned apply as well.
What we do share (put in commons) are some common functionality and helper classes (for cloud, discovery, error handling, rest and json configuration...), and pure DTOs, where T is transfer between MSs (rest entities or message payloads).

Persistence ignorance and DDD reality

I'm trying to implement fully valid persistence ignorance with little effort. I have many questions though:
The simplest option
It's really straightforward - is it okay to have Entities annotated with Spring Data annotations just like in SOA (but make them really do the logic)? What are the consequences other than having to use persistance annotation in the Entities, which doesn't really follow PI principle? I mean is it really the case with Spring Data - it provides nice repositories which do what repositories in DDD should do. The problem is with Entities themself then...
The harder option
In order to make an Entity unaware of where the data it operates on came from it is natural to inject that data as an interface through constructor. Another advantage is that we always could perform lazy loading - which we have by default in Neo4j graph database for instance. The drawback is that Aggregates (which compose of Entities) will be totally aware of all data even if they don't use them - possibly it could led to debugging difficulties as data is totally exposed (DAO's would be hierarchical just like Aggregates). This would also force us to use some adapters for the repositories as they doesn't store real Entities anymore... And any translation is ugly... Another thing is that we cannot instantiate an Entity without such DAO - though there could be in-memory implementations in domain... again, more layers. Some say that injecting DAOs does break PI too.
The hardest option
The Entity could be wrapped around a lazy-loader which decides where data should come from. It could be both in-memory and in-database, and it could handle any operations which need transactions and so on. Complex layer though, but might be generic to some extent perhaps...? Have a read about it here
Do you know any other solution? Or maybe I'm missing something in mentioned ones. Please share your thoughts!
I achieve persistence ignorance (almost) for free, as a side effect of proper domain modeling.
In particular:
if you correctly define each context's boundary, you will obtain small entities without any need for lazy loading (that, actually becomes an antipattern/code smell in a DDD project)
if you can't simply use SQL into your repository, map a set of DTO to your db schema, and use them into factories to initialize entity classes.
In DDD projects, persistence ignorance is relevant for the domain model itself, not for repositories, factories and other applicative code. Indeed you are very unlikely to change the ORM and/or the DB in the future.
The only (but very strong) rational behind persistence ignorance of the domain model is separation of concerns: in the domain model you should express business invariants only! Persistence is an infrastructural concern!
For example without persistence ignorance (and with lazy loading) the domain model should handle possible exceptions from the db, it's complexity grows and business rules are buried under technological details.
Personally I find it near impossible to achieve a clean domain model when trying to use the same entities as the ORM.
My solution is to model my domain entities as I see fit and ensure that any ORM entities don't leak outside of the repositories. This means that my repositories accept and return domain entities.
This means you lose "most of your ORM goodness" and end up "using your ORM for simple CRUD operations".
Both of these trade-offs are fine for me, I would rather have a clean domain model that I can use, rather than one polluted with artefacts from my DB or ORM. It also cuts down the amount of time I spend "wrestling with my ORM" to zero.
As a side-note, I find document databases a much better fit for DDD.
Once you will provide persistence mapping in you domain model:
your code depends on framework. If you decided to change this framework, you want to change persistence layer and model layer source code - more work, more changes, more merging of code etc.
your domain model jar file depends on spring/nhibernate jars etc.
your classes become larger and larger how business code and persistence related code grows
I've to admit that I dont understand harder and hardest option.
We used separated interfaces and implementations for domain entities. Provide separated mapping files using Hibernate along with repositories.
Entities are created using factory (or repository later), identifier is generated within persistence layer, entity does not need it until it's being persisted.
Lazy loading is provided by special implementation of List once:
mapping of an entity contains it
entity/aggregate is fetched from persistence layer
The only issue is related to transaction as when you use lazy-loaded collection out of transaction scope, it fails.
I would follow the simplest option unless I ran into a stone wall. There are also pitfalls such as this when you adopt pi principle.
Somtimes some compromises are acceptable.
public class Order {
private String status;//my orm does not support enum
public Status status() {
return Status.of(this.status);
}
public is(Status status) {
return status() == status;//use status() instead of getStatus() in domain model
}
}

Replacements to hand-rolled ADO.NET POCO mapping?

I have written a wrapper around ADO.NET's DbProviderFactory that I use extensively throughout my applications. I also have written a lot of code that maps IDataReader rows to POCOs. However, as I have tons of classes the whole thing is getting to be a pain in the ass to maintain.
I have been looking at replacing the whole she-bang with a micro-orm like Petapoco. I have a few queries though:
I have lots of POCOs that contain other POCOs in them as properties. How well does the Petapoco support this?
Should I use a ORM like Massive or Simple.Data that returns a dynamic object and map that to a POCO?
Are there any approaches I can take to the whole mapping of rows to POCOs? I can't really use convention-based tools as my database isn't particularly consistent in how it is designed.
How about using a text templating/code generator to build out a lightweight persistence layer? I have a battle-hardened open source project called TextMetal to generate the necessary persistence layer based on tried and true architectural decisions. The only lacking thing is object to object relations but it does support query expressions and works well with poorly designed data schemas.
You can see a real world project that uses the above tool call Can Do It For.
Feel free to ask me about any design decisions once you take a look-sse.
Simple.Data automagically casts its dynamic type to static types. It will map nested properties as long as they have been eager-loaded using the .With method. So for example
Customer customer = db.Customer.WithOrders().Get(42);
would populate the Orders property of the customer object.
Could you use QueryFirst, or modify it? It takes your sql and wraps it in vanilla ADO code, generated at design time. You get fresh POCOs from your result schema every time you save your file. Additionally, you can choose to test all queries and regenerate all wrappers via the option in the tools menu. It's dependent on Sql Server and SqlClient, so unless you do some modification, you'll lose DbProviderFactory.

How do I use entity framework with hierarchical data?

I'm working with a large hierarchical data set in sql server - modelled using the standard "EntityID, ParentID" kind of approach. There are about 25,000 nodes in the whole tree.
I often need to access subtrees of the tree, and then access related data that hangs off the nodes of the subtree. I built a data access layer a few years ago based on table-valued functions, using recursive queries to fetch an arbitrary subtree, given the root node of the subtree.
I'm thinking of using Entity Framework, but I can't see how to query hierarchical data like
this. AFAIK there is no recursive querying in Linq, and I can't expose a TVF in my entity data model.
Is the only solution to keep using stored procs? Has anyone else solved this?
Clarification: By 25,000 nodes in the tree I'm referring to the size of the hierarchical dataset, not to anything to do with objects or the Entity Framework.
It may the best to use a pattern called "Nested Set", which allows you to get an arbitrary subtree within one query. This is especially useful if the nodes aren't manipulated very often: Managing hierarchical data in MySQL.
In a perfect world the entity framework would provide possibilities to save and query data using this data pattern.
Everything IS possible with Entity Framework but you have to hack and slash your way in to it. The database I am currently working against has too many "holder tables" since Points for instance is shared with both teams and users. Both users and teams can also have a blog.
When you say 25 000 nodes do you mean navigational properties? If so I think it could be tricky to get the data access in place. It's not hard to navigate, search etc with entity framework but I tend to model on paper then create the database based on how I want to navigate while using entity framework. Sounds like you don't have that option.
Thanks for these suggestions.
I'm beginning to realise that the answer is to remodel the data in the database - either along the lines of nested sets as Georg suggests, or maybe a transitive closure table, which I've just come across.
That way, I'm hoping to get two key benefits:
a) faster querying aginst arbitrary subtrees
b) a data model which no longer requires recursive querying - so perhaps bringing it within easy reach of the Entity Framework!
It's always amazing how so often the right answer to a difficult problem is not to answer it, but to do something else instead!

Linq to SQL, Entity Framework, Repository Pattern, and Dependency Injection

Stephan Walters video on MVC and Models is a very good and light discussion of the various topics listed in this questions title. The one question listed in the notes unanswered was:
If you create an Interface / Repository pattern for Linq2SQL, does Linq2SQLs classes still cause a dependency on Linq, even though you pass the classes as toList?
It is probably an easy answer YES, however, what standard mechanic would you use to represent the data?
Lets say you have a Product entity that is made up of three tables (Prices, Text, and Photos) (you could have sets of price for different regions, different text for localization, and different photos). (Sounds like a builder pattern) Would you create a slice of these tables grabbing the right prices, text, and photos in to a single List? Since Lists may be proprietary, would you use a Dictionary object?
I thank you for your answers. I am very interested in the "standard and proper" way to do it rather than 101 possibilities.
Another quick question: is Entity Framework ready for a complicated database yet? There are a lot of constructs that Linq2SQL likes that EF does not. EF seems to require identity fields as primary keys (HAHA), but it seems like every demo does this. I want to use EF, but I constantly fail to make it work, falling back to Linq2SQL.
If you keep the L2S on the other side of the Repository facade (remember, that's all a Repository is - a facade) then you decouple the rest of your application from L2S. This means that the job of the code behind your repository is to turn the L2S into "domain" objects, custom classes, and then the Repository returns those.
In this sense, the Repository is returning fully formed "Product" objects with all their related Price, Text, and Photo data. This is called an Aggregate Root.
There shouldn't be a problem with Lists, since they are CLR objects.
As far as EF for advanced scenarios, my advice would be not yet, for the reasons you note.
The standard mechanism I'd use to represent the data is a Data Transfer Object. I would never return a LINQ to SQL or Entity Framework object across a service boundary, and I would hesitate to return it across a layer boundary of any kind. This is because these objects will serialize implementation-dependant data.