Entity framework optimize include, provide hints to Include - entity-framework

Is it possible to provide hints to the Include linq queries so that the joining of the child tables can be optimized ?
I have a very generic data model and so, there are multiple referential constraints between tables. I am working with a legacy system, so changing that around would be very difficult.
I have a query like the following, which generates very complicated SQL queries.
var links = A.B.CreateSourceQuery()
.Include("B1")
.Include("B1.C1")
.Include("B1.D1")
.ToArray();
Is there a way to provide hints to the above query, on how to join the respective child entities, so that the SQL generated is more optimized and efficient and the data can be eager loaded.

See my post at http://www.thinqlinq.com/Post.aspx/Title/LINQ-to-Database-Performance-hints particularly the point around breaking up complex queries. In the case where you're fetching two sets of grandchildren, your performance may well suffer. You may want to consider a custom projection instead of multiple includes.

Related

What is the benefit of adding .HasIndex() in your mappings, on a DBFirst scenario?

I have been searching on EF Core documentation, if adding .HasIndex() on your entities mappings would bring any benefits on a DbFirst scenario, and I couldn`t find anything.
I have this 20yo DB that has all the necessary tables and indexes created, and I am mapping some tables to query them using EF Core. I wonder, what could be the benefits of mapping the indexes on a DbFirst scenario where you would never update the tables schema via code? Does it affect the way EF generates the SQL queries?
None. HasIndex would only apply to creating indexes for code-first/migrations. You don't need to map indexes for EF to generate or optimize the query.
I do recommend after introducing EF to a project to record/report on the most common queries executed to determine whether there are new indexes or adjustments to existing indexes that might benefit your application's performance. (I.e. included columns)

Denormalization vs Parent Referencing vs MapReduce

I have a highly normalized data model with me. Currently I'm using manual referencing by storing the _id and running sequential queries to fetch details from the deepest collection.
The referencing is one-way and the flow has around 5-6 collections. For one particular use case, I'm having to query down to the deepest collection by querying subsequent "_id" from the higher level collections. So technically I'm hitting the database every time I run a
db.collection_name.find(_id: ****).
My prime goal is to optimize the read without hugely affecting the atomicity of the other collections. I have read about de-normalization and it does not make sense to me because I want to keep an option for changing the cardinality down the line and hence want to maintain a separate collection altogether.
I was initially thinking of using MapReduce to do an aggregation from the back and have a collection primarily for the particular use-case. But well even that does not sound that good.
In a relational db, I would be breaking the query in sub-queries and performing a join to get the data sets that intersect from the initial results. Since mongodb does not support joins, I'm having a tough time figuring anything out.
Please help if you have faced anything like this earlier or have any idea how to resolve it.
Denormalize your data.
MongoDB does not do JOIN's - period.
There is no operation on the database which gets data from more than one collection. Not find(), not aggregate() and not MapReduce. When you need to puzzle your data together from more than one collection, there is no other way than doing it on the application layer. For that reason you should organize your data in a way that any common and performance-relevant query can be resolved by querying just a single collection.
In order to do that you might have to create redundancies and transitive dependencies. This is normal in MongoDB.
When this feels "dirty" to you, then you should either accept the fact that your performance will be sub-optimal or use a different kind of database, like a classic relational database or a graph database.

How to do multi-table aggregates using Spring Data repositories?

What's the best approach for doing multi-table aggregates, or non aggregate multi table results in my Spring Data repositories. I don't care about mapping back to entities, I just need a list of objects returned I can massage into a JSON response.
If you don't care about entities, repositories are not the tool for the job. Repositories are defined to simulate collections of aggregates (which are special kinds of entities usually).
So to answer the question from your headline (which surprisingly seems to be the opposite of what you're asking in the description): just do it. Define your entity types including the relations that form the aggregate, create a repository for them and query it, define query methods etc.
If you don't care about types (which is perfectly fine, too), have a look at jOOQ which is centered around SQL to efficiently query relational databases, but wrapped into a nice API.

Port From Entity Framework to MongoDB

I'm planing to port from entity framework 4.0 to MongoDb. What are the best practices that can minimize the impact since the project is having social networking functionality hence, maintain a complex relational database.As a result, performance should be a matter if we use
relational database.
We have used domain Layer(using POCO), repository pattern and DTO Mapping in the project.Also,
What are the advantages and disadvantages of the decision ? At the same time, how it affect to my domain layer implementation ?
If you want to 'minimize impact' you'll want to create a database in MongoDB the one you have in SQL. Since there are no joins in the database you'll need to do multiple reads to complete your query. In itself that's not too bad because MongoDB is really fast, but obviously it has other issues (concurrency, etc.).
If, however, you want to move over fully to the NOSQL-way of doing things you'll likely not be able to 'minimize impact', you'll need to make substantial changes to the way you store content, the way you access it and the way you update it.
Storage: You'll likely create documents in your database that are denormalized and much closer to 'ViewModels' than 'Models'. You might for example store a count of child records in a parent record so that you can display it without having to load them or count them.
Access: You might end up using Map-Reduce for some queries to your database which is a very different mind-set from a traditional query.
Updates: In all likelihood your approach to updating will be different in order to take advantage of the many fine-grained MongoDB update features like $inc. Instead of posting back some large view model and then applying it to your model and then updating the database you might instead provide a much finer-grained Ajax call back that updates a single value. Take a look at CQRS for more ideas on how to think about models for updates vs queries.

Equivalent of ERD for MongoDB?

What would be the equivalent of ERD for a NoSQL database such as MongoDB?
It looks like you asked a similar question on Quora.
As mentioned there, the ERD is simply a mapping of the data you intend to store and the relations amongst that data.
You can still make an ERD with MongoDB as you still want to track the data and the relations. The big difference is that MongoDB has no joins, so when you translate the ERD into an actual schema you'll have to make some specific decisions about implement the relationships.
In particular, you'll need to make the "embed vs. reference" decision when deciding how this data will actually be stored. Relations are still allowed, just not enforced. Many of the wrappers for MongoDB actually provide lookups across collections to abstract some of this complexity.
Even though MongoDB does not enforce a schema, it's not recommended to proceed completely at random. Modeling the data you expect to have in the system is still a really good idea and that's what the ERD provides you.
So I guess the equivalent to the ERD is the ERD?
You could just use a UML class diagram instead too.
Moon Modeler supports schema design for MongoDB. It allows users to define diagrams with nested structures.
I know of no standard means of diagramming document-oriented "schema".
I'm sure you could use an ERD to map out your schemata but since document databases do not truly support--or more importantly enforce--relationships between data, it would only be as useful as your code was disciplined to internally enforce such relationships.
I have been thinking about the same issue for quite some time.
And I came to the following conclusion:
If NoSQL databases are generally schemaless, you don't actually have a 'schema' to illustrate in a diagram.
Thus, I think you should take a "by example" approach.
You could draw some mindmaps exemplifying how your data would look like when stored in a NoSQL DB such as MongoDB.
And since these databases are very dynamic you could also create some derived mindmaps to show how the data from today could evolve in time.
Take a look at this topic too.
Confusion about NoSQL Design
MongoDB does support 'joins', just not in the SQL sense of INNER JOIN (the default SQL join). While the concept of 'join' is typically associated with SQL, MongoDB does have the aggregation framework with its data processing pipeline stages. The $lookup pipeline stage is used to create the equivalent of a LEFT JOIN in SQL. That is, all documents on the left of a relationship will be pass through the pipeline, as well as any relating documents on the right side of the relationship. The documents are modified to include the relationship as part of the new documents.
Consequently, I postulate that Entity Relationship Diagrams do have a role in MongoDB. Documents are certainly related to each other in the db, and we should have a visualization of these relationships, including the cardinality relationship, e.g. full participation, partial participation, weak/strong entities, etc.
Of course, MongoDB also introduces the concept of embedded documents and referenced documents, and so I argue it adds additional flavor to the model of the ERD. And I certainly would want to see embedded and referenced relationships mapped out in a visual diagram.
The remaining question is so what is out there? What is out there for Mongoose for NodeJS? Mongoid for Ruby? etc. If you check the respective repositories for their corresponding ORMs (Object Relational Mappers), then you will see there are ERDs for them. But in terms of their completeness, perhaps there is a lot to be desired and the open source community is welcome to make contributions.
https://www.npmjs.com/package/mongoose-erd
https://rubygems.org/gems/railroady