Let's assume I have the following object:
public class MyOwnList {
#DatabaseField(id= true)
protected int id;
#ForeignCollectionField(eager = false)
protected Collection<Item> items;
}
As items is marked as lazy it won't be loaded if I load the list object from the database.
That's exactly what I want!!
The problem is that everytime I access items, ORMLite makes a sql query to get the collection. Only discovered it after activating the logging of ORMLite...
Why does it do that? Any good reason for that?
Is there any way that I can lazy load the collection, but only once, not everytime I access the collection? So something between eager and lazy?
The problem is that everytime I access items, ORMLite makes a sql query to get the collection.
So initially I didn't understand this. What you are asking for is for ORMLite to cache the collection of items after it lazy loads it the first time. The problem with this as the default is that ORMLite has no idea how big your collection of items is. One of the reasons why lazy collections are used is to handle large collections. If ORMLite kept all lazy collections around in memory, it could fill the memory.
I will add to the TODO list something like lazyCached = true which does a hybrid between lazy and eager. Good suggestion.
Related
I have read about lazy loading from this web site.
Enable or disable LazyLoading
"If we request a list of Students with LazyLoading enabled, the data provider will get all of our students from the DB but each StudentAddress property won’t be loaded until the property will be explicitly accessed."
This statement says that when I set Lazy Loading Enabled = true the related data won't be loaded. However
List<Students> stdList = Datacontext.Students.ToList();
if I set lazy loading enabled = true the above code returns all stundents with their teachers and address. What is the point that I am missing here? Please could someone explain it?
No matter what setting you have, if you use .ToList() it will "enumerate the enumerable". This is very significant, and this phrase should become common knowledge to you.
When .ToList() is used, many things occur. Enumerating the enumerable means that the previous set of how to enumerate the set is now being used to actually iterate through the set and populate the data. What that means is that the previous enumerator (which was stored internally as an Expression Tree) is now going to be sent from Entity Framework to your SQLProvider Factory. That will then convert the object graph from the Expression Tree into SQL and execute the query on the server, thus returning the data and populating your list.
Lazy loading instead of using ToList() would be if you had this IQueryable enumerable, and then iterated that manually loading each element in the set, or only partial elements in the set.
Once you have the list of elements returned, lazy loading will only come in to play if there are navigational properties. If there were related properties, for example if you have an Invoice and you want to get the related Customer information from the customer table. The relation will not explicitly be returned at first, only the invoices. So to get the customer data you could then (while the context was still open, i.e. not disposed) access that via the .Customer reference on your object and it would load. Conversely, to load all the customers during the original enumeration, you could use the .Include() functionality on your queryable, and that would then tell the sql provider factory to use a join when issuing the query.
In your specific example,
List<Students> stdList = Datacontext.Students.ToList();
This will actually not load all of the Teachers and Addresses regardless of if lazy loading is enabled or not. It will only load the students. If you want to lazy load a Teacher, while the Datacontext is still not disposed, you could then use
var firstStudent = stdList.First();
var teacher = firstStudent.Teacher;
//and at this point lazy loading will fetch the teacher
//by issuing **another** query (round trip) to the database
That would only be possible if lazy loading were enabled.
The alternative to this is to eager load, which would include the teachers and addresses. That would look like this
List<Students> stdList = Datacontext.Students
.Include( s => s.Teacher )
.Include( s => s.Address ).ToList();
And then later on if you were to try to access a teacher the context could be disposed and access would still be possible because the data was already loaded.
var firstStudent = stdList.First();
var teacher = firstStudent.Teacher;
//and at this point the teacher was already
//loaded and as a result no additional round trip is required
How was that you noticed that property was loaded, using the debugger? If that is the case, then you already have the answer. With the debugger you are accessing to that property too, so, that also triggers lazy loading.
How this works?
If your entities meets these requirements, then EF will create a proxy class for each of your entities that support change tracking or lazy loading. This way you can load the related entities only when these are accessed. As I explained earlier, the debugger will also trigger lazy loading.
Now, be careful with lazy loading, once you context has been disposed, you will get an exception when you try to get access to one of the related properties. So, I would suggest to use eager loading in that case.
I need to generate a CSV file containing a database export. Since the data can be quite big, I want to use lazy loading with a specific batch size, so that when I iterate through the collection returned by the DAO/Repository, I will only have a batch loaded at one point. I want this to be done automatically by the collection (e.g. otherwise I could just load page after page, using Pageable as a parameter).
Here is some code to hopefully make things clearer. My controller looks something like this:
public ModelAndView generateCsv(Status status) {
//can return a large number of items.
Collection<Item> items = itemRepository.findByStatus(status);
return new ModelAndView("csv", "items", items);
}
As you can see, I'm passing that collection to the view (through the ModelAndView object), and the view will just iterate through it and generate the CSV.
I want the collection to know how to load the next batch internally, which is what a lazy loaded collection should generally do.
Is there a way to do this with Spring-data or just plain JPA?
I know of ScrollableResults from Hibernate, but I don't like it for two reasons: it's not JPA (I'd have to make my code depend on Hibernate), and it's not using collections API, thus I'd have to make my view know about ScrollableResults. At least if it would implement Iterable, that would have made it nicer, but it's not.
So what I'm looking for is a way to specify that a collection is to be lazy loaded, using a specific batch size. Maybe something like:
#Query("SELECT o FROM Item o WHERE o.status = ?1")
#Fetch(type = FetchType.LAZY, size = 100)
Page<Item> findByStatus(Item.Status status);
If something like this is not possible using Spring Data, do you know if it can be done with QueryDsl? The fact that QueryDsl repositories return Iterator objects makes me think it might lazy load those, though I can't find documentation on that.
Thanks,
Stef.
I have a Question object which has List of Comment objects with #OneToMany mapping. The Question object has a fetchComments(int offset, int pageSize) method to fetch comments for a given question.
I want to paginate the comments by fetching a limited amount of them at a time.
If I write a Query object then I can set record offset and maximum records to fetch with Query.setFirstResult(int offset) and Query.setMaxResults(int numberOfResults). But my question is how(if possible) can I achieve the same result without having to write a Query i.e. with simple annotation or property. More clearly, I need to know if there is something like
#OneToMany(cascade = CascadeType.ALL)
#Paginate(offset = x,maxresult = y)//is this kind of annotation available?
private List<Comment> comments;
I have read that #Basic(fetch = FetchType.LAZY) only loads the records needed at runtime, but I won't have control to the number of records fetched there.
I'm new to JPA. So please consider if I've missed something really simple.
No, there is no such a functionality in JPA. Also concept itself is bit confusing. With your example offset (and maxresult as well) is compile time constant and that does not serve pagination purpose too well. Also in general JPA annotations in entities define structure, not the context dependent result (for that need there is queries).
If fetching entities when they are accessed in list is enough and if you are using Hibernate, then closest you can get is extra #LazyCollection:
#org.hibernate.annotations.LazyCollection(LazyCollectionOption.EXTRA)
I am using EF 4.1, with POCOs which are lazy loaded.
Some sample queries that I run:
var discountsCount = product.Discounts.Count();
var currentDiscountsCount = product.Discounts.Count(d=>d.IsCurrent);
var expiredDiscountsCount = product.Discounts.Count(d=>d.IsExpired);
What I'd like to know, is whether my queries make sense, or are poorly performant:
Am I hitting the database each time, or will the results come from cached data in the DbContext?
Is it okay to access the navigation properties "from scratch" each time, as above, or should I be caching them and then performing more queries on them, for example:
var discounts = product.Discounts;
var current = discounts.Count(d=>d.IsCurrent);
var expired = discounts.Count(d=>d.Expired);
What about a complicated case like below, does it pull the whole collection and then perform local operations on it, or does it construct a specialised SQL query which means that I cannot reuse the results to avoid hitting the database again:
var chained = discounts.OrderBy(d=>d.CreationDate).Where(d=>d.CreationDate < DateTime.Now).Count();
Thanks for the advice!
EDIT based on comments below
So once I call a navigation property (which is a collection), it will load the entire object graph. But what if I filtered that collection using .Count(d=>d...) or Select(d=>d...) or Min(d=>d...), etc. Does it load the entire graph as well, or only the final data?
product.Discounts (or any other navigation collection) isn't an IQueryable but only an IEnumerable. LINQ operations you perform on product.Discounts will never issue a query to the database - with the only exception that in case of lazy loading product.Discounts will be loaded once from the database into memory. It will be loaded completely - no matter which LINQ operation or filter you perform.
If you want to perform filters or any queries on navigation collections without loading the collection completely into memory you must not access the navigation collection but create a query through the context, for instance in your example:
var chained = context.Entry(product).Collection(p => p.Discounts).Query()
.Where(d => d.CreationDate < DateTime.Now).Count();
This would not load the Discounts collection of the product into memory but perform a query in the database and then return a single number as result. The next query of this kind would go to the database again.
In your examples above the Discounts collection should be populated by Ef the first time you access it. The subsequent linq queries on the Discount collection should then be performed in memory. This will even include the last complex expression.
You can also use the Include method to make sure you are getting back associated collection first time. example .Include("Discounts");
If your worried about performance I would recommend using SQL Profiler to have a look at what SQL is being executed.
I've been using Entity Framework Profiler to test my data access in an MVC project and have come accross several pages where I'm making far more db queries than I need to because of N+1 problems.
Here is a simple example to show my problem:
var club = this.ActiveClub; // ActiveClub uses code similar to context.Clubs.First()
var members = club.Members.ToList();
return View("MembersWithAddress", members);
The view loops through Members and then follows a navigion property on each member to also show their address. Each of the address requests results in an extra db query.
One way to solve this would be to use Include to make sure the extra tables I need are queried up front. However, I only seem to be able to do this on the ObjectSet of Clubs attached directly to the context. In this case the ActiveClub property is shared by lots of controllers and I don't always want to query the Member and address table up front.
I'd like to be able to use something like:
var members = club.Members.Include("Address").ToList();
But, Members is an EntityCollection and that doesn't have the Include method on it.
Is there a way to force a load on the Members EntityCollection and ask EF to also load their Addresses?
Or, is using EntityCollection navigation properties on an entity in this way, just a really bad idea; and you should know what you're loading when you get it from the context?
If your entities inherits from EntityObject try to use this:
var members = club.Members.CreateSourceQuery()
.Include("Address")
.ToList();
If you use POCOs with lazy loading proxies try to use this:
var members = ((EntityCollection<Club>)club.Members).CreateSourceQuery()
.Include("Address")
.ToList();
Obviously second version is not very nice because POCOs are used to remove dependency on EF but now you need to convert the collection to EF class. Another problem is that the query will be executed twice. Lazy loading will trigger for Members once to access the property and then second query will be executed when you call ToList. This can be solved by turning off lazy loading prior to running the query.
When you say ActiveClub is shared I believe it means something like it is property in base controller used in derived controllers. In such case you can still use different code in different controller to fill the property.