Loading a context - entity-framework

Have I understood this correctly please.
When you are running a web application to view pages and you create an instance of the context is that instance loading all the database date into it?
If it does does that not take up a lot of memory a blog with five years of blogs could have 1,500 to 2,000 (or more)post in it, with all the comments tags etc that would be a great deal of data.
So what does happen when you create the instance of a context?

A context only loads the records that you request, so when you first instantiate one it will be empty and won't perform any queries against the database until you tell it to. Any entities you load through it will (usually) be cached within the context, though, so they use more and more memory every time you run a query and can become very large over time.
For that reason, and because contexts are relatively cheap to instantiate, it's a good idea to only keep them alive while you actually need them, and dispose of them as soon as you're done. This is part of the "unit of work" pattern -- basically using a new context for each set of operations that go together as one unit or transaction.
Edited to add:
If you're performing read-only queries (i.e. you just want to display data, you don't need to make changes and save them back to the database), you might check out non-tracking queries (e.g. the .AsNoTracking() method if you're using a DbContext/DbSet, or the MergeOption.NoTracking property if you're using an ObjectContext/ObjectSet) -- that will avoid caching the results in the context, increasing performance and reducing memory use.

Related

EF - multiple includes to eager load hierarchical data. Bad practice?

I am needing to eager load a hierarchy structure so that I can recursively iterate through it. The eager loading is necessary to prevent multiple db queries while traversing the tree. It seems the consensus is that you can't eager load infinite levels of the tree, so I did something like
var item= db.ItemHierarchies
.Include("Children.Children.Children.Children.Children")
.Where(x => x.condition == condition)
to load 5 levels of children. This seems to get the job done. I'm wondering what the drawback is to doing this? If there is none then theoretically could I add 50 levels of includes here without slowing things down?
I recommend taking a look at the SQL that is generated as you add eager loading to your query.
var item= db.ItemHierarchies
.Include("Children")
.Include("Children.Children")
.Include("Children.Children.Children")
.Include("Children.Children.Children.Children")
.Include("Children.Children.Children.Children.Children")
var sql = ((System.Data.Objects.ObjectQuery) item).ToTraceString()
// http://visualstudiomagazine.com/blogs/tool-tracker/2011/11/seeing-the-sql.aspx
You'll see that the SQL quickly gets very big and complicated and can potentially have serious performance implications. You'd do well to limit your eager loading to data that you are certain you will need and to consider using explicit loading for some of the related entities - especially if you're working with connected entities in which case you can explicitly load collection properties when they're needed.
Also note that you may not need multiple separate Includes. For example, the following needs to be separate Includes because they're addressing separate properties (Widgets and Spanners) of the root.
var item= db.ItemHierarchies
.Include("Widgets")
.Include("Spanners.Flanges")
But the following isn't necessary:
var item= db.ItemHierarchies
.Include("Widgets") //This isn't necessary.
.Include("Widgets.Flanges") //This loads both Widges and Flanges.
Well honestly.. It's an extremely bad practice.
Let's assume you had 50 objects in your root.. and 50 per level.
You may end up retrieving 312500000 "capsules" of information.
Now, one might ask: "So what is wrong with that?!",
I mean if that is what is required than why not do that..
Rule #1: we develop software that should be used by human beings.
And the fact is that no human capable of taking a glimpse at 312500000 items of information at once and learn or conclude something beneficial out of it. (except.. that it does not help him or her to watch it)
Rule #2: UI should be based on what is needed and not what is possible.
And since we already established that showing 312500000 capsules of data is not needed there is no reason to bring all that at once.
And now you might come forward and say - But I don't care about the UI, really! All I need is to iterate in that data in order to process some information!
In that case you would probably want to save your results somewhere for future reference, but that means that its a batch job.. so why not apply batch job rules upon it.. like process it item by item which will also may give you the benefit of splitting it between even more machines if needed.
So you see.. no matter which path you choose there should be no reason to do it.
(= definition of what is a bad practice.)
Update:
After reading interesting concerns in the comments, I would like to update this answer with more analysis:
Deciding what is a bad practice must always be in reference to what is to be achieved or what is the role of each part in the system. In the current situation (after reading the comments) it has been brought or implied that the data storage is actually a persistent medium for objects opposed to a different concept where the data is the 'heart' of the application.
We can define two data types:
1) Data-Center which is being used in data-centric applications such as banks, CRM, ERP, websites or other service based solutions.
VS.
2) Data-Persistence medium which is being used as data to be saved for when the application is not active, in example: any simple app save file or any game save file and etc.
The main difference is that a data persistence medium is to be accessed only by a single instance of the app at a single point in time.. meaning the data is not designed to be shared by many instances. if the data is to be shared - we are dealing with a data-center application.
If your app just need a data-persistence medium - loading all the information cannot be considered as a bad practice - but you still need to make sure you are not exploding the memory. and in that frame of work, SQL Server might not be what you need or the best tool to use.
In the other case of Data-Centric application - my original answer remains as it will be a bad practice to bring all the information per instance of the application.

iOS: using GCD with Core Data

at the heart of it, my app will ask the user for a bunch of numbers, store them via core data, and then my app is responsible for showing the user the average of all these numbers.
So what I figure I should do is that after the user inputs a new number, I could fire up a new thread, fetch all the objects in a NSFetchDescription instance and call it on my NSManagedObjectContext, do the proper calculations, and then update the UI on the main thread.
I'm aware that the rule for concurrency in Core Data is one thread per NSManagedObjectContext instance so what I want to know is, do you I think can what I just described without having my app explode 5 months down the line? I just don't think it's necessary to instantiate a whole a new context just to do some measly calculations...
Based on what you have described, why not just store the numbers as they are entered into a CoreData model and also into an NSMutableArray? It seems as though you are storing these for future retrieval in case someone needs to look at (and maybe modify) a previous calculation. Under that scenario, there is no need to do a fetch after a current set of numbers is entered. Just use a mutable array and populate it with all the numbers for the current calculation. As a number is entered, save it to the model AND to the array. When the user is ready to see the average, do the math on the numbers in the already populated array. If the user wants to modify a previous calculation, retrieve those numbers into an array and work from there.
Bottom line is that you shouldn't need to work with multiple threads and merging Contexts unless you are populating a model from a large data set (like initial seeding of a phonebook, etc). Modifying a Context and calling save on that context is a very fast thing for such a small change as you are describing.
I would say you may want to do some testing, especially in regard to the size of the data set. if it is pretty small, the sqlite calls are pretty fast so you may get away with doing in on the main queue. But if it is going to take some time, then it would be wise to get it off the main thread.
Apple introduced the concept of parent and child managed object contexts in 2011 to make using MO contexts on different threads easier. you may want to check out the WWDC videos on Core Data.
You can use NSExpression with you fetch to get really high performance functions like min, max, average, etc. here is a good link. There are examples on SO
http://useyourloaf.com/blog/2012/01/19/core-data-queries-using-expressions.html
Good luck!

How do i stop my app from accessing the database numerous times for each cell?

I have a UITableViewCell, which receives a pointer to an object in the database. That object has it's own objects on a one-to-one relationship. I want to know how I can access various bits of information from these objects without it accessing the database each time. Sort of along the same lines as 'am I able to load the data in one go so it's all ready to go, as part of the pointer, rather than something that needs to be loaded from the database'?
You can use NSCache to cache the data. Before going to DB, check if the data is in the cache. If it is not, read the DB and put the results in the cache; next time the access is going to be almost instantaneous.
You can also use Core Data to access the database: it caches results for you.
Andrew,
You can set your fetch request to not load your items as faults, i.e. make all the data available instead of cached in the row cache, by using -returnsObjectsAsFaults. Also, you can prefetch the properties in relationships by using -relationshipKeyPathsForPrefetching. IOW, you can make sure that you do fetch everything in "one go."
Andrew

Data Base Design Dilemma

I am creating a simple DB application for reports. According to DB design theory, you should never store the same information twice. This makes sense for most DB applications, but I need something that you can simply select a generic topic, you could then keep the new instance copy of the generic topic untouched or change the information but the generic topic should not be modified by modifying the instance copy, but the relationship needs to be tracked between the original topic and the instance copy of the topic.
Confusing, I know. Here is a diagram that may help:
I need the report to be immutable or mutable based off of the situation.
A quick example would be you select a customer, then you finish your report. A month later the customer's phone number changes so you update the customer portion of the DB, but you do not want to pull up a finished report and have the new information update into the already completed report.
What would be the most elegant solution to this scenario?
This may work:
But by utilizing this approach I would find myself using looping statements and if statements to identify the relationships between Generic, Checked Off and Report.
for (NSManagedObject *managedObject in checkedOffTaskObjects) {
if ([[reportObject valueForKeyPath:#"tasks"] containsObject:managedObject]) {
if ([[managedObject valueForKeyPath:#"tasks"] containsObject:genericTaskObjectAtIndexPath]) {
cell.backgroundView = [[[UIImageView alloc] initWithImage:[UIImage imageNamed:#"cellbackground.png"]] autorelease];
}
}
}
I know a better solution exists, but I cannot see it.
Thank you for time.
It's tricky to be very precise without knowing much about what exactly you're modelling, but here goes...
As you've noted, there's at least two strategies to get the "mutable instance copies of a prototype" functionality you want:
1) When creating an instance based on a prototype, completely copy the instance data from the prototype. No link between them thereafter.
PRO: faster access to the instance data with less logic involved.
CON 1: Any update to your prototype will not make it into the instances. e.g. if you have the address of a company wrong in the prototype.
CON 2: you're duplicating database data -- to a certain extent -- wasteful if you have huge records.
2) When creating an instance based on a prototype, store a reference to the 'parent' record, i.e. the prototype, and then only store updated fields in the actual instance.
PRO 1: Updates to prototype get reflected in all instances.
PRO 2: More efficient use of storage space (less duplication of data)
CON: more logic around pulling an instance from the database.
In summary: there's not any magical solution I can think of that gets you the best of both of these worlds. They're both valid strategies, depending on your exact problem and constraints (runtime speed versus storage size, for example).
If you go for 2), I certainly don't think it's a disaster -- particularly if you model things well and find out the best most efficient way to structure things in core data.

What are some of the advantage/disadvantages of using SQLDataReader?

SqlDataReader is a faster way to process the stored procedure. What are some of the advantage/disadvantages of using SQLDataReader?
I assume you mean "instead of loading the results into a DataTable"?
Advantages: you're in control of how the data is loaded. You can ask for specific data types, and you don't end up loading the whole set of data into memory all at the same time unless you want to. Basically, if you want the data but don't need a data table (e.g. you're going to populate your own kind of collection) you don't get the overhead of the intermediate step.
Disadvantages: you're in control of how the data is loaded, which means it's easier to make a mistake and there's more work to do.
What's your use case here? Do you have a good reason to believe that the overhead of using a normal (or strongly typed) data table is significantly hurting performance? I'd only use SqlDataReader directly if I had a good reason to do so.
The key advantage is obviously speed - that's the main reason you'd choose a SQLDataReader.
One potential disadvantage not already mentioned is that the SQLDataReader is forward only, so you can only go through the records once in sequence - that's one of the things that allows it to be so fast. In many cases that's fine but if you need to iterate over the records more than once or add/edit/delete data you'll need to use one of the alternatives.
It also remains connected until you've worked through all the records and close the reader (of course, you can opt to close it earlier, but then you can't access any of the remaining records). If you're going to perform any lengthy processing on the records as you iterate over them, you may find that you impact other connections to the database.
It depends on what you need to do. If you get back a page of results from the database (say 20 records), it would be better to use a data adapter to fill a DataSet, and bind that to something in the UI.
But if you need to process many records, 1 at a time, use SqlDataReader.
Advantages: Faster, less memory.
Disadvantages: Must remain connected, must remember to close the reader.
The data might not be concluesive and you are not in control of your actions that why the milk man down the road has always got to carry data with him or else they gona get cracked by the data and the policeman will not carry any data because they think that is wrong to keep other people's data and its wrong to do so. There is a girl who lives in Sheffield and she loves to go out and play most the times that she s in the house that is why I dont like to talk to her because her parents and her other fwends got taken to peace gardens thats a place that everyone likes to sing and stay. usually famous Celebs get to hang aroun dthere but there are always top security because we dont want to get skanked down them ends. KK see u now I need 2 go and chill in the west end PEACE!!!£"$$$ Made of MOney MAN$$$$