Bulk Insert Optimization in .NET / EF Core

Bulk Insert Optimization in .NET / EF Core - entity-framework

I have some .NET Core code that does some bulk loading of the DB with random sample data.
I'm getting 20 inserts/second on localhost, and looking to improve my performance. I'm doing the basic stuff like calling _dbContext.SaveChanges() only once, etc.
A number of posts like this indicate gains can be had by manipulating properties on the DbContext's Configuration, such as Configuration.AutoDetectChangesEnabled and Configuration.ValidateOnSaveEnabled.
My .NET Core MVC app's DbContext is a subclass of IdentityDbContext, which does not expose the Configuration.
Not sure what approach I should be using - can I / should I be messing with those configuration properties of a IdentityDbContext subclass?
Or, should I use a separate DbContext for this? (Some early research indicated the typical pattern is a single DbContext for a webapp).

There is no need for creating separated DbContext class and you can turn change tracking off:
context.ChangeTracker.AutoDetectChangesEnabled = false;
or you can turn it off globaly:
public class MyContext : IdentityDbContext
{
public MyContext()
{
ChangeTracker.AutoDetectChangesEnabled = false;
}
}

Related

Is DBContext added automatically with EF 6 or must it be added manuall?

New to EF. Following along with DBContext by Lerman/Miller.
When I start a new project, adding EF6 (Database First), the DBContext seems to be added as a default (ie I don't have to add the DBContext separately with T4). Also, for Lazy Loading, the "virtual" needed in the class definitions also seems to be there by default (I don't have to add it like in the book). Is this what is expected?

When you use Database First approach and use EF x DbContext Generator it creates the DbContext for you automatically and set the navigation properties, virtual. If you want to disable lazy loading you can simply use following code
public class MyContext : DbContext
{
public MyContext()
{
this.Configuration.LazyLoadingEnabled = false;
}
}

Most probably the books you are reading are for code first development. If you use database first(especially with the designer) you don't need to make changes.

Entity Framework 5 : Using Lazy Loading or Eager Loading

I'm sorry if my question is normal. But I meet this problem when I design my ASP.NET MVC 4.0 Application using Entity Framework 5.
If I choose Eager Loading, I just simplify using :
public Problem getProblemById(int id) {
using(DBEntity ctx = new DBEntity ())
{
return (Problem) ctx.Posts.Find(id);
}
}
But if I use Eager Loading, I will meet problem: when I want to navigate through all its attributes such as comments (of problem), User (of Problem) ... I must manually use Include to include those properties. and At sometimes, If I don't use those properties, I will lost performance, and maybe I lost the strength of Entity Framework.
If I use Lazy Loading. There are two ways to use DBContext object. First way is using DBContext object locally :
public Problem getProblemById(int id) {
DBEntity ctx = new DBEntity ();
return (Problem) ctx.Posts.Find(id);
}
Using this, I think will meet memory leak, because ctx will never dispose again.
Second way is make DBContext object static and use it globally :
static DBEntity ctx = new DBEntity ();
public Problem getProblemById(int id) {
return (Problem) ctx.Posts.Find(id);
}
I read some blog, they say that, if I use this way, I must control concurrency access (because multi request sends to server) by myself, OMG. For example this link :
Entity Framework DBContext Usage
So, how can design my app, please help me figure out.
Thanks :)

Don't use a static DBContext object. See c# working with Entity Framework in a multi threaded server
A simple rule for ASP.Net MVC: use a DBContext instance per user request.
As for using lazy loading or not, I would say it depends, but personally I would deactivate lazy-loading. IMO it's a broken feature because there are fundamental issues with it:
just too hard to handle exceptions, because a SQL request can fail at any place in your code (not just in the DAL because one developer can access to a navigation property in any piece of code)
poor performances if not well used
too easy to write broken code that produces thousands of SQL requests

Entity Framework 6 Code First - Is Repository Implementation a Good One?

I am about to implement an Entity Framework 6 design with a repository and unit of work.
There are so many articles around and I'm not sure what the best advice is: For example I realy like the pattern implemented here: for the reasons suggested in the article here
However, Tom Dykstra (Senior Programming Writer on Microsoft's Web Platform & Tools Content Team) suggests it should be done in another article: here
I subscribe to Pluralsight, and it is implemented in a slightly different way pretty much every time it is used in a course so choosing a design is difficult.
Some people seem to suggest that unit of work is already implemented by DbContext as in this post, so we shouldn't need to implement it at all.
I realise that this type of question has been asked before and this may be subjective but my question is direct:
I like the approach in the first (Code Fizzle) article and wanted to know if it is perhaps more maintainable and as easily testable as other approaches and safe to go ahead with?
Any other views are more than welcome.

#Chris Hardie is correct, EF implements UoW out of the box. However many people overlook the fact that EF also implements a generic repository pattern out of the box too:
var repos1 = _dbContext.Set<Widget1>();
var repos2 = _dbContext.Set<Widget2>();
var reposN = _dbContext.Set<WidgetN>();
...and this is a pretty good generic repository implementation that is built into the tool itself.
Why go through the trouble of creating a ton of other interfaces and properties, when DbContext gives you everything you need? If you want to abstract the DbContext behind application-level interfaces, and you want to apply command query segregation, you could do something as simple as this:
public interface IReadEntities
{
IQueryable<TEntity> Query<TEntity>();
}
public interface IWriteEntities : IReadEntities, IUnitOfWork
{
IQueryable<TEntity> Load<TEntity>();
void Create<TEntity>(TEntity entity);
void Update<TEntity>(TEntity entity);
void Delete<TEntity>(TEntity entity);
}
public interface IUnitOfWork
{
int SaveChanges();
}
You could use these 3 interfaces for all of your entity access, and not have to worry about injecting 3 or more different repositories into business code that works with 3 or more entity sets. Of course you would still use IoC to ensure that there is only 1 DbContext instance per web request, but all 3 of your interfaces are implemented by the same class, which makes it easier.
public class MyDbContext : DbContext, IWriteEntities
{
public IQueryable<TEntity> Query<TEntity>()
{
return Set<TEntity>().AsNoTracking(); // detach results from context
}
public IQueryable<TEntity> Load<TEntity>()
{
return Set<TEntity>();
}
public void Create<TEntity>(TEntity entity)
{
if (Entry(entity).State == EntityState.Detached)
Set<TEntity>().Add(entity);
}
...etc
}
You now only need to inject a single interface into your dependency, regardless of how many different entities it needs to work with:
// NOTE: In reality I would never inject IWriteEntities into an MVC Controller.
// Instead I would inject my CQRS business layer, which consumes IWriteEntities.
// See #MikeSW's answer for more info as to why you shouldn't consume a
// generic repository like this directly by your web application layer.
// See http://www.cuttingedge.it/blogs/steven/pivot/entry.php?id=91 and
// http://www.cuttingedge.it/blogs/steven/pivot/entry.php?id=92 for more info
// on what a CQRS business layer that consumes IWriteEntities / IReadEntities
// (and is consumed by an MVC Controller) might look like.
public class RecipeController : Controller
{
private readonly IWriteEntities _entities;
//Using Dependency Injection
public RecipeController(IWriteEntities entities)
{
_entities = entities;
}
[HttpPost]
public ActionResult Create(CreateEditRecipeViewModel model)
{
Mapper.CreateMap<CreateEditRecipeViewModel, Recipe>()
.ForMember(r => r.IngredientAmounts, opt => opt.Ignore());
Recipe recipe = Mapper.Map<CreateEditRecipeViewModel, Recipe>(model);
_entities.Create(recipe);
foreach(Tag t in model.Tags) {
_entities.Create(tag);
}
_entities.SaveChanges();
return RedirectToAction("CreateRecipeSuccess");
}
}
One of my favorite things about this design is that it minimizes the entity storage dependencies on the consumer. In this example the RecipeController is the consumer, but in a real application the consumer would be a command handler. (For a query handler, you would typically consume IReadEntities only because you just want to return data, not mutate any state.) But for this example, let's just use RecipeController as the consumer to examine the dependency implications:
Say you have a set of unit tests written for the above action. In each of these unit tests, you new up the Controller, passing a mock into the constructor. Then, say your customer decides they want to allow people to create a new Cookbook or add to an existing one when creating a new recipe.
With a repository-per-entity or repository-per-aggregate interface pattern, you would have to inject a new repository instance IRepository<Cookbook> into your controller constructor (or using #Chris Hardie's answer, write code to attach yet another repository to the UoW instance). This would immediately make all of your other unit tests break, and you would have to go back to modify the construction code in all of them, passing yet another mock instance, and widening your dependency array. However with the above, all of your other unit tests will still at least compile. All you have to do is write additional test(s) to cover the new cookbook functionality.

I'm (not) sorry to say that the codefizzle, Dyksta's article and the previous answers are wrong. For the simple fact that they use the EF entities as domain (business) objects, which is a big WTF.
Update: For a less technical explanation (in plain words) read Repository Pattern for Dummies
In a nutshell, ANY repository interface should not be coupled to ANY persistence (ORM) detail. The repo interface deals ONLY with objects that makes sense for the rest of the app (domain, maybe UI as in presentation). A LOT of people (with MS leading the pack, with intent I suspect) make the mistake of believing that they can reuse their EF entities or that can be business object on top of them.
While it can happen, it's quite rare. In practice, you'll have a lot of domain objects 'designed' after database rules i.e bad modelling. The repository purpose is to decouple the rest of the app (mainly the business layer) from its persistence form.
How do you decouple it when your repo deals with EF entities (persistence detail) or its methods return IQueryable, a leaking abstraction with wrong semantics for this purpose (IQueryable allows you to build a query, thus implying that you need to know persistence details thus negating the repository's purpose and functionality)?
A domin object should never know about persistence, EF, joins etc. It shouldn't know what db engine you're using or if you're using one. Same with the rest of the app, if you want it to be decoupled from the persistence details.
The repository interface know only about what the higher layer know. This means, that a generic domain repository interface looks like this
public interface IStore<TDomainObject> //where TDomainObject != Ef (ORM) entity
{
void Save(TDomainObject entity);
TDomainObject Get(Guid id);
void Delete(Guid id);
}
The implementation will reside in the DAL and will use EF to work with the db. However the implementation looks like this
public class UsersRepository:IStore<User>
{
public UsersRepository(DbContext db) {}
public void Save(User entity)
{
//map entity to one or more ORM entities
//use EF to save it
}
//.. other methods implementation ...
}
You don't really have a concrete generic repository. The only usage of a concrete generic repository is when ANY domain object is stored in serialized form in a key-value like table. It isn't the case with an ORM.
What about querying?
public interface IQueryUsers
{
PagedResult<UserData> GetAll(int skip, int take);
//or
PagedResult<UserData> Get(CriteriaObject criteria,int skip, int take);
}
The UserData is the read/view model fit for the query context usage.
You can use directly EF for querying in a query handler if you don't mind that your DAL knows about view models and in that case you won't be needing any query repo.
Conclusion
Your business object shouldn't know about EF entities.
The repository will use an ORM, but it never exposes the ORM to the rest of the app, so the repo interface will use only domain objects or view models (or any other app context object that isn't a persistence detail)
You do not tell the repo how to do its work i.e NEVER use IQueryable with a repo interface
If you just want to use the db in a easier/cool way and you're dealing with a simple CRUD app where you don't need (be sure about it) to maintain separation of concerns then skip the repository all together, use directly EF for everything data. The app will be tightly coupled to EF but at least you'll cut the middle man and it will be on purpose not by mistake.
Note that using the repository in the wrong way, will invalidate its use and your app will still be tightly coupled to the persistence (ORM).
In case you believe the ORM is there to magically store your domain objects, it's not. The ORM purpose is to simulate an OOP storage on top of relational tables. It has everything to do with persistence and nothing to do with domain, so don't use the ORM outside persistence.

DbContext is indeed built with the Unit of Work pattern. It allows all of its entities to share the same context as we work with them. This implementation is internal to the DbContext.
However, it should be noted that if you instantiate two DbContext objects, neither of them will see the other's entities that they are each tracking. They are insulated from one another, which can be problematic.
When I build an MVC application, I want to ensure that during the course of the request, all my data access code works off of a single DbContext. To achieve that, I apply the Unit of Work as a pattern external to DbContext.
Here is my Unit of Work object from a barbecue recipe app I'm building:
public class UnitOfWork : IUnitOfWork
{
private BarbecurianContext _context = new BarbecurianContext();
private IRepository<Recipe> _recipeRepository;
private IRepository<Category> _categoryRepository;
private IRepository<Tag> _tagRepository;
public IRepository<Recipe> RecipeRepository
{
get
{
if (_recipeRepository == null)
{
_recipeRepository = new RecipeRepository(_context);
}
return _recipeRepository;
}
}
public void Save()
{
_context.SaveChanges();
}
**SNIP**
I attach all my repositories, which are all injected with the same DbContext, to my Unit of Work object. So long as any repositories are requested from the Unit of Work object, we can be assured that all our data access code will be managed with the same DbContext - awesome sauce!
If I were to use this in an MVC app, I would ensure the Unit of Work is used throughout the request by instantiating it in the controller, and using it throughout its actions:
public class RecipeController : Controller
{
private IUnitOfWork _unitOfWork;
private IRepository<Recipe> _recipeService;
private IRepository<Category> _categoryService;
private IRepository<Tag> _tagService;
//Using Dependency Injection
public RecipeController(IUnitOfWork unitOfWork)
{
_unitOfWork = unitOfWork;
_categoryRepository = _unitOfWork.CategoryRepository;
_recipeRepository = _unitOfWork.RecipeRepository;
_tagRepository = _unitOfWork.TagRepository;
}
Now in our action, we can be assured that all our data access code will use the same DbContext:
[HttpPost]
public ActionResult Create(CreateEditRecipeViewModel model)
{
Mapper.CreateMap<CreateEditRecipeViewModel, Recipe>().ForMember(r => r.IngredientAmounts, opt => opt.Ignore());
Recipe recipe = Mapper.Map<CreateEditRecipeViewModel, Recipe>(model);
_recipeRepository.Create(recipe);
foreach(Tag t in model.Tags){
_tagRepository.Create(tag); //I'm using the same DbContext as the recipe repo!
}
_unitOfWork.Save();

Searching around the internet I found this http://www.thereformedprogrammer.net/is-the-repository-pattern-useful-with-entity-framework/ it's a 2 part article about the usefulness of the repository pattern by Jon Smith.
The second part focuses on a solution. Hope it helps!

Repository with unit of work pattern implementation is a bad one to answer your question.
The DbContext of the entity framework is implemented by Microsoft according to the unit of work pattern. That means the context.SaveChanges is transactionally saving your changes in one go.
The DbSet is also an implementation of the Repository pattern. Do not build repositories that you can just do:
void Add(Customer c)
{
_context.Customers.Add(c);
}
Create a one-liner method for what you can do inside the service anyway ???
There is no benefit and nobody is changing EF ORM to another ORM nowadays...
You do not need that freedom...
Chris Hardie is argumenting that there could be instantiated multiple context objects but already doing this you do it wrong...
Just use an IOC tool you like and setup the MyContext per Http Request and your are fine.
Take ninject for example:
kernel.Bind<ITeststepService>().To<TeststepService>().InRequestScope().WithConstructorArgument("context", c => new ITMSContext());
The service running the business logic gets the context injected.
Just keep it simple stupid :-)

You should consider "command/query objects" as an alternative, you can find a bunch of interesting articles around this area, but here is a good one:
https://rob.conery.io/2014/03/03/repositories-and-unitofwork-are-not-a-good-idea/
When you need a transaction over multiple DB objects, use one command object per command to avoid the complexity of the UOW pattern.
A query object per query is likely unnecessary for most projects. Instead you might choose to start with a 'FooQueries' object
...by which I mean you can start with a Repository pattern for READS but name it as "Queries" to be explicit that it does not and should not do any inserts/updates.
Later, you might find splitting out individual query objects worthwhile if you want to add things like authorization and logging, you could feed a query object into a pipeline.

I always use UoW with EF code first. I find it more performant and easier tot manage your contexts, to prevent memory leaking and such. You can find an example of my workaround on my github: http://www.github.com/stefchri in the RADAR project.
If you have any questions about it feel free to ask them.

EF: In search of a design strategy for DatabaseFirst DbContext of a Modular Application

I'm looking for suggestions on how to approach using an ORM (in this case, EF5) in the design of modular Non-Monolithic applications, with a Core part and 3rd party Modules, where the Core has no direct Reference to the 3rd party Modules, and Modules only have a reference to Core/Common tables and classes.
For arguments sake, a close enough analogy would be DNN.
CodeFirst:
With CodeFirst, the approach I used was to build up the model of the Db was via reflection: in the Core's DbContext's DbInitialation phase, I used Reflection to find any class in any dll (eg Core or various Modules) decorated with IDbInitializer (a custom contract containing an Execute() method) to define just the dll's structure. Each dll added to the DbModel what it knew about itself.
Any subsequent Seeding was also handled in the same wa (searching for a specific IDbSeeder contract, and executing it).
Pro:
* the approach works for now.
* The same core DbContext can be used across all respositories, as long as each repo uses dbContext.GetSet(), rather than expecting it to be a property of the dbContext. No biggie.
Cons:
* it only works at startup (ie, adding new modules would require an AppPool refresh).
* CodeFirst is great for a POC. But in EF5, it's not mature enough for Enterprise work yet (and I can't wait for EF6 for StoredProcs and other features to be added).
* My DBA hates CodeFirst, at least for the Core, wanting to optimize that part with Stored Procs as much as as possible...We're a team, so I have to try to find a way to please him, if I can find a way...
Database-first:
The DbModel phase appears to be happening prior to the DbContext's constructor (reading from embedded *.edmx resource file). DbInitialization is never invoked (as model is deemed complete), so I can't add more tables than what the Core knows about.
If I can't add elements to the Model, dynamically, as one can with CodeFirst, it means that
* either the Core DbContext's Model has to have knowledge of every table in the Db -- Core AND every 3rd party module. Making the application Monolithic and highly coupled, defeating the very thing I am trying to achieve.
* Or each 3rd party has to create their own DbContext, importing Core tables, leading to
* versioning issues (module not updating their *.edmx's when Core's *.edmx is updated, etc.)
* duplication everywhere, in different memory contexts = hard to track down concurrency issues.
At this point, it seems to me that the CodeFirst approach is the only way that Modular software can be achieved with EF. But hopefully someone else know's how to make DatabaseFirst shine -- is there any way of 'appending' DbSet's to the model created from the embedded *.edmx file?
Or any other ideas?

I would consider using a sort of plugin architecture, since that's your overall design for the application itself.
You can accomplish the basics of this by doing something like the following (note that this example uses StructureMap - here is a link to the StructureMap Documentation):
Create an interface from which your DbContext objects can derive.
public interface IPluginContext {
IDictionary<String, DbSet> DataSets { get; }
}
In your Dependency Injection set-up (using StructureMap) - do something like the following:
Scan(s => {
s.AssembliesFromApplicationBaseDirectory();
s.AddAllTypesOf<IPluginContext>();
s.WithDefaultConventions();
});
For<IEnumerable<IPluginContext>>().Use(x =>
x.GetAllInstances<IPluginContext>()
);
For each of your plugins, either alter the {plugin}.Context.tt file - or add a partial class file which causes the DbContext being generated to derive from IPluginContext.
public partial class FooContext : IPluginContext { }
Alter the {plugin}.Context.tt file for each plugin to expose something like:
public IDictionary<String, DbSet> DataSets {
get {
// Here is where you would have the .tt file output a reference
// to each property, keyed on its property name as the Key -
// in the form of an IDictionary.
}
}
You can now do something like the following:
// This could be inside a service class, your main Data Context, or wherever
// else it becomes convenient to call.
public DbSet DataSet(String name) {
var plugins = ObjectFactory.GetInstance<IEnumerable<IPluginContext>>();
var dataSet = plugins.FirstOrDefault(p =>
p.DataSets.Any(ds => ds.Key.Equals(name))
);
return dataSet;
}
Forgive me if the syntax isn't perfect - I'm doing this within the post, not within the compiler.
The end result gives you the flexibility to do something like:
// Inside an MVC controller...
public JsonResult GetPluginByTypeName(String typeName) {
var dataSet = container.DataSet(typeName);
if (dataSet != null) {
return Json(dataSet.Select());
} else {
return Json("Unable to locate that object type.");
}
}
Clearly, in the long-run - you would want the control to be inverted, where the plugin is the one actually tying into the architecture, rather than the server expecting a type. You can accomplish the same kind of thing using this sort of lazy-loading, however - where the main application exposes an endpoint that all of the Plugins tie to.
That would be something like:
public interface IPlugin : IDisposable {
void EnsureDatabase();
void Initialize();
}
You now can expose this interface to any application developers who are going to create plugins for your architecture (DNN style) - and your StructureMap configuration works something like:
Scan(s => {
s.AssembliesFromApplicationBaseDirectory(); // Upload all plugin DLLs here
// NOTE: Remember that this gives people access to your system!!!
// Given what you are developing, though, I am assuming you
// already get that.
s.AddAllTypesOf<IPlugin>();
s.WithDefaultConventions();
});
For<IEnumerable<IPlugin>>().Use(x => x.GetAllInstances<IPlugin>());
Now, when you initialize your application, you can do something like:
// Global.asax
public static IEnumerable<IPlugin> plugins =
ObjectFactory.GetInstance<IEnumerable<IPlugin>>();
public void Application_Start() {
foreach(IPlugin plugin in plugins) {
plugin.EnsureDatabase();
plugin.Initialize();
}
}
Each of your IPlugin objects can now contain its own database context, manage the process of installing (if necessary) its own database instance / tables, and dispose of itself gracefully.
Clearly this isn't a complete solution - but I hope it starts you off in a useful direction. :) If I can help clarify anything herein, please let me know.

While it's a CodeFirst approach, and not the cleanest solution, what about forcing initialization to run even after start up, as in the answer posted here: Forcing code-first to always initialize a non-existent database? (I know precisely nothing about CodeFirst, but seeing as this is a month old with no posts it's worth a shot)

Binding the interface with class in mvc3 EF

Suppose I have some Interface like :
public interface IIconComponent
{
// statements ...
}
then I am implementing this interface within my class as below
public class IconComponent : IIconcomponent
{
// implementing the interface statements ..
}
and creating a Table in mvc3 like:
public class IconDBContext : DbContext
{
public DbSet<IIconComponent> Icon {get; set;} //Is this statement possible
}
That is making the set of objects of interface type for storing the class IconComponent objects in the table. How to do this in MVC3 ?
Does I have to implement some model-binder for this ? or, there exists some other method ?
Thanking you,

EF doesn't support interfaces. DbSet must be defined with the real implementation. Once you change it to use implementation your actions will most probably use it as well because there will be no reason to work with abstraction.

Why would you use entity framework is you're creating abstraction layer on top of it, it's as you're not using entity framework at all and because of that entity framework is not able to work with interfaces.

If you really need to, you can let your Entity Framework classes implement interfaces. With POCO's it's straightforward, with edmx you can make partial classes that contain the derivation from the interface. However, as said by Ladislav, something like DbSet<IIconComponent> is not possible.
I can imagine scenarios where you would want to use this, e.g. dealing with other application components that only accept specific interfaces, but that you want to populate with your EF classes. (The other day, I did exactly that with a legacy UI layer).

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse