Is there a way to manage part of an Entity Framework object using EF and the remainder using ADO? - entity-framework

I have to maintain an application that creates and manages quotations. (A quotation is a single class containing all information needed for pricing).
Most of the time, creating a quotation means adding a couple of lines to a couple of tables. It is pretty fast. Sometime however, the user attaches a large history of claims to the quotation and tens of thousands lines must be created in the database. Using EF, it takes forever.
So I've tried to use DbBulkCopy to bulk insert the claims while using EF to manage the reminder of the quotation, but the way I figure out how to achieve this is really, really cumbersome: I had to clone the quotation, detach the histories, delete the claims from the database, save the quotation, get the new foreign keys, bulk create the claims, attach the histories back to the quotation, etc.
Is it another way to achieve this?
Note: I could separate the claim history from the Quotation class and manage the first using ADO and the later using EF, but a lot of existing processes need the actual class design (not to mention that the user can actually attach many claim histories, which, of course, are sub-collections of sub-collections of sub-collections buried deep in the object tree...).
Many thanks in advance,
Sylvain.

I found a simple way to accomplish this :
// We will need a quotation Id (see below). Make sure we have one
SaveQuotation( myQuotation );
// Read the large claim history
var claimCollection = ImportClaimsFromExcelFile( fileName );
// Save the claim history without using EF. The quotation Id is needed to
// link the history to the quotation.
SaveClaimCollectionUsingSqlBulkCopy( claimCollection, myQuotation.Id );
// Now, ask EF to reload the quotation.
LoadQuotation( myQuotation.Id );
With an history of 60 000 claims, this code runs in 10 seconds. Using myObjectContext.SaveChanges(), 10 minutes were not even enough...
Thanks for your suggestions!
Note : Here is the code I used to bulk insert the claims :
using (var connection = new SqlConnection(constring))
{
connection.Open();
using (var copy = new SqlBulkCopy(connection))
{
copy.DestinationTableName = "ImportedLoss";
copy.ColumnMappings.Add("ImporterId", "ImporterId");
copy.ColumnMappings.Add("Loss", "Loss");
copy.ColumnMappings.Add("YearOfLoss", "YearOfLoss");
copy.BatchSize = 1000;
copy.WriteToServer(dt);
}
connection.Close();
}

Because of the many round-trips made to the db in order to persist an entity, EF is not the best choice for bulk operations. This said looks like the EF team is looking into improving this http://entityframework.codeplex.com/workitem/53
Also, have a look here : http://elegantcode.com/2012/01/26/sqlbulkcopy-for-generic-listt-useful-for-entity-framework-nhibernate/.
Another possible solution could be add all your Inserts within a single ExecuteSqlCommand, but then you will be loosing all the advantages of using an ORM.

Related

How to handle creating schema/tables on the fly for a multi-tenant web app

Problem
I'm building a web app, where each user needs to have segregated data (due to confidentiality), but with exactly the same data structures/tables.
Looking around I think this concept is called multi-tenants? And it seems as though a good solution is 1 schema per tenant.
I think sqlalchemy 1.1 implemented some support for this with
session.connection(execution_options={
"schema_translate_map": {"per_user": "account_one"}})
However this seems to assume the schema and tables are already created.
I'm not sure how many tenants I'm going to have, so I need to create the schema, and the tables within them, on the fly, when the user's account is created.
Solution
What I've come up with feels like a bit of a hack, which is why I'm posting here to see if there's a better solution.
To create schemas on the fly I'm using
if not engine.dialect.has_schema(engine, user.name):
engine.execute(sqlalchemy.schema.CreateSchema(user.name))
And then directly afterwards I'm creating the tables using
table = TableModel()
table.__table__.schema = user.name
table.__table__.create(db.session.bind)
With TableModel defined as
class TableModel(Base):
__tablename__ = 'users'
__table_args__ = {'schema': 'public'}
id = db.Column(
db.Integer,
primary_key=True
)
...
I'm not too sure why to inherit from Base vs db.Model - db.Model seems to automatically create the table in public, which I want to avoid.
Bonus question
Once the schema are created, if, down the line, I need to add tables to all the schema - what's the best way to manage that? Does flask-migrations natively handle that?
Thanks!
If anyone sees this in the future, this solution seems to broadly work, however I've recently run into a problem.
This line
table.__table__.schema = user.name
seems to create some odd behaviour where the value of user.name seems to persist in order areas of the app, so if you switch user, the table from the previous user is incorrectly queried.
I'm not totally sure why this happens, and still investigating how to fix it.

Best practice for RESTful API design updating 1/many-to-many relationship?

Suppose I have a recipe page where the recipe can have a number of ingredients associated with it. Users can edit the ingredients list and update/save the recipe. In the database there are these tables: recipes table, ingredients table, ingredients_recipes_table. Suppose a recipe has ingredients a, b, c, d but then the user changes it to a, d, e, f. With the request to the server, do I just send only the new ingredients list and have the back end determine what values need to be deleted/inserted into the database? Or do I explicitly state in the payload what values need to be deleted and what values need to be inserted? I'm guessing it's probably the former, but then is this handled before or during the db query? Do I read from the table first then write after calculating the differences? Or does the query just handle this?
I searched and I'm seeing solutions involving INSERT IGNORE... + DELETE ... NOT IN ... or using the MERGE statement. The project isn't using an ORM -- would I be right to assume that this could be done easily with an ORM?
Can you share what the user interface looks like? It would be pretty standard practice that you can either post a single new ingredient as an action or delete one as an action. You can simply have a button next to the ingredients to initiate a DELETE request, and have a form beneath for a POST.
Having the users input a list creates unnecessary complexity.
A common pattern to use would be to treat this like a remote authoring problem.
The basic idea of remote authoring is that we ask the server for its current representation of a resource. We then make local (to the client) edits to the representation, and then request that the server accept our representation as a replacement.
So we might GET a representation that includes a JSON Array of ingredients. In our local copy, we remove the ingredients we no longer want, add the new ones in. The we would PUT our local copy back to the server.
When the documents are very large, with changes that are easily described, we might instead of sending the entire document to the server instead send a PATCH request, with a "patch document" that describes the changes we have made locally.
When the server is just a document store, the implementation on the server is easy -- you can review the changes to decide if they are valid, compute the new representation (if necessary), and then save it into a file, or whatever.
When you are using a relational database? Then the server implementation needs to figure out how to update itself. An ORM library might save you a bunch of work, but there are no guarantees -- people tend to get tangled up in the "object" end of the "object relational mapper". You may need to fall back to hand rolling your own SQL.
An alternative to remote authoring is to treat the problem like a web site. In that case, you would get some representation of a form that allows the client to describe the change that should be made, and then submit the form, producing a POST request that describes the intended changes.
But you run into the same mapping problem on the server end -- how much work do you have to do to translate the POST request into the correct database transaction?
REST, alas, doesn't tell you anything about how to transform the representation provided in the request into your relational database. After all, that's part of the point -- REST is intended to allow you to replace the server with an alternative implementation without breaking existing clients, and vice versa.
That said, yes - your basic ideas are right; you might just replace the entire existing representation in your database, or you might instead optimize to only issue the necessary changes. An ORM may be able to effectively perform the transformations for you -- optimizations like lazy loading have been known to complicate things significantly.

Get int value from database

How i can get int value from database?
Table has 4 columns
Id, Author, Like, Dislike.
I want to get Dislike amount and add 1.
i try
var db = new memyContext();
var amountLike = db.Memy.Where(s => s.IdMema == id).select(like);
memy.like=amountLike+1;
I know that this is bad way.
Please help
I'm not entirely sure what your question is here, but there's a few things that might help.
First, if you're retrieving via something that reasonably only has one match, or in a scenario where you want just one thing, then you should be use SingleOrDefault or FirstOrDefault, respectively - not Where. Where is reserved for scenarios where you expect multiple things to match, i.e. the result will be a list of objects, not an object. Since you're querying by an id, then it's fairly obvious that you expect just one match. Therefore:
var memy = db.Memy.SingleOrDefault(s => s.IdMema == id);
Second, if you just need to read the value of Like, then you can use Select, but here there's two problems with that. First, Select can only be used on enumerables, as already discussed here, you need a single object, not a list of objects. In truth, you can sidestep this in a somewhat convoluted way:
var amountLike = db.Memy.Select(x => x.Like).SingleOrDefault(x => x.IdMema == id);
However, this is still flawed, because you not only need to read this value, but also write back to it, which then needs the context of the object it belongs to. As such, your code should actually look like:
var memy = db.Memy.SingleOrDefault(s => s.IdMema == id);
memy.Like++;
In other words, you pull out the instance you want to modify, and then modify the value in place on that instance. I also took the liberty of using the increment operator here, since it makes far more sense that way.
That then only solves part of your problem, as you need to persist this value back to the database as well, of course. That also brings up the side issue of how you're getting your context. Since this is an EF context, it implements IDisposable and should therefore be disposed when you're done with it. That can be achieved simply by calling db.Dispose(), but it's far better to use using instead:
using (var db = new memyContext())
{
// do stuff with db
}
And while we're here, based on the tags of your question, you're using ASP.NET Core, which means that even this is sub-optimal. ASP.NET Core uses DI (dependency injection) heavily, and encourages you to do likewise. An EF context is generally registered as a scoped service, and should therefore be injected where it's needed. I don't have the context of where this code exists, but for illustration purposes, we'll assume it's in a controller:
public class MemyController : Controller
{
private readonly memyContext _db;
public MemyController(memyContext db)
{
_db = db;
}
...
}
With that, ASP.NET Core will automatically pass in an instance of your context to the constructor, and you do not need to worry about creating the context or disposing of it. It's all handled for you.
Finally, you need to do the actual persistence, but that's where things start to get trickier, as you now most likely need to deal with the concept of concurrency. This code could be being run simultaneously on multiple different threads, each one querying the database at its current state, incrementing this value, and then attempting to save it back. If you do nothing, one thread will inevitably overwrite the changes of the other. For example, let's say we receive three simultaneous "likes" on this object. They all query the object from the database, and let's say that the current like count is 0. They then each increment that value, making it 1, and then they each save the result back to the database. The end result is the value will be 1, but that's not correct: there were three likes just added.
As such, you'll need to implement a semaphore to essentially gate this logic, allowing only one like operation through at a time for this particular object. That's a bit beyond the scope here, but there's plenty of stuff online about how to achieve that.

Breeze column based security

I have a "web forms", "database first enitity" project using Breeze. I have a "People" table that include sensitive data (e.g. SSN#). At the moment I have an IQueryable web api for GetPeople.
The current page I'm working on is a "Manage people" screen, but it is not meant for editing or viewing of SSN#'s. I think I know how to use the BeforeSaveEntity to make sure that the user won't be able to save SSN changes, but is there any way to not pass the SSN#s to the client?
Note: I'd prefer to use only one EDMX file. Right now the only way I can see to accomplish this is to have a "View" in the database for each set of data I want to pass to the client that is not an exact match of the table.
You can also use JSON.NET serialization attributes to suppress serialization of the SSN from the server to the client. See the JSON.NET documention on serialization attributes.
Separate your tables. (For now, this is the only solution that comes to mind.)
Put your SSN data in another table with a related key (1 to 1 relation) and the problem will be solved. (Just handle your save in case you need it.)
If you are using Breeze it will work, because you have almost no control on Breeze API interaction after the user logs in, so it is safer to separate your data. (Breeze is usually great, but in this case it's harmful.)

Does EF caching work differently for SQL Server CE 3.5?

I have been developing some single-user desktop apps using Entity Framework and SQL Server 3.5. I thought I had read somewhere that once records are in an EF cache for one context, if they are deleted using a different context, they are not removed from the cache for the first context even when a new query is executed. Hence, I've been writing really inefficient and obfuscatory code so I can dispose the context and instantiate a new one whenever another method modifies the database using its own context.
I recently discovered some code where I had not re-instantiated the first context under these conditions, but it worked anyway. I wrote a simple test method to see what was going on:
using (UnitsDefinitionEntities context1 = new UnitsDefinitionEntities())
{
List<RealmDef> rdl1 = (from RealmDef rd in context1.RealmDefs
select rd).ToList();
RealmDef rd1 = RealmDef.CreateRealmDef(100, "TestRealm1", MeasurementSystem.Unknown, 0);
context1.RealmDefs.AddObject(rd1);
context1.SaveChanges();
int rd1ID = rd1.RealmID;
using (UnitsDefinitionEntities context2
= new UnitsDefinitionEntities())
{
RealmDef rd2 = (from RealmDef r in context2.RealmDefs
where r.RealmID == rd1ID select r).Single();
context2.RealmDefs.DeleteObject(rd2);
context2.SaveChanges();
rd2 = null;
}
rdl1 = (from RealmDef rd in context1.RealmDefs select rd).ToList();
Setting a breakpoint at the last line I was amazed to find that the added and deleted entity was in fact not returned by the second query on the first context!
I several possible explanations:
I am totally mistaken in my understanding that the cached records
are not removed upon requerying.
EF is capricious in its caching and it's a matter of luck.
Caching has changed in EF 4.1.
The issue does not arise when the two contexts are
instantiated in the same process.
Caching works differently for SQL CE 3.5 than other versions of SQL
server.
I suspect the answer may be one of the last two options. I would really rather not have to deal with all the hassles in constantly re-instantiating contexts for single-user desktop apps if I don't have to do so.
Can I rely on this discovered behavior for single-user desktop apps using SQL CE (3.5 and 4)?
When you run the 2nd query on an the ObjectSet it's requerying the database, which is why it's reflecting the change exposed by your 2nd context. Before we go too far into this, are you sure you want to have 2 contexts like you're explaining? Contexts should be short lived, so it might be better if you're caching your list in memory or doing something else of that nature.
That being said, you can access the local store by calling ObjectStateManager.GetObjectStateEntries and viewing what is in the store there. However, what you're probably looking for is the .Local storage that's provided by DbSets in EF 4.2 and beyond. See this blog post for more information about that.
Judging by your class names, it looks like you're using an edmx so you'll need to make some changes to your file to have your context inherit from a DbSet to an objectset. This post can show you how
Apparently Explanation #1 was closer to the fact. Inserting the following statement at the end of the example:
var cached = context1.ObjectStateManager.GetObjectStateEntries(System.Data.EntityState.Unchanged);
revealed that the record was in fact still in the cache. Mark Oreta was essentially correct in that the database is actually re-queried in the above example.
However, navigational properties apparently behave differently, e.g.:
RealmDef distance = (from RealmDef rd in context1.RealmDefs
where rd.Name == "Distance"
select rd).Single();
SystemDef metric = (from SystemDef sd in context1.SystemDefs
where sd.Name == "Metric"
select sd).Single();
RealmSystem rs1 = (from RealmSystem rs in distance.RealmSystems
where rs.SystemID == metric.SystemID
select rs).Single();
UnitDef ud1 = UnitDef.CreateUnitDef(distance.RealmID, metric.SystemID, 100, "testunit");
rs1.UnitDefs.Add(ud1);
context1.SaveChanges();
using (UnitsDefinitionEntities context2 = new UnitsDefinitionEntities())
{
UnitDef ud2 = (from UnitDef ud in context2.UnitDefs
where ud.Name == "testunit"
select ud).Single();
context2.UnitDefs.DeleteObject(ud2);
context2.SaveChanges();
}
udList = (from UnitDef ud in rs1.UnitDefs select ud).ToList();
In this case, breaking after the last statement reveals that the last query returns the deleted entry from the cache. This was my source of confusion.
I think I now have a better understanding of what Julia Lerman meant by "Query the model, not the database." As I understand it, in the previous example I was querying the database. In this case I am querying the model. Querying the database in the previous situation happened to do what I wanted, whereas in the latter situation querying the model would not have the desired effect. (This is clearly a problem with my understanding, not with Julia's advice.)