Self Tracking Entities versus timestamp column in database - entity-framework

In an optimistic concurrency scenario fo a web-app, I am considering to give each table the timestamp column (sqlserver), comparable to a guid. Linq to entities will then generate sql update queries like WHERE id = #p0 AND timestamp = #p1 when one decorates the timestamp column with a certain attribute in Entity Framework. When the number of updated records returned is 0 we have detected a concurrency exception.
In a lot of posts I am reading about Self Tracking Entities which may be an alternative or better solution. But I didn't see any advantage over the "simple" timestamp method described above. Apart from the scenario where the database is immutable and doesn't offer the timestamp column.
Which solution is better and why?
EDIT
Yury Tarabanko correctly states that STE is another concept.
However zeeshanhirani's answer demonstrates that concurrency check is one main motive to track changes.
Lets rephrase the question: why would anybody use the STE concept for concurrency check where the 'timestamp column' method looks so much easier.

You are mixing two concepts here. STE is not about concurrency at all.
Self tracking entities just know how to do their change tracking regardles of how those changes were made. So you always know what is the current state of entities object graph. And you don't need to invoke additional change detecting.
What is STE.
EDIT:
"concurrency check is one main motive to track changes"
AFAIK STEs and POCOs share the same approach to concurrency check which simply results in additional where condition(s) in update statement sent to DB. Equivalent to this:
UPDATE [schema].[table]
SET [prop1] = value1, ...
WHERE [key] = key_value
AND [concurrency_prop_1] = concurrency_prop1_old_value
AND [concurrency_prop_2] = concurrency_prop2_old_value
So 'the main motive' to track changes is, well, to track changes in N-tier app.

Self Tracking Entity actually works with the concept you described. STE basically tracks changes to the object when the context is not around. However when it sends its changes back to the server using WCF service, it sends the current values of the property, the new state of the entity and also the original values of the the primary key column, independent association value and original value for any columns that are marked as Concurrency=Fixed in entity data model.

Related

Lagom persistent read side and model evolution

To learn lagom i created a simple application with some simple persistent entities and a persistent read side (as per the official documentation, using cassandra)
The official doc contains a section about model evolution, describing how to change the model. However, there is no mention of evolution when it comes to the read side.
Assuming i have an entity called Item, with an ID and a name, and the read side creates a table like CREATE TABLE IF NOT EXISTS items (id TEXT, name TEXT, PRIMARY KEY (id))
I now want to change the item to include a description. This is trivial for the persistent entity, but the read side has to be changed as well.
I can see several approaches to (maybe) achieve that:
use a model evolution tool like liquibase or play evolutions to change the read side tables.
somehow include update table statements in createTables that migrate the model
create additional tables containing the additional information, and keep the old tables without modifications
Which approach would be the most fitting? Is there something better?
Creating a new table and dropping the old table is an option too IMHO.
It is simple as modifying your "create table" command ("create table mytable_v2 ..." and "drop table mytable...") and changing the offset name and modifying your event handlers.
override def buildHandler( ): ReadSideProcessor.ReadSideHandler[MyEvent] = {
readSide.builder[MyEvent]("myOffset") // change it to "myOffset_v2"
...
}
This results in all events to be replayed and your read side table to be reconstructed from the scratch. This may not be an option if the current table is really huge as the recostruction may last very long time.
Regarding what #erip says I see perfectly normal adding a new column to your read side table. Suppose there are lots of records in this table with list of all entities and you want to retrieve a list of entities based on some criteria so you need some columns to be included in the where clause. Retrieving list of all entities and asking each of them if it complies with the criteria is not an option at all - it could be very unefficient as it needs more time, memory and network usage.
The point of a read-side is to materialize views from entity state changes from your service's event stream in your service. In this respect, you as the service controller can decide what is important for your subscribers to know about. This is handled by creating read-sides with an anti-corruption layer (or ACL).
Typically your subscribers will subscribe to API events which should experience no evolution. Your internal events (or impl events) will likely need to evolve; because of this, there should be a transformation from the impl to the API.
This is why it's very important to consider your domain very carefully before design: you really need to nail down what subscribers will need to know about. In the case of a description, it strikes me as unlikely that subscribers will need (or want!) to know about that.

copy records from between two databases using EF

I need to copy data from one database to another with EF. E.g. I have the following table relations: Forms->FormVersions->FormLayouts... We have different forms in both databases and we want to collect them to one DB. Basically I want to load Form object recursively from one DB and save it to another DB with all his references. Also I need to change IDs of the object and related objects if there are exists objects with the same ID in the second database.
Until now I have following code:
Form form = null;
using (var context = new FormEntities())
{
form = (from f in context.Forms
join fv in context.FormVersions on f.ID equals fv.FormID
where f.ID == 56
select f).First();
}
var context1 = new FormEntities("name=FormEntities1");
context1.AddObject("Forms", form);
context1.SaveChanges();
I'm receiving the error: "The EntityKey property can only be set when the current value of the property is null."
Can you help with implementation?
The simplest solution would be create copy of your Form (new object) and add that new object. Otherwise you can try:
Call context.Detach(form)
Set form's EntityKey to null
Call context1.AddObject(form)
I would first second E.J.'s answer. Assuming though that you are going to use Entity Framework, one of the main problem areas that you will face is relationship management. Your code should use the Include method to ensure that related objects are included in the results of a select operation. The join that you have will not have this effect.
http://msdn.microsoft.com/en-us/library/bb738708.aspx
Further, detaching an object will not automatically detach the related objects. You can detach them in the same way however the problem here is that as each object is detached, the relationships that it held to other objects within the context are broken.
Manually restoring the relationships may be an option for you however it may be worthwhile looking at EntityGraph. This framework allows you to define object graphs and then perform operations such as detach upon them. The entire graph is detached in a single operation with its relationships intact.
My experience with this framework has been in relation to RIA Services and Silverlight however I believe that these operations are also supported in .Net.
http://riaservicescontrib.codeplex.com/wikipage?title=EntityGraphs
Edit1: I just checked the EntityGraph docs and see that DetachEntityGraph is in the RIA specific layer which unfortunately rules it out as an option for you.
Edit2: Alex Jame's answer to the following question is a solution to your problem. Don't load the objects into the context to begin with - use the notracking option. That way you don't need to detach them which is what causes the problem.
Entity Framework - Detach and keep related object graph
If you are only doing a few records, Ladislav's suggestion will probably work, but if you are moving lots of data, you should/could consider doing this move in a stored procedure. The entire operation can be done at the server, with no need to move objects from the db server, to your front end and then back again. A single SP call would do it all.
The performance will be a lot better which may or may not not matter in your case.

Change tracking with EF over time?

I'd like to know what is the best practice to track and/or persist changes over time if I use EF. I'd like to get started with EF for a new project. What I need is a kind of change history.
That's how I did it before: If a record was created it was saved with an ID and with the same ID as InvariantID. If the record was updated i marked it as deleted and created a new record with the new values and a new ID but the same InvariantID. Like this I always had my current record but a history of changes as well.
This works perfectly fine for my scenarios. The amount of historical records is not an issue because I use this usually only for data that's not changing very often.
Is this build in EF somehow or what's the best way to get this behavior for EF?
No it is not build into EF and it will not work this way. I even don't think that it is a good approach on the database level because it makes referential integrity very complex.
With EF this will work only if you use following approach:
You will use conditional mapping for your entity - condition will be IsDeleted = 0. It will ensure that only non deleted entities will be used in queries.
You will have mapped stored procedure for delete operation to correctly set IsDeleted = 1 instead of really deleting the record
You will have to manually call DeleteObject to delete your record and after that you will insert new record - the reason is that EF is not able to deal with scenario where entity change its PK value during update.
Your entities will not be able to participate in relations unless you manually rebuild referential integrity with some other stored procedure
You will need stored procedure to query historical (deleted) records

I don't need/want a key!

I have some views that I want to use EF 4.1 to query. These are specific optimized views that will not have keys to speak of; there will be no deletions, updates, just good ol'e select.
But EF wants a key set on the model. Is there a way to tell EF to move on, there's nothing to worry about?
More Details
The main purpose of this is to query against a set of views that have been optimized by size, query parameters and joins. The underlying tables have their PKs, FKs and so on. It's indexed, statiscized (that a word?) and optimized.
I'd like to have a class like (this is a much smaller and simpler version of what I have...):
public MyObject //this is a view
{
Name{get;set}
Age{get;set;}
TotalPimples{get;set;}
}
and a repository, built off of EF 4.1 CF where I can just
public List<MyObject> GetPimply(int numberOfPimples)
{
return db.MyObjects.Where(d=> d.TotalPimples > numberOfPimples).ToList();
}
I could expose a key, but whats the real purpose of dislaying a 2 or 3 column natural key? That will never be used?
Current Solution
Seeming as their will be no EF CF solution, I have added a complex key to the model and I am exposing it in the model. While this goes "with the grain" on what one expects a "well designed" db model to look like, in this case, IMHO, it added nothing but more logic to the model builder, more bytes over the wire, and extra properties on a class. These will never be used.
There is no way. EF demands unique identification of the record - entity key. That doesn't mean that you must expose any additional column. You can mark all your current properties (or any subset) as a key - that is exactly how EDMX does it when you add database view to the model - it goes through columns and marks all non-nullable and non-computed columns as primary key.
You must be aware of one problem - EF internally uses identity map and entity key is unique identification in this map (each entity key can be associated only with single entity instance). It means that if you are not able to choose unique identification of the record and you load multiple records with the same identification (your defined key) they will all be represented by a single entity instance. Not sure if this can cause you any issues if you don't plan to modify these records.
EF is looking for a unique way to identify records. I am not sure if you can force it to go counter to its nature of desiring something unique about objects.
But, this is an answer to the "show me how to solve my problem the way I want to solve it" question and not actually tackling your core business requirement.
If this is a "I don't want to show the user the key", then don't bind it when you bind the data to your form (web or windows). If this is a "I need to share these items, but don't want to give them the keys" issue, then map or surrogate the objects into an external domain model. Adds a bit of weight to the solution, but allows you to still do the heavy lifting with a drag and drop surface (EF).
The question is what is the business requirement that is pushing you to create a bunch of objects without a unique identifier (key).
One way to do this would be not to use views at all.
Just add the tables to your EF model and let EF create the SQL that you are currently writing by hand.

Create new or update existing entity at one go with JPA

A have a JPA entity that has timestamp field and is distinguished by a complex identifier field. What I need is to update timestamp in an entity that has already been stored, otherwise create and store new entity with the current timestamp.
As it turns out the task is not as simple as it seems from the first sight. The problem is that in concurrent environment I get nasty "Unique index or primary key violation" exception. Here's my code:
// Load existing entity, if any.
Entity e = entityManager.find(Entity.class, id);
if (e == null) {
// Could not find entity with the specified id in the database, so create new one.
e = entityManager.merge(new Entity(id));
}
// Set current time...
e.setTimestamp(new Date());
// ...and finally save entity.
entityManager.flush();
Please note that in this example entity identifier is not generated on insert, it is known in advance.
When two or more of threads run this block of code in parallel, they may simultaneously get null from entityManager.find(Entity.class, id) method call, so they will attempt to save two or more entities at the same time, with the same identifier resulting in error.
I think that there are few solutions to the problem.
Sure I could synchronize this code block with a global lock to prevent concurrent access to the database, but would it be the most efficient way?
Some databases support very handy MERGE statement that updates existing or creates new row if none exists. But I doubt that OpenJPA (JPA implementation of my choice) supports it.
Event if JPA does not support SQL MERGE, I can always fall back to plain old JDBC and do whatever I want with the database. But I don't want to leave comfortable API and mess with hairy JDBC+SQL combination.
There is a magic trick to fix it using standard JPA API only, but I don't know it yet.
Please help.
You are referring to the transaction isolation of JPA transactions. I.e. what is the behaviour of transactions when they access other transactions' resources.
According to this article:
READ_COMMITTED is the expected default Transaction Isolation level for using [..] EJB3 JPA
This means that - yes, you will have problems with the above code.
But JPA doesn't support custom isolation levels.
This thread discusses the topic more extensively. Depending on whether you use Spring or EJB, I think you can make use of the proper transaction strategy.