In my app I need to retrieve a large graph of entities, make various changes while detached (add entities, make changes, delete entities), then persist the graph back to the database. I've tried STEs but it's starting to over-complicate some aspects of my client tier, so I was hoping to keep things simple (at least on the client side) by using POCOs instead.
When it comes to persisting the changes, I was thinking I could retrieve the graph from the database again, and walk both this graph and the graph from the client tier, looking for differences:-
A deletion is actioned where an entity exists in the database graph but not the client graph. Presumably I can just .Remove() these from the database graph.
New entities are those in the client graph with an ID of 0. Presumably I can just .Add() these to the database graph.
I'm not sure the best way to deal with updates. I don't want to implement an "IsDirty" flag on my entities, and would prefer a more automatic solution. So a) is there a way to compare an entity in the client graph with its database counterpart to see if it has changed, and b) what's the best way of applying/merging the client entity into its database counterpart?
Once all this is done I presumably just call SaveChanges() on the database graph. I would also have to pass this graph back to the client, to ensure it has the latest version (database-generated values such as IDs, timestamps).
Is my solution too simplistic? If it's feasible, how can I deal with updates as outlined above?
Your solution can work but it is not simple thing to implement - especially if you try to do it somehow generic (check answers in this question - somebody offered a code base which should have this already implemented). It also causes a lot of additional traffic to database so it is definitely not a good approach when you expect some heavy load. The simplest solution is what #Gert Arnold mentioned.
Related
Currently, I'm working on a Java EE project with some non-trivial requirements regarding persistence management. Changes to entities by users first need to be applied to some working copy before being validated, after which they are applied to the "live data". Any changes on that live data also need to have some record of them, to allow auditing.
The entities are managed via JPA, and Hibernate will be used as provider. That is a given, so we don't shy away from Hibernate-specific stuff. For the first requirement, two persistence units are used. One maps the entities to the "live data" tables, the other to the "working copy" tables. For the second requirement, we're going to use Hibernate Envers, a good fit for our use-case.
So far so good. Now, when users view the data on the (web-based) front-end, it would be very useful to be able to indicate which fields were changed in the working copy compared to the live data. A different colour would suffice. For this, we need some way of knowing which properties were altered. My question is, what would be a good way to go about this?
Using the JavaBeans API, a PropertyChangeListener could suffice to be notified of any changes in an entity of the working copy and keep a set of them. But the set would also need to be persisted, since the application could be restarted and changes can be long-lived before they're validated and applied to the live data. And applying the changes on the live data to obtain the working copy every time it is needed isn't feasible (hence the two persistence units).
We could also compare the working copy to the live data and find fields that are different. Some introspection and reflection code would suffice, but again that seems rather processing-intensive, not to mention the live data would need to be fetched.
Maybe I'm missing something simple, or someone know of a wonderful JPA/Hibernate feature I can use. Even if I can't avoid making (a) separate database table(s) for storing such information until it is applied to the live data, some best-practices or real-life experience with this scenario could be very useful.
I realize it's a semi-open question but surely other people must have encountered a requirement like this. Any good suggestion is appreciated, and any pointer to a ready-made solution would be a good candidate as accepted answer.
Maybe you can use the Hibernate flush entity event listener. The dirty properties are calculated before the flush. You can store them somewhere in your database.
A sample code of using the dirty properties feature of Hibernate which may give you an idea.
A little backstory
I have to develop a web application for college. This web application has to do with managing different locations using google maps like pinning new locations adding custom descriptions and so on. The login part is done using facebook (login with facebook). The more interesting part would be that the queries (client-server) would have to be done by using REST.
The part that i try to understand
If i use a database to store my user's unique ID, their online status (online/offline) and somehow (didn't settle actually on the idea) to keep a JSON on the server that would contain each user's pinned locations, would all this actually be ok with the REST paradigm ?
I find mixed answers on the internet and i don't know how to think of the statelessness of the application correctly. A session would not be created but the credentials from the database would be necessary for the users to communicate with each other.
The other side of the question
Considering that i'm mistaken and i shouldn't use the database to store the credentials and locations like that, how am i supposed to keep all that data ? I'm thinking something like JSON cached client-side but what if my client changes the computer, wouldn't this mean that he loses all his data? (Also wouldn't this make MVC handicapped by not having a model?) How do i really keep track of all things.
You're making this way too hard on yourself, try to keep it simple since you probably have a deadline. REST is a way of using APIs with HTTP verbs like GET, POST, PUT, and DELETE. It says nothing about how to store the data behind your APIs.
As for storing the data, a database should be fine. Storing it as JSON in the db could work, but in the end you'll have to parse the json every time that you want to use it, so I would suggest that you store it in a DB in such a way that it can be read easily.
For a beginner (especially if you're doing this for a school project), I would definitely suggest that you set up a relational database like Microsoft SQL Database (Microsoft Stack), or a MySQL/PosGres Database (I think this is what they'd use in linux), but if you wanna skip the relational db approach (because it might not be all that "easy" to get going), you can always try a NoSQL database like MongoDB.
Relevant links to help:
http://rest.elkstein.org/ (REST explained)
http://www.restapitutorial.com/lessons/httpmethods.html (REST verbs)
http://en.wikipedia.org/wiki/Relational_database (what is a relational db)
http://en.wikipedia.org/wiki/Database_normalization (Kinda the goal of relational db.. but note you can go too far...http://lemire.me/blog/archives/2010/12/02/over-normalization-is-bad-for-you/)
http://www.mongodb.com/nosql-explained (NoSQL explanation)
OK. I know that Entity Framework is ORM. We use it for mapping data from database to object model, and from objects to relational data. But where it fits in a context of persistance layer? Can we say that persistance layer is also Entity Framework?
I would say - No! There are a lot of articles about this topic. But in general you don't want your object-relational mapper to be data-persistent. In fact exactly the opposite, keeping it persistent ignorant you can benefit by using your data classes with different types of data providers such as relational databases, web services, XML files and what not.
To keep data persistence you may take advantage of different design patterns like Repository pattern and Unit Of Work so you can really decouple you business layer from your data layer.
Ok, to make myself clear since it's very difficult through comments, here's an update to what I wanted to explain. Please have in mind that this is just my interpretation, and the way I'm using EF, I've been using it in different projects (desktop and web) but it's not universal, but still covers a lot of the most common scenarios.
So since I'm a big fan of Code First I'll write from this prespective. The Database Model is where your entities lies. Later on based on those entities the EF will generate your database. So what is important on this stage of development - you want to have you database normalized and you want all navigation properties set correctly. Not so trivial tasks as it may seems but that it's, you just care about how efficient your database will be.
Now comes the tricky moment somehow you should deliver you data to the business layer and it's true - as far as we are talking only about data from a database using repository is very arguable. However even then the one advantage that you get when having this Repository between the data and the business logic is that you don't have to take in mind the business needs while creating the data model, and after that this doesn't make it any harder to use your data from inside the business layer even though what exactly will your front end looks like at the time you create the database model.
So at this point let's consider again the example case where in you Database Model you have those two entities - Customers and Orders. When a user log in into your application and wants to see his orders you need to join two tables in order to provide the front end the information that it needs. Option 1 - you don't have a Repository and you are using the DbContext directly from the method that returns the data. That means two things - you gonna have to write the same code everywhere you need to get this specific piece of information and 2 - if the business requirements change and in the same view that since now was used to show a customer and his orders now you have to show some additional info which is taken, let's say from a third table, then what happens - you have to go to each place where you use this view and change the way you retrieve the data. And option 2 - you have Repository, all your methods for accessing data are stored there and the Business Layer is completely ignorant about the way it get's the data, the Database Model is also ignorant about the needs of the business model which lead to loose coupling and only one place where you gonna have to make changes if you have to. In the scenario above, if you indeed use Repository and in your repository you have method called GetUserOrders() and inside this method you make the database call, the joins and so on, and all that the Business layer needs to do to get the data in the proper way is call this method when the requirements change and you have to include one more table, this time you don't have to look for all the places where you are using this data, you just have to modify one method and that's all.
It's pretty much the same logic on the way back. When you have some complex data returned from your front end and you want to save/update the old data with the new one, again - you can do it from the business layer but it leads to the same problem as when you have to get data, instead - you just pass the complex data to another Repository method which knows how to deal with it (say maybe some of the data should be saved directly into database and other should be used to feed a web service or whatever scenario comes to your mind) and here again - when something change, like - you want to use more heavily web services or the opposite, you want to migrate to more database centric design, all you have to do is change the method that takes care about the data the is concerned with this changes and nothing more.
So even though when I'm writing this I can see that DbContext can very well act as a repository and in this regard also as a data persistent layer, there are still some valid reason to not let this happen. Especially right now when the web services are more and more popular, WebAPI2 is out and RESTFull services are frequently used I think that leaving the EF as persistent ignorant as possible is the way to go.
But yet again, this is my opinion. There are a lot of articles on this topic so I urge you to google and read about it, since I think this is very important part form the architecture of every application.
P.S
In response to your comment which was written while I was writing my edited answer:
If I change data source I need to make changes in DAL anyway or in my example in repostitory. - the answer is yes. But there is no way tho change the data source without changing the DAL. The question is how easy will be to do that. I think the with what I've written already you can decide for yourself which way is better but just because I really think this is one of the few really strong arguments of leaving the EF persistent ignorant all write it again. When you have Repository and there are methods which take care for data manipulation, every time something related with the way the data is fetched affects only those methods and nothing else. If you use the context freely, in your business layer even a little change may cause you a lot of trouble just because it always possible to miss something, you have to go through the entire code to make sure that you have fixed all places and it's just not as efficient as having all in one place.
this is a question on best practice, i understand that there are a lot of different options for doing this, but i would like your opinions as to how you would approach solving this problem. Please take it as though performance is critical in this system, in other words scalable.
I have recently found the wonders of graph database, so i came up with a theoretical situation where a company wants to manage it's customers relationships, and in order to do so they are going to use neo4j which is great, and allows for really great management of the customers, different staff members and their relationships, which is all great, however the company now wants to create a web based interface which will need authentication, and anyone in the neo4j database should be able to login to the system in order to see how they are related to other people in the company's database, so each user must have a password/email/id associated with their name.
So my question is, in this case scenario, is it best to store the password_hash/password_salt/id/email in a mysql database and then based on the node look it up on the mysql database. Or is it better to store the password_hash/password_salt/id/email in the hash tables inside the nodes.
Also each store has 1000s of products, and they can be stored in the graph database or i can store the products in the mysql database and then look up the product there, and do the changes there, because the products are not related to each other, so no point in storing them in the graph database, so should they be not stored there to improve performance?
So my question boils down to this: is it best for large projects to use a graph database along with the more common rdms database such as mysql? if not, then what is the point at which you start to use these two database systems?
apologies in advance for my lack of knowledge regarding database terminology.
Graph DB is mainly used for maintaining relations. If app has a graph DB that does not mean that app needs to store everything in Graph DB.
Every node request on Graph is in memory and thus if you have unnecessary properties in your node it will be bloated and may make things slower and take more memory.I usually decide what needs to go in graph and what needs to go in DB by very simple rule.
High level property (that defines the relation and other important properties that defines the node) goes in graph whereas additional information goes in RDMS.
For example in FB may be FBID, Name goes in Graph as it defines the relationship of one node with another. But when user clicks on someones facebook ID, he/she gets to see other users DOB, Age , College .All these can go in RDBMS.
PS: RDMS has another advantage, it can be used for quick analytics. I know with graph also you can do that but i am not sure if its as scalable and easy as RDBMS.
Downside to this approach is : You need to maintain two DBS.
Unless you have a proven case for a two-DB solution, I'd say fewer moving parts would keep you more agile, more able to change things quickly. If later you find a use case that is difficult, then weigh up the cost/ benefit of introducing a second storage. A two-DB architecture is not unheard of, but comes with an overhead.
Specific to security, there is no reason why Neo4j or any other reasonable NOSQL solution couldn't do that: http://spring.neo4j.org/docs#tutorial_security
You should use both in case there is data where it does not make much sense to store it in a graph DB such as neo4j/orientDB (and some data would be better off in a graph DB as opposed to a relational DB). Forcing data on one platform may cause issues with performance/scalability down the line.
I like working with the entity framework for many reasons- the ease of use of the entity designer, the power of linq, and the ease of binding.
Occasionally I want to build a simple app that doesnt need to use a database, but still needs to work with data and display it on screen, in grids etc, so I'd like to just create a quick EF model and use it for this, but it doesnt seem to work very will with just using it for local data.
My question is- is there a correct usage of the EF for working with local data, and perhaps then just serialize/deserialize the whole context to a file? Or is this just too much effort to make work properyly? I used to use Datasets in this way, along with Linq to Dataset, and it works well... So perhaps those are still the better way to go for this scenario?
Yes you can use entity framework as local, and also access the data that is currently in-memory, read details as link below:
http://msdn.microsoft.com/en-us/data/jj592872.aspx
I don't know what you mean by "local data" exactly (sounds like it's not a database), but I think the Datasets vs. EF portion of your post is (for me) the real question.
EF is great when you need to model robust business logic, are implementing a Domain Model pattern, using Domain Driven Design, etc: basically any scenario where a Table Module or Active Record pattern is inappropriate.
When you just need to display some grids of data, and the business logic is very simple, Datasets are definitely the way to go (in my experience).