I am stating a new project that needs to be portable and in some scenarios will have 100's millions of entities.
Now with Azure getting hadoop that of course got my attention for the big data scenario. But I also have the small data scenario under 1 million rows.
Entity Framework code-first is they way I see designing this but needing hadoop in the mix of course may complicate things (Entity Framework of course been used to give simpler storage providers for the smaller data sets)
Now the question is does anyone have experience in this mix?
Can anyone recommended if this is a good approach or not, if not is there a better way?
Working on a reasonably large system based on Entity Framework Code First, with the caveats that I have been working with EF4 and cannot upgrade to 5, that your mileage may vary and the outcomes are going to be strongly affected by what you intend to do, my experience has been that EF doesn't handle large amounts of data tremendously well, it is quite inflexible, so if you need to change its standard behaviour in some way there is a good chance you are going to end up having to hack some nasty workarounds, and performance isn't amazing. If you want to do things that aren't exactly the things that EF expects you to do, then you can run into walls.
If I was thinking of designing a relatively simple/small scale Asp.Net MVC setup, I think EF is a really good choice. For a larger scale operation where you need more flexibility or you are planning to go beyond basic operations you might find that something like NHibernate works better. I don't have experience with that, but colleagues who have worked with both tend to prefer NHibernate. ( Brief article on the comparison - a bit old so EF has addressed some, but not all of the points there. It is also different by design of course. )
It may be that for more high-traffic or unusual stuff you need to roll your own data access anyway just to achieve the right performance or be able to find the right data. Unsquestionably, if you are planning to try working with EF I strongly recommend some serious prototyping just to ensure it can do what you need.
Related
I've spent the last couple of months on and off with CFwhees.
There's a lot it has going for simplicity, but I always come up against an issue or two that I have a hard time troubleshooting. Biggest of them is to get a many-many relationship to work properly.
I'm switching between two environments Railo/mySQL and CF/MsSQL so it would be nice if it could work on both.
I'm trying to roll out a web application in a limited amount of time, as I've spent already too much time on CF wheels.
Can anyone recommend a framework that will make creating many-many relationships and the related CRUD easy and has a big community?
Some of the one's I've seen mentioned frequently are MachII, FuseBox, Model-Glue, ColdBox
Most of those frameworks you list do not have built-in ORM like wheels does. This means you'll be either using straight SQL queries or the CF9(Hibernate) ORM. I think it is only fair to mention that both of those options are also available for CFWheels.
I've written a fairly large app in CFWheels. In my app there were several instances of many-to-many relationships, and I was able to make it work without too much pain. That being said, I have felt your frustration with the CFWheels ORM. It can be clunky once you get to complex relationships. In those cases, I've had to make a judgement call as to whether it was worth it to try to build a query using the ORM, or just build a custom SQL query and store it in the CFC for my model. In fact, for 99% of my report queries for this app, I just resorted to writing the SQL in the model. But for CRUD operations, this wasn't really a limiting factor.
I'm curious what specific problems you're experiencing with Wheels - care to post an example?
yes the orm in cfwheels can be buggy at times. if you encounter a bug or even what you THINK might be a bug, we want to know. please take the time to file a bug report so we can investigate. that all being said, i'm very surprised that the CF community hasn't taken noticed about Don Humphrey's ORM called CFRel. It's probably one of the biggest things to happen to CFML since fusebox.
Oh... and there is even a cfwheels plugin for it.
Can any one tell which is best suited for performance oriented applications?
All of the above. Or none of the above. No way to tell without measuring performance and seeing which one does or does not work for you.
I would agree with the existing answers here: Understand what performance really means to your application before going off half-cocked on something (most of us have been there). If you're looking for something super-performant but that still has some "ORMish" behavior and takes some monkey coding out of the ADO.Net equation, take a look at the various .Net MicroOrms out there such as:
Dapper (with extensions)
Service Stack's ORM Lite
Insight.Database
Massive
There are several others out there, some of which are referenced from the dapper site.
If you really are stuck with those three choices, it definitely does depend on a lot of factors and how much time you spend tuning. That being said, I've used all three quite a bit, especially NHib 2-3 and EF 4-6. I think if you are doing just quick-and-dirty coding without spending a lot of time on optimizing, LightSpeed is a really good choice and I've personally found it to outperform the other two very handily when it comes to most basic CRUD operations and LINQ queries.
The big downside of LightSpeed is that you have to inherit from their base classes. This is somewhat mitigated by partial class support and you can also insert your own base classes in between, and there's also no true "CodeFirst" support, although you can handcode the classes and skip the designer if you like. They all work well if tuned properly. Just pick the right tool for the job.
Whichever you chose, use your SQL Profiler / Mini Profiler / NHProf / EFProf etc...
When Linq-to-Sql was first released, I used it quite a lot for small and medium sized projects where a true multi-tier architecture wasn't required.
NHibernate, Small Middleware, Overkill
Where I work, we now almost exclusively use NHibernate for true Domain Driven development.
I'm working on a small temporary (a lifetime of probably a year, maybe less) middleware component where NHibernate feels slightly overkill in terms of configuration and keeping the entities up to date. Especially because I haven't got any control over the DB, it sometimes changes, and it's a little bit "legacy".
Some changes were recently made to the DB, and the NHibernate mappings are not very complete.
Linq-to-Sql? Or EF?
I thought it might be easier just to rip out the IRepository implementation I have and replace it with a Linq-to-Sql implementation. Then I can just use lambdas for my simple queries, and just drag and drop the tables in.
RAD But Dead?
In this scenario the RAD elements of Linq-to-Sql make sense. But it's essentially old technology. Should I not use it? I've never used the Entity Framework. Should I use that, is it as easy and quick to use?
cheers
Is it still OK to use Linq-to-sql?
Yes. It is OK to use any technology which allows you delivering the product in time, with required functionality and quality. You can still find projects using ASP, ADO, VB6. One reason why Microsoft technologies have very hard time in many international corporations is that their products have very short lifetime. Linq-to-sql was on the market less then 2 years and was deprecated by Microsoft but companies / community argued about that and Microsoft changed their strategy little bit. Linq-to-sql doesn't have new features but it is still supported and a fully functional technology.
Will Linq-to-sql or EF solve your problems?
It depends. Perhaps yes and perhaps no. Don't believe to marketing announcements about RAD. Sometimes I feel that people think that RAD is about designer. No. Tools supporting RAD are about well defined API which is easy to understand, easy to use and doesn't contain unexpected behavior (Principle of least surprise) - you will use the API to quickly prototype the application but it still requires understanding and practice. NHibernate's mapping is still prototyping when you compare it with manually doing whole data access. We can even follow the basic rule of good framework: Easy things are easy to do and hard things are possible. That is something that NHibernate accomplish much better then EF or Linq-to-sql.
If you know NHibernate but you don't have any real world experience with EF or Linq-to-sql, you can be sure that neither Linq-to-sql or EF will increase your productivity in first one or two projects where you use it. If you don't have too much experience with NHibernate changing to EF or Linq-to-sql will probably don't cause temporary lose of productivity.
I also don't think that EF or Linq-to-sql will generally help you in situations where database changes. As I remember Linq-to-sql designer doesn't have update mapping functionality at all and because of that it is very often used completely without designer so you must still manually modify mapping. EF's updating model from database can be helpful here but it is not a silver bullet. Some updates can require manual modification of EDMX file (huge XML).
At last be aware that NHibernate's mapping features are much more powerful especially when working with legacy databases. Linq-to-sql's mapping features are very limited, it is mostly 1:1 mapping of tables to classes with some exceptions (basic TPH inheritance). EF offers more complex mapping features but it somehow expects a correct design of the database.
If you (and your coworkers) are comfortable with NHibernate, then it should be just as RAD as Linq to SQL. There's no reason not to use L2S as long as you understand it's not going to get much in the way of updates and improvements from Microsoft, but in my experience if you know how to use both frameworks already, no need in re-doing the work just because L2S might be a little more RAD
Some good discussion on NHibernate vs L2S/EF
Entity Framework vs LINQ to SQL
MS Entity Framework VS NHibernate and its derived contribs (FluentNHibernate, Linq for NHibernate)
in your specific case, you should go with technology you are most comfortable with. Though Linq2Sql is relatively straight forward - and build right into the language - it does have a slight learning curve and its own set of gotchas!
I was surprised to find a public letter proposing a vote of no confidence in the entity framework (see http://efvote.wufoo.com/forms/ado-net-entity-framework-vote-of-no-confidence/)
Would the reasons stated in the letter keep you from using the current version of the entity framework? Would you rather wait for v4.0? Or rather use another ORM?
The current version of EF is definitely not perfect, and has lots of gotchas and drawbacks. I probably wouldn't use it right now - but the upgrade path to EF v2 (or is it EF4?) sure looks pretty rosy!
complete persistence ignorance - you can use your straight up POCO classes
deferred loading configurable as an option
much improved designer with support for pluralization/singularization (even in multiple languages!)
ability to do "domain first" design and create database from your model
ability to have self-tracking entities across multiple layers that allow you to send data to the client and get back changes and apply them to your entity context
All in all, EF v2 looks very promising and I'm very eager to give it a serious spin. If it really keeps all the promises out there right now, it's definitely a winner!
Check out the ADO.NET team blog for a flurry of recent blog posts on EF v2.
Marc
Another ORM.
Don't get me wrong you should get flamed with responses, but currently only nHibernate is functionally complete.
I'm a TDD fan, so want an easily testable POCO ORM solution. If that's your bag then EF3.5 is out. EF4.0 is introducing it (http://blogs.msdn.com/adonet/archive/2009/05/21/poco-in-the-entity-framework-part-1-the-experience.aspx) , but it still has at least 1 big drawback -> doesn't support inheritance.
NHibernate is more complete, but EF could be easier to use. As ever, best tool for the job... but if it's an Enterprise-scale TDD developed app, go nHibernate.
Also -> there's a profiler that makes nHibernate dev much easier -> http://www.nhprof.com/
I tried using it for my current project, which basically involves rewriting our current mess of a data layer.
It just doesn't work.
First, if you're trying to base an Entity off of a View, the designer tries to force every NOT NULL property to be an entity key... which is pretty much never what I wanted. To work around that you have to edit the xml in at least two places, and do it every time you add an object because it refreshes and re-adds the EntityKey properties. Must specify mapping for all key properties in Entity Framework?
Second, when you are creating associations you MUST use every entity key - How can you make an association without using all entity keys in entity framework?
Those two things held me up for 3 days, then I went back to Linq to SQL and had it done in a couple hours. (Well, at least the part of the system I was struggling with... ) I don't know if those are in the Vote of No Confidence, but it's just not ready in my opinion.
Also with the lack of answers I got here on every EF question I've asked, I have to assume current usage is so low that getting help and support is going to be difficult... which is possibly the BIGGEST reason not to use something.
Let's hope the next version is better...
EDIT: OUr current plan is to stick with Linq 2 SQL (I have to finish a project by Friday) and then evaluate all the other ORMs to see if anything else is better. The other developer hates L2S for the record, but I've never had any major problems using it...
EF has some rich design time support, but I have to agree that nHibernate is the way to go, despite the learning curve. If you need to make something fast and don't care about TDD or serialization (which is a large weakness of all of MS's ORM offerings) go EF.
Well My experience of Version 1 was interesting. I wanted to use POCOs but it didn't support it. After reading around I came across some code from a bod at microsoft that did this.
It was a bit messy to generate the code but on the whole this part of the process was not so bad.
A real nasty part that I came across was the lack of Concurrency checking built in, for N-tier development. You have to manage this yourself which after looking at the problem was not so bad, especially if you want to hand back the versioning back to the client for user intervention.
Second nasty and absolutely stupid thing missing was the IN keyword for LINQ queries. Not supported and so needs to be worked around. I found a solution but was a real mess bringing in some other code that quickly patched up the omissions.
Would I use EF 4.0 (2.0). Yes, absolutely, why not? In fact on stage 2 I will be using this. It looks like it supports POCOs, it looks like my concurrency model will move straight across with no problems (basically delta copy stuff). Its all good so far and I hope this time round that the Big guys at Microsoft have seen the errors of their ways and provided a solution that works.
If your buying into entity development and the whole Concept Model first thing, then its the only way to go for a complete Microsoft solution. Although the stuff being done on the M language might eclipse the idea and move the whole modelling thing back to the Database.
If you not buying into the Entity stuff then I would strongly go Enterprise Library. Its a proven technology that works every time built on a solid code foundation and Database centric paradigm. I would also go this route if you think that Stored Procedures are the bees knees and like what they bring to the table.
If your feeling really exotic and feel a bit frisky I would go with a NO-SQL approach such as CouchDB. This however does take some getting used to. Its damn weird and feels really really wrong. But things get developed in super quick time and the solutions seem to be robust and faster than expected. I would not got for this type of solution though if your big into Normalization and think that it can be applied to a NO-SQL approach. The whole model needs to be shifted on its head and the application will be needed to be modelled in a way that is driven by the technology applied.
I find the CouchDB way a bit dirty and very very wrong. But it has so many compelling reasons to use it, that I think it will seep into the psyche of every programmer, and it will definitely go mainstream in the next couple of years.
My biggest gripe still with the whole Entity thing though even in the new version 4 is that there really has not been much thought into N-tier environments. It still got a feel about it being a 2 tier solution with a lot of boiler plate code still needed to be done by the end user (developer), to get it working in a robust and dependable N-Tier way.
We been having some discussions on approaches to using the entity framework at work recently. We have a fairly large and complex n-tier web based application, which is due for a major overhaul.
The question is: If we where to starting using the entity framework, would it be better to create one big model, or a set of smaller functional/acivity based models.
I have my own opinions on this, but would be interested to hear what some other people think.
Update (17th November 2008):
I have been creating one model, wiping it out and re-creating, etc for small projects at home. Although I haven't tried, I suspect that this approach will be a bit more challenging when there are a large number of entity types involved.
Also, does anyone have any experience of using ef with a large team using TFS or similar?
In my experience with it, I would just make one big model of the database. Otherwise, it might be hard to track what tables changed where. When I make changes to the database, I just delete all the tables in the model and regenerate it.
Of course, I also didn't customize my model by adding "entity" functionality to it (not sure how that works exactly).
So I'm no expert in it, but I usually end up using the LINQ-To-SQL models/objects instead of the Entity Framework - it's worked better for me so far.