Entity Framework with large number of tables - entity-framework

Our database has about 500 tables we'd like to use in our EF model. Of those I'd be happy to start with 50 or fewer just to get our feet wet after working in plain ADO.net for years.
The problem is, our SQL server contains many thousands of other tables that exist in our database that have been created through the years and many that are dynamically generated. Believe it or not:
select count(*) from INFORMATION_SCHEMA.TABLES
73261
So that's a lot of tables. I have found that pretty much every tool I've tried to design, build or template EF models or entities either hangs or does not return a list of tables. Even SQL Server Object Explorer in VS2012 won't list the tables and instead shows the Tables folder with a little "x" over the icon. So I can't even select a subset of tables.
What options do I have for using EF? Is there a template where I can explicitly define the tables that I want to use entities for? Even with 50 tables, I don't want to hand code each one in an empty EDMX.

Using a Database / Code First approach and avoiding connecting Visual Studio to the database at all (i.e. don't create an edmx, or connect with server explorer) would allow you to do this easily. It does not give you any of the Model First advantages, but I think it sounds like your project would be better served with a Database / Code First approach anyway as:
You have an existing Model, and are not looking to push changes from your EDMX to the DB
You are looking to implement this on a subset of your database
This link has a good summation ( Code-first vs Model/Database-first ) with the caveat that in you case a Database/Code First approach does not have you pushing changes from code to the Database, so the last two bullets under code first apply less, and yours is a Database/Code First hybrid.
With 70k tables I think that any GUI is going to be tricky. When I am saying Database / Code First, I am trying to convey that you are not using the code to create / define and update your Database. Someone may be able to answer this more succinctly / accurately?

I now this is an old question. But for those who land here on a google search. The only tool I have found that actually works with thousands of tables is The Sharp Factory.
It is an ORM. Pretty simple to use. So if you are looking for an ORM that can work with a large number of tables and does not require you to write "POCOS" or "Mappings" or SQL then this is the tool.
You can find it here: The Sharp Factory

Related

Multiple database in EF6

We are involved in quite a new development in which we are remaking our current web shop platform.
In the current platform we do not use EF6 neither other ORM but store procedures to access to the db, but in the new building is what we do.
We have a doubt regarding database design of the new platform. In the current platform we use several different databases depending on the content of them.
For example, we have dedicated databases to store information for products catalogs other dedicated db for handling orders.
Currently all data access is done through stored procedures, so we have no problem with the links between different databases.
The problem appears to us now when we have started to use EF6. In this case each DB is associated with a context and it is not possible to know data from one context to another
unless we implement directly in the source code these relationships using various contexts. It looks like these means we will lose the power of EF6.
The questions we have are:
Is it a bad design maintaining different databases for the same application using EF6?
in case this is a poor design and choosing for a single database, is the performance going to be optimum even driving hundreds of tables (almost 1000) with several TBytes of information?
in the other hand, in the case of opting for the design in which several bbdd appear (it would be much better in our case), what is the best way to handle them EF6?
Thank you very much for your help!
First of all EF is not written to be cross database. You can't write cross database (cross context) queries, lazy load does not work and so on.
This is a big limitation in your case.
EF could work with several schema (actually I don't use it and I don't like it but is just my opinion).
You can use your stored procedures with EF but as I understand you are thinking to stop to use them.
In my experience I wrote several applications with more than one database but the use of the different databases was very limited. In this cases I use cross database views (i.e. one database per company and some common tables with views in company databases that selects data in common tables). In your case, if the tables are sharded everywhere I don't think this is a way you can choose.
So, in my opinion you could change the approach.
If you have backups problems you could shard the huge tables (I think facts tables and tables with pictures) and create cross database views. BTW, also, cross database referential integrity is not supported in SQL Server so you need to write triggers to check it.
If you need to split different application functions (i.e. WMS, CRM and so on) you can use namespaces without bothering about how tables are stored in the DB.

EDMX or not EDMX any more?

I'm bit confused: with all the evolutions of EF i'm not sure where i'm now.
*Is EDMX a choice of the past and should be used any more ?
*If so what is the best choice ?
*I hate edmx, can i upgrade to code first ?
It is not clear what all this EF versions are to me
Thanks
Jonathan
For a lot of apps you can start using Code First if you want to. The one big thing Code First doesn't support yet is mapping to stored procedures. (You can still call stored procedures, but you can't map entity CRUD operations to them.)
That being said, doing Database First with an EDMX is still absolutely supported and a fine choice, especially you like using the EF designer.
EF 4.1 and above fully support both Code First and Database First.
Personally, I would almost always choose Code First, even with an existing database, because I'm a code-centric person and would rather keep all my mappings in code where I can easily refactor, manage in source control, split into multiple files, etc. For me, it's much easier and nicer to deal with code artifacts than monolithic XML documents.
This is how you should evaluate your Entity Framework usage:
1) EDMX is a totally valid option specifically if you have an existing Database and want to generate your entities based on your database schema. One of my favorite benefits to this can be rapid data layer development with low risk. Also mapping stored procedure results to classes is always nice when you have complex existing stored procedures to work with.
OR
2) Code First is a totally valid option specifically if you want to create you database based an object oriented data model. With code first its easy to make big refactors that you don't always think of till implementation time. Source control is more common with code and shelving/rolling back are beautiful features.
TL;DR version :
They are both totally viable options. Neither are outdated ;nor shall they be any time soon.
We had performance consideration in warm up EF Code First. EF Code First take some minutes to start, because we have thousand Entity. so this bottleneck enforced us to Use EDMX, and used Interactive Pregenerated to Create EDMX from Code First in First Run after entity Model changed, and at Other First Run warm up time considerably lowered.
but story not end at that. after doing that we saw in Development area we have many change in Entity Model, so after each change EDMX File should be recreated(update) very often. so we decide to Create EDMX Programatically and Optimize that creation for our Entity Models.

Relation between ER Modelling and Database normalization

How is database normalization related to ER Modelling??
What comes first??
Or should both be implemented at the same time??
I feel modeling should come first in a highly normalized database design.
Creating the model allows you to think through how the tables will relate to one another and also allows you to envision what tables you'll need to use when writing your join queries.
Using a tool such as MySQL Workbench or Toad Data Modeler , depending on your target database vendor, can even generate SQL commands to build the tables, constraints, and indexes directly from the model. This is useful because it ensures the tables are created exactly as you designed them.
Also, when making changes to the model, some tools like those mentioned above will even allow you to "update" your schema by issuing the necessary statements required to do so.
So in short, for a project with more than one table, I'd always model it first. It also makes it easier for developers to understand how the tables function and relate at a glance rather than having to read through DDL to understand it.
Modeling can even be fun!
A model created with MySQL Workbench:
Hope this helps!

How to model my database when using entity framework 4?

Trying to wrap my head around the best approach in modelling a database when we are using Entity Framework 4 as the ORM layer. We are going to use asp.net mvc 2 for the application.
Is it worth trying to model using the class diagram modeller that comes with Visual Studio 2010 where you graphically configure your models into the EDMX file and then generate out the database structure?
I have run into a bunch of non trivial issues and for complex many to many mappings or multi primary key entities the answer is not that obvious even after poking around a while with the tools.
I figure its easy at this point to give up and start modelling the DB using real, working DB modelling tools and then try to generate out the EDMX from the database, rather than trying to do the model first approach.
It's really a matter of preference. If you are comfortable in SQL server that's probably the best place to start. But if you are more of a C# programmer, it's sometimes easier to start in the EDMX designer, make the model and then ask it to figure out what the database should look like.
Of course if you do go model first you'll still need to go in to SSMS and add indexes and maybe rename some FKs and tables more to your liking. Then you can bring the model back up to date with an Update from Model.
Modelling inheritance is also something you'll need to do in the designer, but again you can either do it in SSMS or in the EDMX designer. For inheritance I mostly prefer SQL first because there is the explicit decision as to what form of inheritance you want - per hierarchy, per class, or per concrete type.

Data Warehousing Postgres

We're considering using SSIS to maintain a PostgreSql data warehouse. I've used it before between SQL Servers with no problems, but am having a lot of difficulty getting it to play nicely with Postgres. I’m using the evaluation version of the OLEDB PGNP data provider (http://www.postgresql.org/about/news.1004).
I wanted to start with something simple like UPSERT on the fact table (10k-15k rows are updated/inserted daily), but this is proving very difficult (not to mention I’ll want to use surrogate keys in the future).
I’ve attempted (Link) and (http://consultingblogs.emc.com/jamiethomson/archive/2006/09/12/SSIS_3A00_-Checking-if-a-row-exists-and-if-it-does_2C00_-has-it-changed.aspx) which are effectively the same (except I don’t really understand the union all at the end when I’m trying to upsert) But I run into the same problem with parameters when doing the update using a OLEDb command – which I tried to overcome using (http://technet.microsoft.com/en-us/library/ms141773.aspx) but that just doesn’t seem to work, I get a validation error –
The external columns for complent.... are out of sync with the datasource columns... external column “Param_2” needs to be removed from the external columns.
(this error is repeated for the first two parameters as well – never came across this using the sql connection as it supports named parameters)
Has anyone come across this?
AND:
The fact that this simple task is apparently so difficult to do in SSIS suggests I’m using the wrong tool for the job - is there a better (and still flexible) way of doing this? Or would another ETL package be better for use between two Postgres database? -Other options include any listed on (http://en.wikipedia.org/wiki/Extract,_transform,_load#Open-source_ETL_frameworks). I could just go and write a load of SQL to do this for me, but I wanted a neat and easily maintainable solution.
I have used the Slowly Changing Dimension wizard for this with good success. It may give you what you are looking for especially with the Wizard
http://msdn.microsoft.com/en-us/library/ms141715.aspx
The External Columns Out Of Sync: SSIS is Case Sensitive - I encountered this issue multiple times and it makes me want to pull my hair out.
This simple task is going to take some work either way. SSIS is by no means an enterprise class ETL product yet, but it does give you some quick and easy functionality, and is sufficient for most ETL work. I guess it is also about your level of comfort with it as well.
SCD is way too slow for what I want. I need to use set based sql.
It turned out that a lot of my problems were with bugs in the provider.
I opened a forum topic (http://www.pgoledb.com/forum/viewtopic.php?f=4&t=49) and had a useful discussion with the moderator/support/developer person.
Also Postgres doesn't let you do cross db querys, so I solved the problem this way:
Data Source from Production DB to a temp Archive DB table
Run set based query between temp table and archive table
Truncate temp table
Note that the temp table is not atchally a temp table, but a copy of the archive table schema to temporarily stored data in.
Took a while, but I got there in the end.
This simple task is going to take some work either way. SSIS is by no means an enterprise class ETL product yet, but it does give you some quick and easy functionality, and is sufficient for most ETL work. I guess it is also about your level of comfort with it as well.
What enterprise ETL solution would you suggest?