I am looking at entity framework and trying learn more about it. So have created a simple project to play with.
I found out that I can't add a table if it does not have a primary key. Reading some posts on here and other places I think that is correct. It is apparently to allow EF to do deletions and updates etc. If I have a project where there will be no deletions or updates, just select queries I'm guessing it doesn't matter what column I make as a primary key? I understand most tables should have a primary this is just a question out of curiosity.
Also can EF handle a primary key on multiple columns, I assume so?
Although you application does not require deletions or updates, son or later you will need a primary key. If you set a good primary key (here you have a good guide for this), the task of programming will be much easier. And yes, EF can handle primary key on multiple columns.
Related
I'm taking a course about PostgreSQL coming from a MySQL background and I stumbled upon the USING table expression. I know it is a shorthand to, well, shorten the ON conditions for JOINs, but I have questions
https://www.postgresql.org/docs/13/queries-table-expressions.html
Are they actually used?
I think that having, say, a "customerid" PRIMARY key on some "customers" table just to be able to use USING is way less unconvenient than just having a normal "id" PRIMARY key as I've always done; is it bad practice?
USING clauses are used quite often. It is rather a design choice for the tables in a database. Sometimes customers.id is used in the primary table and sometimes customers.customer_id.
Usually you'll see customer_id as foreign keys in other tables.
If in your queries you plan to do a lot of simple joins on foreign vs primary keys structuring the tables to be able to use the USING clause might be worth it if it simplifies many queries.
I would say none of the two options could be considered bad practice.
In order to cut down on "stupid" tables (the ones which are identical for several related parent entities) we made a few generic tables.
Here is an example:
tbl_settings
id
owner_type (e.g. "account", "user" etc.)
owner_id (actual ID of a foreign table record)
setting_name
setting_value
Now the problem are the deletes, where it is quite easy to forget to delete e.g. the user's settings when the user is deleted.
What is the right way to handle deletes for this kind of a table in PostgreSQL?
Do it in an application (e.g. when deleting an user, do a manual delete of related settings)?
Do it in a database trigger on the tbl_user (and in all other parent tables)?
Something else?
If a tables relationships have meaning only in the application and upwards - i.e. it has no bearing on the referential integrity of the data in the database - you can do this in the application layer.
If "orphaned" records violate data (as opposed to business logic) relationships, then do this in the database: the safest way is probably via a trigger, though that has its disadvantages too (e.g. the likelihood of obfuscating DML errors is higher if there is a trigger action involved).
My impression from your question is that these tables are mainly there because of some business logic, in which case I would handle the deletes outisde the database, in an ORM layer, for example.
I have a set of tables that are partitioned on a partition scheme A and now I would like to change to using partition scheme B instead. I've read online articles (such as this) about doing that but it appears to be quite bothersome.
But it seems like the general way of doing this is to:
1. Drop all foreign keys
2. Drop the cluster primary key
3. Re-add the primary key with the new scheme
4. Re-add all the foreign keys
However, not being too fluent with T-SQL, I am not too sure how to do this dynamically. I'd like to write a script to store all the key settings and re-applying that later without human supervision and that appears to be beyond me at the moment. MSDN is also not very helpful because it's just data overload when I read their documentations.
Is there any resource I can read that'd provide more insight on how to do this? Perhaps some clever method call from a built-in method I don't previously know on SQL Server?
Any input would be appreciated. Thanks
Instead of having a composite primary key (this table maintains the relationship between the two tables which represents two entities [two tables]), the design is proposed to have identity column as primary key and the unique data constraint is enforced over two columns which represents the data from the primary key of entities.
For me having identity column for each relationship table is breaking the normalisation rules.
What is the industry standards?
What are the considerations to make before making the design decision on this?
Which approach is right?
There are lots of tables where you may want to have an identity column as a primary key. However, in the case of a M:M relationship table you describe, best practice is NOT to use a new identity column for the primary key.
RThomas's link in his comment provides the excellent reasons why the best practice is to NOT add an identity column. Here's that link.
The cons will outweigh the pros in pretty much every case, but since you asked for pros and cons I put a couple of unlikely pros in as well.
Cons
Adds complexity
Can lead to duplicate relationships unless you enforce uniqueness on the relationship (which a primary key would do by default).
Likely slower: db must maintain two indexes rather than one.
Pros
All the pros are pretty sketchy
If you had a situation where you needed to use the primary key of the relationship table as a join to a separate table (e.g. an audit table?) the join would likely be faster. (As noted though--adding and removing records will likely be slower. Further, if your relationship table is a relationship between tables that themselves use unique IDs, the speed increase from using one identity column in the join vs two will be minimal.)
The application, for simplicity, may assume that every table it works with has a unique ID as its primary key. (That's poor design in the app but you may not have control over it.) You could imagine a scenario where it is better to introduce some extra complexity in the DB than the extra complexity into such an app.
Cons:
Composite primary keys have to be imported in all referencing tables.
That means larger indexes, and more code to write (e.g. the joins,
the updates). If you are systematic about using composite primary
keys, it can become very cumbersome.
You can't update a part of the primary key. E.g. if you use
university_id, student_id as primary key in a table of university
students, and one student changes university, you have to delete
and recreate the record.
Pros:
Composite primary keys allow to enforce a common kind of constraint
in a powerful and seemless way. Suppose you have a table UNIVERSITY,
a table STUDENT, a table COURSE, and a table STUDENT_COURSE (which
student follows which course). If it is a constraint that you always
have to be a student of university A in order to follow a course of
university A, then that constraint will be automatically validated if
university_id is a part of the composite keys of both STUDENT and
COURSE.
You have to create all the columns in each tables wherever it is used as foreign key. This is the biggest disadvantage.
I am building an Entity Framework model for a subset of the Pubs database from microsoft. I am only interested and publishers and books, not publishers and employees, but there is a foreign key constraint between the publishers and emoloyees tables. When I remove the employees entity from my model, the model won't validate because of the foreign key constraint.
How do I create a model for a subset of a database when that subset links to other tabes with foreign key constraints?
Because this is for a demo, I deleted the offending tables and constraints from the database, but this won't work in production.
The correct way to do this is by exposing the foreign key columns as scalar properties. There is a complete explanation, and downloadable sample code, in this blog post. You might find the rest of the post interesting, as well.
You could create views of the pertinent data and bind your model to that. I am not a database expert, but a DBA that I formerly worked with recommended this approach because she said that the view is less intensive on the database server to begin with.
Prior to the release of 3.5 SP1, we built a DAL on top of LINQ to SQL (without DBML mappings, but that is another story) that mapped all of the domain objects to either stored procedures or views. That way, the DBA was happy about the calls following a more set execution plan, as well as being able to encapsulate the database logic outside of the codebase.