JPA - Compound key vs generated id in Many to many table - jpa

I am creating a kind of social network and I have users that can follow other users. So I have an entity like:
#Entity
public class FollowedUser{
#ManyToOne
private User user;
#ManyToOne
private User followedUser;
//more fields
...
}
I cannot have a ManyToMany relationship as I have more fields in my FollowedUser entity. Now, the questions I have are:
Should I use a compound key or a generated id (surrogate key)? I have read the following links (1, 2, 3) about the topic where a surrogate key is suggested, but I don't know if they apply to my concrete case (where my compound key would be composed of two surrogate foreign keys). Also here (4) it says "Composite primary keys typically arise when mapping from legacy databases" so I suppose they are discouraged.
In case I should use a compound key, I don't know if I should use #IdClass (as recommended here 5) or #EmbeddedId (as recommended here 6) or any other option. Although I suppose it doesn't matter.
In case I should use a surrogate key, I don't know how to still make impossible to have the compound candidate key repeated. I have read here (7) about unique indexes but I don't know if it is the correct workaround to that problem.

1. I recommend using surrogate keys. I find it helpful to separate the database identity of a record from it's business identity. If the two concepts are mixed, it may be cumbersome to model them right and to remodel them later. You reference some good answers, so you are probably aware of the major up- and downsides, no need to reiterate them here. One additional point is that you can rely on the surrogate key like UUID to implement equals and hashCode properly. Implementing those for a composite keys to play nicely with both collections and the db can be tricky.
As to your use case, a connection between users can be viewed as an entity of it's own and have a autogenerated surrogate PK. You can enforce the uniqueness of the business key attributes in the DB, see pt.3.
2. AFAIK, deciding between EmbeddedId and IdClass is mostly a matter of taste. I prefer
IdClass, since it avoids having to add navigation when querying id attributes:
... WHERE a.id.attribute = :att with EmbeddedId vs.
... WHERE a.attribute = :att vs. with IdClass
I do not find the argument you link 6 convincing. Composite keys tend to consist of the most characteristic attributes of the entity. To hide them away in a different class because they happen to be used as the DB key seems awkward to me.
3. Unique indexes look like a good way to guarantee uniqueness of a combination of attributes. You may want to read this answers, there is a small example.
If you are not yet working with JPA 2.1, you might want to use unique constraints, as explained here.

Related

Which JPA key generator option to choose?

I am new to JPA and databases in general. I was trying to generate entities from tables using JPA tools in Eclipse. There are a number of tables and I am trying to generate entities for all of them at the same time. The JPA tool gives me the following options for Key-generator.
I looked around on Google a bit but could not find much that addresses all the options. What do the options mean?
The JPA specification document provides answers in section 11.1.20, on pages 449 and 450:
The GeneratedValue annotation provides for the specification of
generation strategies for the values of primary keys. The
GeneratedValue annotation may be applied to a primary key property or
field of an entity or mapped superclass in conjunction with the Id
annotation.
The use of the GeneratedValue annotation is only required to be
supported for simple primary keys.
In case you are not familiar with the Id annotation, here is a quick explanation by Vlad Mihalcea from t/his blog post:
The #Id annotation is mandatory for entities, and it must be mapped to
a table column that has a unique constraint. Most often, the #Id
annotation is mapped to the Primary Key table column.
The types of primary key generation are defined by the GenerationType enum:
TABLE, SEQUENCE, IDENTITY, AUTO
The JPA spec gives details on those types as follows:
The TABLE generator type value indicates that the persistence provider
must assign primary keys for the entity using an underlying database
table to ensure uniqueness.
The SEQUENCE and IDENTITY values specify the use of a database
sequence or identity column, respectively. The further specification
of table generators and sequence generators is described in sections
11.1.48 and 11.1.51.
The AUTO value indicates that the persistence provider should pick an
appropriate strategy for the particular database. The AUTO
generation strategy may expect a database resource to exist, or it may
attempt to create one. A vendor may provide documentation on how to
create such resources in the event that it does not support schema
generation or cannot create the schema resource at runtime.
A well-established and recommended strategy is to chose the SEQUENCE strategy, if that is supported by the database management system.
Note well, that strictly speaking, there is no NONE strategy defined in the JPA spec. The corresponding option in the select one menu, depicted in the screenshot, simply expresses that "none" of the four regular types shall be set. This seems to be a fallback to indicate you don’t have chosen your strategy for now. Still, you should pick one from the regular ones.

JPA/Hibernate and composite keys

I have come across some SO discussions and others posts (e.g. here, here and here) where using composite primary keys with JPA is described either as something to be avoided if possible, or as a necessity due to legacy databases or as having "hairy" corner cases. Since we are designing a new database from scratch and don't have any legacy issues to consider is it recommended or let's say, safer, to avoid composite primary keys with JPA (either Hibernate or EclipseLink?).
My own feeling is that since JPA engines are complex enough and certainly, like all software, not without bugs, it may be best to suffer non-normalized tables than to endure the horror of running against a bug related to composite primary keys (the rationale being that numeric single-column primary keys and foreign keys are the simplest use case for JPA engines to support and so it should be as bug-free as possible).
I've tried both methods, and personally I prefer avoiding composite primary keys for several reasons:
You can make a superclass containing the id field, so you don't have to bother with it in all your entities.
Entity creation becomes much easier
JPA plays nicer in general
Referencing to an entity becomes easier. For example storing a bunch of IDs in a set, or specififying a single id in the query string of a web page is largelly simplified by only having to use a single number.
You can use a single equals method specified in the super class that works for all entities).
If you use JSF you can make a generic converter
Easier to specify objects when working with your DB client
But it brings some bad parts aswell:
Small amount of denormalization
Working with unpersisted objects (if you use auto generated IDs, which you should) can mean trouble in some cases, since equality methods and such needs an ID to work correctly

I don't need/want a key!

I have some views that I want to use EF 4.1 to query. These are specific optimized views that will not have keys to speak of; there will be no deletions, updates, just good ol'e select.
But EF wants a key set on the model. Is there a way to tell EF to move on, there's nothing to worry about?
More Details
The main purpose of this is to query against a set of views that have been optimized by size, query parameters and joins. The underlying tables have their PKs, FKs and so on. It's indexed, statiscized (that a word?) and optimized.
I'd like to have a class like (this is a much smaller and simpler version of what I have...):
public MyObject //this is a view
{
Name{get;set}
Age{get;set;}
TotalPimples{get;set;}
}
and a repository, built off of EF 4.1 CF where I can just
public List<MyObject> GetPimply(int numberOfPimples)
{
return db.MyObjects.Where(d=> d.TotalPimples > numberOfPimples).ToList();
}
I could expose a key, but whats the real purpose of dislaying a 2 or 3 column natural key? That will never be used?
Current Solution
Seeming as their will be no EF CF solution, I have added a complex key to the model and I am exposing it in the model. While this goes "with the grain" on what one expects a "well designed" db model to look like, in this case, IMHO, it added nothing but more logic to the model builder, more bytes over the wire, and extra properties on a class. These will never be used.
There is no way. EF demands unique identification of the record - entity key. That doesn't mean that you must expose any additional column. You can mark all your current properties (or any subset) as a key - that is exactly how EDMX does it when you add database view to the model - it goes through columns and marks all non-nullable and non-computed columns as primary key.
You must be aware of one problem - EF internally uses identity map and entity key is unique identification in this map (each entity key can be associated only with single entity instance). It means that if you are not able to choose unique identification of the record and you load multiple records with the same identification (your defined key) they will all be represented by a single entity instance. Not sure if this can cause you any issues if you don't plan to modify these records.
EF is looking for a unique way to identify records. I am not sure if you can force it to go counter to its nature of desiring something unique about objects.
But, this is an answer to the "show me how to solve my problem the way I want to solve it" question and not actually tackling your core business requirement.
If this is a "I don't want to show the user the key", then don't bind it when you bind the data to your form (web or windows). If this is a "I need to share these items, but don't want to give them the keys" issue, then map or surrogate the objects into an external domain model. Adds a bit of weight to the solution, but allows you to still do the heavy lifting with a drag and drop surface (EF).
The question is what is the business requirement that is pushing you to create a bunch of objects without a unique identifier (key).
One way to do this would be not to use views at all.
Just add the tables to your EF model and let EF create the SQL that you are currently writing by hand.

Composite DB keys with Entity Framework 4.0

The re-design for a large database at our company makes extensive use of composite primary keys on the database.
Forgetting performance impacts, will this cause any difficulties when working with this db in Entity Framework 4.0? The database structure is unlikely to change and I'm not looking for "philosophical" debate but what are the practical impacts?
According to Jeremy Miller, "Composite key make any kind of Object/Relational mapping and persistance in general harder." but he doesn't really say why. Is this relavent to how Entity Framework 4.0 handles keys?
No, EF4 supports composite keys just fine.
The problem is a table with a surrogate key and composite keys. You can only set a single key on each model; that key can have multiple fields, but you can only have one from the designer standpoint. Not sure about manually editing xml or code only mapping.
You can set a field as an Identity and not a key if you need a composite and surrogate key on the same table. The Identity ( Id ) field won't be used by the ObjectContext or ObjectStateTracker but will increment and be queryable just fine though.
I have had problems with EF4 and composite keys. It doesn't support columns being used as components in more than one key in a join table.
See my previous question Mapping composite foreign keys in a many-many relationship in Entity Framework for more details. The nuts of it is that when you have a join table (describing a many-many relationship) where both of the relationships use a common key, you'll get an error like
Error 3021: Problem in mapping
fragments...: Each of the following
columns in table PageView is mapped to
multiple conceptual side properties:
PageView.Version is mapped to
(PageView_Association.View.Version,
PageView_Association.Page.Version)
The only way around it was to duplicate the column which defeats the purpose of having it there at all.
Good luck!

How can I add constraints to an ADO.NET Entity?

I know how to mark a group of fields as primary key in ADO.NET entities but i haven't found a way to declare unique constraints or check constraints.
Is this feature missing on the designer or on the framework?
Support for unique keys/constraints does not exist in ADO.NET Entities in v4.0, see the answer to "one-to-one association on a foreign key with unique constraint", where Diego B Vega says:
I know for sure we haven't added
support for unique keys other than
primary keys in 4.0.
He does, however, provide a possible workaround/hack (which comes with all the normal caveats):
As you are probably aware of, it is
often possible to “lie” to Entity
Framework and tell it in the SSDL, for
instance, that some unique key is the
primary key. I reckon this would work
very well if the actual primary key is
an surrogate key (i.e. an IDENTITY
column that was added for this
purpose) and you don’t even have to
map it in the model.