Does a weak entity need a partial key? - entity-framework

Does a weak entity need a partial key? Or can you just use its parent key as its primary key.
i.e Order and OrderItem. Order has a PK OrderPK, whilst OrderItem has no partial key.
Is this considered bad practice?

The OrderItem table should have an OrderID field that makes a FK reference back to the Orders table. This assures each item is for a valid order.
Then there is usually another field with distinguishes each item which would be used together with the OrderID field to form the primary key for the item.
This could be an intrinsic value or values that is unique for each item within an order. SKU or PartNum might be just such a value, assuming that multiple occurrences of the same item would be merged into one entry. To find this value, just ask yourself what minimum amount of data would you need to uniquely identify one item from another within the same order. However, it may not be possible. A disadvantage of this scheme is that you could be using dynamic data for a key field. The SKU of a particular item could well change some time in the future.
Or there could be a sequential value (1, 2, 3,...) for each item in an order. A disadvantage with this scheme is the sequential values cannot be system generated. Each sequence is unique for each order and this must be generated by trigger or application code.
Or there could be a system-generated sequential value unique to all the items for all the orders and this field could be the lone primary key. Per-order sequential values could still be generated by row_number functions in queries, but this means a particular item could have different values in different queries. That may or may not be a problem.
At this point, only you know enough about your system to choose the best option. But it is generally necessary for users to be able to select one specific item of one specific order, so some sort of key definition for each item is usually necessary.

Related

mapping generalization constraints to sql (STI approcach)

I'm trying to model the following relationships between entities, mainly consisting of a partial, disjoint generalization.
original EERD
'mapped' to relational
Since I didn't need the subclasses to have any particular attributes I decided to use the "single table inheritance" approach, added the "type" field and moved the relationships towards the parent.
After that I had two choices to make:
1- type for the "business type" attribute
2- way to constraint participation to at most one of the 4 relationships based on the type attribute
For the sake of portability and extensibility I decided to implement no.1 as a lookup table (rather than enum or a hardcoded check).
About no.2 I figured the best way to enforce participation and exclusivity constraints on the four relationships would be a trigger.
The problem is that now I'm not really sure how to write a trigger function; for instance it would have to reference values inserted into business type, so I'd have to make sure those can't be changed or deleted in the future.
I feel like I'm doing something wrong so I wanted to ask for feedback before going further; is this a suitable approach in your opinion?
I found an interesting article describing a possible solution: it feels a bit like an 'hack' but it should be working
(it's intended for SQL Server, but it can be easily applied in postgres too).
EDIT:
It consists in adding a type field to the parent table, and then have every child table reference said field along with the parent's id by using a foreign key constraint (a UNIQUE constraint on this pair of fields has to be added beforehand, since FKs must be unique).
Now in order to force the type field to match the table it belongs to, one adds a check constraint/always generated value ensuring that the type column always has the same value
(eg: CHECK(Business_type_id = 1) in the Husbandry table, where 1 represents 'husbandry' in the type lookup table).
The only issue is that it requires a whole column in every subclass, each containing the same generated value repeated over and over (waste of space?), and it may fall apart as soon as the IDs in the lookup table are modified

Deciding primary key for DynamoDB

I have 3 fields to store in DynamoDB: identity-1, identity-2, score.
identity-1 and identity-2 are always unique in the table, i.e. no two entries can have same identity-1 or identity-2.
We want to allow entries to either have one of identity-1 or identity-2 or have both. Example:
identity-1
identity-2
score
a1
b1
s1
a2
s2
b3
s3
Access patterns are as follows:
Query identity-2 from identity-1
Query score from identity-1
Query score from identity-2
How do I define primary key in such case?
This is a "many:1" problem and there's a few ways to tackle it with DynamoDB. The simple answer here is to leverage Global Secondary Indexes (GSI). For every "identity" you wanted to do a direct look up from, you'd create a GSI.
GSI-1 would include Identity-1 as the hash key and you'd include Identity-2 and any other identities as a non-key attribute to include. You'd create a GSI for each identity you wanted to query directly on. You could also include the score as a non-key attribute if you wanted to directly look up score from any identity without having to resolve to the primary key (which we'll talk about).
The thing to consider with GSI's, though, is that they consume extra storage and throughput. If you create a GSI which includes all your attributes for every identity, you'd be paying for an additional copy of your table for each identity.
The other issue, so far, is that you haven't chosen a Primary Key for your table. You'll need a field to be your primary key and if none of your identities is non-nullable, you'll need a field which will be. It's often convenient to just call it what it is, so we'll call it pk.
You've got a few choices for pk here. Once is to define pk as a composite of your identities. For example: item.pk = item["identity-1"] || item["identity-2"]. Then you could do a query on the table for the identity == pk and if you don't find anything, you could then look up the index for the given identity. This works fine for your simple example, but as you wanted to do more complex things (such as many different identity types), you might find it to be a bit of a headache.
From past experience, my recommendation would be to adjust your approach slightly, however, and have an "users" table and a "scores" table. "users" would have a pk of a guid unique for every user and all their identities (call it "user_id"), you could then create a GSI for that table for every identity back to user_id. Then scores would then use "user_id" as the pk as well with no need for an index. Your application would always resolve to a "user_id" when a user was logged in or otherwise identified - then you can search for score without needing to track identity and you can look up all the associated identities or other user information without needing to create a very "fat" index of every identity->every other identity.

Self References

For an assessment task I'm doing, an entity album has the attribute also_bought, which is a self-referential attribute. However, this one attribute has multiple entries for any one album - as the also_bought recommendations are rarely only one recommendation - and thus, is a bit of a question mark when it comes to normalisation. I'm not sure whether it passes 1NF or not.
To be clear, the entire entity's set is
Album(album_id, title, playtime, genre, release_date, price, also_bought)
"Also bought" items should be stored in a separate table, something like.
AlsoBought (table)
album_id
also_bought_album_id
Then configure foreign keys from both columns to reference Album.album_id.
You mean that Album is a "self-referencing table" because it has a FK (foreign key) from one column list to another in the same table? (A FK constraint holds when subrow values for a column list must appear elsewhere.) If you mean that the type of also_bought is a list of album_ids, there is no FK from the former to the latter, because values for the former (lists of ids) are not values for the latter (ids). There's a constraint that is reminding you of a FK.
Anyway, normalization is done to one table, and doesn't depend on FKs.
But any time you are "normalizing to 1NF" eliminating "non-atomic columns" you have to start by deciding what your "table" "columns" contain. If you decide a cell for a column in a row contains "many values" then you don't have a relational table and you have to come up with one. The easiest way is to assume a set-valued column to get a relation and then follow the standard rules for elimination of too-complex column types.

Attribute creation from one single table

I have one table.
I have to make attributes only from the fields on that table.
I have to use these attributes on one report.
I wanted to ask that all the attributes I have made are keys. Is this fine? If not, how do I resolve this issue?
The Keys are like primary, foreign keys in RDBMS. They define the joins
So long as you do not have other tables involved in the design, this is fine.
Ideally attributes are made only for dimensions
e.g
you could make attribute called Issue with forms(Issue id, Issue desc, Issue date) with Issue id as the ID form that drives the join with the other tables
All attributes should not be keys. Every key denotes that the tool is interpreting them as primary keys. Set proper relationship (parent-child) between the attributes and you will see keys only for the child attribute(s).

Using a string vs an id as a foreign key in Mongodb

I have a collection users whose documents will belong to a company (and each company can have many users). Because I set a unique index on the company name, can I use the name as the foreign key inside the user document, or is it recommended to use the id instead?
If name is unique and is guaranteed to never change, then you can use it, no problem. Although there were cases in my practice when names turned out to be not-so-unique and not-so-immutable (damn requirement changes). So, just to be extra safe, use the id.