The difference between referenced relation and foreign key? - orientdb

I think that referenced relation is one record has a property which value is the record id of the other record, at the same time, foreign key is that one record has the primary key of the other record. Why the doc2.1.x emphasize that the referenced relation avoid the costly join operation?

OrientDB manages relations as physical links to records, assigned only once when the edge is created. OrientDB not use JOIN. Instead, use the links that has a relationship managed by storing the RID target in the record source. It 'very similar to store a pointer between two objects in memory. An edge connects two vertices and must have: a unique identifier, links vertex incoming, outgoing link vertex and label that defines the type of connection.
This is a little example:
Hope it helps

Related

When to use an owned entity types vs just creating a foreign key or adding the columns directly to the table?

I was reading about owned entity types here https://learn.microsoft.com/en-us/ef/core/modeling/owned-entities#feedback and I was wondering when I would use that. Especially when using .ToTable(); although I am not sure if ToTable creates a relationship with keys.
I read the entire article so I understand that it essentially forces you to access the data via nav properties and prevents the owned table from being treated as an entity. They also say Include() is not needed and the data comes down with every query for the parent table so its not like you are reducing the amount of data that comes back.
So whats the point exactly? Also whats the point of "table splitting"?
It takes the place of Complex types with the option to set it up like a 1-1 relationship /w ToTable while automatically eager-loaded. This would use the same PK in both tables, same as 1-1.
The point Table-splitting would be that you want an object model that is normalized, where the table structure is not. This would fit scenarios where you have an existing table structure and want to split off related pieces of that data into sub-entities associated with the main entity. With the ToTable option, it would be similar to a 1-1 relationship, but automatically eager-loaded. However when considering the reasons to use a 1-1 relationship I would consider this option a bad choice.
The common reasons for using it in normal 1-1 relationships would include:
Splitting off expensive to load, rarely used data. (images, binary, memo)
Encapsulating data particular to a single application off of a common entity. i.e. if I have a "Customer" which is used by a billing system vs. a CRM I might have "CustomerBillingData" and "CustomerCRMData" owned by "Customer" rather than an inherited BillingCustomer / CRMCustomer. As there is a "single" customer that may serve one or both systems. Billing doesn't care about CRM data, CRM doesn't care about Billing. If all data is in "Customer" then both systems potentially need to be updated, and I cannot rely on constraints when the data is optional to the other system. By using composition I can enforce required data for a particular system.
In neither of these cases would I want to use table-splitting or anything that automatically eager-loads, so Owned Types /w ToTable would not replace 1-1 relationships by any stretch. It's essentially a more strict version of complex types, I'd say it's strictly used for entity organization. Not something I'd admit to wanting to use very often.

How to stop EF Core from indexing all foreign keys

As documented in questions like Entity Framework Indexing ALL foreign key columns, EF Core seems to automatically generate an index for every foreign key. This is a sound default for me (let's not get into an opinion war here...), but there are cases where it is just a waste of space and slowing down inserts and updates. How do I prevent it on a case-by-case basis?
I don't want to wholly turn it off, as it does more good than harm; I don't want to have to manually configure it for all those indices I do want. I just want to prevent it on specific FKs.
Related side question: is the fact that these index are automatically created mentioned anywhere in the EF documentation? I can't find it anywhere, which is probably why I can't find how to disable it?
Someone is bound to question why I would want to do this... so in the interest of saving time, the OPer of the linked question gave a great example in a comment:
We have a People table and an Addresses table, for example. The
People.AddressID FK was Indexed by EF but I only ever start from a
People row and search for the Addresses record; I never find an
Addresses row and then search the People.AddressID column for a
matching record.
EF Core has a configuration option to replace one of its services.
I found replacing IConventionSetBuilder to custom one would be a much cleaner approach.
https://giridharprakash.me/2020/02/12/entity-framework-core-override-conventions/
If it is really necessary to avoid the usage of some foreign keys indices - as far as I know (currently) - in .Net Core, it is necessary to remove code that will set the indices in generated migration code file.
Another approach would be to implement a custom migration generator in combination with an attribute or maybe an extension method that will avoid the index creation. You could find more information in this answer for EF6: EF6 preventing not to create Index on Foreign Key. But I'm not sure if it will work in .Net Core too. The approach seems to be bit different, here is a MS doc article that should help.
But, I strongly advise against doing this! I'm against doing this, because you have to modify generated migration files and not because of not using indices for FKs. Like you mentioned in question's comments, in real world scenarios some cases need such approach.
For other people they are not really sure if they have to avoid the usage of indices on FKs and therefor they have to modify migration files:
Before you go that way, I would suggest to implement the application with indices on FKs and would check the performance and space usage. Therefor I would produce a lot test data.
If it really results in performance and space usage issues on a test or QA stage, it's still possible to remove indices in migration files.
Because we already chat about EnsureCreated vs migrations here for completeness further information about EnsureCreated and migrations (even if you don't need it :-)):
MS doc about EnsureCreated() (It will not update your database if you have some model changes - migrations would do it)
interesting too (even if for EF7) EF7 EnsureCreated vs. Migrate Methods
Entity Framework core 2.0 (the latest version available when the question was asked) doesn't have such a mechanism, but EF Core 2.2 just might - in the form of Owned Entity Types.
Namely, since you said:
" I only ever start from a People row and search for the Addresses record; I never find an Addresses row"
Then you may want to make the Address an Owned Entity Type (and especially the variant with 'Storing owned types in separate tables', to match your choice of storing the address information in a separate Addresses table).
The docs of the feature seem to say a matching:
"Owned entities are essentially a part of the owner and cannot exist without it"
By the way, now that the feature is in EF, this may justify why EF always creates the indexes for HasMany/HasOne. It's likely because the Has* relations are meant to be used towards other entities (as opposed to 'value objects') and these, since they have their own identity, are meant to be queried independently and allow accessing other entities they relate to using navigational properties. For such a use case, it would be simply dangerous use such navigation properties without indexes (a few queries could make the database slow down hugely).
There are few caveats here though:
Turning an entity into an owned one doesn't instruct EF only about the index, but rather it instructs to map the model to database in a way that is a bit different (more on this below) but the end effect is in fact free of that extra index on People.
But chances are, this actually might be the better solution for you: this way you also say that no one should query the Address (by not allowing to create a DbSet<T> of that type), minimizing the chance of someone using it to reach the other entities with these costly indexless queries.
As to what the difference is, you'll note that if you make the Address owned by Person, EF will create a PersonId column in the Address table, which is different to your AddressId in the People table (in a sense, lack of the foreign key is a bit of a cheat: an index for querying Person from Address is there, it's just that it's the primary key index of the People table, which was there anyways). But take note that this design is actually rather good - it not only needs one column less (no AddressId in People), but it also guarantees that there's no way to make orphaned Address record that your code will never be able to access.
If you would still like to keep the AddressId column in the Addresses, then there's still one option:
Just choose a name of AddressId for the foreign key in the Addresses table and just "pretend" you don't know that it happens to have the same values as the PersonId :)
If that option isn't funny (e.g. because you can't change your database schema), then you're somewhat out of luck. But do take note that among the Current shortcomings of EF they still list "Instances of owned entity types cannot be shared by multiple owners", while some shortcomings of the previous versions are already listed as addressed. Might be worth watching that space as, it seems to me, resolving that one will probably involve introducing the ability to have your AddressId in the People, because in such a model, for the owned objects to be shared among many entities the foreign keys would need to be sitting with the owning entities to create an association to the same value for each.
in the OnModelCreating override
AFTER the call to
base.OnModelCreating(modelBuilder);
add:
var indexForRemoval = modelBuilder.Entity<You_Table_Entity>().HasIndex(x => x.Column_Index_Is_On).Metadata;
modelBuilder.Entity<You_Table_Entity>().Metadata.RemoveIndex(indexForRemoval);
'''

How to avoid duplicate records with user edits in a referential system?

PostgreSQL 10.1
I am curious to see how the SO Community deals with this common database problem.
The problem is this. I have textual descriptions of various problems being entered at the desktop into the table ICD_Descriptions (the middle box below). Generally speaking there will also be one code associated with each description. However, over time (i.e., years), the codes for a specific text phrase/description will change. Hence, a general many-to-many relationship will exist for some codes to descriptions. So a third table, dx_log, is being used to allow for a many-to-many relationship between the code and the descriptions. Lastly, other "children" tables that need to see a specific combination of 'code-description' will be given a reference to the primary key (the recid) of the dx_log. I believe this arrangement to be fairly standard database management.
O.K., now for the problem. I wish the codes in the icd_code table to be unique to that table. I also wish the descriptions in the icd_description table to be unique to their table.
The problem. This being a referential system, changes to either the "data" part of the code table, (the code), or the description (in the descriptions table) will be seen in the child tables.
But, how to correctly manage user edits to a code or description that would conflict with the "unique" rule of the respective table?
As an example, lets say the initial text of a description is "this is mispelled", whereas another description text has "this is misspelled". At one point it time, both phrases coexist and are unique. However, at a later point in time, the wrong record is corrected (i.e., mispelled --->misspelled). When the edits are attempted to be saved to the file, it will be detected that the correct record already exists.
So if the dx_log table is also unique on (icd_description_recid, icd_code_recid), then simply replacing the reference to the already existing correctly spelled record in the dx_log will lead to a violation of uniqueness for the dx_log table. Therefore, I can think of only three solutions:
The table which the children tables actually reference CAN NOT HAVE A UNIQUES constraint on its reference pointers,
Leave the uniqueness constraint on the dx_log, but when a conflict would violate uniqueness, then use a migrate procedure to move the child table references to the existing record (in postgresql this would be heavy use of the catalogs) to the "new record", then delete the existing record before adding the new record.
Add an additional self-reference pointer in the dx_log such that when a record would conflict with an already existing record in dx_log, then don't change it but rather place a pointer to the already existing "correct" record in the dx_log.
I hope I've explained my question well enough. What is the recommended approach?
Thanks for any comments.
I would say the 2. is the correct solution.
Yes, that requires that all dependent records in dx_log have to be updated when two entries are consolidated, but you have an index on dx_log(icd_description_recid) anyway, right?
The other solutions compromise data consistency and will make all queries on the system more complicated and probably slower.
Separate the idea of a natural unique constraint from the idea of a key. Use a system generated surrogate key for referential integrity, and a UNIQUE INDEX for the natural key. Then you don't have to reparent anything.

Null values in relational database

I'm new in PostgreSQL(still learning)
I'm trying to create a relational database for a venue.
In my table(still in UNF) I have attribute to store the client's name, phone, email.
The problem is that the client will give maybe 2 or 1 info on him. So I will always have null values.
Sometimes I can get all the client's values(for the 3 attribute)
How am i supposed to deal with this in the normalization process?
Do I need to separate the tables in other relation. If so 3 relations is not too much?
For every attribute that should be there once, use a column in the main table. "Should" indicates it might be missing / unknown, too. That's a NULL value then. If the attribute must be there, define the column NOT NULL.
Attributes where there can be multiple distinct instances, especially if the maximum number is uncertain, create a separate table in a one-to-many relationship.
Store (non-trivial) attributes that can be used in many rows of the main table, in a separate table in a many-to-one relationship.
And attributes that can be linked multiple times on either side are best implemented in a many-to-many relationship.
Referential integrity is enforced with foreign key constraints.
It's not nearly as complex as reality, but the point is to establish a logically valid model that can keep up with reality.
Read basics about database normalization.
Detailed code example with explanation and links for n:m relationship:
How to implement a many-to-many relationship in PostgreSQL?

I don't need/want a key!

I have some views that I want to use EF 4.1 to query. These are specific optimized views that will not have keys to speak of; there will be no deletions, updates, just good ol'e select.
But EF wants a key set on the model. Is there a way to tell EF to move on, there's nothing to worry about?
More Details
The main purpose of this is to query against a set of views that have been optimized by size, query parameters and joins. The underlying tables have their PKs, FKs and so on. It's indexed, statiscized (that a word?) and optimized.
I'd like to have a class like (this is a much smaller and simpler version of what I have...):
public MyObject //this is a view
{
Name{get;set}
Age{get;set;}
TotalPimples{get;set;}
}
and a repository, built off of EF 4.1 CF where I can just
public List<MyObject> GetPimply(int numberOfPimples)
{
return db.MyObjects.Where(d=> d.TotalPimples > numberOfPimples).ToList();
}
I could expose a key, but whats the real purpose of dislaying a 2 or 3 column natural key? That will never be used?
Current Solution
Seeming as their will be no EF CF solution, I have added a complex key to the model and I am exposing it in the model. While this goes "with the grain" on what one expects a "well designed" db model to look like, in this case, IMHO, it added nothing but more logic to the model builder, more bytes over the wire, and extra properties on a class. These will never be used.
There is no way. EF demands unique identification of the record - entity key. That doesn't mean that you must expose any additional column. You can mark all your current properties (or any subset) as a key - that is exactly how EDMX does it when you add database view to the model - it goes through columns and marks all non-nullable and non-computed columns as primary key.
You must be aware of one problem - EF internally uses identity map and entity key is unique identification in this map (each entity key can be associated only with single entity instance). It means that if you are not able to choose unique identification of the record and you load multiple records with the same identification (your defined key) they will all be represented by a single entity instance. Not sure if this can cause you any issues if you don't plan to modify these records.
EF is looking for a unique way to identify records. I am not sure if you can force it to go counter to its nature of desiring something unique about objects.
But, this is an answer to the "show me how to solve my problem the way I want to solve it" question and not actually tackling your core business requirement.
If this is a "I don't want to show the user the key", then don't bind it when you bind the data to your form (web or windows). If this is a "I need to share these items, but don't want to give them the keys" issue, then map or surrogate the objects into an external domain model. Adds a bit of weight to the solution, but allows you to still do the heavy lifting with a drag and drop surface (EF).
The question is what is the business requirement that is pushing you to create a bunch of objects without a unique identifier (key).
One way to do this would be not to use views at all.
Just add the tables to your EF model and let EF create the SQL that you are currently writing by hand.