Convert Relational Database model to AGE database

Convert Relational Database model to AGE database - apache-age

I have made an ER diagram of course management system considering Relational Database.
Here is the picture:
[Course Management System] (https://i.stack.imgur.com/RBNfg.png)
How can I map these things in Apache AGE?
What will be the Vertices and what will be the Edges in this case?
I have ER diagram of Course Management System considering Relational Database Model. I want guidance in converting it to Apache AGE Database Model.

I can give you an idea to a certain extent.
Create a Graph with some name(e.g. Registration_Details)
The nodes/vertices will be Person.
Since there are two types of Person, you can use vertex label names to distinguish between them(label name as Student and Teacher and).
Properties of vertex with label name Student will be {'Student_ID': , 'Address': , 'CGPA': } and similarly for vertex with label name Teacher.
Now for Registration details, You'll be creating all possible edges between Student and Teacher. For every edge created you'll be adding properties of edges as well!
example
StudentA(vertex-{StudID: ,...}) ------- edge1({'RegId':123, 'CourseId':789, 'Marks': 80}) ------ TeacherA(Vertex)
I have not included StudentID and TeacherID in edge1 because in Apache AGE the edge already consists of information about the vertexIDs.
Hope this gives some head start for you.

As a rule of thumb, table names in relational databases become label names for vertices, table columns become the properties of those vertices, and the relationships between tables (defined by foreign keys in relational databases) become the edges between vertices.
However, while you could technically map relational databases 1:1 onto graph databases, this may not be the best approach because graphs can offer us more powerful design choices. Check out this blog post for more info:
https://maxdemarzi.com/2015/08/26/modeling-airline-flights-in-neo4j/

You should first write down the relationship between the tables as each table becomes a vertex and it has some properties which are the columns. Then each vertex(TABLE) is connected to another vertex(TABLE) through an edge(RELATIONSHIP). This edge would also have some properties depending upon how you want to utilize the data. You can take help from the here

You need to create a vertex type for each entity (table), and then define the edges that connect the vertices based on the relationships in the ER diagram. You will also need to define attributes for edges along with defining attributes for vertexes as needed.

In your ERD you have defined your tables now you just have to convert those tables into vertexes and the relationships between them would become edges between those vertexes. This means your tables here represent your vertex and your edges represent the relationship between those tables.

Related

Inheriting Parent Table with identifier (Postgres)

Sorry if this is a relatively easy problem to solve; I read the docs on inheritance and I'm still confused on how I would do this.
Let's say I have the parent table being car_model, which has the name of the car and some of it's features as the columns (e.g. car_name, car_description, car_year, etc). Basically a list of cars.
I have the child table being car_user, which has the column user_id.
Basically, I want to link a car to the car_user, so when I call
SELECT car_name FROM car_user WHERE user_id = "name", I could retrieve the car_name. I would need a linking component that links car_user to the car.
How would I do this?
I was thinking of doing something like having car_name column in car_user, so when I create a new data row in car_user, it could link the 2 together.
What's the best way to solve this problem?

Inheritance is something completely different. You should read about foreign keys and joins.
If one user drives only one car, but many users can drive same car, you need to build one-to-many -relation. Add car_name to your user table and JOIN using that field.

Entity Framework: Doing JOINs without having to creating Entities

Just starting out with Entity Framework (Code First) and I have to say I am having a lot of problems with it when loading SQL data that is fairly complex. For example, let's say I have the following tables which stores which animals belongs to which regions in the world and the animal are also categorized.
Table: Region
Id: integer
Name string
Table AnimalCategory
Id integer
Name: string
RegionId: integer -- Refers back Region
Table Animal
Id integer
AnimalCategoryId integer -- Refers back AnimalCategory
Let's say I want to create a query with Entity Framework that would load all Animals for a specific region. The easiest thing to do is to create 3 Entities Region, AnimalCategory, and Animal and use LINQ to load the data.
But let's say I am not interested in loading any AnimalCategory information and define an Entity class just to represent AnimalCategory so that I can do the JOIN. How can I do this with Entity Framework? Even with many of its Mapping functions I still don't think this is possible.
In non Entity Framework solutions this is easy to accomplish by using INNER JOINs in SPs or inline SQL. So what are my options in Entity Framework? Shall I pollute my data model with these useless tables just so I can do a JOIN?

It's a matter of choice I guess. EF choose to support many-to-many associations with transparent junction tables, i.e. where junction tables only have two foreign keys to the associated entities. They simply didn't choose to support this far less common "skipping one-to-many-to-many" scenario in a similar manner.
And I can imagine why.
To start with, in a many-to-many association, the junction table is nothing but that: a junction, an association. However, in a chain of one-to-many (or many-to-one) associations it would be exceptional for any of the involved tables to be just an association. In your example...
Animal → AnimalCategory → Region
...AnimalCategory would only have a primary key (Id) and a foreign key (RegionId). That would be useless though: Animal might just as well have a RegionId itself. There's no reason to support a data model that doesn't make sense.
What you're after though, is a model in which the table in the middle does carry information (AnimalCategory.Name), but where you'd like to map it as a transparent junction table, because a particular class model doesn't need this information.
Your focus seems to be on reading data. But EF has to support all CRUD actions. The problem here would be: how to deal with inserts? Suppose Name is a required field. There would be no way to supply its value.
Another problem would be that a statement like...
region.Animals.Add(animal);
...could mean two things:
add an Animal and a new AnimalCategory, the latter referring to the Region.
Add an Animal referring to an existing AnimalCategory - without being able to choose which one.
EF wouldn't want to choose for some default behavior. You'd have to make the choice yourself, so you can't do without access to AnimalCategory.

How to expose extra column data for each pairing on a many to many relationship?

I have a Cars table, a BodyPaints table, and a Cars_BodyPaints association table that defines what paint jobs are possible for various cars. The association between Cars and BodyPaints is many to many.
I'd like to create an extra piece of data for each (Car, Paint) mapping to define the cost of a specific paint job on a specific car. I have added an extra column to Cars_BodyPaints to record the price of each pairing, but when I update the model classes from the DB I don't see it exposed in any of the entity classes. This leads me to believe that maybe my approach is wrong here. I was excepting some method to be generated so I could execute code like:
Car civic = (from c in context.Cars where c.Name == "Civic" select c).Single();
Paint red = (from p in civic.Paints where p.Name == "Red" select p).Single();
var price = civic.GetPrice(red);
Am I off base here? How would you accomplish this?
Thanks

When modeling many-to-many relationships with attributes, you will need to model the relation as an entity itself. You would create a new entity for the Cars_BodyPaints table and it would contain your extra price column as a property. This new intermediary entity must be traversed when considering the relation between Cars and BodyPaints and can also be directly queried.
Cars <-- Cars_BodyPaints --> BodyPaints
In this new model, your query would look more like:
(from bp in context.CarBodyPaints
where bp.Car.Name = "Civic" and bp.Paint.Name = "Red"
select bp.Price).Single()

Entity Framework many-to-many question

Please help an EF n00b design his database.
I have several companies that produce several products, so there's a many-to-many relationship between companies and products. I have an intermediate table, Company_Product, that relates them.
Each company/product combination has a unique SKU. For example Acme widgets have SKU 123, but Omega widgets have SKU 456. I added the SKU as a field in the Company_Product intermediate table.
EF generated a model with a 1:* relationship between the company and Company_Product tables, and a 1:* relationship between the product and Company_Product tables. I really want a : relationship between company and product. But, most importantly, there's no way to access the SKU directly from the model.
Do I need to put the SKU in its own table and write a join, or is there a better way?

I just tested this in a new VS2010 project (EFv4) to be sure, and here's what I found:
When your associative table in the middle (Company_Product) has ONLY the 2 foreign keys to the other tables (CompanyID and ProductID), then adding all 3 tables to the designer ends up modeling the many to many relationship. It doesn't even generate a class for the Company_Product table. Each Company has a Products collection, and each Product has a Companies collection.
However, if your associative table (Company_Product) has other fields (such as SKU, it's own Primary Key, or other descriptive fields like dates, descriptions, etc), then the EF modeler will create a separate class, and it does what you've already seen.
Having the class in the middle with 1:* relationships out to Company and Product is not a bad thing, and you can still get the data you want with some easy queries.
// Get all products for Company with ID = 1
var q =
from compProd in context.Company_Product
where compProd.CompanyID == 1
select compProd.Product;
True, it's not as easy to just navigate the relationships of the model, when you already have your entity objects loaded, for instance, but that's what a data layer is for. Encapsulate the queries that get the data you want. If you really want to get rid of that middle Company_Product class, and have the many-to-many directly represented in the class model, then you'll have to strip down the Company_Product table to contain only the 2 foreign keys, and get rid of the SKU.
Actually, I shouldn't say you HAVE to do that...you might be able to do some edits in the designer and set it up this way anyway. I'll give it a try and report back.
UPDATE
Keeping the SKU in the Company_Product table (meaning my EF model had 3 classes, not 2; it created the Company_Payload class, with a 1:* to the other 2 tables), I tried to add an association directly between Company and Product. The steps I followed were:
Right click on the Company class in the designer
Add > Association
Set "End" on the left to be Company (it should be already)
Set "End" on the right to Product
Change both multiplicities to "* (Many)"
The navigation properties should be named "Products" and "Companies"
Hit OK.
Right Click on the association in the model > click "Table Mapping"
Under "Add a table or view" select "Company_Product"
Map Company -> ID (on left) to CompanyID (on right)
Map Product -> ID (on left) to ProductID (on right)
But, it doesn't work. It gives this error:
Error 3025: Problem in mapping fragments starting at line 175:Must specify mapping for all key properties (Company_Product.SKU) of table Company_Product.
So that particular association is invalid, because it uses Company_Product as the table, but doesn't map the SKU field to anything.
Also, while I was researching this, I came across this "Best Practice" tidbit from the book Entity Framework 4.0 Recipies (note that for an association table with extra fields, besides to 2 FKs, they refer to the extra fields as the "payload". In your case, SKU is the payload in Company_Product).
Best Practice
Unfortunately, a project
that starts out with several,
payload-free, many-to-many
relationships often ends up with
several, payload-rich, many-to-many
relationships. Refactoring a model,
especially late in the development
cycle, to accommodate payloads in the
many-to-many relationships can be
tedious. Not only are additional
entities introduced, but the queries
and navigation patterns through the
relationships change as well. Some
developers argue that every
many-to-many relationship should start
off with some payload, typically a
synthetic key, so the inevitable
addition of more payload has
significantly less impact on the
project.
So here's the best practice.
If you have a payload-free,
many-to-many relationship and you
think there is some chance that it may
change over time to include a payload,
start with an extra identity column in
the link table. When you import the
tables into your model, you will get
two one-to-many relationships, which
means the code you write and the model
you have will be ready for any number
of additional payload columns that
come along as the project matures. The
cost of an additional integer identity
column is usually a pretty small price
to pay to keep the model more
flexible.
(From Chapter 2. Entity Data Modeling Fundamentals, 2.4. Modeling a Many-to-Many Relationship with a Payload)
Sounds like good advice. Especially since you already have a payload (SKU).

I would just like to add the following to Samuel's answer:
If you want to directly query from one side of a many-to-many relationship (with payload) to the other, you can use the following code (using the same example):
Company c = context.Companies.First();
IQueryable<Product> products = c.Company_Products.Select(cp => cp.Product);
The products variable would then be all Product records associated with the Company c record. If you would like to include the SKU for each of the products, you could use an anonymous class like so:
var productsWithSKU = c.Company_Products.Select(cp => new {
ProductID = cp.Product.ID,
Name = cp.Product.Name,
Price = cp.Product.Price,
SKU = cp.SKU
});
foreach (var
You can encapsulate the first query in a read-only property for simplicity like so:
public partial class Company
{
public property IQueryable<Product> Products
{
get { return Company_Products.Select(cp => cp.Product); }
}
}
You can't do that with the query that includes the SKU because you can't return anonymous types. You would have to have a definite class, which would typically be done by either adding a non-mapped property to the Product class or creating another class that inherits from Product that would add an SKU property. If you use an inherited class though, you will not be able to make changes to it and have it managed by EF - it would only be useful for display purposes.
Cheers. :)

No-sql relations question

I'm willing to give MongoDB and CouchDB a serious try. So far I've worked a bit with Mongo, but I'm also intrigued by Couch's RESTful approach.
Having worked for years with relational DBs, I still don't get what is the best way to get some things done with non relational databases.
For example, if I have 1000 car shops and 1000 car types, I want to specify what kind of cars each shop sells. Each car has 100 features. Within a relational database i'd make a middle table to link each car shop with the car types it sells via IDs. What is the approach of No-sql? If every car shop sells 50 car types, it means replicating a huge amount of data, if I have to store within the car shop all the features of all the car types it sells!
Any help appreciated.

I can only speak to CouchDB.
The best way to stick your data in the db is to not normalize it at all beyond converting it to JSON. If that data is "cars" then stick all the data about every car in the database.
You then use map/reduce to create a normalized index of the data. So, if you want an index of every car, sorted first by shop, then by car-type you would emit each car with an index of [shop, car-type].
Map reduce seems a little scary at first, but you don't need to understand all the complicated stuff or even btrees, all you need to understand is how the key sorting works.
http://wiki.apache.org/couchdb/View_collation
With that alone you can create amazing normalized indexes over differing documents with the map reduce system in CouchDB.

In MongoDB an often used approach would be store a list of _ids of car types in each car shop. So no separate join table but still basically doing a client-side join.
Embedded documents become more relevant for cases that aren't many-to-many like this.

Coming from a HBase/BigTable point of view, typically you would completely denormalize your data, and use a "list" field, or multidimensional map column (see this link for a better description).
The word "column" is another loaded
word like "table" and "base" which
carries the emotional baggage of years
of RDBMS experience.
Instead, I find it easier to think
about this like a multidimensional map
- a map of maps if you will.
For your example for a many-to-many relationship, you can still create two tables, and use your multidimenstional map column to hold the relationship between the tables.
See the FAQ question 20 in the Hadoop/HBase FAQ:
Q:[Michael Dagaev] How would you
design an Hbase table for many-to-many
association between two entities, for
example Student and Course?
I would
define two tables: Student: student
id student data (name, address, ...)
courses (use course ids as column
qualifiers here) Course: course id
course data (name, syllabus, ...)
students (use student ids as column
qualifiers here) Does it make sense?
A[Jonathan Gray] : Your design does
make sense. As you said, you'd
probably have two column-families in
each of the Student and Course tables.
One for the data, another with a
column per student or course. For
example, a student row might look
like: Student : id/row/key = 1001
data:name = Student Name data:address
= 123 ABC St courses:2001 = (If you need more information about this
association, for example, if they are
on the waiting list) courses:2002 =
... This schema gives you fast access
to the queries, show all classes for a
student (student table, courses
family), or all students for a class
(courses table, students family).

In relational database, the concept is very clear: one table for cars with columns like "car_id, car_type, car_name, car_price", and another table for shops with columns "shop_id, car_id, shop_name, sale_count", the "car_id" links the two table together for data Ops. All the columns must well defined in creating the database.
No SQL database systems do not require you pre-define these columns and tables. You just construct your records in a certain format, say JSon, like:
"{car:[id:1, type:auto, name:ford], shop:[id:100, name:some_shop]}",
"{car:[id:2, type:auto, name:benz], shop:[id:105, name:my_shop]}",
.....
After your system is on-line providing service for your management, you may find there are some flaws in your design of db structure, you hope to add one column "employee" of "shop" for your future records. Then your new records coming is as:
"{car:[id:3, type:auto, name:RR], shop:[id:108, name:other_shop, employee:Bill]}",
No SQL systems allow you to do so, but relational database is impossible for this job.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse