it is easy to deal with 1-1(via refs) or 1-N(via populate virtuals) relations in MongoDB
but how to deal with N-M relations?
suppose I have 2 entities teacher and classroom
many teachers can access many classrooms
many classrooms can be accessed by many teachers
teacher.schema
{
name:String;
//classrooms:Array;
}
classrooms.schema
{
name:String;
//teachers:Array
}
is there a direct way(similar like populate virtuals) to keep this N-M relations so that when one teacher removed, then teachers in classroom can automatically be changed too?
should I use a third 'bridge' schema like TeacherToClassroom to record their relations?
i am thinking of some thing like this, like a computed value
teacher.schema
{
name:String;
classrooms:(row)=>{
return db.classrooms.find({_id:{$elemMatch:row._id }})
}
}
classrooms.schema
{
name:String;
teachers:{Type:ObjectId[]}
}
so that i just manage the teacher ids in classrooms, then the classroom property in teach schema will auto computed
The literature describes a few methods on how to implement a m-n relationship in Mongodb.
The first method is by two-way embedding. Looking at an example using books and director of movies:
{
_id: 1,
name: "Peter Griffin",
books: [1, 2]
}
{
_id: 2,
name: "Luke Skywalker",
books: [2]
}
{
_id: 1,
title: "War of the oceans",
categories: ["drama"],
authors: [1, 2]
}
{
_id: 2,
title: "Into the loop",
categories: ["scifi"],
authors: [1]
}
The second option is to use one-way embedding. This means you only embed one of the documents into the other. Like so (movie with a genre):
{
_id: 1,
name: "drama"
}
{
_id: 1,
title: "War of the oceans",
categories: [1],
authors: [1, 2]
}
When the data you are embedding becomes larger you could use something like the bucketing pattern to split it up: https://www.mongodb.com/blog/post/building-with-patterns-the-bucket-pattern
As you can see in the above example by embedding the documents you still only need to modify the data in one location. You do not need any intermediate tables to do that.
In some cases you might even be able to omit an entire document when it has no meaning as a stand-alone object: Absorbing N in a M:N relationship
Related
I need store and build fast query for next structure:
class Model {
id: number:
alias: string;
schema: Record<string, any>;
}
where schema it's can be:
{
someField: '$model_alias',
otherField: {
nestedField: '$other_model_alias'
}
}
Example data:
{ id: 1, alias: "model_one", schema: { field1: "test", field2: "demo" } }
{ id: 2, alias: "model_second", schema: { someField: "$model_one", otherField: { nestedField: 5 } }
{ id: 3, alias: "model_third", schema: { field5: "$model_second", field6: "$model_one", field7: "$model_fourth" } }
{ id: 4, alias: "model_fourth", schema: { field8: "$model_second" } }
As you can see, json field schema contains fields which may refer to another models with schemas. Thus, there can be a lot of nesting, and relationships can be many-to-many.
Is it possible to achieve such a structure with Postgres or should some alternative be used? I need possible to easy manage structure and very fast queries (get tree children or get tree parents).
Thanks.
Choosing the right db type is tricky at the best of times. Can you provide more information about what sorts of queries you'd be doing? And how big is your dataset?
If your requirement is to exclusively get the parents and children of a model, a relational db such as postgres would do it. If the relations are many-to-many, you'll have a bridging table (https://dzone.com/articles/how-to-handle-a-many-to-many-relationship-in-datab), and will be able to do efficient queries on that.
If you're doing significant, multi-hop traversals between the relationships, you might indeed want to look at a graph database to avoid expensive joins. Postgres even has a plugin that allows this: https://www.postgresql.org/about/news/announcing-age-a-multi-model-graph-database-extension-for-postgresql-2050/
I wouldn't recommend a document store for data that's heavily relational like this, just because managing relationships between documents has to be handled manually by the user, and that's normally more trouble that it's worth.
I am currently designing the MongoDB schema for an event management system. The ER diagram is as follows:
The concept is fairly simple:
A company can create 1 or more events (estimating x500s of companies)
A client can attend 1 or more events from a multitude of companies (estimating x200 per client..also estimate x1000s of clients)
The is the classic many-to-many relationship, right?
Now I come from an RDBMS background, so my instincts on structuring a MongoDB schema might be incorrect. However I like MongoDB's flexible document nature and so I tried to come up with the following model structure:
Company model
{
_id: <CompanyID1>,
name: "Foo Bar",
events: [<EventID1>, <EventID2>, ...]
}
Event model
{ _id: <EventID1>,
name: "Rockestra",
location: LocationSchema, // (model below)
eventDate: "01/01/2019",
clients: [<ClientID1>, <ClientID2>, ...]
}
Client model
{ _id: <ClientID1>,
name: "Joe Borg"
}
Location model
{ _id: <LocationID1>,
name: "London, UK"
}
My typical query scenarios would probably be:
List all events organised by a specific company (including location details)
List all registered clients for a particular event
Would this design and approach be a sensible one to use given the cardinality I stated above? I guess one of the pitfalls of this design is that I could not get the company details if I just query the events model.
I would do
Company model
{
_id: <CompanyID1>,
name: "Foo Bar"
}
Event model
{ _id: <EventID1>,
name: "Rockestra",
location: LocationSchema, // embedded, not a reference
eventDate: "01/01/2019",
company: <CompanyID1> // indexed reference.
}
Client model
{ _id: <ClientID1>,
name: "Joe Borg",
events: [<EventID1>, <EventID2>, ...] // with index on events
}
List all events organised by a specific company (including location details):
db.events.find({company:<CompanyID1>})
List all registered clients for a particular event:
db.clients.find({events:<EventID1>})
It's not many-to-many unless a single event can be created by many companies. It looks like you are describing one-to-many.
This is the way I'd approach it.
Company model
{
_id:
name:
}
Client model
{
_id:
name:
}
ClientEvents model
{
_id
clientId
eventId
}
Event model
{
_id:
companyId:
name:
locationId:
eventDate:
}
Location model
{
_id:
name: "London, UK"
}
I would like to create an eCommerce type of database where I have products and categories for the products using Mongodb and Mongoose. I am thinking of having two collections, one for products and one for categories. After digging online, I think the category should be as such:
var categorySchema = {
_id: { type: String },
parent: {
type: String,
ref: 'Category'
},
ancestors: [{
type: String,
ref: 'Category'
}]
};
I would like to be able to find all the products by category. For example "find all phones." However, the categories may be renamed, updated, etc. What is the best way to implement the product collection? In SQL, a product would contain a foreign key to a category.
A code sample of inserting and finding a document would be much appreciated!
Why not keep it simple and do something like the following?
var product_Schema = {
phones:[{
price:Number,
Name:String,
}],
TV:[{
price:Number,
Name:String
}]
};
Then using projections you could easily return the products for a given key. For example:
db.collection.find({},{TV:1,_id:0},function(err,data){
if (!err) {console.log(data)}
})
Of course the correct schema design will be dependent on how you plan on querying/inserting/updating data, but with mongo keeping things simple usually pays off.
I'm coming from the SQL world, so naturally mongo / noSQL has been an adventure.
I'm building a page to add/edit categories, that "posts" will later be assigned to.
What I've basically created is this:
{
_id: "asdf234ljsf",
title: "CategoryOne",
sortorder: 1,
active: true,
children: [
{
title: ChildOne,
sortorder: 1,
active: true
},
{
title: ChildTwo,
sortorder: 2,
active: true
}
]
}
So later, when creating a "post" I would assign that post to one or more parent categories, as well as optionally one or more child categories within the selected parent categories. Visitors to the site if they clicked on a parent category, it would show all posts within that parent category, and if they select a child category, it will only show posts within that child category.
The logic is obvious and simple, but in SQL I would have created tables like this:
table_Category ( CategoryID, Title, Sort, Active )
table_Category_Children ( ChildID, ParentID, Title, Sort, Active )
I've been reading the Discover Meteor book and it mentions that Meteor gives us many tools that work a lot better when operating at the collection level, as well as how the DDP operates at the top level of a document, meaning if something small changed down in a sub collection or array, potentially unneeded data will be sent back to all connected/subscribed clients.
So, this makes me think I should be organizing the categories like this:
Collection for parent categories
{
_id: "someid",
title: "CategoryOne"
sortorder: 1,
active: true
},
{
_id: "someid",
title: "CategoryTwo"
sortorder: 1,
active: true
}
Collection for Child Categories
{
_id: "someid",
parent: "idofparent"
title: "ChildOne"
sortorder: 1,
active: true
},
{
_id: "someid",
parent: "idofparent"
title: "ChildTwo"
sortorder: 1,
active: true
}
Or, perhaps its better like this:
Collection for parent categories
{
_id: "someid",
title: "CategoryOne"
sortorder: 1,
active: true,
children: [ { id: "childid" }, ... ]
}
I think understanding a best practice/method for Meteor and Mongo in this scenario will help me greatly across the board.
So conclusion: I have an admin page where I add/edit these categories. When clients create a post, they'll select the parent and child categories suitable for their post and make sure that I organize it properly from the beginning. Changing my thinking process from a traditional RDBMS to NoSQL is a big jump.
Thank you!
MongoDB stores all data in documents. This is a fundamental difference from relational database like SQL.
Imagine if you have 100 parent categories and 1000 child categories, once you update a parent category it will affect all linked child category's "idofparent", in a reactive way. In short, it's not sustainable.
Try to think of a way to avoid JOIN SQL equivalent in MongoDB.
Restructure you data perhaps similar to this way:
One big collection for all categories:
{
_id: id,
title: title,
sortorder: 1,
active: 1,
class: "parent > child" // make this as a field
...
}
// class can be "parent1", "parent2", "parent1 > child1" ... you get the idea
so each document store is completely individual.
Or if you absolutely need JOIN relational data structure, I don't think MongoDB is the right choice for you.
I'm using MongoDB. I know that MongoDB isn't relational but information sometimes is. So what's the most efficient way to reference these kinds of relationships to lessen database load and maximize query speed?
Example:
* Tinder-style "matches" *
There are many users in a Users collection. They get matched to each other.
So I'm thinking:
Document 1:
{
_id: "d3fg45wr4f343",
firstName: "Bob",
lastName: "Lee",
matches: [
"ferh823u9WURF",
"8Y283DUFH3FI2",
"KJSDH298U2F8",
"shdfy2988U2Ywf"
]
}
Document 2:
{
_id: "d3fg45wr4f343",
firstName: "Cindy",
lastName: "Doe",
matches: [
"d3fg45wr4f343"
]
}
Would this work OK if there were, say, 10,000 users and you were on Bob's profile page and you wanted to display the firstName of all of his matches?
Any alternative structures that would work better?
* Online Forum *
I supposed you could have the following collections:
Users
Topics
Users Collection:
{
_id: "d3fg45wr4f343",
userName: "aircon",
avatar: "234232.jpg"
}
{
_id: "23qdf3a3fq3fq3",
userName: "spider",
avatar: "986754.jpg"
}
Topics Collection Version 1
One example document in the Topics Collection:
{
title: "A spider just popped out of the AC",
dateTimeSubmitted: 201408201200,
category: 5,
posts: [
{
message: "I'm going to use a gun.",
dateTimeSubmitted: 201408201200,
author: "d3fg45wr4f343"
},
{
message: "I don't think this would work.",
dateTimeSubmitted: 201408201201,
author: "23qdf3a3fq3fq3"
},
{
message: "It will totally work.",
dateTimeSubmitted: 201408201202,
author: "d3fg45wr4f343"
},
{
message: "ur dumb",
dateTimeSubmitted: 201408201203,
author: "23qdf3a3fq3fq3"
}
]
}
Topics Collection Version 2
One example document in the Topics Collection. The author's avatar and userName are now embedded in the document. I know that:
This is not DRY.
If the author changes their avatar and userName, these change would need to be updated in the Topics Collection and in all of the post documents that are in it.
BUT it saves the system from querying for all the avatars and userNames via the authors ID every single time this thread is viewed on the client.
{
title: "A spider just popped out of the AC",
dateTimeSubmitted: 201408201200,
category: 5,
posts: [
{
message: "I'm going to use a gun.",
dateTimeSubmitted: 201408201200,
author: "d3fg45wr4f343",
userName: "aircon",
avatar: "234232.jpg"
},
{
message: "I don't think this would work.",
dateTimeSubmitted: 201408201201,
author: "23qdf3a3fq3fq3",
userName: "spider",
avatar: "986754.jpg"
},
{
message: "It will totally work.",
dateTimeSubmitted: 201408201202,
author: "d3fg45wr4f343",
userName: "aircon",
avatar: "234232.jpg"
},
{
message: "ur dumb",
dateTimeSubmitted: 201408201203,
author: "23qdf3a3fq3fq3",
userName: "spider",
avatar: "986754.jpg"
}
]
}
So yeah, I'm not sure which is best...
If the data is realy many to many i.e. one can have many matches and can be matched by many in your first example it is usually best to go with relations.
The main arguments against relations stem from mongodb not beeing a relational database so there are no such things as foreign key constraints or join statements.
The trade off you have to consider in those many to many cases (many beeing much more than two) is either enforce the key constraints yourself or manage the possible data inconsistencies accross the multiple documents (your last example). And in most cases the relational approach is much more practical than the embedding approach for those cases.
Exceptions could be read often write seldom examples. For (a very constructed) example when in your first example matches would be recalculated once a day or so by wiping all previous matches and calculating a list of new matches. In that case the data inconsistencies you would introduce could be acceptable and the read time you save by embedding the firstnames of the matches could be an advantage.
But usually for many to many relations it would be best to use a relational approach and make use of the array query features such as {_id :{$in:[matches]}}.
But in the end it all comes down to the consideration of how many inconsistencies you can live with and how fast you realy need to access the data (is it ok for some topics to have the old avatar for a few days if I save half a second of page load time?).
Edit
The schema design series on the mongodb blog might be a good read for you: part1, part2 and part3