Handling a one-to-many relationship MongoDB schema design - mongodb

I am currently designing the MongoDB schema for an event management system. The ER diagram is as follows:
The concept is fairly simple:
A company can create 1 or more events (estimating x500s of companies)
A client can attend 1 or more events from a multitude of companies (estimating x200 per client..also estimate x1000s of clients)
The is the classic many-to-many relationship, right?
Now I come from an RDBMS background, so my instincts on structuring a MongoDB schema might be incorrect. However I like MongoDB's flexible document nature and so I tried to come up with the following model structure:
Company model
{
_id: <CompanyID1>,
name: "Foo Bar",
events: [<EventID1>, <EventID2>, ...]
}
Event model
{ _id: <EventID1>,
name: "Rockestra",
location: LocationSchema, // (model below)
eventDate: "01/01/2019",
clients: [<ClientID1>, <ClientID2>, ...]
}
Client model
{ _id: <ClientID1>,
name: "Joe Borg"
}
Location model
{ _id: <LocationID1>,
name: "London, UK"
}
My typical query scenarios would probably be:
List all events organised by a specific company (including location details)
List all registered clients for a particular event
Would this design and approach be a sensible one to use given the cardinality I stated above? I guess one of the pitfalls of this design is that I could not get the company details if I just query the events model.

I would do
Company model
{
_id: <CompanyID1>,
name: "Foo Bar"
}
Event model
{ _id: <EventID1>,
name: "Rockestra",
location: LocationSchema, // embedded, not a reference
eventDate: "01/01/2019",
company: <CompanyID1> // indexed reference.
}
Client model
{ _id: <ClientID1>,
name: "Joe Borg",
events: [<EventID1>, <EventID2>, ...] // with index on events
}
List all events organised by a specific company (including location details):
db.events.find({company:<CompanyID1>})
List all registered clients for a particular event:
db.clients.find({events:<EventID1>})

It's not many-to-many unless a single event can be created by many companies. It looks like you are describing one-to-many.
This is the way I'd approach it.
Company model
{
_id:
name:
}
Client model
{
_id:
name:
}
ClientEvents model
{
_id
clientId
eventId
}
Event model
{
_id:
companyId:
name:
locationId:
eventDate:
}
Location model
{
_id:
name: "London, UK"
}

Related

how to deal with many-to-many relationship in mongodb?

it is easy to deal with 1-1(via refs) or 1-N(via populate virtuals) relations in MongoDB
but how to deal with N-M relations?
suppose I have 2 entities teacher and classroom
many teachers can access many classrooms
many classrooms can be accessed by many teachers
teacher.schema
{
name:String;
//classrooms:Array;
}
classrooms.schema
{
name:String;
//teachers:Array
}
is there a direct way(similar like populate virtuals) to keep this N-M relations so that when one teacher removed, then teachers in classroom can automatically be changed too?
should I use a third 'bridge' schema like TeacherToClassroom to record their relations?
i am thinking of some thing like this, like a computed value
teacher.schema
{
name:String;
classrooms:(row)=>{
return db.classrooms.find({_id:{$elemMatch:row._id }})
}
}
classrooms.schema
{
name:String;
teachers:{Type:ObjectId[]}
}
so that i just manage the teacher ids in classrooms, then the classroom property in teach schema will auto computed
The literature describes a few methods on how to implement a m-n relationship in Mongodb.
The first method is by two-way embedding. Looking at an example using books and director of movies:
{
_id: 1,
name: "Peter Griffin",
books: [1, 2]
}
{
_id: 2,
name: "Luke Skywalker",
books: [2]
}
{
_id: 1,
title: "War of the oceans",
categories: ["drama"],
authors: [1, 2]
}
{
_id: 2,
title: "Into the loop",
categories: ["scifi"],
authors: [1]
}
The second option is to use one-way embedding. This means you only embed one of the documents into the other. Like so (movie with a genre):
{
_id: 1,
name: "drama"
}
{
_id: 1,
title: "War of the oceans",
categories: [1],
authors: [1, 2]
}
When the data you are embedding becomes larger you could use something like the bucketing pattern to split it up: https://www.mongodb.com/blog/post/building-with-patterns-the-bucket-pattern
As you can see in the above example by embedding the documents you still only need to modify the data in one location. You do not need any intermediate tables to do that.
In some cases you might even be able to omit an entire document when it has no meaning as a stand-alone object: Absorbing N in a M:N relationship

Historical data structure on MongoDB

Based on a certain time interval I need to implement pre-aggregated statistical data based on the following model:
I have a Product entity and ProductGroup entity that plays a role of Products container. I can have 0..N Products and 0..N ProductGroups with the MANY_2_MANY relationship between Products and ProductGroups.
Based on some own business logic I can calculate the order of every Product in the every ProductGroup.
I will do this calculation continuously per some period of time... let's say via Cron job.
I also would like to store the history for every calculation(versions) in order to be able to analyze the Product positions shifts.
I have created a simple picture with this structure:
Right now I use MongoDB database and really interested to implement this structure on MongoDB without introducing new technologies.
My functional requirements - I need to have the ability to quickly get the position(and position offset) for certain Product in the certain ProductGroup. Let's say P2 position and offset for ProductGroup1. The output should be:
position: 1
offset : +2
Also, I'd like to visualize the graphics and show the historical changes of positions for a certain Product with a certain ProductGroup. For example for Product P2 in ProductGroup1 the output should be:
1(+2), 3(-3), 0
Is it possible to implement with MongoDB and if so, could you please describe the MongoDB collection(s) structure in order to support this?
Since the only limitation is to "quickly query the data as I described at my question", the simplest way is to have a collection of snapshots with an array of products:
db.snapshots.insert({
group: "group 1",
products:[
{id:"P2", position:0, offset:0},
{id:"P4", position:1, offset:0},
{id:"P5", position:2, offset:0},
{id:"P6", position:3, offset:0}
], ver:0
});
db.snapshots.insert({
group: "group 1",
products:[
{id:"P3", position:0, offset:0},
{id:"P5", position:1, offset:1},
{id:"P1", position:2, offset:0},
{id:"P2", position:3, offset:-3},
{id:"P4", position:4, offset:0}
], ver:1
});
The index would be
db.snapshots.createIndex(
{ group: 1, ver: -1, "products.id": 1 },
{ unique: true, partialFilterExpression: { "products.id": { $exists: true } } }
);
And the query to fetch current position of a product in the group ("P4" in "group 1" in the example):
db.snapshots.find(
{ group: "group 1" },
{ _id: 0, products: { $elemMatch: { id: "P4" } } }
).sort( { ver:-1 } ).limit(1)
A query to fetch historical data is almost the same:
db.snapshots.find(
{ group: "group 1" },
{ _id: 0, products: { $elemMatch: {id: "P4" } }, ver: 1 }
).sort({ver:-1})

Atomic consistency

My lecturer in the database course I'm taking said an advantage of NoSQL databases is that they "support atomic consistency of a single aggregate". I have no idea what this means, can someone please explain it to me?
It means that by using aggregates you are able to avoid that your database save inconsistence data by an error of transaction.
In Domain Driven Design, an aggregate is a collection of related objects that are treated as an unit.
For example, lets say you have a restaurant and you want to save the orders of each customer.
You could save your data with two aggregates like below:
var customerIdGenerated = newGuid();
var customer = { id: customerIdGenerated , name: 'Mateus Forgiarini'};
var orders = {
id: 1,
customerId: customerIdGenerated ,
orderedFoods: [{
name: 'Sushi',
price: 50
},
{
name: 'Tacos',
price: 12
}]
};
Or you could threat orders and customers as a single aggregate:
var customerIdGenerated = newGuid();
var customerAndOrders = {
customerId: customerIdGenerated ,
name: 'Mateus Forgiarini',
orderId: 1,
orderedFoods: [{
name: 'Sushi',
price: 50
},
{
name: 'Tacos',
price: 12
}]
};
By setting your orders and customer as a single aggregate you avoid an error of transaction. In the NoSQL world an error of transaction can occur when you have to write a related data in many nodes (a node is where you store your data, NoSQL databases that run on clusters can have many nodes).
So if you are treating orders and customers as two aggregates, an error can occur while you are saving the customer but your orders can still be saved, so you would have an inconsistency data because you would have orders with no customer.
However by making use of a single aggregate, you can avoid that, because if an error occur, you won't have an inconsistency data, since you are saving your related data together.

Mongoose product category design?

I would like to create an eCommerce type of database where I have products and categories for the products using Mongodb and Mongoose. I am thinking of having two collections, one for products and one for categories. After digging online, I think the category should be as such:
var categorySchema = {
_id: { type: String },
parent: {
type: String,
ref: 'Category'
},
ancestors: [{
type: String,
ref: 'Category'
}]
};
I would like to be able to find all the products by category. For example "find all phones." However, the categories may be renamed, updated, etc. What is the best way to implement the product collection? In SQL, a product would contain a foreign key to a category.
A code sample of inserting and finding a document would be much appreciated!
Why not keep it simple and do something like the following?
var product_Schema = {
phones:[{
price:Number,
Name:String,
}],
TV:[{
price:Number,
Name:String
}]
};
Then using projections you could easily return the products for a given key. For example:
db.collection.find({},{TV:1,_id:0},function(err,data){
if (!err) {console.log(data)}
})
Of course the correct schema design will be dependent on how you plan on querying/inserting/updating data, but with mongo keeping things simple usually pays off.

MongoDB Data-Modelling: a pattern for text search in referenced documents

I'm working on a project that use MongoDB; and I would like to hear your opinion about a feature I'd like to implement.
In paticular there are "Users" that reside in "Cities" where they offer "Services".
I have created three Collections representing the three above mentioned entities:
the User collection has a one-to-one reference with City and a one-to-many one with Service.
I would like making a search function that search in the user collection and in referenced collections for a given string available.
Therefor given the following two users, two cities and three services ...
User
{
_id:"u1",
name:"Jhon",
City: ObjectId("c1"),
Services: [
ObjectId("s1"),
ObjectId("s2")
]
}
{
_id:"u2",
name:"Jack",
City: ObjectId("c2"),
Services: [
ObjectId("s2"),
ObjectId("s3")
]
}
City
{
_id:"c1",
name: "Rome"
}
{
_id:"c2",
name: "London"
}
Services
{
_id:"s1",
name: "Repair"
}
{
_id:"s2",
name: "Sell"
}
{
_id:"s3",
name: "Buy"
}
...and searching for the word "R", the result should be the u1 user (due to the R in "Rome" and "Repair").
Given that I cannot do joins, I was thinking making a mongo shell script that adds an additional field to the User collection with all the searcheable referenced strings.
As in the following example
{
_id:"u1",
name:"Jhon",
City: ObjectId("c1"),
Services: [
ObjectId("s1"),
ObjectId("s2")
],
"idx":{
city: "Rome",
services:["Repair","Sell"]
}
}
Finally the question(s)...
Do you think is it a good choice? And Can you propose an alternative solution (or share a link about that, i didn't find nothing usefull)?
And how would you mantain that field constantly updated; for instance, What about if the referenced city name or the services offered by a user change?