Using MongoDB for storage, if I wanted to represent a tree structure of nodes, where child nodes under a single parent always have unique node-names, I believe the standard approach would be to use collections and to manage the node name uniqueness on the app level:
Approach 1: Collection Based Approach for Tree Data
{ "node_name": "home", "title": "Home", "children": [
{ "node_name": "products", "title": "Products", "children": [
{ "node_name": "electronics", "title": "Electronics", "children": [ ] },
{ "node_name": "toys", "title": "Toys", "children": [ ] } ] },
{ "node_name": "services", "title": "Services", "children": [
{ "node_name": "repair", "title": "Repair", "children": [ ] },
{ "node_name": "training", "title": "Training"", "children": [ ] } ] } ] }
I have however thought of the following alternate approach, where node-names become "Object Map" field names, and we do without collections altogether:
Approach 2: Object-Map Based Approach (without Collections)
// NOTE: We don't have the equivalent of "none_name":"home" at the root, but that's not an issue in this case
{ "title": "Home", "children": {
"products": { "title": "Products", children": {
"electronics": { "title": "Electronics", "children": { } },
"toys": { "title": "Toys", "children": { } } } },
"services": { "title": "Services", children": {
"repair": { "title": "Repair", "children": { } },
"training": { "title": "Training", "children": { } } } } } }
The question is:
Strictly from a MongoDB perspective (considering querying, performance, data maintainability and data-size and server scaling), are there any major issues with Approach #2 (over #1)?
EDIT: After getting to know MongoDB a bit better (and thanks to Neil's comments below), I realized that both options of this question are generally the wrong way to go, because they assume that it makes sense to store multiple nodes in a single MongoDB document. Ultimately, each "node" should be a separate document and (as Neil Lunn stated in the comments) there are various ways to implement a hierarchy tree, as seen here: Model Tree Structures in MongoDB
I think this use-case is not good for Mongo DB, because:
there's(MongoDB 2.6) no compress algorithm (your documents will be too large)
Mongo DB use database-level locks (when you want one large document, all DB operations will be blocked)
it will be hard to index
I think better solution will be some relational DB for this use-case.
Related
Let's imagine a mongo collection of - let's say magazines. For some reason, we've ended up storing each issue of the magazine as a separate document. Each article is a subdocument inside an Articles-array, and the authors of each article is represented as a subdocument inside the Writers-array on the Article-subdocument. Only the name and email of the author is stored inside the article, but there is an Writers-array on the magazine level containing more information about each author.
{
"Title": "The Magazine",
"Articles": [
{
"Title": "Mongo Queries 101",
"Summary": ".....",
"Writers": [
{
"Name": "tom",
"Email": "tom#example.com"
},
{
"Name": "anna",
"Email": "anna#example.com"
}
]
},
{
"Title": "Why not SQL instead?",
"Summary": ".....",
"Writers": [
{
"Name": "mike",
"Email": "mike#example.com"
},
{
"Name": "anna",
"Email": "anna#example.com"
}
]
}
],
"Writers": [
{
"Name": "tom",
"Email": "tom#example.com",
"Web": "tom.example.com"
},
{
"Name": "mike",
"Email": "mike#example.com",
"Web": "mike.example.com"
},
{
"Name": "anna",
"Email": "anna#example.com",
"Web": "anna.example.com"
}
]
}
How can one author be completely removed from a magazines?
Finding magazines where the unwanted author exist is quite easy. The problem is pulling the author out of all the sub documents.
MongoDB 3.6 introduces some new placeholder operators, $[] and $[<identity>], and I suspect these could be used with either $pull or $pullAll, but so far, I haven't had any success.
Is it possible to do this in one go? Or at least no more than two? One query for removing the author from all the articles, and one for removing the biography from the magazine?
You can try below query.
db.col.update(
{},
{"$pull":{
"Articles.$[].Writers":{"Name": "tom","Email": "tom#example.com"},
"Writers":{"Name": "tom","Email": "tom#example.com"}
}},
{"multi":true}
);
I am new to MongoDB, and I am trying to understand what document structure for nested documents is the best practice.
For instance, what structure will be the best in my case -
1)
"nonAssociatedFindings": {
"finding": [
{
"name": "Non-mass enhancement (NME)",
"parameters": [
{
"name": "Distribution",
"value": "Focal"
},
{
"name": "Internal enhancement patterns",
"value": "Heterogeneous"
}
]
}
]
or
2)
"nonAssociatedFindings": [
{
"name": "Non-mass enhancement (NME)",
"parameters": [
{
"name": "Distribution",
"value": "Focal"
},
{
"name": "Internal enhancement patterns",
"value": "Heterogeneous"
}
]
}
]
Also, if you can suggest any good materials to learn essentials, besides documentation, I would appreciate it.
Thanks in advance!
I am pretty new to Mongo db and coming from T-SQL background, I am finding little hard to understand how joins work in Mongo.
I have a very simple case where i have a "User Table.. err.. Collections" and "User Audit Collections"..
My User Collection looks something like this.
{
"_id": LUUID("d991e92a-766c-054e-9ad8-1c902acc6efc"),
"System": {
"VisitCount": 1
},
"UserData": {
"Uid": "46831",
"UserName": "abc.",
"FirstName": "abv",
"LastName": "test",
"EmailId": "abc#gmail.com",
"Region": "Georgia",
"Postal": "10000",
"Country": "United States",
"Phone": "800-000-1734",
}
}
and a User Audit Table :
{
"_id": LUUID("9561a583-0afe-e844-a090-43ffdab46ed2"),
"UserId": LUUID("914ed252-3fc7-d84c-9731-f382e7cf400b"),
"StartDateTime": ISODate("2016-05-12T04:07:37.299Z"),
"EndDateTime": ISODate("2016-05-12T04:07:42.715Z"),
"SaveDateTime": ISODate("2016-05-12T04:28:23.186Z"),
"Browser": {
"BrowserVersion": "50.0",
"BrowserMajorName": "Chrome",
"BrowserMinorName": "50.0"
},
"Pages": [
{
"DateTime": ISODate("2016-05-12T04:07:37.365Z"),
"Duration": 5416,
"Item": {
"_id": LUUID("f293157a-f22d-fe49-a7b0-f66f412408fe"),
"Language": "en",
"Version": 1
}"Url": {
"Path": "/"
},
"VisitPageIndex": 1
},
{
"DateTime": ISODate("2016-05-12T04:07:42.781Z"),
"Duration": 0,
"Item": {
"Version": 0
},
"SitecoreDevice": {
"_id": LUUID("df7f5dfe-c089-994d-9aa3-b5fbd009c9f3"),
"Name": "Default"
},
"MvTest": {
"ValueAtExposure": 0
},
"Url": {
"Path": "/Sample Page1"
},
"VisitPageIndex": 2
}
]
}
I need a Flat view where each row will hold all the user User information and the pages the user visited.
The Audit information can be grouped by user or repeated per user.. My main idea is to combine the User details with Page visited history.
I am looking for something like a Left outer join equivalent
something like
Select * from usertable, useraudittable
on usertable.id = userAuditTable.UserId
group by userID.
Mongo is a simple object storage database and does not offer a lot of relational operations like joins. Normally you have to do it programmatically doing multiple queries and processing the data using your application code and logic.
In Mongo 3.2 they introduced the lookup operation to the aggregation pipeline and fortunately it kinda does what you are looking for. You can use something like this (using mongo shell javascript syntax as example)
db.user.aggregate([{
$lookup: {
from: "audit",
localField: "_id",
foreignField: "UserId",
as: "VisitedPages"
}
}]);
If you are using the last version of mongo you can play with this approach otherwise you'll need to go with multiple queries on your application.
Take a look at the documentation
I have a collection with data like this:
{
"Name": "Steven",
"Children": [
{
"Name": "Liv",
"Children": [
{
"Name": "Milo"
}
]
},
{
"Name": "Mia"
},
{
"Name": "Chelsea"
}
]
},
{
"Name": "Ozzy",
"Children": [
{
"Name": "Jack",
"Children": [
{
"Name": "Pearl"
}
]
},
{
"Name": "Kelly"
}
]
}
Two questions
Can MongoDB flatten the arrays to a structure like this [Steven, Liv, Milo, Mia,Chelsea, Ozzy, Jack, Pearl,Kelly]
How can I find the a document where name is jack, no matter where in the structure it is placed
In general, MongoDB does not perform recursive or arbitrary-depth operations on nested fields. To accomplish objectives 1 and 2 I would reconsider the structure of the data as an arbitrarily nested document is not a good way to model a tree in MongoDB. The MongoDB docs have a good section of modelin tree structures that present several options with examples. Pick the one that best suits your entire use case - they will all make 1 and 2 very easy.
I have some documents in the "company" collection structured this way :
[
{
"company_name": "Company 1",
"contacts": {
"main": {
"email": "main#company1.com",
"name": "Mainuser"
},
"store1": {
"email": "store1#company1.com",
"name": "Store1 user"
},
"store2": {
"email": "store2#company1.com",
"name": "Store2 user"
}
}
},
{
"company_name": "Company 2",
"contacts": {
"main": {
"email": "main#company2.com",
"name": "Mainuser"
},
"store1": {
"email": "store1#company2.com",
"name": "Store1 user"
},
"store2": {
"email": "store2#company2.com",
"name": "Store2 user"
}
}
}
]
I'm trying to retrieve the doc that have store1#company2.com as a contact but cannot find how to query a specific value of a specific propertie of an "indexed" list of objects.
My feeling is that the contacts lists should not not be indexed resulting in the following structure :
{
"company_name": "Company 1",
"contacts": [
{
"email": "main#company1.com",
"name": "Mainuser",
"label": "main"
},
{
"email": "store1#company1.com",
"name": "Store1 user",
"label": "store1"
},
{
"email": "store2#company1.com",
"name": "Store2 user",
"label": "store2"
}
]
}
This way I can retrieve matching documents through the following request :
db.company.find({"contacts.email":"main#company1.com"})
But is there anyway to do a similar request on document using the previous structure ?
Thanks a lot for your answers!
P.S. : same question for documents structured this way :
{
"company_name": "Company 1",
"contacts": {
"0": {
"email": "main#company1.com",
"name": "Mainuser"
},
"4": {
"email": "store1#company1.com",
"name": "Store1 user"
},
"1": {
"email": "store2#company1.com",
"name": "Store2 user"
}
}
}
Short answer: yes, they can be queried but it's probably not what you want and it's not going to be really efficient.
The document structure in the first and third block is basically the same - you have an embedded document. The only difference between are the name of the keys in the contacts object.
To query document with that kind of structure you will have to do a query like this:
db.company.find({ $or : [
{"contacts.main.email":"main#company1.com"},
{"contacts.store1.email":"main#company1.com"},
{"contacts.store2.email":"main#company1.com"}
]});
This query will not be efficient, especially if you have a lot of keys in the contacts object. Also, creating a query will be unnecessarily difficult and error prone.
The second document structure, with an array of embedded objects, is optimal. You can create a multikey index on the contacts array which will make your query faster. The bonus is that you can use a short and simple query.
I think the easiest is really to shape your document using the structure describe in your 2nd example : (I have not fixed the JSON)
{
"company_name": "Company 1",
"contacts":{[
{"email":"main#company1.com","name":"Mainuser", "label": "main", ...}
{"email":"store1#company1.com","name":"Store1 user", "label": "store1",...}
{"email":"store2#company1.com","name":"Store2 user", "label": "store2",...}
]}
}
like that you can easily query on email independently of the "label".
So if you really want to use the other structure, (but you need to fix the JSON too) you will have to write more complex code/aggregation pipeline, since we do not know the name and number of attributes when querying the system. Theses structures are also probably hard to use by the developers independently of MongoDB queries.
Since it was not clear let me show what I have in mind
db.company.save(
{
"company_name": "Company 1",
"contacts":[
{"email":"main#company1.com","name":"Mainuser", "label": "main"},
{"email":"store1#company1.com","name":"Store1 user", "label": "store1"},
{"email":"store2#company1.com","name":"Store2 user", "label": "store2"}
]
}
);
db.company.save(
{
"company_name": "Company 2",
"contacts":[
{"email":"main#company2.com","name":"Mainuser", "label": "main"},
{"email":"store1#company2.com","name":"Store1 user", "label": "store1"},
{"email":"store2#company2.com","name":"Store2 user", "label": "store2"}
]
}
);
db.company.ensureIndex( { "contacts.email" : 1 } );
db.company.find( { "contacts.email" : "store1#company2.com" } );
This allows you to store many emails, and query with an index.