mongodb:
#Authors collection
db.authors.insert({name: 'Kobo', birthday: '1860', country: 'jp', tags: ['japan', 'avant-garde', 'screen', 'Akutagawa Prize']})
db.authors.insert({name: 'Sartr', birthday: '1905', country: 'fr', tags: [...]})
db.authors.insert({name: 'Braun', birthday: '1913', country: 'us', tags: [...]}
...
#Books collection
db.books.insert({title: 'book1', author: 'Kobo', year :''});
db.books.insert({title: 'book2', author: 'Sartr', year :''});
db.books.insert({title: 'book3', author: 'author', year :''});
...
Autor tags are regularly added to the collection Authors.
Using book tags is not considered in this question.
Need to find all the books in which the author has a certain tag. Such as 'avant-garde'.
What is the most effective way to do this?
How to do it in pymongo?
MongoDB does not support joins. So, with the data model you have, you'll have to break it down into 2 queries:
Query the authors collection to get list of all authors that have a certain tag
Query the books collection using the results from Step 1 to get list of books
If your data was modeled using embedded documents, i.e., either embedding authors inside books or the other way around, you can get the results in one query.
Related
so I've been messing around with Mongo lately using Mongoose and I came to a bump lately. I want to update and get something but without specifically targeting it. Let me explain myself better.
I have this schema:
id: {
required: true,
type: String
},
information: {
number: String,
Identification: String,
title: String,
address: String
},
products: {
}
Now ofcourse I won't hardcode every product into the schema because there are a lot of products..etc, what I eventually want to do is to update doc.updateOne({'products.productIDHere.review': newReviewData}, { new: true, upsert: true, setDefaultsOnInsert: true })
So whenever a client changes their review or rating..etc it will update that.
Here are my questions:
1- How do I insert the products individually without overwriting everything within products:{}.
2- How do I update the review or rating value within a certain product.
3- How do I get information about that product because I cannot do something like doc.products.product.id.review, product.id is the only information I have about the product.
4- Do I need to change something about the schema?
Please try to answer with Mongoose as some answers are different in MongoDB than how their executed in Mongoose. No problem if you rather answer in MongoDB sense though.
This is a time-honored data design: products and reviews. A good, simple, scalable way to approach it is with two collections: product and reviews. The product collection contains all details about a product and carries the product ID (pid):
{pid: "ABC123", name: "TV", manu: "Sony", ...}
{pid: "G765", name: "Fridge", manu: "Whirlpool", ...}
The reviews collection is an ever-growing list of pid, timestamp, and review information.
{pid: "G765", ts: ISODate("2020-03-04), author: "A1", review: "Great", rating: 4}
{pid: "G765", ts: ISODate("2020-03-05), author: "A2", review: "Good", rating: 3}
{pid: "G765", ts: ISODate("2020-03-06), author: "A3", review: "Awesome", rating: 5}
If you're thinking this sounds very relational, that's because it is and it is a good design pattern.
It answers the OP questions easily:
1- How do I insert the products individually without overwriting everything within products:{}. ANSWER: You simply add a new product doc with a new pid to the product collection.
2- How do I update the review or rating value within a certain product. ANSWER Not sure you want to do that; you probably want to accumulate reviews over time. But since each review is a separate doc (with a separate _id) you can easily do this:
db.reviews.update({_id:targetID},{$set: {review:"new text"}});
3- How do I get information about that product because I cannot do something like doc.products.product.id.review, product.id is the only information I have about the product.
Easy:
db.product.find({pid:"ABC123"})
or
db.product.find({name:"TV"})
With this simple example
(use short ObjectId to make it read easier)
Tag documents:
{
_id: ObjectId('0001'),
name: 'JavaScript',
// other data
},
{
_id: ObjectId('0002'),
name: 'MongoDB',
// other data
},
...
Assume that we need a individual tag collection, e.g. we need to store some information on each tag.
If reference by ID:
// a book document
{
_id: ObjectId('9876'),
title: 'MEAN Web Development',
tags: [ObjectId('0001'), ObjectId('0002'), ...]
}
If reference by name:
{
_id: ObjectId('9876'),
title: 'MEAN Web Development',
tags: ['JavaScript', 'MongoDB', ...]
}
It's known that "reference by ID" is feasible.
I'm thinking if use "reference by name", a query for book's info only need to find within the book collection, we could know the tags' name without a join ($lookup) operation, which should be faster.
If the app performs a tag checking before book creating and modifying, this should also be feasible, and faster.
I'm still not very sure:
Is there any hider on "reference by name" ?
Will "reference by name" slower on "finding all books with a given tag" ? Maybe ObjectId is somehow special ?
Thanks.
I would say it depends on what your use case is for tags. As you say, it will be more expensive to do a $lookup to retrieve tag names if you reference by id. On the other hand, if you expect that tag names may change frequently, all documents in the book collection containing that tag will need to be updated every change.
The ObjectID is simply a 12 byte value, which is autogenerated by a driver if no _id is present in inserted documents. See the MongoDB docs for more info. The only "special behavior" would be the fact that _id has an index by default. An index will speedup lookups in general, but indexes can be created on any field, not just _id.
In fact, the _id does not need to be an ObjectID. It is perfectly legal to have documents with integer _id values for instance:
{
_id: 1,
name: 'Javascript'
},
{
_id: 2,
name: 'MongoDB'
},
I'm making a game; players form leagues and make competing predictions. A league looks like this:
{ leagueName: "Premier League",
players:[
{name: "Goodie", secretPrediction: "abc"},
{name: "Baddie", secretPrediction: "def"}
] }
For each player, I need to publish to the client the names of all the players in the league, but only their own secret prediction. So from above, if Goodie is logged in, the document on mini-mongo should be:
{ leagueName: "Premier League",
players:[
{name: "Goodie", secretPrediction: "abc"},
{name: "Baddie"}
] }
To do this, I have two publications - one to get the whole League document but excluding ALL secret predictions, and one to get the current player's subdocument in the players array including her secret prediction. My publications are:
// Publish whole players array excluding secretPrediction
Leagues.find({"players.name": "Goodie"}, {fields: {"players.secretPrediction": 0}})
// Publish the whole Goodie item in the players array and nothing else
Leagues.find({"players.name": "Goodie"}, {fields: {players: {$elemMatch: {name: "Goodie"}}}})
The problem is that when I subscribe to both the above publications, I don't get the document I want - the secret prediction is excluded even with the second publication. (On their own, the publications behave as expected, it's only when I subscribe to both.)
Now, I understand from this answer that the two publications should be "merged" on the client
Down to the level of top level fields, Meteor takes care to perform a set union among documents, such that subscriptions can overlap - publish functions that ship different top level fields to the client work side by side and on the client, the document in the collection will be the union of the two sets of fields.
So I have two main questions (and well done / thanks for making it this far!):
Is the union of documents not happening because I'm not dealing with top level fields? Is there a way around this?
Am I going about this completely the wrong way? Is there a better way to get the results I want?
Yes, the merging multiple subscriptions of Meteor only works with the top level fields, it is mentioned in the Meteor docs: Meteor.subscribe
I can not say that you are heading the wrong direction, this really depends on your situation, what features you want to help. Only speak of myself, I would decouple the above collection to two separate collections. Because players may join many leagues and leagues may have many players, so their relation is many-to-many (n-n). For this kind of relation, we should split them to two collections and use an associative table to reflect their relation
So in your case, I would have:
League collection:
[{
_id: 'league1',
name: 'League 1',
// ...
}]
Player collection:
[{
_id: 'player1',
name: 'Player 1',
// ...
}]
League2Player collection:
[{
_id: 'league1palyer1',
playerId: 'player1',
leagueId: 'league1',
secretPrediction: 'abc',
// ...
}]
Could you instead rearrange the data document so that you can use a single query e.g.
{ leagueName: "Premier League",
players:[
{name: "Goodie"},
{name: "Baddie"}
]
playerPredictions:[
{name: "Goodie", secretPrediction: "abc"},
{name: "Baddie", secretPrediction: "def"}
]
}
That way it would be possible in a single query to return all the players and only the playerPrediction for the given person.
I'm a new user to MongoDB.
When I do a find() on a db.users, I get back an object like such:
{"_id" : ObjectId("5373c8779c82e0955aadcddc"), "username": "example"}
How do I link this document to another document? I'm using the command line mongo shell.
For example, I want to associate a person in db.person with an attribute owner in a car object in db.car.
so it sounds like you're trying to do a join, which mongo does not support. what it does support is embedding. so, based on what you're trying to do, you could embed a list of cars that a person owns... for example:
{
id: (whatever),
username: phil,
cars: [
{make: honda, model: civic, mileage: 44000},
{make: ford, model: focus, mileage: 56000}
]
}
or, you could link to a list of IDs in your car collection:
{
id: (whatever),
username: phil,
cars: [
123,
456
]
}
however this is less efficient, since you'll have to do more finds to get each car's info-- which is why embedding rocks!
This is described in detail here: MongoDB relationships: embed or reference?
I have a document that has an id of another document from a different collection embedded in it.
My desired result is to return (I'm using python and pymongo) the all the fields of the first collection, and all of the friends from the document that was embedded.
I understand mongo doesn't do joins and I understand I'll need to make two queries. I also don't want to duplicate my data.
My question is how to piece the two queries together in python/pymongo so I have one results with all the fields from both documents in it.
Here is what my data looks like:
db.employees
{_id: ObjectId("4d85c7039ab0fd70a117d733"), name: 'Joe Smith', title: 'junior',
manager: ObjectId("4d85c7039ab0fd70a117d730") }
db.managers
{_id: ObjectId("ObjectId("4d85c7039ab0fd70a117d730"), name: 'Jane Doe', title: 'senior manager'}
desired result
x = {_id: ObjectId("4d85c7039ab0fd70a117d733"), name: 'Joe Smith', title: 'junior',
manager: 'Jane Doe' }
Your basically doing something that Mongo does not support out of the box and would actually be more painful than using the two records separately.
Basically (in pseudo/JS code since I am not a Python programmer):
var emp = db.employees.find({ name: 'Joe Smith' });
var mang = db.managers.find({ _id: emp._id });
And there you have it you have the two records separately. You cannot chain as #slownage shows and get a merged result, or any result infact since MongoDB, from one qauery, will actually return a dictionary not the value of the field even when you specify only one field to return.
So this is really the only solution, to get the the two separately and then work on them.
Edit
You could use a DBRef here but it is pretty much the same as doing it this way, only difference is that it is a helper to put it into your own model instead of doing it youself.
If it works it should be something like:
db.managers.find({
'_id' => db->employees->find({ ('_id' : 1),
('_id': ObjectId("4d85c7039ab0fd70a117d733") }))
})
updated