Join like query in mongodb

Join like query in mongodb - mongodb

i have a requirement where i need to get the friends of user. I have made two collections named User and Friends.
The code that i use to access the data from the Friends and User is:
var friend = Friends.find({acceptor:req.currentUser.id,status:'0'},function(err, friends) {
console.log('----------------friends-----------------'+friends.length);
});
console.log is giving me the desired results for friends.Now if i use friend to access the User data like the one given i am not getting the result that i need.
var user = User.find({_id:friend.requestor},function(err, users) {
console.log('----------------user-----------------'+users.length);
});
how can i join the two queries to get the desired result.Please help

I'd suggest you try to denormalize the data instead of going down the SQL path:
User {
"FirstName" : "Jack",
"LastName" : "Doe",
// ...
// no friend info here
}
Put denormalized information in the list of friends. Don't use an embedded array, because you probably don't need to fetch all friend ids every time you fetch a user. The details of the data structure depend on the relations you want to support (directed vs. undirected, etc.), but it would roughly look like this:
FriendList {
OwnerUserId : ObjectId("..."),
FriendUserId : ObjectId("..."),
FriendName: "Jack Doe"
// add more denormalized information
}
Now, to display the list of friends of a user:
var friends = db.FriendList.find({"OwnerUserId" : currentUserId});
The downside is that, if a friend changes her name, you'll have to update all references of that name. On the other hand, that logic is trivial and that the (typically much more common) query "fetch all friends" is super fast, easy to code, and easy to page.

Related

How correctly aggregate and lookup mongo data to models

Consider two collections:
users (can be both, organizer and participant in other meetings)
meetings
For the sake of simplicity, I show here only the basic data, in my code i have emails, passwords, etc.
User (easier part)
{ "_id": "ObjectId('user0_id')", "username": "Paul" }
and model:
type User struct {
Id primitive.ObjectID `bson:"_id" json:"id,omitempty"`
Username string `json:"username"`
}
Meeting:
{
"_id": "ObjectId('meeting0_id')",
"organizer": "ObjectId('user0_id')",
"participants": [ "ObjectId('user1_id')", "ObjectId('user2_id')", "ObjectId('user3_id')"]
}
and model:
type Meeting struct {
Id primitive.ObjectID `bson:"_id" json:"id,omitempty"`
Organizer primitive.ObjectID `json:"organizer"`
Participants []primitive.ObjectID `json:"participants,omitempty"
}
Everything works if I extract only the basic data from mongo, but it seems ineffective... Because if I want to present this data in a readable form, I can't use only ObjectID, but with mongo "$lookup" I would like to get more data about users right away.
Another problem, in some cases I need a different dataset. Once to show a list I need only
the name of the users assigned to the meeting.
However, in the case of data administration or sending notifications, I need more (all?) User data.
How to correctly (what are the best practices) to store data like this in Go models?
Create one super-struct "meeting" with all possible data? Eg.with Participants []User instead of ID's ?
But what next? Get a complete set of data from the database each time, then filter it on the code side? Or filter on mongo side, but in most cases almost all fields in struct will be empty (eg. LastPasswordChangeDate in simple meeting participants list). Especially since there may be more "lookup" data, e.g. meeting place, invitations, etc., etc.
How finally save this super struct to two collections?
P.S. Create different models for different "views" of meetings seems super stupid...

How can I effectively design and fetch associated documents in MongoDB?

Currently I have two models in my application - for users and comments. The simplified structure is as follows:
User
{
id : "01",
username : "john"
}
Comment
{
id : "001",
body : "this is the comment"
}
Now I would like to associate users with their comments. Coming from SQL world, the first thing coming to my mind is simply adding user_id field in comment document and then use JOIN, but I guess it's not an optimal solution in terms of efficiency.
The other solution could be to embed comments in user's document:
{
id : "01",
username : "john",
comments : [
{
id : "001",
body : "this is the comment"
}
]
}
But I'm going to query for comments very often, e.g. when showing all comments from the past 24 or 48 hours. And alongside with the comment, I want to display the username.
I could of course add username field to the comment document. But then I have username stored in two places - in users collection and comments collection.
What is the best approach here?

If you have a very huge number of comments per user, it is not a good option to put those comments as sub-document of the user collection. It will not increase the efficiency. The best option will be to put the user_id in the comment collection and creating an index on that field.

MongoDB document model size/performance limits? A collection with an object that possibly houses 100k+ names?

I'm trying to build an event website that will host videos and such. I've set up a collection with the event name, event description, and an object with some friendly info of people "attending". If things go well there might be 100-200k people attending, and those people should have access to whoever else is in the event. (clicking on the friendly name will find the user's id and subsequently their full profile) Is that asking too much of mongo? Or is there a better way to go about doing something like that? It seems like that could get rather large rather quick.
{
_id : ...., // event Id,
'name' : // event name
'description' : //event description
'attendees' :{
{'username': user's friendly name, 'avatarlink': avatar url},
{'username': user's friendly name, 'avatarlink': avatar url},
{'username': user's friendly name, 'avatarlink': avatar url},
{'username': user's friendly name, 'avatarlink': avatar url}
}
}
Thanks for the suggestions!

In MongoDB many-to-many modeling (or one-to-many) in general, you should take a different approach depending if the many are few (up to few dozens usually) or "really" many as in your case.
It will be better for you not to use embedding in your case, and instead normalize. If you embed users in your events collection, adding attendees to a certain event will increase the array size. Since documents are updated in-place, if the document can't fit it's disk size, it will have to moved on disk, a very expensive operation which will also cause fragmentation. There are few techniques to deal with moves, but none is ideal.
Having a array of ObjectId as attendees will be better in that documents will grow much less dramatically, but still issue few problems. How will you find all events user has participated in? You can have a multi-key index for attendees, but once a certain document moves, the index will have to be updated per each user entry (the index contains a pointer to the document place on disk). In your case, where you plan to have up to 200K of users it will be very very painful.
Embedding is a very cool feature of MongoDB or any other document oriented database, but it's naive to think it doesn't (sometimes) comes without a price.
I think you should really rethink your schema: having an events collection, a users collection and a user_event collection with a structure similar to this:
{
_id : ObjectId(),
user_id : ObjectId(),
event_id : ObjectId()
}
Normalization is not a dirty word

Perhaps you should consider modeling your data in two collections and your attendees field in an event document would be an array of user ids.
Here's a sample of the schema:
db.events
{
_id : ...., // event Id,
'name' : // event name
'description' : //event description
'attendees' :[ObjectId('userId1'), ObjectId('userId2') ...]
}
db.users
{
_id : ObjectId('userId1'),
username: 'user friendly name',
avatarLink: 'url to avatar'
}
Then you could do 2 separate queries
db.events.find({_id: ObjectId('eventId')});
db.users.find( {_id: {$in: [ObjectId['userId1'), ObjectId('userId2')]}});

mongodb - add column to one collection find based on value in another collection

I have a posts collection which stores posts related info and author information. This is a nested tree.
Then I have a postrating collection which stores which user has rated a particular post up or down.
When a request is made to get a nested tree for a particular post, I also need to return if the current user has voted, and if yes, up or down on each of the post being returned.
In SQL this would be something like "posts.*, postrating.vote from posts join postrating on postID and postrating.memberID=currentUser".
I know MongoDB does not support joins. What are my options with MongoDB?
use map reduce - performance for a simple query?
in the post document store the ratings - BSON size limit?
Get list of all required posts. Get list of all votes by current user. Loop on posts and if user has voted add that to output?
Is there any other way? Can this be done using aggregation?
NOTE: I started on MongoDB last week.

In MongoDB, the simplest way is probably to handle this with application-side logic and not to try this in a single query. There are many ways to structure your data, but here's one possibility:
user_document = {
name : "User1",
postsIhaveLiked : [ "post1", "post2" ... ]
}
post_document = {
postID : "post1",
content : "my awesome blog post"
}
With this structure, you would first query for the user's user_document. Then, for each post returned, you could check if the post's postID is in that user's "postsIhaveLiked" list.
The main idea with this is that you get your data in two steps, not one. This is different from a join, but based on the same underlying idea of using one key (in this case, the postID) to relate two different pieces of data.
In general, try to avoid using map-reduce for performance reasons. And for this simple use case, aggregation is not what you want.

How should I insert a bunch of data in Meteor Mongodb?

It's taking a long time to save 500+ Facebook friends in MongoDB and I think I'm doing it so wrong. I'll paste how I'm doing the insertion:
models.js:
Friends = new Meteor.Collection('friends');
Friend = {
set : function(owner, friend) {
var user_id = get_user_by_uid(friend['uid']);
return Friends.update({uid: friend['uid'], owner: owner}, {$set:{
name : friend['name'],
pic_square : 'https://graph.facebook.com/'+friend['uid']+'/picture?width=150&height=150',
pic_cover : friend['pic_cover'],
uid : friend['uid'],
likes_count : friend['likes_count'],
friend_count : friend['friend_count'],
wall_count : friend['wall_count'],
age : get_age(friend['birthday_date']),
mutual_friend_count : friend['mutual_friend_count'],
owner : owner,
user_id : user_id ? user_id['_id'] : undefined
}}, {upsert: true});
}
}
server.js:
// First get facebook list of friends
friends = friends['data']['data'];
_.each(friends, function(friend){
Friend.set(user_id, friend);
});
The loads go high with 2+ users and it takes ages to insert on the database. What should I change here ?

The performance is bad for two reasons I think.
First, you are experiencing minimongo performance, not mongodb performance, on the client. minimongo can't index, so upsert is expensive—it is O(n^2) expensive on database size. Simply add a if (Meteor.isSimulation) return; statement into your model just before the database updating.
Take a look at some sample code to see how to organize your code a bit, because Friend.set(user_id, friend) should be occurring in a method call, conventionally, defined in your model.js. Then, it should escape if it is detected as a client simulating the call, as opposed to the server executing it.
Second, you are using uid and owner like a key without making them a key. On your server startup code, add Friends._ensureIndex({uid:1, owner:1}).
If none of this works, then your queries to Facebook might be rate-limited in some way.
Check out https://stackoverflow.com/a/8805436/1757994 for some discussion about the error message you'd receive if you are being rate limited.
They almost certainly do not want you copying the graph the way you are doing. You might want to consider not copying the graph at all and only getting the data on a use-basis, because it very rapidly becomes stale anyway.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse