quick question I have a list of articals in mongodb and I want users to be able to upvote or down vote the artical.
My first way would be in the artical collection to have two rows called upvote and downvote they would have numbers like
upvote:360
downvote:102;
then I would need to order this by doing a sum
upvote-downvote this would show the total likes of the artical.
My question is in the mongoDB is this the best way to do it or am I better off with one "vote" and then just order it by that vote.
Thank You
When you would do it that way, you wouldn't track which user has already voted, so users can vote multiple times. That's surely not in your interest.
For that reason I would add an array "votes" to each article which includes an object for each vote which uniquely identifies the user who made it:
votes: [
{ voter:"name or ID or IP address or some other unique identifier for the person who voted",
vote:-1 },
{ voter:"someone else",
vote:1 },
{ voter:"and someone entirely different",
vote:-1 }
]
When you create an unique index over the article ID and votes.voter, you have already ensured that nobody can vote twice for an article.
When you use a value of "-1" for downvote and "1" for upvote you can calculate the total score of an article by using the $sum aggregate function (It would also easily allow you to introduce weighted votes later, when you feel like it).
Related
lets say I have 2 collections wherein each document may look like this:
Collection 1:
target:
_id,
comments:
[
{ _id,
message,
full_name
},
...
]
Collection 2:
user:
_id,
full_name,
username
I am paging through comments via $slice, let's say I take the first 25 entries.
From these entries I need the according usernames, which I receive from the second collection. What I want is to get the comments sorted by their reference username. The problem is I can't add the username to the comments because they may change often and if so, I would need to update all target documents, where the old username was in.
I can only imagine one way to solve this. Read out the entire full_names and query them in the user collection. The result would be sortable but it is not paged and so it takes a lot of resources to do that with large documents.
Is there anything I am missing with this problem?
Thanks in advance
If comments are an embedded array, you will have to do work on the client side to sort the comments array unless you store it in sorted order. Your application requirements for username force you to either read out all of the usernames of the users who commented to do the sort, or to store the username in the comments and have (much) more difficult and expensive updates.
Sorting and pagination don't work unless you can return the documents in sorted order. You should consider a different schema where comments form a separate collection so that you can return them in sorted order and paginate them. Store the username in each comment to facilitate the sort on the MongoDB side. Depending on your application's usage pattern this might work better for you.
It also seems strange to sort on usernames and expect/allow usernames to change frequently. If you could drop these requirements it'd make your life easier :D
I'm new to mongodb, please suggest me how to correct design schema for situation like below:
I have User collection and Product collection. Product contain info like id, title, description, price... User can bookmark or like Product. Currently, in User collection, I'm store 1 array for liked products, and 1 array for bookmarked products. So when I need to view info about 1 user, I have to read out these 2 array, then search in Product collection to get title of liked and bookmarked products.
//User collection
{
_id : 12345,
name: "John",
liked: [123, 456, 789],
bkmark: [123, 125]
}
//Product collection
{
_id : 123,
title: "computer",
desc: "awesome computer",
price: 12
}
Now I think I can speed up this process by embedded both product id and title in User collection, so that I don't have to search in Product collection, just read it out and display. But if I choose this way, whenever Product's title get updated, I have to search and update in User collection too. I can't evaluate update cost in 2nd way, so I don't know which way is correct. Please help me to choose between them.
Thanks & Regards.
You should consider what happens more often: A product gets renamed or the information of a user is requested.
You should also consider what's a bigger problem: Some time lag in which users see an outdated product name (we are talking about seconds, maybe minutes when you have a really large number of users) or always a longer response time when requesting a user profile.
Without knowing your actual usage patterns and requirements, I would guess that it's the latter in both cases, so you should rather optimize for this situation.
In general it is not recommended to normalize a MongoDB as radical as you would normalize a relational database. The reason is that MongoDB can not perform JOINs. So it's usually not such a bad idea to duplicate some relevant information in multiple documents, while accepting a higher cost for updates and a potential risk of inconsistencies.
I'm building simple Web App where users can vote.
What is the fastest way for checking if user has already voted. I'm interested in both relation databases and document based databases (mongodb,...)
I have few ideas but I am sure they can be improved:
Relation databases
Create a seperate table for voting:
|userid|articleid|
Before incrementing articles vote check if there is a row including both userid and articleid. We have two queries. Is possible to improve this with triggers? For example:
|useridarticleid| unique column
Before vote generate useridarticleid on application side. Try to insert useridarticleid. Trigger will fire if field is new and it will increment our vote column in article.
Document based
This is a bit more trickier. So having document structured like so:
{
"id": "123",
"content": "something",
"num_votes": 2,
"votes" : [
"userid1",
"userid2"
]
}
First "query" - check if userid is in votes array. Second "query" - Increment num_votes if not.
Again two queries. So I thought we can change this but I don't know really if it will increase performance:
Insert userid in votes array. When user want to check article "count" votes in array. But I think it possible that performance will drop because if traffic is high counting every article is a bit of waste. Imagine Reddit here.
Actually, it's a lot simpler in a document database. Your document structure is perfect for it.
{
"id": "123",
"content": "something",
"num_votes": 2,
"votes" : [
"userid1",
"userid2"
]
}
db.collection.update(
{id:"123", votes:{$ne:"userid"}},
{$push:{"votes":"userid"},$inc:{"num_votes":1}}
);
This will atomically update record id=123 adding userid to list of voters and incrementing votes by one only if userid is not already in the list of votes on this document.
So there is only one query and one update - and they are actually the same operation.
In a relational database |userid|articleid| would be the best approach, using both fields as primary keys.
In the second one you can also consider wther putting the votes in the user document, or in the article document.
Anyway, I'd suggest you really focus on creating a design, where changing all this decisions later is easy.
The different ways of designing this, favor things like "A lot of users at the same article at the same time" or "A lot of users in different articles", etc... Until you can see the real usage, you won't have enough information to decide which approach will work best and fastest... So create something that you can easily adapt to whatever information you learn later.
BTW: You might also consider don't counting the votes synchronically. I remember an article (which I can't find) where it mentioned that you tube votes numbers weren't actually "accurate"... They put an estimation of the current votes, and calculated the real number in a background worker thread.
Let say we have user and post collection. In post collection, vote store the user name as a key.
db.user.insert({name:'a', age:12});
db.user.insert({name:'b', age:12});
db.user.insert({name:'c', age:22});
db.user.insert({name:'d', age:22});
db.post.insert({Title:'Title1', vote:[a]});
db.post.insert({Title:'Title2', vote:[a,b]});
db.post.insert({Title:'Title3', vote:[a,b,c]});
db.post.insert({Title:'Title4', vote:[a,b,c,d]});
We would like to group by the post.Title and find out the count of vote in different user age.
> {_id:'Title1', value:{ ages:[{age:12, Count:1},{age:22, Count:0}]} }
> {_id:'Title2', value:{ ages:[{age:12, Count:2},{age:22, Count:0}]} }
> {_id:'Title3', value:{ ages:[{age:12, Count:2},{age:22, Count:1}]} }
> {_id:'Title4', value:{ ages:[{age:12, Count:2},{age:22, Count:2}]} }
I have searched through and doesn't find a way to access 2 collection in mongodb mapreduce.
Could it be possible to achieve in re-reduce?
I know it is much simple to embedded the user document in post, but it is not a nice way to do as the real user document have many properties. If we include the simplify version of user document, it will limit the dimension of analysis.
{Title:'Title1', vote:[{name:'a', age:12}]}
MongoDB does not have a multi-collection Map / Reduce. MongoDB does not have any JOIN syntax and may not be very good for ad-hoc joins. You will need to denormalize this data in some way.
You have a few options:
Option #1: Embed the age with the vote.
{Title:'Title1', vote:[{name:'a', age:12}]}
Option #2: Keep a counter of the ages
{Title:'Title1', vote:[a, b], age: { "12" : 1, "22" : 1 }}
Option #3: Do a "manual" join
Your last option is to write script/code that does a for loop over both collections and merges the data correctly.
So you would loop over post and output a collection with the title and the list of votes. Then you would loop through the new collection and update the ages by looking up each user.
My suggestion
Go with #1 or #2.
Instead of
{name:'a', age:12}
It is easier to add a new field to user document and maintain it in each vote update.Of course, you can enjoy to use map reduce to analysis your data.
{name:'a', age:12, voteTitle:["Title1","Title2","Title3","Title4"]}
I'm quite new to MongoDB and trying to build a nested comment system with it.
On the net you're finding various document structures to achieve that, but I'm looking for some proposals that would enable me easily to do the following things with the comments
Mark comments as spam/approved and retrieve comments by this attributes
Retrieve comments by user
Retrieve comment count for an object/user
Besides of course displaying the comments as it is normally done. If you have any suggestions on how to handle these things with MongoDB - or - tell me to look for an alternative it'd be appreciated much!
Have you considered storing the comments in all documents that need a reference to them? If you have a document for the user, store all of that user's comments in it. If you have a separate document for objects, store all comments there also. It feels sort of wrong after coming from a relational world where you try to have exactly one copy of a given piece of data, and then reference it by ID, but even with relational databases you have to start duplicating data if you want queries to run quickly.
With this design, each document that you load would be "complete". It would have all the data you need, and indexes on that collection would keep reads fast. The price would be slightly slower writes, and more of a headache when you need to update the comment text, since you need to update more than one document.
Because of you need retrieve comments by some attributes, by user, etc.., you can't embed(embedding is always faster for document databases) comment in each object that users can comment. So you need create separate collection for the comments. I suggest following structure:
comment
{
_id : ObjectId,
status: int (spam =1, approved =2),
userId: ObjectId,
commentedObjectId: ObjectId,
commentedObjectType: int(for example question =1, answer =2, user =3),
commentText
}
With above structure you can easy do things thats you want:
//Mark comments as spam/approved and retrieve comments by this attributes
//mark specific comment as spam
db.comments.update( { _id: someCommentId }, { status: 1 }, true);
db.comments.find({status : 1});// get all comments marked as spam
//Retrieve comments by user
db.comments.find({'_userId' : someUserId});
//Retrieve comment count for an object/user
db.comments.find({'commentedObjectId' : someId,'commentedObjectType' : 1 })
.count();
Also i suppose for comments counting will be better to create extra field in each object and inc it on comment add/delete.