I have a document which looks like this
{'name':'abc',
'location': 'xyz',
'social_links' : { 'facebook' : 'links',
'stackoverflow': 'links',
'quora' : 'links' ... }
}
I want to count the total number of links for each social_links in my collection
Currently my code looks like this
db.main_candidate.aggregate( [ { '$match': {'social_links.quora': {'$exists': true}}}, {'$group': { '_id' :'quora', 'count': {'$sum':1 }}}])
While this is correctly returning the counts for the specific social_link, I want to write a query which will be able to count for all the social_links in a single query instead of having to write for each specific name.
I think there is no way to group what you want with a query without hardcoding the specific names. Maybe you should try with MapReduce.
You should store social_links as an array instead as a document, which makes more sense to me. Something like:
{'name':'abc',
'location': 'xyz',
'social_links' : [ { 'name':'facebook', 'link' : 'links'},
{ 'name':'quora', 'link' : 'links'},
{ 'name':'stackoverflow', 'link' : 'links'}]
}
Then you could do the following query:
db.col.aggregate(
{
$unwind: "$social_links"
},
{
$group:
{
_id: "$social_links.name",
count: $sum: 1
}
})
Related
This has been extensively covered here, but none of the solutions seems to be working for me. I'm attempting to remove an object from an array using that object's id. Currently, my Schema is:
const scheduleSchema = new Schema({
//unrelated
_id: ObjectId
shifts: [
{
_id: Types.ObjectId,
name: String,
shift_start: Date,
shift_end: Date,
},
],
});
I've tried almost every variation of something like this:
.findOneAndUpdate(
{ _id: req.params.id },
{
$pull: {
shifts: { _id: new Types.ObjectId(req.params.id) },
},
}
);
Database:
Database Format
Within these variations, the usual response I've gotten has been either an empty array or null.
I was able slightly find a way around this and accomplish the deletion by utilizing the main _id of the Schema (instead of the nested one:
.findOneAndUpdate(
{ _id: <main _id> },
{ $pull: { shifts: { _id: new Types.ObjectId(<nested _id>) } } },
{ new: true }
);
But I was hoping to figure out a way to do this by just using the nested _id. Any suggestions?
The problem you are having currently is you are using the same _id.
Using mongo, update method allows three objects: query, update and options.
query object is the object into collection which will be updated.
update is the action to do into the object (add, change value...).
options different options to add.
Then, assuming you have this collection:
[
{
"_id": 1,
"shifts": [
{
"_id": 2
},
{
"_id": 3
}
]
}
]
If you try to look for a document which _id is 2, obviously response will be empty (example).
Then, if none document has been found, none document will be updated.
What happens if we look for a document using shifts._id:2?
This tells mongo "search a document where shifts field has an object with _id equals to 2". This query works ok (example) but be careful, this returns the WHOLE document, not only the array which match the _id.
This not return:
[
{
"_id": 1,
"shifts": [
{
"_id": 2
}
]
}
]
Using this query mongo returns the ENTIRE document where exists a field called shifts that contains an object with an _id with value 2. This also include the whole array.
So, with tat, you know why find object works. Now adding this to an update query you can create the query:
This one to remove all shifts._id which are equal to 2.
db.collection.update({
"shifts._id": 2
},
{
$pull: {
shifts: {
_id: 2
}
}
})
Example
Or this one to remove shifts._id if parent _id is equal to 1
db.collection.update({
"_id": 1
},
{
$pull: {
shifts: {
_id: 2
}
}
})
Example
I am trying to implement a search feature to MongoDB and this is the aggregate pipeline I am using:
[
{
'$search': {
'text': {
'query': 'albus',
'path': [
'first_name', 'email', 'last_name'
]
}
}
}, {
'$project': {
'_id': 1,
'first_name': 1,
'last_name': 1
}
}, {
'$limit': 5
}
]
The command returns documents that contain only exactly albus or Albus, but return nothing for queries like alb, albu, etc. In the demo video I watched here: https://www.youtube.com/watch?time_continue=8&v=kZ77X67GUfk, the instructor was able to search based on substring.
The search index I am currently using is the default dynamic one.
How would I need to change my command?
You need to use the autocomplete feature, so your query will look like this:
{
$search: {
"autocomplete": {
'query': 'albus',
'path': [
'first_name', 'email', 'last_name'
]
}
}
}
Mind you both first_name, email and last_name need to be mapped as autocomplete type so a name like albus will be indexed as a, al, alb, albu, albus. Obviously this will vastly increase your index size.
Another thing to consider is tweaking the maxGrams and tokenization parameters. this will allow very long names to still work as expected and if you want to allow substring match like lbu matching albus.
I have a collection of tweets and I am trying to output the retweets on root level (similarly for quoted tweets) to a new collection to merge them later with the original collection using dump and restore).
The retweeted status is a subdocument in the tweet document, and there may be multiple tweets retweeting the same tweet.
How can I make the retweet on the root level and add an array called 'retweeted_by' that contains the ids of all tweets that retweeted it?
keeping in mind that I am using the tweet id as the primary index (_id) to avoid creating duplicates when combining (mongorestore) collections.
My collection has the form:
{
"_id" : "123456",
"other_fields1" : "values1",
"retweeted_status" : {
"retweet_id": "159753",
"other_fields2" : "values2",
}
}
The ideal output is expected to look like:
{
"_id" : "159753",
"other_fields2" : "values2",
"retweeted_by" : [ "123456", "974631", "121212"]
}
edit for clarification:
The fields in the subdocument (other_fields2) are multiple fields (~28) that are not all present in other tweets
OK.. so I finally reached a solution to my question.. I am not sure if this is the best way to do it though:
db.tweets.aggregate([
{
$match: { retweeted_status: {$exists: true}}
},
{
$addFields: { 'retweeted_status.retweeted_by' : '$_id', 'retweeted_status._id' : '$retweeted_status.id_str'}
},
{
$replaceRoot: { newRoot: '$retweeted_status'}
},
{
$group: { _id: '$_id', doc: { '$first': '$$ROOT' }, retweeted_by: {$addToSet: '$retweeted_by'}}
},
{
$addFields: { 'doc.retweeted_by' : '$retweeted_by'}
},
{
$replaceRoot: { newRoot: '$doc'}
},
{
$project: { id: 0 , id_str: 0 }
},
{
$out: 'retweets'
}
], {allowDiskUse: true})
Initially each document (tweet) has the form:
{parent, {subdocument}}
First match the existence of a retweeted_status (subdocument), then before grouping by retweeted_status id, I added a field with the id of the parent tweet:
{parent, {subdocument , parent_id}}
Then replaced the root with the modified subdocument:
{subdocument, parent_id}
Then, I grouped by the new root's _id, took the first document of the group, and added a new accumulator set(retweeted_by). (not $push because twitter API sometimes sends duplicates)
So far the root document contains _id, the retweeted document embedded inside the field 'doc', and an array that contains the parents:
{doc{subdocument, parent_id}, [parent_ids]}
Next, I added the parents array as a field inside doc, (overwriting the previously added retweeted_by field):
{doc{subdocument, [parent_ids]}, [parent_ids]}
Then replaced the parent (root) with the new doc. then excluded fields that contain the same number as _id:
{subdocument, [parent_ids]}
In my mongodb (using Mongoose), I have story collection which has comments sub collection and I want to query the subdocument by client id, as
Story.find({ 'comments.client': id }, { title: 1, 'comments.$': 1 }, function (err, stories) {
...
})
})
The query works except that it only returns the first matched subdocument, but I want it to return all matching subdocuments. Did I miss an option?
EDIT:
On Blakes Seven's tip, I tried the answers from Retrieve only the queried element in an object array in MongoDB collection, but I couldn't make it work.
First tried this:
Story.find({'comments.client': id}, { title: 1, comments: {$elemMatch: { client: id } } }, function (err, stories) {
})
It also returns the first match only.
Then, I tried the accepted answer there:
Story.aggregate({$match: {'comments.client': id} }, {$unwind: '$comments'}, {$match : {'comments.client': id} }, function (err, stories) {
})
but this returns nothing. What is wrong here?
UPDATE:
My data structure looks like this:
{
"_id" : ObjectId("55e2185288fee5a433ceabf5"),
"title" : "test",
"comments" : [
{
"_id" : ObjectId("55e2184e88fee5a433ceaaf5"),
"client" : ObjectId("55e218446033de4e7db3f2a4"),
"time" : ISODate("2015-08-29T20:16:00.000Z")
}
]
}
I have Item schema in which I have item details with respective restaurant. I have to find all items of particular restaurant and group by them with 'type' and 'category' (type and category are fields in Item schema), I am able to group items as I want but I wont be able to get complete item object.
My query:
db.items.aggregate([{
'$match': {
'restaurant': ObjectId("551111450712235c81620a57")
}
}, {
'$group': {
id: {
'$push': '$_id'
}
, _id: {
type: '$type'
, category: '$category'
}
}
}, {
$project: {
id: '$id'
}
}])
I have seen one method by adding each field value to group then project it. As I have many fields in my Item schema I don't feel this will good solution for me, Can I get complete object instead of Ids only.
Well you can always use $$ROOT providing that your server is MongoDB 2.6 or greater:
db.items.aggregate([
{ '$match': {'restaurant': ObjectId("551111450712235c81620a57")}},
{ '$group':{
_id : {
type : '$type',
category : '$category'
},
id: { '$push': '$$ROOT' },
}}
])
Which is going to place every whole object into the members of the array.
You need to be careful when doing this as with larger results you are certain to break BSON limits.
I would suggest that you are trying to contruct some kind of "search results", with "facet counts" or similar. For that you are better off running a separate query for the "aggregation" part and one for the actual document results.
That is a much safer and flexible approach than trying to group everything together.