Aggregate not behaving in Meteor as in Mongo - mongodb

This is the query that I'm trying to run. If I run it on the Mongo console I get
meteor:PRIMARY> db.keywords_o2m.aggregate({$match:{keyword:{$in:['sql']}}},{$unwind:'$synonym'},{$group:{_id:0,kw:{$addToSet:'$synonym'}}});
{ "_id" : 0, "kw" : [ "database" ] }
However, if I copypaste it and try to run it on Meteor calling Meteor.call('getAllKeywordSynonyms',kw,function(err,data){...}); with this code
if(Meteor.isServer){
Meteor.methods({
'getAllKeywordSynonyms':function(keyword){
console.log("keywordO2M aggregate");
console.log(keywordO2M.aggregate({$match:{keyword:{$in:['sql']}}},{$unwind:'$synonym'},{$group:{_id:0,kw:{$addToSet:'$synonym'}}}));
}
)};
}
I get
I20151220-12:49:38.197(-8)? keywordO2M aggregate
I20151220-12:49:38.197(-8)? [ { _id: 5676fe5a17aeddb799dc4ef8,
I20151220-12:49:38.197(-8)? keyword: 'sql',
I20151220-12:49:38.197(-8)? synonym: 'database' } ]
It looks like it ran the $match and ignored the $unwind and $group.
I've tried using meteorhacks:aggregate and monbro:mongod-mapreduce-aggregation, but no difference.
What am I doing wrong?

Meteor is not as forgiving with notation.
With the aggregation function, the pipeline stages need to all be passed as a single parameter in an array, so the correct syntax would be
console.log(keywordO2M.aggregate([{$match:{keyword:{$in:['sql']}}},{$unwind:'$synonym'},{$group:{_id:0,kw:{$addToSet:'$synonym'}}}]));
Note the square brackets so that only one parameter gets passed to aggregate.

Related

runCommand vs aggregate method to do aggregation

To run aggregation query it is possible to use either of these:
db.collectionName.aggregate(query1);
OR
db.runCommand(query2)
But I noticed something bizarre this morning. While this:
db.runCommand(
{
"aggregate":"collectionName",
allowDiskUse: true,
"pipeline":[
{
"$match":{
"field":param
}
}
]
});
fails with error:
{
"ok" : 0.0,
"errmsg" : "aggregation result exceeds maximum document size (16MB)",
"code" : 16389,
"codeName" : "Location16389"
}
This:
db.collectionName.aggregate([
{
$match: {
field: param
}
}
])
is working (gives the expected aggregation result).
How is this possible?
Well the difference is of course that the .aggregate() method returns a "cursor", where as the options you are providing to runCommand() you are not. This actually was the legacy form which returned the response as a single BSON document with all it's limitations. Cursors on the other hand do not have the limitation.
Of course you can use the runCommand() method to "make your own cursor" with the shell, since after-all that is exactly what the .aggregate() method is doing "under the covers". The same goes for all drivers, which essentially invoke the database command for everything.
With the shell, you can transform your request like this:
var cmdRes = db.runReadCommand({
"aggregate": "collectionName",
"allowDiskUse": true,
"pipeline":[
{
"$match":{
"field":param
}
}
],
"cursor": { "batchSize": 25 }
});
var cursor = new DBCommandCursor(db, cmdRes);
cursor.next(); // will actually iterate the cursor
If you really want to dig into it then type in db.collectionName.aggregate without the parenthesis () so you actually print the function definition. This will show you some other function calls and you can dig further into them and eventually see what is effectively the lines shown above, amongst a lot of other stuff.
But the way you ran it, it's a "single BSON Document" response. Run it the way shown here, and you get the same "cursor" response.

Bad Value needs an array in mongoDB

I am trying to use $in inside $match condition of aggregate. I have SegmentList as literal array and Segment as a literal value. I am trying to match with $in condition as in query below.
db.collection.aggregate([
{'$match': {
'MaterialNumber': '867522'
}},
{'$project':{
'SegmentList': {'$literal':['A']},
'Segment':{'$literal':'A'},
'value':{'$literal':'B'},
}
},
{'$match': {
'Segment':{'$in':'$SegmentList'}
}
}
])
But I am getting an error.
assert: command failed: {
"ok" : 0,
"errmsg" : "bad query: BadValue: $in needs an array",
"code" : 16810
} : aggregate failed
I am not not able to understand what is problem
Not sure what you are intending to acheive here as all you are doing comparing if 'A' is in array with element 'A' which is always true.
$match in its regular form only takes a value not field reference.
No need to use $literal you can directly pass [].
You can use one of the following query based on mongo version.
3.4 & below:
{'$match': {'Segment':{'$in':['A']}}
3.6 ( not needed for your use case )
{'$match': {'$expr':{'$in':['$Segment', ['A']]}}}

Mongodb aggregation in mongo command prompt

I have the following code based upon this question
How to efficiently perform "distinct" with multiple keys?:
collection = db.products;
result = collection.aggregate(
[
{"$group": { "_id": { "P1 Connection": "$p1c", "P1 Size": "$p1s" } } },
{"$match" : {"parentGUID":ObjectId("5509b246c519ce4b900138a3")}}
]
)
printjson(result);
The printjson statement only prints a bunch of code, and not an object. I also tried result() but that got the following error:
> result()
2015-10-29T10:31:14.892-0400 TypeError: Property 'result' of object #<Object> is not a function
How do I get the results of this aggregation? It looks like it may be possible to do this if I put my code in a file and run that, but I am having a hard time believing that there is no quick and dirty way to run this query in the mongodb command prompt.
Move the $match pipeline step to the very beginning, this will filter the documents that get into the pipeline and the $group pipeline stage will then run the pipeline with the correct documents. Since MongoDB 2.6 adds support for returning a cursor for the aggregate() method, you would need to iterate over the cursor using the forEach() method and access the documents, as in the following example:
var pipeline = [
{"$match" : {"parentGUID":ObjectId("5509b246c519ce4b900138a3")}},
{"$group": { "_id": { "P1 Connection": "$p1c", "P1 Size": "$p1s" } } }
];
var results = db.products.aggregate( pipeline );
results.forEach(printjson);

How can I get all the doc ids in MongoDB?

How can I get an array of all the doc ids in MongoDB? I only need a set of ids but not the doc contents.
You can do this in the Mongo shell by calling map on the cursor like this:
var a = db.c.find({}, {_id:1}).map(function(item){ return item._id; })
The result is that a is an array of just the _id values.
The way it works in Node is similar.
(This is MongoDB Node driver v2.2, and Node v6.7.0)
db.collection('...')
.find(...)
.project( {_id: 1} )
.map(x => x._id)
.toArray();
Remember to put map before toArray as this map is NOT the JavaScript map function, but it is the one provided by MongoDB and it runs within the database before the cursor is returned.
One way is to simply use the runCommand API.
db.runCommand ( { distinct: "distinct", key: "_id" } )
which gives you something like this:
{
"values" : [
ObjectId("54cfcf93e2b8994c25077924"),
ObjectId("54d672d819f899c704b21ef4"),
ObjectId("54d6732319f899c704b21ef5"),
ObjectId("54d6732319f899c704b21ef6"),
ObjectId("54d6732319f899c704b21ef7"),
ObjectId("54d6732319f899c704b21ef8"),
ObjectId("54d6732319f899c704b21ef9")
],
"stats" : {
"n" : 7,
"nscanned" : 7,
"nscannedObjects" : 0,
"timems" : 2,
"cursor" : "DistinctCursor"
},
"ok" : 1
}
However, there's an even nicer way using the actual distinct API:
var ids = db.distinct.distinct('_id', {}, {});
which just gives you an array of ids:
[
ObjectId("54cfcf93e2b8994c25077924"),
ObjectId("54d672d819f899c704b21ef4"),
ObjectId("54d6732319f899c704b21ef5"),
ObjectId("54d6732319f899c704b21ef6"),
ObjectId("54d6732319f899c704b21ef7"),
ObjectId("54d6732319f899c704b21ef8"),
ObjectId("54d6732319f899c704b21ef9")
]
Not sure about the first version, but the latter is definitely supported in the Node.js driver (which I saw you mention you wanted to use). That would look something like this:
db.collection('c').distinct('_id', {}, {}, function (err, result) {
// result is your array of ids
})
I also was wondering how to do this with the MongoDB Node.JS driver, like #user2793120. Someone else said he should iterate through the results with .each which seemed highly inefficient to me. I used MongoDB's aggregation instead:
myCollection.aggregate([
{$match: {ANY SEARCHING CRITERIA FOLLOWING $match'S RULES} },
{$sort: {ANY SORTING CRITERIA, FOLLOWING $sort'S RULES}},
{$group: {_id:null, ids: {$addToSet: "$_id"}}}
]).exec()
The sorting phase is optional. The match one as well if you want all the collection's _ids. If you console.log the result, you'd see something like:
[ { _id: null, ids: [ '56e05a832f3caaf218b57a90', '56e05a832f3caaf218b57a91', '56e05a832f3caaf218b57a92' ] } ]
Then just use the contents of result[0].ids somewhere else.
The key part here is the $group section. You must define a value of null for _id (otherwise, the aggregation will crash), and create a new array field with all the _ids. If you don't mind having duplicated ids (according to your search criteria used in the $match phase, and assuming you are grouping a field other than _id which also has another document _id), you can use $push instead of $addToSet.
Another way to do this on mongo console could be:
var arr=[]
db.c.find({},{_id:1}).forEach(function(doc){arr.push(doc._id)})
printjson(arr)
Hope that helps!!!
Thanks!!!
I struggled with this for a long time, and I'm answering this because I've got an important hint. It seemed obvious that:
db.c.find({},{_id:1});
would be the answer.
It worked, sort of. It would find the first 101 documents and then the application would pause. I didn't let it keep going. This was both in Java using MongoOperations and also on the Mongo command line.
I looked at the mongo logs and saw it's doing a colscan, on a big collection of big documents. I thought, crazy, I'm projecting the _id which is always indexed so why would it attempt a colscan?
I have no idea why it would do that, but the solution is simple:
db.c.find({},{_id:1}).hint({_id:1});
or in Java:
query.withHint("{_id:1}");
Then it was able to proceed along as normal, using stream style:
createStreamFromIterator(mongoOperations.stream(query, MortgageDocument.class)).
map(MortgageDocument::getId).forEach(transformer);
Mongo can do some good things and it can also get stuck in really confusing ways. At least that's my experience so far.
Try with an agregation pipeline, like this:
db.collection.aggregate([
{ $match: { deletedAt: null }},
{ $group: { _id: "$_id"}}
])
this gona return a documents array with this structure
_id: ObjectId("5fc98977fda32e3458c97edd")
i had a similar requirement to get ids for a collection with 50+ million rows. I tried many ways. Fastest way to get the ids turned out to be to do mongoexport with just the ids.
One of the above examples worked for me, with a minor tweak. I left out the second object, as I tried using with my Mongoose schema.
const idArray = await Model.distinct('_id', {}, function (err, result) {
// result is your array of ids
return result;
});

is it possible to use "$where" in mongodb aggregation functions

I need to get the length of a string value in MongoDB using aggregation functions.
it works in
db.collection_name.find({"$where":"this.app_name.length===12"})
but when implanted to
db.collection_name.aggregate({$match:
{"$where":"this.app_name.length===12"}
},
{
$group :
{
_id : 1,
app_downloads : {$sum: "$app_downloads"}
}
}
);
I got this result:
failed: exception: $where is not allowed inside of a $match aggregation expression
The question is: is it possible to use $where in aggregation functions?
or is there any way of getting the length of a string value in aggregation function?
Thanks in advance
Eric
MongoDB doesn't support $where in aggregation pipeline and hope this will never happen, because JavaScript slows things down. Never the less, you still have options:
1) Мaintain additional field(e.g. app_name_len) than will store app_name length and query it, when needed.
2) You can try extremely slow MapReduce framework, where you allowed to write aggregations with JavaScript.
Today I had the same problem.
Mongodb doesn't support this.app_name.length, but you can do this condition with $regex - this is not very quick, but it still works.
{"app_name": { $regex: /^.{12}$/ }}
A simple way to achieve the behaviour expected of OP would be chaining up $expr with $strLenCP
db.collection.find({
$expr: {
$eq: [
12,
{
$strLenCP: "$app_name"
}
]
}
})
Mongo Playground