Mongo complex sorting? - mongodb

I know how to sort queries in MongoDB by multiple fields, e.g., db.coll.find().sort({a:1,b:-1}).
Can I sort with a user-defined function; e.g., supposing a and b are integers, by the difference between a and b (a-b)?
Thanks!

UPDATE: This answer appears to be out of date; it seems that custom sorting can be more or less achieved by using the $project function of the aggregation pipeline to transform the input documents prior to sorting. See also #Ari's answer.
I don't think this is possible directly; the sort documentation certainly doesn't mention any way to provide a custom compare function.
You're probably best off doing the sort in the client, but if you're really determined to do it on the server you might be able to use db.eval() to arrange to run the sort on the server (if your client supports it).
Server-side sort:
db.eval(function() {
return db.scratch.find().toArray().sort(function(doc1, doc2) {
return doc1.a - doc2.a
})
});
Versus the equivalent client-side sort:
db.scratch.find().toArray().sort(function(doc1, doc2) {
return doc1.a - doc2.b
});
Note that it's also possible to sort via an aggregation pipeline and by the $orderby operator (i.e. in addition to .sort()) however neither of these ways lets you provide a custom sort function either.

Ran into this and this is what I came up with:
db.collection.aggregate([
{
$project: {
difference: { $subtract: ["$a", "$b"] }
// Add other keys in here as necessary
}
},
{
$sort: { difference: -1 }
}
])

Why don't create the field with this operation and sort on it ?

Related

Find out all types used in a MongoDB collection

Let's say you want to use mongoexport/import to update a collection (for reasons explained here. You should make sure the types in the collection are JSON-safe.
How can one determine all the types used in all documents of a collection, including within array elements, using the aggregation framework?
You can use $objectToArray in combination with $map and $type.
I think something like this should get you started:
db.collection.aggregate([
{ $project: {
types: {
$map: {
input: { $objectToArray: "$$CURRENT" },
in: { $type: [ "$$this.v" ] }
}
}
}
}
])
Note it is not recursive and it would not go deep into the values of the arrays since I am not also sure how many levels you want to go deep and even what is the desired output. So hopefully that is a good start for you.
You can see that aggregation with provided input with various types working here.

Mongodb query to select less than and equal based on custom comparator

I am using mongodb database and I need to run less than and equal filter based on custom comparator. Following is more details.
"profile" collection is having "level" field as string
{"name":"Test1", "level":"intermediate"}
Following are value of level and its corresponding weight
novice
intermediate
experienced
advance
I want to write query like as below so that it should return all the profile collection which level less than and equal to "experienced" (i.e. includes result for "novice", "intermediate" and "experienced"
db.profile.find( { level: { $lte: "experienced" } } )
I understand, I need to provide custom comparator. But how can i do?
You can't use custom comparators in a MongoDB Query. The ones available are: $eq, $gt, $gte, $lt, $lte, $ne, $in, $nin.
You can, however, use $in to get what you want:
db.profile.find( { level: { $in: [ "experienced", "intermediate ", "novice" ] } } );

What is the purpose of $eq

I noticed a new $eq operator released with MongoDB 3.0 and I don't understand the purpose of it. For instance these two queries are exactly the same:
db.users.find({age:21})
and
db.users.find({age:{$eq:21}})
Does anyone know why this was necessary?
The problem was that you'd have to handle equality differently from comparison when you had some kind of query builder, so it's
{ a : { $gt : 3 } }
{ a : { $lt : 3 } }
but
{ a : 3 }
for equality, which looks completely different. The same applies for composition of $not, as JohnnyHK already pointed out. Also, comparing with $eq saves you from having to $-escape user provided strings. Therefore, people asked for alternatives that are syntactically closer and it was implemented. The Jira ticket contains a longer discussion which mentions all these points.
The clearer syntax of an $eq operator might also make sense in the aggregation framework to compare two fields, should such a feature be implemented.
Also, the feature has apparently been around since 2.5, was added to the documentation relatively late.
One specific application I can see for $eq is with cases like the $not operator which requires that its value is an operator-expression.
This allows you to construct a query like:
db.zips.find({state: {$not: {$eq: 'NY'}}})
Before, the closest you could get to this semantically was:
db.zips.find({state: {$not: {$regex: /^NY$/}}})
I realize there are other ways to represent the functionality of that query, but if you need to use the $not operator there for other reasons, this would now allow it.
In filter part of an aggregation query if you need to check if some field is equal to a value you can not use assign syntax:
db.sales.aggregate([
{
$project: {
items: {
$filter: {
input: "$items",
as: "item",
cond: { $eq: [ "$$item.price", 100 ] }
}
}
}
}
])

sorting documents in mongodb

Let's say I have four documents in my collection:
{u'a': {u'time': 3}}
{u'a': {u'time': 5}}
{u'b': {u'time': 4}}
{u'b': {u'time': 2}}
Is it possible to sort them by the field 'time' which is common in both 'a' and 'b' documents?
Thank you
No, you should put your data into a common format so you can sort it on a common field. It can still be nested if you want but it would need to have the same path.
You can use use aggregation and the following code has been tested.
db.test.aggregate({
$project: {
time: {
"$cond": [{
"$gt": ["$a.time", null]
}, "$a.time", "$b.time"]
}
}
}, {
$sort: {
time: -1
}
});
Or if you also want the original fields returned back: gist
Alternatively you can sort once you get the result back, using a customized compare function ( not tested,for illustration purpose only)
db.eval(function() {
return db.mycollection.find().toArray().sort( function(doc1, doc2) {
var time1 = doc1.a? doc1.a.time:doc1.b.time,
time2 = doc2.a?doc2.a.time:doc2.b.time;
return time1 -time2;
})
});
You can, using the aggregation framework.
The trick here is to $project a common field to all the documents so that the $sort stage can use the value in that field to sort the documents.
The $ifNull operator can be used to check if a.time exists, it
does, then the record will be sorted by that value else, by b.time.
code:
db.t.aggregate([
{$project:{"a":1,"b":1,
"sortBy":{$ifNull:["$a.time","$b.time"]}}},
{$sort:{"sortBy":-1}},
{$project:{"a":1,"b":1}}
])
consequences of this approach:
The aggregation pipeline won't be covered by any of the index you
create.
The performance will be very poor for very large data sets.
What you could ideally do is to ask the source system that is sending you the data to standardize its format, something like:
{"a":1,"time":5}
{"b":1,"time":4}
That way your query can make use of the index if you create one on the time field.
db.t.ensureIndex({"time":-1});
code:
db.t.find({}).sort({"time":-1});

sort by string length in Mongodb/pymongo

I was wondering if anyone knows how to sort a mongodb find() result by string length.
I have tried something like db.foo.find().sort({item.lenght:-1}) but obviously doesn't work. Can somebody help me and also suggest me a way to do the same thing but in pymongo?
There are lot of things ( and basic API ) I would personally love to see in the aggregation framework such as:
Math functions
log (as in logarithm)
ceil
floor
Array
sum
String
length
Just to name a few.
And that is without resorting to obscure usages of the $mod operator or other means in such cases as "ceil" and "floor". But I digress.
Your "string length" falls into this category. Raise a JIRA issue about it. But for now you you can use mapReduce and the existing JavaScript functionality:
db.collection.mapReduce(
function() {
emit( this.item.length, this.item );
},
function(key,values) {
return values;
},
{ "out": { "inline": 1 } }
)
So while that does actually have the "mapReduce" funky style of returning a re-shaped document and with of course everything matching the same length in an array, what it does do is take advantage of the nature of "mapReduce" ( not just restricted to MongoDB ) and allows the emitted "key" value to be sorted in the response.
There is now a solution for this in MongoDB v3.4+ using the aggregation framework using $strLenBytes. Given the following document:
{_id: 0, name: "Bob"}
We can use
db.mycollection.aggregate([{
$project: {
byteLength: {$strLenBytes: "$name"}
}
}])
Which will return 3 for the number of bytes.
No, actually is not possible.
I was dealing with a similar problem, what I did was to store the string length of every object as a property of the object itself. This bypassed the problem.
If you think that shall be implemented (I do) I recomend you to upvote the issue in JIRA, which, for some reason have not so many votes:
https://jira.mongodb.org/browse/SERVER-5319