How does mongodb support sorting over multiple keys? - mongodb

Sorting on Key1 first and then Key2 isn't the same as sorting on Key2 first then Key1. Mongodb just receives sort order object like {"key1":-1, "key2":1}.
How does it guarantee it does what the programmer wants?
There are many bindings to the mongodb driver where many programming languages have some kind of hashmap implemented, and they are likely to not reserve key order. If one uses some kind of hashmap to talk to the mongodb driver, how is the sorting key precedence by the order of insertion of keys into such hashmap guaranteed?

As far I know there's no way to force mongodb's comparison for the sort to use a specific comparator expression. Sorting happens as follows
When comparing values of different BSON types, MongoDB uses the following comparison order, from lowest to highest:
MinKey (internal type)
Null
Numbers (ints, longs, doubles)
Symbol, String
Object
Array
BinData
ObjectId
Boolean
Date
Timestamp
Regular Expression
MaxKey (internal type)

Related

Order of results for `sort` using mongoose

If I have two equal values for a field. What would be the order of results for sort on that field? Random or ordered by insertion date?
If two documents have equal values for the field you're sorting on, then MongoDB will return the results in the order they are found on disk (ie Natural order)
from MongoDB Documentation :
natural order:
The order in which the database refers to documents on
disk. This is the default sort order. See $natural and Return in
Natural Order.
This may coincide with insertion date in some case, but not all of the time (especially when you perform insertion/deletion on your collection), so you should assume that this is random ordering

In Algolia, how do you construct records to allow for alphabetical sorting of query results?

As far as I know, you can only sort on numeric fields in Algolia, so how do you efficiently set up your records to allow for results to be returned alphabetically based on a specific string field?
For example, let's say in each record in an index you have a field called "title" that contains an arbitrary string value. How would you create a sibling field called "title_sort" that contains a number that allows for the the results to be sorted such that the records come out in alphabetical order by "title"? Is there a particularly well-accepted algorithm for creating such a number from the string in "title"?
If you have a static dataset, then you can just sort your data and put an index on it. This works as long as sorting data every time you update your indices.
I'm also thinking that if you can deal with a partial sorting, meaning that you can accept orc < orb but you need or < os, then you could derive an can use base64 as our index. You can then sort it to as many characters as you have precision for. It's only a partial sorting, but it might be acceptable for your use case. You just need to map your base64 -> base10 mappings to accomodate the sorting.
Additionally, if you don't care about the difference between capital and lowercase letters, then you can do base26 -> base10. The more I think about this the more limited it is, but it might work for your use case.

Mongodb can't find object with too long _id

I have a little bit strange situation.
I persist objects in collection "refs" explicitly setting _id.
So I have objects with very big id's.
db.refs.find().sort({_id: -1});
// {_id: 9200000000165761625}
// ...
But when I try to find object with biggest id in mongo shell it returns nothing:
db.refs.find({_id: 9200000000165761625}); // nothing
But!
db.refs.find({_id: 9200000000165761625}).count(); // return 1
How could this happen?
i could not reproduce your problem. i was able to successfully query on the _id value you specified.
ensure that when you are querying you are passing correct collection name
JavaScript currently only has a single numeric type Number, which represents all values as 64-bit floating point values. The maximum safe integer representation in JavaScript's native Number type is 253-1 or 9007199254740991 (as returned by the constant Number.MAX_SAFE_INTEGER).
Any integer values beyond the safe range cannot be represented distinctly, so two or more mathematical values will map to the same JavaScript Number.
You can see this effect in the mongo shell with values adjacent to your provided _id (which is larger than the safe integer size):
> 9200000000165761624
9200000000165762000
> 9200000000165761625
9200000000165762000
> 9200000000165761626
9200000000165762000
However, these driver/client limitations are distinct from the underlying data types used in MongoDB's BSON format for documents. BSON has a 64-bit integer type which represents the full range of values: up to 263-1 for 64-bit integers.
Your example _id is within the 64-bit integer range so you should be able to insert or update this using a driver with support for 64-bit integers, but would not be able to safely query or manipulate long values in the mongo shell or other JavaScript environments. To avoid unexpected outcomes you may want to use a different data type for these long _id values.

Distinguish array from single value in a document

I have two type of documents in a mongodb collection:
one where key sessions has a simple value:
{"sessions": NumberLong("10000000000001")}
one where key sessions has an array of values.
{"sessions": [NumberLong("10000000000001")]}
Is there any way to retrieve all documents from the second category, ie. only documents whose value is an arary and not a simple value?
You can use this kind of query for that:
db.collectionName.find( { $where : "Array.isArray(this.sessions)" } );
but you'd better convert all the records to one type to keep the things consistent.
This code can be simple like this:
db.c.find({sessions:{$gte:[]}});
Explanation:
Because you only want to retrieve documents whose sessions data type is array, and by the feature of $gte (if data types are different between tow operands, it returns false; Double, Integer32, Integer64 are considered as same data type.), giving an empty array as the opposite operand will help to retrieve all results by required.
Also , $gt, $lt, $lte for standard query (attention: different behaviors to operaors with same name in expression of aggregation pipeline) have the same feature. I proved this by practice on MongoDB V2.4.8, V2.6.4.

MongoDB sort order on timestamp / ISODate fields

If there is a MongoDB collection that contains documents with field foo with both integer timestamps and ISODate objects, what will the resulting order of a sorted query be?
Will one of the objects come before the other, or will they be compared and interleaved?
The reason I ask is because this is true in Javascript (see below), but I'm wondering what will happen in MongoDB's underlying implementation.
> new Date(400) <= 401
true
> new Date(401) <= 400
false
MongoDB does type checking and conversion for certain comparison and not for all of them. I would suggest looking further in the documentation at http://docs.mongodb.org/manual/reference/method/cursor.sort/#behaviors to see how the sort behaves in ordering when the types are different.