MongoDB Compass : can't filter a specific ID - mongodb

I have a MongoDB Collection build like this :
When I try to filter by ID, I got inconsistent results. For exemple, when I try to get the first entry, by typing in filter query :
{_id:209383449381830657}
I got no result found.
But if I type for exemple the third, It work correctly.
{_id:191312485326913536}
I searched if it's due to too large int but no, all _id value are Int64 and it's the same code that generate all entries.
I really don't know why I got this result and that why I am asking here.
EDIT:
All entries have same type.
No limit are set in query.
If I type {_id:{"$gte":209383449381830657}} it found the entry, but don't if I type {_id:{"$eq":209383449381830657}}

MongoDB Compass uses mongo's node.js driver.
Both 209383449381830657 and 191312485326913536 exceed the javascript max safe integer of (2^53-1).
Javascript does not handle numbers larger than that in a consistent manner.
Note that in your documents, those numbers are reported as $numberLong, indicating that they are not using javascript's default floating point numeric representation.
To query these number consistently, use the NumberLong constructor when querying, like
{_id:NumberLong("209383449381830657")}

It depends on the version of MongoDB Compass you have (I assume it's +- latest)
As I see, you are using NumberLong for _id field.
You might use NumberLong("...") to handle such fields in MongoDB.
So, better to try search like this
{_id: NumberLong("209383449381830657")}
Can't say for sure how MongoDB Compass handles number you are trying to pass here.
As it should automatically cast values to int32 or int64, but it might be a bit tricky. So, better to define it by yourself to avoid unexpected results.
Some time ago I read an article that it tries to cast to int32, but if it's more than int32 might handle -> then it uses this function
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Number/isSafeInteger
and tries to cast it to int64. My speculation is that problem lives here, but can't say for sure.

In my case the _id contain hex digits.
Filter {_id: ObjectId("62470af3036b9f058552032e")} works

Related

Convert a string to a number in Shell Mongodb query

I have a Mongodb Collection in which a decimal amount is stored as a string.
I would like to be able to query treating that field as a number.
Is that possible without elaborate code? I would expect to be able to do. tonumber(field):{$gt:100}.
But I can't find anything.

Mongodb can't find object with too long _id

I have a little bit strange situation.
I persist objects in collection "refs" explicitly setting _id.
So I have objects with very big id's.
db.refs.find().sort({_id: -1});
// {_id: 9200000000165761625}
// ...
But when I try to find object with biggest id in mongo shell it returns nothing:
db.refs.find({_id: 9200000000165761625}); // nothing
But!
db.refs.find({_id: 9200000000165761625}).count(); // return 1
How could this happen?
i could not reproduce your problem. i was able to successfully query on the _id value you specified.
ensure that when you are querying you are passing correct collection name
JavaScript currently only has a single numeric type Number, which represents all values as 64-bit floating point values. The maximum safe integer representation in JavaScript's native Number type is 253-1 or 9007199254740991 (as returned by the constant Number.MAX_SAFE_INTEGER).
Any integer values beyond the safe range cannot be represented distinctly, so two or more mathematical values will map to the same JavaScript Number.
You can see this effect in the mongo shell with values adjacent to your provided _id (which is larger than the safe integer size):
> 9200000000165761624
9200000000165762000
> 9200000000165761625
9200000000165762000
> 9200000000165761626
9200000000165762000
However, these driver/client limitations are distinct from the underlying data types used in MongoDB's BSON format for documents. BSON has a 64-bit integer type which represents the full range of values: up to 263-1 for 64-bit integers.
Your example _id is within the 64-bit integer range so you should be able to insert or update this using a driver with support for 64-bit integers, but would not be able to safely query or manipulate long values in the mongo shell or other JavaScript environments. To avoid unexpected outcomes you may want to use a different data type for these long _id values.

mongodb index strategy for range query with different fields

Almost all my documents include 2 fields, start timestamp and final timestamp. And in each query, I need to retrieve elements which are in selected period of time. so start should be after selected value and final should be before selected timestamp.
query looks like
db.collection.find({start:{$gt:DateTime(...)}, final:{$lt:DateTime(...)}})
So what is the best indexing strategy for that scenario?
By the way, which is better for performance - to store date as datetimes or as unix timestamps, which is long value itself
To add a little more to baloo's answer.
On the time-stamp vs. long issue. Generally the MongoDB server will not see a difference. The BSON encoding length is the same (64 bits). You may see a performance different on the client side depending on the driver's encoding. As an example, on the Java side a using the 10gen driver a time-stamp is rendered as Date that is a lot heavier than Long. There are drivers that try to avoid that overhead.
The other issue is that you will see a performance improvement if you close the range for the first field of the index. So if you use the index suggested by baloo:
db.collection.ensureIndex({start: 1, final: 1})
The query will perform (potentially much) better if it is:
db.collection.find({start:{$gt:DateTime(...),$lt:DateTime(...)},
final:{$lt:DateTime(...)}})
Conceptually, if you think of the indexes as a a tree the closed range limits both sides of the tree instead of just one side. Without the closed range the server has to "check" all of the entries with a start greater than the time stamp provided since it does not know of the relation between start and final.
You may even find that that the query performance is no better using a single field index like:
db.collection.ensureIndex({start: 1})
Most of the savings is from the first field's pruning. The case where this will not be the case is when the query is covered by the index or the ordering/sort for the results can be derived from the index.
You can use a Compound index in order to create an index for multiple fields.
db.collection.ensureIndex({start: 1, final: 1})
Compare different queries and indexes by using explain() to get the most out of your database

MongoDB datatype handling

Is there a way to handle numbers, which are stored as strings, like integers in mongodb?
e.g.
{
"someKey" : "45646764646"
}
I would like to perform $gte or $lte operations on that value, but it's not possible as long the value for "someKey" is a string. I also would like to avoid of using a dbcursor and make comparisons with java e.g.
If your field someKey contains numbers, consider saving it as NumberLong() . Your driver should allow you to represent that field in your application as a string on most all 10gen drivers.
I would recommend, if possible, converting the strings to ints when you store them in the database. This could prevent confusion in the future, and would save a lot of space due to the storage differences.

MongoDB - most efficient way of getting the last version of a document

I'm using MongoDB to hold a collection of documents.
Each document has an _id (version) which is an ObjectId. Each document has a documentId which is shared across the different versions. This too is an OjectId assigned when the first document was created.
What's the most efficient way of finding the most up-to-date version of a document given the documentId?
I.e. I want to get the record where _id = max(_id) and documentId = x
Do I need to use MapReduce?
Thanks in advance,
Sam
Add index containing both fields (documentId, _id) and don't use max (what for)? Use query with documentId = x, order DESC by _id and limit(1) results to get the latest. Remember about proper sorting order of index (DESC also)
Something like that
db.collection.find({documentId : "x"}).sort({_id : -1}).limit(1)
Other approach (more denormalized) would be to use other collecion with documents like:
{
documentId : "x",
latestVersionId : ...
}
Use of atomic operations would allow to safely update this collection. Adding proper index would make queries fast as lightning.
There is one thing to take into account - i'm not sure whether ObjectID can always be safely used to order by for latest version. Using timestamp may be more certain approach.
I was typing the same as Daimon's first answer, using sort and limit. This is probably not recommended, especially with some drivers (which use random numbers instead of increments for the least significant portion), because of the way the _id is generated. It has second [as opposed to something smaller, like millisecond] resolution as the most significant portion, but the last number can be a random number. So if you had a user save twice in a second (probably not likely, but worth noting), you might end up with a slightly out of order latest document.
See http://www.mongodb.org/display/DOCS/Object+IDs#ObjectIDs-BSONObjectIDSpecification for more details on the structure of the ObjectID.
I would recommend adding an explicit versionNumber field to your documents, so you can query in a similar fashion using that field, like so:
db.coll.find({documentId: <id>}).sort({versionNum: -1}).limit(1);
edit to answer question in comments
You can store a regular DateTime directly in MongoDB, but it will only store the milliseconds precision in a "DateTime" format in MongoDB. If that's good enough, it's simpler to do.
BsonDocument doc = new BsonDocument("dt", DateTime.UtcNow);
coll.Insert (doc);
doc = coll.FindOne();
// see it doesn't have precision...
Console.WriteLine(doc.GetValue("dt").AsUniversalTime.Ticks);
If you want .NET DateTime (ticks)/Timestamp precision, you can do a bunch of casts to get it to work, like:
BsonDocument doc = new BsonDocument("dt", new BsonTimestamp(DateTime.UtcNow.Ticks));
coll.Insert (doc);
doc = coll.FindOne();
// see it does have precision
Console.WriteLine(new DateTime(doc.GetValue("dt").AsBsonTimestamp.Value).Ticks);
update again!
Looks like the real use for BsonTimestamp is to generate unique timestamps within a second resolution. So, you're not really supposed to abuse them as I have in the last few lines of code, and it actually will probably screw up the ordering of results. If you need to store the DateTime at a Tick (100 nanosecond) resolution, you probably should just store the 64-bit int "ticks", which will be sortable in mongodb, and then wrap it in a DateTime after you pull it out of the database again, like so:
BsonDocument doc = new BsonDocument("dt", DateTime.UtcNow.Ticks);
coll.Insert (doc);
doc = coll.FindOne();
DateTime dt = new DateTime(doc.GetValue("dt").AsInt64);
// see it does have precision
Console.WriteLine(dt.Ticks);