I am trying to retrieve one element from a mongo collection, the one with the greatest _id field. I know this can be done by querying:
db.collection.find().sort({_id: -1}).limit(1)
But it kind of seems unelegant and I was wondering whether there is a way to get that specific element using findOne()
Note: I want to do this because, from what I've read in ObjectId, the first bytes correspond to the miliseconds since the Epoch and thus, the last element being inserted will have the greatest _id. Is there any other way to retrieve the last element inserted in a collection?
You should use find, like you already are, and not aggregation which will be slower since it needs to scan all the values of _id fields to figure out the max.
As comments pointed out there is no difference between using find() and findOne() - functionally or elegance-wise. In fact, findOne in the shell (and in the drivers which implement it) is defined in terms of find (with limit -1 and with pretty print in the shell).
If you really want to do the equivalent of
db.collection.find().sort({_id:-1}).limit(1).pretty()
as findOne you can do it with this syntax:
db.collection.findOne({$query:{},$orderby:{_id:-1}})
You can get max _id using aggregation of mongodb. Find and sort may overkill's.
db.myCollection.aggregate({
$group: {
_id: '',
last: {
$max: "$_id"
}
}
});
with PHP driver (mongodb)
using findOne()
$filter=[];
$options = ['sort' => ['_id' => -1]]; // -1 is for DESC
$result = $collection->findOne(filter, $options);
$maxAge = $result['age']
import pymongo
tonystark = pymongo.MongoClient("mongodb://localhost:27017/")
mydb = tonystark["tonystark_db"]
savings = mydb["customers"]
x = savings.find().sort("_id")
for s in x:
print(s)
$maxId="";
$Cursor =$collection->find();
foreach($cursor as $document) {
$maxid =max($arr=(array($document['id'])));
}
print_r($maxid+1);
Related
Is there a way to push all the documents of a given collection in a array?
I did this but is there any quicker way?
var ops = [];
db.getCollection('stock').find({}).forEach(function (stock) {
ops.push(stock);
})
PS: I use Mongo 3.4
You can just use the toArray function on the cursor that's returned from find, like this:
var ops = db.getCollection('stock').find({}).toArray();
Note: As with your original solution, this might suffer with performance if the stock collection contains millions of documents.
As an aside, you can use db.stock directly to shorten the query a little bit:
var ops = db.stock.find({}).toArray();
Try using lean query option. in your case:
db.getCollection('stock').find({}).lean()
You could as well use $facet which will allow you to create the array on the server side - provided the resulting document array is no bigger than 16MB in which case you'll get an exception:
db.stock.aggregate({
$facet: {
ops: [ { $match: {} } ]
}
})
In order to reduce the amount of data returned you could limit the number of returned fields in the above pipeline (instead of an empty $match stage - which is a hack anyway - you would then use $project).
I wish to return just the document id's from mongo that match a find() query.
I know I can pass an object to exclude or include in the result set, however I cannot find a way to just return the _id field.
My thought process is returning just this bit of information is going to be way more efficient (my use case requires no other document data just the ObjectId).
An example query that I expected to work was:
collection.find({}, { _id: 1 }).toArray(function(err, docs) {
...
}
However this returns the entire document and not just the _id field.
You just need to use a projection to find what ya want.
collection.find({filter criteria here}, {foo: 0, bar: 0, _id: 1});
Since I don't know what your document collection looks like this is all I can do for you. foo: 0 for example is exclude this property.
I found that using the cursor object directly I can specify the required projection. The mongodb package on npm when calling toArray() is returning the entire document regardless of the projection specified in the initial find(). Fixed working example below that satisfies my requirements of just getting the _id field.
Example document:
{
_id: new ObjectId(...),
test1: "hello",
test2: "world!"
}
Working Projection
var cursor = collection.find({});
cursor.project({
test1: 0,
test2: 0
});
cursor.toArray(function(err, docs) {
// Importantly the docs objects here only
// have the field _id
});
Because _id is by definition unique, you can use distinct to get an array of the _id values of all documents as:
collection.distinct('_id', function(err, ids) {
...
}
you can do like this
collection.find({},'_id').toArray(function(err, docs) {
...
}
I have a mongoDB document that has the following structure:
{
user:user_name,
streams:[
{user:user_a, name:name_a},
{user:user_b, name:name_b},
{user:user_c, name:name_c}
]
}
I want to use $pullAll to remove from the streams array, passing it an array of streams (the size of the array varies from 1 to N):
var streamsA = [{user:"user_a", name:"name_a"},{user:"user_b", name:"name_b"}]
var streamsB = [{name:"name_a", user:"user_a"},{name:"name_b", user:"user_b"}]
I use the following mongoDB command to perform the update operation:
db.streams.update({name:"user_name", {"$pullAll:{streams:streamsA}})
db.streams.update({name:"user_name", {"$pullAll:{streams:streamsB}})
Removing streamsA succeeds, whereas removing streamsB fails. After digging through the mongoDB manuals, I saw that the order of fields in streamsA and streamsB records has to match the order of fields in the database. For streamsB the order does not match, that's why it was not removed.
I can reorder the streams to the database document order prior to performing an update operation, but is there an easier and cleaner way to do this? Is there some flag that can be set to update and/or pullAll to ignore the order?
Thank You,
Gary
The $pullAll operator is really a "special case" that was mostly intended for single "scalar" array elements and not for sub-documents in the way you are using it.
Instead use $pull which will inspect each element and use an $or condition for the document lists:
db.streams.update(
{ "user": "user_name" },
{ "$pull": { "streams": { "$or": streamsB } }}
)
That way it does not matter which order the fields are in or indeed look for an "exact match" as the current $pullAll operation is actually doing.
How can I get an array of all the doc ids in MongoDB? I only need a set of ids but not the doc contents.
You can do this in the Mongo shell by calling map on the cursor like this:
var a = db.c.find({}, {_id:1}).map(function(item){ return item._id; })
The result is that a is an array of just the _id values.
The way it works in Node is similar.
(This is MongoDB Node driver v2.2, and Node v6.7.0)
db.collection('...')
.find(...)
.project( {_id: 1} )
.map(x => x._id)
.toArray();
Remember to put map before toArray as this map is NOT the JavaScript map function, but it is the one provided by MongoDB and it runs within the database before the cursor is returned.
One way is to simply use the runCommand API.
db.runCommand ( { distinct: "distinct", key: "_id" } )
which gives you something like this:
{
"values" : [
ObjectId("54cfcf93e2b8994c25077924"),
ObjectId("54d672d819f899c704b21ef4"),
ObjectId("54d6732319f899c704b21ef5"),
ObjectId("54d6732319f899c704b21ef6"),
ObjectId("54d6732319f899c704b21ef7"),
ObjectId("54d6732319f899c704b21ef8"),
ObjectId("54d6732319f899c704b21ef9")
],
"stats" : {
"n" : 7,
"nscanned" : 7,
"nscannedObjects" : 0,
"timems" : 2,
"cursor" : "DistinctCursor"
},
"ok" : 1
}
However, there's an even nicer way using the actual distinct API:
var ids = db.distinct.distinct('_id', {}, {});
which just gives you an array of ids:
[
ObjectId("54cfcf93e2b8994c25077924"),
ObjectId("54d672d819f899c704b21ef4"),
ObjectId("54d6732319f899c704b21ef5"),
ObjectId("54d6732319f899c704b21ef6"),
ObjectId("54d6732319f899c704b21ef7"),
ObjectId("54d6732319f899c704b21ef8"),
ObjectId("54d6732319f899c704b21ef9")
]
Not sure about the first version, but the latter is definitely supported in the Node.js driver (which I saw you mention you wanted to use). That would look something like this:
db.collection('c').distinct('_id', {}, {}, function (err, result) {
// result is your array of ids
})
I also was wondering how to do this with the MongoDB Node.JS driver, like #user2793120. Someone else said he should iterate through the results with .each which seemed highly inefficient to me. I used MongoDB's aggregation instead:
myCollection.aggregate([
{$match: {ANY SEARCHING CRITERIA FOLLOWING $match'S RULES} },
{$sort: {ANY SORTING CRITERIA, FOLLOWING $sort'S RULES}},
{$group: {_id:null, ids: {$addToSet: "$_id"}}}
]).exec()
The sorting phase is optional. The match one as well if you want all the collection's _ids. If you console.log the result, you'd see something like:
[ { _id: null, ids: [ '56e05a832f3caaf218b57a90', '56e05a832f3caaf218b57a91', '56e05a832f3caaf218b57a92' ] } ]
Then just use the contents of result[0].ids somewhere else.
The key part here is the $group section. You must define a value of null for _id (otherwise, the aggregation will crash), and create a new array field with all the _ids. If you don't mind having duplicated ids (according to your search criteria used in the $match phase, and assuming you are grouping a field other than _id which also has another document _id), you can use $push instead of $addToSet.
Another way to do this on mongo console could be:
var arr=[]
db.c.find({},{_id:1}).forEach(function(doc){arr.push(doc._id)})
printjson(arr)
Hope that helps!!!
Thanks!!!
I struggled with this for a long time, and I'm answering this because I've got an important hint. It seemed obvious that:
db.c.find({},{_id:1});
would be the answer.
It worked, sort of. It would find the first 101 documents and then the application would pause. I didn't let it keep going. This was both in Java using MongoOperations and also on the Mongo command line.
I looked at the mongo logs and saw it's doing a colscan, on a big collection of big documents. I thought, crazy, I'm projecting the _id which is always indexed so why would it attempt a colscan?
I have no idea why it would do that, but the solution is simple:
db.c.find({},{_id:1}).hint({_id:1});
or in Java:
query.withHint("{_id:1}");
Then it was able to proceed along as normal, using stream style:
createStreamFromIterator(mongoOperations.stream(query, MortgageDocument.class)).
map(MortgageDocument::getId).forEach(transformer);
Mongo can do some good things and it can also get stuck in really confusing ways. At least that's my experience so far.
Try with an agregation pipeline, like this:
db.collection.aggregate([
{ $match: { deletedAt: null }},
{ $group: { _id: "$_id"}}
])
this gona return a documents array with this structure
_id: ObjectId("5fc98977fda32e3458c97edd")
i had a similar requirement to get ids for a collection with 50+ million rows. I tried many ways. Fastest way to get the ids turned out to be to do mongoexport with just the ids.
One of the above examples worked for me, with a minor tweak. I left out the second object, as I tried using with my Mongoose schema.
const idArray = await Model.distinct('_id', {}, function (err, result) {
// result is your array of ids
return result;
});
I would like to speed up an query on my mongoDB which uses $where to compare two fields in the document, which seems to be really slow.
My query look like this:
db.mycollection.find({ $where : "this.lastCheckDate < this.modificationDate})
What I would like to do is add a field to my document, i.e. isCheckDateLowerThenModDate, on which I could execute a probably much faster query:
db.mycollection.find({"isCheckDateLowerThenModDate":true})
I quite new to mongoDB an have no idea how to do this. I would appreciate if someone could give me some hints or examples on
How to initialize such a field on an existing collection
How to maintain this field. Which means how to update this field when lastCheckDate or modificationDate changes.
Thanks in advance for your help!
You are thinking in a right way!
1.How to initialize such a field on an existing collection.
Most simple way is to load each document (from your language), calculate this field, update and save.
Or you could perform an update via mongo shell:
db.mycollection.find().forEach(function(doc) {
if(doc.lastCheckDate < doc.modificationDate)
{
doc.isCheckDateLowerThenModDate = true;
}
else
{
doc.isCheckDateLowerThenModDate = false;
}
db.mycollection.save(doc);
});
2.How to maintain this field. Which means how to update this field when
lastCheckDate or modificationDate changes.
You have to do it yourself from your client code. Make some wrapper for update, save operations and recalculate this value each time there. To be absolutely sure that this update works -- write unit tests.
The $where clause is slow because it is evaluating each document using the JavaScript interpreter.
There are a few alternatives:
1) Assuming your use case is "look for records that need updating", take advantage of a sparse index:
add a boolean field like needsChecking and $set this whenever the modificationDate is updated
in your "check" procedure, find the documents that have this field set (should be fast due to the sparse index)
db.mycollection.find({'needsChecking':true});
after you've done whatever check is needed, $unset the needsChecking field.
2) A new (and faster) feature in MongoDB 2.2 is the Aggregation Framework.
Here is an example of adding a "isUpdated" field based on the date comparison, and then filtering the matching documents:
db.mycollection.aggregate(
{ $project: {
_id: 1,
name: 1,
type: 1,
modificationDate: 1,
lastCheckDate: 1,
isUpdated: { $gt:["$modificationDate","$lastCheckDate"] }
}},
{ $match : {
isUpdated : true,
}}
)
Some current caveats of using the Aggregation Framework are:
you have to specify fields to include aside from _id
the result is limited to the current maximum BSON document size (16Mb in MongoDB 2.2)