Get next item in MongoDB collection without ObjectId - mongodb

I have access to a MongoDB collection but the entries their _id is a string, a url to be exact.
I want to retrieve the next document in a collection based on the previous _id. I looked around and I've seen it's possible using the ObjectID: Finding The Next Document in MongoDb.
My problem is that the database doesn't have ObjectID's in _id, there is also no field that could possibly be used as an alternative to ObjectID (f.e. a timestamp). So how would I retrieve the next document?
Edit: Added collection example
{ "_id": random.com,
"name": "Random",
},
{ "_id": example.com,
"name": "Example",
},
{ "_id": stack.com,
"name": "Stack",
}
If I have the _id "random.com", how do I retrieve the next document, in this case the one with _id "example.com"? I'm using pymongo.

cursor = db.coll.find({"_id": { "$gt": "the_url"}}).sort("_id").limit(1)
for doc in cursor:
print(doc['_id'])
see sort for how you could define order.
The index is alphabetical and we get next value of _id.
Still we can query for next inserted, that is about insertion time order, using sort("natural").
cursor = db.coll.find({"_id": { $gt":the_url"}}).sort("natural").limit(1)

Related

Get count of a value of a subdocument inside an array with mongoose

I have Collection of documents with id and contact. Contact is an array which contains subdocuments.
I am trying to get the count of contact where isActive = Y. Also need to query the collection based on the id. The entire query can be something like
Select Count(contact.isActive=Y) where _id = '601ad0227b25254647823713'
I am using mongo and mongoose for the first time. Please edit the question if I was not able to explain it properly.
You can use an aggregation pipeline like this:
First $match to get only documents with desired _id.
Then $unwind to get different values inside array.
Match again to get the values which isActive value is Y.
And $group adding one for each document that exists (i.e. counting documents with isActive= Y). The count is stores in field total.
db.collection.aggregate([
{
"$match": {"id": 1}
},
{
"$unwind": "$contact"
},
{
"$match": {"contact.isActive": "Y"}
},
{
"$group": {
"_id": "$id",
"total": {"$sum": 1}
}
}
])
Example here

Mongo queries to search all the collections of a database (Mongo/PyMongo)

I have been stuck on how to query db which the common data structure of every document looks as:
{
"_id": {
"$oid": "5e0983863bcf0dab51f2872b"
},
"word": "never", // get the `word` value for each of below queries
"wordset_id": "a42b50e85e",
"meanings": [{
"id": "1f1bca9d9f",
"def": "not ever",
"speech_part": "adverb",
"synonyms": ["ne'er"]
}, {
"id": "d35f973ed0",
"def": "not at all",
"speech_part": "adverb"
}]
}
1) query to get all the wordfor speech_part: "adverb" (eg: never,....) //
2)query to get all the word for: word length of 6 and speech_part: "adverb"
I have learnt from SO that ,to search whole collections first i have to retrieve all collections in the database , but how to write a query is where i stuck
db.collection.find({"meanings.speech_part":"adverb"},{"_id":0, "word":1})
To get array of all word of a specific speech_part above is the query.
First part of the query is filter predicate like in your scenario matching speach_part.if your matching column were not inside another object or a object inside a array, you could just write {column_name: "something"}.
as speech_part is inside an object which is inside an array, you have to write {"parentClumn.key":"something"}, in your case {"meanings.speech_part":"adverb"}.
where second part of the query is projection where you define which columns you want in your result. so to get only word column values you do {word:1}, to have more column you do {word:1, etc:1}. While mongodb project _id by default, so to remove _id from result you have to explicitly set {_id:0}
db.collection.find({
"meanings.speech_part":"adverb",
"$expr": { "$gt": [ { "$strLenCP": "$word" }, 6 ] }
},{"_id":0, "word":1})
To get array of all word of a specific speech_part with length greater than 6. This one is a bit complex query. You can look up $expr documentation. In $expr you can run function on your column and match the result. In your case strLenCP is calculating the length of your word column value and then checking, is it greater then 6 by $gt comparison operator
You may try below query to get the matching rows. You will have to try the same with pymongo.
db.getCollection('test-collection').find(
{
'meanings.speech_part': 'adverb'
},
{
_id: 0,
word: 1
}
);
Read about the projections in mongodb here:
https://docs.mongodb.com/manual/tutorial/project-fields-from-query-results

How to make a cloudant query to find documents which two fields are equal

I need to get all documents whose e.g. "_id" field equal to another document field, e.g. "appId"
{
"_id": "xxxx-xxxx-xxxx-xxxx",
"_rev": "xxxx-xxxx-xxxx-xxxx",
"header": {
"appId": "xxxx-xxxx-xxxx-xxxx"
So what would be the query?
"selector": {
"_id": {
"$eq": header.appId
}
},
You can't do "sub queries" with Mango.
From what I see, you're trying to get all the documents listed by appId.
This could be done by using a view.
Your map function would be the following:
if(doc.header && doc.header.appId){
emit(doc.doc.header.appId,{_id: doc.header.appId});
}
The result would be a list of documents mapped by doc.header.appId.
If you query the view with ?include_docs=true, the documents would be joined to the response since we're doing a ManyToJoin join.

Collection.findOne({_id: "stringidsdfdsfdsfds"}) is returns undefined

I'm not new to Meteor, but been away for a couple of weeks due to working on other projects.
I'm now working on a Meteor project using React.
When I do this Collection.find({}).fetch() it returns this:
[
{
"_id": { "_str": "59d3b91d80f4f5eeb0162634" },
"title": "My first Post",
"content": "This is the body of the pst"
}
]
The only strange thing is the _id field.
But, when I do Collection.findOne({_id: "59d3b91d80f4f5eeb0162634" }), it returns undefined.
How can I do a .findOne() using the _id string as query parameter?
What you're seeing as _id's value is not a JSON object, but string representation of Mongo's ObjectID type, that's why your .findOne() fails to find it.
You should search it like this:
const _id = new Meteor.Collection.ObjectID('59d3b91d80f4f5eeb0162634');
Collection.findOne({ _id }); // same as { _id: _id }
By default, Meteor uses STRING method of _id generation, so it seems that this particular document has been inserted into collection another way.

MongoDB Compound Index to Optimize Update with Key and Range Condition

Have read this doc, it states that index can optimize update operation. Then, I am adding an index to my collection to optimize update operation I am using.
Records in the collection have object as _id, and a timestamp:
{_id: {userId: "sample"}, firstTimestamp: 123, otherField: "abc"}
What I want to do is operate update using query below:
db.userFirstTimestamp.update(
{_id: {userId: "sample"}, firstTimestamp: {$gt: 100}},
{_id: {userId: "sample"}, firstTimestamp: 100, otherField2: "efg"})
I want to store 'first document' based on 'firstTimestamp', field of old and new document can be different, hence it cannot be $set query, it should rewrite document instead. For sample below "otherField" should not be exist, it should be "otherField2" instead.
Based on my understanding on MongoDB doc and this article, I created index as per below
db.sample.createIndex({_id:1, timestamp:1})
Then I try to benchmark the query on an isolated experimental node using MongoDB 3.0.4 with spec below:
MongoDB 3.0.4
Machine is empty, no other operation, only mongo
RAM ~30GB
Disk is RAID 0 stripped
Collection has 60 million record
Average object size 1001 bytes
Index size 5.34 gig
When I check the log, many update query take more than 100ms, and when I do mongotop, top of the query is write query which takes ~1000ms. It is a bit slow since it takes that long to do one query.
When I do mongostat, throughput is only 400-500 query per second.
Then I try to do query explain using find query (since update does not support explain)
When I am not using projection, it is using default index {_id:1}.
When I am using projection for _id and timestamp only, it is using {_id:1, timestamp:1} index.
My question is:
Does index I have created help this update query?
If it is not helping, then how the index should be?
Any other way to optimize this update query?
Somewhat. But not optimally.
Should be this really, so index on the "element" of the object in the _id key:
db.sample.createIndex({ "_id.userId": 1, "timestamp": 1 })
Use the $set operator and stop overwiting your documents:
db.sample.update(
{
"_id.userId": "sample",
"firstTimestamp": { "$gt": 100 }
},
{
"$set": { "otherfield": "cfg" }
}
)
But really your data "should" look like this:
{
"_id": "sample",
"firstTimestamp": 200,
"otherfield2": "sam"
}
And update like:
db.sample.update(
{
"_id.userId": "sample",
"firstTimestamp": { "$gt": 100 }
},
{
"$set": {
"fistTimetamp": 100,
"otherfield2": "efg"
}
}
)
Or if you insist that fields other than "_id" and "firstTimestamp" are going to change a lot, then rather do this:
{
"_id": "sample",
"firstTimestamp": 200,
"data": {
"otherfield2": "sam"
}
}
When if you just want to replace data then do:
db.sample.update(
{
"_id.userId": "sample",
"firstTimestamp": { "$gt": 100 }
},
{
"$set": {
"fistTimetamp": 100,
"data": {
"overwritingField": "efg"
}
}
}
)
Since "data" can be replaced as an entire object if you wish, or just update a single key:
db.sample.update(
{
"_id.userId": "sample",
"firstTimestamp": { "$gt": 100 }
},
{
"$set": {
"fistTimetamp": 100,
"data.newfield": "efg"
}
}
)
In all cases, try to use the operators rather than replacing the whole object as it typically works out as more traffic and more load to the server.
But overall, what makes sense here is that the "userId" part "should" be the portion of the index that narrows down the results the most. So it definately goes before the timestamp, of which there should be a lot more possible values.
Compound primary keys are fine, but make sure you actually use them. A singular value would not make any sense and could just be assigned to _id. If you can just query on one field of they key as you are here, then you probably don't need a compound object as the primary key.
Your _id in the update suggests that you are getting exact matches for the _id therefore it is not a compound field with other keys. With this being the case, it should just a value in the _id itself.
Also a "range" is okay, but again consider that you are trying to match a single document ( well you don't mention "multi" anywhere ), so again questin why is it needed and either then go for an exact match or at "least" an upper limit.
The $set will "only" update the fields that you specifiy. I think you made a mistake in typing your question though, as the syntax for the "update" portion would not be valid. But use update operators anyway, as they send less traffic by sending a single field, or just the fields you intend to update.