Is it required to specify Sort for skip limit in Mongodb - mongodb

I am using skip & limit for mongodb C# driver to fetch tickets batchwise like below,
var data = db.collectionName.Find({}).Skip(1000).Limit(500).ToList()
Data fetching is happening as expected. Need confirmation on whether Sort() is mandatory for Skip & limit methods like below ? or sort will be handled by mongodb if not specified
var data = db.collectionName.Find({}).Sort("{_id:1}").Skip(1000).Limit(500).ToList()
I have removed Sort from query to improve time taken to complete fetch operation.

No, Sort() is not mandatory for Skip() and Limit() methods. You can use them directly like you are using in your query:
var data = db.collectionName.Find({}).Skip(1000).Limit(500).ToList()
To know more about default sort order, refer to below link:
https://stackoverflow.com/questions/11599069/how-does-mongodb-sort-records-when-no-sort-order-is-specified#:~:text=When%20we%20run%20a%20Mongo,objects%20in%20forward%20natural%20order.

Related

Spring boot mongo template remove with limit query

I am trying to delete a limited set of mongo documents from a collection which have id less than 10 but want to remove them in sets of 3, so tried using limit, but it still deletes all the documents and ignores limit.
Query query = new Query();
query.addCriteria(Criteria.where("_id").lt(id)).limit(3);
mongoTemplate.remove(query,TestCollection.class);
When I perform mongoTemplate.find(query,TestCollection.class); limit works fine and returns 3 element at a time but in remove it doesn't works.
Is there any other way to delete in single query only.
To achieve this do it in two passes
Find 3 ids to delete as you are doing currently
do a collection.remove with Criteria.where("_id").in[id1,id2,id3]
I would also add a sort criteria before doing a limit. Otherwise the results of deletion might be dependent on the index used

Aggregate do not return records in order as they are inserted

I am facing one issue with aggregate query.
When I am trying to retrieve records using aggregate ($match), I am not receiving records in same order they are inserted.
but when I am trying to query using find, then I am getting data in same order data inserted.
In Mongo, the default internal sort order is an "unknown" implementation detail.
If a certain order was required then you should use the $sort stage otherwise it's considered overhead operation for the storage engine
Without any query criteria, results will be returned by the storage engine in their "natural order" or in layman term the order which they were found, however we don't really know what that order is and we shouldn't rely on it.
So obviously the simple option would be to add a $sort stage on _id for example, if for some reason you don't have a field you can sort on you can use $natural that will return results in the order you expect.

How can I perform a bulkWrite in mongodb using rust mongodb driver?

I am implementing a process in rust where I read a large number of documents from a mongodb collection, perform some calculations on the values of each document and then have to update the documents in mongodb.
In my initial implementation, after the calculations are performed, I go through each of the documents and call db.collection.replace_one.
let document = bson::to_document(&item).unwrap();
let filter = doc! { "_id": item.id.as_ref().unwrap() };
let result = my_collection.replace_one(filter, rec_document, None).await?
Since this is quite time consuming for large record sets, I want to implement it using db.collection.bulkWrite. In version 1.1.1 of the official rust mongodb driver, bulkWrite does not seem to be supported, so I want to use db.run_command. However, I am not sure how to call db.collection.bulkWrite(...) using run_command as I cannot figure out how to pass the command name as well as the set of documents to replace the values in mongodb.
What I have attempted is to create a String representing the command document with all the document records to be updated string joined as well. In order to create bson::Document from that string, I convert the string to bytes and then attempt to create the document to be passed using Document::from_reader but that doesn't work, nor is a good solution.
Is there a proper or better way to call bulkWrite using version 1.1.1 of the mongodb crate?

Get all documents from an index - elasticsearch

How can I get all documents from an index in elasticsearch without determining the size in the query like
GET http://localhost:8090/my_index/_search?size=1000&scroll=1m&pretty=true'-d '{"size": 0,"query":{"query_string":{ "match_all" : {}}}}
Thanks
According to the ES scan query documentation, size parameter is not just the number of results:
The size parameter allows you to configure the maximum number of hits
to be returned with each batch of results. Each call to the scroll API
returns the next batch of results until there are no more results left
to return, ie the hits array is empty.
To retrieve all the results you need to do subsequent calls to the API in the manner described in the aforementioned documentation, or to use some ready made implementation, like there is in python. Here is an example script to dump resulting jsons on stdout:
import elasticsearch
from elasticsearch.helpers import scan
import json
es = elasticsearch.Elasticsearch('https://localhost:8090')
es_response = scan(
es,
index='my_index',
doc_type='my_doc_type',
query={"query": { "match_all" : {}}}
)
for item in es_response:
print(json.dumps(item))
As per the latest documentation, you will have to use the search_after parameter to retrieve records more than 10,000 from an index. take a look here https://www.elastic.co/guide/en/elasticsearch/reference/current/paginate-search-results.html#search-after

pymongo sort grouped results

I need to group and sort by date_published some documents stored on mongodb using pymongo.
the group part went just fine :) but when I'm addding .sort() to the query it keeps failing no matter what I tried :(
here is my query:
db.activities.group(keyf_code,cond,{},reduce_code)
I want to sort by a field called "published" (timestamp)
tried to do
db.activities.group(keyf_code,cond,{},reduce_code).sort({"published": -1})
and many more variations without any success
ideas anyone?
You can't currently do sort with group in MongoDB. You can use MapReduce instead which does support a sort option. There is also an enhancement request to support group with sort here.
Although MongoDB doesn't do what you want, you can always use Python to do the sorting:
result = db.activities.group(keyf_code,cond,{},reduce_code)
result = sorted(result, key=itemgetter("published"), reverse=True)