Mongo find all documents from a particular month - mongodb

I have a collection with documents containing a date.
Is there a way to find all documents from a particular month (all years), using db.myCollection.find()?
I managed to achieve approximately what I need by using the aggregation pipeline and the $month operator, but I couldn't find anything similar among the 'query and projection' operators.
I do not need to group the documents (I only need to filter them).
I am modifying some code that creates filters dynamically based on the user selection and the code uses find with the generated filters.
I would like to be able to give a user the ability to see only entries from particular months (say, only entries for the summer season).
Finally is there a performance difference between using a filter with find and using the equivalent filter in a $match stage with the aggregation pipeline ?

Related

How to improve the performance of this MongoDB query

I am trying to take an extract from a huge MongoDB collection.
In particular, the collection contains 2.65TB data (unzipped), i.e., 600GB data (zipped). Each document has a deep hierarchy and a couple of arrays and I want to extract some parts out of them. In this collection we have multiple documents for each customer id. Since I want to export the most active document for each customer, I need to group and take the records with the maximum timestamp field and perform some further processing on them. I need some help in forming the query for the export. I have tried to sort the documents per customer id, but this could not be achieved in an acceptable time when combined with a 'match' construct (this is needed since it is a huge collection and we try to create the export in parts). Currently the query looks like this:
db.getCollection('CEM').aggregate([
{'$match' : {'LiveFeed.customer.profile.id':'TCAYT2RY2PF93R93JVSUGU7D3'}},
{'$project':{'LiveFeed.customer.profile.id':1,'LiveFeed.customer.profile.products.air.flights':1, 'LiveFeed.context.timestamp':1}},
{'$sort':{'LiveFeed.customer.profile.id':1,"LiveFeed.context.timestamp":1}},
{'$group':{'_id':'$LiveFeed.customer.profile.id',
'products':{'$last':'$LiveFeed.customer.profile.products.air.flights'}}},
{'$unwind': '$products'},
{'$unwind': '$products.sources'},
{'$project':{'_id':0,
'ceid': '$_id',
'coupon_no':{'$ifNull':['$products.couponId.couponNumber', ""]},
'ticket_no':{'$ifNull':['$products.couponId.ticketId.number','']},
'pnr_id':'$products.sources.id',
'departure_date':'$products.segment.departure.at',
'departure_airport':'$products.segment.departure.code',
'arrival_airport':'$products.segment.arrival.code',
'created_date':'$products.createdAt'}}])
Any ideas/suggestions on to how to improve this query will be very helpful indeed - Thanks in advance!
It is difficult to answer this without knowing the indexes on your collection. However, you can save some time by eliminating stage 3. The $sort is undone by the $group in stage 4. See $group does not preserve order

How to sort data using MongoDB Compass

I'm currently trying to use MongoDB Compass to query my collection. However, I seem to be only able to filter the data.
Is there any way for me to sort the data as well? I would like to sort my data in ascending order using one of my data fields.
If MongoDB Compass isn't the best way to order a collection, what other GUI could I use?
Using MongoDB Compass 1.7 or newer, you can sort (and project, skip, or limit) results by choosing the Documents tab and expanding the Options.
To sort in ascending order by a field myField, use { myField:1 }. Any of the usual cursor sort() options can be provided, including ordering results by multiple fields.
Note: options like sort and skip are not available in the default Schema tab because this view uses sampling to find a random set of documents, as opposed to the Documents view which displays a specific query result.

MongoDB Fillter the records and Updating vs Updating with filters

If i want to update multiple documents based on multiple filter criteria which is the better approach?
Filter and get the documents (only _id field) which needs to be updated and supply the array of _id to updatemanyasync ($in) as a parameter and update . (see below 1)
Update the documents by supplying filter criteria directly.(see below 2)
Reason for this doubt.
1. MongoDB search only for _id matches and update it.
2. MongoDB search for the supplied mulitple criteria (multiple fields) each document and it will update.
What is the performance difference on these 2 approaches by spliting up the updates as 2 process
Performance on Timeouts,Locks,Document Avalability after update.
Please help to share your suggestions and views.

Solr: Query for documents whose from-to date range contains the user input

I would like to store and query documents that contain a from-to date range, where the range represents an interval when the document has been valid.
Typical use cases in lucene/solr documentation address the opposite problem: Querying for documents that contain a single timestamp and this timestamp is contained in a date range provided as query parameter. (createdate:[1976-03-06T23:59:59.999Z TO *])
I want to use the edismax parser.
I have found the ms() function, which seems to me to be designed for boosting score only, not to eliminate non-matching results entirely.
I have found the article Spatial Search Tricks for People Who Don't Have Spatial Data, where the problem described by me is said to be Easy... (Find People Alive On May 25, 1977).
Is there any simpler way to express something like
date_from_query:[valid_from_field TO valid_to_field] than using the spacial approach?
The most direct approach is to create the bounds yourself:
valid_from_field:[* TO date_from_query] AND valid_to_field:[date_from_query TO *]
.. which would give you documents where the valid_from_field is earlier than the date you're querying, and the valid_to_field is later than the date you're querying, in effect, extracting the interval contained between valid_from_field and valid_to_field. This assumes that neither field is multi valued.
I'd probably add it as a filter query, since you don't need any scoring from it, and you probably want to allow other search queries at the same time.

mongodb computed field based on another query

I have a mongodb query, and I want to add a computed field. The computed field is based on where or not the item is in the results of another query. So my query returns the columns a,b,c,d, and then column e should be based on whether or not the current row would be matched by another query.
Is there an efficient way to do this in mongo? I'm not really sure how to do this one...
There is no way currently to execute a function as you describe within the database when returning a document via standard functions such as find. It's been requested by the community, but the general request is to operate only on a single document.
There are calculated fields using $project in the aggregation framework. But, they only operate on the current document in the pipeline. So, they can't summarize other queries.
You'll need to likely build your e value as part of your data access layer.