how to use $lookup in mongodb change stream? - mongodb

based on docs in mongoDB Change Stream Docs i can use only these operations for getting output of change stream:
$addFields
$match
$project
$replaceRoot
$replaceWith (Available starting in MongoDB 4.2)
$redact
$set (Available starting in MongoDB 4.2)
$unset (Available starting in MongoDB 4.2)
but I want to use $lookup op :(
Do you have any idea to achieve this?

The allowed operations are transformations on the change stream-produced documents. You are asking about joining other collections.
If you want to join other collections, you need to issue those queries separately from the change stream.

Related

$set vs $addField in mongoDb Aggregation Framework

With MongoDb 4.2, we have new aggregation stage $set. As per docs,
$set stage is an alias for $addFields (from mongoDb 3.4)
But nothing mentioned why the need to have two stages with same functionality.
Can someone help to understand this ? (possibly since versions < 4.0 will be depricated soon )
MongoDB 4.2 introduced update commands that can accept an aggregation pipeline.
$set has long been an update operator, which does the same thing in updates as the $addFields stage does in aggregation.
Permitting either name to be used in an aggregation pipeline eases the adoption of the new update command.

How can perform facet aggregation in ReactiveMongo with playframework?

I am trying to run $facet aggregation in order to get documents according to $match with $skip and the count of matching documents.
I found there is no $facet aggregation in ReactiveMongo. How can I do this in my playframework 2.7 app?
Any aggregation stage can be written using PipelineOperator (if no convenient function provided).

MongoDB, indexing query in inner object and grouping? [duplicate]

I'm trying to use aggregation framework with $match and $group stages. Does $group stage use index data? I'm using latest available mongodb version - 2.5.4
$group does not use index data.
From the mongoDB docs:
The $match and $sort pipeline operators can take advantage of an index when they occur at the beginning of the pipeline.
The $geoNear pipeline operator takes advantage of a geospatial index.
When using $geoNear, the $geoNear pipeline operation must appear as
the first stage in an aggregation pipeline.
#ArthurTacca, as of Mongo 4.0 $sort preceding $group will speed up things significantly. See https://stackoverflow.com/a/56427875/92049.
As 4J41's answer says, $group does not (directly) use an index, although $sort does if it is the first stage in the pipeline. However, it seems possible that $group could, in principle, have an optimised implementation if it immediately follows a $sort, in which case you could make it effectively make use of an index by putting a $sort before hand.
There does not seem to be a straight answer either way in the docs about whether $group has this optimisation (although I bet there would be if it did, so this suggests it doesn't). The answer is in MongoDB bug 4507: currently $group does NOT have this implementation, so the top line of 4J41's answer is right after all. If you really need efficiency, depending on the application it may be quickest to use a regular query and do the grouping in your client code.
Edit: As sebastian's answer says, it seems that in practice using $sort (that can take advantage of an index) before a $group can make a very large speed improvement. The bug above is still open so it seems that it is not making the absolute best possible advantage of the index (that is, starting to group items as items are loaded, rather than loading them all in memory first). But it is still certainly worth doing.
Per Mongo's 4.2 $group documentation, there is a special optimization for $first:
Optimization to Return the First Document of Each Group
If a pipeline sorts and groups by the same field and the $group stage only uses the $first accumulator operator, consider adding an index on the grouped field which matches the sort order. In some cases, the $group stage can use the index to quickly find the first document of each group.
It makes sense, since only the first entry in an ordered index should be needed for each bin in the $group stage. Unfortunately, in my 3.6 testing, I haven't been able to get nearly the performance I would expect if the index were really being used. I've posted about that problem in detail in another question.
EDIT 2020-04-23
I confirmed with Atlas's MongoDB Support that this $first optimization was added in Mongo 4.2, hence my trouble getting it to work with 3.6. There is also a bug preventing it from working with a composite $group _id at the moment. Further details are available in the post that I linked above.
Changed in version 3.2: Starting in MongoDB 3.2, indexes can cover an aggregation pipeline. In MongoDB 2.6 and 3.0, indexes could not cover an aggregation pipeline since even when the pipeline uses an index, aggregation still requires access to the actual documents.
https://docs.mongodb.com/master/core/aggregation-pipeline/#pipeline-operators-and-indexes

Is it possible to get the textScore on mongodb MapReduce?

If you created a textIndex on mongodb 2.6 when you find or use the pipeline aggregate framework you can get the textScore given a query with the projection:
{'$meta': "textScore"}
This allows to operate with the textScore in further operations.
Is it possible to acces such value during a map-reduce operation?

How to aggregate and merge the result into a collection?

I want to aggregate and insert the results into an existing collection, without deleting that collection. The documentation seems to suggest that this isn't directly possible. I find that hard to believe.
The map-reduce functionality has 'output modes', including 'merge', which does what I want. I'm looking for the equivalent for aggregation.
The new $out aggregation stage supports inserting into a collection, but it replaces the collection rather than updating it. If I did this I would (I think) have to run another map-reduce to merge this into another collection, which seems inefficient.
Am I missing something or is the functionality just missing from the aggregation feature?
I used the output from aggregation to insert/merge to collection:
db.coll2.insert(
db.coll1.aggregate([]).toArray()
)
Reading the documentation answers this question quite precisely. Atm mongo is not able to do what you want.
The $out operation creates a new collection in the current database if one does not already exist. The collection is not visible until the aggregation completes. If the aggregation fails, MongoDB does not create the collection.
If the collection specified by the $out operation already exists, then upon completion of aggregation the $out stage atomically replaces the existing collection with the new results collection. The $out operation does not change any indexes that existed on the previous collection. If the aggregation fails, the $out operation makes no changes to the previous collection.
For anyone coming to this more recently, this is available from version 4.2, you will be able to do this using the $merge operator in an aggregation pipeline. It needs to be the last stage in the pipeline.
{ $merge: { into: "myOutput", on: "_id", whenMatched: "replace", whenNotMatched: "insert" } }
If your not stuck on using the Aggregation operators, you could do an incremental map-reduce on the collection. This operator allows you to merge results into an existing collection.
See documentation below:
http://docs.mongodb.org/manual/tutorial/perform-incremental-map-reduce/