Aggregate $unwind in Mongodb - mongodb

Hi I'm trying to do a query using aggregation(mongodb version 2.2) to return a list of products in a certain store owned by a certain user. I think I have the first 2 steps correctly, but stuck on the next ones.
Data:
Here's the incomplete query I have:
collection.aggregate(
[ { '$match': { "username": test } },
{ '$unwind': '$stores'}
],
function(err, results) {
assert.equal(err, null);
console.log(results)
callback(results);
}
);
Can anyone advice on how to go about with the next few ones?

Related

findOneAndUpdate - document query > 1000

So I have a collection that has over 1million documents clearly, I don't need to search 1000 or more documents as it should only be updating the latest document.
const nowplayingData = {"type":"S",
"station": req.params.stationname,
"song": data[1],
"artist": data[0],
"timeplay":npdate};
LNowPlaying.findOneAndUpdate( nowplayingData,
{ $addToSet: { history: [uuid] } },
{ upsert: true, sort: {_id:-1} },
function(err) {
if (err) {
console.log('ERROR when submitting round');
console.log(err);
}
});
I have added a sort on it, so that it is able to get the latest one first, but if the document is not in there and it's the first time the document is being added then the way the script is written will query all documents to find.
When really it only needs to look at the last say 100 documents or even better if timeplay is in the last 5 minutes.
You can do something like:
const nowplayingData = {
"type":"S","station": req.params.stationname, "song": data[1], "artist": data[0],
"timeplay":{$gte: beforeFiveMin}};
But this will create a new document almost every time...so you will need to maintain it...

How would I update a field in MongoDB which totals up the values of a child document?

I have a document which is structured like this:
{
'item_id': '12345'
'total_score': 100,
'user_scores': {
'ABC': 40,
'DEF': 60
}
}
I'm using PyMongo, but documentation of MongoDB seems easily translatable across different distributions. With PyMongo, I could update user scores with:
collection.update_one(
{ 'item_id': '12345' },
{ '$set': { 'user_scores.GHI': 20 } },
upsert=True
)
Which results in this:
{
'item_id': '12345'
'total_score': 100,
'user_scores': {
'ABC': 40,
'DEF': 60,
'GHI': 20
}
}
The issue is of course that the total_score is now incorrect. I want that total score to update, so that in a future query, I can quickly ascertain the score of each result, and even sort by score.
One solution could be to find an existing document using find_one({'item_id: '12345'}), (create if it doesn't exist), then update with new scores, and update total score. The problem there is that I want to run thousands of these at the same time, and it's far more efficient to call bulk_write on a series of requests.
So, a better solution would be to do two sequential update requests:
request1 = UpdateOne(
{ 'item_id' : '12345' },
{ '$set': { 'user_scores.GHI': 20 } },
upsert = True
)
request2 = UpdateOne(
{ 'item_id' : '12345' },
{ '$set': { 'total_score': { '$sum': { '$values': 'user_scores' } } } },
upsert = True
)
The first request updates the user scores, same as before. The second request, there are two concepts going on. The syntax for this isn't correct, but here's what I'm trying to do:
I need to get the values from the user_scores dictionary. { '$values': 'user_scores' } is how I've tried to convey this.
That gives me an array of values. I know these are all numeric, so I now need to sum those, conveyed with { '$sum': { '$values': 'user_scores' } }.
I can run these batch updates consecutively, so there's no risk of summing the wrong thing. The danger with having a total_score field will always be that it isn't updated and thus doesn't contain the correct number. I'd imagine this is a common case with document-based models?
If you're using Mongo version 4.2+ they introduced a new feature: pipelined updates, Meaning now you can do what you want in one go:
db.collection.updateOne({ 'item_id' : '12345' },
[
{ '$set': { 'user_scores.GHI': 20 } },
{ '$set': { 'total_score': { '$sum': [ "$user_scores.GHI", "$user_scores.ABC", "$user_scores.GHI"] } } },,
]);
Unfortunately this is not possible for lesser Mongo versions hence if that is the case you'll have to keep using your solution which is splitting this into 2 actions.
EDIT:
For dynamic update we can use $map and $objectToArray like so:
db.collection.updateOne(
{'item_id': '12345'},
[
{'$set': {'user_scores.GHI': 20}},
{
'$set':
{
'total_score': {
'$sum': {
'$map': {
'input': {'$objectToArray': '$user_scores'},
'as': 'score',
'in': '$$score.v'
}
}
}
}
}
]);

How to save / update multiple documents in mongoose

I am reading all documents of a specific schema from Mongoose. Now in my program I am doing some modifications to the results I got from Mongoose over time. Something like this:
var model = mongoose.model("Doc", docSchema);
model.find(function(err, result){
// for each result do some modifications
});
How can I send all the results back to the database to be saved? Currently I am iterating the documents and doing a save() on every document. I think there must be a better way. But currently I only find information on updating documents IN the database without returning them. Or bulk updates which do the SAME to update to each document.
You can use update query with multi:true which update all documents in your db.
please find below reference code,
model.update({ "_id": id }, { $set: { "Key": "Value" } }, { multi: true }, function (err, records) {
if (err || !records) {
return res.json({ status: 500, message: "Unable to update documents." });
} else {
return res.json({ status: 200, message: "success" });
}
});
If you are trying to make the same change to each document in the results, you could do something like this:
model.update({ _id: { $in: results.map(doc=>doc._id) }}, { yourField: 'new value' }, { multi: true })

Mongo aggregation and MongoError: exception: BufBuilder attempted to grow() to 134217728 bytes, past the 64MB limit

I'm trying to aggregate data from my Mongo collection to produce some statistics for FreeCodeCamp by making a large json file of the data to use later.
I'm running into the error in the title. There doesn't seem to be a lot of information about this, and the other posts here on SO don't have an answer. I'm using the latest version of MongoDB and drivers.
I suspect there is probably a better way to run this aggregation, but it runs fine on a subset of my collection. My full collection is ~7GB.
I'm running the script via node aggScript.js > ~/Desktop/output.json
Here is the relevant code:
MongoClient.connect(secrets.db, function(err, database) {
if (err) {
throw err;
}
database.collection('user').aggregate([
{
$match: {
'completedChallenges': {
$exists: true
}
}
},
{
$match: {
'completedChallenges': {
$ne: ''
}
}
},
{
$match: {
'completedChallenges': {
$ne: null
}
}
},
{
$group: {
'_id': 1, 'completedChallenges': {
$addToSet: '$completedChallenges'
}
}
}
], {
allowDiskUse: true
}, function(err, results) {
if (err) { throw err; }
var aggData = results.map(function(camper) {
return _.flatten(camper.completedChallenges.map(function(challenges) {
return challenges.map(function(challenge) {
return {
name: challenge.name,
completedDate: challenge.completedDate,
solution: challenge.solution
};
});
}), true);
});
console.log(JSON.stringify(aggData));
process.exit(0);
});
});
Aggregate returns a single document containing all the result data, which limits how much data can be returned to the maximum BSON document size.
Assuming that you do actually want all this data, there are two options:
Use aggregateCursor instead of aggregate. This returns a cursor rather than a single document, which you can then iterate over
add a $out stage as the last stage of your pipeline. This tells mongodb to write your aggregation data to the specified collection. The aggregate command itself returns no data and you then query that collection as you would any other.
It just means that the result object you are building became too large. This kind of issue should not be impacted by the version. The fix implemented for 2.5.0 only prevents the crash from occurring.
You need to filter ($match) properly to have the data which you need in result. Also group with proper fields. The results are put into buffer of 64MB. So reduce your data. $project only the columns you require in result. Not whole documents.
You can combine your 3 $match objects to single to reduce pipelines.
{
$match: {
'completedChallenges': {
$exists: true,
$ne: null,
$ne: ""
}
}
}
I had this issue and I couldn't debug the problem so I ended up abandoning the aggregation approach. Instead I just iterated through each entry and created a new collection. Here's a stripped down shell script which might help you see what I mean:
db.new_collection.ensureIndex({my_key:1}); //for performance, not a necessity
db.old_collection.find({}).noCursorTimeout().forEach(function(doc) {
db.new_collection.update(
{ my_key: doc.my_key },
{
$push: { stuff: doc.stuff, other_stuff: doc.other_stuff},
$inc: { thing: doc.thing},
},
{ upsert: true }
);
});
I don't imagine that this approach would suit everyone, but hopefully that helps anyone who was in my particular situation.

Limiting results in MongoDB but still getting the full count?

For speed, I'd like to limit a query to 10 results
db.collection.find( ... ).limit(10)
However, I'd also like to know the total count, so to say "there were 124 but I only have 10". Is there a good efficient way to do this?
By default, count() ignores limit() and counts the results in the entire query.
So when you for example do this, var a = db.collection.find(...).limit(10);
running a.count() will give you the total count of your query.
Doing count(1) includes limit and skip.
The accepted answer by #johnnycrab is for the mongo CLI.
If you have to write the same code in Node.js and Express.js, you will have to use it like this to be able to use the "count" function along with the toArray's "result".
var curFind = db.collection('tasks').find({query});
Then you can run two functions after it like this (one nested in the other)
curFind.count(function (e, count) {
// Use count here
curFind.skip(0).limit(10).toArray(function(err, result) {
// Use result here and count here
});
});
cursor.count() should ignore cursor.skip() and cursor.limit() by default.
Source: http://docs.mongodb.org/manual/reference/method/cursor.count/#cursor.count
You can use a $facet stage which processes multiple aggregation pipelines within a single stage on the same set of input documents:
// { item: "a" }
// { item: "b" }
// { item: "c" }
db.collection.aggregate([
{ $facet: {
limit: [{ $limit: 2 }],
total: [{ $count: "count" }]
}},
{ $set: { total: { $first: "$total.count" } } }
])
// { limit: [{ item: "a" }, { item: "b" }], total: 3 }
This way, within the same query, you can get both some documents (limit: [{ $limit: 2 }]) and the total count of documents ({ $count: "count" }).
The final $set stage is an optional clean-up step, just there to project the result of the $count stage, such that "total" : [ { "count" : 3 } ] becomes total: 3.
There is a solution using push and slice: https://stackoverflow.com/a/39784851/4752635
I prefe
First for filtering and then grouping by ID to get number of filtered elements. Do not filter here, it is unnecessary.
Second query which filters, sorts and paginates.
Solution with pushing $$ROOT and using $slice runs into document memory limitation of 16MB for large collections. Also, for large collections two queries together seem to run faster than the one with $$ROOT pushing. You can run them in parallel as well, so you are limited only by the slower of the two queries (probably the one which sorts).
I have settled with this solution using 2 queries and aggregation framework (note - I use node.js in this example, but idea is the same):
var aggregation = [
{
// If you can match fields at the begining, match as many as early as possible.
$match: {...}
},
{
// Projection.
$project: {...}
},
{
// Some things you can match only after projection or grouping, so do it now.
$match: {...}
}
];
// Copy filtering elements from the pipeline - this is the same for both counting number of fileter elements and for pagination queries.
var aggregationPaginated = aggregation.slice(0);
// Count filtered elements.
aggregation.push(
{
$group: {
_id: null,
count: { $sum: 1 }
}
}
);
// Sort in pagination query.
aggregationPaginated.push(
{
$sort: sorting
}
);
// Paginate.
aggregationPaginated.push(
{
$limit: skip + length
},
{
$skip: skip
}
);
// I use mongoose.
// Get total count.
model.count(function(errCount, totalCount) {
// Count filtered.
model.aggregate(aggregation)
.allowDiskUse(true)
.exec(
function(errFind, documents) {
if (errFind) {
// Errors.
res.status(503);
return res.json({
'success': false,
'response': 'err_counting'
});
}
else {
// Number of filtered elements.
var numFiltered = documents[0].count;
// Filter, sort and pagiante.
model.request.aggregate(aggregationPaginated)
.allowDiskUse(true)
.exec(
function(errFindP, documentsP) {
if (errFindP) {
// Errors.
res.status(503);
return res.json({
'success': false,
'response': 'err_pagination'
});
}
else {
return res.json({
'success': true,
'recordsTotal': totalCount,
'recordsFiltered': numFiltered,
'response': documentsP
});
}
});
}
});
});