Project object existence boolean in MongoDB

Project object existence boolean in MongoDB - mongodb

I have a document structure that looks like this (two example docs below).
{
"A": "value"
},
{
"A": "value",
"B": {
"a": "value",
"b": "value"
}
}
I want to aggregate such that the value of field A is projected while a true/false value is returned depending on whether the object B exists. The result of the query would be:
{
"A": "value",
"B": false
},
{
"A": "value",
"B": true
}

Even a shorter solution:
db.collection.aggregate({
$project: {
A: 1,
B: { $cond: ["$B", true, false] }
}
})
or
db.collection.aggregate({
$project: {
A: 1,
B: { $ifNull: [{ $toBool: "$B" }, false] }
}
})
However, following documents will yield different result than the other answers. Check your application if such documents apply.
{
'A': 'value5',
'B': false
},
{
'A': 'value5',
'B': []
}

You can use below aggregation
db.collection.aggregate([
{ "$addFields": {
"B": {
"$cond": [
{ "$eq": ["$B", undefined] },
false,
true
]
}
}}
])

You may use $type operator:
If the argument is a field that is missing in the input document, $type returns the string "missing".
db.collection.aggregate([
{
$project: {
A: 1,
B: {
$ne: [
{
$type: "$B"
},
"missing"
]
}
}
}
])
MongoPlayground

Related

Mongodb group by values and count the number of occurence

I am trying to count how many times does a particular value occur in a collection.
{
_id:1,
field1: value,
field2: A,
}
{
_id:2,
field1: value,
field2: A,
}
{
_id:3,
field1: value,
field2: C,
}
{
_id:4,
field1: value,
field2: B,
}
what I want is to count how many times A occurs, B occurs and C occurs and return the count.
The output I want
{
A: 2,
B: 1,
C: 1,
}

You can use $facet in an aggregate pipeline like this:
$facet create "three ways" where in each one filter the values by desired key (A, B or C).
Then in a $project stage you can get the $size of the matched values.
db.collection.aggregate([
{
"$facet": {
"first": [
{
"$match": {
"field2": "A"
}
}
],
"second": [
{
"$match": {
"field2": "B"
}
}
],
"third": [
{
"$match": {
"field2": "C"
}
}
]
}
},
{
"$project": {
"A": {
"$size": "$first"
},
"B": {
"$size": "$second"
},
"C": {
"$size": "$third"
}
}
}
])
Example here

This is typical use case for $group stage in Aggregation Pipeline. You can do it like this:
$group - to group all the documents by field2
$sum - to count the number of documents for each value of field2
db.collection.aggregate([
{
"$group": {
"_id": "$field2",
"count": {
"$sum": 1
}
}
}
])
Working example

Leverage the $arrayToObject operator and a final $replaceWith pipeline to get the desired result. You would need to run the following aggregate pipeline:
db.collection.aggregate([
{ $group: {
_id: { $toUpper: '$field2' },
count: { $sum: 1 }
} },
{ $group: {
_id: null,
counts: {
$push: { k: '$_id', v: '$count' }
}
} },
{ $replaceWith: { $arrayToObject: '$counts' } }
])
Mongo Playground

MongoDB aggregation filter array of subdocuments by missing field

Given the following collection:
db.test.insertOne(
{
test: [
{a: "1", b: "2"},
{a: "3"}
]
}
)
How can you filter out the second document (with the non existing field b) of the test array inside an aggregation?
The aggregation
db.test.aggregate([
{
$project: {
test: {
$filter: {
input: "$test",
cond: {
$ne: ["$$this.b", null]
}
}
}
}
}
])
still returns the subdocument with the not existing field b:
{
"_id": {"$oid": "xyz"},
"test": [
{
"a": "1",
"b": "2"
},
{
"a": "3"
}
]
}
I'd expect the following result:
{
"_id": {"$oid": "xyz"},
"test": [
{
"a": "1",
"b": "2"
}
]
}
Obviously my pipeline is more complex than that, but the problem boils down to this small example.

Null and missing are different types in BSON. To test for missing, compare with the $type:
cond: {
$ne: [{$type:"$$this.b"}, "missing"]
}
Playground

How to retrieve all elements of an array, on which a specific does not exist?

Consider the following data:
{
"foo": "foo"
"baz": [
"D": 77
]
}
{
"foo": "bar"
"baz": [
{
"A": 5,
"B": 15
},
{
"A": 13,
"B": 34,
"C": 68,
},
{
"A": 192,
"B": 168,
"C": 1,
"D": 27
}
]
},
{
"foo": "baz"
"baz": [
{
"A": 5,
"B": 10
"C": 15
},
{
"A": 13,
"D": 37,
}
]
}
I tried finding all and only projecting the result, but due to the nature of $elemMatch this only returns the first element without the field D.
db.collection.find({}, {"baz": {$elemMatch: {"D": {$exists: false}}}})
How could I retrieve all documents, which have at least one element in baz, which has no field D and project it, so only the entries without the field D are shown?

Use $anyElemetTrue for your filtering criteria and $filter to remove those baz elements that contain D:
db.collection.aggregate([
{
$match: {
$expr: {
$anyElementTrue: {
$map: {
input: "$baz",
in: { $eq: [ "$$this.D", undefined ] }
}
}
}
}
},
{
$addFields: {
baz: {
$filter: {
input: "$baz",
cond: { $eq: [ "$$this.D", undefined ] }
}
}
}
}
])
Mongo Playground

I found a way to achieve my goal:
db.collection.aggregate([
{
$unwind: "$baz"
},
{
$match: {
"baz.D": {$exists: false}
}
},
{
$group: {
foo: "$foo",
baz: {$push: "$baz"}
}
}
])

You can do this as:
db.collection.aggregate([
{
$project: {
foo: 1,
baz: {
$filter: {
input: "$baz",
as: "item",
cond: {
$lt: [
"$$item.D",
null
]
}
}
}
}
},
{
$match: {
baz: {
$ne: []
}
}
}
])

Mongodb array $push and $pull

I was looking to pull some value from array and simultaneously trying to update it.
userSchema.statics.experience = function (id,xper,delet,callback) {
var update = {
$pull:{
'profile.experience' : delet
},
$push: {
'profile.experience': xper
}
};
this.findByIdAndUpdate(id,update,{ 'new': true},function(err,doc) {
if (err) {
callback(err);
} else if(doc){
callback(null,doc);
}
});
};
i was getting error like:
MongoError: exception: Cannot update 'profile.experience' and 'profile.experience' at the same time

I found this explanation:
The issue is that MongoDB doesn’t allow multiple operations on the
same property in the same update call. This means that the two
operations must happen in two individually atomic operations.
And you can read that posts:
Pull and addtoset at the same time with mongo
multiple mongo update operator in a single statement?

In case you need replace one array value to another, you can use arrayFilters for update.
(at least, present in mongo 4.2.1).
db.your_collection.update(
{ "_id": ObjectId("your_24_byte_length_id") },
{ "$set": { "profile.experience.$[elem]": "new_value" } },
{ "arrayFilters": [ { "elem": { "$eq": "old_value" } } ], "multi": true }
)
This will replace all "old_value" array elements with "new_value".

Starting from MongoDB 4.2
You can try to update the array using an aggregation pipeline.
this.updateOne(
{ _id: id },
[
{
$set: {
"profile.experience": {
$concatArrays: [
{
$filter: {
input: "$profile.experience",
cond: { $ne: ["$$this", delet] },
},
},
[xper],
],
},
},
},
]
);
Following, a mongoplayground doing the work:
https://mongoplayground.net/p/m1C1LnHc0Ge
OBS: With mongo regular update query it is not possible.

Since Mongo 4.2 findAndModify supports aggregation pipeline which will allow atomically moving elements between arrays within the same document. findAndModify also allows you to return the modified document (necessary to see which array elements were actually moved around).
The following includes examples of:
moving all elements from one array onto the end of a different array
"pop" one element of an array and "push" it to another array
To run the examples, you will need the following data:
db.test.insertMany( [
{
"_id": ObjectId("6d792d6a756963792d696441"),
"A": [ "8", "9" ],
"B": [ "7" ]
},
{
"_id": ObjectId("6d792d6a756963792d696442"),
"A": [ "1", "2", "3", "4" ],
"B": [ ]
}
]);
Example 1 - Empty array A by moving it into array B:
db.test.findAndModify({
query: { _id: ObjectId("6d792d6a756963792d696441") },
update: [
{ $set: { "B": { $concatArrays: [ { $ifNull: [ "$B", [] ] }, "$A" ] } } },
{ $set: { "A": [] } }
],
new: true
});
Resulting in:
{
"_id": {
"$oid": "6d792d6a756963792d696441"
},
"A": [],
"B": [
"7",
"8",
"9"
]
}
Example 2.a - Pop element from array A and push it onto array B
db.test.findAndModify({
query: { _id: ObjectId("6d792d6a756963792d696442"),
"A": {$exists: true, $type: "array", $ne: [] }},
update: [
{ $set: { "B": { $concatArrays: [ { $ifNull: [ "$B", [] ] }, [ { $first: "$A" } ] ] } } },
{ $set: { "A": { $slice: ["$A", 1, {$max: [{$subtract: [{ $size: "$A"}, 1]}, 1]}] } }}
],
new: true
});
Resulting in:
{
"_id": {
"$oid": "6d792d6a756963792d696442"
},
"A": [
"2",
"3",
"4"
],
"B": [
"1"
]
}
Example 2.b - Pop element from array A and push it onto array B but in two steps with a temporary placeholder:
db.test.findAndModify({
query: { _id: ObjectId("6d792d6a756963792d696442"),
"temp": { $exists: false } },
update: [
{ $set: { "temp": { $first: "$A" } } },
{ $set: { "A": { $slice: ["$A", 1, {$max: [{$subtract: [{ $size: "$A"}, 1]}, 1]}] } }}
],
new: true
});
// do what you need to do with "temp"
db.test.findAndModify({
query: { _id: ObjectId("6d792d6a756963792d696442"),
"temp": { $exists: true } },
update: [
{ $set: { "B": { $concatArrays: [ { $ifNull: [ "$B", [] ] }, [ "$temp" ] ] } } },
{ $unset: "temp" }
],
new: true
});

How to select only not null values when aggregating with first or last in mongodb?

My data represents a dictionary that receives a bunch of updates and potentially new fields (metadata being added to a post). So something like:
> db.collection.find()
{ _id: ..., 'A': 'apple', 'B': 'banana' },
{ _id: ..., 'A': 'artichoke' },
{ _id: ..., 'B': 'blueberry' },
{ _id: ..., 'C': 'cranberry' }
The challenge - I want to find the first (or last) value for each key ignoring blank values (i.e. I want some kind of conditional group by that works at a field not document level). (Equivalent to the starting or ending version of the metadata after updates).
The problem is that:
db.collection.aggregate([
{ $group: {
_id: null,
A: { $last: '$A' },
B: { $last: '$B' },
C: { $last: '$C' }
}}
])
fills in the blanks with nulls (rather than skipping them in the result), so I get:
{ '_id': ..., 'A': null, 'B': null, 'C': 'cranberry' }
when I want:
{ '_id': ..., 'A': 'artichoke', 'B': 'blueberry', 'C': cranberry' }

I don't think this is what you really want, but it does solve the problem you are asking. The aggregation framework cannot really do this, as you are asking for "last results" of different columns from different documents. There is really only one way to do this and it is pretty insane:
db.collection.aggregate([
{ "$group": {
"_id": null,
"A": { "$push": "$A" },
"B": { "$push": "$B" },
"C": { "$push": "$C" }
}},
{ "$unwind": "$A" },
{ "$group": {
"_id": null,
"A": { "$last": "$A" },
"B": { "$last": "$B" },
"C": { "$last": "$C" }
}},
{ "$unwind": "$B" },
{ "$group": {
"_id": null,
"A": { "$last": "$A" },
"B": { "$last": "$B" },
"C": { "$last": "$C" }
}},
{ "$unwind": "$C" },
{ "$group": {
"_id": null,
"A": { "$last": "$A" },
"B": { "$last": "$B" },
"C": { "$last": "$C" }
}},
])
Essentially you compact down the documents pushing all of the found elements into arrays. Then each array is unwound and the $last element is taken from there. You need to do this for each field in order to get the last element of each array, which was the last match for that field.
Not real good and certain to explode the BSON 16MB limit on any meaningful collection.
So what you are really after is looking for a "last seen" value for each field. You could brute force this by iterating the collection and keeping values that are not null. You can even do this on the server like this with mapReduce:
db.collection.mapReduce(
function () {
if (start == 0)
emit( 1, "A" );
start++;
current = this;
Object.keys(store).forEach(function(key) {
if ( current.hasOwnProperty(key) )
store[key] = current[key];
});
},
function(){},
{
"scope": { "start": 0, "store": { "A": null, "B": null, "C": null } },
"finalize": function(){ return store },
"out": { "inline": 1 }
}
)
That will work as well, but iterating the whole collection is nearly as bad as mashing everything together with aggregate.
What you really want in this case is three queries, ideally in parallel to just get the discreet value last seen for each property:
> db.collection.find({ "A": { "$exists": true } }).sort({ "$natural": -1 }).limit(1)
{ "_id" : ObjectId("54b319cd6997a054ce4d71e7"), "A" : "artichoke" }
> db.collection.find({ "B": { "$exists": true } }).sort({ "$natural": -1 }).limit(1)
{ "_id" : ObjectId("54b319cd6997a054ce4d71e8"), "B" : "blueberry" }
> db.collection.find({ "C": { "$exists": true } }).sort({ "$natural": -1 }).limit(1)
{ "_id" : ObjectId("54b319cd6997a054ce4d71e9"), "C" : "cranberry" }
Acutally even better is to create a sparse index on each property and query via $gt and a blank string. This makes sure an index is used and as a sparse index it will only contain documents where the property is present. You'll need to .hint() this, but you still want $natural ordering for the sort:
db.collection.ensureIndex({ "A": -1 },{ "sparse": 1 })
db.collection.ensureIndex({ "B": -1 },{ "sparse": 1 })
db.collection.ensureIndex({ "C": -1 },{ "sparse": 1 })
> db.collection.find({ "A": { "$gt": "" } }).hint({ "A": -1 }).sort({ "$natural": -1 }).limit(1)
{ "_id" : ObjectId("54b319cd6997a054ce4d71e7"), "A" : "artichoke" }
> db.collection.find({ "B": { "$gt": "" } }).hint({ "B": -1 }).sort({ "$natural": -1 }).limit(1)
{ "_id" : ObjectId("54b319cd6997a054ce4d71e8"), "B" : "blueberry" }
> db.collection.find({ "C": { "$gt": "" } }).hint({ "C": -1 }).sort({ "$natural": -1 }).limit(1)
{ "_id" : ObjectId("54b319cd6997a054ce4d71e9"), "C" : "cranberry" }
That's the best way to solve what you are saying here. But as I said, this is how you think you need to solve it. Your real problem likely has another way to approach both storing and querying.

Starting Mongo 3.6, for those using $first or $last as a way to get one value from grouped records (not necessarily the actual first or last), $group's $mergeObjects can be used as a way to find a non-null value from grouped items:
// { "A" : "apple", "B" : "banana" }
// { "A" : "artichoke" }
// { "B" : "blueberry" }
// { "C" : "cranberry" }
db.collection.aggregate([
{ $group: {
_id: null,
A: { $mergeObjects: { a: "$A" } },
B: { $mergeObjects: { b: "$B" } },
C: { $mergeObjects: { c: "$C" } }
}}
])
// { _id: null, A: { a: "artichoke" }, B: { b: "blueberry" }, C: { c: "cranberry" } }
$mergeObjects accumulates an object based on each grouped record. And the thing to note is that $mergeObjects will merge in priority values that aren't null. But that requires to modify the accumulated field to an object, thus the "awkward" { a: "$A" }.
If the output format isn't exactly what you expect, one can always use an additional $project stage.

So I've just thought about how to answer this, but would be interested to hear people's opinions on how right/wrong this is. Based on the reply from #NeilLunn I guess I'll hit the BSON limit, making his version better for pulling the data, but it's important to my app that I can run this query in one go. (Perhaps my real problem is the data design).
The problem we have is that in the "group by" we pull in a version of A, B, C for every document. So my solution is to tell the aggregation what fields it should pull in by changing (slightly) the original data structure to tell the engine which keys are in each document:
> db.collection.find()
{ _id: ..., 'A': 'apple', 'B': 'banana', 'Keys': ['A', 'B']},
{ _id: ..., 'A': 'artichoke', 'Keys': ['A']},
{ _id: ..., 'B': 'blueberry', 'Keys': ['B']},
{ _id: ..., 'C': 'cranberry', 'Keys': ['C']}
Now we can can $unwind on 'Keys' and then group with 'Keys' as '_id'. Thus:
db.collection.aggregate([
{'$unwind': 'Keys'},
{'$group':
{'_id': 'Keys',
'A': {'$last': '$A'},
'B': {'$last': '$B'},
'C': {'$last': '$C'}
}
}
])
I get back a series of documents with _id equal to the key:
{_id: 'A', 'A': 'artichoke', 'B': null, 'C': null},
{_id: 'B', 'A': null, 'B': 'blueberry', 'C': null},
{_id: 'C', 'A': null, 'B': null, 'C': 'cranberry'}
You can then pull the results you want, knowing that the value for key X is only valid for the result where _id is X.
(Of course the next question is how to reduce this series of documents to one, taking the appropriate field each time)