MongoDB insert document "or" increment field if exists in array - mongodb

What I try to do is fairly simple, I have an array inside a document ;
"tags": [
{
"t" : "architecture",
"n" : 12
},
{
"t" : "contemporary",
"n" : 2
},
{
"t" : "creative",
"n" : 1
},
{
"t" : "concrete",
"n" : 3
}
]
I want to push an array of items to array like
["architecture","blabladontexist"]
If item exists, I want to increment object's n value (in this case its architecture),
and if don't, add it as a new Item (with value of n=0) { "t": "blabladontexist", "n":0}
I have tried $addToSet, $set, $inc, $upsert: true with so many combinations and couldn't do it.
How can we do this in MongoDB?

With MongoDB 4.2 and newer, the update method can now take a document or an aggregate pipeline where the following stages can be used:
$addFields and its alias $set
$project and its alias $unset
$replaceRoot and its alias $replaceWith.
Armed with the above, your update operation with the aggregate pipeline will be to override the tags field by concatenating a filtered tags array and a mapped array of the input list with some data lookup in the map:
To start with, the aggregate expression that filters the tags array uses the $filter and it follows:
const myTags = ["architecture", "blabladontexist"];
{
"$filter": {
"input": "$tags",
"cond": {
"$not": [
{ "$in": ["$$this.t", myTags] }
]
}
}
}
which produces the filtered array of documents
[
{ "t" : "contemporary", "n" : 2 },
{ "t" : "creative", "n" : 1 },
{ "t" : "concrete", "n" : 3 }
]
Now the second part will be to derive the other array that will be concatenated to the above. This array requires a $map over the myTags input array as
{
"$map": {
"input": myTags,
"in": {
"$cond": {
"if": { "$in": ["$$this", "$tags.t"] },
"then": {
"t": "$$this",
"n": {
"$sum": [
{
"$arrayElemAt": [
"$tags.n",
{ "$indexOfArray": [ "$tags.t", "$$this" ] }
]
},
1
]
}
},
"else": { "t": "$$this", "n": 0 }
}
}
}
}
The above $map essentially loops over the input array and checks with each element whether it's in the tags array comparing the t property, if it exists then the value of the n field of the subdocument becomes its current n value
expressed with
{
"$arrayElemAt": [
"$tags.n",
{ "$indexOfArray": [ "$tags.t", "$$this" ] }
]
}
else add the default document with an n value of 0.
Overall, your update operation will be as follows
Your final update operation becomes:
const myTags = ["architecture", "blabladontexist"];
db.getCollection('coll').update(
{ "_id": "1234" },
[
{ "$set": {
"tags": {
"$concatArrays": [
{ "$filter": {
"input": "$tags",
"cond": { "$not": [ { "$in": ["$$this.t", myTags] } ] }
} },
{ "$map": {
"input": myTags,
"in": {
"$cond": [
{ "$in": ["$$this", "$tags.t"] },
{ "t": "$$this", "n": {
"$sum": [
{ "$arrayElemAt": [
"$tags.n",
{ "$indexOfArray": [ "$tags.t", "$$this" ] }
] },
1
]
} },
{ "t": "$$this", "n": 0 }
]
}
} }
]
}
} }
],
{ "upsert": true }
);

I don't believe this is possible to do in a single command.
MongoDB doesn't allow a $set (or $setOnInsert) and $inc to affect the same field in a single command.
You'll have to do one update command to attempt to $inc the field, and if that doesn't change any documents (n = 0), do the update to $set the field to it's default value.

Related

Match Document Key Nearest to Search Value

I have the following collection:
{
"_id" : "Stats1",
"minutes" : {
"0" : [
{
"0" : {
"f" : 1,
"t" : 0,
"v" : "0"
}
}
],
"22" : [
{
"2" : "1"
}
],
"29" : [
{
"32" : "2"
}
],
"38" : [
{
"40" : "3"
}
]
}
}
and when i try:
db.stats.aggregate()
.project({"_id":"1", "minArray": {"$objectToArray": "$minutes"}})
i am getting error message:
"$objectToArray requires a document input, found: array"
and when i try:
db.stats.aggregate()
.project({"_id":"1", "minArray": {"$arrayToObject": "$minutes"}})
i am getting error message:
"$arrayToObject requires an array input, found: object"
I would like to get closest value for minute exact or lower than 30:
{ "minute" : "29", "value" : [{ "32" : "2"}] }
So the errors are because without a $match your pipeline is attempting to access other documents which don't have the expected structure. That's really something separate to sort out though.
To actually answer your question from it's end objective, you want a pipeline like this:
var _id = "Stats1";
var target = 30;
db.stats.aggregate([
{ "$match": { "_id" : _id } },
{ "$replaceRoot": {
"newRoot": {
"$let": {
"vars": {
"working": {
"$map": {
"input": { "$objectToArray": "$minutes" },
"in": {
"k": { "$toInt": "$$this.k" },
"v": "$$this.v",
"diff": { "$abs": { "$subtract": [ target, { "$toInt": "$$this.k" }] } }
}
}
}
},
"in": {
"$arrayToObject": {
"$map": {
"input": {
"$filter": {
"input": {
"$objectToArray": {
"$arrayElemAt": [
"$$working",
{ "$indexOfArray": [ "$$working.diff", { "$min": "$$working.diff" } ] }
]
}
},
"cond": { "$ne": [ "$$this.k", "diff" ] }
}
},
"in": {
"k": { "$cond": [{ "$eq": [ "$$this.k", "k"] }, "minute", "value" ] },
"v": { "$cond": [{ "$eq": [ "$$this.k", "k"] }, { "$toString": "$$this.v" }, "$$this.v" ] }
}
}
}
}
}
}
}}
])
Which of course returns the wanted output:
{ "minute" : "29", "value" : [ { "32" : "2" } ] }
In sequence you do the $objectToArray as you initially attempted, but then you need that key or "k" value to actually be converted to numeric for comparison. You also need to calculate the difference of that from the value you are searching for, in this case 30. That gives you a "working" copy of the data in array form, which is important for the next input stages.
The next section is basically read inwards from the levels of indentation to best understand the order.
First you basically want to extract the element from that working array where the difference ( using $abs so positive and negative are the same ) is the minimal value with $min. This gives the position of the first match from $indexOfArray and used that with $arrayElemAt to return that single selected element from the working array.
We don't want all the fields in that object, so $objectToArray converts that single object into "k" and "v" paired objects, and the first step is to $filter where that key is the difference field and remove this from that list.
Next you want to rename the fields and change some data formats, so the $map iterates the remaining array ( just two entries ) assigning readable names and setting the string format for the "minute".
Finally this can go back to an object as $arrayToObject as the final output. Since we wanted to refer to that "working" array several times, we declare in $let which allows us to do that. And since all of that was an expression that outputs what you want as a document, you use $replaceRoot to wrap this as an "expression" is basically it's single expected argument.

Retrieve specific element of a nested document

Just cannot figure this out. This is the document format from a MongoDB of jobs, which is derived from an XML file the layout of which I have no control over:
{
"reference" : [ "93417" ],
"Title" : [ "RN - Pediatric Director of Nursing" ],
"Description" : [ "...a paragraph or two..." ],
"Classifications" : [
{
"Classification" : [
{
"_" : "Nurse / Midwife",
"name" : [ "Category" ]
},
{
"_" : "FL - Jacksonville",
"name" : [ "Location" ],
},
{
"_" : "Permanent / Full Time",
"name" : [ "Work Type" ],
},
{
"_" : "Some Health Care Org",
"name" : [ "Company Name" ],
}
]
}
],
"Apply" : [
{
"EmailTo" : [ "jess#recruiting.co" ]
}
]
}
The intention is to pull a list of jobs from the DB, to include 'Location', which is buried down there as the second document at 'Classifications.Classification._'.
I've tried various 'aggregate' permutations of $project, $unwind, $match, $filter, $group… but I don't seem to be getting anywhere. Experimenting with just retrieving the company name, I was expecting this to work:
db.collection(JOBS_COLLECTION).aggregate([
{ "$project" : { "meta": "$Classifications.Classification" } },
{ "$project" : { "meta": 1, _id: 0 } },
{ "$unwind" : "$meta" },
{ "$match": { "meta.name" : "Company Name" } },
{ "$project" : { "Company" : "$meta._" } },
])
But that pulled everything for every record, thus:
[{
"Company":[
"Nurse / Midwife",
"TX - San Antonio",
"Permanent / Full Time",
"Some Health Care Org"
]
}, { etc etc }]
What am I missing, or misusing?
Ideally with MongoDB 3.4 available you would simply $project, and use the array operators of $map, $filter and $reduce. The latter to "compact" the arrays and the former to to extract the relevant element and detail. Also $arrayElemAt takes just the "element" from the array(s):
db.collection(JOBS_COLLECTION).aggregate([
{ "$match": { "Classifications.Classification.name": "Location" } },
{ "$project": {
"_id": 0,
"output": {
"$arrayElemAt": [
{ "$map": {
"input": {
"$filter": {
"input": {
"$reduce": {
"input": "$Classifications.Classification",
"initialValue": [],
"in": {
"$concatArrays": [ "$$value", "$$this" ]
}
}
},
"as": "c",
"cond": { "$eq": [ "$$c.name", ["Location"] ] }
}
},
"as": "c",
"in": "$$c._"
}},
0
]
}
}}
])
Or even skip the $reduce which is merely applying the $concatArrays to "merge" and simply grab the "first" array index ( since there is only one ) using $arrayElemAt:
db.collection(JOBS_COLLECTION).aggregate([
{ "$match": { "Classifications.Classification.name": "Location" } },
{ "$project": {
"_id": 0,
"output": {
"$arrayElemAt": [
{ "$map": {
"input": {
"$filter": {
"input": { "$arrayElemAt": [ "$Classifications.Classification", 0 ] },
"as": "c",
"cond": { "$eq": [ "$$c.name", ["Location"] ] }
}
},
"as": "c",
"in": "$$c._"
}},
0
]
}
}}
])
That makes the operation compatible with MongoDB 3.2, which you "should" be running at least.
Which in turn allows you to consider alternate syntax for MongoDB 3.4 using $indexOfArray based on the initial input variable of the "first" array index using $let to somewhat shorten the syntax:
db.collection(JOBS_COLLECTION).aggregate([
{ "$match": { "Classifications.Classification.name": "Location" } },
{ "$project": {
"_id": 0,
"output": {
"$let": {
"vars": {
"meta": {
"$arrayElemAt": [
"$Classifications.Classification",
0
]
}
},
"in": {
"$arrayElemAt": [
"$$meta._",
{ "$indexOfArray": [
"$$meta.name", [ "Location" ]
]}
]
}
}
}
}}
])
If indeed you consider that to be "shorter", that is.
In the other sense though, much like above there is an "array inside and array", so in order to process it, you $unwind twice, which is effectively what the $concatArrays inside $reduce is countering in the ideal case:
db.collection(JOBS_COLLECTION).aggregate([
{ "$match": { "Classifications.Classification.name": "Location" } },
{ "$unwind": "$Classifications" },
{ "$unwind": "$Classifications.Classification" },
{ "$match": { "Classifications.Classification.name": "Location" } },
{ "$project": { "_id": 0, "output": "$Classifications.Classification._" } }
])
All statements actually produce:
{
"output" : "FL - Jacksonville"
}
Which is the matching value of "_" in the inner array element for the "Location" as selected by your original intent.
Keeping in mind of course that all statements really should be preceded with the relevant [$match]9 statement as shown:
{ "$match": { "Classifications.Classification.name": "Location" } },
Since without that you would be possibly processing documents unnecessarily, which did not actually contain an array element matching that condition. Of course this may not be the case due to the nature of the documents, but it's generally good practice to make sure the "initial" selection always matches the conditions of details you later intend to "extract".
All of that said, even if this is the result of a direct import from XML, the structure should be changed since it does not efficiently present itself for queries. MongoDB documents do not work how XPATH does in terms of issuing queries. Therefore anything "XML Like" is not going to be a good structure, and if the "import" process cannot be changed to a more accommodating format, then there should at least be a "post process" to manipulate this into a separate storage in a more usable form.

Difference between two value in embedded document

This is my collection:
{
"Id" : "001",
"Data":[{
"updatedTime" : 1483209005,
"value" : 35
},
{
"updatedTime" : 1483209005,
"value" : 20
}
]
}
This was i tried:
db.A.aggregate([
{ "$group": {
"_id": "$Id",
"Difference": {
"$sum": {
"$cond": [
{ "$eq": [ "Data.$.value", 35.0 ] },
"$updatedTime",
{ "$cond": [
{ "$eq": [ "Data.$.value", 20.0 ] },
{ "$subtract": [ 0, "$updatedTime" ] },
0
]}
]
}
}
}}
])
But i get output like this:
{
"_id" : "001",
"Difference" : 0.0
}
I need to find difference bewteen two updatedDate fields in data array how to i do that?
You need to assign each element in your array to a variable using the $let operator in order to access the subdocument field with "dot notation" can use the $subtract and $abs. To get the first and second element, simply use the $arrayElemAt operator.
In the "in" expression you need to $subtract the two values and return the absolute value using the $abs operator.
db.collection.aggregate([
{ "$group": {
"_id": "$Id",
"Difference": {
"$sum": {
"$let": {
"vars": {
"first": { "$arrayElemAt": [ "$Data", 0 ] },
"second": { "$arrayElemAt": [ "$Data", 1 ] }
},
"in": {
"$abs": {
"$subtract": [
"$$first.updatedTime",
"$$second.updatedTime"
]
}
}
}
}
}
}}
])
As you said it will always have two elements, then do something like this --
db.getCollection('test').aggregate([{$project: {diff: {$abs: {$subtract: [{$arrayElemAt: ['$Data.value', 0]}, {$arrayElemAt: ['$Data.value', 1]}]}}}}])

Count how many documents contain a field

I have these three MongoDB documents:
{
"_id" : ObjectId("571094afc2bcfe430ddd0815"),
"name" : "Barry",
"surname" : "Allen",
"address" : [
{
"street" : "Red",
"number" : NumberInt(66),
"city" : "Central City"
},
{
"street" : "Yellow",
"number" : NumberInt(7),
"city" : "Gotham City"
}
]
}
{
"_id" : ObjectId("57109504c2bcfe430ddd0816"),
"name" : "Oliver",
"surname" : "Queen",
"address" : {
"street" : "Green",
"number" : NumberInt(66),
"city" : "Star City"
}
}
{
"_id" : ObjectId("5710953ac2bcfe430ddd0817"),
"name" : "Tudof",
"surname" : "Unknown",
"address" : "homeless"
}
The address field is an Array of Objects in the first document, an Object in the second and a String in the third.
My target is to find how many documents of my collection containinig the field address.street. In this case the right count is 1 but with my query I get two:
db.coll.find({"address.street":{"$exists":1}}).count()
I also tried map/reduce. It works but it is slower; so if it is possible, I would avoid it.
The distinction here is that the .count() operation is actually "correct" in returning the "document" count where the field is present. So the general considerations break down to:
If you just want to exlude the documents with the array field
Then the most effective way of excluding those documents where the "street" was a property of the "address" as an "array", then just use the dot-notation property of looking for the 0 index to not exist in the exlcusion:
db.coll.find({
"address.street": { "$exists": true },
"address.0": { "$exists": false }
}).count()
As a natively coded operator test in both cases $exists does the correct job and efficiently.
If you intended to count field occurences
If what you are actually asking is the "field count", where some "documents" contain array entries where that "field" may be present several times.
For that you need the aggregation framework or mapReduce like you mention. MapReduce uses JavaScript based processing and is therefore going to be considerably slower than the .count() operation. The aggregation framework also needs to calculate and "will" be slower than .count(), but not by as much as mapReduce.
In MongoDB 3.2 you get some help here by the expanded ability of $sum to work on an array of values as well as being an grouping accumulator. The other helper here is $isArray which allows a different processing method via $map when the data is in fact "an array":
db.coll.aggregate([
{ "$group": {
"_id": null,
"count": {
"$sum": {
"$sum": {
"$cond": {
"if": { "$isArray": "$address" },
"then": {
"$map": {
"input": "$address",
"as": "el",
"in": {
"$cond": {
"if": { "$ifNull": [ "$$el.street", false ] },
"then": 1,
"else": 0
}
}
}
},
"else": {
"$cond": {
"if": { "$ifNull": [ "$address.street", false ] },
"then": 1,
"else": 0
}
}
}
}
}
}
}}
])
Earlier versions hinge on a bit more conditional processing in order to treat the array and non-array data differently, and generally require $unwind to process array entries.
Either transposing the array via $map with MongoDB 2.6:
db.coll.aggregate([
{ "$project": {
"address": {
"$cond": {
"if": { "$ifNull": [ "$address.0", false ] },
"then": "$address",
"else": {
"$map": {
"input": ["A"],
"as": "el",
"in": "$address"
}
}
}
}
}},
{ "$unwind": "$address" },
{ "$group": {
"_id": null,
"count": {
"$sum": {
"$cond": {
"if": { "$ifNull": [ "$address.street", false ] },
"then": 1,
"else": 0
}
}
}
}}
])
Or providing conditional selection with MongoDB 2.2 or 2.4:
db.coll.aggregate([
{ "$group": {
"_id": "$_id",
"address": {
"$first": {
"$cond": [
{ "$ifNull": [ "$address.0", false ] },
"$address",
{ "$const": [null] }
]
}
},
"other": {
"$push": {
"$cond": [
{ "$ifNull": [ "$address.0", false ] },
null,
"$address"
]
}
},
"has": {
"$first": {
"$cond": [
{ "$ifNull": [ "$address.0", false ] },
1,
0
]
}
}
}},
{ "$unwind": "$address" },
{ "$unwind": "$other" },
{ "$group": {
"_id": null,
"count": {
"$sum": {
"$cond": [
{ "$eq": [ "$has", 1 ] },
{ "$cond": [
{ "$ifNull": [ "$address.street", false ] },
1,
0
]},
{ "$cond": [
{ "$ifNull": [ "$other.street", false ] },
1,
0
]}
]
}
}
}}
])
So the latter form "should" perform a bit better than mapReduce, but probably not by much.
In all cases the logic falls to using $ifNull as the "logical" form of $exists for the aggregation framework. Paired with $cond, a "truthfull" result is obtained when the property actually exsists, and a false value is returned when it is not. This determines whether 1 or 0 is returned respectively to the overall accumulation via $sum.
Ideally you have the modern version that can do this in a single $group pipeline stage, but otherwise you need the longer path.
Can you try this:
db.getCollection('collection_name').find({
"address.street":{"$exists":1},
"$where": "Array.isArray(this.address) == false && typeof this.address === 'object'"
});
In where clause, we are excluding if address is array and
Including address if it's type is object.

How to find document and single subdocument matching given criterias in MongoDB collection

I have collection of products. Each product contains array of items.
> db.products.find().pretty()
{
"_id" : ObjectId("54023e8bcef998273f36041d"),
"shop" : "shop1",
"name" : "product1",
"items" : [
{
"date" : "01.02.2100",
"purchasePrice" : 1,
"sellingPrice" : 10,
"count" : 15
},
{
"date" : "31.08.2014",
"purchasePrice" : 10,
"sellingPrice" : 1,
"count" : 5
}
]
}
So, can you please give me an advice, how I can query MongoDB to retrieve all products with only single item which date is equals to the date I pass to query as parameter.
The result for "31.08.2014" must be:
{
"_id" : ObjectId("54023e8bcef998273f36041d"),
"shop" : "shop1",
"name" : "product1",
"items" : [
{
"date" : "31.08.2014",
"purchasePrice" : 10,
"sellingPrice" : 1,
"count" : 5
}
]
}
What you are looking for is the positional $ operator and "projection". For a single field you need to match the required array element using "dot notation", for more than one field use $elemMatch:
db.products.find(
{ "items.date": "31.08.2014" },
{ "shop": 1, "name":1, "items.$": 1 }
)
Or the $elemMatch for more than one matching field:
db.products.find(
{ "items": {
"$elemMatch": { "date": "31.08.2014", "purchasePrice": 1 }
}},
{ "shop": 1, "name":1, "items.$": 1 }
)
These work for a single array element only though and only one will be returned. If you want more than one array element to be returned from your conditions then you need more advanced handling with the aggregation framework.
db.products.aggregate([
{ "$match": { "items.date": "31.08.2014" } },
{ "$unwind": "$items" },
{ "$match": { "items.date": "31.08.2014" } },
{ "$group": {
"_id": "$_id",
"shop": { "$first": "$shop" },
"name": { "$first": "$name" },
"items": { "$push": "$items" }
}}
])
Or possibly in shorter/faster form since MongoDB 2.6 where your array of items contains unique entries:
db.products.aggregate([
{ "$match": { "items.date": "31.08.2014" } },
{ "$project": {
"shop": 1,
"name": 1,
"items": {
"$setDifference": [
{ "$map": {
"input": "$items",
"as": "el",
"in": {
"$cond": [
{ "$eq": [ "$$el.date", "31.08.2014" ] },
"$$el",
false
]
}
}},
[false]
]
}
}}
])
Or possibly with $redact, but a little contrived:
db.products.aggregate([
{ "$match": { "items.date": "31.08.2014" } },
{ "$redact": {
"$cond": [
{ "$eq": [ { "$ifNull": [ "$date", "31.08.2014" ] }, "31.08.2014" ] },
"$$DESCEND",
"$$PRUNE"
]
}}
])
More modern, you would use $filter:
db.products.aggregate([
{ "$match": { "items.date": "31.08.2014" } },
{ "$addFields": {
"items": {
"input": "$items",
"cond": { "$eq": [ "$$this.date", "31.08.2014" ] }
}
}}
])
And with multiple conditions, the $elemMatch and $and within the $filter:
db.products.aggregate([
{ "$match": {
"$elemMatch": { "date": "31.08.2014", "purchasePrice": 1 }
}},
{ "$addFields": {
"items": {
"input": "$items",
"cond": {
"$and": [
{ "$eq": [ "$$this.date", "31.08.2014" ] },
{ "$eq": [ "$$this.purchasePrice", 1 ] }
]
}
}
}}
])
So it just depends on whether you always expect a single element to match or multiple elements, and then which approach is better. But where possible the .find() method will generally be faster since it lacks the overhead of the other operations, which in those last to forms does not lag that far behind at all.
As a side note, your "dates" are represented as strings which is not a very good idea going forward. Consider changing these to proper Date object types, which will greatly help you in the future.
Based on Neil Lunn's code I work with this solution, it includes automatically all first level keys (but you could also exclude keys if you want):
db.products.find(
{ "items.date": "31.08.2014" },
{ "shop": 1, "name":1, "items.$": 1 }
{ items: { $elemMatch: { date: "31.08.2014" } } },
)
With multiple requirements:
db.products.find(
{ "items": {
"$elemMatch": { "date": "31.08.2014", "purchasePrice": 1 }
}},
{ items: { $elemMatch: { "date": "31.08.2014", "purchasePrice": 1 } } },
)
Mongo supports dot notation for sub-queries.
See: http://docs.mongodb.org/manual/reference/glossary/#term-dot-notation
Depending on your driver, you want something like:
db.products.find({"items.date":"31.08.2014"});
Note that the attribute is in quotes for dot notation, even if usually your driver doesn't require this.