I am trying to count word usage using MongoDB. My collection currently looks like this:
{'_id':###, 'username':'Foo', words:[{'word':'foo', 'count':1}, {'word':'bar', 'count':1}]}
When a new post is made, I extract all the new words to an array but I'm trying to figure out to upsert to the words array and increment the count if the word already exists.
In the example above, for example, if the user "Foo" posted "lorem ipsum foo", I'd add "lorem" and "ipsum" to the users words array but increment the count for "foo".
Is this possible in one query? Currently I am using addToSet:
'$addToSet':{'words':{'$each':word_array}}
But that doesn't seem to offer any way of increasing the words count.
Would very much appreciate some help :)
If you're willing to switch from a list to hash (object), you can atomically do this.
From the docs: "$inc ... increments field by the number value if field is present in the object, otherwise sets field to the number value."
{ $inc : { field : value } }
So, if you could refactor your container and object:
words: [
{
'word': 'foo',
'count': 1
},
...
]
to:
words: {
'foo': 1,
'other_word: 2,
...
}
you could use the operation update with:
{ $inc: { 'words.foo': 1 } }
which would create { 'foo': 1 } if 'foo' doesn't exist, else increment foo.
E.g.:
$ db.bar.insert({ id: 1, words: {} });
$ db.bar.find({ id: 1 })
[
{ ..., "words" : { }, "id" : 1 }
]
$ db.bar.update({ id: 1 }, { $inc: { 'words.foo': 1 } });
$ db.bar.find({ id: 1 })
[
{ ..., "id" : 1, "words" : { "foo" : 1 } }
]
$ db.bar.update({ id: 1 }, { $inc: { 'words.foo': 1 } });
$ db.bar.find({ id: 1 })
[
{ ..., "id" : 1, "words" : { "foo" : 2 } }
]
Unfortunately it is not possible to do this in a single update with your schema. Your schema is a bit questionable and should probably be converted to having a dedicated collection with word counters, e.g :
db.users {_id:###, username:'Foo'}
db.words.counters {_id:###, word:'Word', userId: ###, count: 1}
That will avoid quite a few issues such as :
Running into maximum document size limits
Forcing mongo to keep moving around your documents as you increase their size
Both scenarios require two updates to do what you want which introduces atomicity issues. Updating per word by looping through word_array is better and safer (and is possible with both solutions).
Related
I have an old list of products that store the descriptions in an array at index [0]. The model is set up as a string. So my plan is to extract that value and add it to a temporary field. Step 2 would be to take that value and copy it to the original field.
This is the 'wrong' product I want to fix.
{_id : 1 , pDescription : ['great product']},
{_id : 2 , pDescription : ['another product']}
All I want to is to change the array to a string like this:
{_id : 1 , pDescription : 'great product'},
{_id : 2 , pDescription : 'another product'}
I have tried this to create the temporary description:
Products.aggregate([
{
$match: {
pDescription: {
$type: "array"
}
}
},
{
$set: {
pDescTemp: {
$first: "$pDescription"
}
}
}
]).exec((err, r) => {
// do stuff here
});
The command works fine without the $first command.
The error reads: MongoError: Unrecognized expression '$first'
Any tips on how to fix this are appreciated!
Thanks!
I believe this is what you need to update your pDescription field to be equal to the first element of the array already stored as pDescription:
db.Products.updateMany({},
[
{
$set: {
pDescription: {
$arrayElemAt: [
"$pDescription",
0
]
}
}
}
])
Document structure in cities collection is like this
cities
{
_id: ObjectId("5e78ec62bb5b406776e92fac"),
city_name: "Mumbai",
...
...
subscriptions: [
{
_id: 1,
category: "Print Magazine",
subscribers: 183476
options: [
{
name: "Time",
subscribers: 56445
},
{
name: "The Gentlewoman",
subscribers: 9454
},
{
name: "Gourmand",
subscribers: 15564
}
...
...
]
},
{
_id: 2,
category: "RSS Feed",
subscribers: 2645873
options: [
{
name: "Finance",
subscribers: 168465
},
{
name: "Politics",
subscribers: 56945
},
{
name: "Entrepreneurship",
subscribers: 56945
},
...
...
]
}
]
}
Now when a user subscribes like below
{
cityId: 5e78ec62bb5b406776e92fac
selections: [
{
categoryId: 1,
options : ["Time", "Gourmand"]
},
{
categoryId: 2,
selected: ["Politics", "Entrepreneurship"]
}
]
}
I want to update the following in the cities document
Increment subscribers for "Print Magazine" by 1
Increment subscribers for "Time" by 1
Increment subscribers for "Gourmand" by 1
Increment subscribers for "RSS Feed" by 1
Increment subscribers for "Politics" by 1
Increment subscribers for "Entrepreneurship" by 1
So when an item is subscribed, its subscribers count is incremented by 1. And the category it falls into, its subscriber count is also incremented by 1.
I want to achieve this in a single update query. Any tips how can I do this?
Use case details
Each user's subscription details are stored in user_subscription_details collection(not listed here). subscriptions property in cities holds just the subscription summary for each city.
So I was able to it with the following query
db.cities.updateOne(
{
_id : ObjectId("5e78ec62bb5b406776e92fac")
},
{
$inc: {
"subscriptions.$[category].subscribers" : 1,
"subscriptions.$[category].options.$[option].subscribers" : 1
}
},
{ multi: true,
arrayFilters: [
{ "category._id": {$in: ["1", "2"]} },
{ "option.name": {$in: ["Time", "Gourmand", "Politics", "Entrepreneurship"]} }
]
}
)
Brief Explanation
First the document is matched with _id.
In update block we will declare the fields to be updated
"subscriptions.$[?].subscribers" : 1,
"subscriptions.$[?].options.$[?].subscribers" : 1
I have used ? here to show we don't know yet for which elements in the array we need to do these update. Which we can declare in the next block by filtering the array elements that need to be updated.
In filter block we filter array elements on some condition
{ "category._id": {$in: ["1", "2"]} }
{ "option.name": {$in: ["Time", "Gourmand", "Politics", "Entrepreneurship"]} }
First we filter the elements in the outer array by _id i.e only subscription categories whose _id is either 1 or 2.
Next, we filter the elements in the inner options array on the name field. Elements which will pass both filters will get updated.
Note: category in category._id and option in option.name can be any name. But the same name is to be used for fields path in update block.
For, Spring Boot MongoOperation translation of this query look at this answer
I have a collection with documents having the following format
{
name: "A",
details : {
matchA: {
comment: "Hello",
score: 5
},
matchI: {
score: 10
},
lastMatch:{
score: 5
}
}
},
{
name: "B",
details : {
match2: {
score: 5
},
match7: {
score: 10
},
firstMatch:{
score: 5
}
}
}
I don't immediatly know the name of the keys that are children of details, they don't follow a known format, there can be different amounts etc.
I would like to write a query which will update the children in such a manner that any subdocument with a score less than 5, gets a new field added (say lowScore: true).
I've looked around a bit and I found $ and $elemMatch, but those only work on arrays. Is there an equivalent for subdocuments? Is there some way of doing it using the aggregation pipeline?
I don't think you can do that using a normal update(). There is a way through the aggregation framework which itself, however, cannot alter any persisted data. So you will need to loop through the results and update your documents individually like e.g. here: Aggregation with update in mongoDB
This is the required query to transform your data into what you need for the subsequent update:
collection.aggregate({
$addFields: {
"details": {
$objectToArray: "$details" // transform "details" into uniform array of key-value pairs
}
}
}, {
$unwind: "$details" // flatten the array created above
}, {
$match: {
"details.v.score": {
$lt: 10 // filter out anything that's not relevant to us
// (please note that I used some other filter than the one you wanted "score less than 5" to get some results using your sample data
},
"details.v.lowScore": { // this filter is not really required but it seems to make sense to check for the presence of the field that you want to create in case you run the query repeatedly
$exists: false
}
}
}, {
$project: {
"fieldsToUpdate": "$details.k" // ...by populating the "details" array again
}
})
Running this query returns:
/* 1 */
{
"_id" : ObjectId("59cc0b6afab2f8c9e1404641"),
"fieldsToUpdate" : "matchA"
}
/* 2 */
{
"_id" : ObjectId("59cc0b6afab2f8c9e1404641"),
"fieldsToUpdate" : "lastMatch"
}
/* 3 */
{
"_id" : ObjectId("59cc0b6afab2f8c9e1404643"),
"fieldsToUpdate" : "match2"
}
/* 4 */
{
"_id" : ObjectId("59cc0b6afab2f8c9e1404643"),
"fieldsToUpdate" : "firstMatch"
}
You could then $set your new field "lowScore" using a cursor as described in the linked answer above.
I have a mongo collection "test" which contains elements like so (with the nodes array being a set and meaningful order):
"test" : {
"superiorID" : 1,
"nodes" : [
{
"subID" : 2
},
{
"subID" : 1
},
{
"subID" : 3
}
]
}
or
"test" : {
"superiorID" : 4,
"nodes" : [
{
"subID" : 2
},
{
"subID" : 1
},
{
"subID" : 3
}
]
}
I am using spring Criteria to try and build a mongo query which will return to me all elements where the 'subID' equals a user input id 'inputID' AND the 'superiorID' position is NOT before the 'inputID' (if the superior id is even in the sub ids which is not required).
So for example, if my user input was 3 I would NOT want to pull the first document but I WOULD want to pull the second document (first has a superior that exists in the nodes BEFORE the userInput node second's superior id is not equal to the user input).
I know that the $indexOfArray function exists but I don't know how to translate this to Criteria.
You can get the result you are looking for through the aggregation framework. I've made a speude query for you to show what you should be looking for. This returns
showMe = false for doc1 and showMe = true for doc2, which you could obiously match for. You do not need 2 project phases for this query, I only did that to make a working query which is also easy-ish to read. This will not be a very fast query. If you want fast queries you might want to rethink your data structure.
db.getCollection('test').aggregate([
{ "$project":
{
"superiorIndex": {"$indexOfArray" : [ "$nodes.subID","$superiorID" ]},
"inputIndex": {"$indexOfArray" : [ "$nodes.subID",3 ]},
}
},
{ "$project":
{
"showMe" :
{
$cond:
{
if: { $eq: [ "$superiorIndex", -1 ] },
then: true,
else: {$gt:[ "$superiorIndex","$inputIndex"]}
}
}
}
}
])
db.collection.find({nodes.2.subID:2}) that query will lookup 2th element subid from nodes field.
The documents in my collection have the following format:
{ word: 'apple', number: 5 }
I want to increment the value of number from a javascript function. Yes, I know you can do that with a simple upsert, but I'm planning to do this for arrays and more complicated decisions that can't expressed by operators in a single upsert. This is just a simplified example.
What I've tried so far:
db.a.find({word:'test'})
{ "_id" : ObjectId("53c98ff18b95662af1148ad7"), "word" : "test", "number" : 5 }
db.a.find({word:'test'}).forEach(function(entry) {
inc: entry.number, 5;
print("test", entry.number);
})
test 5
(but it should be 10)
Could you please tell me what am I doing wrong?
db.runCommand({
findAndModify: "people",
query: { name: "Andy" },
sort: { rating: 1 },
update: { $inc: { score: 1 } },
upsert: true
})
As per my comment above. This should work in bulk operations as opposed to an update on each foreach iteration.