Remove an array element from an array of sub documents - mongodb

{
"_id" : ObjectId("5488303649f2012be0901e97"),
"user_id":3,
"my_shopping_list" : {
"books" : [ ]
},
"my_library" : {
"books" : [
{
"date_added" : ISODate("2014-12-10T12:03:04.062Z"),
"tag_text" : [
"english"
],
"bdata_product_identifier" : "a1",
"tag_id" : [
"fa7ec571-4903-4aed-892a-011a8a411471"
]
},
{
"date_added" : ISODate("2014-12-10T12:03:08.708Z"),
"tag_text" : [
"english",
"hindi"
],
"bdata_product_identifier" : "a2",
"tag_id" : [
"fa7ec571-4903-4aed-892a-011a8a411471",
"60733993-6b54-420c-8bc6-e876c0e196d6"
]
}
]
},
"my_wishlist" : {
"books" : [ ]
},
}
Here I would like to remove only english from every tag_text array of my_library using only user_id and tag_text This document belongs to user_id:3. I have tried some queries which delete an entire book sub-document . Thank you.

Well since you are using pymongo and mongodb doesn't provide a nice way for doing this because using the $ operator will only pull english from the first subdocument, why not write a script that will remove english from every tag_text and then update your document.
Demo:
>>> doc = yourcollection.find_one(
{
'user_id': 3, "my_library.books" : {"$exists": True}},
{"_id" : 0, 'user_id': 0
})
>>> books = doc['my_library']['books'] #books field in your doc
>>> new_books = []
>>> for k in books:
... for x, y in k.items():
... if x == 'tag_text' and 'english' in y:
... y.remove('english')
... new_book.append({x:y})
...
>>> new_book
[{'tag_text': []}, {'tag_id': ['fa7ec571-4903-4aed-892a-011a8a411471']}, {'bdata_product_identifier': 'a1'}, {'date_added': datetime.datetime(2014, 12, 10, 12, 3, 4, 62000)}, {'tag_text': ['hindi']}, {'tag_id': ['fa7ec571-4903-4aed-892a-011a8a411471', '60733993-6b54-420c-8bc6-e876c0e196d6']}, {'bdata_product_identifier': 'a2'}, {'date_added': datetime.datetime(2014, 12, 10, 12, 3, 8, 708000)}]
>>> yourcollection.update({'user_id' : 3}, {"$set" : {'my_library.books' : bk}})
Check if everything work fine.
>>> yourcollection.find_one({'user_id' : 3})
{'user_id': 3.0, '_id': ObjectId('5488303649f2012be0901e97'), 'my_library': {'books': [{'tag_text': []}, {'tag_id': ['fa7ec571-4903-4aed-892a-011a8a411471']}, {'bdata_product_identifier': 'a1'}, {'date_added': datetime.datetime(2014, 12, 10, 12, 3, 4, 62000)}, {'tag_text': ['hindi']}, {'tag_id': ['fa7ec571-4903-4aed-892a-011a8a411471', '60733993-6b54-420c-8bc6-e876c0e196d6']}, {'bdata_product_identifier': 'a2'}, {'date_added': datetime.datetime(2014, 12, 10, 12, 3, 8, 708000)}]}, 'my_shopping_list': {'books': []}, 'my_wishlist': {'books': []}}

One possible solution could be to repeat
db.collection.update({user_id: 3, "my_library.books.tag_text": "english"}, {$pull: {"my_library.books.$.tag_text": "english"}}
until MongoDB can no longer match a document to update.

Related

Mongodb multiple subdocument

I need a collection with structure like this:
{
"_id" : ObjectId("5ffc3e2df14de59d7347564d"),
"name" : "MyName",
"pays" : "de",
"actif" : 1,
"details" : {
"pt" : {
"title" : "MongoTime PT",
"availability_message" : "In stock",
"price" : 23,
"stock" : 1,
"delivery_location" : "Portugal",
"price_shipping" : 0,
"updated_date" : ISODate("2022-03-01T20:07:20.119Z"),
"priority" : false,
"missing" : 1,
},
"fr" : {
"title" : "MongoTime FR",
"availability_message" : "En stock",
"price" : 33,
"stock" : 1,
"delivery_location" : "France",
"price_shipping" : 0,
"updated_date" : ISODate("2022-03-01T20:07:20.119Z"),
"priority" : false,
"missing" : 1,
}
}
}
How can i create an index for each subdocument in 'details' ?
Or maybe it's better to do an array ?
Doing a query like this is currently very long (1 hour). How can I do ?
query = {"details.pt.missing": {"$in": [0, 1, 2, 3]}, "pays": 'de'}
db.find(query, {"_id": false, "name": true}, sort=[("details.pt.updated_date", 1)], limit=300)
An array type would be better, as there are advantages.
(1) You can include a new field which has values like pt, fr, xy, ab, etc. For example:
details: [
{ type: "pt", title : "MongoTime PT", missing: 1, other_fields: ... },
{ type: "fr", title : "MongoTime FR", missing: 1, other_fields: ... },
{ type: "xy", title : "MongoTime XY", missing: 2, other_fields: ... },
// ...
]
Note the introduction of the new field type (this can be any name representing the field data).
(2) You can also index on the array sub-document fields, which can improve query performance. Array field indexes are referred as Multikey Indexes.
The index can be on a field used in a query filter. For example, "details.missing". This key can also be part of a Compound Index. This can help a query filter like below:
{ pays: "de", "details.type": "pt", "details.missing": { $in: [ 0, 1, 2, 3 ] } }
NOTE: You can verify the usage of an index in a query by generating a Query Plan, applying the explain method on the find.
(3) Also, see Embedded Document Pattern as explained in the Model One-to-Many Relationships with Embedded Documents.

$elemMatch vs $and giving different output for the same query in Mongo shell. Why?

I am writing a mongo query to find the restaurants that achieved a score which is more than 80 but less than 100.
The following are the queries I wrote to achieve this.
db.restaurants.find({grades: {$elemMatch: {"score": {$gt: 80, $lt: 100}}}})
db.restaurants.find({"grades.score": {$gt: 80, $lt: 100}})
db.restaurants.find({$and: [{"grades.score": {$gt: 80, $lt: 100}}]})
The first query returns 3 outputs but the bottom two return 4 outputs. What am I doing wrong?
I simplified the results to make a better explanation:
{
"_id" : ObjectId("5d4ac3f61effed1f90d366ac"),
"grades" : [
{
"score" : [11, 131, 11, 25, 11, 13]
}
],
"name" : "Murals On 54/Randolphs'S",
"restaurant_id" : "40372466"
},
{
"_id" : ObjectId("5d4ac3f61effed1f90d3674d"),
"grades" : [
{
"score" : [5, 8, 12, 2, 9, 92, 41]
}
],
"name" : "Gandhi",
"restaurant_id" : "40381295"
},
{
"_id" : ObjectId("5d4ac3f61effed1f90d368b0"),
"grades" : [
{
"score" : [31, 98, 32, 21, 11]
}
],
"name" : "Bella Napoli",
"restaurant_id" : "40393488"
},
{
"_id" : ObjectId("5d4ac3f61effed1f90d3711c"),
"grades" : [
{
"score" : [89, 6, 13]
}
],
"name" : "West 79Th Street Boat Basin Cafe",
"restaurant_id" : "40756344"
}
The restaurant_id and score elements that matches the conditions seperately:
$gt: 80: 40372466 (131), 40381295 (92), 40393488 (98), 40756344 (89)
$lt: 100: 40372466 (11, 11, 25, 11, 13), 40381295 (5, 8, 12, 2, 9, 92, 41), 40393488 (31, 98, 32, 21, 11), 40756344 (89, 6, 13)
The bottom two queries looks for the array as a whole so 4 records are in both $gt and $lt groups.
But the first query with $elemMatch looks for each element of the array and apply the {$gt: 80, $lt: 100} query on each element. The record with "restaurant_id" : "40372466" has no element matches the query. Neither of these ([11, 131, 11, 25, 11, 13]) are between 80 and 100.
The $elemMatch operator matches documents that contain an array field
with at least one element that matches all the specified query
criteria.
That's why $elemMatch operator should be used on arrays to match multiple conditions in queries.
Definition of $elemMatch from MongoDB official documentation:
The $elemMatch operator matches documents that contain an array field with at least one element that matches all the specified query criteria.
e.g: With only two records in the scores collection as below:
> db.scores.find()
{ "_id" : 2, "results" : [ 75, 88, 89 ] }
{ "_id" : 1, "results" : [ 82, 85, 88 ] }
>
The following query matches only those documents where the results array contains at least one element that is both greater than or equal to 80 and is less than 85.
db.scores.find(
{ results: { $elemMatch: { $gte: 80, $lt: 85 } } }
)
Consider the first records [ 82, 85, 88 ] of the results array:
82 => true, since 82 lies between 80 and 85.
85 => false, since $lt not the $lte.
88 => false, since not less than 85.
So, at least one value in the array found and as per specified query criteria, this record will be selected.
Second records [ 75, 88, 89 ], let's find out at least one record which satisfied the criteria $gte: 80, $lt: 85:
75 => false, since not greater than 80.
88 => false, since not less than 85.
89 => false, since not less than 85.
So, None of the elements of results array satisfied the criteria so rejected.
Final output would be:
{ "_id" : 1, "results" : [ 82, 85, 88 ] }
Let's insert new records "results": [ 1, 2, 3 ] with _id:3 in the same collection, records become as below:
> db.scores.find()
{ "_id" : 2, "results" : [ 75, 88, 89 ] }
{ "_id" : 1, "results" : [ 82, 85, 88 ] }
{ "_id" : 3, "results" : [ 1, 2, 3 ] }
Now run the $and query as below:
db.scores.find({$and: [{"results": {$gte: 80, $lt: 85 } }] } )
What would be the output?
Since records with _id:2 satisfied both the condition 88 and 89 greater than 80 as well the 75 is less than 85, so records with _id:2 selected.
Now the second records with _id:1 all the results value are greater than 80 and 82 is less than 85, so it is also selected.
Now come to second records with _id:3 none of the elements satisfied the conditions so rejected.
So the final output be:
{ "_id" : 2, "results" : [ 75, 88, 89 ] }
{ "_id" : 1, "results" : [ 82, 85, 88 ] }
Other things, let us insert one more records results:[86,92,93]} with _id:4 and what would be the result of the same $and query? think of it, you will figure it out yourself.

MongoDB, how to use document as the smallest unit to search the document in array?

Sorry for the title, but I really do not know how to make it clear. But I can show you.
Here I have insert two document
> db.test.find().pretty()
{
"_id" : ObjectId("557faa461ec825d473b21422"),
"c" : [
{
"a" : 3,
"b" : 7
}
]
}
{
"_id" : ObjectId("557faa4c1ec825d473b21423"),
"c" : [
{
"a" : 1,
"b" : 3
},
{
"a" : 5,
"b" : 9
}
]
}
>
I only want to select the first document with a value which is greater than 'a' and smaller than 'b', like '4'.
But when i search, i cannot get the result i want
> db.test.find({'c.a': {$lte: 4}, 'c.b': {$gte: 4}})
{ "_id" : ObjectId("557faa461ec825d473b21422"), "c" : [ { "a" : 3, "b" : 7 } ] }
{ "_id" : ObjectId("557faa4c1ec825d473b21423"), "c" : [ { "a" : 1, "b" : 3 }, { "a" : 5, "b" : 9 } ] }
>
Because '4' is greater than the '"a" : 1' and smaller than '"b" : 9' in the second document even it is not in the same document in the array, so the second one selected.
But I only want the first one selected.
I found this http://docs.mongodb.org/manual/reference/operator/query/elemMatch/#op._S_elemMatch, but it seems the example is not suitable for my situation.
You would want to
db.test.findOne({ c: {$elemMatch: {a: {$lte: 4}, b: {$gte: 4} } } })
With your query, you are searching for documents that have an object in the 'c' array that has a key 'a' with a value <= 4, and a key 'b' with a value >= 4.
The second record is return because c[0].a is <= 4, and c[1].b is >= 4.
Since you specified you wanted to select only the first document, you would want to do a findOne() instead of a find().
Use $elemMatch as below :
db.test.find({"c":{"$elemMatch":{"a":{"$lte":4},"b":{"$gte":4}}}})
Or
db.test.find({"c":{"$elemMatch":{"a":{"$lte":4},"b":{"$gte":4}}}},{"c.$":1})

Is there any way to project the size of intersection between two numerical ranges using mongo language?

My sample data:
db.test.insert([{range:[1, 8]},
{range:[4, 8]},
{range:[1,9]},
{range:[3, 5]}])
And I have a variable:
query = [2, 5]
I want to do something like this:
db.test.aggregate([
{$project:{overlap: {$IntersectionOfRanges:["$range", query]} }},
...
So that it projects 3 for 1st doc, 1 for 2nd, 3 for 3rd and 2 for 4th. Of course, this "$IntersectionOfRanges" function is completely made up. The only solution in mongo I can think of is to include the whole sequence of integers in an array (e.g., [1, 5] turns into [1, 2, 3, 4, 5]) and then use $SetIntersection. Unfortunately, some of the ranges are much longer than those in the sample, I cannot afford to keep arrays of 100 or so numbers in the database. Is this even accomplishable?
To solve that issue, you basically have to implement interval trichotomy using the aggregation framework. Things are not as hard as it sounds. But due to some limitation in MongoDB expression syntax, using an array like you suggested first would make things really hard.
But, as you explained in a comment, restructuring your schema is an option. So I would go toward that instead:
db.test.insert([
{range:{from:1, to:8}},
{range:{from:4, to:8}},
{range:{from:1, to:9}},
{range:{from:3, to:5}},
])
With that new model, you can find range intersection using that simple aggregation pipeline:
query = [2, 5]
db.test.aggregate([
{$project: {
from: {$cond: [{$gt: ["$range.from", query[0]]}, "$range.from", query[0]]},
to: {$cond: [{$lt: ["$range.to", query[1]]}, "$range.to", query[1]]},
}}
])
Now, from is the minimum between the document range.from field and the start of the target range. to is the maximum between the document range.to field and the end of the target range. So, at this point:
When to is lower than from, there was no intersection between the two ranges1;
Otherwise, your intersection is the range between from and to (possibly limited to a single value)
Given the data set on top on this answer, the above aggregation pipeline (with an extra step to add the "width" of the range) will produce:
> query = [2, 5]
> db.test.aggregate([
{$project: {
from: {$cond: [{$gt: ["$range.from", query[0]]}, "$range.from", query[0]]},
to: {$cond: [{$lt: ["$range.to", query[1]]}, "$range.to", query[1]]},
}},
{$project: { width: { $subtract: [ "$to", "$from" ]},
from: 1,
to: 1,
}}
])
{ "_id" : ObjectId("..."), "from" : 2, "to" : 5, "width" : 3 }
{ "_id" : ObjectId("..."), "from" : 4, "to" : 5, "width" : 1 }
{ "_id" : ObjectId("..."), "from" : 2, "to" : 5, "width" : 3 }
{ "_id" : ObjectId("..."), "from" : 3, "to" : 5, "width" : 2 }
And, using a different range:
> query = [8, 10]
> db.test.aggregate([
... {$project: {
... from: {$cond: [{$gt: ["$range.from", query[0]]}, "$range.from", query[0]]},
... to: {$cond: [{$lt: ["$range.to", query[1]]}, "$range.to", query[1]]},
... }}
... ])
{ "_id" : ObjectId("..."), "from" : 8, "to" : 8, "width" : 0 } // single point
{ "_id" : ObjectId("..."), "from" : 8, "to" : 8, "width" : 0 } // single point
{ "_id" : ObjectId("..."), "from" : 8, "to" : 9, "width" : 1 } // range intersection
{ "_id" : ObjectId("..."), "from" : 8, "to" : 5, "width" : -3 } // NO intersection
1You can go even further in your analysis, as if from > to and from = query[0] (resp. to = query[1]) you know that the document range was below (resp. above) the target range.

Updating array with push and slice

I have just started to play with MongoDB and have some questions about how I update my documents in the database. I insert two documents in my db with
db.userscores.insert({name: 'John Doe', email: 'john.doe#mail.com', levels : [{level: 1, hiscores: [90, 40, 25], achivements: ['capture the flag', 'it can only be one', 'apple collector', 'level complete']}, {level: 2, hiscores: [30, 25], achivements: ['level complete']}, {level: 3, hiscores: [], achivements: []}]});
db.userscores.insert({name: 'Jane Doe', email: 'jane.doe#mail.com', levels : [{level: 1, hiscores: [150, 90], achivements: ['Master of the universe', 'capture the flag', 'it can only be one', 'apple collector', 'level complete']}]});
I check if my inserting worked with the find() command and it looks ok.
db.userscores.find().pretty();
{
"_id" : ObjectId("5358b47ab826096525d0ec98"),
"name" : "John Doe",
"email" : "john.doe#mail.com",
"levels" : [
{
"level" : 1,
"hiscores" : [
90,
40,
25
],
"achivements" : [
"capture the flag",
"it can only be one",
"apple collector",
"level complete"
]
},
{
"level" : 2,
"hiscores" : [
30,
25
],
"achivements" : [
"level complete"
]
},
{
"level" : 3,
"hiscores" : [ ],
"achivements" : [ ]
}
]
}
{
"_id" : ObjectId("5358b47ab826096525d0ec99"),
"name" : "Jane Doe",
"email" : "jane.doe#mail.com",
"levels" : [
{
"level" : 1,
"hiscores" : [
150,
90
],
"achivements" : [
"Master of the universe",
"capture the flag",
"it can only be one",
"apple collector",
"level complete"
]
}
]
}
How can I add/update data to my userscores? Lets say I want to add a hiscore to user John Doe on level 1. How do I insert the hiscore 75 and still have the hiscore array sorted? Can I limit the number of hiscores so the array only contains 3 elements? I have tried with
db.userscores.aggregate(
// Initial document match (uses name, if a suitable one is available)
{ $match: {
name : 'John Doe'
}},
// Expand the levels array into a stream of documents
{ $unwind: '$levels' },
// Filter to 'level 1' scores
{ $match: {
'levels.level': 1
}},
// Add score 75 with cap/limit of 3 elements
{ $push: {
'levels.hiscore':{$each [75], $slice:-3}
}}
);
but it wont work, the error I get is "SyntaxError: Unexpected token [".
And also, how do I get the 10 highest score from all users on level 1 for example? Is my document scheme ok or can I use a better scheme for storing users hiscores and achivements on diffrent levels for my game? Is there any downsides on quering or performance using they scheme above?
You can add the score with this statement:
db.userscores.update(
{ "name": "John Doe", "levels.level": 1 },
{ "$push": { "levels.$.hiscores": 75 } } )
This will not sort the array as this is only supported if your array elements are documents.
In MongoDB 2.6 you can use sorting also for non-document arrays:
db.userscores.update(
{ "name": "John Doe", "levels.level": 1 },
{ "$push": { "levels.$.hiscores": { $each: [ 75 ], $sort: -1, $slice: 3 } } } )