Combine array from embedded document based on condition in MongoDB - mongodb

I have a collection of student details like below:
{
"Student_id": 1,
"StudentName": "ABC",
"TestDetails": [{
"SubtestName":"Reading", "TestSeq":1, "SubTestDetails":1,
"Scores":[{"ScoreType":"YY","ScoreValue":"100"},{"ScoreType":"XX","ScoreValue":"100"},
{"ScoreType": "ZZ","ScoreValue":"100"}]}]
,
"TestDetails": [{
"SubtestName":"Writing", "TestSeq":1, "SubTestDetails":2,
"Scores":[{"ScoreType":"YY","ScoreValue":"200"},{"ScoreType":"XX","ScoreValue":"200"},
{"ScoreType": "ZZ","ScoreValue":"200"}]}]
,
"TestDetails": [{
"SubtestName":"Listning", "TestSeq":2, "SubTestDetails":3,
"Scores":[{"ScoreType":"YY","ScoreValue":"300"},{"ScoreType":"XX","ScoreValue":"300"},
{"ScoreType": "ZZ","ScoreValue":"300"}]}]
,
"TestDetails": [{
"SubtestName":"Speaking", "TestSeq":2, "SubTestDetails":4,
"Scores":[{"ScoreType":"YY","ScoreValue":"400"},{"ScoreType":"XX","ScoreValue":"400"},
{"ScoreType": "ZZ","ScoreValue":"400"}]}]
,
"TestDetails": [{
"SubtestName":"Smartness", "TestSeq":3, "SubTestDetails":5,
"Scores":[{"ScoreType":"YY","ScoreValue":"500"},{"ScoreType":"XX","ScoreValue":"500"},
{"ScoreType": "ZZ","ScoreValue":"500"}]}]
},
{
"Student_id": 2,
"StudentName": "XYZ",
"TestDetails": [{
"SubtestName":"Smartness", "TestSeq":1, "SubTestDetails":1,
"Scores":[{"ScoreType":"YY","ScoreValue":"100"},{"ScoreType":"XX","ScoreValue":"100"},
{"ScoreType": "ZZ","ScoreValue":"100"}]}]
,
"TestDetails": [{
"SubtestName":"Writing", "TestSeq":1, "SubTestDetails":2,
"Scores":[{"ScoreType":"YY","ScoreValue":"200"},{"ScoreType":"XX","ScoreValue":"200"},
{"ScoreType": "ZZ","ScoreValue":"200"}]}]
,
"TestDetails": [{
"SubtestName":"Listning", "TestSeq":2, "SubTestDetails":3,
"Scores":[{"ScoreType":"YY","ScoreValue":"300"},{"ScoreType":"XX","ScoreValue":"300"},
{"ScoreType": "ZZ","ScoreValue":"300"}]}]
,
"TestDetails": [{
"SubtestName":"Speaking", "TestSeq":2, "SubTestDetails":4,
"Scores":[{"ScoreType":"YY","ScoreValue":"400"},{"ScoreType":"XX","ScoreValue":"400"},
{"ScoreType": "ZZ","ScoreValue":"400"}]}]
,
"TestDetails": [{
"SubtestName":"Reading", "TestSeq":3, "SubTestDetails":5,
"Scores":[{"ScoreType":"YY","ScoreValue":"100"},{"ScoreType":"XX","ScoreValue":"100"},
{"ScoreType": "ZZ","ScoreValue":"1000"}]}]
},
.
.
.
)
How can I create aggregate query to generate document like below:
{Student:1, "TestSeq" : 1, [{Subtest_name: Reading},{Subtest_name: Writing}]},
{Student:1,"TestSeq" : 2, [{Subtest_name: Listning},{Subtest_name: Speaking}]},
{Student:1, "TestSeq" : 3, [{Subtest_name: Smartness}]},
{Student:2, "TestSeq" : 1, [{Subtest_name: Smartness},{Subtest_name: Writing}]},
{Student:2, "TestSeq" : 2, [{Subtest_name: Listning},{Subtest_name: Speaking}]},
{Student:2, "TestSeq" : 3, [{Subtest_name: Reading}]},
{Student:3, "TestSeq" : 1, [{Subtest_name: Subtest1},{Subtest_name: Subtest2}]},
{Student:3, "TestSeq" : 2, [{Subtest_name: Subtest3},{Subtest_name: Subtest4}]},
{Student:3, "TestSeq" : 3, [{Subtest_name: Subtest5}]}
Logic is to combine/group Subtest name based on TestSeq values. For example Subtest names are combined for TestSeq = 1, for value 2 it's in 2nd row and 3 for last Subtest name for each student.
How can I implement that?
I have tried as below -
db.students.aggregate([
{$unwind: "$SubtestAttribs"},
{ $project: { student_name: 1, student_id : 1,
print_ready : "$SubtestAttribs.TestSeq",
Subtest_names :$SubtestAttribs.SubtestName" } } ])
But I am unable to form array based on condition. Above snippet giving data for each test seq. But how to combine two sub test name based on test seq?

Note: I'm making a couple assumptions because your question has some illegal JSON in it. Let me know if I guessed wrong. Also, I'm not on a computer with Mongo right now, so I might have some syntax issues.
db.students.aggregate([
{ $unwind: "$TestDetails" },
{
$group:{
_id: { Student: "$Student_id", TestSeq: "$TestDetails.TestSeq},
Subtest_names: { $addToSet: "$TestDetails.Subtestname" }
}
},
{
$project:{
Student: "$_id.Student",
TestSeq: "$_id.TestSeq,
Subtest_names: "$Subtest_names"
}
}
])

Related

Mongodb multiple subdocument

I need a collection with structure like this:
{
"_id" : ObjectId("5ffc3e2df14de59d7347564d"),
"name" : "MyName",
"pays" : "de",
"actif" : 1,
"details" : {
"pt" : {
"title" : "MongoTime PT",
"availability_message" : "In stock",
"price" : 23,
"stock" : 1,
"delivery_location" : "Portugal",
"price_shipping" : 0,
"updated_date" : ISODate("2022-03-01T20:07:20.119Z"),
"priority" : false,
"missing" : 1,
},
"fr" : {
"title" : "MongoTime FR",
"availability_message" : "En stock",
"price" : 33,
"stock" : 1,
"delivery_location" : "France",
"price_shipping" : 0,
"updated_date" : ISODate("2022-03-01T20:07:20.119Z"),
"priority" : false,
"missing" : 1,
}
}
}
How can i create an index for each subdocument in 'details' ?
Or maybe it's better to do an array ?
Doing a query like this is currently very long (1 hour). How can I do ?
query = {"details.pt.missing": {"$in": [0, 1, 2, 3]}, "pays": 'de'}
db.find(query, {"_id": false, "name": true}, sort=[("details.pt.updated_date", 1)], limit=300)
An array type would be better, as there are advantages.
(1) You can include a new field which has values like pt, fr, xy, ab, etc. For example:
details: [
{ type: "pt", title : "MongoTime PT", missing: 1, other_fields: ... },
{ type: "fr", title : "MongoTime FR", missing: 1, other_fields: ... },
{ type: "xy", title : "MongoTime XY", missing: 2, other_fields: ... },
// ...
]
Note the introduction of the new field type (this can be any name representing the field data).
(2) You can also index on the array sub-document fields, which can improve query performance. Array field indexes are referred as Multikey Indexes.
The index can be on a field used in a query filter. For example, "details.missing". This key can also be part of a Compound Index. This can help a query filter like below:
{ pays: "de", "details.type": "pt", "details.missing": { $in: [ 0, 1, 2, 3 ] } }
NOTE: You can verify the usage of an index in a query by generating a Query Plan, applying the explain method on the find.
(3) Also, see Embedded Document Pattern as explained in the Model One-to-Many Relationships with Embedded Documents.

Complete a MongoDB query to group and sum fields

I'm doing an app which simulates a hotel check in, where the user registers a lot of client's data. Among this data, I have this:
{
"client_name" : "John Doe",
"client_id" : "xxxxxxxx"
...
"company" : "CocaCola",
"hotel_hq" : "New York",
"lodging_days" : 5,
...
}
One of the functions that should have the app is to show the list of hotel headquarters that the company attends, and the number of days that the company use in every HQ.
So, I need a query that returns me something like this:
{"company" : "CocaCola", "hotel_hq" : ["New York", "California", "Orlando"], "lodging_days" : [5, 10, 8]}
I make, with blood sweat and tears, this query:
db.clients.aggregate(
{
$group: {
_id: '$company',
hotel_hq : {$push:'$hotel_hq'},
lodging_days : {$push:'$lodging_days' }
}
})
And it's the closest I've been, because that returns me this:
{"_id" : "CocaCola", "hotel_hq": ["New York", "New York", "California", "Orlando", "Orlando", "Orlando", "Orlando"], "lodging_days" : [5, 8, 10, 8, 9, 2, 3]}
The hotel HQ are sometimes repeated because differents clients of the same company stayed in the same HQ, or the same client does it more than one time.
Obviously, I can change $push to $addToSet, but the result going to be:
{"_id" : "CocaCola", "hotel_hq": ["New York", "California", "Orlando"], "lodging_days" : [5, 8, 10, 9, 2, 3]}
Which is cool for the hotel_hq, but no for the lodging_days, I try with a $sum, but I don't know how to say Mongo to sum only the 'lodging_days' of a repeated 'hotel_hq'.
hope this works..
db.test3.aggregate(
[
{
$group:{_id:{company:"$company",hotel_hq:"$hotel_hq"},
"Days":{$sum:"$lodging_days"}}
},
{
$group:{_id:"$company",hotel_hq:{$push:"$_id.hotel_hq"},
Days:{$push:"$Days"}}
}
]
)

MongoDb Aggregation (SQL UNION style)

I need some help/advice on how to replicate some SQL behaviour in MongoDB.
Specifically, given this collection:
{
"_id" : ObjectId("577ebc0660084921141a7857"),
"tournament" : "Wimbledon",
"player1" : "Agassi",
"player2" : "Lendl",
"sets" : [{
"score1" : 6,
"score2" : 4,
"tiebreak" : false
}, {
"score1" : 7,
"score2" : 6,
"tiebreak" : true
}, {
"score1" : 7,
"score2" : 6,
"tiebreak" : true
}]
}
{
"_id" : ObjectId("577ebc3560084921141a7858"),
"tournament" : "Wimbledon",
"player1" : "Ivanisevic",
"player2" : "McEnroe",
"sets" : [{
"score1" : 4,
"score2" : 6,
"tiebreak" : false
}, {
"score1" : 3,
"score2" : 6,
"tiebreak" : false
}, {
"score1" : 6,
"score2" : 4,
"tiebreak" : false
}]
}
{
"_id" : ObjectId("577ebc7560084921141a7859"),
"tournament" : "Roland Garros",
"player1" : "Navratilova",
"player2" : "Graf",
"sets" : [{
"score1" : 5,
"score2" : 7,
"tiebreak" : false
}, {
"score1" : 6,
"score2" : 3,
"tiebreak" : false
}, {
"score1" : 7,
"score2" : 7,
"tiebreak" : true
}, {
"score1" : 7,
"score2" : 5,
"tiebreak" : false
}]
}
And these two distinct aggregations:
1) Aggregation ALFA: this aggregation is purposely strange, in the sense that it is designed to find all matches where at least 1 tiebreak is true but only show sets where tiebreak is false. Please don't consider the logic of it, it is crafted to allow full freedom to the user.
{
$match: {
"tournament": "Wimbledon",
"sets.tiebreak": true
}
},
{
$project: {
"tournament": 1,
"player1": 1,
"sets": {
$filter: {
input: "$sets",
as: "set",
cond: {
$eq: ["$$set.tiebreak", false]
}
}
}
}
}
2) Aggregation BETA: this aggregation is purposely strange, in the sense that it is designed to find all matches where at least 1 tiebreak is false but only show sets where tiebreak is true. Please don't consider the logic of it, it is crafted to allow full freedom to the user. Please note that player1 is hidden from the results.
{
$match: {
"tournament": "Roland Garros",
"sets.tiebreak": false
}
},
{
$project: {
"tournament": 1,
"sets": {
$filter: {
input: "$sets",
as: "set",
cond: {
$eq: ["$$set.tiebreak", true]
}
}
}
}
}
Now suppose that these two aggregations purpose is to delimit what part of the database a user can see, in the sense that those two queries delimit all the documents (and details) that are visible to the user. This is similar to 2 sql views that user has rights to access.
I need/want to try to rewrite the previous distinct aggregations in only one. Can this be achieved?
It is mandatory to keep all restriction that were set in Aggregation A & B, without loosing any control on data and without leaking and data that was not available in query A or B.
Specifically, matches in Wimbledon can only be seen if they had at least one set which ended with a tiebreak. Player1 field CAN be seen. Single sets must be hidden if they did not end with a tiebreak and hidden otherwise. If needed, it is acceptable, but not desirable, to not see player1 at all.
Conversely, matches in Roland Garros can be seen only if they had at least one set which ended without a tie break. Player1 field MUST be hidden. Single sets must be seen if they ended with a tiebreak and hidden otherwise.
Again, the purpose is to UNION the two aggregations while keeping the limits imposed by the two aggregations.
MongoDB is version 3.5, can be upgraded to unstable releases if needed.
here's my two cents for the issue:
if you wish to avoid empty sets when
a "Wimbledon" doc has all true tibreaks,
or "Roland Garros" has all false tiebreaks
you may reshape the query:
...
{
$and: [{
"sets.tiebreak": true,
}, {
"sets.tiebreak": false
}],
$or: [{
"tournament": "Wimbledon"
}, {
"tournament": "Roland Garros"
}]
}
...
and use it in:
aggregate pipeline http://pastebin.com/cM6mNsuC
mapReduce (if performance is no a big issue..) http://pastebin.com/MShihSQL

Remove an array element from an array of sub documents

{
"_id" : ObjectId("5488303649f2012be0901e97"),
"user_id":3,
"my_shopping_list" : {
"books" : [ ]
},
"my_library" : {
"books" : [
{
"date_added" : ISODate("2014-12-10T12:03:04.062Z"),
"tag_text" : [
"english"
],
"bdata_product_identifier" : "a1",
"tag_id" : [
"fa7ec571-4903-4aed-892a-011a8a411471"
]
},
{
"date_added" : ISODate("2014-12-10T12:03:08.708Z"),
"tag_text" : [
"english",
"hindi"
],
"bdata_product_identifier" : "a2",
"tag_id" : [
"fa7ec571-4903-4aed-892a-011a8a411471",
"60733993-6b54-420c-8bc6-e876c0e196d6"
]
}
]
},
"my_wishlist" : {
"books" : [ ]
},
}
Here I would like to remove only english from every tag_text array of my_library using only user_id and tag_text This document belongs to user_id:3. I have tried some queries which delete an entire book sub-document . Thank you.
Well since you are using pymongo and mongodb doesn't provide a nice way for doing this because using the $ operator will only pull english from the first subdocument, why not write a script that will remove english from every tag_text and then update your document.
Demo:
>>> doc = yourcollection.find_one(
{
'user_id': 3, "my_library.books" : {"$exists": True}},
{"_id" : 0, 'user_id': 0
})
>>> books = doc['my_library']['books'] #books field in your doc
>>> new_books = []
>>> for k in books:
... for x, y in k.items():
... if x == 'tag_text' and 'english' in y:
... y.remove('english')
... new_book.append({x:y})
...
>>> new_book
[{'tag_text': []}, {'tag_id': ['fa7ec571-4903-4aed-892a-011a8a411471']}, {'bdata_product_identifier': 'a1'}, {'date_added': datetime.datetime(2014, 12, 10, 12, 3, 4, 62000)}, {'tag_text': ['hindi']}, {'tag_id': ['fa7ec571-4903-4aed-892a-011a8a411471', '60733993-6b54-420c-8bc6-e876c0e196d6']}, {'bdata_product_identifier': 'a2'}, {'date_added': datetime.datetime(2014, 12, 10, 12, 3, 8, 708000)}]
>>> yourcollection.update({'user_id' : 3}, {"$set" : {'my_library.books' : bk}})
Check if everything work fine.
>>> yourcollection.find_one({'user_id' : 3})
{'user_id': 3.0, '_id': ObjectId('5488303649f2012be0901e97'), 'my_library': {'books': [{'tag_text': []}, {'tag_id': ['fa7ec571-4903-4aed-892a-011a8a411471']}, {'bdata_product_identifier': 'a1'}, {'date_added': datetime.datetime(2014, 12, 10, 12, 3, 4, 62000)}, {'tag_text': ['hindi']}, {'tag_id': ['fa7ec571-4903-4aed-892a-011a8a411471', '60733993-6b54-420c-8bc6-e876c0e196d6']}, {'bdata_product_identifier': 'a2'}, {'date_added': datetime.datetime(2014, 12, 10, 12, 3, 8, 708000)}]}, 'my_shopping_list': {'books': []}, 'my_wishlist': {'books': []}}
One possible solution could be to repeat
db.collection.update({user_id: 3, "my_library.books.tag_text": "english"}, {$pull: {"my_library.books.$.tag_text": "english"}}
until MongoDB can no longer match a document to update.

Updating array with push and slice

I have just started to play with MongoDB and have some questions about how I update my documents in the database. I insert two documents in my db with
db.userscores.insert({name: 'John Doe', email: 'john.doe#mail.com', levels : [{level: 1, hiscores: [90, 40, 25], achivements: ['capture the flag', 'it can only be one', 'apple collector', 'level complete']}, {level: 2, hiscores: [30, 25], achivements: ['level complete']}, {level: 3, hiscores: [], achivements: []}]});
db.userscores.insert({name: 'Jane Doe', email: 'jane.doe#mail.com', levels : [{level: 1, hiscores: [150, 90], achivements: ['Master of the universe', 'capture the flag', 'it can only be one', 'apple collector', 'level complete']}]});
I check if my inserting worked with the find() command and it looks ok.
db.userscores.find().pretty();
{
"_id" : ObjectId("5358b47ab826096525d0ec98"),
"name" : "John Doe",
"email" : "john.doe#mail.com",
"levels" : [
{
"level" : 1,
"hiscores" : [
90,
40,
25
],
"achivements" : [
"capture the flag",
"it can only be one",
"apple collector",
"level complete"
]
},
{
"level" : 2,
"hiscores" : [
30,
25
],
"achivements" : [
"level complete"
]
},
{
"level" : 3,
"hiscores" : [ ],
"achivements" : [ ]
}
]
}
{
"_id" : ObjectId("5358b47ab826096525d0ec99"),
"name" : "Jane Doe",
"email" : "jane.doe#mail.com",
"levels" : [
{
"level" : 1,
"hiscores" : [
150,
90
],
"achivements" : [
"Master of the universe",
"capture the flag",
"it can only be one",
"apple collector",
"level complete"
]
}
]
}
How can I add/update data to my userscores? Lets say I want to add a hiscore to user John Doe on level 1. How do I insert the hiscore 75 and still have the hiscore array sorted? Can I limit the number of hiscores so the array only contains 3 elements? I have tried with
db.userscores.aggregate(
// Initial document match (uses name, if a suitable one is available)
{ $match: {
name : 'John Doe'
}},
// Expand the levels array into a stream of documents
{ $unwind: '$levels' },
// Filter to 'level 1' scores
{ $match: {
'levels.level': 1
}},
// Add score 75 with cap/limit of 3 elements
{ $push: {
'levels.hiscore':{$each [75], $slice:-3}
}}
);
but it wont work, the error I get is "SyntaxError: Unexpected token [".
And also, how do I get the 10 highest score from all users on level 1 for example? Is my document scheme ok or can I use a better scheme for storing users hiscores and achivements on diffrent levels for my game? Is there any downsides on quering or performance using they scheme above?
You can add the score with this statement:
db.userscores.update(
{ "name": "John Doe", "levels.level": 1 },
{ "$push": { "levels.$.hiscores": 75 } } )
This will not sort the array as this is only supported if your array elements are documents.
In MongoDB 2.6 you can use sorting also for non-document arrays:
db.userscores.update(
{ "name": "John Doe", "levels.level": 1 },
{ "$push": { "levels.$.hiscores": { $each: [ 75 ], $sort: -1, $slice: 3 } } } )