I am trying to retrieve information on how many attempts a user takes to solve a particular problem as a JSON from a mongodb database. If there are multiple attempts on the same problem, I would only like to pull out the last entry - for instance, right now, if I do a db.proficiencies.find() - I will pull out entries A, B, C, and D but I would like to only pull out entries B and D (latest entries for the problems maze and circle respectively).
Is there an easy way to do so?
Entry A
{
"problem": "maze",
"courseLesson": "elementary_one, 1",
"studentId": "51ed51d0fcb4cc3696000001",
"studentName": "Sarah",
"_id": "51ed51defcb4cc3696000011",
"__v": 0,
"date": "2013-07-22T15:38:06.259Z",
"numberOfAttemptsBeforeSolved": 1
}
Entry B
{
"problem": "maze",
"courseLesson": "elementary_one, 1",
"studentId": "51ed51d0fcb4cc3696000001",
"studentName": "Sarah",
"_id": "51ed51defcb4cc3696000011",
"__v": 0,
"date": "2013-07-27T15:38:06.259Z",
"numberOfAttemptsBeforeSolved": 1
}
Entry C
{
"problem": "circle",
"courseLesson": "elementary_one, 1",
"studentId": "51ed51d0fcb4cc3696000001",
"studentName": "Sarah",
"_id": "51ed51defcb4cc3696000011",
"__v": 0,
"date": "2013-07-22T15:38:06.259Z",
"numberOfAttemptsBeforeSolved": 2
}
Entry D
{
"problem": "circle",
"courseLesson": "elementary_one, 1",
"studentId": "51ed51d0fcb4cc3696000001",
"studentName": "Sarah",
"_id": "51ed51defcb4cc3696000011",
"__v": 0,
"date": "2013-07-27T15:38:06.259Z",
"numberOfAttemptsBeforeSolved": 4
}
var ProficiencySchema = new Schema({
problem: String
, numberOfAttemptsBeforeSolved: {type: Number, default: 0}
//refers to which lesson, e.g. elementary_one, 2 refers to lesson 2 of elementary_one
, courseLesson: String
, date: {type: Date, default: Date.now}
, studentId: Schema.Types.ObjectId
, studentName: String
})
The best way to do this would be to sort the results in descending date-time order (so the latest response is first) and then to limit the result set by one. This would look something like:
db.proficiencies.find(YOUR QUERY).sort({'date': -1}).limit(1)
Related
So I have a data structure in a Mongo collection (v. 4.0.18) that looks something like this…
{
"_id": ObjectId("242kl4j2lk23423"),
"name": "Doug",
"kids": [
{
"name": "Alice",
"age": 15,
},
{
"name": "James",
"age": 13,
},
{
"name": "Michael",
"age": 10,
},
{
"name": "Sharon",
"age": 8,
}
]
}
In Mongo, how would I get back a projection of this object with only the first two kids? I want the output to look like this:
{
"_id": ObjectId("242kl4j2lk23423"),
"name": "Doug",
"kids": [
{
"name": "Alice",
"age": 15,
},
{
"name": "James",
"age": 13,
}
]
}
It seems like I should easily be able to get them by index, but I'm not seeing anything in the docs about how to do that. The real-world problem I'm trying to solve has nothing to do with kids, and the array could be quite lengthy. I'm trying to break it up and process it in batches without having to load the whole thing into memory in my application.
EDIT (non-sequential indexes):
I noticed that since I asked about item 1 & 2 that $slice would suffice…however, what if I wanted items 1 & 3? Is there a way I can specify specific array indexes to return?
Any ideas or pointers for how to accomplish that?
Thanks!
You are looking for the $slice projection operator if the desired selection are near each other.
https://docs.mongodb.com/manual/reference/operator/projection/slice/
This would return the first 2
client.db.collection.find({"name":"Doug"}, { "kids": { "$slice": 2 } })
returns
{'_id': ObjectId('5f85f682a45e15af3a907f51'), 'name': 'Doug', 'kids': [{'name': 'Alice', 'age': 15}, {'name': 'James', 'age': 13}]}
this would skip the first kid and return the next two (second and third)
client.db.collection.find({"name":"Doug"}, { "kids": { "$slice": [1, 2] } })
returns
{'_id': ObjectId('5f85f682a45e15af3a907f51'), 'name': 'Doug', 'kids': [{'name': 'James', 'age': 13}, {'name': 'Michael', 'age': 10}]}
Edit:
Arbitrary selections 1 and 3 probably need to route through an aggregation pipeline rather than a simple query. The performance shouldn't be too much different assuming you have an index on the $match field.
Steps of your pipeline should be pretty obvious and you should be able to take it from here.
Hate to point to RTFM, but that's going to be super helpful here to at least be acquainted with the pipeline operations.
https://docs.mongodb.com/manual/reference/operator/aggregation/
Your pipeline should:
$match on your desired query
$set some new field kid_selection to element 1 (second element) and element 3 (4th element) since counting starts at 0. Notice the prefixed $ on the "kids" key name in the kid_selection setter. When referencing a key in the document you're working on, you need to prefix with $
project the whole document, minus the original kids field that we've selected from
client.db.collection.aggregate([
{"$match":{"name":"Doug"}},
{"$set": {"kid_selection": [
{ "$arrayElemAt": [ "$kids", 1 ] },
{ "$arrayElemAt": [ "$kids", 3 ] }
]}},
{ "$project": { "kids": 0 } }
])
returns
{
'_id': ObjectId('5f86038635649a988cdd2ade'),
'name': 'Doug',
'kid_selection': [
{'name': 'James', 'age': 13},
{'name': 'Sharon', 'age': 8}
]
}
I'm trying to search any value that match with a "name" param, inside any object with any level in a MongoDB collection.
My BSON looks like this:
{
"name": "a",
"sub": {
"name": "b",
"sub": {
"name": "c",
"sub": [{
"name": "d"
},{
"name": "e",
"sub": {
"name": "f"
}
}]
}
}
}
I've created an index with db.collection.createIndex({"name": "text"}); and it seems to work, because it has created more than one.
{
"numIndexesBefore" : 1,
"numIndexesAfter" : 6,
"note" : "all indexes already exist",
"ok" : 1
}
But, when I use this db.collection.find({$text: {$search : "b"}}); to search, it does not work. It just searches at the first level.
I cannot do a search with precision, because the dimensions of the objects/arrays is dynamic and can grow or shrink at any time.
I appreciate your answers.
MongoDB cannot build an index on arbitrarily-nested objects. The index only occurs for the depth specified. In your case, the $text search will only check the top-level name field, but not the name field for any of the nested sub-documents. This is an inherent limitation for indexing.
To my knowledge, MongoDB has no support for handling these kinds of deeply-nested data structures. You really need to break your data out into separate documents in order to handle it correctly. For example, you could break it out into the following:
[
{
"_id": 0,
"name": "a",
"root_id": null,
"parent_id": null
},
{
"_id": 1,
"name": "b",
"root_id": 0,
"parent_id": 0
},
{
"_id": 2,
"name": "c",
"root_id": 0,
"parent_id": 1
},
{
"_id": 3,
"name": "d",
"root_id": 0,
"parent_id": 2
},
{
"_id": 4,
"name": "e",
"root_id": 0,
"parent_id": 2
},
{
"_id": 5,
"name": "f",
"root_id": 0,
"parent_id": 4
}
]
In the above structure, our original query db.collection.find({$text: {$search : "b"}}); will now return the following document:
{
"_id": 1,
"name": "b",
"root_id": 0,
"parent_id": 0
}
From here we can retrieve all related documents by retrieving the root_id value and finding all documents with an _id or root_id matching this value:
db.collection.find({
$or: [
{_id: 0},
{root_id: 0}
]
});
Finding all root-level documents is a simple matter of matching on root_id: null.
The drawback, of course, is that now you need to assemble these documents manually after retrieval by matching a document's parent_id with another document's _id because the hierarchical information has been abstracted away. Using a $graphLookup could help alleviate this somewhat by matching each subdocument with a list of ancestors, but you would still need to determine the nesting order manually.
Regardless of how you choose to structure your documents moving forward, this sort of restructure is going to be needed if you're going to query on arbitrarily-nested content. I would encourage you to consider different possibilities and determine which is most suited for your specific application needs.
I upgraded Wekan from 0.48 to 0.95. It looks like what happened in Mongo is that it took the checklist collection from one containing a nested list of items and split it out into a new checklistItems collection. It appears to have copied the data correctly- except that instead of copying each item's title, it copied the checklist title to each list.
I started with this in wekan.checklists:
{
"_id": "z329QEDfjsuQcxz7E",
"cardId": "TBgz6gMGCcn9XNPSW",
"title": "A list",
"sort": 0,
"createdAt": {
"$date": "2018-05-09T22:20:50.537Z"
},
"items": [
{
"_id": "z329QEDfjsuQcxz7E0",
"title": "Do some stuff",
"isFinished": false,
"sort": 0
},
{
"_id": "z329QEDfjsuQcxz7E1",
"title": "Do some other stuff",
"isFinished": false,
"sort": 1
}
],
"userId": "YndMrPQ5XhZTTKD2S"
}
and wound up with the following in wekan.checklistItems:
{
"_id": "RADPEu4nhr9PgwPHH",
"title": "A list",
"sort": 0,
"isFinished": false,
"checklistId": "z329QEDfjsuQcxz7E",
"cardId": "TBgz6gMGCcn9XNPSW"
}
{
"_id": "Guy3aaJL4WLJQjzRX",
"title": "A list",
"sort": 1,
"isFinished": false,
"checklistId": "z329QEDfjsuQcxz7E",
"cardId": "TBgz6gMGCcn9XNPSW"
}
and this in wekan.checklists:
{ "_id" : "z329QEDfjsuQcxz7E", "cardId" : "TBgz6gMGCcn9XNPSW", "title" : "MVP", "sort" : 0, "createdAt" : ISODate("2018-05-09T22:20:50.537Z"), "userId" : "YndMrPQ5XhZTTKD2S" }
Is there a quick query to go back through my original wekan.checklists and update the titles in wekan.checklistItems? I note that the checklistIDs stayed the same but the card id's are different- I can of course load the old wekan.checklists collection into my current (upgraded) db to query against.
Fix: load your old db.checklists into db.checklistsOld (I used mongoimport -d wekan -c checklistsOld ~/checklistsOld.bson, where checklistsOld.bson held my backup from before the upgrade. Use the following script in Robo3T:
db.checklistsOld.find({}, {"_id": 1, "items.title":1, "items.sort": 1 }).forEach( (list, i, lists) => {
var checklistId = list._id;
list.items.forEach( (item, j, items) => {
var sort = item.sort,
title = item.title;
db.checklistItems.update({"checklistId": checklistId, "sort":sort}, {$set: {"title": title}} );
});
});
Depending on how many items you have, you may need to adjust "shellTimeoutSec" in Robo3T (https://github.com/Studio3T/robomongo/wiki/Robomongo-Config-File-Guide)
I have this document structure in MongoDB.
I need to find and update a song for a given artist, album id
When someone likes a song, I need to increase the likes count and add the user to the votes list.
Sample document:
artists = [{
"_id": 1,
"name": "Bob",
"albums": [{
"_id": 3,
"numSongs" : 2,
"songs": [{
"_id": 4,
"title": "Song 1",
"numLikes": 2,
"votes" : [{"usr":"John"}, {"usr": "Steve"}]
},
{
"_id": 5,
"title": "Song 2",
"numLikes": 3,
"votes" : [{"usr": "Mark"}, {"usr": "Ken"}, {"usr": "Luke"}]
}]
}]
}]
This is what I have but it is not working.
//Note artists id, album id and song id are passed correctly
var query = Query.And( Query.EQ("_id", artistId), Query.EQ("albums._id", albumId), Query.EQ("albums.$.songs._id", songId));
var update = Update.Push("albums.$.songs.votes", newVote.ToBsonDocument()).Inc("albums.$.songs.numLikes", 1);
WriteConcernResult result = Artists.Update(query, update);
numDocsUpdated = result.DocumentsAffected;//This is always 0
The document is like below.
{
"title": "Book1",
"dailyactiviescores":[
{
"date": 2013-06-05,
"score": 10,
},
{
"date": 2013-06-06,
"score": 21,
},
]
}
The daily active score is intended to increase once the book is opened by a reader. The first solution comes to mind is use "$" to find whether target date has a score or not, and deal with it.
err = bookCollection.Update(
{"title":"Book1", "dailyactivescore.date": 2013-06-06},
{"$inc":{"dailyactivescore.$.score": 1}})
if err == ErrNotFound {
bookCollection.Update({"title":"Book1"}, {"$push":...})
}
But I cannot help to think is there any way to return the index of an item inside array? If so, I could use one query to do the job rather than two. Like this.
index = bookCollection.Find(
{"title":"Book1", "dailyactivescore.date": 2013-06-06}).Select({"$index"})
if index != -1 {
incTarget = FormatString("dailyactivescore.%d.score", index)
bookCollection.Update(..., {"$inc": {incTarget: 1}})
} else {
//push here
}
Incrementing a field that's not present isn't the issue as doing $inc:1 on it will just create it and set it to 1 post-increment. The issue is when you don't have an array item corresponding to the date you want to increment.
There are several possible solutions here (that don't involve multiple steps to increment).
One is to pre-create all the dates in the array elements with scores:0 like so:
{
"title": "Book1",
"dailyactiviescores":[
{
"date": 2013-06-01,
"score": 0,
},
{
"date": 2013-06-02,
"score": 0,
},
{
"date": 2013-06-03,
"score": 0,
},
{
"date": 2013-06-04,
"score": 0,
},
{
"date": 2013-06-05,
"score": 0,
},
{
"date": 2013-06-06,
"score": 0
}, { etc ... }
]
}
But how far into the future to go? So one option here is to "bucket" - for example, have an activities document "per month" and before the start of a month have a job that creates the new documents for next month. Slightly yucky. But it'll work.
Other options involve slight changes in schema.
You can use a collection with book, date, activity_scores. Then you can use a simple upsert to increment a score:
db.books.update({title:"Book1", date:"2013-06-02", {$inc:{score:1}}, {upsert:true})
This will increment the score or insert new record with score:1 for this book and date and your collection will look like this:
{
"title": "Book1",
"date": 2013-06-01,
"score": 10,
},
{
"title": "Book1",
"date": 2013-06-02,
"score": 1,
}, ...
Depending on how much you simplified your example from your real use case, this might work well.
Another option is to stick with the array but switch to using the date string as a key that you increment:
Schema:
{
"title": "Book1",
"dailyactiviescores":{
{ "2013-06-01":10},
{ "2013-06-02":8}
}
}
Note it's now a subdocument and not an array and you can do:
db.books.update({title:"Book1"}, {"dailyactivityscores.2013-06-03":{$inc:1}})
and it will add a new date into the subdocument and increment it resulting in:
{
"title": "Book1",
"dailyactiviescores":{
{ "2013-06-01":10},
{ "2013-06-02":8},
{ "2013-06-03":1}
}
}
Note it's now harder to "add-up" the scores for the book so you can atomically also update a "subtotal" in the same update statement whether it's for all time or just for the month.
But here it's once again problematic to keep adding days to this subdocument - what happens when you're still around in a few years and these book documents grow hugely?
I suspect that unless you will only be keeping activity scores for the last N days (which you can do with capped array feature in 2.4) it will be simpler to have a separate collection for book-activity-score tracking where each book-day is a separate document than to embed the scores for each day into the book in a collection of books.
According to the docs:
The $inc operator increments a value of a field by a specified amount.
If the field does not exist, $inc sets the field to the specified
amount.
So, if there won't be a score field in the array item, $inc will set it to 1 in your case, like this:
{
"title": "Book1",
"dailyactiviescores":[
{
"date": 2013-06-05,
"score": 10,
},
{
"date": 2013-06-06,
},
]
}
bookCollection.Update(
{"title":"Book1", "dailyactivescore.date": 2013-06-06},
{"$inc":{"dailyactivescore.$.score": 1}})
will result into:
{
"title": "Book1",
"dailyactiviescores":[
{
"date": 2013-06-05,
"score": 10,
},
{
"date": 2013-06-06,
"score": 1
},
]
}
Hope that helps.