How to mass update in MongoDB after corruption

How to mass update in MongoDB after corruption - mongodb

I upgraded Wekan from 0.48 to 0.95. It looks like what happened in Mongo is that it took the checklist collection from one containing a nested list of items and split it out into a new checklistItems collection. It appears to have copied the data correctly- except that instead of copying each item's title, it copied the checklist title to each list.
I started with this in wekan.checklists:
{
"_id": "z329QEDfjsuQcxz7E",
"cardId": "TBgz6gMGCcn9XNPSW",
"title": "A list",
"sort": 0,
"createdAt": {
"$date": "2018-05-09T22:20:50.537Z"
},
"items": [
{
"_id": "z329QEDfjsuQcxz7E0",
"title": "Do some stuff",
"isFinished": false,
"sort": 0
},
{
"_id": "z329QEDfjsuQcxz7E1",
"title": "Do some other stuff",
"isFinished": false,
"sort": 1
}
],
"userId": "YndMrPQ5XhZTTKD2S"
}
and wound up with the following in wekan.checklistItems:
{
"_id": "RADPEu4nhr9PgwPHH",
"title": "A list",
"sort": 0,
"isFinished": false,
"checklistId": "z329QEDfjsuQcxz7E",
"cardId": "TBgz6gMGCcn9XNPSW"
}
{
"_id": "Guy3aaJL4WLJQjzRX",
"title": "A list",
"sort": 1,
"isFinished": false,
"checklistId": "z329QEDfjsuQcxz7E",
"cardId": "TBgz6gMGCcn9XNPSW"
}
and this in wekan.checklists:
{ "_id" : "z329QEDfjsuQcxz7E", "cardId" : "TBgz6gMGCcn9XNPSW", "title" : "MVP", "sort" : 0, "createdAt" : ISODate("2018-05-09T22:20:50.537Z"), "userId" : "YndMrPQ5XhZTTKD2S" }
Is there a quick query to go back through my original wekan.checklists and update the titles in wekan.checklistItems? I note that the checklistIDs stayed the same but the card id's are different- I can of course load the old wekan.checklists collection into my current (upgraded) db to query against.

Fix: load your old db.checklists into db.checklistsOld (I used mongoimport -d wekan -c checklistsOld ~/checklistsOld.bson, where checklistsOld.bson held my backup from before the upgrade. Use the following script in Robo3T:
db.checklistsOld.find({}, {"_id": 1, "items.title":1, "items.sort": 1 }).forEach( (list, i, lists) => { 
  var checklistId = list._id;
  list.items.forEach( (item, j, items) => {
      var sort = item.sort,
          title = item.title;
      db.checklistItems.update({"checklistId": checklistId, "sort":sort}, {$set: {"title": title}} );
  });
});
Depending on how many items you have, you may need to adjust "shellTimeoutSec" in Robo3T (https://github.com/Studio3T/robomongo/wiki/Robomongo-Config-File-Guide)

Related

How to search for child objects inside parent objects in MongoDB?

I'm trying to search any value that match with a "name" param, inside any object with any level in a MongoDB collection.
My BSON looks like this:
{
"name": "a",
"sub": {
"name": "b",
"sub": {
"name": "c",
"sub": [{
"name": "d"
},{
"name": "e",
"sub": {
"name": "f"
}
}]
}
}
}
I've created an index with db.collection.createIndex({"name": "text"}); and it seems to work, because it has created more than one.
{
"numIndexesBefore" : 1,
"numIndexesAfter" : 6,
"note" : "all indexes already exist",
"ok" : 1
}
But, when I use this db.collection.find({$text: {$search : "b"}}); to search, it does not work. It just searches at the first level.
I cannot do a search with precision, because the dimensions of the objects/arrays is dynamic and can grow or shrink at any time.
I appreciate your answers.

MongoDB cannot build an index on arbitrarily-nested objects. The index only occurs for the depth specified. In your case, the $text search will only check the top-level name field, but not the name field for any of the nested sub-documents. This is an inherent limitation for indexing.
To my knowledge, MongoDB has no support for handling these kinds of deeply-nested data structures. You really need to break your data out into separate documents in order to handle it correctly. For example, you could break it out into the following:
[
{
"_id": 0,
"name": "a",
"root_id": null,
"parent_id": null
},
{
"_id": 1,
"name": "b",
"root_id": 0,
"parent_id": 0
},
{
"_id": 2,
"name": "c",
"root_id": 0,
"parent_id": 1
},
{
"_id": 3,
"name": "d",
"root_id": 0,
"parent_id": 2
},
{
"_id": 4,
"name": "e",
"root_id": 0,
"parent_id": 2
},
{
"_id": 5,
"name": "f",
"root_id": 0,
"parent_id": 4
}
]
In the above structure, our original query db.collection.find({$text: {$search : "b"}}); will now return the following document:
{
"_id": 1,
"name": "b",
"root_id": 0,
"parent_id": 0
}
From here we can retrieve all related documents by retrieving the root_id value and finding all documents with an _id or root_id matching this value:
db.collection.find({
$or: [
{_id: 0},
{root_id: 0}
]
});
Finding all root-level documents is a simple matter of matching on root_id: null.
The drawback, of course, is that now you need to assemble these documents manually after retrieval by matching a document's parent_id with another document's _id because the hierarchical information has been abstracted away. Using a $graphLookup could help alleviate this somewhat by matching each subdocument with a list of ancestors, but you would still need to determine the nesting order manually.
Regardless of how you choose to structure your documents moving forward, this sort of restructure is going to be needed if you're going to query on arbitrarily-nested content. I would encourage you to consider different possibilities and determine which is most suited for your specific application needs.

MongoDB Project - return data only if $elemMatch Exist

Hello Good Developers,
I am facing a situation in MongoDB where I've JSON Data like this
[{
"id": "GLOBAL_EDUCATION",
"general_name": "GLOBAL_EDUCATION",
"display_name": "GLOBAL_EDUCATION",
"profile_section_id": 0,
"translated": [
{
"con_lang": "US-EN",
"country_code": "US",
"language_code": "EN",
"text": "What is the highest level of education you have completed?",
"hint": null
},
{
"con_lang": "US-ES",
"country_code": "US",
"language_code": "ES",
"text": "\u00bfCu\u00e1l es su nivel de educaci\u00f3n?",
"hint": null
}...
{
....
}
]
I am projecting result using the following query :
db.collection.find({
},
{
_id: 0,
id: 1,
general_name: 1,
translated: {
$elemMatch: {
con_lang: "US-EN"
}
}
})
here's a fiddle for the same: https://mongoplayground.net/p/I99ZXBfXIut
I want those records who don't match $elemMatch don't get returned at all.
In the fiddle output, you can see that the second item doesn't have translated attribute, In this case, I don't want the second Item at all to be returned.
I am using Laravel as Backend Tech, I can filter out those records using PHP, but there are lots of records returned, and I think filtering using PHP is not the best option.

You need to use $elemMatch in the first parameter
db.collection.find({
translated: {
$elemMatch: {
con_lang: "IT-EN"
}
}
})
MongoPlayground

MongoDB - Get Names of All Keys Matching Criteria in a Collection

As the title says, I need to retrieve the names of all the keys in my MongoDB collection, BUT I need them split up based on a key/value pair that each document has. Here's my clunky analogy: If you imagine the original collection is a zoo, I need a new collection that contains all the keys Zebras have, all the keys Lions have, and all the keys Giraffes have. The different animal types share many of the same keys, but those keys are meant to be specific to each type of animal (because the user needs to be able to (for example) search for Zebras taller than 3ft and giraffes shorter than 10ft).
Here's a bit of example code that I ran which worked well - it grabbed all the unique keys in my entire collection and threw them into their own collection:
db.runCommand({
"mapreduce" : "MyZoo",
"map" : function() {
for (var key in this) { emit(key, null); }
},
"reduce" : function(key, stuff) { return null; },
"out": "MyZoo" + "_keys"
})
I'd like a version of this command that would look through the MyZoo collection for animals with "type":"zebra", find all the unique keys, and place them in a new collection (MyZoo_keys) - then do the same thing for "type":"lion" & "type":"giraffe", giving each "type" its own array of keys.
Here's the collection I'm starting with:
{
"name": "Zebra1",
"height": "300",
"weight": "900",
"type": "zebra"
"zebraSpecific1": "somevalue"
},
{
"name": "Lion1",
"height": "325",
"weight": "1200",
"type": "lion",
},
{
"name": "Zebra2",
"height": "500",
"weight": "2100",
"type": "zebra",
"zebraSpecific2": "somevalue"
},
{
"name": "Giraffe",
"height": "4800",
"weight": "2400",
"type": "giraffe"
"giraffeSpecific1": "somevalue",
"giraffeSpecific2": "someothervalue"
}
And here's what I'd like the MyZoo_keys collection to look like:
{
"zebra": [
{
"name": null,
"height": null,
"weight": null,
"type": null,
"zebraSpecific1": null,
"zebraSpecific2": null
}
],
"lion": [
{
"name": null,
"height": null,
"weight": null,
"type": null
}
],
"giraffe": [
{
"name": null,
"height": null,
"weight": null,
"type": null,
"giraffeSpecific1": null,
"giraffeSpecific2": null
}
]
}
That's probably imperfect JSON, but you get the idea...
Thanks!

You can modify your code to dump the results in a more readable and organized format.
The map function:
Emit the type of animal as key, and an array of keys for
each animal(document). Leave out the _id field.
Code:
var map = function(){
var keys = [];
Object.keys(this).forEach(function(k){
if(k != "_id"){
keys.push(k);
}
})
emit(this.type,{"keys":keys});
}
The reduce function:
For each type of animal, consolidate and return the unique keys.
Use an Object(uniqueKeys) to check for duplicates, this increases the running
time even if it occupies some memory. The look up is O(1).
Code:
var reduce = function(key,values){
var uniqueKeys = {};
var result = [];
values.forEach(function(value){
value.keys.forEach(function(k){
if(!uniqueKeys[k]){
uniqueKeys[k] = 1;
result.push(k);
}
})
})
return {"keys":result};
}
Invoking Map-Reduce:
db.collection.mapReduce(map,reduce,{out:"t1"});
Aggregating the result:
db.t1.aggregate([
{$project:{"_id":0,"animal":"$_id","keys":"$value.keys"}}
])
Sample o/p:
{
"animal" : "lion",
"keys" : [
"name",
"height",
"weight",
"type"
]
}
{
"animal" : "zebra",
"keys" : [
"name",
"height",
"weight",
"type",
"zebraSpecific1",
"zebraSpecific2"
]
}
{
"animal" : "giraffe",
"keys" : [
"name",
"height",
"weight",
"type",
"giraffeSpecific1",
"giraffeSpecific2"
]
}

MongoDB update all fields of array error

Im tring to set 0 the items.qty of a document obtains by a id query.
db.warehouses.update(
// query
{
_id:ObjectId('5322f07e139cdd7e31178b78')
},
// update
{
$set:{"items.$.qty":0}
},
// options
{
"multi" : true, // update only one document
"upsert" : true // insert a new document, if no existing document match the query
}
);
Return:
Cannot apply the positional operator without a corresponding query field containing an array.
This is the document that i want to set all items.qty to 0
{
"_id": { "$oid" : "5322f07e139cdd7e31178b78" },
"items": [
{
"_id": { "$oid" : "531ed4cae604d3d30df8e2ca" },
"brand": "BJFE",
"color": "GDRNCCD",
"hand": 1,
"model": 0,
"price": 500,
"qty": 0,
"type": 0
},
{
"brand": "BJFE",
"color": "GDRNCCD",
"hand": 1,
"id": "23",
"model": 0,
"price": 500,
"qty": 4,
"type": 0
},
{
"brand": "BJFE",
"color": "GDRNCCD",
"hand": 1,
"id": "3344",
"model": 0,
"price": 500,
"qty": 6,
"type": 0
}
],
"name": "a"
}

EDIT
The detail missing from the question was that the required field to update was actually in a sub-document. This changes the answer considerably:
This is a constraint of what you can possibly do with updating array elements. And this is clearly explained in the documentation. Mostly in this paragraph:
The positional $ operator acts as a placeholder for the first element that matches the query document
So here is the thing. Trying to update all of the array elements in a single statement like this will not work. In order to do this you must to the following.
db.warehouses.find({ "items.qty": { "$gt": 0 } }).forEach(function(doc) {
doc.items.forEach(function(item) {
item.qty = 0;
});
db.warehouses.update({ "_id": doc._id }, doc );
})
Which is basically the way to update every array element.
The multi setting in .update() means across multiple "documents". It cannot be applied to multiple elements of an array. So presently the best option is to replace the whole thing. Or in this case we may just as well replace the whole document since we need to do that anyway.
For real bulk data, use db.eval(). But please read the documentation first:
db.eval(function() {
db.warehouses.find({ "items.qty": { "$gt": 0 } }).forEach(function(doc) {
doc.items.forEach(function(item) {
item.qty = 0;
});
db.warehouses.update({ "_id": doc._id }, doc );
});
})
Updating all the elements in an array across the whole collection is not simple.
Original
Pretty much exactly what the error says. In order to use a positional operator you need to match something first. As in:
db.warehouses.update(
// query
{
_id:ObjectId('5322f07e139cdd7e31178b78'),
"items.qty": { "$gt": 0 }
},
// update
{
$set:{"items.$.qty":0}
},
// options
{
"multi" : true,
"upsert" : true
}
);
So where the match condition fins the position of the items that are less than 0 then that index is passed to the positional operator.
P.S : When muti is true it means it updates every document. Leave it false if you only mean one. Which is the default.

You can use the $ positional operator only when you specify an array in the first argument (i.e., the query part used to identify the document you want to update).
The positional $ operator identifies an element in an array field to update without explicitly specifying the position of the element in the array.

Pulling out latest (multiple) entries from MongoDB

I am trying to retrieve information on how many attempts a user takes to solve a particular problem as a JSON from a mongodb database. If there are multiple attempts on the same problem, I would only like to pull out the last entry - for instance, right now, if I do a db.proficiencies.find() - I will pull out entries A, B, C, and D but I would like to only pull out entries B and D (latest entries for the problems maze and circle respectively).
Is there an easy way to do so?
Entry A
{
"problem": "maze",
"courseLesson": "elementary_one, 1",
"studentId": "51ed51d0fcb4cc3696000001",
"studentName": "Sarah",
"_id": "51ed51defcb4cc3696000011",
"__v": 0,
"date": "2013-07-22T15:38:06.259Z",
"numberOfAttemptsBeforeSolved": 1
}
Entry B
{
"problem": "maze",
"courseLesson": "elementary_one, 1",
"studentId": "51ed51d0fcb4cc3696000001",
"studentName": "Sarah",
"_id": "51ed51defcb4cc3696000011",
"__v": 0,
"date": "2013-07-27T15:38:06.259Z",
"numberOfAttemptsBeforeSolved": 1
}
Entry C
{
"problem": "circle",
"courseLesson": "elementary_one, 1",
"studentId": "51ed51d0fcb4cc3696000001",
"studentName": "Sarah",
"_id": "51ed51defcb4cc3696000011",
"__v": 0,
"date": "2013-07-22T15:38:06.259Z",
"numberOfAttemptsBeforeSolved": 2
}
Entry D
{
"problem": "circle",
"courseLesson": "elementary_one, 1",
"studentId": "51ed51d0fcb4cc3696000001",
"studentName": "Sarah",
"_id": "51ed51defcb4cc3696000011",
"__v": 0,
"date": "2013-07-27T15:38:06.259Z",
"numberOfAttemptsBeforeSolved": 4
}
var ProficiencySchema = new Schema({
problem: String
, numberOfAttemptsBeforeSolved: {type: Number, default: 0}
//refers to which lesson, e.g. elementary_one, 2 refers to lesson 2 of elementary_one
, courseLesson: String
, date: {type: Date, default: Date.now}
, studentId: Schema.Types.ObjectId
, studentName: String
})

The best way to do this would be to sort the results in descending date-time order (so the latest response is first) and then to limit the result set by one. This would look something like:
db.proficiencies.find(YOUR QUERY).sort({'date': -1}).limit(1)