MongoDB: changing document structure query - mongodb

We have made a change in our data model and we would like to apply it to all documents in one of our collections:
{
"id":60,
"measurement":{
"steps":1274.0
},
"date":"2012-05-15T00:00:00Z"
}
to:
{
"id":60,
"measurement":{
"distance":{
"steps":1274.0}
},
"date":"2012-05-15T00:00:00Z"
}
Essentially, we want to further nest the field steps, placing it under the distance field.
As for measurement.step, we would like to convert measurement.miles to measurement.distance.miles and measurement.minutes to measurement.time.minutes.
Any thoughts and/or suggestions would be appreciated.

Assuming you're asking how to script the schema change, which wasn't quite clear in the question: I would do something like this, unless you have more cases for the document structure, or mixed cases:
// find all the measurement documents with steps
db.coll.find({"measurement.steps":{$exists:true}}).forEach(function(doc) {
// create a new distance subdoc with the steps
doc.measurement.distance = {steps:doc.measurement.steps};
// delete the old steps subdoc
delete doc.measurement.steps;
// save the document
db.coll.save(doc);
});
// find all the measurement documents with miles
db.coll.find({"measurement.miles":{$exists:true}}).forEach(function(doc) {
// create a new distance subdoc with the miles
doc.measurement.distance = {miles:doc.measurement.miles};
delete doc.measurement.miles;
db.coll.save(doc);
});
// find all the measurement documents with minutes
db.coll.find({"measurement.minutes":{$exists:true}}).forEach(function(doc) {
// create a new time subdoc with the minutes
doc.measurement.time = {minutes:doc.measurement.minutes};
delete doc.measurement.minutes;
db.coll.save(doc);
});
You could pretty easily do the equivalent in the language/driver of your choice to ensure types, but it is probably faster to do in the shell. Hope it helps.

Related

Comparing value in document to value in previous document for each document

I have a collection where I'm storing a point of arrival and a point of departure and I need to check if the point of departure is the same as the point of departure of the previous document for each document.
{
departure:A
arrival:B
}
{
departure:B
arrival:C
}
{
departure:H
arrival:J
}
In this collection I should only be getting the second document since it's the only one where the departure(B) equals the arrival of the previous document(B).
In SQL it would be as simple as TABLE.DEPARTURE = TABLE.ARRIVAL+1,
is there any way of doing something like that in Mongo?
I think you need to read this document https://docs.mongodb.com/manual/tutorial/iterate-a-cursor/

Build a reactive publication with additional fields in each document

I want to make a publication with several additional fields, but I don't want to either use Collection.aggregate and lose my publication updates when the collection change (so I can't just use self.added in it either).
I plan to use Cursor.observeChanges in order to achieve that. I have two major constraints:
I don't want to publish all the documents fields
I want to use some of the unpublished fields to create new ones. For example, I have a field item where I store an array of item _id. I don't want to publish it, but I want to publish a item_count field with the length of my field array
Here comes the approach:
I plan to chain find queries. I never did that so I wonder if it possible. The general (simplified) query structure would be like this: http://jsfiddle.net/Billybobbonnet/1cgrqouj/ (I cant get the code properly displayed here)
Based on the count example in Meteor documentation, I store my query in a variable handle in order to stop the changes notification if a client unsubscribes:
self.onStop(function () {
handle.stop();
});
I add a flag initializing = true; before my query and I set it to true just before calling self.ready();. I use this flag to change my itemCount variable only if it is the publication is initialized. So basically, I change my switch like that:
switch (field) {
case "item"
if (!initializing)
itemCount = raw_document.item.length;
break;
default:
}
I wanted to check that this approach is good and possible before committing into big changes in my code. Can someone confirm me if this is the right way to go?
It's relatively easy to keep fields private even if they are part of the database query. The last argument to self.added is the object being passed to the client, so you can strip/modify/delete fields you are sending to the client.
Here's a modified version of your fiddle. This should do what you are asking for. (To be honest I'm not sure why you had anything chained after the observeChanges function in your fiddle, so maybe I'm misunderstanding you, but looking at the rest of your question this should be it. Sorry if I got it wrong.)
var self = this;
// Modify the document we are sending to the client.
function filter(doc) {
var length = doc.item.length;
// White list the fields you want to publish.
var docToPublish = _.pick(doc, [
'someOtherField'
]);
// Add your custom fields.
docToPublish.itemLength = length;
return docToPublish;
}
var handle = myCollection.find({}, {fields: {item:1, someOtherField:1}})
// Use observe since it gives us the the old and new document when something is changing.
// If this becomes a performance issue then consider using observeChanges,
// but its usually a lot simpler to use observe in cases like this.
.observe({
added: function(doc) {
self.added("myCollection", doc._id, filter(doc));
},
changed: function(newDocument, oldDocument)
// When the item count is changing, send update to client.
if (newDocument.item.length !== oldDocument.item.length)
self.changed("myCollection", newDocument._id, filter(newDocument));
},
removed: function(doc) {
self.removed("myCollection", doc._id);
});
self.ready();
self.onStop(function () {
handle.stop();
});
To solve your first problem, you need to tell MongoDB what fields it should return in the cursor. Leave out the fields you don't want:
MyCollection.find({}, {fields: {'a_field':1}});
Solving your second problem is also pretty easy, I would suggest using the collection helpers packages. You could accomplish this easily, like so:
// Add calculated fields to MyCollection.
MyCollection.helpers({
item_count: function() {
return this.items.length;
}
});
This will be run before an object is added to a cursor, and will create properties on the returned objects that are calculated dynamically, not stored in MongoDB.

Mongo -Select parent document with maximum child documents count, Faster way?

I'm quite new to mongo, and trying to get work following query.and is working fine too, But it's taking a little bit more time. I think I'm doing something wrong.
There are many number of documents in a collection parent, near about 6000. Each document has certain number of childs (childs is an another collection with 40000 documents in it). parents & childs are associated with each other by an attribute in the document called parent_id. Please see the following code. Following code takes approximate 1 minute to execute the queries. I don't think mongo should take that much time.
function getChildMaxDocCount(){
var maxLen = 0;
var bigSizeParent = null;
db.parents.find().forEach(function (parent){
var currentcount = db.childs.count({parent_id:parent._id});
if(currcount > maxLen){
maxLen = currcount;
bigSizeParent = parent._id;
}
});
printjson({"maxLen":maxLen, "bigSizeParent":bigSizeParent });
}
Is there any feasible/optimal way to achieve this?
If I got you right, you want to have the parent with the most childs. This is easy to accomplish using the aggregation framework. When each child only can have one parent, the aggregation query would look like this
db.childs.aggregate(
{ $group: { _id:"$parent_id", children:{$sum:1} } },
{ $sort: { "children":-1 } },
{ $limit : 1 }
);
Which should return a document like:
{ _id:"SomeParentId", children:15}
If a child can have more than one parent, it heavily depends on the data modeling how the query would look like.
Have a look at the aggregation framework documentation for details.
Edit: Some explanation
The aggregation pipeline takes every document it is told do do so through a series of steps in a way that all documents are first processed through the first step and the resulting documents are put into the next step.
Step 1: Grouping
We group all documents into new documents (virtual ones, if you want) and tell mongod to increment the field children by one for each document which has the same parent_id. Since we are referring to a field of the current document, we need to add a $ sign.
Step 2: Sorting
Now that we have a bunch of documents which hold the parent_id and the number of children this parent has, we sort it by the children field in descending (-1) order.
Step3: Limiting
Since we are only interested in the parent_id which has the most children, we only let mongod return the first document after sorting.

mongo: multiple queries or not?

I'm wondering the best way to query mongo db for many objects, where each one has an array of _id's that are attached to it. I want to grab the referenced objects as well. The objects' schemas looks like this:
var headlineSchema = new Schema({
title : String,
source : String,
edits : Array // list of edits, stored as an array of _id's
...
});
...and the one that's referenced, if needed:
var messageSchema = new Schema({
message : String,
user : String,
headlineID : ObjectId // also contains a ref. back to headline it's incl. in
...
});
One part of the problem (well, depending if I want to keep going this route) is that pushing the message id's is not working (edits remains an empty array [] afterwards) :
db.headline.update({_id : headlineid}, {$push: {edits : messageid} }, true);
When I do my query, I need to grab about 30 'headlines' at a time, and each one could contain references to up to 20 or 30 'messages'. My question is, what is the best way to fetch all of these things? I know mongo isn't a relational db, so what I'm intending is to first grab the headlines that I need, and then loop through all 30 of them to grab any attached messages.
db.headline.find({'date': {$gte: start, $lt: end} }, function (err, docs) {
if(err) { console.log(err.message); }
if(docs) {
docs.forEach(function(doc){
doc.edits.forEach(function(ed){
db.messages.find({_id:ed}, function (err, msg) {
// save stuff
});
});
});
}
});
This just seems wrong, but I'm unsure how else to proceed. Should I even bother with keeping an array of attached messages? I'm not married to the way I've set up my schema, either. If there is a better way to track relationships between them, or a better query to achieve this, please let me know.
Thanks
Does each message belong to only one headline? If so, you can store the headline id as part of each message. Then for each headline, do:
db.messages.find({headline_id: current-headline-id-here})
You could try using the $in operator for selecting a list of ObjectIds
http://www.mongodb.org/display/DOCS/Advanced+Queries#AdvancedQueries-%24in

How to "(WHERE) column = column" in Mongo?

I like Mongo for simple things so I was hoping to use it for something more advanced. And that worked fine until I needed this:
UPDATE tbl SET a = b WHERE c <> 0
The a = b part is what I can't figure out. I tried mongodb.org, but I can't find it there. I also looked for WHERE a = b but I can't find that either.
An alternative is so fetch all rows and than update them individually, but I don't like that. It has to be simpler.
Thanks.
You want to check the documentation for updating.
http://www.mongodb.org/display/DOCS/Updating
Your code might look like:
db.tbl.update( { c:{$ne:0}}, { $set: { a : b } } );
If you need to brush up on advanced queries (e.g. using $ne), then check here:
http://www.mongodb.org/display/DOCS/Advanced+Queries
EDIT:
Apparently you can't update with data from the same document.
MongoDB: Updating documents using data from the same document
EDIT 2 (solution with map reduce):
var c = new Mongo();
var db = c.getDB('db')
var s = db.getCollection('s')
s.drop();
s.save({z:1,q:5});
s.save({z:11,q:55});
db.runCommand({
mapreduce:'s',
map:function(){
var i = this._id; //we will emit with a unique key. _id in this case
this._id=undefined; //strange things happen with merge if you leave the id in
//update your document with access to all fields!
this.z=this.q;
emit(i,this);
},
query:{z:1}, //apply to only certain documents
out:{merge:'s'} //results get merged (overwrite themselves in collection)
});
//now take a look
s.find();