MongoDB: projection on specific fields of all objects in an array - mongodb

I have a MongoDB database with following data structure for one Part:
Part Data structure
Now I want to to have a projection on the element Descriptionof every object in the array SensorData. Using the MonogDB API in .NET with C# my code would look the following:
var projection = Builders<Part>.Projection
.Include(part => part.SensorData[0].Description)
.Include(part => part.SensorData[1].Description)
.Include(part => part.SensorData[2].Description)
//...
;
The problem is that the number of objects in SensorData is dynamic and may range from 0 to about 20, so including every Descriptionfield for itself is not possible. The projection has to be done Server-Side as SensorData -> Values can be huge.
Is there any Syntax or method to do this kind of projection?

Just found a solution after asking a colleague: It isn't possible to do this with the Projection Builder, but it is with the aggregation framework:
var match = new BsonDocument
{
{
"$project",
new BsonDocument
{
{"PartData.Description", 1 }
}
}
};
var pipeline = new[] { match };
var aggregationCursor = parts.Aggregate<Part>(pipeline);
var query = aggregationCursor
.ToEnumerable()//needed to perform AsQueryable
.AsQueryable()//Asking for data elements
}

Related

MongoDB project nested element in _id field

I'm stuck in something very stupid but I can't get out from my own.
MongoDB v4.2 and I have a collection with documents like this:
{"_id":{"A":"***","B":0}}, "some other fields"...
I'm working on top of mongo-c driver and I want to query only the "_id.B" field but I don't know how I can do it. I have tried:
"projection":{"_id.B":1}: It returns me the whole _id object. _id.A & _id.B.
"projection":{"_id.A":0,"All other fields except _id.B":0}: Returns same as above.
"projection":{"_id.A":0,"_id.B":1}: Returns nothing.
How I can do it to get only some object elements when this object is inside the _id field? The first option works for me with objects that aren't inside the _id field but not here.
Best regards, thanks for your time.
Héctor
You can use MongoDB's $project in aggregation to do this. You can also use $addFields to get _id.B into new field + all other fields in document & finally project _id :0.
Code:
var coll = localDb.GetCollection("yourCollectionName");
var project = new BsonDocument
{
{
"$project",
new BsonDocument
{
{ "_id.B": 1 }
}
}
}
var pipeline = new[] { project };
var result = coll.Aggregate(pipeline);
Test : MongoDB-Playground

Query generated mongo-version collection without model

I want to use mongoose-version to track and keep changes in mongodb.
I created this example schema.
var mongoose = require('mongoose');
var Schema = mongoose.Schema;
var version = require('mongoose-version');
var PageSchema = new Schema({
title : { type : String, required : true},
tags : [String],
});
PageSchema.plugin(version, { collection: 'Page__versions' });
const PageModel = mongoose.model('PageModel', PageSchema)
So all versions are stored in collection Page__versions, but how can I query this collection, because I don't have a model for that collection.
To get the collection you can use the mongoose database object that is returned from the createConnection function. So when you start the application you store this variable like this
let db = mongoose.createConnection(url, params);
And then you can use this object to get the collection you want, in this case
let collection = db.collection("Page__versions");
At this point you can use the standard methods to do CRUD operations on that collection, for example if you want to find all documents that have a specific property in that collection, you can do something like this
collection.find({myProperty: value});
And this will give you all documents that are in that collection that matches the criteria.
If you don't know the model you can always get one item from the collection and see what the result is
let doc = collection.findOne({}); //This will get the first document in the collection

How to compare all documents in two collections with millions of doc and write the diff in a third collection in MongoDB

I have two collections (coll_1, coll_2) with a million documents each.
These two collections are actually created by running two versions of a code from the same data source, so both two collections will have the same number of documents but the document in both collections can have one more field or sub-document missing or have a different values, but both collection's documents will have the same primary_key_id which is indexed.
I have this javascript function saved on the db to get the diff
db.system.js.save({
_id: "diffJSON", value:
function(obj1, obj2) {
var result = {};
for (key in obj1) {
if (obj2[key] != obj1[key]) result[key] = obj2[key];
if (typeof obj2[key] == 'array' && typeof obj1[key] == 'array')
result[key] = arguments.callee(obj1[key], obj2[key]);
if (typeof obj2[key] == 'object' && typeof obj1[key] == 'object')
result[key] = arguments.callee(obj1[key], obj2[key]);
}
return result;
}
});
Which runs fine like this
diffJSON(testObj1, testObj2);
Question: How to run diffJSON on coll1 and coll2, and output diffJSON result into coll3 along with primary_key_id.
I am new to MongoDB, and I understand the JOINS doesn't work as similar to RDBMS, so I wonder if I have to copy the two comparing documents in a single collection and then run the diffJSON function.
Also, most of the time (say 90%) documents in two collections will be identical, I would need to know about only 10% of docs which have any diff.
Here is a simple example document:
(but real doc is around 15k in size, just so you know the scale)
var testObj1 = { test:"1",test1: "2", tt:["td","ax"], tr:["Positive"] ,tft:{test:["a"]}};
var testObj2 = { test:"1",test1: "2", tt:["td","ax"], tr:["Negative"] };
If you know a better way to diff the documents, please feel free to suggest.
you can use a simple shell script to achieve this. First create a file named script.js and paste this code in it :
// load previously saved diffJSON() function
db.loadServerScripts();
// get all the document from collection coll1
var cursor = db.coll1.find();
if (cursor != null && cursor.hasNext()) {
// iterate over the cursor
while (cursor.hasNext()){
var doc1 = cursor.next();
// get the doc with the same _id from coll2
var id = doc1._id;
var doc2 = db.coll2.findOne({_id: id});
// compute the diff
var diff = diffJSON(doc2, doc1);
// if there is a difference between the two objects
if ( Object.keys(diff).length > 0 ) {
diff._id = id;
// insert the diff in coll3 with the same _id
db.coll3.insert(diff);
}
}
}
In this script I assume that your primary_key is the _id field.
then execute it from you shell like this:
mongo --host hostName --port portNumber databaseName < script.js
where databaseName is the came of the database containing the collections coll1 and coll2.
for this samples documents (just added an _id field to your docs):
var testObj1 = { _id: 1, test:"1",test1: "2", tt:["td","ax"], tr:["Positive"] ,tft:{test:["a"]}};
var testObj2 = { _id: 1, test:"1",test1: "2", tt:["td","ax"], tr:["Negative"] };
the script will save the following doc in coll3 :
{ "_id" : 1, "tt" : { }, "tr" : { "0" : "Positive" } }
This solution builds upon the one proposed by felix (I don't have the necessary reputation to comment on his). I made a few small changes to his script that bring important performance improvements:
// load previously saved diffJSON() function
db.loadServerScripts();
// get all the document from collection coll1 and coll2
var cursor1 = db.coll1.find().sort({'_id': 1});
var cursor2 = db.coll2.find().sort({'_id': 1});
if (cursor1 != null && cursor1.hasNext() && cursor2 != null && cursor2.hasNext()) {
// iterate over the cursor
while (cursor1.hasNext() && cursor2.hasNext()){
var doc1 = cursor1.next();
var doc2 = cursor2.next();
var pk = doc1._id
// compute the diff
var diff = diffJSON(doc2, doc1);
// if there is a difference between the two objects
if ( Object.keys(diff).length > 0 ) {
diff._id = pk;
// insert the diff in coll3 with the same _id
db.coll3.insert(diff);
}
}
}
Two cursors are used for fetching all the entries in the database sorted by the primary key. This is a very important aspect and brings most of the performance improvement. By retrieving the documents sorted by primary key, we make sure we match them correctly by the primary key. This is based on the fact that the two collections hold the same data.
This way we avoid making a call to coll2 for each document in coll1. It might seem as something insignificant, but we're talking about 1 million calls which put a lot of stress on the database.
Another important assumption is that the primary key field is _id. If it's not the case, it is crucial to have an unique index on the primary key field. Otherwise, the script might mismatch documents with the same primary key.

How to query $in with objectid in meteors existing collection

I am using existing Mongodb in meteor. I don't know how to query $in with ObjectId()
Users = new Mongo.Collection('users', {idGeneration: 'MONGO'});
var ids = ['55549158f046be124e3fdee7',
'5539d937f046be0e2502aefc',
'55548e10f046bee14c3fdeed',
'55549938f046be99493fdef8' ];
Users.find({_id:{$in: ids}}).fetch(); //returns empty array
You could first cast the array ids to an array of ObjectIds using the map() method:
var Users = new Mongo.Collection('users', {idGeneration: 'MONGO'}),
ids = [
'55549158f046be124e3fdee7',
'5539d937f046be0e2502aefc',
'55548e10f046bee14c3fdeed',
'55549938f046be99493fdef8'
],
mids = ids.map(function(id) { return new Mongo.ObjectID(id); });
Users.find({"_id":{"$in": mids}}).fetch();

Query or command to find a Document, given an ObjectID but NOT a collection

So I have a document that has references to foreign ObjectIDs that may point to other documents or collections.
For example this is the pseudo-structure of the document
{
_id: ObjectID(xxxxxxxx),
....
reference: ObjectID(yyyyyyyy)
}
I can't find anything that does not involve providing the collection and given that I don't know for sure on which collection to search, I am wondering if there is a way for me to find the document in the entire database and find the collection ObjectID(yyyyyyyy) belongs to.
The only possible way to do this is by listing every collection in the database and performing a db.collection.find() on each one.
E.g. in the Mongo shell I would do something like
var result = new Array();
var collections = db.getCollectionNames();
for (var i = 0; i < collections.length; i++) {
var found = db.getCollection(collections[i]).findOne({ "_id" : ObjectId("yyyyyyyy") });
if (found) {
result.push(found);
}
}
print(result);
You need to run your query on all collections in your database.
db.getCollectionNames().forEach(function(collection){
db[collection].find({ $or : [
{ _id : ObjectId("535372b537e6210c53005ee5") },
{ reference : ObjectId("535372b537e6210c53005ee5")}]
}).forEach(printjson);
});