Aggregation on an attribute which could be null [duplicate] - mongodb

This question already has answers here:
Before $unwind check if sub document is not empty
(2 answers)
Closed 3 years ago.
I am trying to aggregate attributes from two collections, one of those contains a field which may or may not be there in a document. When the attribute is not there in the document it doesn't return any document at all. So I need to create a kind of null check, that if the attribute is not there don't consider the attribute else consider it, below is my query -
db.collection(collectionName).aggregate(
[{
$match: selector
}, {
$lookup: {
from: 'status',
localField: 'candidateId',
foreignField: 'candidateId',
as: 'profile'
}
}, {
$project: {
'_id': 0,
'currentStatus': '$profile.currentStatus',
'lastContacted': '$profile.lastContacted',
'lastWorkingDay': '$profile.lastWorkingDay',
'remarks': '$profile.remarks'
}
},{
$unwind: '$lastWorkingDay'
}
In this case lastWorkingDay if not present makes the whole query return nothing. Any pointer would be helpful.

I believe something else is wrong with your query.
This is a bit hard to analyse without any data input, so I made up my own:
I have tried this on my local box just now, and it executes the way you'd expect it.
A projection shouldn't remove any kind of results. Here is my example:
Collection c1:
/* 1 */
{
"_id" : ObjectId("5c780eea79e5bed2bd00f85e"),
"candidateId" : "id1",
"currentStatus" : "a",
"lastContacted" : "b"
}
/* 2 */
{
"_id" : ObjectId("5c780efb79e5bed2bd00f863"),
"candidateId" : "id2",
"currentStatus" : "a",
"lastContacted" : "b",
"lastWorkingDay" : "yesterday"
}
Collection C2:
/* 1 */
{
"_id" : ObjectId("5c780f0a79e5bed2bd00f874"),
"candidateId" : "id1"
}
/* 2 */
{
"_id" : ObjectId("5c780f2879e5bed2bd00f87b"),
"candidateId" : "id2"
}
Aggregation:
db.getCollection('c2').aggregate( [
{$match: {}},
{ $lookup: {
from: "c1",
localField: "candidateId",
foreignField: "candidateId",
as : "profile"
} },
{$project: {
_id: 0,
"currentStatus" : "$profile.currentStatus",
"lastWorkingDay" : "$profile.lastWorkingDay"
} }
] )
Results:
/* 1 */
{
"currentStatus" : [
"a"
],
"lastWorkingDay" : []
}
/* 2 */
{
"currentStatus" : [
"a"
],
"lastWorkingDay" : [
"yesterday"
]
}
As you can see, the lastWorkingDay is executed correctly for both values in my aggregation.
Note that the lookup is creating an array for profiles since there could be multiple results for the lookup. You may need to unwind this if you need it in more detail.
I hope this helps.

Related

MongoDB $divide on aggregate output

Is there a possibility to calculate mathematical operation on already aggregated computed fields?
I have something like this:
([
{
"$unwind" : {
"path" : "$users"
}
},
{
"$match" : {
"users.r" : {
"$exists" : true
}
}
},
{
"$group" : {
"_id" : "$users.r",
"count" : {
"$sum" : 1
}
}
},
])
Which gives an output as:
{ "_id" : "A", "count" : 7 }
{ "_id" : "B", "count" : 49 }
Now I want to divide 7 by 49 or vice versa.
Is there a possibility to do that? I tried $project and $divide but had no luck.
Any help would be really appreciated.
Thank you,
From your question, it looks like you are assuming result count to be 2 only. In that case I can assume users.r can have only 2 values(apart from null).
The simplest thing I suggest is to do this arithmetic via javascript(if you're using it in mongo console) or in case of using it in progam, use the language you're using to access mongo) e.g.
var results = db.collection.aggregate([theAggregatePipelineQuery]).toArray();
print(results[0].count/results[1].count);
EDIT: I am sharing an alternative to above approach because OP commented about the constraint of not using javascript code and the need to be done only via query. Here it is
([
{ /**your existing aggregation stages that results in two rows as described in the question with a count field **/ },
{ $group: {"_id": 1, firstCount: {$first: "$count"}, lastCount: {$last: "$count"}
},
{ $project: { finalResult: { $divide: ['$firstCount','$lastCount']} } }
])
//The returned document has your answer under `finalResult` field

MongoDB $lookup with <collection to join> coming from the input document

New to mongodb, so maybe this is a dumb question. I am using MongoDB lookup aggregation, but the 'from' collection is a field in the input document. How do I indicate that field in the 'from' rather than a string literal?
A simplified version of the collection I am starting with ("Groups") has documents that look like this:
{
_id: "<ObjectId>",
collectionName: "MyCollectionA",
list: ["<Foreign ObjectId>", "<Foreign ObjectId>", "<Foreign ObjectId>"]
}
I am joining to another collection. In this case, "MyCollectionA".
My lookup is working and looks like this:
{
$lookup: {
from: "MyCollectionA",
localField: "list",
foreignField: "_id",
as: "myJoinedItems"
}
}
However, I want to be able to use the field 'collectionName' rather than hardcoding 'MyCollectionA' in the lookup. How can I do that? I've tried '$collectionName' and { $literal: '$collectionName }, but no luck.
The following query can do the trick. We are iterating over each record of Groups collection and performing the find operation on the collection specified in each document for the list of object IDs. Since there is a default index on _id, the search operation would be fast.
db.Groups.find().forEach(doc=>{
var myJoinedItems = [];
doc["list"].forEach(id=>{
myJoinedItems.push(
db.getCollection(doc["collectionName"]).find({"_id":id})[0]
);
});
doc["myJoinedItems"]=myJoinedItems;
print(tojson(doc));
});
Output:
{
"_id" : ObjectId("5d6cac366bc2ad3b23f7de74"),
"collectionName" : "MyCollectionA",
"list" : [
ObjectId("5d6cabd16bc2ad3b23f7de72")
],
"myJoinedItems" : [
{
"_id" : ObjectId("5d6cabd16bc2ad3b23f7de72"),
"collectionDetails" : {
"name" : "MyCollectionA",
"info" : "Cool"
}
}
]
}
{
"_id" : ObjectId("5d6cb82a6bc2ad3b23f7de76"),
"collectionName" : "MyCollectionB",
"list" : [
ObjectId("5d6cb7fd6bc2ad3b23f7de75")
],
"myJoinedItems" : [
{
"_id" : ObjectId("5d6cb7fd6bc2ad3b23f7de75"),
"collectionDetails" : {
"name" : "MyCollectionB",
"info" : "Super-Cool"
}
}
]
}
Data set:
Collection: Groups
{
"_id" : ObjectId("5d6cac366bc2ad3b23f7de74"),
"collectionName" : "MyCollectionA",
"list" : [
ObjectId("5d6cabd16bc2ad3b23f7de72")
]
}
{
"_id" : ObjectId("5d6cb82a6bc2ad3b23f7de76"),
"collectionName" : "MyCollectionB",
"list" : [
ObjectId("5d6cb7fd6bc2ad3b23f7de75")
]
}
Collection: MyCollectionA
{
"_id" : ObjectId("5d6cabd16bc2ad3b23f7de72"),
"collectionDetails":{
"name":"MyCollectionA",
"info":"Cool"
}
}
Collection: MyCollectionB
{
"_id" : ObjectId("5d6cb7fd6bc2ad3b23f7de75"),
"collectionDetails":{
"name":"MyCollectionB",
"info":"Super-Cool"
}
}
Here's a variation that uses $lookup directly.
db.foo.drop();
db.foo1.drop();
db.foo2.drop();
db.foo.insert(
[
{_id:0, collectionName:"foo1", list: ['A','B','C'] },
{_id:1, collectionName:"foo2", list: ['D','E'] }
]);
db.foo1.insert([
{_id:0, key:"A", foo1data:"goodbye"},
{_id:1, key:"B", foo1data:"goodbye"},
{_id:2, key:"C", foo1data:"goodbye"}
]);
db.foo2.insert([
{_id:0, key:"C", foo2data:"goodbye"},
{_id:1, key:"D", foo2data:"goodbye"}
]);
// Pass 1: Collect unique collections.
c = db.foo.distinct("collectionName");
// Pass 2: Get 'em:
c.forEach(function(collname) {
c2 = db.foo.aggregate([
{$match: {"collectionName": collname}}
,{$lookup: {"from": collname,
// Clever twist: if localField is a list, then the lookup
// behaves like an in-list:
localField: "list",
foreignField: "key",
as: "X" }}
]);
});

Query to retrieve all the lines involved in a documents

I have a structure with document, and lines. A line has a reference to it's document. However some lines can also have a reference to another line.
I want to make a query to retrieve all the lines involved in a documents (meaning lines directly linked, and the referenced lines).
Example
{_id:1, doc:1 },
{_id:3, doc:1, linkedLine:4},
{_id:4, doc:2 },
{_id:5, doc:2 },
I would like to obtain
linesOfDoc(1) = {_id:1, doc:1},{_id:3, doc:1, linkedLine:4},{_id:4, doc:2 }
I could be done getting first lines with doc=1, the doing a loop and getting the linked lines if present.
But is that possible to do this in one mongodb query ?
Regards
You can not do joins with mongo, exactly as you would do with sql, but you can get close with aggregation pipeline.
You got all the data in one query, but you need to flatten it farther to get the exact result you specified.
MONGO> db.playground.find()
{ "_id" : 1, "doc" : 1 }
{ "_id" : 3, "doc" : 1, "linkedLine" : 4 }
{ "_id" : 4, "doc" : 2 }
MONGO> db.playground.aggregate([{ $lookup: { from: "playground", localField: "linkedLine", foreignField: "_id", as: "embeddedLinkedLine"}}, { $match: { doc: <id of the document youre looking for> }}])
{ "_id" : 1, "doc" : 1, "embeddedLinkedLine" : [ ] }
{ "_id" : 3, "doc" : 1, "linkedLine" : 4, "embeddedLinkedLine" : [ { "_id" : 4, "doc" : 2 } ] }

Project data set into new objects

I have a really simple question which has troubled me for some time. I have a list of objects containing an array of Measurements, where each of these contains a time and multiple values like below:
{
"_id" : ObjectId("5710ed8129c7f31530a537bc"),
"Measurements" : [
{
"_t" : "Measurement",
"_time" : ISODate("2016-04-14T12:31:52.584Z"),
"Measurement1" : 1
"Measurement2" : 2
"Measurement3" : 3
},
{
"_t" : "DataType",
"_time" : ISODate("2016-04-14T12:31:52.584Z"),
"Measurement1" : 4
"Measurement2" : 5
"Measurement3" : 6
},
{
"_t" : "DataType",
"_time" : ISODate("2016-04-14T12:31:52.584Z"),
"Measurement1" : 7
"Measurement2" : 8
"Measurement3" : 9
} ]
},
{
"_id" : ObjectId("5710ed8129c7f31530a537cc"),
"Measurements" : [
{
"_t" : "Measurement",
"_time" : ISODate("2016-04-14T12:31:52.584Z"),
"Measurement1" : 0
....
I want to create a query which projects the following data set into the one below. For example, query for Measurement1 and create an array of objects containing the time and value of Measurement1 (see below) via mongo aggregation framework.
{ "Measurement": [
{
"Time": ISODate("2016-04-14T12:31:52.584Z"),
"Value": 1
}
{
"Time": ISODate("2016-04-14T12:31:52.584Z"),
"Value": 4
}
{
"Time": ISODate("2016-04-14T12:31:52.584Z"),
"Value": 7
}
]}
Seems like a pretty standard operation, so I hope you guys can shed some light on this.
You can do this by first unwinding the Measurements array for each doc and then projecting the fields you need and then grouping them back together:
db.test.aggregate([
// Duplicate each doc, once per Measurements array element
{$unwind: '$Measurements'},
// Include and rename the desired fields
{$project: {
'Measurements.Time': '$Measurements._time',
'Measurements.Value': '$Measurements.Measurement1'
}},
// Group the docs back together to reassemble the Measurements array field
{$group: {
_id: '$_id',
Measurements: {$push: '$Measurements'}
}}
])

Mongodb aggregate query count records for large dataset

I'm attempting to query all data from the errorlog collection, and in the same query grab a count of relevant irs_documents for each errorlog entry.
The problem is that there are too many records in the irs_documents collection to perform a $lookup.
Is there a performant method of doing this in one MongoDB query?
Failed attempt
db.getCollection('errorlog').aggregate(
[
{
$lookup: {
from: "irs_documents",
localField: "document.ssn",
foreignField: "ssn",
as: "irs_documents"
}
},
{
$group: {
_id: { document: "$document", error: "$error" },
logged_documents: { $sum : 1 }
}
}
]
)
Error
Total size of documents in $lookup exceeds maximum document size
Clearly this solution won't work. MongoDB is literally attempting to gather whole documents with $lookup, where I just want a count.
"errorlog" collection sample data:
/* 1 */
{
"_id" : ObjectId("56d73955ce09a5a32399f022"),
"document" : {
"ssn" : 1
},
"error" : "Error 1"
}
/* 2 */
{
"_id" : ObjectId("56d73967ce09a5a32399f023"),
"document" : {
"ssn" : 2
},
"error" : "Error 1"
}
/* 3 */
{
"_id" : ObjectId("56d73979ce09a5a32399f024"),
"document" : {
"ssn" : 3
},
"error" : "Error 429"
}
/* 4 */
{
"_id" : ObjectId("56d73985ce09a5a32399f025"),
"document" : {
"ssn" : 9
},
"error" : "Error 1"
}
/* 5 */
{
"_id" : ObjectId("56d73990ce09a5a32399f026"),
"document" : {
"ssn" : 1
},
"error" : "Error 8"
}
"irs_documents" collection sample data
/* 1 */
{
"_id" : ObjectId("56d73905ce09a5a32399f01e"),
"ssn" : 1,
"name" : "Sally"
}
/* 2 */
{
"_id" : ObjectId("56d7390fce09a5a32399f01f"),
"ssn" : 2,
"name" : "Bob"
}
/* 3 */
{
"_id" : ObjectId("56d7391ace09a5a32399f020"),
"ssn" : 3,
"name" : "Kelly"
}
/* 4 */
{
"_id" : ObjectId("56d7393ace09a5a32399f021"),
"ssn" : 9,
"name" : "Pippinpaddle-Oppsokopolis"
}
The error is self explanatory. Lookup is essentially combining two documents into single BSON document so MongoDB document size limit is biting you back.
You need to ask yourself, is it absolute necessary to perform both actions in one operation? if yes, do it the way you have to do in previous versions of MongoDB where $lookup is not supported.
Said that, perform two queries and perform merger in your client.
OPTION #1: you can aggregate on irs_documents and export computed result into another collection. Since, there will be very few objects in each document, I don't think you'll hit problem. But, you may hit memory problems and forced to use disk for aggregation framework. Try following solution and see if it works.
db.irs_documents.aggregate([
{
$group:{_id:"$ssn", count:{$sum:1}}
},
{
$out:"irs_documents_group"
}]);
db.errorlog.aggregate([
{
$lookup: {
from: "irs_documents_group",
localField: "document.ssn",
foreignField: "ssn",
as: "irs_documents"
}
},
{
$group: {
_id: { document: "$document", error: "$error" },
logged_documents: { $sum : 1 }
}
}
])
OPTION #2: If above solution is not working, you can always use map reduce, though it will not be an elegant solution but will work.