New to mongodb, so maybe this is a dumb question. I am using MongoDB lookup aggregation, but the 'from' collection is a field in the input document. How do I indicate that field in the 'from' rather than a string literal?
A simplified version of the collection I am starting with ("Groups") has documents that look like this:
{
_id: "<ObjectId>",
collectionName: "MyCollectionA",
list: ["<Foreign ObjectId>", "<Foreign ObjectId>", "<Foreign ObjectId>"]
}
I am joining to another collection. In this case, "MyCollectionA".
My lookup is working and looks like this:
{
$lookup: {
from: "MyCollectionA",
localField: "list",
foreignField: "_id",
as: "myJoinedItems"
}
}
However, I want to be able to use the field 'collectionName' rather than hardcoding 'MyCollectionA' in the lookup. How can I do that? I've tried '$collectionName' and { $literal: '$collectionName }, but no luck.
The following query can do the trick. We are iterating over each record of Groups collection and performing the find operation on the collection specified in each document for the list of object IDs. Since there is a default index on _id, the search operation would be fast.
db.Groups.find().forEach(doc=>{
var myJoinedItems = [];
doc["list"].forEach(id=>{
myJoinedItems.push(
db.getCollection(doc["collectionName"]).find({"_id":id})[0]
);
});
doc["myJoinedItems"]=myJoinedItems;
print(tojson(doc));
});
Output:
{
"_id" : ObjectId("5d6cac366bc2ad3b23f7de74"),
"collectionName" : "MyCollectionA",
"list" : [
ObjectId("5d6cabd16bc2ad3b23f7de72")
],
"myJoinedItems" : [
{
"_id" : ObjectId("5d6cabd16bc2ad3b23f7de72"),
"collectionDetails" : {
"name" : "MyCollectionA",
"info" : "Cool"
}
}
]
}
{
"_id" : ObjectId("5d6cb82a6bc2ad3b23f7de76"),
"collectionName" : "MyCollectionB",
"list" : [
ObjectId("5d6cb7fd6bc2ad3b23f7de75")
],
"myJoinedItems" : [
{
"_id" : ObjectId("5d6cb7fd6bc2ad3b23f7de75"),
"collectionDetails" : {
"name" : "MyCollectionB",
"info" : "Super-Cool"
}
}
]
}
Data set:
Collection: Groups
{
"_id" : ObjectId("5d6cac366bc2ad3b23f7de74"),
"collectionName" : "MyCollectionA",
"list" : [
ObjectId("5d6cabd16bc2ad3b23f7de72")
]
}
{
"_id" : ObjectId("5d6cb82a6bc2ad3b23f7de76"),
"collectionName" : "MyCollectionB",
"list" : [
ObjectId("5d6cb7fd6bc2ad3b23f7de75")
]
}
Collection: MyCollectionA
{
"_id" : ObjectId("5d6cabd16bc2ad3b23f7de72"),
"collectionDetails":{
"name":"MyCollectionA",
"info":"Cool"
}
}
Collection: MyCollectionB
{
"_id" : ObjectId("5d6cb7fd6bc2ad3b23f7de75"),
"collectionDetails":{
"name":"MyCollectionB",
"info":"Super-Cool"
}
}
Here's a variation that uses $lookup directly.
db.foo.drop();
db.foo1.drop();
db.foo2.drop();
db.foo.insert(
[
{_id:0, collectionName:"foo1", list: ['A','B','C'] },
{_id:1, collectionName:"foo2", list: ['D','E'] }
]);
db.foo1.insert([
{_id:0, key:"A", foo1data:"goodbye"},
{_id:1, key:"B", foo1data:"goodbye"},
{_id:2, key:"C", foo1data:"goodbye"}
]);
db.foo2.insert([
{_id:0, key:"C", foo2data:"goodbye"},
{_id:1, key:"D", foo2data:"goodbye"}
]);
// Pass 1: Collect unique collections.
c = db.foo.distinct("collectionName");
// Pass 2: Get 'em:
c.forEach(function(collname) {
c2 = db.foo.aggregate([
{$match: {"collectionName": collname}}
,{$lookup: {"from": collname,
// Clever twist: if localField is a list, then the lookup
// behaves like an in-list:
localField: "list",
foreignField: "key",
as: "X" }}
]);
});
Related
I need to join two MongoDB Colletions with lookup, the MainField to join from de first collection has to join with the other collection through Field A or Field B.
MainField is an array, with this structure [Doc1.FieldA, Doc2.FieldA, Doc3.FieldB,...].
FieldA is Unique-Index.
FieldB is Non-Unique-Index, it is for group FieldB with a unique value.
The problem is that I need to keep the order of the MainField Array.
I like to do something like this:
db.getCollection("collection1").aggregate([
$lookup: {
from: "collection2",
localField: "mainField",
foreignField: $or:["fieldA","FieldB"]
as: "mainFieldInfo"
}]
Is it possible to do this lookup or I need a different approach?
Collections examples, the documents are simplified there are more fields
in each document.
Collection Machines (1 example) :
{
"_id" : ObjectId("5c793a188021710636865c33"),
"MachineName" : "CER3A",
"NextJobs" : [ //--> MainField
"ST105862", // match with FIELD B - Flags.STS
"OFT083520", // match with FIELD A - Lote
"OFT083365",
"ST105946"
]
}
Collection Works (2 example, 1 to match with FieldA, 1 to match Field B):
Field A example:
FieldB*(Flags.STS)* is empty
{
"_id" : ObjectId("5c1b89d0b6e97d001816595e"),
"Lote" : "OFT083520", //--> FIELD A
"Flags" : {
"ShipsFinished" : true,
"PlanFinished" : true,
"Finished" : true,
"IdDefecto" : false,
"EstadoOF" : 4,
"GCT" : "GCT018929",
"PedidoVenta" : "",
"STS" : "", //--> FIELD B
}
}
Field B Example (2 docs):
FieldA*(Lote)* is diferent in each document, FieldB*(Flags.STS)* is equal
{
"_id" : ObjectId("5dcd78e2a2061070185400e2"),
"Lote" : "OFT083671", //--> FIELD A
"Flags" : {
"B2" : 1,
"EstadoOF" : 4,
"Finished" : false,
"GCT" : "GCT024270",
"LaSI" : 0,
"PedidoVenta" : "P056048",
"SPO" : "PO23579",
"STS" : "ST105862", //--> FIELD B
"Inks" : "true",
}
}
{
"_id" : ObjectId("5dcd78e2a2061070185401f0"),
"Lote" : "OFT083672", //--> FIELD A
"Flags" : {
"B2" : 1,
"EstadoOF" : 4,
"Finished" : false,
"STS" : "ST105862", //--> FIELD B
"ShipsFinished" : false,
"TipoOF" : 1,
"EstatIQC" : 1,
}
}
You have to use the other form of $lookup stage, which allow to perform multiple conditions for the lookup stage.
Here's the query you have to run :
db.machines.aggregate([
{
$lookup: {
from: "works",
let: {
"nj": "$NextJobs"
},
pipeline: [
{
$match: {
$expr: {
$or: [
{
$in: [
"$Lote",
"$$nj"
]
},
{
$in: [
"$Flags.STS",
"$$nj"
]
}
]
}
}
}
],
as: "linkedWorks"
}
}
])
You can test it here
This question already has answers here:
Before $unwind check if sub document is not empty
(2 answers)
Closed 3 years ago.
I am trying to aggregate attributes from two collections, one of those contains a field which may or may not be there in a document. When the attribute is not there in the document it doesn't return any document at all. So I need to create a kind of null check, that if the attribute is not there don't consider the attribute else consider it, below is my query -
db.collection(collectionName).aggregate(
[{
$match: selector
}, {
$lookup: {
from: 'status',
localField: 'candidateId',
foreignField: 'candidateId',
as: 'profile'
}
}, {
$project: {
'_id': 0,
'currentStatus': '$profile.currentStatus',
'lastContacted': '$profile.lastContacted',
'lastWorkingDay': '$profile.lastWorkingDay',
'remarks': '$profile.remarks'
}
},{
$unwind: '$lastWorkingDay'
}
In this case lastWorkingDay if not present makes the whole query return nothing. Any pointer would be helpful.
I believe something else is wrong with your query.
This is a bit hard to analyse without any data input, so I made up my own:
I have tried this on my local box just now, and it executes the way you'd expect it.
A projection shouldn't remove any kind of results. Here is my example:
Collection c1:
/* 1 */
{
"_id" : ObjectId("5c780eea79e5bed2bd00f85e"),
"candidateId" : "id1",
"currentStatus" : "a",
"lastContacted" : "b"
}
/* 2 */
{
"_id" : ObjectId("5c780efb79e5bed2bd00f863"),
"candidateId" : "id2",
"currentStatus" : "a",
"lastContacted" : "b",
"lastWorkingDay" : "yesterday"
}
Collection C2:
/* 1 */
{
"_id" : ObjectId("5c780f0a79e5bed2bd00f874"),
"candidateId" : "id1"
}
/* 2 */
{
"_id" : ObjectId("5c780f2879e5bed2bd00f87b"),
"candidateId" : "id2"
}
Aggregation:
db.getCollection('c2').aggregate( [
{$match: {}},
{ $lookup: {
from: "c1",
localField: "candidateId",
foreignField: "candidateId",
as : "profile"
} },
{$project: {
_id: 0,
"currentStatus" : "$profile.currentStatus",
"lastWorkingDay" : "$profile.lastWorkingDay"
} }
] )
Results:
/* 1 */
{
"currentStatus" : [
"a"
],
"lastWorkingDay" : []
}
/* 2 */
{
"currentStatus" : [
"a"
],
"lastWorkingDay" : [
"yesterday"
]
}
As you can see, the lastWorkingDay is executed correctly for both values in my aggregation.
Note that the lookup is creating an array for profiles since there could be multiple results for the lookup. You may need to unwind this if you need it in more detail.
I hope this helps.
It's one of my data as JSON format:
{
"_id" : ObjectId("5bfdb412a80939b6ed682090"),
"accounts" : [
{
"_id" : ObjectId("5bf106eee639bd0df4bd8e05"),
"accountType" : "DDA",
"productName" : "DDA1"
},
{
"_id" : ObjectId("5bf106eee639bd0df4bd8df8"),
"accountType" : "VSA",
"productName" : "VSA1"
},
{
"_id" : ObjectId("5bf106eee639bd0df4bd8df9"),
"accountType" : "VSA",
"productName" : "VSA2"
}
]
}
I want to make a query to get all productName(no duplicate) of accountType = VSA.
I write a mongo query:
db.Collection.distinct("accounts.productName", {"accounts.accountType": "VSA" })
I expect: ['VSA1', 'VSA2']
I get: ['DDA','VSA1', 'VSA2']
Anybody knows why the query doesn't work in distinct?
Second parameter of distinct method represents:
A query that specifies the documents from which to retrieve the distinct values.
But the thing is that you showed only one document with nested array of elements so whole document will be returned for your condition "accounts.accountType": "VSA".
To fix that you have to use Aggregation Framework and $unwind nested array before you apply the filtering and then you can use $group with $addToSet to get unique values. Try:
db.col.aggregate([
{
$unwind: "$accounts"
},
{
$match: {
"accounts.accountType": "VSA"
}
},
{
$group: {
_id: null,
uniqueProductNames: { $addToSet: "$accounts.productName" }
}
}
])
which prints:
{ "_id" : null, "uniqueProductNames" : [ "VSA2", "VSA1" ] }
I have below 3 documents. Each represents a contact for a user :
{
"_id" : ObjectId("57f9f9f3b91d070315273d0d"),
"profileId" : "test",
"displayName" : "duplicateTest",
"email" : [
{
"emailId" : "a#a.com"
},
{
"emailId" : "b#b.com"
},
{
"emailId" : "c#c.com"
}
]
}
{
"_id" : ObjectId("57f9fab2b91d070315273d11"),
"profileId" : "test",
"displayName" : "duplicateTest2",
"email" : [
{
"emailId" : "a#a.com"
}
]
}
{
"_id" : ObjectId("57f9fcefb91d070315273d15"),
"profileId" : "test",
"displayName" : "duplicateTest2",
"email" : [
{
"emailId" : "b#b.com"
}
]
}
I need to aggregate/group them by array elements so that I can identify the duplicate contact ( based on email id). Since there is a common email id between doc (1 & 2) and doc( 1 & 3) these 3 represent one contact and should be merged into one as one contact.
I tried doing this using $unwind and $group in java as below:
List<DBObject> aggList = new ArrayList<DBObject>();
BasicDBObject dbo = new BasicDBObject("$match", new BasicDBObject("profileId", "0fb72dcf-292b-4343-a0e7-1d613a803b1e"));
aggList.add(dbo);
BasicDBObject dboUnwind = new BasicDBObject("$unwind", "$email");
aggList.add(dboUnwind);
BasicDBObject dboGroup = new BasicDBObject("$group",
new BasicDBObject().append("_id", new BasicDBObject("name", "$email.emailId"))
.append("uniqueIds", new BasicDBObject("$addToSet", "$_id"))
.append("count", new BasicDBObject("$sum", 1)));
aggList.add(dboGroup);
BasicDBObject dboCount = new BasicDBObject("$match", new BasicDBObject("count", new BasicDBObject("$gte", 2)));
aggList.add(dboCount);
BasicDBObject dboSort = new BasicDBObject("$sort", new BasicDBObject("count",-1));
aggList.add(dboSort);
BasicDBObject dboLimit = new BasicDBObject("$limit", 10);
aggList.add(dboLimit);
AggregationOutput output = collection.aggregate(aggList);
System.out.println(output.results());
This groups docs by email id (and rightly so) but doesn't serves the purpose.
Any help would be highly appreciated.
I need to implement the feature where user can be prompted about the possible duplicate contacts in his repository. I need aggregation result to be something like:
[
{
"_id":{
"name":[
{
"emailId" : "a#a.com"
},
{
"emailId" : "b#b.com"
},
{
"emailId" : "c#c.com"
}
]
},
"uniqueIds":[
{
"$oid":"57f9fcefb91d070315273d15"
},
{
"$oid":"57f9fcefb91d070315273d11"
},
{
"$oid":"57f9fcefb91d070315273d15"
}
],
"count":3
},
So basically, I need _id for all possible duplicate contacts (there could be another group of duplicates with _ids list as above) so that I can prompt it to user and user can merge them at his will.
Hope its more clear now. Thanks!
Well your question differs a bit from the result you are seeking. Your inital question pointed me to the following aggregation:
db.table.aggregate(
[
{
$unwind: "$email"
},
{
$group: {
_id : "$email.emailId",
duplicates : { $addToSet : "$_id"}
}
}
]
);
This results in:
{
"_id" : "c#c.com",
"duplicates" : [
ObjectId("57f9f9f3b91d070315273d0d")
]
}
{
"_id" : "b#b.com",
"duplicates" : [
ObjectId("57f9fcefb91d070315273d15"),
ObjectId("57f9f9f3b91d070315273d0d")
]
}
{
"_id" : "a#a.com",
"duplicates" : [
ObjectId("57f9fab2b91d070315273d11"),
ObjectId("57f9f9f3b91d070315273d0d")
]
}
Grouped by EMail.
But the sample output you added to your question made this aggregation:
db.table.aggregate(
[
{
$unwind: "$email"
},
{
$group: {
_id : "$profileId",
emails : { $addToSet : "$email.emailId"},
duplicates : { $addToSet : "$_id"}
}
}
]
);
Which results in:
{
"_id" : "test",
"emails" : [
"c#c.com",
"b#b.com",
"a#a.com"
],
"duplicates" : [
ObjectId("57f9fcefb91d070315273d15"),
ObjectId("57f9fab2b91d070315273d11"),
ObjectId("57f9f9f3b91d070315273d0d")
]
}
i have a collection of records as follows
{
"_id":417,
"ptime":ISODate("2013-11-26T11:18:42.961Z"),
"p":{
"type":"1",
"txt":"test message"
},
"users":[
{
"uid":"52872ed59542f",
"pt":ISODate("2013-11-26T11:18:42.961Z")
},
{
"uid":"524eb460986e4",
"pt":ISODate("2013-11-26T11:18:42.961Z")
},
{
"uid":"524179060781e",
"pt":ISODate("2013-11-27T12:48:35Z")
}
],
},
{
"_id":418,
"ptime":ISODate("2013-11-25T11:18:42.961Z"),
"p":{
"type":"1",
"txt":"test message 2"
},
"users":[
{
"uid":"524eb460986e4",
"pt":ISODate("2013-11-23T11:18:42.961Z")
},
{
"uid":"52872ed59542f",
"pt":ISODate("2013-11-24T11:18:42.961Z")
},
{
"uid":"524179060781e",
"pt":ISODate("2013-11-22T12:48:35Z")
}
],
}
How to sort the above records with descending order of ptime and pt where users uid ="52872ed59542f" ?
If you want to do such a sort, you probably want to store your data in a different way. MongoDB in generally is not near as good with manipulating nested documents as top level fields. In your case, I would recommend splitting out ptime, pt and uid into their own collection:
messages
{
"_id":417,
"ptime":ISODate("2013-11-26T11:18:42.961Z"),
"type":"1",
"txt":"test message"
},
users
{
"id":417,
"ptime":ISODate("2013-11-26T11:18:42.961Z"),
"uid":"52872ed59542f",
"pt":ISODate("2013-11-26T11:18:42.961Z")
},
{
"id":417,
"ptime":ISODate("2013-11-26T11:18:42.961Z"),
"uid":"524eb460986e4",
"pt":ISODate("2013-11-26T11:18:42.961Z")
},
{
"id":417,
"ptime":ISODate("2013-11-26T11:18:42.961Z"),
"uid":"524179060781e",
"pt":ISODate("2013-11-27T12:48:35Z")
}
You can then set an index on the users collection for uid, ptime and pt.
You will need to do two queries to also get the text messages themselves though.
You can use the Aggregation Framework to sort by first ptime and then users.pt field as follows.
db.users.aggregate(
{$sort : {'ptime' : 1}},
{$unwind : "$users"},
{$match: {"users.uid" : "52872ed59542f"}},
{$sort : {'users.pt' : 1}},
{$group : {_id : {id : "$_id", "ptime" : "$ptime", "p" : "$p"}, users : {$push : "$users"}}},
{$group : {_id : "$_id.id", "ptime" : {$first : "$_id.ptime"}, "p" : {$first : "$_id.p"}, users : {$push : "$users"}}}
);
db.yourcollection.find(
{
users:{
$elemMatch:{uid:"52872ed59542f"}
}
}).sort({ptime:-1})
But you will have problems with order by pt field. You should use Aggregation Framework to project data or use Derick's approach.