$lookup on Embedded Documents in MongoDB: Does order of values matter? - mongodb

I have two similar collections within the same database which I am trying to merge using $lookup and the aggregate pipeline. Their _ids, which I'm using as the matching field, contain the same values, but in a different order:
Collection1:
{ "_id" : { "State" : "Vermont", "Race" : "Black American or African American" }, "Population" : 6456 }
Collection2:
{ "_id" : { "Race" : "Multiracial", "State" : "Arkansas" }, "Population" : 48996 }
I tried running the aggregate pipeline as follows:
db.Collection1.aggregate([{$lookup: {from: "Collection2", localField: "_id", foreignField: "_id", as: "Population"}}])
However, when I do that, I get:
{ "_id" : { "Race" : "Multiracial", "State" : "Arkansas" }, "Population" : [ ] }
I'd like to get the values for population within the array. I'm fairly new to MongoDB. Is there something wrong with my syntax for the aggregate command, or is it failing because 'Race' and 'State' are listed in a different order within the embedded document _id? Does the order of the values matter for matching on embedded documents?
Thank you so much for your time, and I appreciate any suggestions.

Related

How to find MongoDB documents in one collection not referenced by documents in another collection?

I am looking for an efficient way in MonogDB to determine, which documents in one collection are not referenced by documents in another collection.
The database comprises two collections, inventory and tags, where some (not all) documents in inventory reference one of the tags documents:
{
"_id" : ObjectId("5e8df3c02e197074f39f61ea"),
"tag" : ObjectId("5e89a1af96d5d8b30aead768"),
"ean" : "5707196199178",
"location" : "shelf 1"
},
{
"_id" : ObjectId("5e8df211727079cdc24e20e1"),
"ean" : "5707196199178",
"location" : "shelf 1"
}
The 'tags' documents are without any reference to documents in inventory:
{
"_id" : ObjectId("5e7d174fc63ce5b0ca80b89a"),
"nfc" : { "id" : "04:5f:ae:f2:c2:66:81" },
"barcode" : { "code" : "29300310", "type" : "EAN8" }
},
{
"_id" : ObjectId("5e89a1af96d5d8b30aead768"),
"nfc" : { "id" : "04:48:af:f2:c2:66:80" },
"barcode" : { "code" : "29300716", "type" : "EAN8" }
},
{
"_id" : ObjectId("5e7d1756c63ce5b0ca80b89c"),
"nfc" : { "id" : "04:02:ae:f2:c2:66:81" },
"barcode" : { "code" : "29300648", "type" : "EAN8" }
}
Since not all documents in tags are used in inventory documents, I cannot simply have them as sub-documents.
Now I need to determine, which of the tags documents are not referenced by any inventory document. I would prefer not to have to maintain back references from tags to inventory to not risk inconsistencies (unless this can be done automatically by MongoDB?).
I'm very new to MongoDB, and from I've learned so far I'm under the impression that a view is probably what I need. But I seem to lack the proper search terms to find examples that help me understand enough to proceed. Maybe I need something different, here I'm hoping for your input to point me in the right direction.
You need to perform MongoDB aggregation with $lookup operator that allows two collections to be joined.
If there are "tags documents are not referenced by any inventory document", join field would be an empty array.
In the next step, we filter empty arrays with $size operator.
Try the query below:
db.tags.aggregate([
{
$lookup: {
from: "inventory",
localField: "_id",
foreignField: "tag",
as: "join"
}
},
{
$match: {
"join": {
$size: 0
}
}
},
{
$project: {
join: 0
}
}
])
tags not referenced | inventory not referenced

Mongo Query: how to $lookup with DBRef

I have a trouble with $lookup with DBRef. I couldn't find the solution for below scenario anywhere. Someone please help me here?
Suppose the Collection A is
{
"_id" : ObjectId("582abcd85d2dfa67f44127e0"),
"status" : NumberInt(1),
"seq" : NumberInt(0) }
and Collection B:
{
"_id" : ObjectId("582abcd85d2dfa67f44127e1"),
"Name" : "from B Collection"
"bid" : DBRef("B", ObjectId("582abcd85d2dfa67f44127e0")) }
I have spent lot of time in aggregating above two collections. I am looking for the output as below.
{
"_id" : ObjectId("582abcd85d2dfa67f44127e0"),
"status" : NumberInt(1),
"seq" : NumberInt(0),
B: [
{
"_id" : ObjectId("582abcd85d2dfa67f44127e1"),
"Name" : "from B Collection"
}
]}
Please help me with the Mongo query to retrieve the result in the above format. Thanks in advance
Ideally you would be able to change the DBRef to a plain objectId or just string type. As noted in this post, it can be convoluted to use a DBRef in a lookup. The key is an $addFields stage with {$objectToArray: "$$ROOT.bid"} to get the DBRef value into a usable format.
You'll need to start the aggregation from collection B since that is where the reference is -- and that DBRef needs massaging before doing the lookup. Knowing that is the case, maybe the goal output shape might change; however, here is an aggregation that works to get you what you need:
db.getCollection('B').aggregate([
{$addFields: {fk: {$objectToArray: "$$ROOT.bid"}}},
{$lookup: {
from: 'A',
localField: 'fk.1.v',
foreignField: '_id',
as: 'A'
}},
// the below is transforming data into the format in the example
{$addFields: {'A.B': {_id: '$_id', Name: '$Name'}}},
{$unwind: '$A'},
{$replaceRoot: {newRoot: '$A'}}
])
You might need to do a groupBy if there are multiple B matches you need to group into an array.

From two collections how to filter un matching data

In DB i have som sample data as fallows
items(Collection name)
//Object 1
{
"_id" : 1234,
"itemCode" : 3001,// (Number)
"category" : "Biscuts"
}
//Object 2
{
"_id" : 1235,
"itemCode" : 3002,// (Number)
"category" : "Health products"
}
The Above is the sample data in the items collection. So like this, there are many objects with the unique item code.
orders(Collection name)
{
"_id" : 1456,
"customer" : "ram",
"address" : "india",
"type" : "order",
"date" : "2018/08/20",
"orderId" : "999",
"itemcode" : "3001"//('string')
}
The above is the orders sample data. Even this collection has many objects with repeating item codes and orderid.
In the application, we have some tab called items not billed. So in this tab, we can see the items which were not used even once for the order. So from the above data how can I show the items which were not used?
For example: From the above data the resulting itemcode should be 3002 because that item is not used even once. How can I get the output with one DB query?
You can use below aggregation in mongo 4.0 version.
db.items.aggregate([
{ $addFields: {
itemCodeStr: {$toString: "$itemCode"}
}},
{
$lookup: {
from: "orders",
localField: "itemCodeStr",
foreignField: "itemcode",
as: "matched-orders"
}
},
{
$match: {
matched-orders: []
}
}
])

Mongo Db : $lookup (aggregation) / join with two different data types

I am trying to use mongo 3.4 $lookup:function
db.orders.aggregate([
{
$lookup:
{
from: "inventory",
localField: "item",
foreignField: "sku",
as: "inventory_docs"
}
}
])
orders:
{ "_id" : 1, "itemid" : "1234", "price" : 12, "quantity" : 2 }
invdentory :
{ "_id" : 1, "skuid" : 123, description: "product 1", "instock" : 120 }
The problem here is the fields to be joined are of string and integer. How can i make this lookup possible in mongo
It's not possible to change the datatype inside the $lookup step of the aggregation pipeline. The topic has already been discussed here:
Change type of field inside mongoDB aggregation and does $lookup utilises index on fields or not?
how to convert string to numerical values in an aggregate query in MongoDB.
In both threads the final solution was: you must previously convert the datatype programmatically

MongoDB - simple sub query example

Given the data:
> db.parameters.find({})
{ "_id" : ObjectId("56cac0cd0b5a1ffab1bd6c12"), "name" : "Speed", "groups" : [ "
123", "234" ] }
> db.groups.find({})
{ "_id" : "123", "name" : "Group01" }
{ "_id" : "234", "name" : "Group02" }
{ "_id" : "567", "name" : "Group03" }
I would like to supply a parameter _id an make a query return all groups that are within the groups array of the given document in parameters table.
The straightforward solution seems to make several DB calls in PyMongo:
Get parameter from parameters table based on the supplied _id
For each element of groups array select a document from groups collection
But this will have so much unnecessary overhead. I feel there must be a better, faster way to do this within MongoDB (without running custom JS in the DB). Or should I re-structure my data by normalising it a little bit (like a table of relationships), neglecting the document-based approach?
Again, please help me find a solution that would work from PyMongo DB interface
You can do this within a single query using the aggregation framework. In particular you'd need to run an aggregation pipeline that uses the $lookup operator to do a left join from the parameters collection to the groups collection.
Consider running the following pipeline:
db.parameters.aggregate([
{ "$unwind": "$groups" },
{
"$lookup": {
"from": "groups",
"localField": "groups",
"foreignField": "_id",
"as": "grp"
}
},
{ "$unwind": "$grp" }
])
Sample Output
/* 1 */
{
"_id" : ObjectId("56cac0cd0b5a1ffab1bd6c12"),
"name" : "Speed",
"groups" : "123",
"grp" : {
"_id" : "123",
"name" : "Group01"
}
}
/* 2 */
{
"_id" : ObjectId("56cac0cd0b5a1ffab1bd6c12"),
"name" : "Speed",
"groups" : "234",
"grp" : {
"_id" : "234",
"name" : "Group02"
}
}
If your MongoDB server version does not support the $lookup pipeline operator, then you'd need execute two queries as follows:
# get the group ids
ids = db.parameters.find_one({ "_id": ObjectId("56cac0cd0b5a1ffab1bd6c12") })["groups"]
# query the groups collection with the ids from previous query
db.groups.find({ "_id": { "$in": ids } })
EDIT: matched the field name in the aggregation query to the field name in example dataset (within the question)