I have many collections. The view of these collections is the same as the JSON. What I want to do is to collect the collections according to their id and create a collection. How can I do it?
A.json
{
"_id" : ObjectId("58455d2d506c1cab1c82152c"),
"value" : 515835.0
}
{
"_id" : ObjectId("58455d2d506c1cab1c82153c"),
"value" : 6621696.0
}
B.json
{
"_id" : ObjectId("58455d2d506c1cab1c82152c"),
"value" : 2118.0
}
{
"_id" : ObjectId("58455d2d506c1cab1c82153c"),
"value" : 1190.0
}
{
"_id" : ObjectId("423232d2d506c1cab1c1232c"),
"value" : 10.0
}
Collect in A collection, id: 1, collection B id: 1 if it matches.
A in the collection, id: 2, if you are not in any collection, you are only showing that value.
In the last collection I have collected, I want to make a collection of objects and id in pairs.
Result.json
{
"_id" : ObjectId("58455d2d506c1cab1c82152c"),
"value" : 517953.0 // A.value + B.value
}
{
"_id" : ObjectId("58455d2d506c1cab1c82153c"),
"value" : 6633596.0 // A.value + B.value
}
{
"_id" : ObjectId("423232d2d506c1cab1c1232c"),
"value" : 10.0 // B.value (A.value : null)
}
i want this for multiple collections.
For 1-0/1 relation given in the example you can use $lookup as following:
db.B.aggregate([
{$lookup: {
from: "A",
localField: "_id",
foreignField: "_id",
as: "a"
}},
{$unwind: {path: "$a", preserveNullAndEmptyArrays: true}},
{$project: {
value: {$add: ["$value", {$ifNull: ["$a.value", 0 ]}]}
}}
]);
It does ignore any documents in A, which have no corresponding documents in B, i.e. result have the same number of documents as in collection B.
Related
I successfully thanks to the help of the people here managed to $lookup two IDs in my document with their representive document in another collection. The next step I need to take is to further lookup a "nested" ID (refering to a document in another collection).
I tried to simply put another $lookup pipeline up but that just worked part-wise.
So it happens that an "empty" document was included into the chieftain attributes and all other attributes of chieftain where somewhat removed.
See my current aggregate:
db.getCollection('village').aggregate([
{
"$match": { _id: "111" }
},
{
"$lookup": {
from: "character",
localField: "chieftainId",
foreignField: "_id",
as: "chieftain"
}
},
{
"$lookup": {
from: "character",
localField: "villagerIds",
foreignField: "_id",
as: "villagers"
}
},
{
"$lookup": {
from: "bloodline",
localField: "chieftain.bloodline",
foreignField: "_id",
as: "chieftain.bloodline"
}
},
{ "$project" : { "villagerIds" : 0, "chieftainId" : 0}},
{ "$unwind" : "$chieftain" }
])
The result of that is the following:
{
"_id" : "111",
"name" : "MyVillage",
"reputation" : 0,
"chieftain" : {
"bloodline" : []
},
"villagers" : [
{
"_id" : "333",
"name" : "Bortan",
"age" : 21,
"bloodlineId" : "7f02191f-90af-406e-87ff-41d5b4387999",
"villageId" : "foovillage",
"professionId" : "02cbb10a-6c0f-4249-a932-3f40e12d32c5"
},
{
"_id" : "444",
"name" : "Blendi",
"age" : 21,
"bloodlineId" : "b3a8ffeb-27aa-4e2e-a8e6-b382554f326a",
"villageId" : "foovillage",
"professionId" : "45dc9350-c84a-491d-a49a-524834dd5773"
}
]
}
I expected the chieftain part to look like this (this is how the chieftain document looks like without the 'last' $lookup I added):
"chieftain" : {
"_id" : "222",
"name" : "Bolzan",
"age" : 21,
"bloodlineId" : "7c2926f9-2f20-4ccf-846a-c9966970fa9b", // this should be resolved/lookedup
"villageId" : "foovillage",
},
At the point of the lookup, chieftan is an array, so setting the chieftan.bloodline replaces the array with an object containing only the bloodline field.
Move the { "$unwind" : "$chieftain" } stage to before the bloodline lookup stage so the lookup is dealing with an object.
collection A document 1
{
"_id" : ObjectId("5c2ee03224acf45a663d8f09"),
"_class" : "document.domain.DDD",
"generated" : false,
"linkIds" : [],
"types" : [],
"metadata" : {
"templateId" : "ABDC",
"userId" : "Master"
}
"versions" : [
{
"revision" : "fb4fb8ec-edfe-4a3e-a1a9-c8c4b2bce678",
"fileId" : "5c2ee03224acf45a663d8f08"
}
]
}
collection B document 1
{
"_id" : ObjectId("5c2ee03224acf45a663d8f08"),
"_class" : "document.domain.RDF",
"extension" : ".pdf",
"rootPath" : "D"
"size" : 152754
}
the field id in collection A , document 1 appears as String in objectid of collection B doc 1.
how to lookup for the string in collection B which appears as objecid?
you have to use mongodb aggregation there
you have view this more on
https://docs.mongodb.com/manual/core/aggregation-pipeline/
and you have to use $lookup operator more you learn this on
https://docs.mongodb.com/manual/reference/operator/aggregation/lookup/
you can achived your task doing this
collectionA.aggregate.([
{
$match:{},// To match and get all doc
},
{
$lookup:
{
from: collection B//<collection to join>,
localField: versions.fileId//<field from the input documents>,
foreignField: _id//<field from the documents of the "from" collection>,
as: versions.file//<output array field>
}
}])
how to join schema in mongodb?
Example :
a collection
{
ObjectId : ObjectId("ABCDE")
userName : "Jason",
level : 30,
money : 200
}
b collection
{
Id : ObjectId("AAACC"),
userId : ObjectId("ABCDE"),
item : "sword"
}
b.aggregate....
i want result is
id : ObjectId("AAACC"), userName : "Jason", item : "sword"
You should use the aggregation pipeline to join two collections and select the required data. I assume here that you have proper identity fields named _id instead of ObjectId and Id as in your sample:
db.items.aggregate([
{
$lookup:
{
from: "users",
localField: "userId",
foreignField: "_id", // ObjectId in your sample
as: "user"
}
},
{ $unwind: "$user" },
{
$project:
{
"item": 1,
"userName": "$user.userName"
// Id: 1 if you will use your names, because only _id is selected by default
}
}
])
The first step is lookup which joins items and users collections on userId field equals _id field in users collection.
Then you should unwind results, because lookup puts all matched users into user field as an array of user documents.
And last step - project result documents to the desired format.
Now sample. If you have following documents in items collection:
{
"_id" : ObjectId("5c18df3e5d85eb27052a599c"),
"item" : "sword",
"userId" : ObjectId("5c18ded45d85eb27052a5988")
},
{
"_id" : ObjectId("5c18df4f5d85eb27052a599e"),
"item" : "helmet",
"userId" : ObjectId("5c18ded45d85eb27052a5988")
},
{
"_id" : ObjectId("5c18e2da5d85eb27052a59ee"),
"item" : "helmet"
}
And you have two users:
{
"_id" : ObjectId("5c18ded45d85eb27052a5988"),
"userName" : "Jason",
"level" : 30,
"money" : 200
},
{
"_id" : ObjectId("5c18dee35d85eb27052a598a"),
"userName" : "Bob",
"level" : 70,
"money" : 500
}
Then the query above will produce
{
"_id" : ObjectId("5c18df3e5d85eb27052a599c"),
"item" : "sword",
"userName" : "Jason"
},
{
"_id" : ObjectId("5c18df4f5d85eb27052a599e"),
"item" : "helmet",
"userName" : "Jason"
},
{
"_id" : ObjectId("5c18e2da5d85eb27052a59ee"),
"item" : "helmet"
}
NOTE: Usually user names should be unique. Consider to use them as identity for users collection. That will also give you desired result in items collection without any joins.
You can use the lookup aggregation operator to join both collections, and then project only the field from collection a that you are interested in:
db.b.aggregate([
{
$lookup: {
from: "a",
localField: "userId",
foreignField: "ObjectId",
as: "user"
}
},
{
$unwind: "$user"
},
{
$project: {
Id: 1
userName: "$user.userName",
item: 1
}
}
]);
I assume a.ObjectId should in fact be called a._id and b.Id should be b._id? Either way the same principle applies.
EDIT: had forgotten the unwind stage. You need this since your lookup will return the new joined field as an array (albeit with one element), so you need this to get rid of the square brackets.
Problem 1
I have a collection named recipe in which all docs have a array field ingredients. I want to count those array items and write them into a new field ingredient_count.
Problem 2
There is also a collection named ingredient. The docs have a count field which is the total number of uses in all recipes.
My Current Approach
My solution right now is a script that aggregates over the collection and updates all documents one by one:
// PROBLEM 1: update recipe documents
db.recipe.aggregate(
[
{
$project: {
numberOfIngredients: { $size: "$ingredients" }
}
}
]
).forEach(function(recipe) {
db.recipe.updateOne(
{ _id: recipe._id },
{ $set: { incredient_count: recipe.numberOfIngredients } }
)
});
// PROBLEM 2: update ingredient documents
db.ingredient.find().snapshot().forEach(function(ingredient) {
db.ingredient.updateOne(
{ _id: ingredient._id },
{ $set: { count: db.recipe.count({ ingredients: { $in: [ingredient.name] } })) } }
)
});
This is terribly slow. Any idea how to do this more efficiently?
For both problem it's possible to only perform aggregation that output to new collections that would replace existing one :
Problem1
The aggregation contains one $project for counting ingredients with the list of field to keep :
db.recipe.aggregate([{
$project: {
ingredients: 1,
numberOfIngredients: { $size: "$ingredients" }
}
}, {
$out: "recipeNew"
}])
that give you :
{ "_id" : ObjectId("58155bc09c924e717c5c4240"), "ingredients" : [......], "numberOfIngredients" : 5 }
{ "_id" : ObjectId("58155bc19c924e717c5c4241"), "ingredients" : [......], "numberOfIngredients" : 3 }
The result of the aggregation is written to a new collection recipeNew that can replace the existing recipe collection
Problem2
The aggregation contains :
1 $unwind to remove ingredients array
1 $group to sum occurence of each ingredients & group by ingredients _id
1 $lookup that join ingredients collection to the current aggregation to retrieve all fields for specified ingredients
1 $unwind to remove the array of imported ingredients items
1 $project to select fields to keep
1 $out to output the result to a new collection
Query is :
db.recipe.aggregate([{
$unwind: "$ingredients"
}, {
$group: { _id: "$ingredients", IngredientsNumber: { $sum: 1 } }
}, {
$lookup: {
from: "ingredients",
localField: "_id",
foreignField: "_id",
as: "ingredientsDB"
}
}, {
$unwind: { path: "$ingredientsDB", preserveNullAndEmptyArrays: true }
}, {
$project: {
ingredientsNumber: "$IngredientsNumber",
name: "$ingredientsDB.name"
}
}, {
$out: "ingredientsTemp"
}])
That gives :
{ "_id" : ObjectId("5812caaeb4829937f4599b54"), "ingredientsNumber" : 2, "name" : "ingredients5" }
{ "_id" : ObjectId("5812caaeb4829937f4599b53"), "ingredientsNumber" : 1, "name" : "ingredients4" }
{ "_id" : ObjectId("5812caaeb4829937f4599b52"), "ingredientsNumber" : 2, "name" : "ingredients3" }
{ "_id" : ObjectId("5812caaeb4829937f4599b51"), "ingredientsNumber" : 1, "name" : "ingredients2" }
{ "_id" : ObjectId("5812caaeb4829937f4599b50"), "ingredientsNumber" : 2, "name" : "ingredients1" }
The cons of this solution :
It uses $project so you need to specify the fields to keep
you will get a new ingredientsTemp collection containing only ingredients that are actually present in recipes so one additionnal aggregation with a $lookup should be necessary to join the existing one with the one you got from that aggregation :
The following will join the existing ingredients collection with the one we have created :
db.ingredients.aggregate([{
$lookup: {
from: "ingredientsTemp",
localField: "_id",
foreignField: "_id",
as: "ingredientsDB"
}
}, {
$unwind: { path: "$ingredientsDB", preserveNullAndEmptyArrays: true }
}, {
$project: {
name: "$name",
ingredientsNumber: "$ingredientsDB.ingredientsNumber"
}
}])
Then you would have :
{ "_id" : ObjectId("5812caaeb4829937f4599b50"), "name" : "ingredients1", "ingredientsNumber" : 2 }
{ "_id" : ObjectId("5812caaeb4829937f4599b51"), "name" : "ingredients2", "ingredientsNumber" : 1 }
{ "_id" : ObjectId("5812caaeb4829937f4599b52"), "name" : "ingredients3", "ingredientsNumber" : 2 }
{ "_id" : ObjectId("5812caaeb4829937f4599b53"), "name" : "ingredients4", "ingredientsNumber" : 1 }
{ "_id" : ObjectId("5812caaeb4829937f4599b54"), "name" : "ingredients5", "ingredientsNumber" : 2 }
{ "_id" : ObjectId("5812caaeb4829937f4599b57"), "name" : "ingredients6" }
The goods :
It uses only aggregation so it should be quicker
I have two collections: one is items and the second one is user_item_history. I want to fetch items with their status. Status of each item is stored in user_item_history, and other details of the item are in the items collection. we have to filter data for particular user and category of item. so user_id and category is in user_item_history collection.
user_item_history:
{
"_id" : NumberLong(25424),
"_class" : "com.samepinch.domain.registration.UserItemHistory",
"user_id" : NumberLong(25416),
"item_id" : NumberLong(26220),
"catagoryPreference" : "BOTH",
"preference" : 0.6546536707079772,
"catagory" : "FOOD",
"status" : 1,
"createdDate" : ISODate("2015-09-02T07:50:36.760Z"),
"updatedDate" : ISODate("2015-09-02T07:55:24.105Z")
}
items:
{
"_id" : NumberLong(26220),
"_class" : "com.samepinch.domain.item.Item",
"itemName" : "Shoes",
"categoryName" : "SHOPPING",
"attributes" : [
"WESTERN",
"CASUAL",
"ELEGANT",
"LATEST"
],
"isAccessed" : false,
"imageUrl" : "0bd2838e-9349-432a-a200-6e6b659e853eitemcompressed.jpg",
"catagoryPreference" : "FEMALE",
"startDate" : ISODate("2015-11-26T18:30:00Z"),
"endDate" : ISODate("2015-11-27T18:30:00Z"),
"location" : {
"coordinates" : [
77.24149558372778,
28.56973445677584
],
"type" : "Point",
"radius" : 2000
},
"createdDate" : ISODate("2015-11-16T10:49:11.858Z"),
"updatedDate" : ISODate("2015-11-16T10:49:11.858Z")
}
As the final result, I would like to have documents of this format:
{
item_id:26220,
status:1,
imageUrl: "0bd2838e-9349-432a-a200-6e6b659e853eitemcompressed.jpg"
}
Update to MongoDB 3.2 and you'll be able to use the $lookup aggregation stage, which works similarly to SQL joins.
One-to-many relationship
If there are many corresponding user_item_history documents for each items document, you can get a list of item statuses as an array.
Query
db.items.aggregate([
{
$lookup:
{
from: "user_item_history",
localField: "_id",
foreignField: "item_id",
as: "item_history"
}
},
{
$project:
{
item_id: 1,
status: "$item_history.status",
imageUrl: 1
}
}])
Example Output
{
"_id" : NumberLong(26220),
"imageUrl" : "0bd2838e-9349-432a-a200-6e6b659e853eitemcompressed.jpg",
"status" : [ 1 ]
},
{
"_id" : NumberLong(26233),
"imageUrl" : "0bd2838e-9349-432a-a200-6e6b659e853eitemcompressed.jpg",
"status" : [ 1, 2 ]
}
One-to-one relationship
If there's only one corresponding history document for every item, you can use the following approach to get the exact format you requested:
Query
db.items.aggregate([
{
$lookup:
{
from: "user_item_history",
localField: "_id",
foreignField: "item_id",
as: "item_history"
}
},
{
$unwind: "$item_history"
},
{
$project:
{
item_id: 1,
status: "$item_history.status",
imageUrl: 1
}
}])
Example Output
{
"_id" : NumberLong(26220),
"imageUrl" : "0bd2838e-9349-432a-a200-6e6b659e853eitemcompressed.jpg",
"status" : 1
}
Please bear in mind that with every additional aggregation pipeline stage you add, the performance deteriorates. So you may prefer the one-to-many query even if you have a one-to-one relationship.
Applying filtering
In your edit, you added this:
we have to filter data for particular user and category of item. so user_id and category is in user_item_history collection
To filter your results, you should add a $match step to your query:
db.items.aggregate([
{
$lookup:
{
from: "user_item_history",
localField: "_id",
foreignField: "item_id",
as: "item_history"
}
},
{
$unwind: "$item_history"
},
{
$match:
{
"item_history.user_id": NumberLong(25416),
"item_history.catagory": "FOOD"
}
},
{
$project:
{
item_id: 1,
status: "$item_history.status",
imageUrl: 1
}
}])
Please note that "category" is misspelled as "catagory" in your example data, so I also had to misspell it in the query above.