Mongodb:Right way to collect data from two collections? - mongodb

I have two collections: one is items and the second one is user_item_history. I want to fetch items with their status. Status of each item is stored in user_item_history, and other details of the item are in the items collection. we have to filter data for particular user and category of item. so user_id and category is in user_item_history collection.
user_item_history:
{
"_id" : NumberLong(25424),
"_class" : "com.samepinch.domain.registration.UserItemHistory",
"user_id" : NumberLong(25416),
"item_id" : NumberLong(26220),
"catagoryPreference" : "BOTH",
"preference" : 0.6546536707079772,
"catagory" : "FOOD",
"status" : 1,
"createdDate" : ISODate("2015-09-02T07:50:36.760Z"),
"updatedDate" : ISODate("2015-09-02T07:55:24.105Z")
}
items:
{
"_id" : NumberLong(26220),
"_class" : "com.samepinch.domain.item.Item",
"itemName" : "Shoes",
"categoryName" : "SHOPPING",
"attributes" : [
"WESTERN",
"CASUAL",
"ELEGANT",
"LATEST"
],
"isAccessed" : false,
"imageUrl" : "0bd2838e-9349-432a-a200-6e6b659e853eitemcompressed.jpg",
"catagoryPreference" : "FEMALE",
"startDate" : ISODate("2015-11-26T18:30:00Z"),
"endDate" : ISODate("2015-11-27T18:30:00Z"),
"location" : {
"coordinates" : [
77.24149558372778,
28.56973445677584
],
"type" : "Point",
"radius" : 2000
},
"createdDate" : ISODate("2015-11-16T10:49:11.858Z"),
"updatedDate" : ISODate("2015-11-16T10:49:11.858Z")
}
As the final result, I would like to have documents of this format:
{
item_id:26220,
status:1,
imageUrl: "0bd2838e-9349-432a-a200-6e6b659e853eitemcompressed.jpg"
}

Update to MongoDB 3.2 and you'll be able to use the $lookup aggregation stage, which works similarly to SQL joins.
One-to-many relationship
If there are many corresponding user_item_history documents for each items document, you can get a list of item statuses as an array.
Query
db.items.aggregate([
{
$lookup:
{
from: "user_item_history",
localField: "_id",
foreignField: "item_id",
as: "item_history"
}
},
{
$project:
{
item_id: 1,
status: "$item_history.status",
imageUrl: 1
}
}])
Example Output
{
"_id" : NumberLong(26220),
"imageUrl" : "0bd2838e-9349-432a-a200-6e6b659e853eitemcompressed.jpg",
"status" : [ 1 ]
},
{
"_id" : NumberLong(26233),
"imageUrl" : "0bd2838e-9349-432a-a200-6e6b659e853eitemcompressed.jpg",
"status" : [ 1, 2 ]
}
One-to-one relationship
If there's only one corresponding history document for every item, you can use the following approach to get the exact format you requested:
Query
db.items.aggregate([
{
$lookup:
{
from: "user_item_history",
localField: "_id",
foreignField: "item_id",
as: "item_history"
}
},
{
$unwind: "$item_history"
},
{
$project:
{
item_id: 1,
status: "$item_history.status",
imageUrl: 1
}
}])
Example Output
{
"_id" : NumberLong(26220),
"imageUrl" : "0bd2838e-9349-432a-a200-6e6b659e853eitemcompressed.jpg",
"status" : 1
}
Please bear in mind that with every additional aggregation pipeline stage you add, the performance deteriorates. So you may prefer the one-to-many query even if you have a one-to-one relationship.
Applying filtering
In your edit, you added this:
we have to filter data for particular user and category of item. so user_id and category is in user_item_history collection
To filter your results, you should add a $match step to your query:
db.items.aggregate([
{
$lookup:
{
from: "user_item_history",
localField: "_id",
foreignField: "item_id",
as: "item_history"
}
},
{
$unwind: "$item_history"
},
{
$match:
{
"item_history.user_id": NumberLong(25416),
"item_history.catagory": "FOOD"
}
},
{
$project:
{
item_id: 1,
status: "$item_history.status",
imageUrl: 1
}
}])
Please note that "category" is misspelled as "catagory" in your example data, so I also had to misspell it in the query above.

Related

Unable to aggregate two collections using lookup in MongoDB Atlas

I have an orders collection that looks like this:
{
"_id" : "wJNEiSYwBd5ozGtLX",
"orderId" : 52713,
"createdAt" : ISODate("2020-01-31T04:34:13.790Z"),
"status" : "closed",
"orders" : [
{
"_id" : "ziPzwLuZrz9MNkaRT",
"productId" : 10290,
"quantity" : 2
}
]
}
I have an products collection that looks like this
{
"_id" : "238cwwLkZa6gKNN86",
"productId" : 10290,
"title" : "Product Title",
"price" : 9.9
}
I am trying to merge the price information into the orders information.
Something like:
{
"_id" : "wJNEiSYwBd5ozGtLX",
"orderId" : 52713,
"createdAt" : ISODate("2020-01-31T04:34:13.790Z"),
"status" : "closed",
"orders" : [
{
"_id" : "ziPzwLuZrz9MNkaRT",
"productId" : 10290,
"quantity" : 2,
"price": 9.9
}
]
}
If I try a $lookup command on MongoDB Atlas Dashboard like this:
{
from: 'products',
localField: 'orders.productId',
foreignField: 'productId',
as: 'priceInfo'
}
The aggregated output is (not what I wanted):
{
"_id" : "wJNEiSYwBd5ozGtLX",
"orderId" : 52713,
"createdAt" : ISODate("2020-01-31T04:34:13.790Z"),
"status" : "closed",
"orders" : [
{
"_id" : "ziPzwLuZrz9MNkaRT",
"productId" : 10290,
}
],
"priceInfo": [
{
"_id" : "238cwwLkZa6gKNN86",
"productId" : 10290,
"title" : "Product Title",
"price" : 9.9
}
]
}
I do not need a separate priceInfo array. It will be best if I have the product details information merged into the "orders" array. What should be the aggregation lookup syntax to achieve the desired output?
Demo - https://mongoplayground.net/p/bLqcN7tauWU
Read - $lookup $unwind $first $set $push $group
db.orders.aggregate([
{ $unwind: "$orders" }, // break array of orders into individual documents
{
$lookup: { // join
"from": "products",
"localField": "orders.productId",
"foreignField": "productId",
"as": "products"
}
},
{
$set: {
"orders.price": { "$arrayElemAt": [ "$products.price", 0 ] } // set the price
}
},
{
$group: { // group records back
_id: "$_id",
createdAt: { $first: "$createdAt" },
status: { $first: "$status" },
orderId: { $first: "$orderId" },
orders: { $push: "$orders" }
}
}
])

How can I find the following output using a mongo query

How can I find the following output using a mongo query
{
"_id" : ObjectId("5ec387342971ea13815924e0"),
"fullName" : "pqr",
"pendingRequest" : [],
},
{
"_id" : ObjectId("5ec387552971ea13815924e1"),
"fullName" : "xyz",
"pendingRequest" : [],
},
{
"_id" : ObjectId("5ec3876f2971ea13815924e2"),
"fullName" : "abc",
"pendingRequest" : [
{
"_id" : "5ec387342971ea13815924e0"
},
{
"_id" : "5ec387552971ea13815924e1"
}
],
}
This is my data looks like.I want to retrieve "fullName" whose "_id" is in the "pendingRequest" array of "abc" user.I have tried using "$match", "$lookup" stages but didn't get expected output.
To make use of lookup you need to store the "_id" in pending request as ObjectId's.
Try following code it will work.
db.collectionName.aggregate([
{
$match: {fullName: "abc"}
},
{
$unwind: "$pendingRequest"
},
{
$lookup: {from: "so", localField: "pendingRequest._id", foreignField: "_id", as: "a"}
}
])
Also, one more suggestion, it would be better to use pendingRequest as array instead of object array.
pendingRequest: [ObjectId("5ec387342971ea13815924e0"),ObjectId("5ec387552971ea13815924e1")]

How to use join in finding data in MongoDB database?

how to join schema in mongodb?
Example :
a collection
{
ObjectId : ObjectId("ABCDE")
userName : "Jason",
level : 30,
money : 200
}
b collection
{
Id : ObjectId("AAACC"),
userId : ObjectId("ABCDE"),
item : "sword"
}
b.aggregate....
i want result is
id : ObjectId("AAACC"), userName : "Jason", item : "sword"
You should use the aggregation pipeline to join two collections and select the required data. I assume here that you have proper identity fields named _id instead of ObjectId and Id as in your sample:
db.items.aggregate([
{
$lookup:
{
from: "users",
localField: "userId",
foreignField: "_id", // ObjectId in your sample
as: "user"
}
},
{ $unwind: "$user" },
{
$project:
{
"item": 1,
"userName": "$user.userName"
// Id: 1 if you will use your names, because only _id is selected by default
}
}
])
The first step is lookup which joins items and users collections on userId field equals _id field in users collection.
Then you should unwind results, because lookup puts all matched users into user field as an array of user documents.
And last step - project result documents to the desired format.
Now sample. If you have following documents in items collection:
{
"_id" : ObjectId("5c18df3e5d85eb27052a599c"),
"item" : "sword",
"userId" : ObjectId("5c18ded45d85eb27052a5988")
},
{
"_id" : ObjectId("5c18df4f5d85eb27052a599e"),
"item" : "helmet",
"userId" : ObjectId("5c18ded45d85eb27052a5988")
},
{
"_id" : ObjectId("5c18e2da5d85eb27052a59ee"),
"item" : "helmet"
}
And you have two users:
{
"_id" : ObjectId("5c18ded45d85eb27052a5988"),
"userName" : "Jason",
"level" : 30,
"money" : 200
},
{
"_id" : ObjectId("5c18dee35d85eb27052a598a"),
"userName" : "Bob",
"level" : 70,
"money" : 500
}
Then the query above will produce
{
"_id" : ObjectId("5c18df3e5d85eb27052a599c"),
"item" : "sword",
"userName" : "Jason"
},
{
"_id" : ObjectId("5c18df4f5d85eb27052a599e"),
"item" : "helmet",
"userName" : "Jason"
},
{
"_id" : ObjectId("5c18e2da5d85eb27052a59ee"),
"item" : "helmet"
}
NOTE: Usually user names should be unique. Consider to use them as identity for users collection. That will also give you desired result in items collection without any joins.
You can use the lookup aggregation operator to join both collections, and then project only the field from collection a that you are interested in:
db.b.aggregate([
{
$lookup: {
from: "a",
localField: "userId",
foreignField: "ObjectId",
as: "user"
}
},
{
$unwind: "$user"
},
{
$project: {
Id: 1
userName: "$user.userName",
item: 1
}
}
]);
I assume a.ObjectId should in fact be called a._id and b.Id should be b._id? Either way the same principle applies.
EDIT: had forgotten the unwind stage. You need this since your lookup will return the new joined field as an array (albeit with one element), so you need this to get rid of the square brackets.

How to Lookup Data matching an Object property in an Array [duplicate]

What's the syntax for doing a $lookup on a field that is an array of ObjectIds rather than just a single ObjectId?
Example Order Document:
{
_id: ObjectId("..."),
products: [
ObjectId("..<Car ObjectId>.."),
ObjectId("..<Bike ObjectId>..")
]
}
Not Working Query:
db.orders.aggregate([
{
$lookup:
{
from: "products",
localField: "products",
foreignField: "_id",
as: "productObjects"
}
}
])
Desired Result
{
_id: ObjectId("..."),
products: [
ObjectId("..<Car ObjectId>.."),
ObjectId("..<Bike ObjectId>..")
],
productObjects: [
{<Car Object>},
{<Bike Object>}
],
}
2017 update
$lookup can now directly use an array as the local field. $unwind is no longer needed.
Old answer
The $lookup aggregation pipeline stage will not work directly with an array. The main intent of the design is for a "left join" as a "one to many" type of join ( or really a "lookup" ) on the possible related data. But the value is intended to be singular and not an array.
Therefore you must "de-normalise" the content first prior to performing the $lookup operation in order for this to work. And that means using $unwind:
db.orders.aggregate([
// Unwind the source
{ "$unwind": "$products" },
// Do the lookup matching
{ "$lookup": {
"from": "products",
"localField": "products",
"foreignField": "_id",
"as": "productObjects"
}},
// Unwind the result arrays ( likely one or none )
{ "$unwind": "$productObjects" },
// Group back to arrays
{ "$group": {
"_id": "$_id",
"products": { "$push": "$products" },
"productObjects": { "$push": "$productObjects" }
}}
])
After $lookup matches each array member the result is an array itself, so you $unwind again and $group to $push new arrays for the final result.
Note that any "left join" matches that are not found will create an empty array for the "productObjects" on the given product and thus negate the document for the "product" element when the second $unwind is called.
Though a direct application to an array would be nice, it's just how this currently works by matching a singular value to a possible many.
As $lookup is basically very new, it currently works as would be familiar to those who are familiar with mongoose as a "poor mans version" of the .populate() method offered there. The difference being that $lookup offers "server side" processing of the "join" as opposed to on the client and that some of the "maturity" in $lookup is currently lacking from what .populate() offers ( such as interpolating the lookup directly on an array ).
This is actually an assigned issue for improvement SERVER-22881, so with some luck this would hit the next release or one soon after.
As a design principle, your current structure is neither good or bad, but just subject to overheads when creating any "join". As such, the basic standing principle of MongoDB in inception applies, where if you "can" live with the data "pre-joined" in the one collection, then it is best to do so.
The one other thing that can be said of $lookup as a general principle, is that the intent of the "join" here is to work the other way around than shown here. So rather than keeping the "related ids" of the other documents within the "parent" document, the general principle that works best is where the "related documents" contain a reference to the "parent".
So $lookup can be said to "work best" with a "relation design" that is the reverse of how something like mongoose .populate() performs it's client side joins. By idendifying the "one" within each "many" instead, then you just pull in the related items without needing to $unwind the array first.
Starting with MongoDB v3.4 (released in 2016), the $lookup aggregation pipeline stage can also work directly with an array. There is no need for $unwind any more.
This was tracked in SERVER-22881.
You can also use the pipeline stage to perform checks on a sub-docunment array
Here's the example using python (sorry I'm snake people).
db.products.aggregate([
{ '$lookup': {
'from': 'products',
'let': { 'pid': '$products' },
'pipeline': [
{ '$match': { '$expr': { '$in': ['$_id', '$$pid'] } } }
// Add additional stages here
],
'as':'productObjects'
}
])
The catch here is to match all objects in the ObjectId array (foreign _id that is in local field/prop products).
You can also clean up or project the foreign records with additional stages, as indicated by the comment above.
use $unwind you will get the first object instead of array of objects
query:
db.getCollection('vehicles').aggregate([
{
$match: {
status: "AVAILABLE",
vehicleTypeId: {
$in: Array.from(newSet(d.vehicleTypeIds))
}
}
},
{
$lookup: {
from: "servicelocations",
localField: "locationId",
foreignField: "serviceLocationId",
as: "locations"
}
},
{
$unwind: "$locations"
}
]);
result:
{
"_id" : ObjectId("59c3983a647101ec58ddcf90"),
"vehicleId" : "45680",
"regionId" : 1.0,
"vehicleTypeId" : "10TONBOX",
"locationId" : "100",
"description" : "Isuzu/2003-10 Ton/Box",
"deviceId" : "",
"earliestStart" : 36000.0,
"latestArrival" : 54000.0,
"status" : "AVAILABLE",
"accountId" : 1.0,
"locations" : {
"_id" : ObjectId("59c3afeab7799c90ebb3291f"),
"serviceLocationId" : "100",
"regionId" : 1.0,
"zoneId" : "DXBZONE1",
"description" : "Masafi Park Al Quoz",
"locationPriority" : 1.0,
"accountTypeId" : 0.0,
"locationType" : "DEPOT",
"location" : {
"makani" : "",
"lat" : 25.123091,
"lng" : 55.21082
},
"deliveryDays" : "MTWRFSU",
"timeWindow" : {
"timeWindowTypeId" : "1"
},
"address1" : "",
"address2" : "",
"phone" : "",
"city" : "",
"county" : "",
"state" : "",
"country" : "",
"zipcode" : "",
"imageUrl" : "",
"contact" : {
"name" : "",
"email" : ""
},
"status" : "",
"createdBy" : "",
"updatedBy" : "",
"updateDate" : "",
"accountId" : 1.0,
"serviceTimeTypeId" : "1"
}
}
{
"_id" : ObjectId("59c3983a647101ec58ddcf91"),
"vehicleId" : "81765",
"regionId" : 1.0,
"vehicleTypeId" : "10TONBOX",
"locationId" : "100",
"description" : "Hino/2004-10 Ton/Box",
"deviceId" : "",
"earliestStart" : 36000.0,
"latestArrival" : 54000.0,
"status" : "AVAILABLE",
"accountId" : 1.0,
"locations" : {
"_id" : ObjectId("59c3afeab7799c90ebb3291f"),
"serviceLocationId" : "100",
"regionId" : 1.0,
"zoneId" : "DXBZONE1",
"description" : "Masafi Park Al Quoz",
"locationPriority" : 1.0,
"accountTypeId" : 0.0,
"locationType" : "DEPOT",
"location" : {
"makani" : "",
"lat" : 25.123091,
"lng" : 55.21082
},
"deliveryDays" : "MTWRFSU",
"timeWindow" : {
"timeWindowTypeId" : "1"
},
"address1" : "",
"address2" : "",
"phone" : "",
"city" : "",
"county" : "",
"state" : "",
"country" : "",
"zipcode" : "",
"imageUrl" : "",
"contact" : {
"name" : "",
"email" : ""
},
"status" : "",
"createdBy" : "",
"updatedBy" : "",
"updateDate" : "",
"accountId" : 1.0,
"serviceTimeTypeId" : "1"
}
}
I have to disagree, we can make $lookup work with IDs array if we preface it with $match stage.
// replace IDs array with lookup results
db.products.aggregate([
{ $match: { products : { $exists: true } } },
{
$lookup: {
from: "products",
localField: "products",
foreignField: "_id",
as: "productObjects"
}
}
])
It becomes more complicated if we want to pass the lookup result to a pipeline. But then again there's a way to do so (already suggested by #user12164):
// replace IDs array with lookup results passed to pipeline
db.products.aggregate([
{ $match: { products : { $exists: true } } },
{
$lookup: {
from: "products",
let: { products: "$products"},
pipeline: [
{ $match: { $expr: {$in: ["$_id", "$$products"] } } },
{ $project: {_id: 0} } // suppress _id
],
as: "productObjects"
}
}
])
Aggregating with $lookup and subsequent $group is pretty cumbersome, so if (and that's a medium if) you're using node & Mongoose or a supporting library with some hints in the schema, you could use a .populate() to fetch those documents:
var mongoose = require("mongoose"),
Schema = mongoose.Schema;
var productSchema = Schema({ ... });
var orderSchema = Schema({
_id : Number,
products: [ { type: Schema.Types.ObjectId, ref: "Product" } ]
});
var Product = mongoose.model("Product", productSchema);
var Order = mongoose.model("Order", orderSchema);
...
Order
.find(...)
.populate("products")
...
Lookup is basically left join operation in relational database terms.
So left document(table- in relation database terms) is the document where we are working currently and right document is the document from where we have to extract information.
Lookup->
{
$lookup:
{
from: <right document>,
localField: < field in left document which holds the info for right document >,
foreignField: <field in the right document which is referenced in left document >,
as: <alias for array list where the result is to be stored.>
}
}
So the right answer to above question is ->
{
"$lookup":
{
"from": "products",
"localField": "products",
"foreignField": "_id",
"as": "productObject",
}
}
But it wasn't working for me , I debugged and found the below stated issue: So, In my Case what I did wrong while making insert in left document was :
Wrong insert ->
product document -->
{
"_id": ObjectId(something),
"products":[ "some_id_1", "some_id_2"]
}
Right insert would be ->
Product Document
{
"_id": ObjectId(something),
"products" : [ ObjectId("some_id_1"), ObjectId("some_id_2")]
}
For me , I was storing the object id as string , not as in ObjectId object in the foreign key in left document.
Pleasure ensure the right format while making an insert .
And in the end we all should learn from some one's mistake.

Mongodb Aggregate calculate average and add it to the document

I have websites which contains 2 documents:
{
"_id" : ObjectId("58503934034b512b419a6eab"),
"website" : "https://www.stackoverflow.com",
"name" : "Stack Exchange",
"keywords" : [
"helping",
"C#",
"PYTHON"
]
}
{
"_id" : ObjectId("58503934034b512b419a6eab"),
"website" : "https://www.google.com.com",
"name" : "Stack Exchange",
"keywords" : [
"search",
"engine",
]
}
I also have another seo_tracking which contains:
{
"_id" : ObjectId("587373d6f6325811c8a0b3ad"),
"position" : "2",
"real_url" : "https://www.stackoverflow.com",
"created_at" : ISODate("2017-01-09T11:28:22.104Z"),
"keyword" : "helping"
},
{
"_id" : ObjectId("587373d6f6325811c8a0b3ad"),
"position" : "4",
"real_url" : "https://www.stackoverflow.com",
"created_at" : ISODate("2017-01-09T11:28:22.104Z"),
"keyword" : "C#"
}
etc.. This contains around 100+ documents
What I want to do is is aggregate the seo_tracking with website on the specific URL (www.stackexchange (in websites) would match www.stackoverflow.com in (seo_tracking)) which I can do fine. However, I would like to return for each of the websites the following:
{
"_id" : ObjectId("587373d6f6325811c8a0b3ad"),
"website":"https://www.stackoverflow.com",
"avg_position" : "2"
}
Then for Google etc.. Even if the avg_position is 0 .. I have tried the following:
db.seo_tracking.aggregate([
{
$lookup:
{
from: "websites",
localField: "real_url",
foreignField: "website",
as: "post_websites"
},
},
{
"$group": {
_id:null,
avg_position:{$avg:"$position"}
}
}
])
However, this just produces:
{
"_id" : null,
"avg_position" : 2.0
}
What I need to do is have website and ideally also need the ID
Any ideas to where I'm going wrong here?
You can try something like this. You'll need to $unwind to access the fields from joined collection and change your grouping key to use the _id from joined collection to get average for each website:
db.seo_tracking.aggregate([{
$lookup: {
from: "website",
localField: "real_url",
foreignField: "website",
as: "post_websites"
},
}, {
$unwind: "$post_websites"
}, {
"$group": {
_id: "$post_websites._id",
avg_position: {
$avg: "$position"
},
website: {
$first: "$real_url"
}
}
}])