does $lookup use indexes in the foreignField key? - mongodb

In the example below, if the collection inventory has an index on the sku field, will it be used in this $lookup operation?
db.orders.insertMany( [
{ "_id" : 1, "item" : "almonds", "price" : 12, "quantity" : 2 },
{ "_id" : 2, "item" : "pecans", "price" : 20, "quantity" : 1 },
{ "_id" : 3 }
] )
db.inventory.insertMany( [
{ "_id" : 1, "sku" : "almonds", "description": "product 1", "instock" : 120 },
{ "_id" : 2, "sku" : "bread", "description": "product 2", "instock" : 80 },
{ "_id" : 3, "sku" : "cashews", "description": "product 3", "instock" : 60 },
{ "_id" : 4, "sku" : "pecans", "description": "product 4", "instock" : 70 },
{ "_id" : 5, "sku": null, "description": "Incomplete" },
{ "_id" : 6 }
] )
db.orders.aggregate( [
{
$lookup:
{
from: "inventory",
localField: "item",
foreignField: "sku",
as: "inventory_docs"
}
}
] )
EDITED:
It does not. Why not?
{
"explainVersion" : "1",
"stages" : [
{
"$cursor" : {
"queryPlanner" : {
"namespace" : "6303c64faf8ef53d8ba2062f_y22_test2.orders",
"indexFilterSet" : false,
"parsedQuery" : {
},
"queryHash" : "8B3D4AB8",
"planCacheKey" : "D542626C",
"maxIndexedOrSolutionsReached" : false,
"maxIndexedAndSolutionsReached" : false,
"maxScansToExplodeReached" : false,
"winningPlan" : {
"stage" : "COLLSCAN",
"direction" : "forward"
},
"rejectedPlans" : [
]
}
}
},
{
"$lookup" : {
"from" : "inventory",
"as" : "inventory_docs",
"localField" : "item",
"foreignField" : "sku"
}
}
],

In the case of simple lookups (e.g., when specifying a localField + foreignField), an index will be used
Things are sadly more complicated when using a $lookup + pipeline, the following limitations apply:
Multikey indexes are not used.
Indexes are not used for comparisons where the operand is an array or the
operand type is undefined.
Indexes are not used for comparisons with more than one field path operand.
https://www.mongodb.com/docs/manual/reference/operator/aggregation/lookup/
It is really annoying that the explain() call doesn't provide any information on index usage of lookup stages. The best way I've found to determine whether an index was used was to (separately) use the $indexStats aggregation on the collection being looked up, in the above case:
db.inventory.aggregate([{$indexStats: {}}])
Then find the index you think is being used and watch the accesses.ops field.

Related

MongoDB unable to lookup docs based on variable parent document property

I want to find products and for each product attach deals to it. A deal is a product from same collection, yet based on some common properties.
So as per my requirement pipeline should return documents, for each document find other products those aren't same as current, but have equal detail.duration. But even though I've many docs with same duration, deals are always []. Could you please figure out the issue with my pipeline?
Following is the aggregation pipeline I'm running:
I've added filter _id $in just for clarity based on shown documents below. This isn't a part of real pipeline $match query.
db.products
.aggregate([
{
$match: {
_id: {
$in: [
ObjectId("6210fa8746bee3fcbd0ad062"),
ObjectId("6210fa7c46bee3fcbd0acc21"),
],
},
"detail.duration": { $gt: 0 },
},
},
{
$lookup: {
from: "products",
let: { id: "$_id", duration: "$detail.duration" },
as: "deals",
pipeline: [
{
$match: {
_id: { $ne: "$id" },
"detail.duration": "$duration",
},
},
{ $project: { detail: 1 } },
{ $limit: 1 },
],
},
},
{ $limit: 2 },
{ $project: { deals: 1 } },
])
.pretty();
This was the result:
{ "_id" : ObjectId("6210fa7c46bee3fcbd0acc21"), "deals" : [ ] }
{ "_id" : ObjectId("6210fa8746bee3fcbd0ad062"), "deals" : [ ] }
Following are two example documents in the collection:
{
"_id" : ObjectId("6210fa8746bee3fcbd0ad062"),
"book" : "https://wegotrip.com/en/paris-d3/muse-d-orsay-and-musee-de-l-orangerie-combined-tour-ticket-p1117/?SUB_ID=336264",
"address" : "Rue de Lille, 62bis",
"countryName" : "France",
"cityName" : "Paris",
"location" : {
"lang" : 48.859886,
"lat" : 2.3254821,
"country" : ObjectId("6210fa7746bee3fcbd0aca20"),
"city" : ObjectId("6210fa7746bee3fcbd0aca1c"),
"location" : "Rue de Lille, 62bis",
"_id" : ObjectId("6210fa8746bee3fcbd0ad063")
},
"includes" : [
{
"value" : "Skip-the-line ticket to Orsay Museum",
"included" : true
},
{
"value" : "Skip-the-line ticket to the Musée de l'Orangerie",
"included" : true
},
{
"value" : "Detailed description of the Nymphéas from Claude Monet",
"included" : true
},
{
"value" : "Interesting stories of many great artists and their lives",
"included" : true
},
{
"value" : "An easy walkthrough of the Musée d'Orsay and the Musée de l'Orangerie and their great collection",
"included" : true
},
{
"value" : "Headphones — you should bring your own",
"included" : false
}
],
"price" : {
"priceConcession" : null,
"priceChild" : null,
"price" : 57,
"currency" : ObjectId("6210fa7746bee3fcbd0aca2f"),
"_id" : ObjectId("6210fa8746bee3fcbd0ad064")
},
"detail" : {
"isPass" : false,
"features" : [
{
"key" : "audio_guide",
"value" : "Audio Guide"
}
],
"highlights" : [
"Admire the masterpieces by Monet, Renoir, Degas, Cézanne, and many more",
"Discover one of the finest collections of Impressionist art in the world",
"Visit the Nymphéas by Monet, one of the greatest pieces of Impressionism",
"Explore the Guillaume and Walter collection and find out what makes it unique"
],
"details" : [ ],
"images" : [
{
"id" : 7270,
"description" : "",
"cover" : false,
"preview" : "https://app.wegotrip.com/media/CACHE/images/store/1117/dsc04800/01d0770dcc0cac4c6de0f6eae70742f6.jpg",
"full" : "https://app.wegotrip.com/media/store/1117/dsc04800.jpg"
},
{
"id" : 7269,
"description" : "",
"cover" : false,
"preview" : "https://app.wegotrip.com/media/CACHE/images/store/1117/nympheasannees30salle1parisiennephotorogerviolet/e1270aef1c01391290df71d1f83c8abc.jpg",
"full" : "https://app.wegotrip.com/media/store/1117/nympheasannees30salle1parisiennephotorogerviolet.jpg"
},
{
"id" : 7268,
"description" : "",
"cover" : false,
"preview" : "https://app.wegotrip.com/media/CACHE/images/store/1117/ob1f7c80dsc02414-large/7712cb29e133ee3acb4b2bffbc2ac654.jpg",
"full" : "https://app.wegotrip.com/media/store/1117/ob1f7c80dsc02414-large.jpg"
},
{
"id" : 7267,
"description" : "",
"cover" : false,
"preview" : "https://app.wegotrip.com/media/CACHE/images/store/1117/tuileriesgardensb16dsc00678talrg/47430ab8a257e3ccd2337d7a0d750c57.jpg",
"full" : "https://app.wegotrip.com/media/store/1117/tuileriesgardensb16dsc00678talrg.jpg"
},
{
"id" : 7266,
"description" : "",
"cover" : false,
"preview" : "https://app.wegotrip.com/media/CACHE/images/store/1117/009/54223ef27aac5cd94fe5c20893abf2de.jpg",
"full" : "https://app.wegotrip.com/media/store/1117/009.jpg"
},
{
"id" : 7264,
"description" : "",
"cover" : false,
"preview" : "https://app.wegotrip.com/media/CACHE/images/store/1117/monet-morning-with-weeping-willow/09bf842cc9a9d7eade8d0739f704699f.jpg",
"full" : "https://app.wegotrip.com/media/store/1117/monet-morning-with-weeping-willow.jpg"
}
],
"duration" : 2,
"_id" : ObjectId("6210fa8746bee3fcbd0ad065")
},
"availability" : null,
"subcategory" : [
{
"id" : 6,
"title" : "Sightseeing Tickets & Passes",
"slug" : "sightseeing-tickets-passes"
}
],
"category" : [
{
"id" : 6,
"title" : "Sightseeing Tickets & Passes",
"slug" : "sightseeing-tickets-passes"
}
],
"type" : "Audio Guide",
"description" : "Visit the famous Musee d'Orsay and Musée de l'Orangerie in Paris with this combined self-guided tour! \r\n\r\nNavigate through the maze of exhibition rooms with mobile app and see a collection of works by the Impressionists and Expressionists – Seurat, Cezanne, Gaugin, Monet, Renoir, Manet, Van Gogh, Degas; sculptors like Rodin, Pompon and others. Check out a mini-version of the Statue of Liberty! \r\n\r\nExplore the Nymphéas paintings by Claude Monet, that is called \"the Sistine chapel of Impressionism\". Admire the great works of Picasso, Soutine, Rousseau, Matisse and many others part of the Paul Guillaume and Jean Walter collection. Learn about the style and private life of the artists.\r\n\r\nThe audio-guide will provide you with all the information on the cultural significance of these paintings. Walking through rooms you will understand how revolutionary for those times Manet’s, Cezanne’s and Degas’ creation really was casting doubts on conservative, academic conceptions of 'true art' and offering new techniques and ideas.",
"thumbnail" : "https://app.wegotrip.com/media/CACHE/images/store/1117/013/c0b8cce52cb61ab1f30872e6e93385b4.jpg",
"name" : "Musée d'Orsay/Musée de l'Orangerie Combined Admission Ticket & Audio Tour",
"attractionDescription" : "",
"attractionName" : "Musée d'Orsay & Musée de l'Orangerie",
"attraction" : ObjectId("6210fa8746bee3fcbd0ad056"),
"provider" : {
"rating" : {
"count" : 0,
"average" : null,
"_id" : ObjectId("6210fa8746bee3fcbd0ad067")
},
"preview" : "https://app.wegotrip.com/media/CACHE/images/store/1117/013/c0b8cce52cb61ab1f30872e6e93385b4.jpg",
"slug" : "muse-d-orsay-and-musee-de-l-orangerie-combined-tour-ticket",
"id" : "1117",
"key" : "1",
"_id" : ObjectId("6210fa8746bee3fcbd0ad066")
},
"__v" : 0
}
{
"_id" : ObjectId("6210fa7c46bee3fcbd0acc21"),
"book" : "https://wegotrip.com/en/barcelona-d1/the-dali-museum-in-figueres-p3/?SUB_ID=336264",
"address" : "Pujada del Castell, 43",
"countryName" : "Spain",
"cityName" : "Barcelona",
"location" : {
"lang" : 42.26829425831263,
"lat" : 2.95884132385254,
"country" : ObjectId("6210fa7746bee3fcbd0aca3e"),
"city" : ObjectId("6210fa7746bee3fcbd0aca3a"),
"location" : "Pujada del Castell, 43",
"_id" : ObjectId("6210fa7c46bee3fcbd0acc22")
},
"includes" : [
{
"value" : "Recommendations of places to visit to understand the life of Dali better",
"included" : true
},
{
"value" : "Skip-the-line ticket to Dali Theatre-Museum",
"included" : true
},
{
"value" : "Headphones — you should bring your own",
"included" : false
}
],
"price" : {
"priceConcession" : null,
"priceChild" : null,
"price" : 33,
"currency" : ObjectId("6210fa7746bee3fcbd0aca2f"),
"_id" : ObjectId("6210fa7c46bee3fcbd0acc23")
},
"detail" : {
"isPass" : false,
"features" : [
{
"key" : "audio_guide",
"value" : "Audio Guide"
}
],
"highlights" : [
"Discover Dali's surrealism starting with the building of the museum — it's definitely one of a kind",
"Inside the museum you'll find the most famous and controversial works of the artist",
"Our tour will provide you with insights and exiting facts about Dali's works"
],
"details" : [ ],
"images" : [
{
"id" : 6916,
"description" : "",
"cover" : false,
"preview" : "https://app.wegotrip.com/media/CACHE/images/store/3/figueres-oleguer2/032b55c27bb2cd119bdc7fe6c4b86491.jpeg",
"full" : "https://app.wegotrip.com/media/store/3/figueres-oleguer2.jpeg"
},
{
"id" : 6915,
"description" : "",
"cover" : false,
"preview" : "https://app.wegotrip.com/media/CACHE/images/store/3/sky-monument-statue-golden-museum-yellow-1156442-pxherecom/28c645449a9f45ec1e8ede7b7ffbe30f.jpg",
"full" : "https://app.wegotrip.com/media/store/3/sky-monument-statue-golden-museum-yellow-1156442-pxherecom.jpg"
},
{
"id" : 6914,
"description" : "",
"cover" : false,
"preview" : "https://app.wegotrip.com/media/CACHE/images/store/3/architecture-window-museum-landmark-surrealism-catalonia-800928-pxherecom/43691ba6aecc2ee084c300c150e32a03.jpg",
"full" : "https://app.wegotrip.com/media/store/3/architecture-window-museum-landmark-surrealism-catalonia-800928-pxherecom.jpg"
},
{
"id" : 831,
"description" : "",
"cover" : false,
"preview" : "https://app.wegotrip.com/media/CACHE/images/store/3/figueres-oleguers3k6yoz/b9c3093c79cf50e621e022706af59ad6.jpg",
"full" : "https://app.wegotrip.com/media/store/3/figueres-oleguers3k6yoz.jpg"
},
{
"id" : 832,
"description" : "",
"cover" : false,
"preview" : "https://app.wegotrip.com/media/CACHE/images/store/3/shutterstock82210018/2a2450d4f75edf4549d36f2286b6f19b.jpg",
"full" : "https://app.wegotrip.com/media/store/3/shutterstock82210018.jpg"
},
{
"id" : 833,
"description" : "",
"cover" : false,
"preview" : "https://app.wegotrip.com/media/CACHE/images/store/3/dali-museum-8983261920/aa0d93e475c7b7388bee88ff14f8d795.jpg",
"full" : "https://app.wegotrip.com/media/store/3/dali-museum-8983261920.jpg"
},
{
"id" : 834,
"description" : "",
"cover" : false,
"preview" : "https://app.wegotrip.com/media/CACHE/images/store/3/shutterstock196896461/74fc427d0a27f0aa199ed24f4c51bcc5.jpg",
"full" : "https://app.wegotrip.com/media/store/3/shutterstock196896461.jpg"
}
],
"duration" : 2,
"_id" : ObjectId("6210fa7c46bee3fcbd0acc24")
},
"availability" : null,
"subcategory" : [
{
"id" : 3,
"title" : "Theme Tours",
"slug" : "theme-tours"
},
{
"id" : 1,
"title" : "Culture & History",
"slug" : "culture-and-history"
},
{
"id" : 6,
"title" : "Sightseeing Tickets & Passes",
"slug" : "sightseeing-tickets-passes"
}
],
"category" : [
{
"id" : 3,
"title" : "Theme Tours",
"slug" : "theme-tours"
},
{
"id" : 1,
"title" : "Culture & History",
"slug" : "culture-and-history"
},
{
"id" : 6,
"title" : "Sightseeing Tickets & Passes",
"slug" : "sightseeing-tickets-passes"
}
],
"type" : "Audio Guide",
"description" : "The Dalí Theatre and Museum is a museum of the artist Salvador Dalí in his home town of Figueres, in Catalonia, Spain. Dalí is buried there in a crypt below the stage. \r\n\r\nImmerse yourself in an exciting journey through the world of the genius of surrealism. Reveal the meaning of his ambiguous creations and learn the history of the artist's life. Enjoy the unique world of Dali in this excursion.",
"thumbnail" : "https://app.wegotrip.com/media/CACHE/images/store/001_Ispaniya_Figeras_Teatr-01/783c3a10c34eb40c29f14f704cd9c8d1.jpeg",
"name" : "The Dali Theatre-Museum: Skip-the-Line & Audio Tour",
"attractionDescription" : "",
"attractionName" : "Dali Theatre and Museum",
"attraction" : ObjectId("6210fa7c46bee3fcbd0acc15"),
"provider" : {
"rating" : {
"count" : 0,
"average" : null,
"_id" : ObjectId("6210fa7c46bee3fcbd0acc26")
},
"preview" : "https://app.wegotrip.com/media/CACHE/images/store/001_Ispaniya_Figeras_Teatr-01/783c3a10c34eb40c29f14f704cd9c8d1.jpeg",
"slug" : "the-dali-museum-in-figueres",
"id" : "3",
"key" : "1",
"_id" : ObjectId("6210fa7c46bee3fcbd0acc25")
},
"__v" : 0
}
Both of the above have detail.duration set to 2 and as per query, these 2 should have each other considered as a deal and found in result docs, but query returns deals: [], an empty array. I'm unable to figure out the problem.
From $match (Restrictions)
The $match query syntax is identical to the read operation query syntax; i.e. $match does not accept raw aggregation expressions. To include aggregation expression in $match, use a $expr query expression.
And you need to use $$ to get the variable value.
let
To reference variables in pipeline stages, use the "$$" syntax.
Change the $match stage in the pipeline as:
{
$match: {
_id: {
$ne: "$$id"
},
$expr: {
$eq: [
"$detail.duration",
"$$duration"
]
}
}
}
Sample Mongo Playground

MongoDB: Remove field in array with $lookup localField

I am beginner with MongoDB. I use $lookup in aggregation and use localField to get reference document.
db.orders.insert([
{ "_id" : 1, "item" : ['almonds','pecans','bread'], "price" : 12, "quantity" : 2 },
{ "_id" : 2, "item" : ['cashews','catty'], "price" : 20, "quantity" : 1 }
])
I tried to use $lookup and localField in aggregation but I can't find way to remove field _id and description
db.inventory.insert([
{ "_id" : 1, "sku" : "almonds", description: "product 1", "instock" : 120 },
{ "_id" : 2, "sku" : "bread", description: "product 2", "instock" : 80 },
{ "_id" : 3, "sku" : "cashews", description: "product 3", "instock" : 60 },
{ "_id" : 4, "sku" : "pecans", description: "product 4", "instock" : 70 },
{ "_id" : 5, "sku": "catty", description: "Incomplete", "instock" : 100 },
{ "_id" : 6 }
])
Expected results:
[
{
"_id" : 1,
"item" : [
{ "sku" : "almonds", "instock" : 120 },
{ "sku" : "pecans", "instock" : 70 },
{ "sku" : "bread", "instock" : 80 }
],
"price" : 12,
"quantity" : 2
},
{
"_id" : 2,
"item" : [
{ "sku" : "cashews", "instock" : 60 },
{ "sku" : "catty", "instock" : 100 }
],
"price" : 20,
"quantity" : 1
}
]
You can try lookup with aggregation pipeline,
$lookup join with inventory collection
$match to match is inventory sku in item array
$project to display required fields
db.orders.aggregate([
{
$lookup: {
from: "inventory",
as: "item",
let: { i: "$item" },
pipeline: [
{ $match: { $expr: { $in: ["$sku", "$$i"] } } },
{
$project: {
_id: 0,
sku: 1,
instock: 1
}
}
]
}
}
])
Playground

Aggregate distinct values in MongoDB

I have a mongodb db with 18625 collections. It has following keys:
"_id" : ObjectId("5aab14d2fc08b46adb79d99c"),
"game_id" : NumberInt(4),
"score_phrase" : "Great",
"title" : "NHL 13",
"url" : "/games/nhl-13/ps3-128181",
"platform" : "PlayStation 3",
"score" : 8.5,
"genre" : "Sports",
"editors_choice" : "N",
"release_year" : NumberInt(2012),
"release_month" : NumberInt(9),
"release_day" : NumberInt(11)
Now, i wish to create another dimension/ collection with only genres.
If i use the following query :
db.ign.aggregate([ {$project: {"genre":1}}, { $out: "dimen_genre" } ]);
It generates 18625 collections, even though there are only 113 distinct
genres.
How to apply distinct here and get the collection for genres with only the distinct 113 values.
I googled, bt it showed that aggregate and distinct don't work together in mongo.
I also tried : db.dimen_genre.distinct('genre').length
this showed that in dimension_genre, there are 113 distinct genres.
Precisely,
how to make a collection from existing one with only distinct values.
I am really new to NoSQLs.
You can use $addToSet to group unique values in one document and then $unwind to get back multiple docs:
db.ign.aggregate([
{
$group: {
_id: null,
genre: { $addToSet: "$genre" }
}
},
{
$unwind: "$genre"
},
{
$project: {
_id: 0
}
},
{ $out: "dimen_genre" }
]);
You can try
db.names.aggregate(
[
{ $group : { _id : "$genre", books: { $push: "$$ROOT" } } }
]
)
I have tried with Test and Sports as genre
It gives you output something like this
{
"_id" : "Test",
"books" : [
{
"_id" : ObjectId("5aaea6150cc1403ee9a02e0c"),
"game_id" : 4,
"score_phrase" : "Great",
"title" : "NHL 13",
"url" : "/games/nhl-13/ps3-128181",
"platform" : "PlayStation 3",
"score" : 8.5,
"genre" : "Test",
"editors_choice" : "N",
"release_year" : 2012,
"release_month" : 9,
"release_day" : 11
}
]
}
/* 2 */
{
"_id" : "Sports",
"books" : [
{
"_id" : ObjectId("5aaea3be0cc1403ee9a02d97"),
"game_id" : 4,
"score_phrase" : "Great",
"title" : "NHL 13",
"url" : "/games/nhl-13/ps3-128181",
"platform" : "PlayStation 3",
"score" : 8.5,
"genre" : "Sports",
"editors_choice" : "N",
"release_year" : 2012,
"release_month" : 9,
"release_day" : 11
},
{
"_id" : ObjectId("5aaea3c80cc1403ee9a02d9b"),
"game_id" : 4,
"score_phrase" : "Great",
"title" : "NHL 13",
"url" : "/games/nhl-13/ps3-128181",
"platform" : "PlayStation 3",
"score" : 8.5,
"genre" : "Sports",
"editors_choice" : "N",
"release_year" : 2012,
"release_month" : 9,
"release_day" : 11
},
{
"_id" : ObjectId("5aaea3cf0cc1403ee9a02d9f"),
"game_id" : 4,
"score_phrase" : "Great",
"title" : "NHL 13",
"url" : "/games/nhl-13/ps3-128181",
"platform" : "PlayStation 3",
"score" : 8.5,
"genre" : "Sports",
"editors_choice" : "N",
"release_year" : 2012,
"release_month" : 9,
"release_day" : 11
}
]
}

Group multi-dimensional array after unwinding elements

Again with mongoDB. I really like aggregation, but still can't "get it".
So here is my array:
{
"_id" : ObjectId("55951b2bf41edfc80b00002a"),
"orders" : [
{
"id" : "55929142f41edfdc0f00002f",
"name" : "XYZ",
"id_basket" : 1,
"card" : [
{
"id" : "250",
"serial" : "B",
"type" : "9cf4161002b9eda349bb9c5ae64b9f4a",
"name" : "Eco",
"ticket" : [
{
"id" : "55927d41f41edfd00f000030",
"name" : "ZZZ",
"price" : {
"name" : "Normal",
"price" : "10",
"price_disp" : "10 €",
}
},
{
"id" : "55927d41f41edfd00f000030",
"name" : "ZZZ",
"price" : {
"name" : "Normal",
"price" : "10",
"price_disp" : "10 €",
}
}
]
},
{
"id" : "250",
"serial" : "B",
"type" : "9cf4161002b9eda349bb9c5ae64b9f4a",
"name" : "Eco",
"ticket" : [
{
"id" : "55927d41f41edfd00f000030",
"name" : "ZZZ",
"price" : {
"name" : "Normal",
"price" : "10",
"price_disp" : "10 €",
}
},
{
"id" : "55927d41f41edfd00f000030",
"name" : "ZZZ",
"price" : {
"name" : "Normal",
"price" : "10",
"price_disp" : "10 €",
}
}
]
}
],
"full_amount" : "40",
},
{
"id" : "55929142f41edfdc0f00002f",
"name" : "XYZ",
"id_basket" : 1,
"card" : [
{
"id" : "250",
"serial" : "B",
"type" : "9cf4161002b9eda349bb9c5ae64b9f4a",
"name" : "Eco",
"ticket" : [
{
"id" : "55927d41f41edfd00f000030",
"name" : "ZZZ",
"price" : {
"name" : "Normal",
"price" : "10",
"price_disp" : "10 €",
}
},
{
"id" : "55927d41f41edfd00f000030",
"name" : "ZZZ",
"price" : {
"name" : "Normal",
"price" : "10",
"price_disp" : "10 €",
}
}
]
},
{
"id" : "250",
"serial" : "B",
"type" : "9cf4161002b9eda349bb9c5ae64b9f4a",
"name" : "Eco",
"ticket" : [
{
"id" : "55927d41f41edfd00f000030",
"name" : "ZZZ",
"price" : {
"name" : "Normal",
"price" : "10",
"price_disp" : "10 €",
}
},
{
"id" : "55927d41f41edfd00f000030",
"name" : "ZZZ",
"price" : {
"name" : "Normal",
"price" : "10",
"price_disp" : "10 €",
}
}
]
}
],
"full_amount" : "40",
},
],
"rate" : "0.23",
"date" : "2015-07-02 13:04:34",
"id_user" : 97,
}
I want to output something like this:
{
"_id" : ObjectId("55951b2bf41edfc80b00002a"),
"orders" : [
{
"id" : "55929142f41edfdc0f00002f",
"name" : "XYZ",
"card" : [
{
"id" : "250",
"serial" : "B",
"name" : "Eco",
"ticket" : [
{
"id" : "55927d41f41edfd00f000030",
"name" : "ZZZ",
"price" : "10 €"
},
{
"id" : "55927d41f41edfd00f000030",
"name" : "ZZZ",
"price" : "10 €"
}
]
},
{
"id" : "250",
"serial" : "B",
"name" : "Eco",
"ticket" : [
{
"id" : "55927d41f41edfd00f000030",
"name" : "ZZZ",
"price" : "10 €"
},
{
"id" : "55927d41f41edfd00f000030",
"name" : "ZZZ",
"price" : "10 €"
}
]
}
],
"full_amount" : "40",
},
{
"id" : "55929142f41edfdc0f00002f",
"name" : "XYZ",
"card" : [
{
"id" : "250",
"serial" : "B",
"name" : "Eco",
"ticket" : [
{
"id" : "55927d41f41edfd00f000030",
"name" : "ZZZ",
"price" : "10 €"
},
{
"id" : "55927d41f41edfd00f000030",
"name" : "ZZZ",
"price" : "10 €"
}
]
},
{
"id" : "250",
"serial" : "B",
"name" : "Eco",
"ticket" : [
{
"id" : "55927d41f41edfd00f000030",
"name" : "ZZZ",
"price" : "10 €"
},
{
"id" : "55927d41f41edfd00f000030",
"name" : "ZZZ",
"price" : "10 €"
}
]
}
],
"full_amount" : "40",
},
],
"rate" : "0.23",
"date" : "2015-07-02 13:04:34",
}
I've tried many combinations with unwinding, projecting and grouping and failed to get what I want. Can someone help me with this?
You probably shouldn't be using the aggregation framework for tasks like this that do not actually "aggregate" anything between documents. This really is a "projection" task since all you are asking is to "alter" the structure of a document, and that is a task probably better suited to coding in the client after the document is retrieved.
A very good reason for this is that operations like $unwind are very costly in terms of performance. What $unwind does is produce a "copy" of the document content for each array member present, which results in a lot more documents to process.
Think of that like a "SQL Join" with a "one to many" relationship, the only difference being the data is self contained in one document. Processing $unwind simulates the "join" results in that the "master" (one) document contents are reproduced for every "child" (many) document.
In order to counter such operations being done by people, MongoDB 2.6 introduced the $map operator, which processes array elements within the document itself.
So instead of doing multiple ( or any ) $unwind actions, you can instead just process the arrays within the document itself using $map in a $project stage:
db.collection.aggregate([
{ "$project": {
"orders": { "$map": {
"input": "$orders",
"as": "o",
"in": {
"id": "$$o.id",
"name": "$$o.name",
"card": { "$map": {
"input": "$$o.card",
"as": "c",
"in": {
"id": "$$c.id",
"serial": "$$c.serial",
"name": "$$c.name",
"ticket": { "$map": {
"input": "$$c.ticket",
"as": "t",
"in": {
"id": "$$t.id",
"name": "$$t.name",
"price": "$$t.price.price_disp"
}
}}
}
}},
"full_amount": "$$o.full_amount"
}
}},
"rate": 1,
"date": 1
}}
])
The operations are fairly simple there as each "array" is assigned it's own variable name, and for a simple projection operation such as this all that is really left is selecting which fields you want.
In earlier versions, processing using $unwind is much more difficult:
db.collection.aggregate([
{ "$unwind": "$orders" },
{ "$unwind": "$orders.card" },
{ "$unwind": "$orders.card.ticket" },
{ "$group": {
"_id": {
"_id": "$_id",
"orders": {
"id": "$orders.id",
"name": "$orders.name",
"card": {
"id": "$orders.card.id",
"serial": "$orders.card.serial",
"name": "$orders.card.name"
},
"full_amount": "$orders.full_amount"
},
"rate": "$rate",
"date": "$date"
},
"ticket": {
"$push": {
"id": "$orders.card.ticket.id",
"name": "$orders.card.ticket.name",
"price": "$orders.card.ticket.price.price_disp"
}
}
}},
{ "$group": {
"_id": {
"_id": "$_id._id",
"orders": {
"id": "$_id.orders.id",
"name": "$_id.orders.name",
"full_amount": "$_id.orders.full_amount"
},
"rate": "$_id.rate",
"date": "$_id.date"
},
"card": {
"$push": {
"id": "$_id.orders.card.id",
"serial": "$_id.orders.card.serial",
"name": "$_id.orders.card.name",
"ticket": "$ticket"
}
}
}},
{ "$group": {
"_id": "$_id._id",
"orders": {
"$push": {
"id": "$_id.orders.id",
"name": "$_id.orders.name",
"card": "$card",
"full_amount": "$_id.orders.full_amount"
}
},
"rate": { "$first": "$_id.rate" },
"date": { "$first": "$_id.date" }
}}
])
So following through that carefully, you should see that since you $unwind three times it is necessary to $group "three times" as well, while carefully grouping all the distinct values at each "level" and re-constructing the arrays via $push.
This really is not advised at all as was mentioned earlier:
You "are not grouping/aggregating anything" and each sub-document "must" contain a "unique" itentifier because of the "grouping" operations required to re-construct arrays. ( See: NOTE )
The $unwind operation here is very costly. All of the document information is re-produced by a factor of "n" array X "n" array elements and so on. So there is much more data in the aggregation pipeline than your collection or query selection actually contains in itself.
Therefore in conclusion, for the general processing of "reformatting your data" you should instead be processing each document in your code rather than be "throwing it" at the aggregation pipeline to do.
If your document data requires "sufficient" manipulation that makes a "substantial difference" to the returned result size that you deem to be more efficient than pulling the whole document and manipulating in the client, then and "only" then should you be using the $project form as shown with the $map operations.
Sidebar
Your original "tag" here mentions "PHP".
All MongoDB queries including the aggregation have nothing language specific about them and are just "data structures" and are represented as such mostly in the "native form" for those languages (PHP,JavaScript,python,etc), and with "builder methods" for those languages without "native" expressive formats for free structures ( C,C#,Java ).
In all cases, there are simple parsers available for JSON, which is a common "linqua franca" here as the MongoB Shell itself is JavaScript based and understands JSON structre ( as actual JavaScript Objects ) natively.
So when working with such examples use tools like:
json_decode: to get more of an insight into how your native data structure is constructed.
json_encode: in order to check your native data structure against any JSON represented sample.
All content here is just simple "key/value" array() notation, though nested. But it is probably good practice to be aware of the tools and use them regularly.
NOTE:
The data sample you give looks very much like you have "cut and paste" data in order to create multiple items, as various "sub-items" all share the same "id" values.
Your "real" data should not do this! So I hope it does not, but if so then fix it.
In order to make the second example workable ( first is perfectly fine as is ) the data needs to be altered to included "unique" "id" values for each sub-element.
As I used here:
{
"_id" : ObjectId("55951b2bf41edfc80b00002a"),
"orders" : [
{
"id" : "55929142f41edfdc0f00002a",
"name" : "XYZ",
"card" : [
{
"id" : "250",
"serial" : "B",
"name" : "Eco",
"ticket" : [
{
"id" : "55927d41f41edfd00f000031",
"name" : "ZZZ",
"price" : "10 €"
},
{
"id" : "55927d41f41edfd00f000032",
"name" : "ZZZ",
"price" : "10 €"
}
]
},
{
"id" : "251",
"serial" : "B",
"name" : "Eco",
"ticket" : [
{
"id" : "55927d41f41edfd00f000033",
"name" : "ZZZ",
"price" : "10 €"
},
{
"id" : "55927d41f41edfd00f000034",
"name" : "ZZZ",
"price" : "10 €"
}
]
}
],
"full_amount" : "40",
},
{
"id" : "55929142f41edfdc0f00002b",
"name" : "XYZ",
"card" : [
{
"id" : "252",
"serial" : "B",
"name" : "Eco",
"ticket" : [
{
"id" : "55927d41f41edfd00f000035",
"name" : "ZZZ",
"price" : "10 €"
},
{
"id" : "55927d41f41edfd00f000036",
"name" : "ZZZ",
"price" : "10 €"
}
]
},
{
"id" : "253",
"serial" : "B",
"name" : "Eco",
"ticket" : [
{
"id" : "55927d41f41edfd00f000037",
"name" : "ZZZ",
"price" : "10 €"
},
{
"id" : "55927d41f41edfd00f000038",
"name" : "ZZZ",
"price" : "10 €"
}
]
}
],
"full_amount" : "40",
}
],
"rate" : "0.23",
"date" : "2015-07-02 13:04:34",
}

Get document based on multiple criteria of embedded collection

I have the following document, I need to search for multiple items from the embedded collection"items".
Here's an example of a single SKU
db.sku.findOne()
{
"_id" : NumberLong(1192),
"description" : "Uploaded via CSV",
"items" : [
{
"_id" : NumberLong(2),
"category" : DBRef("category", NumberLong(1)),
"description" : "840 tag visual",
"name" : "840 Visual Mini Round",
"version" : NumberLong(0)
},
{
"_id" : NumberLong(7),
"category" : DBRef("category", NumberLong(2)),
"description" : "Maxi",
"name" : "Maxi",
"version" : NumberLong(0)
},
{
"_id" : NumberLong(11),
"category" : DBRef("category", NumberLong(3)),
"description" : "Button",
"name" : "Button",
"version" : NumberLong(0)
},
{
"_id" : NumberLong(16),
"category" : DBRef("category", NumberLong(4)),
"customizationFields" : [
{
"_class" : "CustomizationField",
"_id" : NumberLong(1),
"displayText" : "Custom Print 1",
"fieldName" : "customPrint1",
"listOrder" : 1,
"maxInputLength" : 12,
"required" : false,
"version" : NumberLong(0)
},
{
"_class" : "CustomizationField",
"_id" : NumberLong(2),
"displayText" : "Custom Print 2",
"fieldName" : "customPrint2",
"listOrder" : 2,
"maxInputLength" : 17,
"required" : false,
"version" : NumberLong(0)
}
],
"description" : "2 custom lines of farm print",
"name" : "Custom 2",
"version" : NumberLong(2)
},
{
"_id" : NumberLong(20),
"category" : DBRef("category", NumberLong(5)),
"description" : "Color Red",
"name" : "Red",
"version" : NumberLong(0)
}
],
"skuCode" : "NF-USDA-XC2/SM-BC-R",
"version" : 0,
"webCowOptions" : "840miniwithcust2"
}
There are repeat items.id throughout the embedded collection. Each Sku is made up of multiple items, all combinations are unique, but one item will be part of many Skus.
I'm struggling with the query structure to get what I'm looking for.
Here are a few things I have tried:
db.sku.find({'items._id':2},{'items._id':7})
That one only returns items with the id of 7
db.sku.find({items:{$all:[{_id:5}]}})
That one doesn't return anything, but it came up when looking for solutions. I found about it in the MongoDB manual
Here's an example of a expected result:
sku:{ "_id" : NumberLong(1013),
"items" : [ { "_id" : NumberLong(5) },
{ "_id" : NumberLong(7) },
{ "_id" : NumberLong(12) },
{ "_id" : NumberLong(16) },
{ "_id" :NumberLong(2) } ] },
sku:
{ "_id" : NumberLong(1014),
"items" : [ { "_id" : NumberLong(5) },
{ "_id" : NumberLong(7) },
{ "_id" : NumberLong(2) },
{ "_id" : NumberLong(16) },
{ "_id" :NumberLong(24) } ] },
sku:
{ "_id" : NumberLong(1015),
"items" : [ { "_id" : NumberLong(5) },
{ "_id" : NumberLong(7) },
{ "_id" : NumberLong(12) },
{ "_id" : NumberLong(2) },
{ "_id" :NumberLong(5) } ] }
Each Sku that comes back has both a item of id:7, and id:2, with any other items they have.
To further clarify, my purpose is to determine how many remaining combinations exist after entering the first couple of items.
Basically a customer will start specifying items, and we'll weed it down to the remaining valid combinations. So Sku.items[0].id=5 can only be combined with items[1].id=7 or items[1].id=10 …. Then items[1].id=7 can only be combined with items[2].id=20 … and so forth
The goal was to simplify my rules for purchase, and drive it all from the Sku codes. I don't know if I dug a deeper hole instead.
Thank you,
On the part of extracting the sku with item IDs 2 and 7, when I recall correctly, you have to use $elemMatch:
db.sku.find({'items' :{ '$all' :[{ '$elemMatch':{ '_id' : 2 }},{'$elemMatch': { '_id' : 7 }}]}} )
which selects all sku where there is each an item with _id 2 and 7.
You can use aggregation pipelines
db.sku.aggregate([
{"$unwind": "$sku.items"},
{"$group": {"_id": "$_id", "items": {"$addToSet":{"_id": "$items._id"}}}},
{"$match": {"items._id": {$all:[2,7]}}}
])