MongoDB flatten embedded array - mongodb

i'd like to create a report of a collection. Its schema is :
(I simplified the schema, to focus on the problematic)
Mongoose Schema
var MobilHomeSchema = new Schema({
id: Schema.Types.ObjectId,
region: String,
equipments:[
{ id: ObjectId, label: String }
]
});
It contains lots of mobilhomes. These mobilhomes are in a campsite, on a region (I chose this group, it could be country, ...). Each mobilhome has some equipments, not always the sames.
I'd like to create a spreadsheet with these columns, to count the number of each equipments in a region (it's just an example)
Expected generic result format
region | equipments.label 1 | equipments.label 2 | equipments.label 3 | ....
Example with "real" values :
region|terrace|pergola|shower
Spain | 30 | 15 |150
France| 55 | 32 |540
...
in json format, it could be :
EDIT
[{
region: "Spain",
terrace: 30,
pergola: 15,
shower: 150
},
{
region: "France",
terrace: 55,
pergola: 32,
shower: 540
}]
/EDIT
How can I do ?
(map-reduce ? a most Business Intelligence tool ?)
Many Thanks !

Don't use map/reduce. Use aggregation. In the mongo shell,
> db.mobile.aggregate([
{ "$unwind" : "$equipments" },
{ "$group" : { "_id" : { "region" : "$region", "label" : "$equipments.label" }, "count" : { "$sum" : 1 } } }
])
On the documents
{ "region" : "France", "equipments" : [ { "_id" : 0, "label" : "terrace" }, { "_id" : 1, "label" : "pergola" } ] },
{ "region" : "France", "equipments" : [ { "_id" : 0, "label" : "shower" }, { "_id" : 1, "label" : "pergola" } ] },
{ "region" : "Spain", "equipments" : [ { "_id" : 0, "label" : "terrace" }, { "_id" : 1, "label" : "shower" } ] },
{ "region" : "Spain", "equipments" : [ { "_id" : 0, "label" : "veranda" }, { "_id" : 1, "label" : "pergola" } ] }
the result is
{ "_id" : { "region" : "Spain", "label" : "veranda" }, "count" : 1 }
{ "_id" : { "region" : "Spain", "label" : "terrace" }, "count" : 1 }
{ "_id" : { "region" : "Spain", "label" : "shower" }, "count" : 1 }
{ "_id" : { "region" : "France", "label" : "shower" }, "count" : 1 }
{ "_id" : { "region" : "France", "label" : "pergola" }, "count" : 2 }
{ "_id" : { "region" : "Spain", "label" : "pergola" }, "count" : 1 }
{ "_id" : { "region" : "France", "label" : "terrace" }, "count" : 1 }
Since you're using an array, presumably you don't know all the possible types of equipment ahead of time, which makes shoving the above results back into one object per region in the aggregation an unwieldy thing to attempt. Better to work with these results in the client.

Related

MongoDB - how to optimise find query with regex search, with sort

I need to execute the following query:
db.S12_RU.find({"venue.raw":a,"title":/b|c|d|e/}).sort({"year":-1}).skip(X).limit(Y);
where X and Y are numbers.
The number of documents in my collection is:
208915369
Currently, this sort of query takes about 6 minutes to execute.
I have the following indexes:
[
{
"v" : 2,
"key" : {
"_id" : 1
},
"name" : "_id_"
},
{
"v" : 2,
"key" : {
"venue.raw" : 1
},
"name" : "venue.raw_1"
},
{
"v" : 2,
"key" : {
"venue.raw" : 1,
"title" : 1,
"year" : -1
},
"name" : "venue.raw_1_title_1_year_-1"
}
]
A standard document looks like this:
{ "_id" : ObjectId("5fc25fc091e3146fb10484af"), "id" : "1967181478", "title" : "Quality of Life of Swedish Women with Fibromyalgia Syndrome, Rheumatoid Arthritis or Systemic Lupus Erythematosus", "authors" : [ { "name" : "Carol S. Burckhardt", "id" : "2052326732" }, { "name" : "Birgitha Archenholtz", "id" : "2800742121" }, { "name" : "Kaisa Mannerkorpi", "id" : "240289002" }, { "name" : "Anders Bjelle", "id" : "2419758571" } ], "venue" : { "raw" : "Journal of Musculoskeletal Pain", "id" : "49327845" }, "year" : 1993, "n_citation" : 31, "page_start" : "199", "page_end" : "207", "doc_type" : "Journal", "publisher" : "Taylor & Francis", "volume" : "1", "issue" : "", "doi" : "10.1300/J094v01n03_20" }
Is there any way to make this query execute in a few seconds?

Mongo db query for multiple conditions

I have a Mongodb Json which look like this
{
"_id" : "5b862ebecebe455a1744",
"userId" : "111",
"courses" : [
{
"stateName" : "statge 1",
"courseId" : "1453",
"courseName" : "Program Training 1",
"duration" : 1,
"lag" : 0,
"courseType" : "1",
"transitionType" : "onComplete",
"scheduledStartDate" : ISODate("2018-07-27T16:23:14.000+05:30"),
"scheduledEndDate" : ISODate("2018-07-27T16:23:14.000+05:30"),
"courseProgress" : 0,
"ASD" : ISODate("2018-09-17T23:18:30.636+05:30"),
"score" : 0
},
{
"stateName" : "stage 2",
"courseId" : "1454",
"courseName" : "Program Assessment 1",
"duration" : 1,
"lag" : 0,
"courseType" : "2",
"transitionType" : "onComplete",
"scheduledStartDate" : ISODate("2018-07-28T16:23:14.000+05:30"),
"scheduledEndDate" : ISODate("2018-07-28T16:23:14.000+05:30"),
"courseProgress" : 0,
"score" : 0
},
{
"stateName" : "stage 3",
"courseId" : "911",
"courseName" : "Program Training 3",
"duration" : 1,
"lag" : 0,
"courseType" : "1",
"transitionType" : "onComplete",
"scheduledStartDate" : ISODate("2018-07-29T16:23:14.000+05:30"),
"scheduledEndDate" : ISODate("2018-07-29T16:23:14.000+05:30"),
"courseProgress" : 0,
"score" : 0
}
],
"userStatus" : 1,
"modified" : ISODate("2018-09-12T11:49:47.400+05:30"),
"created" : ISODate("2018-09-12T11:49:47.400+05:30"),
"completionStatus" : "IP",
"currentState" : {
"courseProgress" : 0,
"stateName" : "statge 1",
"courseId" : "1453",
"courseName" : "Program Training 1"
}
}
I want to find a query where condition is. Please help, as I am new to mongodb
courses.transitionType = oncomplete
(PROGRESS<100||(PROGRESS==100&&ASD exists false))
And print Result something like this which contain these below data
{
"_id" : "5b862ebecebe455a1744",
"courseData" : {
"userId" : "4688",
"courseId" : "1476",
"courseProgress" : 0
}
}
You will have to use an aggregation with a $match stage and a $project to format your result.
The tricky part of your request is that you want an answer by course, but 1 item of your collection contains many courses. So first, you can use the $unwind stage to separate every course
db.[CollectionName].aggregate([
{
$unwind : '$courses'
}
{
$match: {
'courses.transitionType': 'onComplete',
$or: [
{
'courses.courseProgress': { $lt: 100 }
},
{
'courses.courseProgress': 100,
ASD: { $exists: 0 }
}
]
}
},
{
$project: {
_id: '0',
courseData: {
userId: '$courses.userId',
courseId: '$courses.courseId',
courseProgress: '$courses.courseProgress'
}
}

Find MongoDB docs where all sub-docs match criteria

I have some Product documents that each contain a list of ProductVariation sub-documents. I need to find all the Product docs where ALL their child ProductVariation docs have zero quantity.
Schemas look like this:
var Product = new mongoose.Schema({
name: String,
variations: [ProductVariation]
});
var ProductVariation = new mongoose.Schema({
type: String,
quantity: Number,
price: Number
});
I am a little new to mongodb, so even sure where to start here.
Try using $not wrapped around { "$gt" : 0 }:
> db.products.find()
{ "_id" : ObjectId("5b7cae558ff28edda6ba4a67"), "name" : "widget", "variations" : [ { "type" : "color", "quantity" : 0, "price" : 10 }, { "type" : "size", "quantity" : 0, "price" : 5 } ] }
{ "_id" : ObjectId("5b7cae678ff28edda6ba4a68"), "name" : "foo", "variations" : [ { "type" : "color", "quantity" : 2, "price" : 15 }, { "type" : "size", "quantity" : 0, "price" : 5 } ] }
{ "_id" : ObjectId("5b7cae7f8ff28edda6ba4a69"), "name" : "bar", "variations" : [ { "type" : "color", "quantity" : 0, "price" : 15 }, { "type" : "size", "quantity" : 1, "price" : 5 } ] }
> db.products.find({"variations.quantity": { "$not" : { "$gt" : 0 } } })
{ "_id" : ObjectId("5b7cae558ff28edda6ba4a67"), "name" : "widget", "variations" : [ { "type" : "color", "quantity" : 0, "price" : 10 }, { "type" : "size", "quantity" : 0, "price" : 5 } ] }
It can also take advantage of an index on { "variations.quantity" : 1 }.

Aggregation in mongo

Below is a document from my database:
{
"_id" : ObjectId("58635ac32c9592064471cf5b"),
"agency_code" : "v5global",
"client_code" : "whirlpool",
"project_code" : "whirlpool",
"date" : {
"datetime" : 1464739200000.0,
"date" : 1464739200000.0,
"datejs" : ISODate("2016-06-01T00:00:00.000+0000"),
"datetimejs" : ISODate("2016-06-01T00:00:00.000+0000"),
"month" : NumberInt(5),
"year" : NumberInt(2016),
"day" : NumberInt(1)
},
"user" : {
"promoter_id" : NumberInt(19),
"promoter_name" : "Hira Singh Pawar",
"empcode" : "519230"
},
"counter" : {
"store_id" : NumberInt(4),
"store_name" : "Maya Sales ",
"chain_type" : "BS",
"address" : "6 Filamingo Market , Hissar",
"city" : "Hissar",
"state" : "Faridabad",
"region" : "North",
"sap_code" : "N_Far_91103948_1",
"unique_tp_code" : "91103948",
"location" : "6"
},
"insertedon" : {
"date" : 1464739200000.0,
"datejs" : ISODate("2016-06-01T00:00:00.000+0000"),
"datetimejs" : ISODate("2016-06-01T00:00:00.000+0000")
},
"insertedby" : "akshay",
"manager" : {
"manager_id" : NumberInt(5943),
"manager_name" : "Sonu Singh"
},
"type" : "display",
"data" : {
"brand" : "whirlpool",
"sku" : "60",
"model_name" : "Icemagic Fresh",
"sub_cat_name" : "DC",
"cat_name" : "Refrigerator",
"value" : NumberInt(1)
},
"IsDeleted" : false
}
I want to apply aggregation where I have to group it with city, state and region and if that counter has sold refrigerator I need that details in my result e.g if a counter has sold 2 refrigerators of whirlpool company then I want that to reflect in my result.
A counter can also sell other things like washing machines etc. So if they have sold 2 washing machines I want a result with { washingMachine: 2 }.
I have tried everything and nothing seems to be working here:
db.display_mop.aggregate( // Pipeline [
// Stage 1
{ $match: { "project_code":"whirlpool" } },
// Stage 2
{
$group: {
_id: {
"userid": "$user.promoter_id",
"userName": "$user.promoter_name",
"usercode": "$user.empcode",
"storename": "$counter.store_name",
"address": "$counter.address",
"city": "$counter.city",
"state": "$counter.state",
"region": "$counter.region"
}
}
},
],
// Options
{ allowDiskUse: true }

MongoDB Aggregation - return default value for documents that don't match query

I'm having trouble figuring out the right aggregation pipe operations to return the results I need.
I have a collection similar to the following :-
{
"_id" : "writer1",
"Name" : "writer1",
"Website" : "website1",
"Reviews" : [
{
"Film" : {
"Name" : "Jurassic Park",
"Genre" : "Action"
},
"Score" : 4
},
{
"Technology" : {
"Name" : "Mad Max",
"Genre" : "Action"
},
"Score" : 5
}
]
}
{
"_id" : "writer2",
"Name" : "writer2",
"Website" : "website1",
"Reviews" : [
{
"Technology" : {
"Name" : "Mad Max",
"Genre" : "Action"
},
"Score" : 5
}
]
}
And this is my aggregation so far : -
db.writers.aggregate([
{ "$unwind" : "$Reviews" },
{ "$match" : { "Reviews.Film.Name" : "Jurassic Park" } },
{ "$group" : { "_id" : "$Website" , "score" : { "$avg" : "$Reviews.Score" },
writers :{ $push: { name:"$Name", score:"$Reviews.Score" } }
}}
])
This returns only writers who have a review of the matching film and also only websites that have at least 1 writer who has reviewed the film,
however, I need to return all websites containing a list of their all writers, with a score of 0 if they haven't written a review for the specified film.
so, I am currently getting : -
{ "_id" : "website1", "score" : 4, "writers" : [ { "name" : "writer1", "score" : 4 } ] }
When I actually need : -
{ "_id" : "website1", "score" : 2, "writers" : [ { "name" : "writer1", "score" : 4 },{ "name" :"writer2", "score" : 0 } ] }
Can anyone point me in the right direction?
Cheers