How do I remove the "value" in mapReduce? - mongodb

I was wondering if it is possible to remove the "value" key in the map-reduce so that the final result just contains the values directly rather than being inside the "value" key. I am looking to do it with just the commands (so no Javascript variables and such)
For example, the map-reduce output is typically
[
{
"_id" : 0,
"value" : {
"name" : "Apple",
"sold" : 1234
}
},
{
"_id" : 1,
"value" : {
"name" : "Amazon",
"sold" : 5678
}
}
]
I would like it to end up as
[
{
"_id" : 0,
"name" : "Apple",
"sold" : 1234
},
{
"_id" : 1,
"name" : "Amazon",
"sold" : 5678
}
]
I am thinking it can be done with the findAndModify command but I am not exactly sure how.

It does not seems to be possible for now. There is a JIRA ticket for that reported in Mongo.

Related

How to remove documents from embedded where name is.. and start is greater than...?

I'm trying to remove some embedded documents from history. I'm using mongodb 3.2
There are two conditions:
"name" must be for example sa
"history" "start" must be greater some date
{
"name" : "sa",
"history" : [
{
"start" : ISODate("2015-11-11T12:46:32.000Z"),
"value" : "color1"
},
{
"start" : ISODate("2015-11-12T11:54:20.000Z"),
"value" : "color2"
}]
}
{
"name" : "sa",
"history" : [
{
"start" : ISODate("2015-11-11T12:46:32.000Z"),
"value" : "color1"
},
]
"start" : ISODate("2015-11-12T11:54:20.000Z"),
"value" : "color2"
}]
}
{
"name" : "so",
"history" : [
{
"start" : ISODate("2015-11-11T12:46:32.000Z"),
"value" : "color1"
},
{
"start" : ISODate("2015-11-12T11:54:20.000Z"),
"value" : "color2"
}]
}
I couldn't do it directly. I download the collection and then do the operations that I need, remove the old collection and then insert the new collection with the data that I need.

Get results in nested objects

This is my collection structure and I want to filter all the results for a defined reference:
{
"_id" : "5xFusfnvRobfMhRKE",
"book" : "Lorem",
"publisher" : "Lorem",
"author" : "Lorem",
"edition" : [
{
"edition" : "Lorem",
"year" : 2015,
"section" : [
{
"pageNumbers" : "12",
"reference" : "4NoHjACkjHJ8mavv9"
}
]
}
]
}
My attempt was Collection.find({'edition.section.reference': '4NoHjACkjHJ8mavv9'}), but that doesn't work. I would expect this matches the above example.
I think this query can help you-
db.collection.find({"edition.section.reference":"4NoHjACkjHJ8mavv9"},{}).pretty()
You have to use 'db' before the collection_name ie. db.collection.find() and for your problem you actually did it right but just missed out on 'db'.
db.sys_test.insert({"-_id":10,"edition":[{"section":[{"reference":"101"}]}]})
db.sys_test.find({"edition.section.reference":"101"}).pretty()
{
"_id" : ObjectId("55faba519d0ff7079e6f9817"),
"-_id" : 10,
"edition" : [
{
"section" : [
{
"reference" : "101"
}
]
}
]
}

Get document based on multiple criteria of embedded collection

I have the following document, I need to search for multiple items from the embedded collection"items".
Here's an example of a single SKU
db.sku.findOne()
{
"_id" : NumberLong(1192),
"description" : "Uploaded via CSV",
"items" : [
{
"_id" : NumberLong(2),
"category" : DBRef("category", NumberLong(1)),
"description" : "840 tag visual",
"name" : "840 Visual Mini Round",
"version" : NumberLong(0)
},
{
"_id" : NumberLong(7),
"category" : DBRef("category", NumberLong(2)),
"description" : "Maxi",
"name" : "Maxi",
"version" : NumberLong(0)
},
{
"_id" : NumberLong(11),
"category" : DBRef("category", NumberLong(3)),
"description" : "Button",
"name" : "Button",
"version" : NumberLong(0)
},
{
"_id" : NumberLong(16),
"category" : DBRef("category", NumberLong(4)),
"customizationFields" : [
{
"_class" : "CustomizationField",
"_id" : NumberLong(1),
"displayText" : "Custom Print 1",
"fieldName" : "customPrint1",
"listOrder" : 1,
"maxInputLength" : 12,
"required" : false,
"version" : NumberLong(0)
},
{
"_class" : "CustomizationField",
"_id" : NumberLong(2),
"displayText" : "Custom Print 2",
"fieldName" : "customPrint2",
"listOrder" : 2,
"maxInputLength" : 17,
"required" : false,
"version" : NumberLong(0)
}
],
"description" : "2 custom lines of farm print",
"name" : "Custom 2",
"version" : NumberLong(2)
},
{
"_id" : NumberLong(20),
"category" : DBRef("category", NumberLong(5)),
"description" : "Color Red",
"name" : "Red",
"version" : NumberLong(0)
}
],
"skuCode" : "NF-USDA-XC2/SM-BC-R",
"version" : 0,
"webCowOptions" : "840miniwithcust2"
}
There are repeat items.id throughout the embedded collection. Each Sku is made up of multiple items, all combinations are unique, but one item will be part of many Skus.
I'm struggling with the query structure to get what I'm looking for.
Here are a few things I have tried:
db.sku.find({'items._id':2},{'items._id':7})
That one only returns items with the id of 7
db.sku.find({items:{$all:[{_id:5}]}})
That one doesn't return anything, but it came up when looking for solutions. I found about it in the MongoDB manual
Here's an example of a expected result:
sku:{ "_id" : NumberLong(1013),
"items" : [ { "_id" : NumberLong(5) },
{ "_id" : NumberLong(7) },
{ "_id" : NumberLong(12) },
{ "_id" : NumberLong(16) },
{ "_id" :NumberLong(2) } ] },
sku:
{ "_id" : NumberLong(1014),
"items" : [ { "_id" : NumberLong(5) },
{ "_id" : NumberLong(7) },
{ "_id" : NumberLong(2) },
{ "_id" : NumberLong(16) },
{ "_id" :NumberLong(24) } ] },
sku:
{ "_id" : NumberLong(1015),
"items" : [ { "_id" : NumberLong(5) },
{ "_id" : NumberLong(7) },
{ "_id" : NumberLong(12) },
{ "_id" : NumberLong(2) },
{ "_id" :NumberLong(5) } ] }
Each Sku that comes back has both a item of id:7, and id:2, with any other items they have.
To further clarify, my purpose is to determine how many remaining combinations exist after entering the first couple of items.
Basically a customer will start specifying items, and we'll weed it down to the remaining valid combinations. So Sku.items[0].id=5 can only be combined with items[1].id=7 or items[1].id=10 …. Then items[1].id=7 can only be combined with items[2].id=20 … and so forth
The goal was to simplify my rules for purchase, and drive it all from the Sku codes. I don't know if I dug a deeper hole instead.
Thank you,
On the part of extracting the sku with item IDs 2 and 7, when I recall correctly, you have to use $elemMatch:
db.sku.find({'items' :{ '$all' :[{ '$elemMatch':{ '_id' : 2 }},{'$elemMatch': { '_id' : 7 }}]}} )
which selects all sku where there is each an item with _id 2 and 7.
You can use aggregation pipelines
db.sku.aggregate([
{"$unwind": "$sku.items"},
{"$group": {"_id": "$_id", "items": {"$addToSet":{"_id": "$items._id"}}}},
{"$match": {"items._id": {$all:[2,7]}}}
])

MongoDB MapReduce--is there an Aggregation alternative?

I've got a collection with documents using a schema something like this (some members redacted):
{
"_id" : ObjectId("539f41a95d1887b57ab78bea"),
"answers" : {
"ratings" : {
"positivity" : [
2,
3,
5
],
"activity" : [
4,
4,
3
],
},
"media" : [
ObjectId("537ea185df872bb71e4df270"),
ObjectId("537ea185df872bb71e4df275"),
ObjectId("537ea185df872bb71e4df272")
]
}
In this schema, the first, second, and third positivity ratings correspond to the first, second, and third entries in the media array, respectively. The same is true for the activity ratings. I need to calculate statistics for the positivity and activity ratings with respect to their associated media objects across all documents in the collection. Right now, I'm doing this with MapReduce. I'd like to, however, accomplish this with the Aggregation Pipeline.
Ideally, I'd like to $unwind the media, answers.ratings.positivity, and answers.ratings.activity arrays simultaneously so that I end up with, for example, the following three documents based on the previous example:
[
{
"_id" : ObjectId("539f41a95d1887b57ab78bea"),
"answers" : {
"ratings" : {
"positivity" : 2,
"activity" : 4
}
},
"media" : ObjectId("537ea185df872bb71e4df270")
},
{
"_id" : ObjectId("539f41a95d1887b57ab78bea"),
"answers" : {
"ratings" : {
"positivity" : 3
"activity" : 4
}
},
"media" : ObjectId("537ea185df872bb71e4df275")
},
{
"_id" : ObjectId("539f41a95d1887b57ab78bea"),
"answers" : {
"ratings" : {
"positivity" : 5
"activity" : 3
}
},
"media" : ObjectId("537ea185df872bb71e4df272")
}
]
Is there some way to accomplish this?
The current aggregation framework does not allow you to do this. Being able to unwind multiple arrays that are know to be the same size and creating a document for the ith value of each would be a good feature.
If you want to use the aggregation framework you will need to change your schema a little. For example take the following document schema:
{
"_id" : ObjectId("539f41a95d1887b57ab78bea"),
"answers" : {
"ratings" : {
"positivity" : [
{k:1, v:2},
{k:2, v:3},
{k:3, v:5}
],
"activity" : [
{k:1, v:4},
{k:2, v:4},
{k:3, v:3}
],
}},
"media" : [
{k:1, v:ObjectId("537ea185df872bb71e4df270")},
{k:2, v:ObjectId("537ea185df872bb71e4df275")},
{k:3, v:ObjectId("537ea185df872bb71e4df272")}
]
}
By doing this you are essentially adding the index to the object inside the array. After this it's just a matter of unwinding all the arrays and matching on the key.
db.test.aggregate([{$unwind:"$media"},
{$unwind:"$answers.ratings.positivity"},
{$unwind:"$answers.ratings.activity"},
{$project:{"media":1, "answers.ratings.positivity":1,"answers.ratings.activity":1,
include:{$and:[
{$eq:["$media.k", "$answers.ratings.positivity.k"]},
{$eq:["$media.k", "$answers.ratings.activity.k"]}
]}}
},
{$match:{include:true}}])
And the output is:
[
{
"_id" : ObjectId("539f41a95d1887b57ab78bea"),
"answers" : {
"ratings" : {
"positivity" : {
"k" : 1,
"v" : 2
},
"activity" : {
"k" : 1,
"v" : 4
}
}
},
"media" : {
"k" : 1,
"v" : ObjectId("537ea185df872bb71e4df270")
},
"include" : true
},
{
"_id" : ObjectId("539f41a95d1887b57ab78bea"),
"answers" : {
"ratings" : {
"positivity" : {
"k" : 2,
"v" : 3
},
"activity" : {
"k" : 2,
"v" : 4
}
}
},
"media" : {
"k" : 2,
"v" : ObjectId("537ea185df872bb71e4df275")
},
"include" : true
},
{
"_id" : ObjectId("539f41a95d1887b57ab78bea"),
"answers" : {
"ratings" : {
"positivity" : {
"k" : 3,
"v" : 5
},
"activity" : {
"k" : 3,
"v" : 3
}
}
},
"media" : {
"k" : 3,
"v" : ObjectId("537ea185df872bb71e4df272")
},
"include" : true
}
]
Doing this creates a lot of extra document overhead and may be slower than your current MapReduce implementation. You would need to run tests to check this. The computations required for this will grow in a cubic way based on the size of those three arrays. This should also be kept in mind.

how to use aggregation of mongodb

From the data as given below, I want to sum all Values fields.
Please let me know how can I do it using aggregation functionality of mongodb.
{"MetricRecord":
{ "SchemaVersion" : "0.12",
"Product": {
"ProductName" : "abc",
"ProductVersion": "7.5.0.1" ,
"ProductId" : "1234567890ABDFGH12345",
"InstanceId" : "12345BA32",
"InstanceName" : "1234SS123",
"SystemId" : "somehost.com"
},
"Tenant" : {
"CustomerId" : "222-555-124",
"ServiceCode": "xyzxyzxyz12345yyy"
},
"Metrics" : [
{
"ReportType" :[
{ "report" : "billing" },
],
"LogTime" : "2013-12-08T12:34:56:01Z" ,
"Type" : "AuthorizedUsers",
"SubType" : "registered",
"Value" : "125",
"UnitOfMeasure": "USD",
"Period" : {
"StartTime" : "2013-12-07T00:00:00:01Z",
"EndTime" : "2013-12-08T00:00:00:01Z"
}
},
{
"ReportType" :[
{ "report" : "billing" }
],
"LogTime" : "2013-12-08T12:34:56:01Z" ,
"Type" : "NumberOfTickets",
"SubType" : "resolved",
"Value" : "430",
"UnitOfMeasure": "USD",
"Period" : {
"StartTime" : "2013-12-07T00:00:00:01Z",
"EndTime" : "2013-12-08T00:00:00:01Z"
}
}
]
}
}
So, results which I expect from summation of values is 430+125 i.e. 555
Your document contains string value for MetricRecord.Metrics[index].Value field and i am not sure why are you trying to sum up the string values. if it is a typo and your document contains numerical values for MetricRecord.Metrics[index].Value field then you can try the following query
db.metrics.aggregate([
{$unwind:"$MetricRecord.Metrics"},
{$group:{_id:"$_id",sum:{$sum:"$MetricRecord.Metrics.Value"}}}
])
In the above document posted, if your value field is like
MetricRecord.Metrics[0].Value is 125(not "125")
MetricRecord.Metrics[1].Value is 430(not "430")
you will get the following output
{
"result" : [
{
"_id" : ObjectId("xxxxxxxxxxxxxxxxxxxxxxxx"),
"sum" : 555
}
],
"ok" : 1
}
The above sample query is composed assuming you have the default mongodb "_id" field and you are using a metrics collection. You have to manipulate the query as per you requirements.