TL;DR
My MapReduce isn't listing the _ids properly as values but creates multiple arrays. Any help?
Full story
I have a collection filled with tweets, including the entities. The portion of data that I'm interested in looks something like this:
{
"_id": ObjectId("h98342jdhs99191"),
"text": "tweet text",
"screen_name":"twittername",
"entities":{
media:[
{
"type":"photo",
"media_url":"http://wwww.twitpic.com/HzKd99.jpg"
},
{
"type":"photo",
"media_url":"http://wwww.twitpic.com/HDK43.jpg"
}
]
}
}
The key of the output should be the media_url. Because a url can be tweeted by more than one person I want the value to be an array containing the ids of tweeps. Something like this:
{
"_id": "http://www.foto.com/kdh34a.jpg",
"value":{
{ id:ObjectId("854737272343f8928") },
{ id:ObjectId("23137272378uie8928") },
{ id:ObjectId("85473727fdsd4x77665") },
{ id:ObjectId("8547372723dsd411zzc") }
}
}
I've created the following MapReduce functions:
map = function(){
if(!this.entities.media){
return;
}
for(index in this.entities.media){
emit(this.entities.media[index].media_url, {ids: [this._id]});
}
}
reduce = function(key, values){
var result = {};
for(id in values){
if(!values.indexOf(values[id])){
Array.prototype.push.apply(result, values);
}
}
return result;
}
db.tweets.mapReduce(map, reduce, {out: "media"});
When the media_url is unique the result is as follows:
{
"_id" : "http://wwww.twitpic.com/HzKd99.jpg",
"value" : {
"ids" : [
ObjectId("528748b423421150010021fd")
]
}
}
When it's not unique the results get weird:
{
"_id" : "http://wwww.twitpic.com/HzKd99.jpg",
"value" : {
"0" : {
"0" : {
"ids" : [
ObjectId("528733ac234211500100004f")
]
},
"1" : {
"ids" : [
ObjectId("52873c772342115001000d8d")
]
},
"2" : {
"ids" : [
ObjectId("52873e142342115001001017")
]
},
"3" : {
"ids" : [
ObjectId("5287545a2342115001004fd3")
]
},
"length" : 4
},
"1" : {
"ids" : [
ObjectId("5287c43b2342115001010e53")
]
},
"length" : 2
}
}
What is causing this and how do I get one nice list of values?
Related
I have two JSONs in a collection in mongodb and would like to write a bson.M filter to fetch the first JSON below.
I tried with the filter below to get the first JSON but got no result.
When the first JSON is in the collection, I get result when i use the filter
but when i have both JSONs, I do not get a result. Need help.
filter := bson.M{"type": "FPF", "status": "REGISTERED","fpfInfo.fpfInfoList.ssai.st": 1, "fpfInfo.fpfInfoList.infoList.dn": "sim"}
{
"_id" : "47f6ad68-d431-4b69-9899-f33d828f8f9c",
"type" : "FPF",
"status" : "REGISTERED",
"fpfInfo" : {
"fpfInfoList" : [
{
"ssai" : {
"st" : 1
},
"infoList" : [
{
"dn" : "sim"
}
]
}
]
}
},
{
"_id" : "347c8ed2-d9d1-4f1a-9672-7e8a232d2bf8",
"type" : "FPF",
"status" : "REGISTERED",
"fpfInfo" : {
"fpfInfoList" : [
{
"ssai" : {
"st" : 1,
"ds" : "000004"
},
"infoList" : [
{
"dn" : "sim"
}
]
}
]
}
}
db.collection.aggregate([
{
"$unwind": "$fpfInfo.fpfInfoList"
},
{
"$match": {
"fpfInfo.fpfInfoList.ssai.ds": {
"$exists": false
},
"fpfInfo.fpfInfoList.infoList.dn": "sim",
"fpfInfo.fpfInfoList.ssai.st": 1
}
}
])
Playground
I have a document like this.
{ "_id" : ObjectId("5c6cc08a568f4cdf7870b3a7"),
"phone" : {
"cell" : [
"854-6574-545",
"545-6456-545"
],
"home" : [
"5474-647-574",
"455-6878-758"
]
}
}
I want to display output like this.
output
{
"_id" : ObjectId("5c6cc08a568f4cdf7870b3a7"),
"phone" : {
"cell" : [
"854-6574-545"
]
}
}
please advice.
Use $slice to project number from array.
Query:
db.collection.find({},
{
"phone.cell": {
$slice: 1
},
"phone.home": 0
})
Result:
{
"_id": ObjectId("5c6cc08a568f4cdf7870b3a7"),
"phone": {
"cell": [
"854-6574-545"
]
}
}
Query 2:
db.collection.find({},
{
"_id": 0,
"phone.cell": {
$slice: 1
},
"phone.home": 0
})
Result 2:
{
"phone": {
"cell": [
"854-6574-545"
]
}
}
** Final Query - using aggregate**
db.collections.aggregate([{'$match':{'phone.cell':{'$exists':true}}},
{'$project':{'_id':1,'phone.cell':{$slice:['$phone.cell',1,1]}}}])
** Output **
{
"_id" : ObjectId("5c6cc08a568f4cdf7870b3a7"),
"phone" : {
"cell" : [
"545-6456-545"
]
}
}
I have a MongoDB document that looks like has a few array properties:
{
"_id" : "123456789",
"distance" : [
{
"inner_distance" : 2
},
{
"inner_distance" : 4
},
{
"inner_distance" : -1
}
],
"name" : [
{
"inner_name" : "MyName"
}
],
"entries" : [
{ ... },
{ ... },
],
"property1" : "myproperty1",
"property2" : "myproperty2",
"property3" : "myproperty3"
}
I am trying to figure out how to apply transformation to the distance array in order to "flatten" it to a scalar according to a transformation function (I want to obtain the absolute value for each inner_distance element in `distance, then take the minimum of all those values.)
For example in the example above, the distance array has: [{"inner_distance" : 2}, {"inner_distance" : 4}, {"inner_distance" : -1}], and I need to figure out how to apply a transformation to make distance: 1 (or if its easier, a new property such as distance_new: 1.
I would like to do this inline (is that the correct terminology?) so that I an perform the operation and end out with the stored record:
{
"_id" : "123456789",
"distance" : 1,
"name" : [
{
"inner_name" : "MyName"
}
],
"entries" : [
{ ... },
{ ... },
],
"property1" : "myproperty1",
"property2" : "myproperty2",
"property3" : "myproperty3"
}
Has anyone had any experience with something like this? I have been trying to figure out how to create a map-reduce command to run this but have had no luck.
Well what you want can be handled efficiently in MongoDB 3.2.
You need to use the $abs operator to return the absolute value for each "inner_distance" and the $min which returns the minimum value in an array. Of course the $map operator in the $project stage return an array of "inner_distance".
You will then need to loop over your aggregation result and use the .bulkWrite() method to update your documents.
var operations = [];
db.collection.aggregate([
{ "$project": {
"distance": {
"$min": {
"$map": {
"input": "$distance",
"as": "d",
"in": { "$abs": "$$d.inner_distance" }
}
}
}
}}
]).forEach(function(doc) {
var operation = { 'updateOne': {
'filter': { '_id': doc._id },
'update': {
'$set': { 'distance': doc.distance }
}
}};
operations.push(operation);
});
operations.push( {
ordered: true,
writeConcern: { w: "majority", wtimeout: 5000 }
});
db.collection.bulkWrite(operations);
mapReduce Solution
var map = function() {
var distance = this.distance.map(function(element) {
return Math.abs(element.inner_distance);
} );
emit(this._id, Math.min(...distance));
};
var results = db.collection.mapReduce(map,
function(key, values) { return;},
{ 'out': { 'inline': 1 } }
);
Which returns this:
{
"results" : [
{
"_id" : "123456789",
"value" : 1
},
{
"_id" : "143456789",
"value" : 1
}
],
"timeMillis" : 31,
"counts" : {
"input" : 2,
"emit" : 2,
"reduce" : 0,
"output" : 2
},
"ok" : 1
}
You can then use the "bulk" operations to update your documents.
var bulk = db.collection.initializeOrderedBulkOp();
var count = 0;
results['results'].forEach(function(element) {
bulk.find( { '_id': element._id } ).updateOne( {
'$set': { 'distance': element.value }
});
count++;
if (count % 200 === 0) {
bulk.execute();
bulk = db.collection.initializeOrderedBulkOp();
}
})
if (count > 0 ) bulk.execute();
Note:
In the mapReduce example, Math.min(...distance) use the spread operator new in ES6 but you can also use Math.min.apply(Math, distance)
I have collection entry like that
[
{
shape : [{id:1,status:true},{id:2,status:false}]
},
{
shape : [{id:1,status:true}]
}
]
I want to fetch data which exactly match array , means contain all ele. of array.
Ex. where shape.id = [1,2] / [ {id: [1,2] } ] (any one is prefer)
then it should return only
[
{
shape : [{id:1,status:true},{id:2,status:false}]
}
]
So help me if is there any native mongodb query .
Thanks
--ND
Here is much simpler query;
db.shapes.find({'shape.id':{$all:[1,2]},shape:{$size:2}});
If mongo documents as below
{
"_id" : ObjectId("54eeb68c8716ec70106ee33b"),
"shapeSize" : [
{
"shape" : [
{
"id" : 1,
"status" : true
},
{
"id" : 2,
"status" : false
}
]
},
{
"shape" : [
{
"id" : 1,
"status" : true
}
]
}
]
}
Then used below aggregation to match the criteria
db.collectionName.aggregate({
"$unwind": "$shapeSize"
}, {
"$match": {
"$and": [{
"shapeSize.shape.id": 2
}, {
"shapeSize.shape.id": 1
}]
}
}, {
"$project": {
"_id": 0,
"shape": "$shapeSize.shape"
}
})
I'm trying to remove the field from the teacher array that contains a specific subject, such as "ok baby"
{
"_id" : "billy",
"password" : "$2a$10$MKZFNtMhts6rMbnIoqXB9.Q8NHAizQAGhX5S6g.8zeRt7TpRpuQea",
"teacher" : [
{
"subject" : "ok baby",
"students" : [
"billy"
]
},
{
"subject" : "adsfqewr",
"students" : [
"billy"
]
}
]
}
This is what I tried:
users.update( { 'teacher.subject':title, '_id':username},
{ $pull: { 'teacher.subject':title } },
{ multi: true }
)
The query should be like this .,,, pulling data from array is teacher and title is equal to title ...
users.update( { 'teacher.subject':title, '_id':username},
{ $pull: { 'teacher':{'subject':title}} },
{ multi: true }
);