The relevant question is Efficiently convert rows to columns in sql server. But the answer is specific to SQL.
I want the same result i.e. pivot row to column without aggregating anything (as of now) in MongoDB.
The collection looks something as below. These are statistics of facebook page properties:
timestamp | propName | propValue
--------------------------------
1371798000000 | page_fans | 100
--------------------------------
1371798000000 | page_posts | 50
--------------------------------
1371798000000 | page_stories | 25
--------------------------------
I need answer like:
timestamp | page_fans | page_posts | page_stories
--------------------------------
1371798000000 | 100 | 50 | 25
--------------------------------
The column names are pre-determined. They don't have to be generated dynamically. But question is how to achieve this in MongoDB.
I believe aggregation is of no use for this purpose. Do I need to use MapReduce? But in that case I have nothing to reduce I guess? Well another option could be fetching these values in code and do the manipulation in programming language e.g. Java
Any insights would be helpful. Thanks in advance :)!!!
EDIT (Based on input from Schaliasos):
Input JSON:
{
"_id" : ObjectId("51cd366644aeac654ecf8f75"),
"name" : "page_storytellers",
"pageId" : "512f993a44ae78b14a9adb85",
"timestamp" : NumberLong("1371798000000"),
"value" : NumberLong(30871),
"provider" : "Facebook"
}
{
"_id" : ObjectId("51cd366644aeac654ecf8f76"),
"name" : "page_fans",
"pageId" : "512f993a44ae78b14a9adb85",
"timestamp" : NumberLong("1371798000000"),
"value" : NumberLong(1291509),
"provider" : "Facebook"
}
{
"_id" : ObjectId("51cd366644aeac654ecf8f77"),
"name" : "page_fan_adds",
"pageId" : "512f993a44ae78b14a9adb85",
"timestamp" : NumberLong("1371798000000"),
"value" : NumberLong(2829),
"provider" : "Facebook"
}
Expected Output JSON:
{
"timestamp" : NumberLong("1371798000000"),
"provider" : "Facebook",
"page_storytellers" : NumberLong(30871),
"page_fans" : NumberLong("1371798000000"),
"page_fan_adds" : NumberLong("1371798000000")
}
Now, you can utilise new aggregation operator $arrayToObject to pivot MongoDB keys. This operator is available in MongoDB v3.4.4+
For example, given an example data of:
db.foo.insert({ provider: "Facebook", timestamp: '1371798000000', name: 'page_storytellers', value: 20871})
db.foo.insert({ provider: "Facebook", timestamp: '1371798000000', name: 'page_fans', value: 1291509})
db.foo.insert({ provider: "Facebook", timestamp: '1371798000000', name: 'page_fan_adds', value: 2829})
db.foo.insert({ provider: "Google", timestamp: '1371798000000', name: 'page_fan_adds', value: 1000})
You can utilise Aggregation Pipeline below:
db.foo.aggregate([
{$group:
{_id:{provider:"$provider", timestamp:"$timestamp"},
items:{$addToSet:{name:"$name",value:"$value"}}}
},
{$project:
{tmp:{$arrayToObject:
{$zip:{inputs:["$items.name", "$items.value"]}}}}
},
{$addFields:
{"tmp.provider":"$_id.provider",
"tmp.timestamp":"$_id.timestamp"}
},
{$replaceRoot:{newRoot:"$tmp"}
}
]);
The output would be:
{
"page_fan_adds": 1000,
"provider": "Google",
"timestamp": "1371798000000"
},
{
"page_fan_adds": 2829,
"page_fans": 1291509,
"page_storytellers": 20871,
"provider": "Facebook",
"timestamp": "1371798000000"
}
See also $group,
$project,
$addFields,
$zip,
and $replaceRoot
I have done something like this using aggregation. Could this help ?
db.foo.insert({ timestamp: '1371798000000', propName: 'page_fans', propValue: 100})
db.foo.insert({ timestamp: '1371798000000', propName: 'page_posts', propValue: 25})
db.foo.insert({ timestamp: '1371798000000', propName: 'page_stories', propValue: 50})
db.foo.aggregate({ $group: { _id: '$timestamp', result: { $push: { 'propName': '$propName', 'propValue': '$propValue' } }}})
{
"result" : [
{
"_id" : "1371798000000",
"result" : [
{
"propName" : "page_fans",
"propValue" : 100
},
{
"propName" : "page_posts",
"propValue" : 50
},
{
"propName" : "page_stories",
"propValue" : 25
}
]
}
],
"ok" : 1
}
You may want to use $sum operator along the way. See here
Related
and think you in advance for the help. I have recently started using mongoDB for some personal project and I'm interested in finding a better way to query my data.
My question is: I have the following collection:
{
"_id" : ObjectId("5dbd77f7a204d21119cfc758"),
"Toyota" : {
"Founder" : "Kiichiro Toyoda",
"Founded" : "28 August 1937",
"Subsidiaries" : [
"Lexus",
"Daihatsu",
"Subaru",
"Hino"
]
}
}
{
"_id" : ObjectId("5dbd78d3a204d21119cfc759"),
"Volkswagen" : {
"Founder" : "German Labour Front",
"Founded" : "28 May 1937",
"Subsidiaries" : [
"Audi",
"Volkswagen",
"Skoda",
"SEAT"
]
}
}
I want to get the object name for example here I want to return
[Toyota, Volkswagen]
I have use this method
var names = {}
db.cars.find().forEach(function(doc){Object.keys(doc).forEach(function(key){names[key]=1})});
names;
which gave me the following result:
{ "_id" : 1, "Toyota" : 1, "Volkswagen" : 1 }
however, is there a better way to get the same result and also to just return the names of the objects. Thank you.
I would suggest you to change the schema design to be something like:
{
_id: ...,
company: {
name: 'Volkswagen',
founder: ...,
subsidiaries: ...,
...<other fields>...
}
You can then use the aggregation framework to achieve a similar result:
> db.test.find()
{ "_id" : 0, "company" : { "name" : "Volkswagen", "founder" : "German Labour Front" } }
{ "_id" : 1, "company" : { "name" : "Toyota", "founder" : "Kiichiro Toyoda" } }
> db.test.aggregate([ {$group: {_id: null, companies: {$push: '$company.name'}}} ])
{ "_id" : null, "companies" : [ "Volkswagen", "Toyota" ] }
For more details, see:
Aggregation framework
$group
Accumulator operators
As a bonus, you can create an index on the company.name field, whereas you cannot create an index on varying field names like in your example.
This is my mongo record. here roles is an array of objects. I want short code of roles in multiple rows.
{
"_id" : ObjectId("111111111111111111111111"),
"roles" : [
{
"name" : "Computer Programme Manager",
"shortCode" : "COMP"
},
{
"name" : "Technical Manager",
"shortCode" : "TEMR"
},
{
"name" : "Technical-Civil",
"shortCode" : "TEMR"
}
],
"deptDbValue" : "i_a",
"deptDisplayValue" : "IA",
"deptShortCode" : "gic"
}
I want all the roles in row wise. I tried this query:
db.departments.distinct("roles.shortCode");
which is giving each role in separates rows which is correct, but how can I get other properties like deptShortCode, deptDbValue etc.
For example, I wanted like this:
id | role_name | role_shortcode
ObjectId("111..") | Computer Prog | COMP
ObjectId("111..") | Technical Manager | TEMR
ObjectId("111..") | Technical-Civil | TEMR
Any suggestions?
The output format of a MongDB query is typically JSON and what you are suggesting as output format is not possible.
But the data you intended to have in the output could be present in the response of your query. For example:
{ "_id" : ObjectId("111111111111111111111111"), "name" : "Computer Programme Manager", "shortCode" : "COMP" }
{ "_id" : ObjectId("111111111111111111111111"), "name" : "Technical Manager", "shortCode" : "TEMR" }
{ "_id" : ObjectId("111111111111111111111111"), "name" : "Technical-Civil", "shortCode" : "TEMR" }
This is quite close to the output you want to have and correct JSON.
You can achieve this output with $unwind (causes the rows to multiply) and then $project (used to transform each single row) in an aggregate query like this one:
db.dummy.aggregate(
[
{
$unwind: {
"path": "$roles"
}
},
{
$project: {
"name": "$roles.name",
"shortCode": "$roles.shortCode",
}
}
]
)
Here is some information on the unwind command:
https://docs.mongodb.com/manual/reference/operator/aggregation/unwind/
and on the project command:
https://docs.mongodb.com/manual/reference/operator/aggregation/project/
Following is my mongo db entries.
my-mongo-set:PRIMARY> db.stat_collection.find({name : /s/})
{ "_id" : ObjectId("5aabf231a167b3808302b138"), "name" : "shankarmr", "email" : "abc#xyz", "rating" : 9901 }
{ "_id" : ObjectId("5aabf23da167b3808302b139"), "name" : "shankar", "email" : "abc1#xyz1", "rating" : 10011 }
{ "_id" : ObjectId("5aabf2b5a167b3808302b13a"), "name" : "shankar1", "email" : "abc2#xyz2", "rating" : 10 }
{ "_id" : ObjectId("5aabf2c2a167b3808302b13b"), "name" : "shankar2", "email" : "abc3#xyz3", "rating" : 100 }
Now i want to find an entry based on name but update a field only if a certain condition holds good.
I tried the following statement, but it gives me error at the second reference to $rating.
db.stat_collection.findOneAndUpdate({name: "shankar"}, {$set : {rating : {$cond : [ {$lt : [ "$rating", 100]}, 100, $rating]}}, $setOnInsert: fullObject}, {upsert : true} )
So in my case, it shouldnot update rating for the 2nd document as the rating is not less than 100. But for the third document, rating should be updated to 100.
How do i get it work?
$max is the operator you're looking for, try:
db.stat_collection.findOneAndUpdate( { name: "shankar1"}, { $max: { rating: 100 } }, { returnNewDocument: true } )
You'll either get old value (if is greater than 100) or modify a document and set 100
According to the documentation:
The $max operator updates the value of the field to a specified value if the specified value is greater than the current value of the field. The $max operator can compare values of different types, using the BSON comparison order.
You should put all conditions in the query part of the update:
db.stat_collections.findOneAndUpdate(
{ name: "Shankar", rating: { $lt: 100 } },
$set : { rating: 100 },
);
"If the name is Shankar and rating is less than 100, then set the rating to 100." is the above.
First of all the status codes("200","404" or other) and time("1000","2000"..) are uncertain,
I want to calculate the number(5, 6 ...) for each status codes.
For example: {"200" : 11}, {"404" :11} or {"total" : 22}
Data Structure :
"_id" : "xxxxx"
"domain" : "www.test.com"
"status" : [
{"200" : [ {"1000" : 5}, {"2000": 6} ...]},
{"404" : [ {"1000" : 5}, {"2000": 6} ...]}
....
]
Any fantastic methods in MongoDB ?
Thank you for your help
Don't use data, like dates, as keys. Data belongs in values. The HTTP status codes are enumerated - you know all the possibilities - so you can use those as keys if you want to. From the look of the documents, you are storing information about requests to a page in a page document with the requests in an array. It's not a great idea to have an unbounded, constantly growing array in a document. I'd suggest refactoring the data to be request documents with the address denormalized into each:
{
"_id" : ObjectId(...),
"status" : 404,
"date" : ISODate("2014-10-30T18:23:09.471Z"),
"domain" : "www.test.com"
}
and then you can get the total number of 404 requests to test.com with the aggregation
db.requests.aggregate([
{ "$match" : { "domain" : "www.test.com" } },
{ "$group" : { "_id" : "$status", "count" : { "$sum" : 1 } } }
])
Index on domain to make it fast.
I think you can use the aggregation framework to pull something like that.
Check this:
db.errors.aggregate([{$unwind: "$status"}, {$group: {_id: "$status", total:{$sum:1}}}])
It will render a result like this:
...
"result" : [
{
"_id" : {
"500" : [
{
"1000" : 5
},
{
"2000" : 6
}
]
},
"total" : 1
},
...
The "total" field has the count that you're looking for.
Hope this helps.
Regards!
I would like to retrieve a list of values that comes from the oldest document currently signed.But i failed to select a document absed on the date.Thanks
here is json :
"ad" : "noc3",
"createdDate" : ISODate(),
"list" : [
{
"id" : "p45",
"value" : 21,
},
{
"id" : "p6",
"value" : 20,
},
{
"id" : "4578",
"value" : 319
}
]
and here my aggregate request :
db.friends.aggregate({$match:{advertiser:"noc3", {$sort:{timestamps:-1},{$limit:1} }},{$unwind:"$list"},{$project:{_id: "$list.id", value:{$add:[0]}}});
Your aggregate query is incorrect. You add the sort and limit to the match, but that's now how you do that. You use different pipeline operators:
db.friends.aggregate( [
{ $match: { advertiser: "noc3" } },
{ $sort: { createdDate: -1 } },
{ $limit: 1 },
Your other pipeline operators are bit strange too, and your code vs query mismatches on timestamps vs createdDate. If you add the expected output, I can update the answer to include the last bits of the query too.