How to return raw JSON directly from a mongodb query? - mongodb

In MongoDB (using mongosh or command-line mongo cli), you can query documents, for example using db.mycollection.find({"something":true}) and get the following result:
{
"someDate": ISODate("2022-10-24T17:21:44.980Z"),
"something": true,
"hello": "world"
}
This result, however, is not valid JSON (Due to ISODate). How can I change the query above to make MongoDB return canonical (valid) JSON?
I'm looking for a recursive and generalized way to do this, even for deeply nested documents.

There are a number of existing answers, I'll clarify a few:
Use aggregate to produce the output in JSON format: Playground
db.collection.aggregate([
{
$match: {
something: true
}
},
{
$project: {
_id: 1,
someDate: {
$dateToString: {
format: "%Y-%m-%dT%H:%M:%S:%LZ",
date: "$someDate"
}
},
something: 1,
hello: 1
}
}
])
Loop through your query in your application: (example, node.js)
db.mycollection.find({"something":true}).forEach(function(doc) {
doc.someDate = doc.someDate.toISOString() // or even .toJSON()
})
// Or with await
const records = await db.mycollection.find({"something":true}).map(doc => {
doc.someDate = doc.someDate.toISOString()
return doc
}).toArray()

The details of where you are running this command are very important, can you please share those?
I am guessing that you are probably running this via the (older) mongo utility (as opposed to the newer mongosh). But confirmation of that, and the database version that you are using, would both be helpful. I will retain this assumption for the purposes of this answer.
This result, however, is not valid JSON (Due to ISODate).
The database itself doesn't return some text that has ISODate. In fact, it doesn't return or "speak" JSON at all. Rather, the database communicates via "Binary JSON" or BSON for short. In fact, the "Does MongoDB use BSON or JSON?" section of this page specifically mentions the following:
Firstly, BSON documents may contain Date or Binary objects that are not natively representable in pure JSON.
So when you see things like ISODate() that is the client application wrapping and representing the rich BSON document in a more limited (and text-based) JSON-like form. Importantly, this is often for readability purposes. You should be able to natively pass around and operate on information (documents) returned by the database directly in the application without doing any sort of transformation and without losing rich type information. Additional reading material about BSON is here.
Getting back to the original question, if you want to have the shell print out a valid JSON document than you can do that via additional helpers. In the older mongo utility, a reproduction of the situation described in the question is:
> db.mycollection.findOne({"something": true})
{
"_id" : 1,
"someDate" : ISODate("2022-11-30T14:38:37.711Z"),
"something" : true,
"hello" : "world"
}
The shell itself can understand and operate on ISODate() (and other functions of that nature). If you did want to remove things like ISODate() for some reason, then you can leverage the JSON.stringify() functionality (reformatted with line indents for readability):
> JSON.stringify( db.mycollection.findOne({"something": true}) )
{
"_id":1,
"someDate":"2022-11-30T14:38:37.711Z",
"something":true,
"hello":"world"
}
The newer mongosh shell offers even more utility here:
> EJSON.serialize( db.mycollection.findOne() )
{
_id: 1,
someDate: { '$date': '2022-11-30T14:38:37.711Z' },
something: true,
hello: 'world'
}
Via these EJSON functions, mongosh is attempting to preserve type information when it prints the data in this format. Notice that in the earlier example the date was just represented as a string, but here the shell is using Extended JSON to capture the fact that the type of someDate is a Date.

Related

Mongoose mongodb modify data before returning with pagination

So am fetching data with mongoose and i would like to modify the data like apply some date formats. Currently i have
const count = await UserModel.countDocuments();
const rows = await UserModel.find({ name:{$regex: search, $options: 'i'}, status:10 })
.sort([["updated_at", -1]])
.skip(page * perPage)
.limit(perPage)
.exec();
res.json({ count, rows });
The above UserModel is a mongoose model
I would like to modify some of objects like applying date formats before the data is returned while still paginating as above.
Currently i have added the following which works but i have to loop through all rows which will be a performance nighmare for large data.
res.json({ count, rows:rows.map(el=>({...el,created_at:'format date here'})) });
Is there a better option
As much as I understood your question, If you need to apply some date formats before showing data on frontend, you just need to pass the retrieved date in a date-formating library before displaying it, like in JS:
const d = new Date("2015-03-25T12:00:00Z");
However, if you want to get date in formatted form, than you must format it before storing. I hope that answer your question.
I think the warning from #Fabian Strathaus in the comments is an important consideration. I would strongly recommend that the approach you are trying to solve sets you up for success overall as opposed to introducing new pain points elsewhere with your project.
Assuming that you want to do this, an alternative approach is to ask the database to do this directly. More specifically, the $dateToString operator sounds like it could be of use here. This playground example demonstrates the basic behavior by adding a formatted date field which will be returned directly from the database. It takes the following document:
{
_id: 1,
created_at: ISODate("2022-01-15T08:15:39.736Z")
}
We then execute this sample aggregation:
db.collection.aggregate([
{
"$addFields": {
created_at_formatted: {
$dateToString: {
format: "%m/%d/%Y",
date: "$created_at"
}
}
}
}
])
The document that gets returned is:
{
"_id": 1,
"created_at": ISODate("2022-01-15T08:15:39.736Z"),
"created_at_formatted": "01/15/2022"
}
You could make use of this in a variety of ways, such as by creating and querying a view which will automatically create and return this formatted field.
I also want to comment on this statement that you made:
Currently i have added the following which works but i have to loop through all rows which will be a performance nighmare for large data.
It's good to hear that you're thinking about performance upfront. That said, your query includes a query predicate of name:{$regex: search, $options: 'i'}. Unanchored and/or case insensitive regex filters cannot use indexes efficiently. So if your status predicate is not selective, then you may need to take a look at alternative approaches for filtering on name to make sure that the query is performant.

sort by string length in Mongodb/pymongo

I was wondering if anyone knows how to sort a mongodb find() result by string length.
I have tried something like db.foo.find().sort({item.lenght:-1}) but obviously doesn't work. Can somebody help me and also suggest me a way to do the same thing but in pymongo?
There are lot of things ( and basic API ) I would personally love to see in the aggregation framework such as:
Math functions
log (as in logarithm)
ceil
floor
Array
sum
String
length
Just to name a few.
And that is without resorting to obscure usages of the $mod operator or other means in such cases as "ceil" and "floor". But I digress.
Your "string length" falls into this category. Raise a JIRA issue about it. But for now you you can use mapReduce and the existing JavaScript functionality:
db.collection.mapReduce(
function() {
emit( this.item.length, this.item );
},
function(key,values) {
return values;
},
{ "out": { "inline": 1 } }
)
So while that does actually have the "mapReduce" funky style of returning a re-shaped document and with of course everything matching the same length in an array, what it does do is take advantage of the nature of "mapReduce" ( not just restricted to MongoDB ) and allows the emitted "key" value to be sorted in the response.
There is now a solution for this in MongoDB v3.4+ using the aggregation framework using $strLenBytes. Given the following document:
{_id: 0, name: "Bob"}
We can use
db.mycollection.aggregate([{
$project: {
byteLength: {$strLenBytes: "$name"}
}
}])
Which will return 3 for the number of bytes.
No, actually is not possible.
I was dealing with a similar problem, what I did was to store the string length of every object as a property of the object itself. This bypassed the problem.
If you think that shall be implemented (I do) I recomend you to upvote the issue in JIRA, which, for some reason have not so many votes:
https://jira.mongodb.org/browse/SERVER-5319

Mongoose: Saving as associative array of subdocuments vs array of subdocuments

I have a set of documents I need to maintain persistence for. Due to the way MongoDB handle's multi-document operations, I need to embed this set of documents inside a container document in order to ensure atomicity of my operations.
The data lends itself heavily to key-value pairing. Is there any way instead of doing this:
var container = new mongoose.Schema({
// meta information here
subdocs: [{key: String, value: String}]
})
I can instead have subdocs be an associative array (i.e. an object) that applies the subdoc validations? So a container instance would look something like:
{
// meta information
subdocs: {
<key1>: <value1>,
<key2>: <value2>,
...
<keyN>: <valueN>,
}
}
Thanks
Using Mongoose, I don't believe that there is a way to do what you are describing. To explain, let's take an example where your keys are dates and the values are high temperatures, to form pairs like { "2012-05-31" : 88 }.
Let's look at the structure you're proposing:
{
// meta information
subdocs: {
"2012-05-30" : 80,
"2012-05-31" : 88,
...
"2012-06-15": 94,
}
}
Because you must pre-define schema in Mongoose, you must know your key names ahead of time. In this use case, we would probably not know ahead of time which dates we would collect data for, so this is not a good option.
If you don't use Mongoose, you can do this without any problem at all. MongoDB by itself excels at inserting values with new key names into an existing document:
> db.coll.insert({ type : "temperatures", subdocuments : {} })
> db.coll.update( { type : "temperatures" }, { $set : { 'subdocuments.2012-05-30' : 80 } } )
> db.coll.update( { type : "temperatures" }, { $set : { 'subdocuments.2012-05-31' : 88 } } )
{
"_id" : ObjectId("5238c3ca8686cd9f0acda0cd"),
"subdocuments" : {
"2012-05-30" : 80,
"2012-05-31" : 88
},
"type" : "temperatures"
}
In this case, adding Mongoose on top of MongoDB takes away some of MongoDB's native flexibility. If your use case is well suited by this feature of MongoDB, then using Mongoose might not be the best choice.
you can achieve this behavior by using {strict: false} in your mongoose schema, although you should check the implications on the validation and casting mechanism of mongoose.
var flexibleSchema = new Schema( {},{strict: false})
another way is using schema.add method but i do not think this is the right solution.
the last solution i see is to get all the array to the client side and use underscore.js or whatever library you have. but it depends on your app, size of docs, communication steps etc.

querying on internals in mongo

I have document structure like:
{ "_id": { "$oid" : "51711cd87023380037000001" },
"dayData": "{ "daysdata":{"date":"02-12-2013","week_day":"","month":"","date_day":"","year":"2013"}}"
}
I want to extract document having date = "02-12-2013" in the above. Here i am trying to query on a value which is also a json.
Please let me know how to use mongodb java driver to extract this
Not answer (stackoverflow won't let me comment as I don't have enough points!)
Your string containing json has a syntax error.
There is a single " after "year":"2013"
You may have to fix that 1st.

Ways to implement data versioning in MongoDB

Can you share your thoughts how would you implement data versioning in MongoDB. (I've asked similar question regarding Cassandra. If you have any thoughts which db is better for that please share)
Suppose that I need to version records in an simple address book. (Address book records are stored as flat json objects). I expect that the history:
will be used infrequently
will be used all at once to present it in a "time machine" fashion
there won't be more versions than few hundred to a single record.
history won't expire.
I'm considering the following approaches:
Create a new object collection to store history of records or changes to the records. It would store one object per version with a reference to the address book entry. Such records would looks as follows:
{
'_id': 'new id',
'user': user_id,
'timestamp': timestamp,
'address_book_id': 'id of the address book record'
'old_record': {'first_name': 'Jon', 'last_name':'Doe' ...}
}
This approach can be modified to store an array of versions per document. But this seems to be slower approach without any advantages.
Store versions as serialized (JSON) object attached to address book entries. I'm not sure how to attach such objects to MongoDB documents. Perhaps as an array of strings.
(Modelled after Simple Document Versioning with CouchDB)
The first big question when diving in to this is "how do you want to store changesets"?
Diffs?
Whole record copies?
My personal approach would be to store diffs. Because the display of these diffs is really a special action, I would put the diffs in a different "history" collection.
I would use the different collection to save memory space. You generally don't want a full history for a simple query. So by keeping the history out of the object you can also keep it out of the commonly accessed memory when that data is queried.
To make my life easy, I would make a history document contain a dictionary of time-stamped diffs. Something like this:
{
_id : "id of address book record",
changes : {
1234567 : { "city" : "Omaha", "state" : "Nebraska" },
1234568 : { "city" : "Kansas City", "state" : "Missouri" }
}
}
To make my life really easy, I would make this part of my DataObjects (EntityWrapper, whatever) that I use to access my data. Generally these objects have some form of history, so that you can easily override the save() method to make this change at the same time.
UPDATE: 2015-10
It looks like there is now a spec for handling JSON diffs. This seems like a more robust way to store the diffs / changes.
There is a versioning scheme called "Vermongo" which addresses some aspects which haven't been dealt with in the other replies.
One of these issues is concurrent updates, another one is deleting documents.
Vermongo stores complete document copies in a shadow collection. For some use cases this might cause too much overhead, but I think it also simplifies many things.
https://github.com/thiloplanz/v7files/wiki/Vermongo
Here's another solution using a single document for the current version and all old versions:
{
_id: ObjectId("..."),
data: [
{ vid: 1, content: "foo" },
{ vid: 2, content: "bar" }
]
}
data contains all versions. The data array is ordered, new versions will only get $pushed to the end of the array. data.vid is the version id, which is an incrementing number.
Get the most recent version:
find(
{ "_id":ObjectId("...") },
{ "data":{ $slice:-1 } }
)
Get a specific version by vid:
find(
{ "_id":ObjectId("...") },
{ "data":{ $elemMatch:{ "vid":1 } } }
)
Return only specified fields:
find(
{ "_id":ObjectId("...") },
{ "data":{ $elemMatch:{ "vid":1 } }, "data.content":1 }
)
Insert new version: (and prevent concurrent insert/update)
update(
{
"_id":ObjectId("..."),
$and:[
{ "data.vid":{ $not:{ $gt:2 } } },
{ "data.vid":2 }
]
},
{ $push:{ "data":{ "vid":3, "content":"baz" } } }
)
2 is the vid of the current most recent version and 3 is the new version getting inserted. Because you need the most recent version's vid, it's easy to do get the next version's vid: nextVID = oldVID + 1.
The $and condition will ensure, that 2 is the latest vid.
This way there's no need for a unique index, but the application logic has to take care of incrementing the vid on insert.
Remove a specific version:
update(
{ "_id":ObjectId("...") },
{ $pull:{ "data":{ "vid":2 } } }
)
That's it!
(remember the 16MB per document limit)
If you're looking for a ready-to-roll solution -
Mongoid has built in simple versioning
http://mongoid.org/en/mongoid/docs/extras.html#versioning
mongoid-history is a Ruby plugin that provides a significantly more complicated solution with auditing, undo and redo
https://github.com/aq1018/mongoid-history
I worked through this solution that accommodates a published, draft and historical versions of the data:
{
published: {},
draft: {},
history: {
"1" : {
metadata: <value>,
document: {}
},
...
}
}
I explain the model further here: http://software.danielwatrous.com/representing-revision-data-in-mongodb/
For those that may implement something like this in Java, here's an example:
http://software.danielwatrous.com/using-java-to-work-with-versioned-data/
Including all the code that you can fork, if you like
https://github.com/dwatrous/mongodb-revision-objects
If you are using mongoose, I have found the following plugin to be a useful implementation of the JSON Patch format
mongoose-patch-history
Another option is to use mongoose-history plugin.
let mongoose = require('mongoose');
let mongooseHistory = require('mongoose-history');
let Schema = mongoose.Schema;
let MySchema = Post = new Schema({
title: String,
status: Boolean
});
MySchema.plugin(mongooseHistory);
// The plugin will automatically create a new collection with the schema name + "_history".
// In this case, collection with name "my_schema_history" will be created.
I have used the below package for a meteor/MongoDB project, and it works well, the main advantage is that it stores history/revisions within an array in the same document, hence no need for an additional publications or middleware to access change-history. It can support a limited number of previous versions (ex. last ten versions), it also supports change-concatenation (so all changes happened within a specific period will be covered by one revision).
nicklozon/meteor-collection-revisions
Another sound option is to use Meteor Vermongo (here)