I have been trying to use reactivemongo to insert some documents into a mongodb collection with a few BSON types.
I am using the Play JSON library to parse and manipulate some documents in extended JSON, here is one example:
{
"_id" : {"$oid": "5f3403dc7e562db8e0aced6b"},
"some_datetime" : {
"$date" : {"$date": 1597841586927}
}
}
I'm using reactivemongo-play-json, and so I have to import the following so my JsObject is automatically cast to a reactivemongo BSONDocument when passing it to collection.insert.one
import reactivemongo.play.json.compat._
import json2bson._
Unfortunately, once I open my mongo shell and look at the document I just inserted, this is the result:
{
"_id" : ObjectId("5f3403dc7e562db8e0aced6b"),
"some_datetime" : {
"$date" : NumberLong("1597244282116")
},
}
Only the _id has been understood as a BSON type described using extended JSON, and I'd expect the some_datetime field to be something like a ISODate(), same as I'd expect to see UUID()-type values instead of their extended JSON description which looks like this:
{'$binary': 'oKQrIfWuTI6JpPbPlYGYEQ==', '$type': '04'}
How can I make sure this extended JSON is actually converted to proper BSON types?
Turns out the problem is that what I thought to be extended JSON is actually not; my datetime should be formatted as:
{"$date": {"$numberLong": "1597841586927"}}
instead of
{"$date": 1597841586927}
The wrong format was introduced by my data source - a kafka connect mongo source connector not serializing documents to proper extended JSON by default (see this stackoverflow post).
Related
In MongoDB (using mongosh or command-line mongo cli), you can query documents, for example using db.mycollection.find({"something":true}) and get the following result:
{
"someDate": ISODate("2022-10-24T17:21:44.980Z"),
"something": true,
"hello": "world"
}
This result, however, is not valid JSON (Due to ISODate). How can I change the query above to make MongoDB return canonical (valid) JSON?
I'm looking for a recursive and generalized way to do this, even for deeply nested documents.
There are a number of existing answers, I'll clarify a few:
Use aggregate to produce the output in JSON format: Playground
db.collection.aggregate([
{
$match: {
something: true
}
},
{
$project: {
_id: 1,
someDate: {
$dateToString: {
format: "%Y-%m-%dT%H:%M:%S:%LZ",
date: "$someDate"
}
},
something: 1,
hello: 1
}
}
])
Loop through your query in your application: (example, node.js)
db.mycollection.find({"something":true}).forEach(function(doc) {
doc.someDate = doc.someDate.toISOString() // or even .toJSON()
})
// Or with await
const records = await db.mycollection.find({"something":true}).map(doc => {
doc.someDate = doc.someDate.toISOString()
return doc
}).toArray()
The details of where you are running this command are very important, can you please share those?
I am guessing that you are probably running this via the (older) mongo utility (as opposed to the newer mongosh). But confirmation of that, and the database version that you are using, would both be helpful. I will retain this assumption for the purposes of this answer.
This result, however, is not valid JSON (Due to ISODate).
The database itself doesn't return some text that has ISODate. In fact, it doesn't return or "speak" JSON at all. Rather, the database communicates via "Binary JSON" or BSON for short. In fact, the "Does MongoDB use BSON or JSON?" section of this page specifically mentions the following:
Firstly, BSON documents may contain Date or Binary objects that are not natively representable in pure JSON.
So when you see things like ISODate() that is the client application wrapping and representing the rich BSON document in a more limited (and text-based) JSON-like form. Importantly, this is often for readability purposes. You should be able to natively pass around and operate on information (documents) returned by the database directly in the application without doing any sort of transformation and without losing rich type information. Additional reading material about BSON is here.
Getting back to the original question, if you want to have the shell print out a valid JSON document than you can do that via additional helpers. In the older mongo utility, a reproduction of the situation described in the question is:
> db.mycollection.findOne({"something": true})
{
"_id" : 1,
"someDate" : ISODate("2022-11-30T14:38:37.711Z"),
"something" : true,
"hello" : "world"
}
The shell itself can understand and operate on ISODate() (and other functions of that nature). If you did want to remove things like ISODate() for some reason, then you can leverage the JSON.stringify() functionality (reformatted with line indents for readability):
> JSON.stringify( db.mycollection.findOne({"something": true}) )
{
"_id":1,
"someDate":"2022-11-30T14:38:37.711Z",
"something":true,
"hello":"world"
}
The newer mongosh shell offers even more utility here:
> EJSON.serialize( db.mycollection.findOne() )
{
_id: 1,
someDate: { '$date': '2022-11-30T14:38:37.711Z' },
something: true,
hello: 'world'
}
Via these EJSON functions, mongosh is attempting to preserve type information when it prints the data in this format. Notice that in the earlier example the date was just represented as a string, but here the shell is using Extended JSON to capture the fact that the type of someDate is a Date.
Well, I have used MongoDB in a while but I don't know how to handle this situation.
The scenario is: I have data inserted in MongoDB. I have exported the data in JSON format, and it is something like:
[
{
"_id": { "$oid": "60ff324f41c4d5b96054390d" },
"field": {
"due_date": { "$date": "2021-11-03T00:09:18.271Z" }
}
}
]
You can see that :
_id is { "$oid": "60ff324f41c4d5b96054390d" } and
date is { "$date": "2021-11-03T00:09:18.271Z" }.
So, the problem is trying to insert in another DB using Mongoose. I want to test some funcionality isolated so I wan't these values in other environment, so I have used insertMany() with the JSON previously exported.
(Maybe there is a more elegant way to import a JSON file but is only for testing purposes)
await model.insertMany(JSON.parse(fs.readFileSync('./data.json').toString()))
But the problem is here: It throws an error because my schema says that _id is an ObjectId and due_date is a Date object but they are actually read as objects: {$oid: ""} and {$date: ""}
ValidationError: model validation failed: _id: Cast to ObjectId failed for value "{ '$oid': '60ff324f41c4d5b96054390d' }" (type Object) at path "_id", field.due_date: Cast to date failed for value "{ '$date': '2021-11-03T00:09:18.271Z' }" (type Object) at path "field.due_date"
So the question is: Is there any way to cast the $oid and $date objects using mongoose while inserting?
Also, I have managed to insert the values using a very ugly script iterating over each value and checking if the object is $oid or $date and modifying the values... but it works.
So the question is not: "Is there a workaround to do this?" but "Is there a way to do it directly with mongoose functions as insertMany and casted automatically?"
Thanks.
The {"$oid": ...} and {"$date":...} constructs are MongoDB extended JSON notation, since JSON does not have any type for Date or ObjectId.
In order to insert those values, you will need to use a JSON parser that knows about this extended format.
One possibility is the json_util that is included with bson.
Or you can use mongoimport to read the JSON file and do the inserts for you.
I have used mongo import to import data into mongodb from csv files. I am trying to retrieve data from an Mongodb realm service. The returned data for the entry is as follows:
{
"_id": "6124edd04543fb222e",
"Field1": "some string",
"Field2": {
"$numberDouble": "145.81"
},
"Field3": {
"$numberInt": "0"
},
"Field4": {
"$numberInt": "15"
},
"Field5": {
"$numberInt": "0"
}
How do I convert this into normal JSON by removing $numberInt and $numberDouble like :
{
"_id": "6124edd04543fb222e",
"Field1": "some string",
"Field2": 145.8,
"Field3": 0,
"Field4": 15,
"Field5": 0
}
The fields are also different for different documents so cannot use Mongoose directly. Are there any solutions to this?
Also would help to know why the numbers are being stored as $numberInt:"".
Edit:
For anyone with the same problem this is how I solved it.
The array of documents is in EJSON format instead of JSON like said in the upvoted answer. To covert it back into normal JSON, I used JSON.stringify to first convert each document I got from map function into string and then parsed it using EJSON.parse with
{strict:false} (this option is important)
option to convert it into normal JSON.
{restaurants.map((restaurant) => {
restaurant=EJSON.parse(JSON.stringify(restaurant),{strict:false});
}
EJSON.parse documentation here. The module to be installed and imported is mongodb-extjson.
The format with $numberInt etc. is called (MongoDB) Extended JSON.
You are getting it on the output side either because this is how you inserted your data (meaning your inserted data was incorrect, you need to fix the ingestion side) or because you requested extended JSON serialization.
If the data in the database is correct, and you want non-extended JSON output, you generally need to write your own serializers to JSON since there are multiple possibilities of how to format the data. MongoDB's JSON output format is the Extended JSON you're seeing in your first quote.
I built a REST service and I found out that a JSON String generated from an ObjectId by using Gson will be in a different format than that is produced by spring-boot. and if I send an ObjectId of an existing Document's _id field in GSON format to my REST service and save it to the collection by using mongorepository's save function, a new Document with duplicated _id will still be inserted even if a unique index is set on such field. But if I send ObjectId in a format that is produced by spring-boot everything works perfectly. I'm wondering what caused such a problem?
"timestamp": 1558461711,
"machineIdentifier": 5077764,
"processIdentifier": 21816,
"counter": 13546695,
"date": "2019-05-21T18:01:51.000+0000",
"time": 1558461711000,
"timeSecond": 1558461711(generated by spring-boot)
"counter": 13546695,
"randomValue1": 9256029,
"randomValue2": 856,
"timestamp": 1558461711(by GSON)
If you are working with mongodb it is better to use org.bson.Document (which is provided by mongodb dependency) or some other mongodb class to convert document to json rather then GSON.
Document document = new Document();
document.put("_id", new ObjectId());
String json = document.toJson()
document.toJson() should stringify ObjectId in a right way.
Actually the output of the code above will be:
{ "_id" : { "$oid" : "5ce51fb47dda11a8507087eb" } }
Which is a valid format for mongodb, not sure about how SpringBoot will react on it.
Anyway, hope it'll help.
I am inserting json file into Mongodb(with Scala/Play framework) and the same getting/downloading it into my view page for some other requirement, but this time it is coming with one "_id" parameter in the json file.
But I need only my actual json file only that is not having any any "_id" parameter. I have read the Mongodb tutorial, that by default storing it with one _id for any collection document.
Please let me know that how can I get or is there any chance to get my actual json file without any _id in MongoDB.
this is the json result which is storing in database(I don't need that "_id" parameter)
{
"testjson": [{
"key01": "value1",
"key02": "value02",
"key03": "value03"
}],
"_id": 1
}
If you have a look at ReactiveMongo dev guide and to its API, you can see it support projection in a similar way as the MongoDB shell.
Then you can understand that you can do
collection.find(selector = BSONDocument(), projection = BSONDocument("_id" -> 0))
Or, as you are using JSON serialization:
collection.find(selector = Json.obj(), projection = Json.obj("_id" -> 0))
You can use this query in the shell:
db.testtable.find({},{"_id" : false})
Here we are telling mongoDB not to return _id from the collection.
You can also use 0 instead of false, like this:
db.testtable.find({},{"_id" : 0})
for scala you need to convert it in as per the driver syntax.