I'm trying to design a schema paradigm in MongoDB which would support multilingual values for variable attributes in documents.
For example, I would have a product catalog where each product may require storing its name, title or any other attribute in various languages.
This same paradigm should probably hold for other locale-specific properties, such as price/currency variations
I've been considering a key-value approach where key is the language code and value is the corresponding value:
{
sku: "1011",
name: { "en": "cheese", "de": "Käse", "es": "queso", etc... },
price: { "usd": 30.95, "eur": 20, "aud": 40, etc... }
}
The problem is I believe this would deny me of using indices on multilingual fields.
Eventually, I'd like a generic, yet intuitive, index-able design.
Any suggestion would be appreciated, thanks.
Wholesale recommendations over your schema design may be a bit broad a topic for discussion here. I can however suggest that you consider putting the elements you are showing into an Array of sub-documents, rather than the singular sub-document with fields for each item.
{
sku: "1011",
name: [{ "en": "cheese" }, {"de": "Käse"}, {"es": "queso"}, etc... ],
price: [{ "usd": 30.95 }, { "eur": 20 }, { "aud": 40 }, etc... ]
}
The main reason for this is consideration for access paths to your elements which should make things easier to query. This I went through in some detail here which may be worth your reading.
It could also be a possibility to expand on this for something like your name field:
name: [
{ "lang": "en", "value": "cheese" },
{ "lang": "de", "value: "Käse" },
{ "lang": "es", "value": "queso" },
etc...
]
All would depend on your indexing and access requirements. It all really depends on what exactly your application needs, and the beauty of MongoDB is that it allows you to structure your documents to your needs.
P.S As to anything where you are storing Money values, I suggest you do some reading and start maybe with this post here:
MongoDB - What about Decimal type of value?
Related
I'm playing around with MongoDB and was wondering what best practices are for how a SQL-ish schema may correspond to MongoDB. Here are the tables/data I have so far:
user
id
email
name
answer
user_id (FK user.id)
tag
upvotes
repo
id
owner
name
description
stars
repo_tag
repo_id (FK to repo.id)
tag
is_language
percentage
repo_contrib
repo_id (FK to repo.id)
user_id (FK to user.id)
lines_of_code
The structure goes something like this:
user
answer (left outer)
repo_contrib (left outer)
repo
repo_tag
Note: All users will have at least one answer or one repo, but does not necessarily have to have both.
How might I put this into a mongo schema? Would this be one 'collection' ? Or would this be two collections: one for user, and one for repo; or more?
My queries will be something like: "Grab all users with an Answer with tay [Python] with more than 2 upvotes or a repo with the [Python] tag with more than two stars.
Let me divide this to couple of steps:
STEP 1 - MONGODB and MONGOOSE
MongoDB is a document based database. Each record in a collection is a document, and every document should be self-contained (it should contain all information that you need inside it).
Since MongoDB is a no-relation database, you can not create relations between collections, but you can store a reference of one collection document as a property of another collection document. To help you manage all of this, there is a great package called Mongoose, which will allow you to create a Model for each Collection. After you define
Models, Mongoose will allow you to easily make queries to database.
STEP 2 - DEFINING MODELS
As we said, documents should be self-contained, so they should have all information that you need inside them. We can have 2 approaches based on your example:
APPROACH 1:
Create one collection for each table that you have in your relational database. This is the best practice when you have documents with a lot of data, because it is scalable.
APPROACH 2:
Create 3 Collections - USERS, ANSWERS and REPOS. Because repo_contrib does not have a lot of data, you can store all user's contributions in a USERS document. That way, when you fetch a User document, you will have everything that you need in one place. The same goes for repo_tag - we can store all repo's tags in a REPOS document.
APPROACH 3:
Create 2 Collections - USERS and REPOS. The same as APPROACH 2, but you can also add all user's answers to the USERS document.
RECOMMENDATION:
I would go with APPROACH 2 in this case, since repo_contrib and repo_tag does not store big data and can easily be stored in USERS and REPOS documents with no problem. Also, if we go with this approach, it will make querying database a lot easier. The reason why I didn't choose option 3 is because theoretically user can have thousands or tens of thousands of answers, and it would not scale well.
STEP 3 - IMPLEMENTATION
NOTE: MongoDB will automatically assign _id to each document, so you don't have to define id property when implementing Models.
Tables from your relational database example can be mapped to collections like this (This implementation is for APPROACH 2):
USERS Collection:
const mongoose = require('mongoose');
const Schema = mongoose.Schema;
var schema = new Schema({
email: { type: String, required: true, unique: true },
name: { type: String, required: true, unique: false },
contributions: [{
repo_id: { type: mongoose.Schema.Types.ObjectId, ref: 'REPOS' },
lines_of_code: { type: Numeric, ref: 'REPOS' }
}]
});
const Users = mongoose.model('USERS', schema);
module.exports = Users;
ANSWERS Collection:
const mongoose = require('mongoose');
const Schema = mongoose.Schema;
var schema = new Schema({
user_id: { type: mongoose.Schema.Types.ObjectId, ref: 'USERS', required: true },
tag: { type: String, required: true, unique: false },
upvotes:{ type: Number, default: 0, unique: false }
});
const Answers = mongoose.model('ANSWERS', schema);
module.exports = Answers;
REPOS Collection:
const mongoose = require('mongoose');
const Schema = mongoose.Schema;
var schema = new Schema({
owner: { type: mongoose.Schema.Types.ObjectId, ref: 'USERS', required: true },
name: { type: String, required: true, unique: false },
description: { type: String, required: false, unique: false },
stars:{ type: Number, default: 0, unique: false },
tags: [{
name: { type: String, required: true, unique: false },
is_language: {type: Boolean, required: true, unique: false},
percentage:{ type: Number, default: 0, unique: false }
}]
});
const Repos = mongoose.model('REPOS', schema);
module.exports = Repos ;
STEP 4 - POPULATION AND DATABASE QUERIES
One of the best features of the Mongoose is called population. If you store a reference of one collection document as a property of another collection document, when performing querying of the database, Mongoose will replace references with the actual documents.
Example 1:
Let us first take as an example the first query that you suggested: Find all users with an Answer with tag [Python] with more than 2 upvotes. Since we stored user_id in ANSWERS Collection as a reference to the document from the USERS collection, that means that we can just query the ANSWERS Collection, and when returning the final result Mongoose will go to USERS collection and replace the references with the actual User documents. The database query that will perform this looks like this:
const ANSWERS = require('../models/answers');
ANSWERS.find({
"tag": "Python",
"upvotes": {
"$gt": 2
}
}).populate('user_id');
Example 2:
Second query that you have suggested is: Find all repos with the [Python] tag with more than two stars. Since we are storing all repo's tags in one array, we just need to check if that array contains an item with the name field that is equal to Python, and that the repo's stars fields is greater than 2. The database query that will perform this looks like this:
const REPOS = require('../models/repos');
REPOS.find({
"tags.name": "Python",
"stars": {
"$gt": 2
}
})
Here is also the working example: https://mongoplayground.net/p/rgBtVVDgPzG
Designing a database model is very complex most of the time and I guess you are looking for best practices from a reputable source. I think this is the missing point in the other answers, even if #NenadMilosavljevic got close to it.
Brief introduction on NoSQL modeling
You are probably used to model SQL databases, for NoSQL modeling it is totally different. These are some of the differences:
SQL Modeling
NoSQL modeling
This type of modeling is "data-oriented" in the sense that it is designed to be used and shared by many applications. Data is normalized, generally using normal forms, to avoid duplication and to make future changes easier and with lowest downtime possible.
NoSQL modeling is "application-oriented" because it should be built from the requirements of a single application, in order to reach the maximum level of optimization.
You start from requirements analysis, then the conceptual design, in the end the physical design.
If you want to optimize your application, you need to start from the app itself and from the operations needed: this is the so-called workload. After that there are conceptual and physical design of course.
I want to focus on the workload a little more because it is very important. Since you come from a SQL-based application, you can describe the workload starting from various scenarios, production logs and statistics. For each query you need, these parameters are essential:
Size of data requested
Frequency of the query
The complexity of the operations involved
Returning to the original question: "My queries will be something like..." is not enough for me to help you on building a NoSQL model. There are many solutions for your problem but, unless you provide a lot more info on the queries you need to do, they are all correct.
#NenadMilosavljevic gave you multiple approaches, but I can't say if the second one is the correct one for the reasons I have told you above. For example, he suggests to keep user and user contributions together so that you have to perform a single query to retrieve them, instead of doing JOINs or something even more expensive.
This is certainly clever but suppose, and it is probably not your case, that you have to update the user contributions very often, then in this case keeping them in a separate collection might be better.
What I am saying is that too many assumptions are missing and the solution we give you could be good but not optimal.
Honestly, it is not clear to me if you need a trivial conversion from a SQL model to NoSQL model, or you are trying to apply the NoSQL principles. I have no idea about the size of your database, but if performances are not a problem, just go with the solution you find more appropriate. Doing a research on how to better model your data would be a waste of time.
Instead, if you really need to design a NoSQL database, not a SQL-like NoSQL database, then my advice is to follow this course. Actually, you could finish it in less than 5 hours and many lessons are unnecessary in your situation, but it is worth taking a look at it. No one here talked about patterns and how to deal with one-to-zillion relationships for instance. It is very important to know their existence unless you like to redesign your database when it is too late.
Here is my suggestion. My suggestion would be 3 collections. There are user, repo and answer. Below are the schemas for reference.
user collection
id: String
email: String
name: String
repo collection
id: String
owner: String
name: String
description: String
tag: [String] // Array of string
contributors: [Number] // Array of user id
I suggest having another collection named answer. This is because a user could provide a lot of answers. Thus, having it on another collection will be easier to query compare to having it inside subdocuments of user collection.
answer collection
answer_id
user_id
tag
upvotes
I hope it is helpful.
Mongo Schema design 101: https://www.mongodb.com/blog/post/6-rules-of-thumb-for-mongodb-schema-design-part-1.
If in SQL you think about data in object oriented way - your models represent some business entities
and you build functions around them, in Mongo you should think functional way - what data you have as
input and what you need as output. In other words your schema should be based on queries you need
to run, not on data you have.
It is a bit more tricky as there is no best way - any schema will be better for some queries than for others.
You will need to choose which queries should be prioritized. To make it even more interesting, you will need
to anticipate which queries you may have in the future.
And of course it's all about data size. If it fits into single server you have luxury of aggregation lookups to "join" collections.
Otherwise sharding will significantly restrict your choices.
On the other hand embedding should be used with care - document size cannot exceed 16MB and modification
of embedded documents is not that straightforward.
The last but not least thing to consider are indexes. Your schema should allow efficient indexes for your queries. Here you will need to consider not only data size but also its quality - selectivity/cardinality
In the light of the foregoing the best schema to "grab all users with an Answer with tay [Python] with more than 2 upvotes or a repo with the [Python] tag with more than two stars" will be a 2 collections:
user:
{
"$schema": "http://json-schema.org/draft-04/schema#",
"type": "object",
"properties": {
"id": {
"bsonType": "objectId"
},
"email": {
"bsonType": "string"
},
"name": {
"bsonType": "string"
},
"answers": {
"bsonType": "array",
"items": [
{
"bsonType": "object",
"properties": {
"tag": {
"bsonType": "string"
},
"upvotes": {
"bsonType": "int"
}
},
"required": [
"tag",
"upvotes"
]
}
]
},
},
"required": [
"id",
"email",
"name",
"answers"
]
}
repo:
{
"$schema": "http://json-schema.org/draft-04/schema#",
"type": "object",
"properties": {
"id": {
"bsonType": "objectId"
},
"owner": {
"bsonType": "string"
},
"name": {
"bsonType": "string"
},
"description": {
"bsonType": "string"
},
"stars": {
"bsonType": "int"
},
"tags": {
"bsonType": "array",
"items": [
{
"bsonType": "object",
"properties": {
"tag": {
"bsonType": "string"
},
"is_language": {
"bsonType": "bool"
},
"percentage": {
"bsonType": "double"
}
},
"required": [
"tag",
"is_language",
"percentage"
]
}
]
},
"contributors": {
"bsonType": "array",
"items": [
{
"bsonType": "object",
"properties": {
"user_id": {
"bsonType": "objectid"
},
"lines_of_code": {
"bsonType": "int"
}
},
"required": [
"user_id",
"lines_of_code"
]
}
]
}
},
"required": [
"id",
"owner",
"name"
"description",
"stars"
]
}
with the queries:
db.user.find({answers: {$elemMatch:{tag:"Python", upvotes:{$gt:2}}}})
db.repo.find({"tags.tag":"Python", stars:{$gt:2}})
In the comment you mentioned something like "to get all the repos for a given user". Assuming it's about contributors, otherwise you don't need this array at all.
The query will be:
db.repo.find({"contributors.user_id": ObjectId("12313212313232")})
I'm pretty new to MongoDB and while preparing data to be consumed I got into Aggregation... what a powerful little thing this database has! I got really excited and started to test some things :)
I'm saving time entries for a companyId and employeeId ... that can have many entries... those are normally sorted by date, but one date can have several entries (multiple registrations in the same day)
I'm trying to come up with a good schema so I could easily get my data exactly how I need and as a newbie, I would rather ask for guidance and check if I'm in the right path
my output should be as
[{
"company": "474A5D39-C87F-440C-BE99-D441371BF88C",
"employee": "BA75621E-5D46-4487-8C9F-C0CE0B2A7DE2",
"name": "Bruno Alexandre":
"registrations": [{
"id": 1448364,
"spanned": false,
"spannedDay": 0,
"date": "2019-01-17",
"timeStart": "09:00:00",
"timeEnd": "12:00:00",
"amount": {
"days": 0.4,
"hours": 2,
"km": null,
"unit": "days and hours",
"normHours": 5
},
"dateDetails": {
"week": 3,
"weekDay": 4,
"weekDayEnglish": "Thursday",
"holiday": false
},
"jobCode": {
"id": null,
"isPayroll": true,
"isFlex": false
},
"payroll": {
"guid": null
},
"type": "Sick",
"subType": "Sick",
"status": "APP",
"reason": "IS",
"group": "LeaveAndAbsence",
"note": null,
"createdTimeStamp": "2019-01-17T15:53:55.423Z"
}, /* more date entries */ ]
}, /* other employees */ ]
what is the best way to add the data into a collection?
Is it more efficient if I create a document per company/employee and add all registration entries inside that document (it could get really big as time passes)... or is it better to have one document per company/employee/date and add all daily events in that document instead?
regarding aggregation, I'm still new to all this, but I'm imagining I could simply call
RegistrationsModel.aggregate([
{
$match: {
date: { $gte: new Date('2019-01-01'), $lte: new Date('2019-01-31') },
company: '474A5D39-C87F-440C-BE99-D441371BF88C'
}
},
{
$group: {
_id: '$employee',
name: { '$first': '$name' }
}
},
{
// ... get all registrations as an Array ...
},
{
$sort: {
'registrations.date': -1
}
}
]);
P.S. I'm taken the Aggregation course to start familiarized with all of it
Is it more efficient if I create a document per company/employee and
add all registration entries inside that document (it could get really
big as time passes)... or is it better to have one document per
company/employee/date and add all daily events in that document
instead?
From what I understand of document oriented databases, I would say the aim is to have all the data you need, in a specific context, grouped inside one document.
So what you need to do is identify what data you're going to need (getting close to the features you want to implement) and build your data structure according to that. Be sure to identify future features, cause the more you prepare your data structure to it, the less it will be tricky to scale your database to your needs.
Your aggregation query looks ok !
I'm having confusion about whether to use selector or views, or both, when try to get a result from the following scenario:
I need to do a wildsearch for a book and return the result of the books plus the price and the details of the store branch name.
So I tried using selector to do wildsearch using regex
"selector": {
"_id": {
"$gt": null
},
"type":"product",
"product_name": {
"$regex":"(?i)"+search
}
},
"fields": [
"_id",
"_rev",
"product_name"
]
I am able to get the result. The idea after getting the result is to use all the _id's from the result set and query to views to get more details like price and store branch name on other documents, which I feel is kind of odd and I'm not certain is that the correct way to do it.
Below is just the idea once I get the result of _id's and insert it as a "productId" variable.
var input = {
method : 'GET',
returnedContentType : 'json',
path : 'test/_design/app/_view/find_price'+"?keys=[\""+productId+"\"]",
};
return WL.Server.invokeHttp(input);
so I'm asking for input from an expert regarding this.
Another question is how to get the store_branch_name? Can it be done in a single view where we can get the product detail, prices and store branch name? Or do I need to have several views to achieve this?
expected result
product_name (from book document) : Book 1
branch_name (from branch array in Store document) : store 1 branch one
price ( from relationship document) : 79.9
References:
Book
"_id": "book1",
"_rev": "1...b",
"product_name": "Book 1",
"type": "book"
"_id": "book2",
"_rev": "1...b",
"product_name": "Book 2 etc",
"type": "book"
relationship
"_id": "c...5",
"_rev": "3...",
"type": "relationship",
"product_id": "book1",
"store_branch_id": "Store1_branch1",
"price": "79.9"
Store
{
"_id": "store1",
"_rev": "1...2",
"store_name": "Store 1 Name",
"type": "stores",
"branch": [
{
"branch_id": "store1_branch1",
"branch_name": "store 1 branch one",
"address": {
"street": "some address",
"postalcode": "33490",
"type": "addresses"
},
"geolocation": {
"coordinates": [
42.34493,
-71.093232
],
"type": "point"
},
"type": "storebranch"
},
{
"branch_id": "store1_branch2",
"branch_name":
**details ommit...**
}
]
}
In Cloudant Query, you can specify two different kinds of indexes, and it's important to know the differences between the two.
For the first part of your question, if you're using Cloudant Query's $regex operator for wildcard searches like that, you might be better off creating a Cloudant Query index of type "text" instead of type "json". It's in the Cloudant docs, but see the intro blog post for details: https://cloudant.com/blog/cloudant-query-grows-up-to-handle-ad-hoc-queries/ There's a more advanced post on this that covers the tradeoffs between the two types of indexes https://cloudant.com/blog/mango-json-vs-text-indexes/
It's harder to address the second part of your question without understanding how your application interacts with your data, but there are a couple pieces of advice.
1) Consider denormalizing some of this information so you're not doing the JOINs to begin with.
2) Inject more logic into your document keys, and use the traditional MapReduce View indexing system to emit a compound key (an array), that you can use to emulate a JOIN by taking advantage of the CouchDB/Cloudant index sorting rules.
That second one's a mouthful, but check out this example on YouTube: https://youtu.be/0al1KnCKjlA?t=23m39s
Here's a preview (example map function) of what I'm talking about:
'map' : function(doc)
{
if (doc.type==="user") {
emit( [doc._id], null );
}
else if (doc.type==="edge:follower") {
emit( [doc.user, doc.follows], {"_id":doc.follows} );
}
}
The resulting secondary index here would take advantage of the rules outlined in http://wiki.apache.org/couchdb/View_collation -- that strings sort before arrays, and arrays sort before objects. You could then issue range queries to emulate the results you'd get with a JOIN.
I think that's as much detail that's appropriate for here. Hope it helps!
I plan to create a database for price history.
The history database should store prices defined 90 days in advance each day in a year.
That means: 90 days x 365 days/year = 32850 database item
Is there any way to design schema to improve query performance ?
my first suggestion was hierarchical store values like:
{
"Address": "xxxxx",
"City": "xxxxx",
"Country": "Deutschland",
"Currency": "EUR",
"Item_Name": "xxxxxx",
"Location": [
log, lat
],
"Postal_code": "xxxx",
"Price_History": [
2014 : [
"January" : {
"CW_1" : { 1: [ price1 .. price90 ], 2: [ price1 .. price90 ], },
"CW_2" : {},
"CW_3" : {},
} ,
"February" : {},
"March" : {},
]
]
}
Thank you in advance!
It all depends on which queries you are planning to run against this data. It seems to me that if you are interested in keeping a history of actions, then your queries will almost always contain a date parameter.
The Price_History array might be better formatted as sub document. Each of these documents would have a varied (but limited) range of values - the year and the month. It might be a good idea to add an index on that attribute. This way, whenever you query by a certain date range, your indexes will assist mongo to find the relevant dataset relatively quickly.
Another option would be to have each price in-itself as a document. The item connected to the price could be a sub-document perhaps not containing all of the item data, but enough to be able to make the calculations and fetch the other relevant data once your dataset is small enough. For this usage, I would recommend creating a single attribute of the date ranges to be indexed and also an index on the item._id attribute. You can still have the individual date components if you still need to query them individually. Something like this:
{
"ind_attr": "2014_January_CW1",
"date": {
"year": 2014,
"month": January",
},
"CW": 1,
"price": [ price1... price90 ],
"item": {
"name": ...,
"_id": ...,
// minimal data about the actual item
}
}
With this document structure, you could easily add an index on the ind_attr attribute. The document.item._id attribute can be used to retrieve more detailed data on the actual item if needed.
I have a site that I'm using Mongo on. So far everything is going well. I've got several fields that are static option data, for example a field for animal breeds and another field for animal registrars.
Breeds
Arabian
Quarter Horse
Saddlebred
Registrars
AQHA
American Arabians
There are maybe 5 or 6 different collections like this that range from 5-15 elements.
What is the best way to put these in Mongo? Right now, I've got a separate collection for each group. That is a breeds collection, a registrars collection etc.
Is that the best way, or would it make more sense to have a single static data collection with a "type" field specifying the option type?
Or something else completely different?
Since this data is static then it's better to just embed the data in documents. This way you don't have to do manual joins.
And also store it in a separate collection (one or several, doesn't matter, choose what's easier) to facilitate presentation (render combo-boxes, etc.)
I believe creating multiple collections has collection size implications? (something about MongoDB creating a collection file on disk as twice the size of the previous file [ db.0 = 64MB, db.1 = 128MB and so on)
Here's what I can think of:
1. Storing as single collection
The benefits here are:
You only need one call to Mongo to fetch, and if you can cache the call you quickly have the data.
You avoid duplication: create a single schema that deals with all your options. You can just nest suboptions if there are any.
Of course, you also avoid duplication in statics/methods to modify options.
I have something similar on a project that I'm working on. I have categories and subcategories all stored in one collection. Here's a JSON/BSON dump as example:
In all the data where I need to store my 'options' (station categories in my case) I simply use the _id.
{
"status": {
"code": 200
},
"response": {
"categories": [
{
"cat": "Taxi",
"_id": "50b92b585cf34cbc0f000004",
"subcat": []
},
{
"cat": "Bus",
"_id": "50b92b585cf34cbc0f000005",
"subcat": [
{
"cat": "Bus Rapid Transit",
"_id": "50b92b585cf34cbc0f00000b"
},
{
"cat": "Express Bus Service",
"_id": "50b92b585cf34cbc0f00000a"
},
{
"cat": "Public Transport Bus",
"_id": "50b92b585cf34cbc0f000009"
},
{
"cat": "Tour Bus",
"_id": "50b92b585cf34cbc0f000008"
},
{
"cat": "Shuttle Bus",
"_id": "50b92b585cf34cbc0f000007"
},
{
"cat": "Intercity Bus",
"_id": "50b92b585cf34cbc0f000006"
}
]
},
{
"cat": "Rail",
"_id": "50b92b585cf34cbc0f00000c",
"subcat": [
{
"cat": "Intercity Train",
"_id": "50b92b585cf34cbc0f000012"
},
{
"cat": "Rapid Transit/Subway",
"_id": "50b92b585cf34cbc0f000011"
},
{
"cat": "High-speed Rail",
"_id": "50b92b585cf34cbc0f000010"
},
{
"cat": "Express Train Service",
"_id": "50b92b585cf34cbc0f00000f"
},
{
"cat": "Passenger Train",
"_id": "50b92b585cf34cbc0f00000e"
},
{
"cat": "Tram",
"_id": "50b92b585cf34cbc0f00000d"
}
]
}
]
}
}
I have a call to my API that gets me that document (app.ly/api/v1/stationcategories). I find this much easier to code with.
In your case you could have something like:
{
"option": "Breeds",
"_id": "xxx",
"suboption": [
{
"option": "Arabian",
"_id": "xxx"
},
{
"option": "Quarter House",
"_id": "xxx"
},
{
"option": "Saddlebred",
"_id": "xxx"
}
]
},
{
"option": "Registrars",
"_id": "xxx",
"suboption": [
{
"option": "AQHA",
"_id": "xxx"
},
{
"option": "American Arabians",
"_id": "xxx"
}
]
}
Whenever you need them, either loop through them, or pull specific options from your collection.
2. Storing as a static JSON document
This as #Sergio mentioned, is a viable and more simplistic approach. You can then either have separate docs for separate options, or put them in one document.
You do lose some flexibility here because you can't reference options by Id (which I prefer because changing option name doesn't affect all your other data).
Prone to typos (though if you know what you're doing this shouldn't be a problem).
For Node.js users: this might leave you with a headache from require('../../../options.json') similar to PHP.
The reader will note that I'm being negative about this approach, it works, but is rather inflexible.
Though we're discouraged from using joins unnecessarily on MongoDB, referencing by ObjectId is sometimes useful and extensible.
An example is if your website becomes popular in one region of the world, and say people from Poland start accounting for say 50% of your site visits. If you decide to add Polish translations. You would need to go back to all your documents, and add Polish names (if exists) to your options. If using approach 1, it's as easy as adding a Polish name to your options, and then plucking the Polish name from your options collection at runtime.
I could only think of 2 options other than storing each option as a collection
UPDATE: If someone has positives or negatives for either approach, may you please add them. My bias might be unhelpful to some people as there are benefits to storing static JSON files
MongoDB is schemaless and also no JOIN is supported. So you have to move out of the RDBMS and normalization given the fact that this is purely a different kind of database.
Few rules which you can apply while designing as a guidelines. Of course, you have the choice of keeping it in a separate collection when needed.
Static Master/Reference Data:
You have to always embed them in your documents wherever required. Since the data is not going to be changed, it is not at all bad idea to keep in the same collection. If the data is too large, group them and store them in a separate collection instead of creating multiple collection for the this master data itself.
NOTE: When embedding the collections as sub-documents, always make sure that you are never going to exceed the 16MB limit. That is the limit (at this point) for each collection can take in MongoDB.
Dynamic Master/Reference Data
Try to keep them in a separate collection as the master data is tend to change often.
Always remember, NO join support, so keep them in a way that you can easily access it without querying the database too many times.
So there is NO suggested best way, it always changes based on your need and the design can go either way.