How to create a collection in a document in MongoDB? - mongodb

I am using MongoDB for the first time, and I have some experience with NoSQL databases.
I am attempting to replicate behaviour that I have managed to achieve on Google's Cloud Firestore:
I want to create a collection within a document. I have not been able to replicate this behaviour using MongoDB as I cannot find code in the documentation. Is this behaviour even possible please?
Thanks in advance.
Edit:
Here is a screenshot of a sample document in biometric_data :

MongoDB has embedded documents which can be used to store the same data. You can try creating an array of sub-documents (each having name and data property):
{
name: "",
email: "",
...otherFields,
biometric_data: [
{
name: "glucose",
data: {
preferred_unit: "mg/dL"
// Add new properties as required
}
},
{
name: "weight",
data: {
preferred_unit: "KG"
}
}
],
...templateData
}
However, a document's size in MongoDB cannot exceed 16 MB. If number of fields in biometric_data are limited then you can use sub-documents otherwise you might have to create another collection to store those as documents (generally preferred for chat apps or where number of sub-documents can be really high).
Sub-collections (in Firestore) allow you to structure data hierarchically, making data easier to access. For example, users and posts collections can be structured in either of the ways below:
With sub-collection
users -> {userId} -> posts -> {postId}
Root level collections
users -> {userId}
posts -> {postId}
Though if you use root level collections, you must add a userId in posts document to identify who the owner of a post is.
If you use nested documents way in MongoDB, you are likely to hit the 16 MB document limit if any of the users decides to add many posts. Similarly if the biometric_data array can have many documents, it'll be best to create another collection.
Firestore's sub-collections and documents do not count towards 1 MB max doc size of parent document but nested documents in MongoDB do.
Also checkout:
Firestore - proper NoSQL structure for user-specific data
Is mongodb sub documents equivalent to Firestore subcollections?

Related

Optimizing mongo queries - _id or traverse whole collection

I'm using mongodb for a project. Need to know which would be a better implementation for queries.
Consider I have to search for 10 documents out of a total 1000 documents based on a condition (not id).
Would it be better to query using document _id's (after storing the required id's in another collection beforehand by checking for the condition whenever insertion is done)
OR
Would it better to traverse all the documents and get the required documents using the condition
The main aim here is to split documents into different categories and display the documents belonging to a particular category. So storing id's of documents belonging to each category or search for documents in that category by traversing through all the documents?
I have heard that mongodb uses hashed indexing (so feel option 1 would be faster), but I couldnt find anything regarding that. So a small description regarding document storage and queries would also be good.
The optimum way to query for the cuisine type example would be to store what the restaurant serves in an array of strings or objects, and index that field.
For example:
{
name: "International House"
cuisine: [
{ name: "Chinese", subtype: "Kowloon"},
{ name: "Japanese", subtype: "Yakitori"},
{ name: "American", subtype: "TexMex" }
]
}
Then create an index on { "cuisine.name": 1 }.
When you need to find all restaurants that serve Chinese food, the query:
db.collection.find({"cuisine.name":"Chinese")
will use that index, and only scan the documents that match.

Querying MongoDB collection with heterogeneous schema efficiently

I'm developing a web application with NodeJS, MongoDB and Mongoose. It is intended to act as an interface between the user and a big data environment. The idea is that the users can execute the big data processes in a separated cluster, and the results are stored in a MongoDB collection Results. This collection may store more than 1 million of documents per user.
The document schema of this collection can be completely different between users. For instance, we have user1 and user2. Examples of document in the Resultscollection for user1 and user2:
{
user: ObjectId(user1):, // reference to user1 in the Users collection
inputFields: {variable1: 3, ...},
outputFields: { result1: 504.75 , ...}
}
{
user: ObjectId(user2):,
inputFields: {country: US, ...},
outputFields: { cost: 14354.45, ...}
}
I'm implementing a search engine in the web application so that each user can filter in the fields according to the schemas of their documents (for example, user1 must me able to filter by inputFields.variable1, and user2 by outputFields.cost). Of course I know that I must use indexes, otherwise the queries are so slow.
My first attempt was to create an index for each different field in the Results collection, but it's quite inefficient, since the database server becomes unstable because of the size of the indexes. So my second attempt was to try to reduce the amount of indexes by using partial indexes, so that I create indexes specifying the user id in the option partialFilterExpression.
The problem is that if another user has the same schema in the Results collection as any other user and I try to create the indexes for this user, MongoDB throws this exception:
Index with pattern: { inputFields.country: 1 } already exists with different options
It happens because the partial indexes cannot index the same fields even though the partialFilterExpression is different.
So my questions are: How could I allow the users to query their results efficiently in this environmnet? Is MongoDB really suitable for this use case?
Thanks

Query moongoDB from a redis list

If for example I keep lists of user posts in redis, for example a user has 1000 posts, and the posts documents are stored into mongodb but the link between the user and the posts is stored inside redis, I can rtetrieve the array containing all the ids of a user post from redis, but what is the efficient way to retrieving them from mongodb?
do I pass a parameter to mongoDB with the array of ids, and mongo will fetch those for me?
I don't seem to find any documentation on this, if Anyone is willing to help me out!
thanks in advance!
To retrieve a number of documents per id, you can use the $in operator to build the MongoDB query. See the following section from the documentation:
http://docs.mongodb.org/manual/reference/operator/query/in/#op._S_in
For instance you can build a query such as:
db.mycollection.find( { _id : { $in: [ id1, id2, id3, .... ] } } )
Depending on how much ids will be returned by Redis, you may have to group them in batch of n items (n=100 for instance) to run several MongoDB queries. IMO, this is a bad practice to build such query containing more than a few thousands ids. It is better to have smaller queries but accept to pay for the extra roundtrips.

ElasticSearch indexing and references to other documents

I have an ElasticSearch instance indexing a MongoDB database using the river by richardwilly98
There are two types of documents that are indexed:
documents referencing users
documents representing users
When these objects are added to mongodb richardwilly98's river generates something like the following:
document = {'user': {"$id" :
"5159a004c87126641f4f9530" } }
user_document = {'_id':"5159a004c87126641f4f9530",'username':'bob'}
If I perform a search for 'bob' i'd like any documents that reference the bob document to be returned. At the moment this doesn't happen because the username field is not related to the referencing documents in anyway.
Is it possible to do this? Does ElasticSearch have object references?
Thanks - let me know if I haven't been clear.
If each document belong to no more than one user, you can index documents as children of users. Then you can use has_parent filter to perform the search. However, if a single document can belong to more than one user, you will have to perform search in two steps. First you would have to find the user and then issue another search to find documents.
Elasticsearch supports parent field [1]. MongoDB river supports custom mapping [2] so _parent can now be used.
http://www.elasticsearch.org/guide/reference/mapping/parent-field/
https://github.com/richardwilly98/elasticsearch-river-mongodb/issues/64

Design MongoDb Schema For My Social

I'm new for MongoDB , I just want to create a simple project to test performance of MongoDB
The project just like a simple CMS
it has users, blogs and comments, users can have friends
so I design my database like that
user
{
_ID:
name:
birth_day:
sex:
friends:[id_1,Id_2]
}
blogs
{
title:
owner:
tags_fiends:
comments:
[
{"_id":"","content":"","date_created":""},
{"_id":"","content":"","date_created":""},
],
"like"={"_id","_id"}
}
And How many collection are needed for this database. Can I use 1 Collection for both user and blog.Thanks in advance.
Due to mongoDB is schema less or schema free DB You can make any kind of structure within a document, which is supported:
individual elements
nested arrays
nested documents
There is a couple of things you have to considare during schema design which for it is useful to have the users and the blogs in separated schema. For example if you storing something in a nested array you can specify index for fastening the search within this array, but you can have only one multykéy index (indexed array content) within one particular collection. so if you store, friends and blogs, and posts, and tags all in arrays you can have index only on one of them.
Also important to know in this case that there is a size limit for each document what is now 16MB.
In your scenario, I would make Users a collection and reference it by _id from the blog collection.
In practise, you could make the Blogs an attribute of User, the only constraint being the max doc size of 16MB - but that's a lot of blogs (text).
To get round that (assuming you need to), a separate Blog collection referencing the user _id would be fine. You may need to denormalise the user name too if that's not your _id. This would mean you can get all the blogs for a user in a single query.