I am currently evaluating mongodb for a project I have started but I can't find any information on what the equivalent of an SQL view in mongodb would be. What I need, that an SQL view provides, is to lump together data from different tables (collections) into a single collection.
I want nothing more than to clump some documents together and label them as a single document. Here's an example:
I have the following documents:
cc_address
us_address
billing_address
shipping_address
But in my application, I'd like to see all of my addresses and be able to manage them in a single document.
In other cases, I may just want a couple of fields from collections:
I have the following documents:
fb_contact
twitter_contact
google_contact
reddit_contact
each of these documents have fields that align, like firstname lastname and email, but they also have fields that don't align. I'd like to be able to compile them into a single document that only contains the fields that align.
This can be accomplished by Views in SQL correct? Can I accomplish this kind of functionality in MongoDb?
The question is quite old already. However, since mongodb v3.2 you can use $lookup in order to join data of different collections together as long as the collections are unsharded.
Since mongodb v3.4 you can also create read-only views.
There are no "joins" in MongoDB. As said by JonnyHK, you can either enormalize your data or you use embedded documents or you perform multiple queries
However, you could also use Map-Reduce.
or if you're prepared to use the development branch, you could test the new aggregation framework though maybe it's too much? This new framework will be in the soon-to-be-released 2.2, which is production-ready unlike 2.1.x.
Here's the SQL-Mongo chart also, which may be of some help in your learning.
Update: Based on your re-edit, you don't need Map-Reduce or the Aggregation Framework because you're just querying.
You're essentially doing joins, querying multiple documents and merging the results. The place to do this is within your application on the client-side.
MongoDB queries never span more than a single collection as there is no support for joins. So if you have related data you need available in the results of a query you must either add that related data to the collection you're querying (i.e. denormalize your data), or make a separate query for it from another collection.
I am currently evaluating mongodb for a project I have started but I
can't find any information on what the equivalent of an SQL view in
mongodb would be
In addition to this answer, mongodb now has on-demand materialized views. In a nutshell, this feature allows you to use aggregate and $merge (in 4.2) to create/update a quick view collection that you can query from faster. The strategy is used to update the quick view collection whenever the main collection has a record change. This has the side effect unlike SQL of increasing your data storage size. But the benefits can be huge depending on your querying needs.
Related
I'm new in MongoDB, and i would like to know if there is a way to create a dynamic view inside MongoDB.
Let me be more precise :
In mongo i have a collection with financial datas, and an GUI interface wich display the datas.
But each users could reduce the datas by adding or removing columns to the grid, and filter the grid : a classic usecase.
What i would like to do, is to create a collection for each user that listen to the master table, according to the users filters: something like this :
mongo.createView(masterCollection, filters, mapReduce)
In this scenario, mongo update each view, each time a modification is done in the master collection (update, delete, insert).
I could do something like this manualy : create a table for a user, and use a tailable cursor on my master collection with the user filters and mapReduce, and the add, remove, or update the document in the user collection.
But, i have up to 100 simultaneous users, and it would open and keep alive 100 tailable cursors on the primary collection. I don't know enough mongo, but i think it's not a best practice to do something like this.
Actualy i have a thread for each user that get data for the collection, according to the user filters, every 5 secondes.
Could you please let me now if there is a native mongo way to handle this scenario or a way to create a user view in mongo.
Thank you
Starting with MongoDB v3.4 (community and enterprise edition), there is a support for creating read only views from existing collections or other views. This feature utilises MongoDB aggregation pipeline.
To create a view from the mongo shell, you could do:
db.createView(<view>, <source>, <pipeline>, <collation> )
Using your scenario above, you could create:
db.createView(<"user_view_name">, <"source_collection_name">, <"aggregation pipeline operators">)
See all the available Aggregation Pipeline Operators - i.e. not only you could filter, you could also modify the document structure.
See MongoDB Views behaviour for more information.
MongoDB Enterprise has a feature called 'Single View' which implements database views. It's more for aggregating data from multiple tables (e.g. parent/child relationships), and may be more than what you're looking for but is worth checking out. The downside, it's only available in the pricey Enterprise edition.
Check out the description here
I have two collections called Company_Details and Company_Ranks...Comp_ID is common in two collections. How do I merge these two collections to get complete details of a company.
Please help me
Thanks
Satyam
To make long story short, you either do that on client-side or consider the benefits of embedding those documents.
MongoDB does not support joins, as opposed to relational databases. This is both a pro and a con. It has helped MongoDB's developers to focus on scalability which is much harder to implement when you have joins and transactions.
You can follow the DBRef specification. Lots of drivers support DBRef and do the composition seamlessly for you. You can even do that manually. But most importantly, you can take advantage of embedding documents.
Embedding documents in MongoDB is a unique ability over relational databases. Meaning, you can create one collection consisting of compound documents. You'll enjoy atomicity, as there is no "partial success", and data locality: spinning disks are better in accessing data in sequence.
If querying is your motive and you don't want to change your schema. Then, try Apache Drill which allows you to query with SQLs. Then perform the full join, inner join, etc whatever you want. You can check for drill with MongoDB.
With MongoDB Version 3.2 and higher we got now the $lookup Command, which is the "same" as a Join in a RDBMS.
With that you can easy Query between your 2 Collections and get the Information you want.
For further Details Checkt out the Documentation
https://docs.mongodb.com/manual/reference/operator/aggregation/lookup/
I have a highly normalized data model with me. Currently I'm using manual referencing by storing the _id and running sequential queries to fetch details from the deepest collection.
The referencing is one-way and the flow has around 5-6 collections. For one particular use case, I'm having to query down to the deepest collection by querying subsequent "_id" from the higher level collections. So technically I'm hitting the database every time I run a
db.collection_name.find(_id: ****).
My prime goal is to optimize the read without hugely affecting the atomicity of the other collections. I have read about de-normalization and it does not make sense to me because I want to keep an option for changing the cardinality down the line and hence want to maintain a separate collection altogether.
I was initially thinking of using MapReduce to do an aggregation from the back and have a collection primarily for the particular use-case. But well even that does not sound that good.
In a relational db, I would be breaking the query in sub-queries and performing a join to get the data sets that intersect from the initial results. Since mongodb does not support joins, I'm having a tough time figuring anything out.
Please help if you have faced anything like this earlier or have any idea how to resolve it.
Denormalize your data.
MongoDB does not do JOIN's - period.
There is no operation on the database which gets data from more than one collection. Not find(), not aggregate() and not MapReduce. When you need to puzzle your data together from more than one collection, there is no other way than doing it on the application layer. For that reason you should organize your data in a way that any common and performance-relevant query can be resolved by querying just a single collection.
In order to do that you might have to create redundancies and transitive dependencies. This is normal in MongoDB.
When this feels "dirty" to you, then you should either accept the fact that your performance will be sub-optimal or use a different kind of database, like a classic relational database or a graph database.
In an SQL database, if I wanted to access some sort of nested data, such as a list of tags or categories for each item in a table, I'd have to use some obscure form of joining in order to send the SQL query once and then only loop through the result cursor.
My question is, in a NoSQL database such as MongoDB, is it OK to query the database repeatedly such that I can do the previous task as follows:
cursor = query for all items
for each item in cursor do
tags = query for item's tags
I know that I can store the tags in an array in the item's document, but I'm assuming that it is somehow not possible to store everything inside the same document. If that is the case, would it be expensive to requery the database repeatedly or is it designed to be used that way?
No, neither in Mongo, nor in any other database should you query a database in a loop. And one good reason for this is performance: in most web apps, database is a bottleneck and devs trying to make as small amount of db calls as possible, whereas here you are trying to make as many as possible.
I mongo you can do what you want in many ways. Some of them are:
putting your tags inside the document {itemName : 'item', tags : [1, 2, 3]}
knowing the list of elements, you do not need a loop to find information about them. You can fetch all results in one query with $in : db.tags.find({ field: { $in: [<value1>, <value2>, ... <valueN> ] }})
You should always try to fulfill a request with as few queries as possible. Keep in mind that each query, even when the database can answer it entirely from cache, requires a network roundtrip between application server, database and back.
Even when you assume that both servers are in the same datacenter and only have a latency of microseconds, these latency times will add up when you query for a large number of documents.
Relational databases solve this issue with the JOIN command. But unfortunately MongoDB has no support for joins. For that reason you should try to build your documents in a way that the most common queries can be answered by a single document. That means that you should denormalize your data. When you have a 1:n relation, you should consider to embed the referencing documents as an array in the main document. Having redundancies in your database is usually not as unacceptable in MongoDB as it is in relational databases.
When you still have good reasons to keep the child-documents as separate documents, you should use a query with the $in operator to query them all at once, as Salvador Dali suggested in his answer.
I'm trying to use mongoDB with Morphia but still I have a problem with deleting documents. Is there any additional plugin or wrapper which works with Mongo and provides something like transactions in DBMS?
No, there are no (multi document) transactions. There are two possible solutions:
You can restructure your data into a single document instead of spreading it over multiple tables. Thus MongoDB's single document transactions (if you call them that) are enough for you. You can solve many problems with embedded entities or arrays. You might want to start a question related to "schema" design, if you're unsure how to approach this.
Your problem absolutely needs transactions across multiple documents / tables. Then MongoDB is simply not the right tool and you should use a relational database.
Don't fight the tool, pick the right one...