Sharing a document with users - mongodb

I have to choose a database for implementing a sharing system.
My system will have users and documents. I have to share a document with a few users.
Example:
There are 2 users, and there is one document.
So if I have to share that one document with both the users, I could do these possible solutions:
The current method I'm using is with MySQL (I don't want to use this):
Relational Databases (MySQL)
Users Table = user1, user2
Docs Table = doc1
Docs-User Relation Table = doc1, user1
doc1, user2
And I would like to use something like this:
NoSQL Document Stores (MongoDB)
Users Documents:
{
_id: user1,
docs_i_have_access_to: {doc1}
}
{
_id: user2,
docs_i_have_access_to: {doc1}
}
Document's Document:
{
_id: doc1
members_of_this_doc: {user1, user2}
}
And I don't yet know how I would implement in a key-value store like Redis.
So I just wanted to know, would the MongoDB way I have given above, the best solution?
And is there any other way I could implement this? Maybe with another database solution?
Should I try to implement it with Redis or not?
Which database and which method should I choose and will be the best to share the data and why?
Note: I want something highly scalable and persistent. :D
Thanks. :D

Actually, you need to represent a many-to-many relationship. One user can have several documents. One document can be shared among several users.
See my previous answer to this question: how to have relations many to many in redis
With Redis, representing relationship with the set datatype is a pretty common pattern. You can expect to get better performance than with MongoDB for this kind of data model. And as a bonus, you can easily and efficiently find which users have a given list of documents in common, or which documents are shared by a given set of users.

Considering only this simple example (you just need to keep who owns what) SQL seems to be the most appropriate, as it will give additional options for free, such as reporting who has how many docs, the most popular documents, most active user etc with almost zero cost + the data will be more consistent (no duplication, possibly foreign keys). This is valid unless you have millions of documents of course.
If I chose between document-oriented and relational DB, I'd make a decision based mostly on the structure of the document itself. Whether they're all uniform or may have different fields for different types, do you nested sub-documents or arrays with the ability to search by their contents.

Related

MongoDB and one-to-many relation

I am trying to come up with a rough design for an application we're working on. What I'd like to know is, if there is a way to directly map a one to many relation in mongo.
My schema is like this:
There are a bunch of Devices.
Each device is known by it's name/ID uniquely.
Each device, can have multiple interfaces.
These interfaces can be added by a user in the front end at any given
time.
An interface is known uniquely by it's ID, and can be associated with
only one Device.
A device can contain at least an order of 100 interfaces.
I was going through MongoDB documentation wherein they mention things relating to Embedded document vs. multiple collections. By no means am I having a detailed clarity over this as I've just started with Mongo and meteor.
Question is, what could seemingly be a better approach? Having multiple small collections or having one big embedded collection. I know this question is somewhat subjective, I just need some clarity from folks who have more expertise in this field.
Another question is, suppose I go with the embedded model, is there a way to update only a part of the document (specific to the interface alone) so that as and when itf is added, it can be inserted into the same device document?
It depends on the purpose of the application.
Big document
A good example on where you'd want a big embedded collection would be if you are not going to modify (normally) the data but you're going to query them a lot. In my application I use this for storing pre-processed trips with all the information. Therefore when someone wants to consult this trip, all the information is located in a single document. However if your query is based on a value that is embedded in a trip, inside a list this would be very slow. If that's the case I'd recommend creating another collection with a relation between both collections. Also for updating part of a document it would be slow since it would require you to fetch the whole document and then update it.
Small documents with relations
If you plan on modify the data a lot, I'd recommend you to stick to a reference to another collection. With small documents, this will allow you to update any collection quicker. If you want to model a unique relation you may consider using a unique index in mongo. This can be done using: db.members.createIndex( { "user_id": 1 }, { unique: true } ).
Therefore:
Big object: Great for querying data but slow for complex queries.
Small related collections: Great for updating but requires several queries on distinct collections.

How to model mongodb for custom user data

I'm developing a cms using MongoDb and am trying to get some modelling advice. It's multi-tenant and each tenant can create their own schema and choose what custom fields they want searchable/indexed. The only thing I'm waffling on is how to model my collections. It seems to me like it would be ideal for each tenant to have their own collection due to indexing, but I am not very experienced with MongoDb and would love to hear if that's even a valid statement or not.
I'm thinking about separating each tenant's schema definitions from their data - perhaps a customSchema and customData collection for each tenant. Maybe something like customSchema_5543e1191a85d8946f0ee6fc and customData_5543e1191a85d8946f0ee6fc? The major question here being how many collections are feasible in MongoDb. I'm not clear if there's a cap with the new WiredTiger or not. If not, would such a large number of collections have any downsides?
Or, is it better to have just two collections with all tenant's data in them, along with all of their individual indexes? What are the pros and cons of this approach?
Any thoughts or suggestions are welcome, particularly if anyone has had experience doing something like this before.
Update:
My use case is a cms where tenants can specify their own data, like in Sharepoint or Expression Engine, or most other content apis, like contentful or CloudCMS. A user can say, "I want to store Products, and each product has a Name, Description, Quantity, and a price". Another user could say, "I want to store bands, and each band has a Name, a HomeCity, and a whatever." The users would then want to retrieve and display that data on their pages however they like. It's a basic cms scenario where tenants can create their own schema, then create, edit, and retrieve entries of those schemas. Tenants would need to be able to denote which fields they can search on, so this highly customizable indexing per tenant is the primary area of focus and concern in the modelling strategy.
I'm waffling between two big collections to store schemas and data, shared by all tenants, and a pair of those collections for every tenant. I just don't know the pros and cons of each of those solutions in MongoDb. I'm also open to any ideas I haven't thought of yet :)

mongodb scheme database design

I have users table like:
{ _id: kshjfhsf098767, email: email#something name: John joshua }
{ _id: dleoireofd9888, email: email#hhh name: Terry Holdman }
And I have other collection "game"
{_id: gsgrfsdgf8898, home_user_id: kshjfhsf098767, guest_user_id: dleoireofd9888, result: "0:1"}
Then what I want is to join (like it was in mysql), game two times with users with because I know home_user_id and guest_user_id and take name email etc.
I could place all of that in table game but that will be duplicated content. and if they change name or email I need to update whole game table....
Any help on design and query to call that game with two users that are playing game would be great...Tnx
There are two ways to manage this, manually or using a DBRef. From the preceeding documentation link:
MongoDB does not support joins. In MongoDB some data is “denormalized,” or stored with related data in documents to remove the need for joins. However, in some cases it makes sense to store related information in separate documents, typically in different collections or databases.
So it is a case of mange the link yourself or use the built-in DBRef. For the DBRef case see How to query mongodb with DBRef
Alternatively, it may be easier to manage with a different schema design. For example the game collection could just store the result and game_id and instead add the game_id reference to each of the relevant users. Of course you will still need to query both collections and the linked SO question has an example of how to do this.
MongoDB has no JOINs (NoSQL).
Just do a lazy join here where by you query your user row and then query all games that user is a part of. It will be ultra fast with the right indexes and since the commands would be two small ones MongoDB would barely notice them.
I would not recommend embedding here. Taking the reason you state, for example, that will make the data a pain to update across the 100's of users that could be in a single game "room". In this case it is better to do a single atomic update even if it means you have to put a little overhead on querying another collection.

Storing secondary documents in MongoDB doc

Say User2 has these objects of data, they are stored in User2.objects as an array of objects.
User2:{objects:[{object1},{object2},{object3}]}
If anyone other then User2 needs to query this data, like User1 needs any of those objects that pertain to them. Then it should be broken out into it's own collections of in MongoDB db right?
Because say a User1 wants to find every object like that are part of. They would need to have a reference to all the User2s they created data with then look through every object in each user, and return the one they need.
I should break those objects out into their own collections? Then I can just index user ids and each user can just query once for their own id.
Sorry if this Q is confusing I'm a little lost.
It appears as though there may be some confusion between Mongo Document structure and using authentication with MongoDB.
The documentation on how to set up user authentication for a Mongo Database is here:
http://www.mongodb.org/display/DOCS/Security+and+Authentication
If User2 needs to run a query on a collection that was created by User1, then User2 must have an account with the Database where that collection resides, and must be properly authenticated.
The example document provided is also a little confusing. It is a better idea to use key names that will be the same across all documents. For example:
{userName:"user1", name:"Marc"},
{userName:"user2", name:"Jeff"},
{userName:"user3", name:"Steve"}
is preferable to
{user1:"Marc"},
{user2:"Jeff"},
{user3:"Steve"}
In the second example, the username (user1, user2, etc) will have to be known in order to find out the name of the user. MongoDB does not support wildcards in queries.
The following document structure would be preferable:
{
user: "User2",
objects:[object1,object2,object3]
},
{
user: "User1",
objects:[object1,object2,object3]
}
All of the objects created by user1 could be retrieved with the following query:
> db.<your collection name>.find({user: "User1"}, {objects:1})
For more information on the MongoDB document structure, I recommend reading the following:
http://www.mongodb.org/display/DOCS/Schema+Design - A great introduction to the way data is stored in MongoDB, including example documents, best practices, and an introduction to indexing.
Hopefully the above will put you on the right track in terms of deciding on a schema for your collection and creating users and setting permissions. Authentication is one of MongoDB's more advanced features, so I would begin by focusing on building an efficient schema and organizing your data correctly before worrying about authentication.
If you have any additional questions about these topics, or anything else MongoDB-related, the Community is here to help! Good Luck!

How would you architect a blog using a document store (such as CouchDB, Redis, MongoDB, Riak, etc)

I'm slightly embarrassed to admit it, but I'm having trouble conceptualizing how to architect data in a non-relational world. Especially given that most document/KV stores have slightly different features.
I'd like to learn from a concrete example, but I haven't been able to find anyone discussing how you would architect, for example, a blog using CouchDB/Redis/MongoDB/Riak/etc.
There are a number of questions which I think are important:
Which bits of data should be denormalised (e.g. tags probably live with the document, but what about users)
How do you link between documents?
What's the best way to create aggregate views, especially ones which require sorting (such as a blog index)
First of all I think you would want to remove redis from the list as it is a key-value store instead of a document store. Riak is also a key-value store, but you it can be a document store with library like Ripple.
In brief, to model an application with document store is to figure out:
What data you would store in its own document and have another document relate to it. If that document is going to be used by many other documents, then it would make sense to model it in its own document. You also must consider about querying the documents. If you are going to query it often, it might be a good idea to store it in its own document as you would find it hard to query over embedded document.
For example, assuming you have multiple Blog instance, a Blog and Article should be in its own document eventhough an Article may be embedded inside Blog document.
Another example is User and Role. It makes make sense to have a separate document for these. In my case I often query over user and it would be easier if it is separated as its own document.
What data you would want to store (embed) inside another document. If that document only solely belongs to one document, then it 'might' be a good option to store it inside another document.
Comments sometimes would make more sense to be embedded inside another document
{ article : { comments : [{ content: 'yada yada', timestamp: '20/11/2010' }] } }
Another caveat you would want to consider is how big the size of the embedded document will be because in mongodb, the maximum size of embedded document is 5MB.
What data should be a plain Array. e.g:
Tags would make sense to be stored as an array. { article: { tags: ['news','bar'] } }
Or if you want to store multiple ids, i.e User with multiple roles { user: { role_ids: [1,2,3]}}
This is a brief overview about modelling with document store. Good luck.
Deciding which objects should be independent and which should be embedded as part of other objects is mostly a matter of balancing read/write performance/effort - If a child object is independent, updating it means changing only one document but when reading the parent object you have only ids and need additional queries to get the data. If the child object is embedded, all the data is right there when you read the parent document, but making a change requires finding all the documents that use that object.
Linking between documents isn't much different from SQL - you store an ID which is used to find the appropriate record. The key difference is that instead of filtering the child table to find records by parent id, you have a list of child ids in the parent document. For many-many relationships you would have a list of ids on both sides rather than a table in the middle.
Query capabilities vary a lot between platforms so there isn't a clear answer for how to approach this. However as a general rule you will usually be setting up views/indexes when the document is written rather than just storing the document and running ad-hoc queries later as you would with SQL.
Ryan Bates made a screencast a couple of weeks ago about mongoid and he uses the example of a blog application: http://railscasts.com/episodes/238-mongoid this might be a good place for you to get started.