How to store a user record on another collection in Meteor - mongodb

I'm trying to find how to associate a user with another collection document in Meteor and am unsure about how to do this.
My objective is to find the most efficient and future proof method for storing this information, although I am now aware this can be seen as somewhat subjective.
In my example, I am using a "Message" collection and am storing user ids on this document as both "sender" and "recipients", recipients being an array of user ids.
When I want to display information about the sender/recipients of this message, should I use helpers to output certain data? Or add things like senderName and senderAvatar onto the document itself when it gets created? Or am I missing another way of associating a user with another object that is perhaps more efficient?
Here's a JSON example:
Option 1 - Simply storing user ids on the other object (Message)
{
"_id": "boDNs36xzLw7eLLhx",
"sender": "8jpS96b4T65g5ARug",
"recipient": "4Pa5i5vQ2gDtYQBDP",
"message": "A new message.",
"createdAt": "2015-10-22T21:18:18.291Z"
}
Option 2 - Storing more information on the document itself
{
"_id": "boDNs36xzLw7eLLhx",
"sender": "8jpS96b4T65g5ARug",
"senderName": "Joe Bloggs",
"senderAvatar": "http://myimg.com",
"recipient": "4Pa5i5vQ2gDtYQBDP",
"recipientName": "Bill Bloggs",
"recipientAvatar": "http://myotherimg.com",
"message": "A new message.",
"createdAt": "2015-10-22T21:18:18.291Z"
}

In my opinion you should stick with the 1st option and then get user data from users collection by user id.
Two main arguments:
Second option duplicates data. If users exchange 100 messages, there are 101 "Joe Bloggs", "Bill Bloggs" etc identical strings in database.
If one of the users changes name, it either doesn't change in messages, or you have to update each message sent and received by this this user, which means a massive and unnecessary database load.

Related

What is the best way to avoid duplication in my music Library database modeling using mongodb

As a newbie in mongodb, I tried to model a music library database. From what I have done so far, I suspect some level of duplications, especially with the artist entity. Suggestions on how to avoid such duplications or perfect the database model will be appreciated.
{
"track_id": "1",
"Duration": " 5.00",
"title": "Andersen",
"date released": ISODate("01-25-1896")
"Artist":
{
"Name": "Lee Jones",
"Gender": "Male"
},
"Album":
{
"Name": "star wars",
"date released": ISODate("01-25-1896")
"Artist":
{
"Name": "Lee Jones",
"Gender": "Male"
}
}
}
In the above codes I made use of the Embedded document pattern considering the following:
A track is made by an artist
A artist can make zero to many tracks
A track can be associated with zero album or a single album
An album can have one or many tracks
An album belongs to an artist
An artist can own zero to many albums
With MongoDB, how you model your data depends – entirely – on your
particular application’s data access patterns. You want to structure
your data to match the ways that your application queries and updates
it.
Of course you can reference (link) the Artist field with a Artist Document and avoid data duplication, but this destroy the may purpose of a document database ( ease of development and fast response), you should favor embedding unless there is a compelling reason not to.
More info: 6 Rules of Thumb for MongoDB Schema Design

MongoDB - how to reference children/nested document _id's within parent on insert

I am very new to MongoDb but the project I was just brought in on uses it to store message threads like this:
{
"_id": ObjectId("messageThreadId"),
"messages": [
{
"_id": ObjectId("messageId"),
"body": "Lorem ipsum..."
}, etc...]
"users": [
{
"_id": ObjectId("userId"),
"unreadMessages": ['messageId', 'messageId', etc...]
}
]
}
I need to use pymongo to insert brand new messageThreads which should (initially) contain a single message. However, I am not clear on how to construct the users.unreadMessages lists of messageIds (which should contain just the newly-created initial message). Is there a way of referencing the initial message's _id before/as it's created, from within the same document? Also worth noting that unreadMessages is a list of strings, not ObjectId()s.
Do I need to create the messageThread with the unreadMessages list empty, then go back and retrieve the initial message's _id that was just created, then update every unreadMessages in the list of users? It feels wrong to require multiple transactions for an insert, but this whole schema feels wrong to me.
As DaveStSomeWhere said, I ended up pre-generating the ObjectId and then using it in the document before insertion. This is what PyMongo does when it goes to insert a document anyways: the relevant code in pymongo.collection.insert_one(). Thanks Dave.

Designing mongo 'schema' for RESTful application

I'm trying to teach myself mongo through writing an application, and I'm struggling with the best way to design the mongo 'schema' (I know it's schemaless, but that is probably the core issue with my understanding in that I'm coming from a relational background)
Anyway, the application is a Gift List manager, where a user can create a Gift List and can add Gifts they would like to receive to their list. Other users can subscribe to the list, and can mark a Gift from the Gift List as claimed/purchased. (So as to prevent the problem of getting duplicate gifts at Christmas!)
At the moment my GiftLists collection is not 'relational' and is simply a collection of GiftList documents with sub documents for the Gifts, like this:
{
"GiftLists": [
{
"_id": {
"$oid": "55e9924848c4ffd723890b48"
},
"description": "Xmas List for Some User",
"gifts": [{
"description": "Mongo book"
"claimed": false
},
{
"description": "New socks"
"claimed": false
},
{
"description": "New socks"
"claimed": false
}],
"owner": "some.user",
"subscribers": ["some.other.user", "my.friend"]
}
]
}
The idea is that some.user is the owner of the Gift List and has added 3 items he would like to receive. some.other.user has subscribed to the list and can see the Gift List and it's Gifts. He may choose to buy one of the gifts, so needs to mark it as claimed so that my.friend does not also buy it.
At the moment, each Gift in the gifts array is a sub-document without its own id, and I think this is where I'm getting stuck in my understanding/thinking.
I'm trying to provide the app functionality with a RESTful interface.
To POST a new Gift List the url is /giftList/add where the request body is the new Gift List
To GET an individual Gift List including the child Gift's the url is /giftList/<listId> - eg: /giftList/55e9924848c4ffd723890b48
With the above in mind, my natural next step is to be able to mark a Gift as claimed, perhaps with:
PUT to the url /gift/claim/<giftId>
But I don't have any ids on the Gift sub documents
So maybe my url should be:
/giftList/<listId>/claim/<giftId>
But again, I don't have an id on the Gift sub document
Or maybe I try to use the description of the item
/gift/claim/<gift description> eg: /gift/claim/Mongo+book
But what if more than one person had a Gift List containing 'Mongo book', and URL encoding the characters of the description could be messy
Or maybe I reference the Gift List
/giftList/<listId>/claim/<gift description> eg: /giftList/55e9924848c4ffd723890b48/claim/New+socks
But which instance of 'New socks' am I claiming? (after all, everyone needs lots of new socks for Christmas!)
Or maybe I reference the index of the Gift
/giftList/<listId>/claim/<gift index> eg: /giftList/55e9924848c4ffd723890b48/claim/2
But this feels fragile (as it implies that the list must always be presented in the same sequence)
To me what it really feels like is that I need another collection, just for the Gifts, where each Gift document has its own id, which I can then reference in my RESTful url. And either the Gift has a reference to it's parent GiftList, or the GiftList has an array of references to the Gifts
But this is all a very 'relational' way of thinking ... isn't it ?
What's the best way of doing this? Or, if there is no 'best' way, what are my options?
You could solve this with a new collection, or you could add an unique identifier field to each list entry. The MongoDB solution for unique identifiers is generating an ObjectId, just like those used for the _id field of documents. Most MongoDB database drivers should expose functionality for generating ObjectId's. For details, consult the documentation of your database driver.

Emulating LEFT JOIN on MongoDB using MapReduce/Aggregation

I have a mongo database with few collections such as a user in the system (id, name, email) and list of projects (id, name, list of users who have access)
User
{
"_id": 1,
"name": "John",
"email": "john#domain.com"
}
{
"_id": 2,
"name": "Sam",
"email": "sam#domain.com"
}
Project
{
"_id": 1,
"name": "My Project1",
"users": [1,2]
}
{
"_id": 2,
"name": My Project2",
"users": [2]
}
In my dashboard, I display a list of projects and the names of its users. To support names - I've changed the "users" field to now also include the name:
{
"_id": 2,
"name": "My Project2",
"users": [{"_id":2,"name":"Sam"}]
}
But on several pages, I now need to also print their email address and later on - maybe also display their image.
Since I don't want to start and embed the entire User document in each project, I'm looking for a way to do a LEFT JOIN and pick the values I need from the User collection.
Performances are NOT important so much on those pages and I rather prefer an easy way to manage my data. So basically I'm looking for a way to query for a list of all projects and associated users with different fields from the original User document.
I've read about the map-reduce and aggregation option of mongo and to be honest - I'm not sure which to use and how to achieve what I'm looking for.
MongoDb doesn't support joins in any form even by using MapReduce and Aggregation Framework. Only way you could implement join between collection is in your code. So just implement LEFT JOIN logic in your code.

Logging file access with MongoDB

I am designing my first MongoDB (and first NoSQL) database and would like to store information about files in a collection. As part of each file document, I would like to store a log of file accesses (both reads and writes).
I was considering creating an array of log messages as part of the document:
{
"filename": "some_file_name",
"logs" : [
{ "timestamp": "2012-08-27 11:40:45", "user": "joe", "access": "read" },
{ "timestamp": "2012-08-27 11:41:01", "user": "mary", "access": "write" },
{ "timestamp": "2012-08-27 11:43:23", "user": "joe", "access": "read" }
]
}
Each log message will contain a timestamp, the type of access, and the username of the person accessing the file. I figured that this would allow very quick access to the logs for a particular file, probably the most common operation that will be performed with the logs.
I know that MongoDB has a 16Mbyte document size limit. I imagine that files that are accessed very frequently could push against this limit.
Is there a better way to design the NoSQL schema for this type of logging?
Lets first try to calculate avg size of the one log record:
timestamp word = 18, timestamp value = 8, user word = 8, user value=20 (10 chars it is max(or avg for sure) I guess), access word = 12, access value 10. So total is 76 bytes. So you can have ~220000 of log records.
And half of physical space will be used by field names. In case if you will name timestamp = t, user = u, access=a -- you will be able to store ~440000 of log items.
So, i think it is enough for the most systems. In my projects I always trying to embed rather than create separate collection, because it a way to achieve good performance with mongodb.
In the future you can move your logs records into separate collection. Also for performance you can have like a 30 last log records (simple denormalize them) in file document, for fast retrieving in addition to logs collection.
Also if you will go with one collection, make sure that you not loading logs when you no need them (you can include/exclude fields in mongodb). Also use $slice to do paging.
And one last thing: Enjoy mongo!
If you think document limit will become an issue there are few alternatives.
The obvious one is to simple create a new document for each log.
So you will have a collecton "logs". With this schema.
{
"filename": "some_file_name",
"timestamp": "2012-08-27 11:40:45",
"user": "joe",
"access": "read"
}
A query to find which files "joe" read will be something like the
db.logs.find({user: "joe", access: "read"})