Structuring Nested Collections in RavenDB

Structuring Nested Collections in RavenDB - nosql

I've got a question about how I structure my data in RavenDB. As like most I'm coming from a relational database background and it feels slightly like I'm having to re-program my brain :).
Anyway. I have a utility which looks as below
{
"Name": "Gas",
"Calendars": [
{
"Name": "EFA"
},
{
"Name": "Calendar"
}
]
}
And I have a contract. Whilst creating the contract I need to first pick a utility type. Then based upon that I need to pick a Calendar type.
For example, I would pick Gas and then I would pick EFA. My question is how should I store this information against the contract object. It almost feels like each of my calendars should have an id, but I'm guessing this is wrong? Or should I just be storing the text values?
Any advice on the correct way to do this would be appreciated.

You can have internal objects have ids in RavenDB, but those are application managed, not managed by RavenDB.

Related

MongoDB collections - which way will be more efficient?

I am more used to MySQL but I decided to go MongoDB for this project.
Basically it's a social network.
I have a posts collection where documents currently look like this:
{
"text": "Some post...",
"user": "3j219dj21h18skd2" // User's "_id"
}
I am looking to implement a replies system. Will it be better to simply add an array of liking users, like so:
{
"text": "Some post...",
"user": "3j219dj21h18skd2", // User's "_id"
"replies": [
{
"user": "3j219dj200928smd81",
"text": "Nice one!"
},
{
"user": "3j219dj2321md81zb3",
"text": "Wow, this is amazing!"
}
]
}
Or will it be better to have a whole separate "replies" collection with a unique ID for each reply, and then "link" to it by ID in the posts collection?
I am not sure, but feels like the 1st way is more "NoSQL-like", and the 2nd way is the way I would go for MySQL.
Any inputs are welcome.

This is a typical data modeling question in MongoDB. Since you are planning to store just the _id of the user the answer is definitely to embed it because those replies are part of the post object.
If those replies can number in the hundreds or thousands and you are not going to show them by default (for example, you are going to have the users click to load those comments) then it would make more sense to store the replies in a separate collection.
Finally, if you need to store more than the user _id (such as the name) you have to think about maintaining the name in two places (here and in the user maintenance page) as you are duplicating data. This can be manageable or too much work. You have to decide.

Mongo for Meteor data design: opposite of normalizing?

I'm new to Meteor and Mongo. Really digging both, but want to get feedback on something. I am digging into porting an app I made with Django over to Meteor and want to handle certain kinds of relations in a way that makes sense in Meteor. Given, I am more used to thinking about things in a Postgres way. So here goes.
Let's say I have three related collections: Locations, Beverages and Inventories. For this question though, I will only focus on the Locations and the Inventories. Here are the models as I've currently defined them:
Location:
_id: "someID"
beverages:
_id: "someID"
fillTo: "87"
name: "Beer"
orderWhen: "87"
startUnits: "87"
name: "Second"
number: "102"
organization: "The Second One"
Inventories:
_id: "someID"
beverages:
0: Object
name: "Diet Coke"
units: "88"
location: "someID"
timestamp: 1397622495615
user_id: "someID"
But here is my dilemma, I often need to retrieve one or many Inventories documents and need to render the "fillTo", "orderWhen" and "startUnits" per beverage. Doing things the Mongodb way it looks like I should actually be embedding these properties as I store each Inventory. But that feels really non-DRY (and dirty).
On the other hand, it seems like a lot of effort & querying to render a table for each Inventory taken. I would need to go get each Inventory, then lookup "fillTo", "orderWhen" and "startUnits" per beverage per location then render these in a table (I'm not even sure how I'd do that well).
TIA for the feedback!

If you only need this for rendering purposes (i.e. no further queries), then you can use the transform hook like this:
var myAwesomeCursor = Inventories.find(/* selector */, {
transform: function (doc) {
_.each(doc.beverages, function (bev) {
// use whatever method you want to receive these data,
// possibly from some cache or even another collection
// bev.fillTo = ...
// bev.orderWhen = ...
// bev.startUnits = ...
}
}
});
Now the myAwesomeCursor can be passed to each helper, and you're done.

In your case you might find denormalizing the inventories so they are a property of locations could be the best option, especially since they are a one-to-many relationship. In MongoDB and several other document databases, denormalizing is often preferred because it requires fewer queries and updates. As you've noticed, joins are not supported and must be done manually. As apendua mentions, Meteor's transform callback is probably the best place for the joins to happen.
However, the inventories may contain many beverage records and could cause the location records to grow too large over time. I highly recommend reading this page in the MongoDB docs (and the rest of the docs, of course). Essentially, this is a complex decision that could eventually have important performance implications for your application. Both normalized and denormalized data models are valid options in MongoDB, and both have their pros and cons.

Mongodb real basic use case

I'm approaching the noSQL world.
I studied a little bit around the web (not the best way to study!) and I read the Mongodb documentation.
Around the web I wasn't able to find a real case example (only fancy flights on big architectures not well explained or too basic to be real world examples).
So I have still some huge holes in my understanding of a noSQL and Mongodb.
I try to summarise one of them, the worst one actually, here below:
Let's imagine the data structure for a post of a simple blog structure:
{
"_id": ObjectId(),
"title": "Title here",
"body": "text of the post here",
"date": ISODate("2010-09-24"),
"author": "author_of_the_post_name",
"comments": [
{
"author": "comment_author_name",
"text": "comment text",
"date": ISODate("date")
},
{
"author": "comment_author_name2",
"text": "comment text",
"date": ISODate("date")
},
...
]
}
So far so good.
All works fine if the author_of_the_post does not change his name (not considering profile picture and description).
The same for all comment_authors.
So if I want to consider this situation I have to use relationships:
"authorID": <author_of_the_post_id>,
for post's author and
"authorID": <comment_author_id>,
for comments authors.
But MongoDB does not allow joins when querying. So there will be a different query for each authorID.
So what happens if I have 100 comments on my blog post?
1 query for the post
1 query to retrieve authors informations
100 queries to retrieve comments' authors informations
**total of 102 queries!!!**
Am I right?
Where is the advantage of using a noSQL here?
In my understanding 102 queries VS 1 bigger query using joins.
Or am I missing something and there is a different way to model this situation?
Thanks for your contribution!

Have you seen this?
http://www.sarahmei.com/blog/2013/11/11/why-you-should-never-use-mongodb/
It sounds like what you are doing is NOT a good use case for NoSQL. Use relational database for basic data storage to back applications, use NoSQL for caching and the like.

NoSQL databases are used for storage of non-sensitive data for instance posts, comments..
You are able to retrieve all data with one query. Example: Don't care about outdated fields as author_name, profile_picture_url or whatever because it's just a post and in the future this post will not be visible as newer ones. But if you want to have updated fields you have two options:
First option is to use some kind of worker service. If some user change his username or profile picture you will give some kind of signal to your service to traverse all posts and comments and update all fields his new username.
Second option use authorId instead of author name, and instead of 2 query you will make N+2 queries to query for comment_author_profile. But use pagination, instead of querying for 100 comments take 10 and show "load more" button/link, so you will make 12 queries.
Hope this helps.

Schema design in MongoDB — to replicate data or not

I have a Share collection which stores a document for every time a user has shared a Link in my application. The schema looks like this:
{
"userId": String
"linkId": String,
"dateCreated": Date
}
In my application I am making requests for these documents, but my application requires that the information referenced by the userId and linkId properties is fully resolved/populated/joined (not sure on the terminology) in order to display the information as needed. Thus, every request for a Share document results in a lookup for the subsequent User and Link documents. Furthermore, each Link has a parent Feed document which must also be looked up. This means I have some spagehetti-like code to perform each find operation in a series (3 in total). Yet, the application only needs some of the data found in these calls (one or two properties). That said, the application does need the entire Link document.
This is very slow, and I am wondering whether I should just be replicating the data in the Share document itself. In my head, this is fine because most of the data will not change, but some of it might (i.e. a User's username). This is suggesting of a Share schema design like so:
{
"userId": String,
"user": {
"username": String,
"name": String,
},
"linkId": String,
"link": {}, // all of the `Link` data
"feed": {
"title": String
}
"dateCreated": Date
}
What is the consensus on optimising data for the application with regards to this? Do you recommend that I replicate the data and write some glue code to ensure the replicated username gets updated if it changes (for example), or can you recommend a better solution (with details on why)? My other worry about replicating data in this manner is, what if I needed more data in the Share document further down the line?

Extract data lists from Mongo Documents

As a mongo/nosql newbie with a RDBMS background I wondered what's the best way to proceed.
Currently I've got a large set of documents, containing in some fields, what I consider as "reference datas".
My need is to display in a search interface summarizing the possible values of those "reference fields" to further proceed a filter on my documents set.
Let's take a very simple and stupid example about nourishment.
Here is an extract of some mongo documents:
{ "_id": 1, "name": "apple", "category": "fruit"}
{ "_id": 1, "name": "orange", "category": "fruit"}
{ "_id": 1, "name": "cucumber", "category": "vegetable"}
In the appplication I'd like to have a selectbox displaying all the possible values for "category". Here it would display "fruit" and "vegetable".
What's the best way to proceed ?
extract datas from the existing documents ?
create some reference documents listing unique possible values (as I would do in RDBMS )
store reference data in a rdbms and programatically link mongo and rdbms...
something else ?

The first option is the easiest to implement and should be efficient if you have indexes properly set (see distinct command), so I would go with this.
You could also choose the second option (linking to a reference collection - RDBMS way) which trades performance (you will need more queries for fetching data) for space (you will need less space). Also, this option is preferred if the category is used in other collections as well.
I would advise against using a mixed system (NoSQL + RDBMS) in this case as the other options are better.
You could also store category values directly in application code - depends on your use case. Sometimes it makes sense, although any RDBMS fanatic would burst into tears (or worse) if you tell him that. YMMV. ;)

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse