I come from an SQL background, where grasping the possible relationships between different models and schemas seems to be quite straightforward to me.
How can I shift the same thing to the MEAN world? For example, let's just assume I have a basic blog engine with a posts table and a comments table, where posts have many comments and each comment has a post. While coding this is easy in, say, Rails, I'm getting stuck here and couldn't find good tutorials.
Also, I'm not sure if adding authors to the party is any more complicated - let's just say posts and comments each have an author, and the author has many comments and also has many posts (once I get this I think highlighting "OP" comments is just the matter of a query).
Can you give me a guideline regarding the differences between what I've been used to in Rails and the approach I need now?
You are used to think in terms of normalization. NoSQL databases let you design your data model in structured documents, meaning you can denormalize your data. It has advantages like data locality and atomicity, but can suffer from redundancy and inconsistency.
An example would be embedding the comments inside each post. Thus, you don't have several collections / tables, and can access your data swiftly.
I advice you to read the book MongoDB Applied Design Patterns to better understand the benefits you would earn.
Related
I am planning the rest architecture of my current project and I want to know which architecture is better. My data consists of users, topics and posts.
So if my usecase requires to get all posts for a specific topic, how should I design it?
I know two approaches:
http://restip/api/topics/:id/posts
http://restip/api/posts/?topicId
So which one is recommended and are there better solutions?
The better way is the first, this link explain all the conventions (Base on ruby on rails conventions).
http://microformats.org/wiki/rest/urls
Is more easy to read a topic with the id have posts.
That's really just a matter of taste. Just be consistant everywhere in your API.
In my opinion though, your second approach is good. It is better to keep it simple by preferring "shallow" URLs (with fewer levels) when possible :
/topics/:topicId
/posts/:postId
/posts/?topic=:topicId
You can add fancy URL rewrite rules later, if needed, so that for example:
/topic/:topicId/posts maps to /posts/?topic=:topicId
This might sound like a dumb question but I am recently learning about a Big Table.
Would someone please tell me the advantage of using Big Table over NoSql databases. I eventually see both of them as semi-structured data storage. Some people mention that Big Table has much more simple interface as compared to a NoSql database but I don't quite understand how. Also is there a way I can try out API's of Big Table ??
Also , does Big Table have web interfaces , if yes can I get links to it as well ?
BigTable is Google's system to store large documents of data. It doesn't generates relations between documents as it doesn't benefit the architecture of Google's applications. This philosophy of "Unrelated" data-instances are the basic idea of NoSQL. So long story short, BigTable is NoSQL as NoSQl is the theoretical idea ( just like RDBMS is the basic theory of MySQL,MSSQL and others ).
An approach of bigtable has been made and gave birth to hadoop. It is widely used by many industries.An other related implementation is storm which tries to operates faster when it comes to serve real time data.
Regarding NoSQL databases you should take a good look at hbase, cassandra and if you are coming from the RDBMS world MongoDB would be the best choice to start realizing the use of NoSQL.
Mind to take a good look at the Google's notes regarding BigTable.
Cheers!
I'm new to MongoDB, can anyone explain how it could be used in efficiently in enterprise applications, so as to give good performance (using joins, indexing etc.)
And perhaps also point me to any MongoDB production applications on the web.
For a good introduction to MongoDB, check out The Little MongoDB Book. Here's a list of sites currently using MongoDB in production.
You talk about Joins and Indexes. It seems your head is still in the RDBMS world. NoSQL and Mongo are not just different Relational Databases there a completely different way of managing and thinking of Data. You need to think of your data schema in terms of Structured Objects rather than rows.
Sounds like you need to start from the beginning. MongoDB.org has a lot of the info you're asking for already available. Specifically, read their page on use cases, and the page on production deployments.
A more specific question would receive more comprehensive answers and fewer downvotes.
I have read a lot of the MongoDB.
I like all the features it provides, but I wonder if it's possible to have it as the only database for my application, including storing sensitive information.
I know that it compromises the durability part in ACID but I will as a solution have 1 master and 2 slaves in different locations.
If I do that, is it possible to use it as the primary database, storing everything?
UPDATE:
Lets put it this way.
I really need a document storage rather than traditional dbms for be able to create my flexible application. But is MongoDB reliable enough to store customer sensitive information if I have multiple database replications and master-slave? Cause as far as I know one major downside is that it compromises the D in ACID. So I solve it with multiple databases.
Now there is not major problems such as lost of data issues?
And someone told me that with MongoDB a customer could be billed twice. Could someone enlighten this?
Yes, MongoDB can work as an application's only data store. Use replication and make sure you know about safe writes and w.
And someone told me that with MongoDB a customer could be billed
twice. Could someone enlighten this?
I'm guessing whoever told you that was talking about eventual consistency. Some NoSQL databases are eventually consistent, meaning that you could have conflicting writes. MongoDB is not, though, it is strongly consistent (just like a relational database).
Your application being flexible or not has absoutely nothing to do with wether you use "nosql", a "document db" or a proper RDBMS. Nobody using your application will care either way.
If you need flexibility while coding, you should research into frameworks, like ActiveRecord for Ruby, which can make DB-interfacing much more simple, generic and powerful. At that level, you can gain alot more than just changing the DB, and you can even become DB-agnostic, meaning you can change DB without changing any code. Indeed, I have found ActiveRecord to boost my productivity many many fold by alleviating me from tedious and error-prone "code intermixed with SQL".
Indeed, if you feel you need a schemaless database, for critical data, you are doing something wrong. You are putting your own convenience over critical needs of the projects, or in ignorance thinking you won't get into problems later. Sooner or later, lack of consistency will bite your ass, hard!
I feel you are hesistant towards RDBMS because you are not really that comfortable with all the jargons, syntax and sound CS principles.
Believe me, if you're going to create an application of any value, you are hundred times better learning SQL, ACID and good database-principles in the first place. Just read up on the basics, and build your knowledge from wherever you are now. It's the same for each and every one of us, it takes time, but you learn to do things right from the start.
Low-level tools like MongoDB and equivalent just provide you with infinitely more ammunition to shoot yourself in the foot with. They make it seem easy. In reality however, they leave the hard work for you, the coder, to deal with, while an RDBMS will handle more of the cruft for you once you grok the basics.
Why use computers at all, if you want more work, you can just go back to paper. Design will be a breeze, and implementation can be super-quick. Done. Except it won't be right of course.
In the real world, we can't afford to ignore consistency, database design and many more sound CS principles. Which is why it's a great idea to study them in the first place, and keep learning more and more.
Don't buy into the hype. You ask question about MongoDB here, but include that you really need its features. With 25 years of computer experience, I simply don't buy it. If you think creatively, an RDBMS can be made to do whatever you want it to, or a framework can be utilized to save you from errors and wasted time.
Crafting ACID properties onto MongoDB seems like more work to me, and by experience, sounds like an excercise in futility, rather than using what is already designed to suit such purposes.
Is it possible? Sure. Why not? It's possible to store everything as XML in text files if you wanted to.
Is it the best idea? It depends on what you're trying to do, the rest of your system architecture, the scale your site is intended to achieve, and so on.
The queries that i create frequently have 7-8 joins to retrieve data. Are these many joins normal in a real database application or is my database design poor? I am curious because if on each request database has to do so much work, then won't it die if few thousands of client connect?
In my opinion it's inevitable in some cases, the key is to have the correct indexes for the queries you're doing. With a deep object graph in ORM, or perhaps one with joined subclasses, it'd be easy to go over the 7-8 joins you talk of. I'm keen to hear what everyone else has to say about it :)
Its not possible to draw a Conclusion in this regard without the Application Logic Details.
If your Application Logic leads you to unavoidable joins to maintain the integrity Its not a Problem, and Your Database Platform must handle it.
That is a lot of joins. It's hard to say without seeing your schema, but I have seen cases where people have gone nuts making a schema overcomplicated. I remember one application I worked on where every address and phone number in the system was treated as an entity, and queries frequently involved joins of over a dozen tables. You should be careful when making a schema to distinguish between those things you care about individually tracking and everything else, otherwise you can end up with needless complexity.