Is MongoDB reliable? [closed] - mongodb

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 9 years ago.
I am developing a dating website and I am thinking of using a NoSQL database to store the profiles etc. I am currenly looking into the MongoDB and so far I am very pleased. The only worry is that I read on different websites that MongoDB is unreliable and not good.
I looked into the NoSQL alternatives and found no one that fully meets my specific criterias:
Easy to learn and use.
Fully compatible with PHP out of the box.
Fast and well documented.
What do you think, am I doing the right thing to go with MongoDB or is it a waste of time?
Thankful for all input in the matter!

I researched MongoDB for my social service startup and it is definitely worth considering. MongoDB has a powerful feature set that makes it a realistic and strong alternative to RDBMS solutions.
Amongst them:
Document Database: Most of your data is embedded in a document, so in order to get the data about a person, you don't have to join several tables. Thus, better performance for many use cases.
Strong Query Language: Despite not being a RDBMS, MongoDB has a very strong query language that allows you to get something very specific or very general from a document or documents. The DB is queried using javascript so you can do many more things beside querying (e.g. functions, calculations).
Sharding & Replication: Sharding allows you application to scale horizontally rather than vertically. In other words, more small servers instead of one huge server. And replication gives you fail-over safety in several configurations (e.g. master/slave).
Powerful Indexing: I originally got interested in MongoDB because it allows geo-spatial indexing out of the box but it has many other indexing configurations as well.
Cross-Platform: MongoDB has many drivers.
As for the documentation, there is not deluge but that is because this project only started in 2009; soon there will be plenty more. However there is enough to get started with your project. In addition to that you can check out Kyle Banker's MongoDB in Action, great resource.
Lastly, I had experience only with RDMBS prior to MongoDB, didn't know javascript or json and still found it to be very simple and elegant.

Consider this related question on MongoDB and CouchDB - Fit for Production?
MongoDB has a showcase of Production Deployments as well. Be sure to analyze the uses of MongoDB rather than the size of the company.

Any software can be reliable or unreliable. MongoDB has replica sets, which give you hardware failover capabilities. You can take backups on a regular basis, which gives you a recovery interval, and you get sharding which can give you some modicum of redundancy, especially when combined with replica sets.
The issue isn't whether or not the technology is reliable, the issue is whether or not you have a well-defined backup and recovery plan that suits your platform of choice.
If MongoDB suits your needs, you're making the right choice. Just make sure to investigate what you can do to increase your reliability.

If its good enough for Foursquare it's most likely good enough for you.

I come from a RDBMS background (12 years) and have spent the last 6 months looking at NoSQL options. For your scenario, MongoDB, sounds like a good choice. What I am hearing from those who have worked with MongoDB in production for some time, is that you should follow these best practices:
Keep key size small
Evaluate (and possibly add) indexes to speed up queries
Pay attention to schema (I know seems strange for a 'schemaless' database, but I have heard this several times
Use replica sets
Here's a video of a best-practices talk from the MongoDB LA User Group that I find useful

10gen, the company behind MongoDB provide official PHP driver.
As Jeremiah says, they implement replica sets in the last version (1.6.0) and they have already debug it (1.6.1 and next version in some weeks: 1.6.2).
Mostover, the free support by the company and communities is very fast and efficient (by 'free' I mean question on the google groups: http://groups.google.com/group/mongodb-user?pli=1)

Well, another points about reliability :
The community reacts extremely fast if you meet any critical issue.
You need to worry about your expectations about "reliability" : Do you need a guarantee on storing your data safely, never getting corrupted ?
In this case, you will have to compare the cost of buying reliable hardware and deploying MongoDB replica-sets
Do you mean having a highly available service ?
MongoDB has some youth issues, I can't say the opposite. But this is definitely NOT a waste of time, and perhaps a long-time solution.

That depends on what you need reliability for. Mongo is very reliable for reading - it have strong availability and sharding features.
OTOH, Mongo writes are not reliable. While most go through, it is never guaranteed that update succeeds or not and you have to manually query database to check if it did.
Thus, Mongo is best used when you have more reads than writes you absolutely need to succeed.

MongoDB would be a good choice.We evaluated and started using MongoDB for our business usecases. MongoDB is giving us better performance than Oracle and also it is easy to scale horizontally.

Related

Using PostgreSQL or PostgreSQL + MongoDB?

I'm currently planning a social-media application - especially the backend.
Basically I have all the social aspects for which I want to use SQL (PostgreSQL I guess) but I also have geolocations organized in lists (so many-to-one) which will propably make out the biggest ammount of data. I know that PostgreSQL has modules for GIS capabilities and my initial thought was to just use PostgreSQL for everything, just for the sake of simplicity and because performance of Geolocation searches should be around the same for both systems, if not even in favor of PostgreSQL. I can also use JSON Type in PostgreSQL so it basically has the most obvious advantages of MongoDB covered.
On the other hand I'm affraid of scalability as the geolocations are going to be the biggest chunk of data and the tables are propably going have heaps of rows.
So my thought now is to implement geolocations in MongoDB with its easy scalability, easy to use geolocation search and embedd e.g Comments/Likes for a geolocation directly into the document, which would make the geolocation reads/searches way easier but then again I had to combine this data with social data from SQL, e.g fetch all users that commented a geolocation and get their profile info from PostgreSQL and 'manually' combine it. Even though parts of this could be done on frontend saving me a lot of resources.
I'm not sure how good this idea performs and if I'm really doing myself a favor there.
tldr: Use PostgreSQL.
Long answer:
You are trying to pre-optimize for a problem you don't even know if you will have. You don't know how many geolocations you will have, what the usage behaviors will be of your users and you probably don't even have any users yet.
I've used MongoDB before and migrated to PostgreSQL. There are many, many features and benefits to using a 'real' database for storing highly structured data. I suggest googling around for 'PostgreSQL vs X' articles, but the overall consensus that I've found is that PGSQL is extremely mature, reliable, performant and supported.
From my personal experience using Mongo then switching to PGSQL, I will never use Mongo again unless PGSQL (or another full-fledged SQL database) is completely falling over and I've spent months fixing it. Even then I'd take a hard look at other NoSQL databases too. PGSQL has so many amazing features and powerful tools that make it a joy to use.
For the seemingly few things you think you need Mongo for, PGSQL can do, and do just as well or better. It has native JSON types with indexes, geo support, full text indexing, etc. PGSQL has been around longer and has more support (useful for debugging, performance tuning, etc).
Regardless of which technologies you are thinking of using, you can't make any sort of informed decision if you don't:
Test with large data sets
and
Know your usage patterns and data volumes
So at this point I'd pick the more matured and powerful tool and setup monitoring for it. Watch the usage and performance of PGSQL, see how it holds up. Research best practices for PGSQL. Get to know it, learn it, dive in deep. When it comes to scaling individual services, each one is somewhat unique and will not fit a simple "Should I use X or Y?" question.
Good luck!

Best suited NoSQL database for Content Recommender

I am currently working in a project which includes migrating a content recommender from MySQL to a NoSQL database for performarce reasons. Our team has been evaluating some alternatives like MongoDB, CouchDB, HBase and Cassandra. The idea is to choose a database that is capable of running in a single server or in a cluster.
So far we have discarded the use of Hbase due to its dependency on a distributed environment. Even having the idea of scaling horizontally, we need to run the DB in a single server for a little while in production. MongoDB was also discarded because it does not support map/reduce features.
We have still 2 alternatives and we have no solid background to decide. Any guidance or help is appreciated
NOTE: I do not pretend to create a religion-like discussion with non-founded arguments. It is a strictly technical question to be discussed in the problem's context
Graph databases are usually considered as best suited for recommendation engines, since a lot of the recommendation algorithms are actually graph based. I recommend looking into Neo4J - it can handle billions of nodes/edges on a single machine and it supports a so-called high availability mode which is a master-slave setup with automatic master selection.

Something about MongoDB

I'm new to MongoDB, can anyone explain how it could be used in efficiently in enterprise applications, so as to give good performance (using joins, indexing etc.)
And perhaps also point me to any MongoDB production applications on the web.
For a good introduction to MongoDB, check out The Little MongoDB Book. Here's a list of sites currently using MongoDB in production.
You talk about Joins and Indexes. It seems your head is still in the RDBMS world. NoSQL and Mongo are not just different Relational Databases there a completely different way of managing and thinking of Data. You need to think of your data schema in terms of Structured Objects rather than rows.
Sounds like you need to start from the beginning. MongoDB.org has a lot of the info you're asking for already available. Specifically, read their page on use cases, and the page on production deployments.
A more specific question would receive more comprehensive answers and fewer downvotes.

When to use NoSql, and which one? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
I've been programming with php and mySql for a while now and recently decided that I wanted to give nosql a try. I would really appreciate if some of you with experience could tell me:
When is it a good time to switch, how do I know nosql is for me?
Which nosql software would you recommend?
Thanks
When is it a good time to switch?
It really depends on the particular project. But in general I see that I can use nosql for 95% of web applications. I will still use old good sql for the systems which should guarantee ACID (for example, systems that work with 'real' money).
How do I know nosql is for me?
It's for you, for you, believe me. ;)
You just need to try something from the nosql world, read some existing articles and you will see all of the benefits and problems.
Which nosql software would you recommend?
I would personally recommend you to start from mongodb, because it really simple. To become an expert in sql takes years, to become an expert in mongodb needs a month or so.
I suggest that you spend an hour for reading "The little mongodb book" and try to write your first test application starting from tomorrow.
No one here will say that you need to use this, or this database. What database to use depends on project and requirements.
It depenends on your application needs.. There are a lot of options.
You can use a document-oriented like mongodb, a "extended" key-value like Redis or maybe a graph-oriented like neo4j
This article is very useful http://highscalability.com/blog/2010/12/6/what-the-heck-are-you-actually-using-nosql-for.html
This recent blog post in High Scalability pretty much answers your question in regards to when to use NoSQL.
I myself always go MySQL until it fails me and then choose the right tool for the job, some of the non-relational databases I worked with are:
Riak: a dynamo clone which is useful when you need to access records quickly but you have too many records to keep on one machine. For instance a recommender system for users in a web application, you want to access the data in a few milliseconds but you have 200M users.
MongoDB: a document-based database, for when I needed speed but had a write-intensive application (read/write ratio close to 1:1) the data was highly transient so I didn't care about the durability issues
The best time to switch is when you:
Start working on a new project and you make your first architectural decisions. Porting an existing application to a different database can cause a lot of headaches.
Hit a brick wall... or better before you see one coming :) The main reason is usually lack of performance or scalability.
Need a missing feature (eg: complex hierachical structures, graph-like traversals, etc..).
I would recommend a lot of them, but each of them has their own sweet-spots where they shine and other parts where they lack features. The only way to pick the right tool(s) is to get familiar with a couple of them.
Web developers usually learn key-value stores (memcached, redis) first as they can fix a lot of performance problems (but also add some complexity to your app...).
There are document (schema-less) databases like MongoDB or CouchDB which can significantly enhance your productivity if your data model often changes.
For graph traversals there is NeoJ.
For "big data" there is Hadoop and its related projects.
And a list goes on and on...

NoSQL best practices [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
What are the best practices for NoSQL Databases, OODBs or whatever other acronyms may exist for them?
For example, I've often seen a field "type" being used for deciding how the DB document (in couchDB/mongoDB terms) should be interpreted by the client, the application.
Where applicable, use PHP as a reference language. Read: I'm also interested in how such data can be best handled on the client side, not only strictly the DB structure. This means practically that I'm also looking for patterns like "ORM"s for SQL DBs (active record, data mapper, etc).
Don't hesitate making statements about how such a DB and the new features of PHP 5.3 could best work together.
I think that currently, the whole idea of NoSQL data stores and the concept of document databases is so new and different from the established ideas which drive relational storage that there are currently very few (if any) best practices.
We know at this point that the rules for storing your data within say CouchDB (or any other document database) are rather different to those for a relational one. For example, it is pretty much a fact that normalisation and aiming for 3NF is not something one should strive for. One of the common examples would be that of a simple blog.
In a relational store, you'd have a table each for "Posts", "Comments" and "Authors". Each Author would have many Posts, and each Post would have many Comments. This is a model which works well enough, and maps fine over any relational DB. However, storing the same data within a docDB would most likely be rather different. You'd probably have something like a collection of Post documents, each of which would have its own Author and collection of Comments embedded right in. Of course that's probably not the only way you could do it, and it is somewhat a compromise (now querying for a single post is fast - you only do one operation and get everything back), but you have no way of maintaining the relationship between authors and posts (since it all becomes part of the post document).
I too have seen examples making use of a "type" attribute (in a CouchDB example). Sure, that sounds like a viable approach. Is it the best one? I haven't got a clue. Certainly in MongoDB you'd use seperate collections within a database, making the type attribute total nonsense. In CouchDB though... perhaps that is best. The other alternatives? Separate databases for each type of document? This seems a bit loopy, so I'd lean towards the "type" solution myself. But that's just me. Perhaps there's something better.
I realise I've rambled on quite a bit here and said very little, most likely nothing you didn't already know. My point is this though - I think its up to us to experiment with the tools we've got and the data we're working with and over time the good ideas will be spread and become the best-practices. I just think you're asking a little too early in the game.
"NoSQL" should be more about building the datastore to follow your application requirements, not about building the app to follow a certain structure -- that's more like a traditional SQL approach.
Don't abandon a relational database "just because"; only do it if your app really needs to.