Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
Is it cheaper to create an infrastructure on AWS and run mongodb on it, or it is cheaper to use the package of DynamoDB of AWS which assure determinate characteristics?
Depends on your use case:
Dynamo: is relatively cheap if your access for read / write is not massive. Storage for it, is really cheap ($0.25 per Gb). Also you also gain the ability of scale and not have to maintain backups, replicas, etc, etc as you will do it by your own managed MongoDb.
Regarding the features, and there is where your data model applies, you have to take into consideration that the search features are not as powerful as the MongoDb ones, however, if you are imaginative, you will be able to adapt your model to it. Also, in a couple of weeks a new feature (Global Secondary Indexes), will be added to Dynamo, so it would be possible to search by other fields that are not in the Key and the Range key, and without projections (something that now is available with the Secondary Indexes.
MongoDb: Depends: you have to think that you will have to mantain the infraestructure (server), backups, etc, etc. Also the cheap instances of AWS does not have a high amount of memory or SSD, so you will have to go to bigger instances that have a high price per month. there is also another possibility, use Mongo provided by some SaaS like MongoHq, etc, but again, that will be expensive.
Related
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
I have to decide on a NoSQL database for a web app that should keep track of the user input and update the corresponding record as frequent as possible. To think about the frequency: let's say a blank record is generated on start and it should update on every key-up event coming from the user.
The methods I have seen for this kind of work are:
Write-Ahead-Logging/Journaling for the user data (not like the internal data consistency methods like Journaling of MongoDB or Write-ahead logging of CouchDb): I don't know if it is even implemented for the user data or the current methods can be utilized for this purpose.
Versioning for MongoDB or a less implicit cell versioning way of doing it in Cassandra
I tended to use Cassandra at the beginning, but I want to know the best fit methods* to achieve that kind of a scenario.
In Cassandra frequent updates on a cell can (but must not) lead to problems with compactions (to be more specific, when updated data is flushed from memtables to sstables because of too many concurrent updates.
If you do not need this data persisted an in memory solution (or in addition to a database) could help, I used Hazelcast for this.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 7 years ago.
Improve this question
Which database is a suitable choice to store an address book with billions of rows (name, email address, phone number, etc. )?
The application will be very read intensive (>99%) and need high consistency available with servers distributed worldwide.
The query will be on either email address or phone number.
I am currently considering HBase, Cassandra or MongoDB.
Since MongoDB has features like Replication (Geographically redundant too) that makes it highly available , MongoDB would be a better alternative. It also provides facility to configure read preferences on the data replicas. Please refer the following link to decide between which DB to use based on your business requirement.
https://lh5.googleusercontent.com/c_vcKz-Jo3XmIHutpOtJxBoysMt_Ny_PL-0cB4Czh4FvIbTEpe9lObaA6sTwsdHJdrtMXqOBNCNoRxYQYnIlu9MxuYIMWcl5dgUSCADFAfOXWuyWRgKWFk99Pg
Cassandra might be a good choice for that. It has support for multiple data centers so for worldwide support you can set up a few DC's around the world to reduce latency by having clients access the nearest data center.
For fast lookups based on email address and phone number you'd probably store the data denormalized in two tables, with one table using email as the primary key and another table using phone number as the primary key.
You should be able to get the read performance you want by adding more nodes, since read performance would scale with the number of nodes you had in each data center.
Now if you want to do ad hoc queries of this data based on fields other than the primary key, then Cassandra would not be a good choice.
I think you should go with MongoDB. Its document database and support replication, shading features.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
I already have a .Net Web project running on MongoDB where I store some news/feed data.
After a while I needed a faster way to track "who shared what" and "how to find relationships depending on these information".
Then I came up with an idea to use graphDB to track related feeds and users.
Since the system is already running on MongoDB, I am thinking of leaving the data in Mongo and creating the graph representation in Neo4J for applying a graph search.
I do not want to migrate all my data to Neo4J because many people telling me MongoDB's I/O performance is way better than Neo4J and they also pointed out Sharding feature.
What would you suggest in this situation?
And If I follow my idea, will it be a good practice?
Personnally I think there are no unique answer and best practices. It is common usage to use polyglot persistence systems.
Now everything is based on your context and there are points we can't just reply for you :
How much time do you have (learning a new technology is not a matter of days until you can use it in production and sleep good )
How much money you can invest in the project , sharding is, AFAIK, a neo4j enterprise feature and licenses have a cost if you're not opensource or commercial company. Also hosting costs for Neo4j in cluster mode.
How much data ? As long as your graph can fit in memory, you'll not run I/O issues.
Now, away from these points, yes you can in a first instance trying to map neo4j on top of mongoDB.
Maybe try to do incremental migrations, and at then end of the process, maybe ask you the following questions, WHY do you need MongoDB to handle graph structures ?
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
I am going to start learning NOSQL databases (in practices, already done my researches and understood the concepts and modeling approaches).
I am working with time series data and both cassandra and mongodb are recommended for this use case.
I would like to know which one takes less time to learn? (Unfortunately, I don't have much time to spend on learning)
PS: I noticed that there are more tutorials and documentations with mongoDB (am I correct?)
Thank you!
Having used them both extensively, I can say that the learning curve isn't as bad as you might think. But as different people learn and absorb information at different rates, it is difficult to say which you will find easier or how quickly you will pick them up. I will mention that MongoDB has a reputation of being very developer-friendly, due to the fact that you can write code and store data without having to define a schema.
Cassandra has a little steeper learning curve (IMO). However that has been lessened due to the CQL table-based column families in recent versions, which help to bridge the understanding gap between Cassandra and a relational database. Since tutorials for MongoDB have been mentioned, I will post a link to DataStax's Academy, which offers a free online course you can take to introduce yourself to Cassandra. Specifically, the course DS220 deals with modeling your data.
With both, a key concept to understand is that you must break yourself of the relational database idea where you want to build your tables/collections to store your data most-efficiently. In non-relational data modeling you will want to model your tables/collections to reflect how you want to query your data, and this might mean using redundant data in some places.
qMongoFront is a qt based mongodb gui application under linux. It's free and opensouce. If you want to learn mongodb, qMongoFront is a good choice.
http://sourceforge.net/projects/qmongofront/
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
I am trying to learn NoSQL, and am implementing that in a project I am working on now as a means to pick it up. I understand there are no hard rules around it. But I'd be happy to read on some of the following:
Guidelines on how to structure a NoSQL document.
Moving from a RDBMS to a NoSQL thinking.
Difference between storing data in a NoSQL to that from RDBMS
Thanks!
I do have previous experience in RDBMS, and have been working with them for years.
Every concept will require to learn new thinking. Your question is to general for a specific
answer.
You will structure and work with CouchDB documents in another way as with MongoDB documents. In CouchDB you will do queries with MapReduce. In MongoDB you have a flexible query interface similar to a RDBMS.
A Key-Value store requires a completely new way of thinking. You have to know your query patterns before you are able to structure your content the right way. You have no index, so you have to build your own structure.
One blog that gives a lot of NoSql information is http://nosql.mypopescu.com
Update
The Riak people have some interresting questions too:
Will my access pattern be read-heavy, write-heavy, or balanced?
Which datasets churn the most? Which ones require more sophisticated conflict resolution?
How will I find this particular type of data? Which method is most efficient?
How independent/interrelated is this type of data with this other type of data? Do they belong together?
How much will I need to do online queries on this data? How quickly do I need them to return results?