Cassandra vs Mongodb running costs? - mongodb

We are planning to create a public website, and we're in the process of choosing suitable Database for it. After discussions it was suggested to go with NOSQL databases as it would be easier for scaling in future.
In our website we expect regular writes and lot of reads, and it seems either Cassandra or MongoDB would best suit for it.
Kindly suggest between Cassandra and MongoDb which database would be easier on hosting and maintenance and cheaper on hosting charges.
Also please suggest some providers for better and low cost hosting for both cassandra and MongoDb.

Related

which backend stack for development e-commerce mobile app?

I am learning flutter is really a good technology to build mobile apps ,my project is an e-commerce app,but I am confused about which route I will take for back-end. I read about Firebase and how its good but I read also its limited by 200 users concurrently on a real-time database so it will not be a good option so the research led me to express.js"Node.js" + SQL or MongoDB \ Laravel + SQL, get in mind that I have a good experience with men stack (Express with MongoDB) but I hear that MongoDB is too expensive to be host also I need to know if MongoDB can be a great deal in a free cluster or not with my e-commerce app .
If you have good experience in Express you probably go with it. And if you are not ok with MongoDb, Specially a paid version, then you definitely go with SQL. But it will be a bit complex than MongoDB and take extra time for you to manually control your CRM database for E-commerce Application.
On the other hand MySQL has no limit on the number of databases. The underlying file system may have a limit on the number of directories.
MySQL has no limit on the number of tables. The underlying file system may have a limit on the number of files that represent tables. Individual storage engines may impose engine-specific constraints. InnoDB permits up to 4 billion tables. This will give SQL a plus point at this stage.
But laravel is also a good technology I am not so familiar with it. so I cant suggest it.

Horizontally Scaling Database Guide

We want to horizontally scale our existing MongoDB database which is running on one server. Due to increased user base, we can't scale it vertically anymore. We need to scale it horizontally through sharding.
The MongoDB provides a good tutorial to achieve Sharding. But, we need to do it in less amount of time. We are not expert on this.
It seems there are multiple options available like Google Cloud and Amazon RDS. All we want is to use our database but achieve Sharding by some another service.
So my questions are:
1. Is it possible to build a fail-safe cluster architecture is less than a week using MongoDB Sharding with the team having no prior experience in this?
2. If not, do these services like Google cloud SQL and Amazon RDS provide a mechanism to use our database with their Sharding service?
Can anyone with expertise in this just guide me in this direction?
I tried MongoDB Atlas and it looks pretty good https://www.mongodb.com/cloud/atlas
It creates a cluster for you by default
Maybe, you can give it a try:
MongoDB Atlas delivers the world’s leading database for modern
applications as a fully automated cloud service engineered and run by
the same team that builds the database. Proven operational and
security practices are built in, automating time-consuming
administration tasks such as infrastructure provisioning, database
setup, ensuring availability, global distribution, backups, and more.
The easy-to-use UI and API let you spend more time building your
applications and less time managing your database.

Cosmos or MongoDB On Azure

I will start to a new .Core + NoSQL project. I am free to choose to use MongoDB or Cosmos on Azure for the database of at most 10 GBs.
That is, if I use Cosmos, I will have no maintenence issues but accesing it with a MongoDB driver seems like containing potential issues. I also have no experience with Cosmos while I worked with MongoDB previously. On the other hand, if I setup Mongo on a Windows or a Linux Server, I have to take care of the server itself, follow up the disk space, fix potencial issues etc.
In terms of maintenence and reliability, which one do you suggest?
As a rule of thumb, always choose the most managed service unless you have a reason not to. You probably answered your own question, in terms of maintenance and reliability you should choose Database-as-a-Service (CosmosDB) which not only offers a 99.999% high availability SLA but enables you to grow and distribute globally.
There is a MongoDB API for Cosmos, I would give it a try and implement a PoC.

What are the pros and cons of DynamoDB with respect to other NoSQL databases?

We use MongoDB database add-on on Heroku for our SaaS product. Now that Amazon launched DynamoDB, a cloud database service, I was wondering how that changes the NoSQL offerings landscape?
Specifically for cloud based services or SaaS vendors, how will using DynamoDB be better or worse as compared to say MongoDB? Are there any cost, performance, scalability, reliability, drivers, community etc. benefits of using one versus the other?
For starters, it will be fully managed by Amazon's expert team, so you can bet that it will scale very well with virtually no input from the end user (developer).
Also, since its built and managed by Amazon, you can assume that they have designed it to work very well with their infrastructure so you can can assume that performance will be top notch. In addition to being specifically built for their infrastructure, they have chosen to use SSD's as storage so right from the start, disk throughput will be significantly higher than other data stores on AWS that are HDD backed.
I havent seen any drivers yet and I think its too early to tell how the community will react to this, but I suspect that Amazon will have drivers for all of the most popular languages and the community will likely receive this well - and in turn create additional drivers and tools.
Using MongoDB through an add-on for Heroku effectively turns MongoDB into a SaaS product as well.
In reality one would be comparing whatever service a chosen provider has compared to what Amazon can offer instead of comparing one persistance solution to another.
This is very hard to do. Each provider will have varying levels of service at different price points and one could consider the option of running it on their own hardware locally for development purposes a welcome option.
I think the key difference to consider is MongoDB is a software that you can install anywhere (including at AWS or at other cloud service or in-house) where as DynamoDB is a SaaS available exclusively as hosted service from Amazon (AWS). If you want to retain the option of hosting your application in-house, DynamoDB is not an option. If hosting outside of AWS is not a consideration, then, DynamoDB should be your default choice unless very specific features are of higher consideration.
There's a table in the following link that summarizes the attributes of DynamoDB and Cassandra:
http://www.datastax.com/dev/blog/amazon-dynamodb
Something that needs improvement on DynamoDB in order to become more usable is the possibility to index columns other than the primary key.
UPDATE 1 (06/04/2013)
On 04/18/2013, Amazon announced support for Local Secondary Indexes, which made DynamoDB f***ing great:
http://aws.amazon.com/about-aws/whats-new/2013/04/18/amazon-dynamodb-announces-local-secondary-indexes/
I have to be honest; I was very excited when I heard about the new DynamoDB and did attend the webinar yesterday. However it's so difficult to make a decision right now as everything they said was still very vague; I have no idea the functions that are going to be allowed / used through their service.
The one thing I do know is that scaling is automatically handled; which is pretty awesome, yet there are still so many unknowns that it's tough to really make a great analysis until all the facts are in and we can start using it.
Thus far I still see mongo as working much better for me (personally) in the project undertaking that I've been working on.
Like most DB decisions, it's really going to come down to a project by project decision of what's best for your need.
I anxiously await more information on the product, as for now though it is in beta and I wouldn't jump ship to adopt the latest and greatest only to be a tester :)
I think one of the key differences between DynamoDB and other NoSQL offerings is the provisioned throughput - you pay for a specific throughput level on a table and provided you keep your data well-partitioned you can always expect that throughput to be met. So as your application load grows you can scale up and keep you performance more-or-less constant.
Amazon DynamoDB seems like a pretty decent NoSQL solution. It is fast, and it is pretty easy to use. Other than having an AWS account, there really isn't any setup or maintenance required. The feature set and API is fairly small right now compared to MongoDB/CouchDB/Cassandra, but I would probably expect that to grow over time as feedback from the developer community is received. Right now, all of the official AWS SDKs include a DynamoDB client.
Pros
Lightning Fast (uses SSDs internally)
Really (really) reliable. (chances of write failures are lower)
Seamless scaling (no need to do manual sharding)
Works as webservices (no server, no configuration, no installation)
Easily integrated with other AWS features (can store the whole table into S3 or use EMR etc)
Replication is managed internally, so chances of accidental loss of data is negligible.
Cons
Very (very) limited querying.
Scanning is painful (I remember once a scanning through Java ran for 6 hours)
pre-defined throughput, which means sudden increase beyond the set throughput will be throttled.
throughput is partitioned as table is sharded internally. (which means if you had a throughput for 1000 and its partitioned in two and if you are reading only the latest data(from one part) then your throughput of reading is 500 only)
No joins, Limited indexing allowed (basically 2).
No views, triggers, scripts or stored procedure.
It's really good as an alternative to session storage in scalable application. Another good use would be logging/auditing in extensive system. NOT preferable for feature rich application with frequent enhancement or changes.

Which database back-end shall I use?

I am writing an iPhone app, that requires cloud back-end DB storage. I have a couple options in mind, and was wondering which one is better fit?
What I need:
be able to perform GRUD in the cloud from the iPhone app
the DB needs to scale (speed-wise) without much or any management
schema free
all i need is to store maybe 1 million records
Google App Engine:
Uses bigTable, scales, and schema free, but I need to write a RESTful interface
CouchDB:
Recently released iOS support, RESTful built-in, but I worry about scaling when syncing with remote server
SimpleDB: (seems to be my best pick)
Has iOS SDK, so I can do GRUD directly, auto scale (I probably won't be running into the 10GB limit), schema free
MongoDB:
Don't know much about, from what I hear, it's faster than SimpleDB, and easy to setup, but again I need to do the admin work
Cassandra:
Too much work, for what I need.
Any insight or feedback or correction is great appreciated.
Regards,
Johnny
If you're looking for zero management on your end, then you've already answered yourself that SimpleDB or GAE are probably your best options.
SimpleDB is probably better in your case, because it'll save you from having to write a simple RESTful interface on top of GAE.
Note that both of them aren't great in terms of speed. I worked with both and there's visible query latency. Unfortunately there's no way for you to tune that - you're completely in the hands of Amazon/Google. That's the price you pay for not managing the datastore yourself, so I guess you'll have to decide if you're willing to pay that price.
I recommend that you try SimpleDB, which is simple enough, first. If latency is a problem then you can move to hosting and tuning your own Mongo or some other option.
SQL Azure Services. Meets your requirements above.
http://en.wikipedia.org/wiki/SQL_Azure