How do we compare the cost of running MongoDB in GCE vs using Google Cloud Datastore? - mongodb

I knew MongoDB and Google Cloud Datastore offer NoSQL database systems. I'm new to deploying a database. These are some of my confusions:
How do we compare the cost of running MongoDB in Google Compute Engine vs. using Google Cloud Datastore? Can you quote a small example estimate? We may find many related articles. But I couldn't find one that address this specific question.
I came to know about the Click to Deploy option in Compute Engine. What is the difference(in cost/performance) between manually configuring the Compute Engine for MongoDB and Click to Deploy option?
Within the Click to Deploy option you'll find two options(this and this) we can choose for MongoDB. Can you spot the difference between them? There is also a significant price difference between them.
Is it worthy to start developing in Google Cloud Datastore leaving my MongoDB skills?
I would prefer to know the answers on the basis of cost, development overhead and perfomance.

Related

What is the differencce between MongoDB Atlas and MongoDB Atlas for AWS

During my investigation on compatible DBs for IoT data storing I looked into MongoDB and pricing is a little bit confusing.
Just wondering what is the difference between MongoDB Atlas and MongoDB Atlas for AWS as they both work on AWS?
And what is the right way to run MongoDB Atlas on AWS?
As far as I can see, they both should mostly be similar :
MongoDB Atlas :
You can directly go to MongoDB-Atlas portal & create a MongoDB cluster(a cluster will usually be 3-shard/node replica set on which a DB is hosted) on either of the cloud providers (AWS/Google/Azure). This way all database updates/maintenance will usually be done by vendor. Quiet easy & simple - Which most people are opting for these days (SAAS/ db hosted on cloud). You can also opt for a free cluster which should be suitable for basic needs kind of learning MongoDB. While creating cluster you can check for pricing which is based on cluster level (M0 to M700), You can upgrade your cluster when ever you wish to, but when I was creating one I've noted that you would pay upfront for a certain amount of years likely 3 & whether you use the money or not you would not get anything back but it you've paid less then you might be charged over the time of usage. You'll pay bills thru MongoDB Atlas.
MongoDB Atlas for AWS :
From here aws-marketplace when you see the text marketplace (where multiple companies/people collaborate to sell products) it's basically these two companies have collaborated to provide MongoDB as SAAS. With this you can actually come from AWS rather than than Atlas from itself. When it comes to pricing AWS seems to provide some credits, It would be better if you can consult AWS & Atlas to check on their pricing & other terms if you really wanted to use it for enterprise purpose. You might end-up owing an AWS account to pay bills for this usage (Which hectic if you don't use AWS for other use-cases). Additionally if you check below on MongoDB Atlas for AWS page it seems like just a starting point is given at AWS side but entire setup would be done at Atlas.
You're charged for your purchase on your AWS bill. After you purchase
a contract, you're directed to the vendor's site to complete setup and
begin using this software.

Which kind of Google Cloud Platform mobile backend client is appropriate?

THE PROBLEM
I'm writing a mobile app which will allow a user to log in, save some preferences that must be stored in a database, and display congressional bills to the user.
I've only written simple RESTful services with PHP and MySQL in the past. I'd like to take advantage of newer technologies, and am a little lost on general direction.
The bill data (formatted as JSON) can be gathered by running the scrapers found here. Using docker, I managed to set a working directory and download the files on my local machine.
I've designed a MySQL database for holding the relevant bill and user data.
I started to mess around in Google Cloud Platform, and read the doc that describes different models. I'm thinking of a few different ideas, but aren't familiar with GCP or what I can actually accomplish.
QUESTIONS
1) What are App Engine, Compute Engine, and Container Engine each for? I get the gist that Container Engine holds different instances of stuff you load up with docker, and that Compute Engine sets up a VM, but I don't really understand the relationships. How should I think of them?
2) When I run those scrapers from the shell, where are the files being stored, and how can I check on them? On my computer, I set a working directory, but how do directories work in GCP? Is it just a directory in the currently selected VM, or is this what Buckets are for?
IDEAS
1) Since my bill data already comes as JSON, should I skip the entire process of building a database for the bills and insert them into Firebase somehow? Is this even possible? If so, am I stuck using Firebase's NoSQL, or can I still set up a relational database?
2) I could schedule the scrapers to run periodically, detect new files, and run a script to parse the JSON and insert new bill data into my a database (PostgrSQL?/MySQL?). Then I would write an API.
3) Download the JSON files to a bucket, and write an API that reads from them. Not sure how the performance would compare to using a DB.
I'm open to other suggestions as well.
For your use case (stateless web application), App Engine is probably your best choice. The Google documentation has severalcomparisons of your computing options
You can use App Engine with PHP and cloud-hosted MySQL if you want, which could be a good way to get your toes wet without going in over your head.

How do I populate a google big table instance with data using an external url?

I've a google big table instance that need to be populate with data that are in a Postgres Database. My product team give a URL's that allow me to replicate the database. So using simple words I need to duplicate the Postgres database into the google instance and the way that my product team give me is using this url, how can I do this? any tutorial that can help me?
If you are already running PostgreSQL and would like to have a mirror of it on Google Cloud Platform, the best and simplest approach may be to run your own PostgreSQL instance on a Google Compute Engine virtual machine which can be done via several approaches, e.g.,
tutorial for launching PostgreSQL, or
click-to-deploy solution for PostgreSQL by Bitnami
Then, you would want to continuously mirror data from your local instance to the PostgreSQL instance running in Google Cloud to be able to query it. Another SO answer suggests that there are two major approaches to this:
Master/Master replication (Bucardo)
Master/Slave replication (Slony)
Based on your use case where you want to keep your local PostgreSQL instance as the canonical one, and just replicate to Google Cloud for the purpose of querying it, you want a Master/Slave replication, and have the PostgreSQL instance be the read-only replica, so you probably want to use the Slony approach.
For a more in-depth look at PostgreSQL solutions for high availability, load balancing, and replication, see the comparison in the manual.

Google Cloud SQL Read replica's in other regions

We are currently investigating the options to make a partly switch to Google Cloud SQL. What we are searching for is a setup by which data is available for reading in multiple regions to increase the speed of the web-application. Writing from multiple regions would off course be great, but that's not really something MySQL does when you also want to have speed on your side :-)
What we would like to setup is a master-slave setup through which the Master would be in Europe and slaves (for reading) would be available in the US and Asia. This way we can provide information to our customers from a VM + SQL instance in Asia without having to connect to a database in Europe.
As far as I am aware it is not possible to currently add a read-instance outside of the region of the master. Is that correct?
Or, would it be possible to create our own MySQL read-only instance and let it replicate from a Google Cloud SQL instance? This would not be preferable (database administration, server administration) but is off course an option.
You can do cross-region replication in Cloud SQL, although it is not straight forward because the performance will not be great. You have to create a master in Cloud SQL, then create a replica with external master pointing at the master you created: https://cloud.google.com/sql/docs/replication#external-master
You can go in the other direction as well: https://cloud.google.com/sql/docs/replication#replication-external
These features are only supported for first generation of Cloud SQL.
Cloud Spanner is a relational database that supports transactional consistency on a global scale. It is an SQL Database and works great in a Multi-region environment. Therefore, It can be a good choice for your case. For more info, please check https://cloud.google.com/spanner/

Azure Table Vs MongoDB on Azure

I want to use a NoSQL database on Windows Azure and the data volume will be very large. Whether a Azure Table storage or a MongoDB database running using a Worker role can offer better performance and scalability? Has anyone used MongoDB on Azure using a Worker role? Please share your thoughts on using MongoDB on Azure over the Azure table storage.
Table Storage is a core Windows Azure storage feature, designed to be scalable (100TB 200TB 500TB per account), durable (triple-replicated in the data center, optionally georeplicated to another data center), and schemaless (each row may contain any properties you want). A row is located by partition key + row key, providing very fast lookup. All Table Storage access is via a well-defined REST API usable through any language (with SDKs, built on top of the REST APIs, already in place for .NET, PHP, Java, Python & Ruby).
MongoDB is a document-oriented database. To run it in Azure, you need to install MongoDB onto a web/worker roles or Virtual Machine, point it to a cloud drive (thereby providing a drive letter) or attached disk (for Windows/Linux Virtual Machines), optionally turn on journaling (which I'd recommend), and optionally define an external endpoint for your use (or access it via virtual network). The Cloud Drive / attached disk, by the way, is actually stored in an Azure Blob, giving you the same durability and georeplication as Azure Tables.
When comparing the two, remember that Table Storage is Storage-as-a-Service: you simply access a well-known REST endpoint. With MongoDB, you're responsible for maintaining the database (e.g. whenever MongoDB Inc (formerly 10gen) pushes out a new version of MongoDB, you'll need to update your server accordingly).
Regarding MongoDB Inc's alpha version pointed to by jtoberon: If you take a close look at it, you'll see a few key things:
The setup is for a Standalone mongodb instance, without replica-sets or shards. Regarding replica-sets, you still get several benefits using the Standalone version, due to the way Blob storage works.
To provide high-availability, you can run with multiple instances. In this case, only one instance serves the database, and one is a 'warm-standby' that launches the mongod process as soon as the other instance fails (for maintenance reboot, hardware failure, etc.).
While 10gen's Windows Azure wrapper is still considered 'alpha,' mongod.exe is not. You can launch the mongod exe just like you'd launch any other Windows exe. It's just the management code around the launching, and that's what the alpa implementation is demonstrating.
EDIT 2011-12-8: This is no longer in an alpha state. You can download the latest MongoDB+Windows Azure project here, which provides replica-set support.
For performance, I think you'll need to do some benchmarking. Having said that, consider the following:
When accessing either Table Storage or MongoDB from, say, a Web Role, you're still reaching out to the Windows Azure Storage system.
MongoDB uses lots of memory for its own cache. For this reason, lots of high-scale MongoDB systems are deployed to larger instance sizes. For Table Storage access, you won't have the same memory-size consideration.
EDIT April 7, 2015
If you want to use a document-based database as-a-service, Azure now offers DocumentDB.
I have used both.
Azure Tables : dead simple, fast, really hard to write even simple queries.
Mongo : runs nicely, lots of querying capabilities, requires several instances to be reliable.
In a nutshell,
if your queries are really simple (key->value), you must run a cost comparison (mainly number of transactions against the storage versus cost of hosting Mongo on Azure). I would rather go to table storage for that one.
If you need more elaborate queries and don't want to go to SQL Azure, Mongo is likely your best bet.
I realize that this question is dated. I'd like to add the following info for those who may come upon this question in their searches.
Note that now, MongoDB is offered as a fully managed service on Azure. (officially in Beta as of Apr '15)
See:
http://www.mongodb.com/partners/cloud/microsoft
or
https://azure.microsoft.com/en-us/blog/announcing-new-mongodb-instances-on-microsoft-azure/
See (including pricing):
https://azure.microsoft.com/en-us/marketplace/partners/mongolab/mongolab/
My first choice is AzureTables because SAAS model and low cost and SLA 99.99% http://alexandrebrisebois.wordpress.com/2013/07/09/what-if-20000-windows-azure-storage-transactions-per-second-isnt-enough/
some limits..
http://msdn.microsoft.com/en-us/library/windowsazure/jj553018.aspx
http://www.windowsazure.com/en-us/pricing/calculator/?scenario=data-management
or AzureSQL for small business
DocumentDB
http://azure.microsoft.com/en-us/documentation/services/documentdb/
http://azure.microsoft.com/en-us/documentation/articles/documentdb-limits/
second choice is many cloud providers including Amazon offer S3
or Google tables https://developers.google.com/bigquery/pricing
nTH choice manage the SHOW all by myself have no sleep MongoDB well I will look again the first two SAAS
My choice if I am running "CLOUD" I will go for SAAS model as much as possible "RENT-IT"...
The question is what my app needs is it AzureTables or DocumentDB or AzureSQL
DocumentDB documentation
http://azure.microsoft.com/en-us/documentation/services/documentdb/
How Azure pricing works
http://azure.microsoft.com/en-us/pricing/details/documentdb/
this is fun
http://www.documentdb.com/sql/demo
At Build 2016 it was announced that DocumentDB would support all MongoDB drivers. This solves some of the lack of tooling issues with DocDB and also makes it easier to migrate Mongo apps.
Above answers are all good - but the real answer depends on what your requirements are. You need to understand what size of data you are processing, what types of operations you want to perform on the data and then select the solution that meets your needs.
One thing to remember is Azure Table Storage doesn't support complex data types.It supports every property in entity to be a String or number or boolean or date etc.
One can't store an object against a key,which i feel is must for NoSql DB.
https://learn.microsoft.com/en-us/rest/api/storageservices/fileservices/understanding-the-table-service-data-model scroll to Property Types