Graph DB's Vs Azure Table storage for a Social networking application - nosql

I'm starting on some architecture work for a .Net based social networking application to be hosted on Azure cloud. we are going to be using ASP.NET MVC on the front end.
i would like to consider the options for storage. considering scalability needs and due to the inter-connected nature of the application, SQL azure has been ruled out.
what would be the main considerations in choosing a graph DB such as Sones GraphDB or neo4j which have features specific for a social networking application against using windows azure table storage to achieve the needs.
i'm mostly concerned about development time, cost, ability to leverage existing skills like .NET and reliability of the graph DB platforms and ease of setup and administration.

Graph databases are designed for applications such as social networks. For ease of development, it may be best to start with something like GraphDB. A key advantage over a key-value database is powerful query and traversal capabilities. It would be easy to, for instance, find all occurrences of friends of friends using the GraphDB query syntax.
The benefit of a key-value database service like Azure Table is low cost, minimal administrative overhead and scalability. You can store 500TB of data per Azure storage account and setup accounts in multiple regions. There is no server setup or database administration overhead and the Visual Studio SDK is easy to use. The down side is that graph like query support is not built in and you must index off the Primary Key / Row Key pair. For additional Azure Table design pattern please see https://azure.microsoft.com/en-us/documentation/articles/storage-table-design-guide/

Related

Choosing correct approach to build a muti-tenant architecture with Azure Cosmos DB (MongoDB)

I am little confused in choosing the suitable approach of creating database/collections for a multi-tenant system in MongoDB in CosmosDB API.
I would have 500 tenants for my application, where each tenant's data may grow up to 3-5GB and initially each tenant may need minimum RUs (400 RU/s).
For this use case i would have few options to go with:
1. PartitionKey (per tenant)
2. Container w/ shared throughput (per tenant)
3. Container w/ dedicated throughput (per tenant)
4. Database Account (per tanant)
Considering the Performance Isolation, Cost, Availability and Security, may i know which option would be suitable for the mentioned use case ?.
Please let me know your inputs as i have less exposure to NoSQL and Cosmos track.
The answer is potentially multiple options and it depends on your specific tenant use cases.
Tenant/Partition is the least expensive with a marginal cost per tenant of zero. This is a great option for providing a "free-tier" in your app but you can scale this up to a paid tier for your customers too. Max storage size is 20GB. With this scheme you will need to implement your own resource governance. You will need to ensure customers are not "running hot" and consuming throughput and storage that is drastically out of line from other users. However if you're building a multi-tenant app, resource governance is something you should already be doing.
Tenant/Container is more expensive at 400 RU/s per month ($25/month) which is the minimum throughput for a container. This is ideal when you have tenants that are very large and require isolation from others in the previous tier.
Tenant/Account is same marginal cost as Tenant/Container. This is useful if you have customers that have GDPR requirements that prevent or require replication into specific Azure regions.
Note that I DO NOT recommend Tenant/Container using shared Database throughput. The reason is because with this scheme, all containers share the same throughput which is what you get with Tenant/Partition but performance is not predictable with shared Database throughput so it is not a good choice. Additionally, you are limited to 25 containers per database further making it a poor choice.
Finally, for your app you will need to implement a mechanism to migrate customers from one tier to another. You will also of course require some sort of auth-n/auth-z mechanism. For Cosmos DB you can optionally use our native users and permissions and use resource tokens to secure access to data.
We did a presentation on this last year at BUILD with a customer of ours Citrix who built their own cloud offering on top of Azure using Cosmos DB as their user meta-data store. Definitely worth checking out and will provide you more details and insights, Mission-critical multi-tenant apps with Cosmos DB
PS: if you are building a new service on Cosmos DB I recommend using our Core (SQL) API rather than MongoDB. This is our native service and you will get the best performance and features. Our MongoDB API is the best choice for customers who are looking to migrate and want a fully managed MongoDB experience.
Hope this is helpful.

How to monitor queries in Azure mobile services?

The following video demonstrates how to monitor sql queries done by EF in an MVC application:
http://channel9.msdn.com/Series/Implementing-Entity-Framework-with-MVC/01
They recommend glimpse for mvc5:
https://www.nuget.org/packages/Glimpse.Mvc5/
I can not find glimpse for Azure mobile services so what is the best way to monitor sql queries in Azure mobile services?
Open Azure Management Portal for your Database, login and open performance tab.
There you get some basic monitoring data about your queries.
You can query sys.dm_exec_query_stats table in your database too, to create custom reports about your queries.
If you don't mind third-party services, look into CloudMonix # http://cloudmonix.com - it has special support for SQL Azure and a free plan.
When monitoring SQL Azure databases, it will show you top 10 queries, connections, logs, performance characteristics and allow you to create your own SQL-based metrics.
Disclaimer: I'm affiliated with the product

Cloud storage options with iOS

I'm trying to create a back-end in which I can have many users communicate with each other amongst an iPhone app I'm creating. I've tried working with Core Data, Google App Engine, Google Cloud Storage, and Amazon Web Services (RDS & Elastic Beanstalk). Unfortunately, after weeks of trying to get any of this working, none of it will!
I've been trying to get in touch with someone who would know how startups (when they were little) like Instagram, Path, and Pinterest have managed to do this. But everyone out there seems to despise this stuff as much as I'm growing to...
I would love for someone to simply map out EXACTLY how I need to create a back-end database that I can save and query data to and from that many users can see. That means that just SQLite, Core Data, or Parse by itself isn't going to work here!
A tutorial of some kind would be incredible.
First off, technologies like CoreData and sqlite are typically local device storage. Local device storage is not going to get you shared cloud storage.
Parse.com is a fast way for devices to access cloud storage and get going fast. Especially useful for games and other mobile apps to access cloud data via an app id and app key. It's simple storage to avoid creating your own backend if it fills all your needs and requirements.
When you get to a multi-tenant cloud backend where you roll your own services and multiple devices accessing your cloud application you need to look into exposing your web API. Exposing RESTful API over http is great for devices and web clients. Exposing the data as JSON is especially conventient for the web and easily consumed by devices.
Those web service end points in the cloud access some sort of backend storage which is optimized for concurrent access by mutliple clients. This is typically a SQL backend like MySQL, SQLServer etc... or a NoSQL solution like mongodb, couchDB, etc...
Some front end web api technologies to look into:
ASP.net web api
Ruby on Rails
Node.js
etc...
Some back end storage technologies to look into:
SQL: MySQL, SQLServer/Azure SQL, Oracle
NoSQL: MongoDb, CouchDb, Amazon S3 simple storage, etc...
If the data is used by many many multi-tenant clients, the backends can scaled up (larger and larger) or get sharded. Sharding is where the data for multiple users is split into many databases or datastores with some sort of lookup algorithm for requests to find where that users data is stored. The front end web api servers abstract the backend storage.
Finally, you'll end up needing some sort of caching/fast lookup technology (if you're successful :):
Redis: fast in memory storage over sockets
memcached: facebook uses - simple key value in memory caching across many front end servers.
Your question is an open ended up broad question so start by googling many of these terms and technologies.
Each of these links will have resources and tutorials. Get a cloud VM, play with each and decide which fits your needs best. There is no one size fits all solution.

Azure Table Vs MongoDB on Azure

I want to use a NoSQL database on Windows Azure and the data volume will be very large. Whether a Azure Table storage or a MongoDB database running using a Worker role can offer better performance and scalability? Has anyone used MongoDB on Azure using a Worker role? Please share your thoughts on using MongoDB on Azure over the Azure table storage.
Table Storage is a core Windows Azure storage feature, designed to be scalable (100TB 200TB 500TB per account), durable (triple-replicated in the data center, optionally georeplicated to another data center), and schemaless (each row may contain any properties you want). A row is located by partition key + row key, providing very fast lookup. All Table Storage access is via a well-defined REST API usable through any language (with SDKs, built on top of the REST APIs, already in place for .NET, PHP, Java, Python & Ruby).
MongoDB is a document-oriented database. To run it in Azure, you need to install MongoDB onto a web/worker roles or Virtual Machine, point it to a cloud drive (thereby providing a drive letter) or attached disk (for Windows/Linux Virtual Machines), optionally turn on journaling (which I'd recommend), and optionally define an external endpoint for your use (or access it via virtual network). The Cloud Drive / attached disk, by the way, is actually stored in an Azure Blob, giving you the same durability and georeplication as Azure Tables.
When comparing the two, remember that Table Storage is Storage-as-a-Service: you simply access a well-known REST endpoint. With MongoDB, you're responsible for maintaining the database (e.g. whenever MongoDB Inc (formerly 10gen) pushes out a new version of MongoDB, you'll need to update your server accordingly).
Regarding MongoDB Inc's alpha version pointed to by jtoberon: If you take a close look at it, you'll see a few key things:
The setup is for a Standalone mongodb instance, without replica-sets or shards. Regarding replica-sets, you still get several benefits using the Standalone version, due to the way Blob storage works.
To provide high-availability, you can run with multiple instances. In this case, only one instance serves the database, and one is a 'warm-standby' that launches the mongod process as soon as the other instance fails (for maintenance reboot, hardware failure, etc.).
While 10gen's Windows Azure wrapper is still considered 'alpha,' mongod.exe is not. You can launch the mongod exe just like you'd launch any other Windows exe. It's just the management code around the launching, and that's what the alpa implementation is demonstrating.
EDIT 2011-12-8: This is no longer in an alpha state. You can download the latest MongoDB+Windows Azure project here, which provides replica-set support.
For performance, I think you'll need to do some benchmarking. Having said that, consider the following:
When accessing either Table Storage or MongoDB from, say, a Web Role, you're still reaching out to the Windows Azure Storage system.
MongoDB uses lots of memory for its own cache. For this reason, lots of high-scale MongoDB systems are deployed to larger instance sizes. For Table Storage access, you won't have the same memory-size consideration.
EDIT April 7, 2015
If you want to use a document-based database as-a-service, Azure now offers DocumentDB.
I have used both.
Azure Tables : dead simple, fast, really hard to write even simple queries.
Mongo : runs nicely, lots of querying capabilities, requires several instances to be reliable.
In a nutshell,
if your queries are really simple (key->value), you must run a cost comparison (mainly number of transactions against the storage versus cost of hosting Mongo on Azure). I would rather go to table storage for that one.
If you need more elaborate queries and don't want to go to SQL Azure, Mongo is likely your best bet.
I realize that this question is dated. I'd like to add the following info for those who may come upon this question in their searches.
Note that now, MongoDB is offered as a fully managed service on Azure. (officially in Beta as of Apr '15)
See:
http://www.mongodb.com/partners/cloud/microsoft
or
https://azure.microsoft.com/en-us/blog/announcing-new-mongodb-instances-on-microsoft-azure/
See (including pricing):
https://azure.microsoft.com/en-us/marketplace/partners/mongolab/mongolab/
My first choice is AzureTables because SAAS model and low cost and SLA 99.99% http://alexandrebrisebois.wordpress.com/2013/07/09/what-if-20000-windows-azure-storage-transactions-per-second-isnt-enough/
some limits..
http://msdn.microsoft.com/en-us/library/windowsazure/jj553018.aspx
http://www.windowsazure.com/en-us/pricing/calculator/?scenario=data-management
or AzureSQL for small business
DocumentDB
http://azure.microsoft.com/en-us/documentation/services/documentdb/
http://azure.microsoft.com/en-us/documentation/articles/documentdb-limits/
second choice is many cloud providers including Amazon offer S3
or Google tables https://developers.google.com/bigquery/pricing
nTH choice manage the SHOW all by myself have no sleep MongoDB well I will look again the first two SAAS
My choice if I am running "CLOUD" I will go for SAAS model as much as possible "RENT-IT"...
The question is what my app needs is it AzureTables or DocumentDB or AzureSQL
DocumentDB documentation
http://azure.microsoft.com/en-us/documentation/services/documentdb/
How Azure pricing works
http://azure.microsoft.com/en-us/pricing/details/documentdb/
this is fun
http://www.documentdb.com/sql/demo
At Build 2016 it was announced that DocumentDB would support all MongoDB drivers. This solves some of the lack of tooling issues with DocDB and also makes it easier to migrate Mongo apps.
Above answers are all good - but the real answer depends on what your requirements are. You need to understand what size of data you are processing, what types of operations you want to perform on the data and then select the solution that meets your needs.
One thing to remember is Azure Table Storage doesn't support complex data types.It supports every property in entity to be a String or number or boolean or date etc.
One can't store an object against a key,which i feel is must for NoSql DB.
https://learn.microsoft.com/en-us/rest/api/storageservices/fileservices/understanding-the-table-service-data-model scroll to Property Types

Windows Azure TDS emulation on a production non-Azure IIS server

I am developing a c# web application that will be hosted in Windows Azure and use Table Data Storage (TDS).
I want to architect my application such that I can also (as an option) deploy the application to a traditional IIS server with some other NoSql back-end. Basically, I want to give my customers the option to either pay me in the software as a service model, OR purchase a license of my application that they can install on a (non-azure) production server of their own.
How can I best architect my data layer and middle tier to achieve both goals?
I will likely need a Windows Azure Worker Role and an Azure Queue. How complicated is to replicate these? Can I substitue a custom Windows Service and some other queuing technology?
How I can the entities in my data model be written such that I can deploy to Azure TDS or some other storage when not deploying to Azure? Would MongoDB or similar be useful for this?
Surely there is a way to architect for Azure without being married to it.
I will likely need a Windows Azure Worker Role and an Azure Queue. How complicated is to replicate these? Can I substitue a custom Windows Service and some other queuing technology?
Yes - a Windows service with some other queuing technology would fit this reasonably well - and worker roles have a main/Run loop which is easy to use within a Windows Service.
How I can the entities in my data model be written such that I can deploy to Azure TDS or some other storage when not deploying to Azure? Would MongoDB or similar be useful for this?
NoSql is a general term encapsulating lots of different technologies. I think Azure TDS currently belongs to the Key-Value store family of NoSql, while MongoDB is more of a document database offering much richer functionality than TDS - see http://en.wikipedia.org/wiki/NoSQL_(concept). For mimmicking Azure TDS I think maybe a variant of something like Redis might work (although I believe Redis itself has wider functionality then TDS currently)
In general, it depends on the shape of your data, but I suspect if you can fit it in Azure TDS, then you'll be able to fit it into your choice of other storage too.
Surely there is a way to architect for Azure without being married to it.
Yes - as you've suggested in your question, you can architect your app so it can work on other technologies instead. In fact, this is quite a similar challenge to the traditional SQL data abstraction methods. However, I think there are a few places where you'll find TDS pushing you in certain
directions which won't fit well with other stores - e.g. Azure pushes you much more towards data replication; has very specific rules on keys; offers high performance using very specific mechanisms; and offers limited transaction integrity in very specific situations. These factors may mean that you do have to indeed change some middle tier layers as well as some data layers in order to get the most out of your app in both its Azure and non-Azure variations.
One other thought - It might be easier to offer your clients a multitenant SaaS version on Azure, and a singletenant version hosted on Azure - but this does depend on the clients!
I found a viable solution. I found that I can use EF Code First with SQL Server or SQL CE if I design my entities with the same PartitionKey & RowKey compound key structure that Azure Table Storage requires.
With a little help from Lokad Cloud (http://code.google.com/p/lokad-cloud/) to perform the interaction with Azure Table Storage, I was able to craft a common DataContext that provides crud operations against either EF's DbContext OR Lokad's TableStorageProvider.
I even found a nice way to manage relationships between entities and lazy-load them properly.
The solution is a bit complex and needs more testing. I will blog about it and post the link here when ready.