OrientDB in Azure - orientdb

We would like to use OrientDB Graph in an Azure environment. Does anybody has experience using it? We also would like to know if high availability from OrientDB is required under Azure cloud? Azure already offers high availability for Azure storage, Azure Drive and SQL. I understand that they have replications and load balancing built in.
This is super important because we prefer not to get into the business of replications and infrastructure management.
Thanks

So you can spin up 2 or more machines and install OrientDB on them, then configure them together as a distributed cluster. However I haven't been able to find any way that is simpler, easier to do. I am interested in this topic too.

Azure does have features such as geo-replication, which is protects your data against a major data-center incident but doesn't provide any performance benefit and will not make it highly available.
Although pretty reliable, occasionally Microsoft will reboot servers for updates, so to protect against downtime you can use affinity groups so that, of your 2 or more servers, one will always be online. This however does need to be used in conjunction with database replication and ideally load balancing.
It's also worth noting that OrientDB recommends clusters have an odd number of servers as this can prevent conflicts when synchronising data after a communication issue between the servers.

I am using it in amazon and I had to create a java project to monitor http requests inserts and queries. The queries are very fast but takes longer inserting data .
I recommend this type of graph database mode to decrease the time of the queries. Also if you have empty fields OrientDB manages very well compared to other databases .
If you need help with the java project can response to this post and I´ll help u.
I hope it helps. Good luck.

Related

Google Cloud SQL Read replica's in other regions

We are currently investigating the options to make a partly switch to Google Cloud SQL. What we are searching for is a setup by which data is available for reading in multiple regions to increase the speed of the web-application. Writing from multiple regions would off course be great, but that's not really something MySQL does when you also want to have speed on your side :-)
What we would like to setup is a master-slave setup through which the Master would be in Europe and slaves (for reading) would be available in the US and Asia. This way we can provide information to our customers from a VM + SQL instance in Asia without having to connect to a database in Europe.
As far as I am aware it is not possible to currently add a read-instance outside of the region of the master. Is that correct?
Or, would it be possible to create our own MySQL read-only instance and let it replicate from a Google Cloud SQL instance? This would not be preferable (database administration, server administration) but is off course an option.
You can do cross-region replication in Cloud SQL, although it is not straight forward because the performance will not be great. You have to create a master in Cloud SQL, then create a replica with external master pointing at the master you created: https://cloud.google.com/sql/docs/replication#external-master
You can go in the other direction as well: https://cloud.google.com/sql/docs/replication#replication-external
These features are only supported for first generation of Cloud SQL.
Cloud Spanner is a relational database that supports transactional consistency on a global scale. It is an SQL Database and works great in a Multi-region environment. Therefore, It can be a good choice for your case. For more info, please check https://cloud.google.com/spanner/

Multiple server instance load and database balancing

Please correct me if I am wrong but I guess handling more requests and load by adding more machines or balancing the load between multiple servers is horizontal scaling. So, if I add more servers, how do I distribute the database? Do I create only one database to hold the user records with multiple servers? Or do I split the database too? Then what about database integrity? How to synchronize it? I am a newbie and really confused but eager to learn. I would like to use postgresql for my project and would like to know some basic things before I start. What I want to do is, create two servers for the application load balancing (please correct me if its not what I have to do). How will I manage the database across these without database integrity? Do I have to replicate the data within the two servers? How do you guyz have multiple instance and manage database? Do I need to go through sharding for this? What would be the best approach to have many instance without database integrity in accordance to postgresql. I would really appreciate if anyone could explain it to me. Thank you!
Not sure if you are looking just for a service, which could bring you what you need so you don't have to spend time on it, or if on the other hand, you would like to implement that on your end, which I guess could be quite complex.
If you are looking for your own solution, maybe you should take a look at Postgres-XC, which is a group that provides a database cluster based upon PostgreSQL database.
On the other hand, if you are just interested in the development process and don't want to spend time on this when you can have it on the cloud, maybe you would like to take a look at EnterpriseDB which provides PostgreSQL on the cloud.
For your application, you can also use a cloud service in which you can even auto-scale your app depending on some parameters as it is explained here.

Sharing object between Node.js server with memcached / couchcase cluster

I was looking for a way to share object in several nodes cluster, and after a bit of research I thought it would be best using redis pub/sub. Then, I saw that redis doesn't support cluster yet, which means that a system based on redis will have single point of failure. Since high availability is a key feature for me, this solution is not applicable.
At the moment, I am looking into 2 other solutions for this issue:
Memcached
Couchbase
I have 2 questions:
On top of which solution it would be more efficient to simulate pub/sub?
which is better when keeping clusters in mind?
I was hoping that someone out there faced similar issues and share his experience.
I think it's a bad idea to use memcached and couchbase for pub/sub. Both solutions don't provide built-in pub/sub functions and implementing pub/sub on app side can cause a lot of ops/sec to memcache/couchbase server and as a result you'll get slow performance.
Couchbase stores data into disk, so for temporary storage it's better to use memcaced. It will be faster and will not load your disk.
If you can avoid that "pub/sub" and use memcached/couchbase just as simple HA shared key-value storage - do it. It will be much better then pub/sub.
When you install Couchbase server it provides 2 types of buckets: couchbase (with disk persistance, ability to create views, etc.) and memcached (only in-memory key-value storage). Both types of buckets act in the same way in clusters. Also couchbase support memcache api calls, so you don't need to change code to test both variants.
I've tried to use memcached provider for socket.io "pub/sub" sharing, but as I mentioned before it's ugly. And in my case there were few node.js servers with socket.io, so instead of sharing I've implemented something like "p2p messaging" between servers on top of sockets.
UPD: If you have such big amount of data may be it will be better not to have one shared storage, but use something like sharding with "predictible" data location.

Azure Table Vs MongoDB on Azure

I want to use a NoSQL database on Windows Azure and the data volume will be very large. Whether a Azure Table storage or a MongoDB database running using a Worker role can offer better performance and scalability? Has anyone used MongoDB on Azure using a Worker role? Please share your thoughts on using MongoDB on Azure over the Azure table storage.
Table Storage is a core Windows Azure storage feature, designed to be scalable (100TB 200TB 500TB per account), durable (triple-replicated in the data center, optionally georeplicated to another data center), and schemaless (each row may contain any properties you want). A row is located by partition key + row key, providing very fast lookup. All Table Storage access is via a well-defined REST API usable through any language (with SDKs, built on top of the REST APIs, already in place for .NET, PHP, Java, Python & Ruby).
MongoDB is a document-oriented database. To run it in Azure, you need to install MongoDB onto a web/worker roles or Virtual Machine, point it to a cloud drive (thereby providing a drive letter) or attached disk (for Windows/Linux Virtual Machines), optionally turn on journaling (which I'd recommend), and optionally define an external endpoint for your use (or access it via virtual network). The Cloud Drive / attached disk, by the way, is actually stored in an Azure Blob, giving you the same durability and georeplication as Azure Tables.
When comparing the two, remember that Table Storage is Storage-as-a-Service: you simply access a well-known REST endpoint. With MongoDB, you're responsible for maintaining the database (e.g. whenever MongoDB Inc (formerly 10gen) pushes out a new version of MongoDB, you'll need to update your server accordingly).
Regarding MongoDB Inc's alpha version pointed to by jtoberon: If you take a close look at it, you'll see a few key things:
The setup is for a Standalone mongodb instance, without replica-sets or shards. Regarding replica-sets, you still get several benefits using the Standalone version, due to the way Blob storage works.
To provide high-availability, you can run with multiple instances. In this case, only one instance serves the database, and one is a 'warm-standby' that launches the mongod process as soon as the other instance fails (for maintenance reboot, hardware failure, etc.).
While 10gen's Windows Azure wrapper is still considered 'alpha,' mongod.exe is not. You can launch the mongod exe just like you'd launch any other Windows exe. It's just the management code around the launching, and that's what the alpa implementation is demonstrating.
EDIT 2011-12-8: This is no longer in an alpha state. You can download the latest MongoDB+Windows Azure project here, which provides replica-set support.
For performance, I think you'll need to do some benchmarking. Having said that, consider the following:
When accessing either Table Storage or MongoDB from, say, a Web Role, you're still reaching out to the Windows Azure Storage system.
MongoDB uses lots of memory for its own cache. For this reason, lots of high-scale MongoDB systems are deployed to larger instance sizes. For Table Storage access, you won't have the same memory-size consideration.
EDIT April 7, 2015
If you want to use a document-based database as-a-service, Azure now offers DocumentDB.
I have used both.
Azure Tables : dead simple, fast, really hard to write even simple queries.
Mongo : runs nicely, lots of querying capabilities, requires several instances to be reliable.
In a nutshell,
if your queries are really simple (key->value), you must run a cost comparison (mainly number of transactions against the storage versus cost of hosting Mongo on Azure). I would rather go to table storage for that one.
If you need more elaborate queries and don't want to go to SQL Azure, Mongo is likely your best bet.
I realize that this question is dated. I'd like to add the following info for those who may come upon this question in their searches.
Note that now, MongoDB is offered as a fully managed service on Azure. (officially in Beta as of Apr '15)
See:
http://www.mongodb.com/partners/cloud/microsoft
or
https://azure.microsoft.com/en-us/blog/announcing-new-mongodb-instances-on-microsoft-azure/
See (including pricing):
https://azure.microsoft.com/en-us/marketplace/partners/mongolab/mongolab/
My first choice is AzureTables because SAAS model and low cost and SLA 99.99% http://alexandrebrisebois.wordpress.com/2013/07/09/what-if-20000-windows-azure-storage-transactions-per-second-isnt-enough/
some limits..
http://msdn.microsoft.com/en-us/library/windowsazure/jj553018.aspx
http://www.windowsazure.com/en-us/pricing/calculator/?scenario=data-management
or AzureSQL for small business
DocumentDB
http://azure.microsoft.com/en-us/documentation/services/documentdb/
http://azure.microsoft.com/en-us/documentation/articles/documentdb-limits/
second choice is many cloud providers including Amazon offer S3
or Google tables https://developers.google.com/bigquery/pricing
nTH choice manage the SHOW all by myself have no sleep MongoDB well I will look again the first two SAAS
My choice if I am running "CLOUD" I will go for SAAS model as much as possible "RENT-IT"...
The question is what my app needs is it AzureTables or DocumentDB or AzureSQL
DocumentDB documentation
http://azure.microsoft.com/en-us/documentation/services/documentdb/
How Azure pricing works
http://azure.microsoft.com/en-us/pricing/details/documentdb/
this is fun
http://www.documentdb.com/sql/demo
At Build 2016 it was announced that DocumentDB would support all MongoDB drivers. This solves some of the lack of tooling issues with DocDB and also makes it easier to migrate Mongo apps.
Above answers are all good - but the real answer depends on what your requirements are. You need to understand what size of data you are processing, what types of operations you want to perform on the data and then select the solution that meets your needs.
One thing to remember is Azure Table Storage doesn't support complex data types.It supports every property in entity to be a String or number or boolean or date etc.
One can't store an object against a key,which i feel is must for NoSql DB.
https://learn.microsoft.com/en-us/rest/api/storageservices/fileservices/understanding-the-table-service-data-model scroll to Property Types

Windows Azure TDS emulation on a production non-Azure IIS server

I am developing a c# web application that will be hosted in Windows Azure and use Table Data Storage (TDS).
I want to architect my application such that I can also (as an option) deploy the application to a traditional IIS server with some other NoSql back-end. Basically, I want to give my customers the option to either pay me in the software as a service model, OR purchase a license of my application that they can install on a (non-azure) production server of their own.
How can I best architect my data layer and middle tier to achieve both goals?
I will likely need a Windows Azure Worker Role and an Azure Queue. How complicated is to replicate these? Can I substitue a custom Windows Service and some other queuing technology?
How I can the entities in my data model be written such that I can deploy to Azure TDS or some other storage when not deploying to Azure? Would MongoDB or similar be useful for this?
Surely there is a way to architect for Azure without being married to it.
I will likely need a Windows Azure Worker Role and an Azure Queue. How complicated is to replicate these? Can I substitue a custom Windows Service and some other queuing technology?
Yes - a Windows service with some other queuing technology would fit this reasonably well - and worker roles have a main/Run loop which is easy to use within a Windows Service.
How I can the entities in my data model be written such that I can deploy to Azure TDS or some other storage when not deploying to Azure? Would MongoDB or similar be useful for this?
NoSql is a general term encapsulating lots of different technologies. I think Azure TDS currently belongs to the Key-Value store family of NoSql, while MongoDB is more of a document database offering much richer functionality than TDS - see http://en.wikipedia.org/wiki/NoSQL_(concept). For mimmicking Azure TDS I think maybe a variant of something like Redis might work (although I believe Redis itself has wider functionality then TDS currently)
In general, it depends on the shape of your data, but I suspect if you can fit it in Azure TDS, then you'll be able to fit it into your choice of other storage too.
Surely there is a way to architect for Azure without being married to it.
Yes - as you've suggested in your question, you can architect your app so it can work on other technologies instead. In fact, this is quite a similar challenge to the traditional SQL data abstraction methods. However, I think there are a few places where you'll find TDS pushing you in certain
directions which won't fit well with other stores - e.g. Azure pushes you much more towards data replication; has very specific rules on keys; offers high performance using very specific mechanisms; and offers limited transaction integrity in very specific situations. These factors may mean that you do have to indeed change some middle tier layers as well as some data layers in order to get the most out of your app in both its Azure and non-Azure variations.
One other thought - It might be easier to offer your clients a multitenant SaaS version on Azure, and a singletenant version hosted on Azure - but this does depend on the clients!
I found a viable solution. I found that I can use EF Code First with SQL Server or SQL CE if I design my entities with the same PartitionKey & RowKey compound key structure that Azure Table Storage requires.
With a little help from Lokad Cloud (http://code.google.com/p/lokad-cloud/) to perform the interaction with Azure Table Storage, I was able to craft a common DataContext that provides crud operations against either EF's DbContext OR Lokad's TableStorageProvider.
I even found a nice way to manage relationships between entities and lazy-load them properly.
The solution is a bit complex and needs more testing. I will blog about it and post the link here when ready.