How good is BDR for production Postgres sync? - postgresql

I have a system where multiple satellites create financial transactions and they need to sync up with a core server. The satellites are remote servers that run Rails apps with a local Postgres database. The core is another Rails app with its own Postgres database. The satellites and core have pretty much the same schema (but not identical). Everything is containerized (apps and database). Very rarely, the core server does update some data that all satellites need. Currently I have one satellite, but this number will grow to a couple (I don’t think more than 100 in the distant future). There is no problem of sequence or contention between the core and the satellites. The core will never update the same transaction as any of the satellites and no satellites will update the same transaction as any of the other satellites. Even better, the financial transactions have a uuid as the primary key.
Since this is a multi-master sync problem, I naturally came across BDR. I have the following questions:
Is BDR production ready and stable? I’m reading about several competing technologies (like Bucardo and Londiste). Will it really be part of Postgres 9.6?
Can BDR handle a disconnected model? I don’t think this will be very often, but my satellites could be disconnected for hours.
Can BDR do selective syncs? For example, I’d only want certain tables be sync-ed.
Could BDR handle 100 satellites?

Is BDR production ready and stable?
Yes, BDR 1.0 for BDR-Postgres 9.4 is production-ready and stable. But then I would say that since I work for 2ndQuadrant, who develop BDR.
It is not a drop-in replacement for standalone PostgreSQL that you can use without application changes though. See the overview section of the manual.
I’m reading about several competing technologies (like Bucardo and Londiste).
They're all different. Different trade-offs. There's some discussion of them in the BDR manual, but of course, take that with a grain of salt since we can hardly claim to be unbiased.
Will it really be part of Postgres 9.6?
No, definitely not. Where have you seen that claim?
There will in future (but is not yet) be an extension released to add BDR to PostgreSQL 9.6 when it's ready. But it won't be part of PostgreSQL 9.6, it'll be something you install on top.
Can BDR handle a disconnected model? I don’t think this will be very often, but my satellites could be disconnected for hours.
Yes, it handles temporary partitions and network outages well, with some caveats around global sequences. See the manual for details.
Can BDR do selective syncs?
Yes. See the manual for replication sets.
Table structure is always replicated. So are initial table contents at the moment. But table changes can be replicated selectively, table-by-table.
For example, I’d only want certain tables be sync-ed.
Sure.
Could BDR handle 100 satellites?
Not well. It's a mesh topology that would expect every satellite to talk to every other satellite. Also, you'd have 198 backends (99 walsenders + 99 apply workers) per node. Not pretty.
You really want a star-and-hub model where each satellite only talks to the hub. That's not supported in BDR 1.0, nor is it targeted for support in BDR 2.0.
I think this is a better use case for pglogical or Londiste.
I can't really go into more detail here, since it overlaps with commercial consulting services I am involved in. The team I work with designs things like this for customers as a professional service.

Related

Sitecore 8.1 update 2 MongoDB backup

I am using replica set (2 mongo, 1 arbitor) for my Sitecore CD servers.
Assuming all mongo DB data get flushed to Reporting SQL DB; do we need to take backup of MongoDB database on production CD ?
If yes what is best approach and frequency to do it; considering My application is moderately using anaytics feature (Personalization , Campaign etc).
Unfortunately, your assumption is bad - the MongoDB is the definitive source of analytic data, not the reporting db. The reporting db contains only the aggregate info needed for generating the report (mostly). In fact, if (when) something goes wrong with the SQL DB, the idea is that it is rebuilt from the source MongoDB. Remember: You can't un-add two numbers after you've added them!
Backup vs Replication
A backup is a point-in-time view of the database, where replication is multiple active copies of a current database. I would advocate for replication over backup for this type of data. Why? Glad you asked!
Currency - under what circumstance would you want to restore a 50GB MongoDB? What if it was a week old? What if it was a month? Really the only useful data is current data, and websites are volatile places - log data backups are out of date within an hour. If you personalise on stale data is that providing a good user experience?
Cost - backing up large datasets is costly in terms of time, storage capacity and compute requirements; they are also a pain to restore and the bigger they are the more likely there's a corruption somewhere
Run of business
In a production MongoDB environment you really should have 2-3 replicas. That's going to save your arse if one of the boxes dies, which they sometimes do - MongoDB works the disks very hard.
These replicas are self-healing, and always current (pretty-much) so they are much better than taking backups. The chances that you lose all your replicas at once is really low except for one particular edge case... upgrades. So a backup is really only protection against hardware failure or data corruption which, in a multi-instance replica set, is already very effectively handled. Unless you're paranoid, you're never going to use that backup and it'll cost you plenty to have it.
Sitecore Upgrades
This is the killer edge-case - always make backups (see Back Up and Restore with MongoDB Tools) before running an upgrade because you can corrupt all of your replicas in one motion and you'll want to be able to roll back.
Data Trimming (side-note)
You didn't ask this, but at some point you'll be thinking "how the heck can I back up this 170GB monster db every day? this is ridiculous" - and you'll be right.
There are various schools of thought around how long this data should be persisted for - that's a question only you or your client can answer. I suggest keeping it until there's too much, then make a decision on how much you have to get rid of. Keep as much as you can tolerate.

no single point of failure with traditional RDBMS

I am working in a trading applications that depends on an Oracle DB.
The DB is crashed two times and the business owner wants some solution in which the application still works even the DB is crashed.
My team leader introduced Cassandra NOSQL as a solution as it has no single point of failure but this option will make us move from the traditional relational model into the NOSQL model which I consider as a drawback.
My question here, Is there a way to avoid a single point of DB failure with traditional relational DBMS like Mysql, postgreSQL,......etc ?
Sounds like you just need a cluster of Oracle database instances, rather than just a single instance, such as Oracle RAC.
If your solution for the Oracle server being offline is to use Cassandra, what happens if the Cassandra cluster goes down? And are you really in the situation where it makes sense to rewrite and re-architect your entire application to use a different type of data store, just to avoid downtime from Oracle? I would suspect this only makes sense for applications with huge usage and load numbers, where any downtime is going to cost serious money (and not just cause embarrassment to the business folks to their bosses).
Is there a way to avoid a single point of DB failure with traditional relational DBMS
No, that's not possible. Simply because when one node dies. It is gone.
Any fault-tolerant system will use several nodes that replicate each other. You can still use traditional RDBMS, but you will need to configure mirroring in order for the system to tolerate a node failure.
NoSQL isn't the only possible solution. You can set up replication with MySQL:
http://dev.mysql.com/doc/refman/5.0/en/replication-solutions.html
and
http://mysql-mmm.org/
and concerming failover discussions:
http://serverfault.com/questions/274094/automated-failover-strategy-for-master-slave-mysql-replication-why-would-this

Considerations for a RDBMS-agnostic transaction replication subsystem

I am working on a RDBMS-agnostic (primarily via ODBC to start, though my personal preferred RDBMS is going to be PostgreSQL) transaction replicator for guaranteeing data in two databases is consistent.
This would be in similar vein to TIBCO Rendezvous, but not targeted at Oracle, and (likely) non-commercial.
I have considered alternatives such as using a simple message queue, but if users/processes in two locales update the same object at the same time (or before a transaction can replicate), you are still left with the issue of authority and "who's right".
What are primary considerations to keep in mind, especially concerning the high potential for conflicts in the environment?
There are some solutions out there, but I have no idea how big the gap between reality and the marketing advertising actually is.
http://symmetricds.codehaus.org/
http://www.continuent.com/solutions/tungsten-replicator
(update: 2015-03-13: does not seem to support Postgres any longer)

Which NoSQL DB is best fitted for OLTP financial systems?

We're designing an OLTP financial system. it should be able to support 10.000 transactions per second and have reporting features.
So we have come to the idea of using:
a NoSQL DB as our main storage
a MySQL DB (Percona server actually) making some ETLs from the NoSQL DB for store reporting data
We're considering MongoDB and Riak for the NoSQL job. we have read that Riak scales more smoothly than MongoDB. And we would like to listen your opinion.
Which NoSQL DB would you use for a
OLTP financial system?
How has been
your experience scaling MongoDB/Riak?
There is no conceivable circumstance where I would use a NOSQl database for anything to do with finance. You simply don't have the data integrity needed or the internal controls. Dow Jones uses SQL Server to do its transactions and if they can properly design a high performance, high transaction Relational datbase so can you. You will have to invest in some people who know what they are doing though.
One has to think about the problem differently. The notion of transaction consistency stems from the UD (update) in CRUD (Create, Read, Update, Delete). noSQL DBs are CRAP (Create, Replicate, Append, Process) oriented, working by accretion of time-stamped data. With the right domain model, there is no reason that auditability and the equivalent of referential integrity can't be achieved.
The global-storage based NoSQL databases - Cache from InterSystems and GT.M from FIS - are used extensively in financial services and have been for many years. Cache in particular is used for both the core database and for OLTP.
I can answer regarding my experience with scaling Riak.
Riak scales smoothly to the extreme. Scaling is as easy as adding nodes to the cluster, which is a very simple operation in itself. You can achieve near linear scalability by simply adding nodes. Our experience with Riak as far as scaling is concerned has been amazing.
The flip side is that it is lacking in many respects. Some examples:
You can't do something like count(*) or list keys on a production cluster. That would require a work around if you want to do ETL from Riak into MySQL - or how would you know what to (E)xtract?
(One possible work around would be to maintain a bucket with a known key sequence that map to values that contain the keys you inserted into your other buckets).
The free version of Riak comes with no management console that lets you know what's going on, and the one that's included in the Enterprise version isn't much of an improvement.
You'll need the Enterprise version of you're looking to replicate your data over WAN (e.g. for DR / high availability). That's alright if you don't mind paying, but keep in mind that Basho pricing is very high.
I work with the Starcounter (so I’m biased), but I think I can safely say that for a system processing financial transactions you have to worry about transaction consistency. Unfortunately, this is what the engines used for Facebook and Twitter had to give up allow their scale-out strategy to offer performance. This is not because engines such as MongoDb or Cassandra are poorly designed; rather, it follows naturally from the CAP theorem (http://en.wikipedia.org/wiki/CAP_theorem). Simply put, changes you make in your database will overwrite other changes if they occur close in time. Ok for status updates and new tweets, but disastrous if you deal with money or other quantities. The amounts will simply end up wrong when many reads and writes are being done in parallel. So for the throughput you need, a memory centric NoSQL database with ACID support is probably the way to go.
You can use some NoSQL databases (Cassandra, EventStore) as a storage for financial service if you implement your app using event sourcing and concepts from DDD. I recommend you to read this minibook http://www.oreilly.com/programming/free/reactive-microservices-architecture.html
OLTP can be achieved using NoSQL with a custom implementation,
there are two things,
1. How are you going to achieve ACID properties that an RDBMS gives.
2. Provide a custom blocking or non blocking concurrency and transaction handling mechanism.
To take you closer to solution,
Apache Phoenix,apache trafodion or Splice machine.
Trafodion has full ACID support over HBase, you should take a look.
Cassandra can be used for both OLTP and OLAP. Good replication and eventual data consistency gives you the choice in your hand. Need to design the system properly. And after all it's free of cost but not free of developer, give it a try

Does any RDBMS do auto scaling, sharding, re-balancing?

I think one of the advantages of no-sql such as MongoDB is that it can scale horizontally automatically: just add a cheap machine and the data can "spread over" to the new machine.
How about for RDBMS, does any RDBMS do that automatically too?
The answer here is "kind of". MySQL doesn't really have anything native of "for free". Big RDBMS technologies like MSSQL and Oracle do have very good support for scaling out. However, both technologies are expensive and there's no way to through a thousand servers at MS SQL and say "have at it".
Of course, even with millions of dollars in servers and tech, you still won't have ready access to joins. I mean, how can you reliably join data across 500 servers?
Honestly, I think your question is probably best answered by the very existence of technologies like MongoDB and CouchDB. These technologies exist b/c developers need a way to reliably "horizontalize". RDBMS, by their nature, are not good at horizontalizing. Again, how can you scale a join?
I have only worked with MySQL and MySQL supports partitioning. However, partitioning is restricted to a single database server, which means that horizontal-scaling (sharding the database to multiple machines) is not something that the database engine manages. This has to be managed at the application level.
MySQL Partitioning is said to work very well in write-heavy use-cases.
To give you further direction :
Scaling mysql writes through partitioning at Yahoo
Database sharding at Netlog