wait for transactional replication in ADO.NET or TSQL - tsql

My web app uses ADO.NET against SQL Server 2008. Database writes happen against a primary (publisher) database, but reads are load balanced across the primary and a secondary (subscriber) database. We use SQL Server's built-in transactional replication to keep the secondary up-to-date. Most of the time, the couple of seconds of latency is not a problem.
However, I do have a case where I'd like to block until the transaction is committed at the secondary site. Blocking for a few seconds is OK, but returning a stale page to the user is not. Is there any way in ADO.NET or TSQL to specify that I want to wait for the replication to complete? Or can I, from the publisher, check the replication status of the transaction without manually connecting to the secondary server.
[edit]
99.9% of the time, The data in the subscriber is "fresh enough". But there is one operation that invalidates it. I can't read from the publisher every time on the off chance that it's become invalid. If I can't solve this problem under transactional replication, can you suggest an alternate architecture?

There's no such solution for SQL Server, but here's how I've worked around it in other environments.
Use three separate connection strings in your application, and choose the right one based on the needs of your query:
Realtime - Points directly at the one master server. All writes go to this connection string, and only the most mission-critical reads go here.
Near-Realtime - Points at a load balanced pool of subscribers. No writes go here, only reads. Used for the vast majority of OLTP reads.
Delayed Reporting - In your environment right now, it's going to point to the same load-balanced pool of subscribers, but down the road you can use a technology like log shipping to have a pool of servers 8-24 hours behind. These scale out really well, but the data's far behind. It's great for reporting, search, long-term history, and other non-realtime needs.
If you design your app to use those 3 connection strings from the start, scaling is a lot easier, especially in the case you're experiencing.

You are describing a synchronous mirroring situation. Replication cannot, by definition, support your requirement. Replication must wait for a transaction to commit before reading it from the log and delivering it to the distributor and from there to the subscriber, which means replication by definition has a window of opportunity for data to be out of sync.
If you have a requirement an operation to read the authorithative copy of the data, then you should make that decission in the client and ensure you read from the publisher in that case.
While you can, in threory, validate wether a certain transaction was distributed to the subscriber or not, you should not base your design on it. Transactional replication makes no latency guarantee, by design, so you cannot rely on a 'perfect day' operation mode.

Related

Synchronous vs asynchronous streaming replication for Postgres with PgPool

After reading the documentation of PgPool I was left confused which option would suit my use case best. I need a main database instance which would serve the queries and 1 or more replicas (standbys) of the main one which would be used for disaster recovery scenarios.
What is very improtant for me is that all transactions committed successfully to the master node are guaranteed to be replicated eventually to the replicas such that when a failover occurs, the replica database instance has all transactions up to and including the latest one applied to it.
In terms of asynchronous replication, I have not seen any mention whether that is the case in the PgPool documentation, however, it does indeed mention some potential data loss occurring which is a bit too vague for me to draw any conclusions.
To combat this data loss, the documentation suggests to use synchronous streaming replication which before committing a transaction in the main node, ensures that all replicas have applied that change also. Thus, this method is slower than the asynchronous one but if there is no data loss, it could be viable.
Is synchronous replication the only method that allows me to achieve my use-case or would the asynchronous replication also do the trick? Also, what constitutes the potential data loss in the asynchronous replication?
Asynchronous replication means that the primary server does not wait for a standby server before reporting a successful COMMIT to the client. As a consequence, if the primary server fails, it is possible that the client believes that a certain transaction is committed, but none of the standby servers has yet received the WAL information. In a high availability setup, where you promote a standby in case of loss of the primary server, that means that you could potentially lose committed transactions, although it typically takes only a split second for the information to reach the standby.
With synchronous replication, the primary waits until the first available synchronous standby server has reported that it has received and persisted the WAL information before reporting a successful COMMIT to the client (the details of this, like which standby server is chosen, how many of them have to report back and when exactly WAL counts as received by the standby are configurable). So no transaction that has been reported committed to the client can get lost, even if the primary server is gone for good.
While it is technically simple to configure synchronous replication, it poses an architectural and design problem, so that asynchronous replication is often the better choice:
Synchronous replication slows down all data modification drastically. To work reasonably well, the network latency between primary and standby has to be very low. You usually cannot reasonably use synchronous replication between different data centers.
Synchronous replication reduces the availability of the whole system, because failure of the standby server prevents any data modification from succeeding. For that reason, you need to have at least two synchronous standby servers, one that is kept synchronized and one as a stand-in.
Even with synchronous replication, it is not guaranteed that reading from the standby after writing to the primary will give you the new data, because by default the primary doesn't wait for WAL to be replayed on the standby. If you want that, you need to set synchronous_commit = remote_apply on the primary server. But since queries on the standby can conflict with WAL replay, you will either have to deal with replication (and commit) delay or canceled queries on the standby. So using synchronous replication as a tool for horizontal scaling is reasonably possible only if you can deal with data modifications not being immediately visible on the standby.

How to read/write to secondary member of a MongoDB replica-set?

I am currently planning some server infrastructure. I have two servers in different locations. My apps (apis and stuff) are running on both of them. The client connects to the nearest (best connection). In case of failure of one server the other can process the requests. I want to use mongodb for my projects. The first idea is to use a replica set, therefore I can ensure the data is consistent. If one server fails the data is still accessible and the secondary switches to primary. When the app on the primary server wants to use the data, it is fine, but the other server must connect to to the primary server in order to handle data (that would solve the failover, but not the "best connection" problem). In Mongodb there is an option to read data from secondary servers, but then I have to ensure, that the inserts (only possible on primary) are consistent on every secondary. There is also an option for this "writeConcern". Is it possible to somehow specify “writeConcern on specific secondary”? Because If an add a second secondary without the apps on it, "writeConcern" on every secondary would not be necessary. And if I specify a specific value I don't really know on which secondary the data is available, right ?
Summary: I want to reduce the connections between the servers when the api is called.
Please share some thought or Ideas to fix my problem.
Writes can only be done on primaries.
To control which secondary the reads are directed to, you can use max staleness as well as tags.
that the inserts (only possible on primary) are consistent on every secondary.
I don't understand what you mean by this phrase.
If you have two geographically separated datacenters, A and B, it is physically impossible to write data in A and instantly see it in B. You must either wait for the write to propagate or wait for the read to fetch data from the remote node.
To pay the cost at write time, set your write concern to the number of nodes in the deployment (2, in your proposal). To pay the cost at read time, use primary reads.
Note that merely setting write concern equal to the number of nodes doesn't make all nodes have the same data at all times - it just makes your application only consider the write successful when all nodes have received it. The primary can still be ahead of a particular secondary in terms of operations committed.
And, as noted in comments, a two-node replica set will not accept writes unless both members are operational, which is why it is generally not a useful configuration to employ.
Summary: I want to reduce the connections between the servers when the api is called.
This has nothing to do with the rest of the question, and if you really mean this it's a premature optimization.
If what you want is faster network I/O I suggest looking into setting up better connectivity between your application and your database (for example, I imagine AWS would offer pretty good connectivity between their various regions).

PostgreSQL synchronous replication consistency

If we compare multiple types of replication (Single-leader, Multi-leader or Leaderless), Single-leader replication has the possibility to be Linearizable. In my understanding, Linearizability means that once a write completes, all later reads should return that value, or of a later write. Or said in other words, there should be an impression if there is only one database, but no more. So i guess, no stale reads.
PostgreSQL in his streaming replication, has the ability to make all it's replicas synchronous using the synchronous_standby_names and it also has the ability to fine tune with the synchronous_commit option, where it can be set to remote_apply, so the leader waits until the transaction is replayed on the standby (making it visible to queries). In the documentation, in the paragraph where it talks about the remote_apply option, it states that this allows load-balancing in simple cases with causal consistency.
Few pages back, it says this:
,,Some solutions are synchronous, meaning that a data-modifying transaction is not considered committed until all servers have committed the transaction. This guarantees that a failover will not lose any data and that all load-balanced servers will return consistent results no matter which server is queried,,
So i'm struggling to understand what can there be guaranteed, and what anomalies can happen if we load-balance read queries to the read replicas. Can still there be stale reads? Can it happen when i query different replicas to get different results even no write after happend on the leader? My impression is yes, but i'm not really sure.
If no, how PostgreSQL prevents stale reads? I did not find anything with more details how it fully works under the hood. Does it use two-phase commit, or some modification of it, or it uses some other algorithm to prevent stale reads?
If it does not provide option of no stale reads, is there a way to accomplish that? I saw, PgPool has to option to load-balance to the replicas that are behind no more than a defined threshold, but i did not understand if it could be defined to load-balance to replicas that are up with the leader.
It's really confusing to me to really understand if there anomalies can happen in a fully synchronous replication in PostgreSQL.
I understand that setup like this has problems with availability, but that is now not a concern.
If you use synchronous replication with synchronous_commit = remote_apply, you can be certain that you will see the modified data on the standby as soon as the modifying transaction has committed.
Synchronous replication does not use two-phase commit, the primary server first commits locally and then simply waits for the feedback of the synchronous standby server before COMMIT returns. So the following is possible:
An observer will see the modified data on the primary before COMMIT returns, and before the data are propagated to the standby.
An observer will see the modified data on the standby before the COMMIT on the primary returns.
If the committing transaction is interrupted on the primary at the proper moment before COMMIT returns, the transaction will already be committed only on the primary. There is always a certain time window between the time the commit happens on the server and the time it is reported to the client, but that window increases considerably with streaming replication.
Eventually, though, the data modification will always make it to the standby.

Where would a scaled relational DB fall in the CAP theorem?

If you have scaled SQL server with one DB for writes and multiple DBs for reads. Wouldn't there be a delay for data to be replicated from the write DB to the to other read databases? In which case isn't the data inconsistent?
So where would a scaled relational DB fall in the CAP theorem?
Update:
In relational DBs consistency means there wont be partial updates. For example if someone transfers money from one account to another and the whole thing is a part of one transaction, it wont happen that you take money out of one account but doesn't show up in another account.
In CAP theorem consistence means all the components see the same data. That consistency is different from consistency in ACID.
From what I know, relational DBs like SQL server are supposed to be CA (consistent and available). This would make sense if there is just one database. Because everyone would see the same data. But what if the SQL server is scaled with multiple databases? In that case would all databases still see the same data? If not, would it be consistent (in CAP theorem)?
My feeling is a scaled relational DB is AP (Available and partition tolerant) and not CA (Consistent and available).
I've read different definitions of consistency in regards to the CAP theorem.
Some definitions of consistency say that once some data is persisted in a system, all reads will read the most recently written data. In this definition, a replicated database (you call this "scaled" but I wouldn't use that term) has a risk of returning inconsistent data, if the replication is asynchronous.
To mitigate this risk, some systems make sure replication is synchronous, or as close to synchronous as they can implement. Galera, for example, sends transaction write sets to its replicas synchronously. If you try to read from the replica, and it detects that there are write sets pending but not yet applied, it can block your read until it has caught up with the pending write sets (this behavior is configurable). So you'll never read data that is out of date.
The cost of maintaining perfectly consistent reads over distributed systems in this manner is usually more expensive than users want. It will become a performance bottleneck in a system that has a high rate of updates. So for practical reasons, most projects accept that "replication lag" is a necessary compromise.
Other definitions of consistency are closer to atomicity, i.e. transactions will not be persisted in a partially-complete state. So all constraints will be satisfied when you read the data, whether you read the data before or after the transaction is applied. In this definition, it's quite easy to imagine the replica database instance remaining consistent, if it applies updates using the same transaction semantics used on the master. If you read data from the replica, you might read data that hasn't yet had the latest updates applied, but it will never be in an inconsistent state with respect to constraints.
There is nothing called a scaled RDBMS. We do have "RDBMS Clusters with shared storage": here can keep on adding nodes to achieve high availability of RDBMS.
In other words:
If you meant a "Distributed RDBMS" by mentioning "Scaled RDBMS" - it doesn't exist. You can have RDBMS on only one node. If you add another node, then that will be "another" RDBMS and it would NOT coalesce with the first one giving you a single view(unlike a typical NoSQL Database). Although, you can happily keep on adding storage nodes behind the RDBMS.

How safe is MongoDB's safe mode on inserts?

I am working on a project which has some important data in it. This means we cannot to lose any of it if the light or server goes down. We are using MongoDB for the database. I'd like to be sure that my data is in the database after the insert and rollback the whole batch if one element was not inserted. I know it is the philosophy behind Mongo that we do not need transactions but how can I make sure that my data is really safely stored after insert rather than sent to some "black hole".
Should I make a search?
Should I use some specific mongoDB commands?
Should I use sharding even if one server is enough for satisfying
the speed and by the way it doesn't guarantee anything if the light
goes down?
What is the best solution?
Your best bet is to use Write Concerns - these allow you to tell MongoDB how important a piece of data is. The quickest Write Concern is also the least safe - the data is not flushed to disk until the next scheduled flush. The safest will confirm that the data has been written to disk on a number of machines before returning.
The write concern you are looking for is FSYNC_SAFE (at least that is what it is called from the point of view of the Java driver) or REPLICAS_SAFE which confirms that your data has been replicated.
Bear in mind that MongoDB does not have transactions in the traditional sense - your rollback will have to be rolled by hand as you can't tell the Mongo database to do this for you.
The other thing you need to do is either use the relatively new --journal option (which uses a Write Ahead Log), or use replica sets to share your data across many machines in order to maximise data integrity in the event of a crash/power loss.
Sharding is not so much a protection against hardware failure as a method for sharing the load when dealing with particularly large datasets - sharding shouldn't be confused with replica sets which is a way of writing data to more than one disk on more than one machine.
Therefore, if your data is valuable enough, you should definitely be using replica sets, perhaps even siting slaves in other data centres/availability zones/racks/etc in order to provide the resilience you require.
There is/will be (can't remember offhand whether this has been implemented yet) a way to specify the priority of individual nodes in a replica set such that if the master goes down the new master that is elected is one in the same data centre if such a machine is available (ie to stop a slave on the other side of the country from becoming master unless it really is the only other option).
I received a really nice answer from a person called GVP on google groups. I will quote it(basically it adds up to Rich's answer):
I'd like to be sure that my data is in the database after the
insert and rollback the whole batch if one element was not inserted.
This is a complex topic and there are several trade-offs you have to
consider here.
Should I use sharding?
Sharding is for scaling writes. For data safety, you want to look a
replica sets.
Should I use some specific mongoDB commands?
First thing to consider is "safe" mode or "getLastError()" as
indicated by Andreas. If you issue a "safe" write, you know that the
database has received the insert and applied the write. However,
MongoDB only flushes to disk every 60 seconds, so the server can fail
without the data on disk.
Second thing to consider is "journaling"
(v1.8+). With journaling turned on, data is flushed to the journal
every 100ms. So you have a smaller window of time before failure. The
drivers have an "fsync" option (check that name) that goes one step
further than "safe", it waits for acknowledgement that the data has
be flushed to the disk (i.e. the journal file). However, this only
covers one server. What happens if the hard drive on the server just
dies? Well you need a second copy.
Third thing to consider is
replication. The drivers support a "W" parameter that says "replicate
this data to N nodes" before returning. If the write does not reach
"N" nodes before a certain timeout, then the write fails (exception
is thrown). However, you have to configure "W" correctly based on the
number of nodes in your replica set. Again, because a hard drive
could fail, even with journaling, you'll want to look at replication.
Then there's replication across data centers which is too long to get
into here. The last thing to consider is your requirement to "roll
back". From my understanding, MongoDB does not have this "roll back"
capacity. If you're doing a batch insert the best you'll get is an
indication of which elements failed.
Here's a link to the PHP driver on this one: http://it.php.net/manual/en/mongocollection.batchinsert.php You'll have to check the details on replication and the W parameter. I believe the same limitations apply here.