Transaction Replication on AlwaysOn

Question

In our AlwayOn environment I have two SQL servers 2014 Ent.edition (SERVER A, and SERVER B). I setup a transaction replication for [database A ], pointing to listener which shouldn’t cause any issue if a failover happened. Tested, worked just fine after a failover, but when I shut off one of the servers in AOAG ,in test environment, replication stops working. There is no error on the replication monitor. For testing purpose,I have inserted few records on the database it works fine on AOAG, but on the Subscriber the data modification is not replicated until I bring up the other server. Once the server is up, then the replication start to work. I’m pretty sure, I might be missing something, this can’t be this way by design.

Any Idea/recommendation ?

Here is what I have for Replication on AOAG.

Distribution Server A ( contains Distribution database ) – SQLSERVER
1. This on a remote server.
two subscribers – Sv1 and sv2 – SQLSERVER 2008R2 STANDARD
Pbulishers ( SERVER A and B – from AOAG ) – SQLSERVER 2014

asked 2017-04-17 by reshad

Answer

The Log Reader Agent will only read as far as the last transaction committed to all other Availability Group replicas. This means that if one AG replica is offline, Replication will stop reading transactions until that AG replica is back online. This is done to be ultra-safe and ensure that only fully committed transactions are replicated.

In some cases, this may be too conservative of a rule for your setup. In that case, you can add Trace Flag 1448 to your startup parameters, which will disable this behavior for async replicas. If a synchronous replica is offline, it will still prevent the log reader agent from reading unsent transactions for Replication.

Why does Microsoft impose this restriction on synchronous replicas? If the LSN has not been hardened on the synchronous replica, then the transaction isn’t fully committed. Replication is very conservative in what it considers a fully-committed transaction that is safe for replication. Say for example, that the AG isn’t in sync due to a network issue, or some other problem (ie, it is online, but the Primary node just doesn’t know that). If you force a failover of the AG, and allow data loss, then your replication secondary would have data that was lost during failover– potentially breaking replication.

This TF will affect the server that is currently hosting the primary database replica, and thus must be set on every server that might host the primary replica for the AG.

From the Microsoft documentation on TF 1448:

Enables the replication log reader to move forward even if the async
secondaries have not acknowledged the reception of a change. Even with
this trace flag enabled the log reader always waits for the sync
secondaries. The log reader will not go beyond the min ack of the sync
secondaries. This trace flag applies to the instance of SQL Server,
not just an availability group, an availability database, or a log
reader instance. Takes effect immediately without a restart. This
trace flag can be activated ahead of time or when an async secondary
fails.

answered 2017-04-18 by Andy Mallon