AG secondary is out of sync for hours after index maintenance
Question
We have a 3TB OLTP database running Siemens Opcenter MES software at one of our manufacturing plants in China. This is setup in an Always-On Availability group with one secondary node. As this is a manufacturing plant, we are only provided 11 1/2 hours every two weeks to run index maintenance when the plant is not running production. We use Hallengren scripts to run this index maintenance.
Currently this index maintenance takes around 11 hours to run. We have the secondary node setup for read-only intent to help offload some of the select queries to the secondary node.
We have been having an issue with the secondary node being out of sync with the primary node for 2 to 3 hours after index maintenance is complete on the primary. For the most recent maintenance window, we disabled all logins and procedures during index maintenance to ensure no transactions are ran against primary. The transactions not syncing are the ones ran against primary after the maintenance window when the 1st shift starts on Monday morning. We are trying to figure out why this is happening.
I speculate that the secondary is copying the index changes from primary to secondary during this time creating a lock on the tables preventing/delaying the sync, but I cannot find any info to corroborate this. Our DBA team has pushed back to get the plant to allow us more frequent maintenance windows but as the index maintenance prevent users from using the system due to table locks, we have not been successful on getting them to align with weekly maintenance.
Is this due to the index updates syncing or any other thoughts?
asked 2023-02-21 by Zack Soderquist
Answer
To really know what is going on, you’ll need to look into where the slowness is on the AG synchronization. Unfortunately, there is limited information available to you from SQL Server’s DMVs, but I think there is enough for you to figure out what’s going on.
In my open source DBA database, there is a stored procedure called dbo.Check_AGLatency. You can install that by itself, or install the entire DBA database.
What info is available to look at?
AG synchronization is done in 2 steps.
- Step 1 is known as "Send" — This step is the portion of AG synchronization that is controlled by the Sync/Async settings on an AG. This step writes changes to disk on the transaction log of the AG secondary. When there is a lot of change data to send to secondary, you may notice the send queue becomes large. This indicates that the current bottleneck on synchronization is on sending/writing the transaction log to the secondary.
- Step 2 is known as "Redo" — This step is ALWAYS asynchronous to the transaction on Primary. This step processes the transaction log on secondary to redo all of the changes, and ultimately persist those changes to the data file. When there is a lot of change data in the secondary transaction log, you may notice the redo queue becomes large. This indicates that the bottleneck is redoing the log, or that the send queue just completed sending a large change, and the redo queue is being processed normally.
- It’s important to note that the redo queue always processes the transaction log in order. Transactions must be redone on secondary in exactly the same order that they are processed on primary to ensure both replicas are exactly the same, with the same data written to the same internal pages on all replicas.
- It’s also important to note that because redo is always asynchronous, here is no guarantee that an uncommitted transaction will be redone on secondary while it is still in process (uncommitted) on primary.
To get a good idea of what is happening (and what is normal), you can ideally monitor BOTH of these queues to see how fast changes are sent & redone. In your case, you’ll want to watch these while your index maintenance is running, and see if one queue or both queues are causing the synchronization delays.
Check the AG Latency.
Assuming you installed that procedure in a database called DBA (though you can install it in any database you want), you would call the procedure something like this, to see all databases/AGs that are experiencing more than 10 minutes of latency:
EXEC DBA.dbo.Check_OpenTransactions @DurationThreshold = 10;
There are several columns to pay closest attention to:
- MinutesBehind – This will show the number of minutes that each database on each secondary replica is behind from Primary. This number includes the end-to-end time, both Send & redo. The
@DurationThreshold
will limit the results based on this column. - AgHealth — This simply shows what SQL Server is reporting for the synchronization health of each database being sent to each replica. If a database is not synching, it will report an unhealthy status.
- UnsentLogKB — This will show the size of the transaction log that is awaiting being sent to the secondary. This is the size of the send queue. (see step 1 above)
- LastHardenedTime — This is the time, from primary, of the most recent transaction that has been hardened (written to disk) on the secondary replica for that database. This can be used to compute the "minutes behind" that correspond to the send queue.
- LastRedoneTime — This is the time, from primary, of the most recent transaction that has been redone (from the transaction log) on the secondary replica for that database. The difference between the
LastRedoneTime
andLastHardenedTime
represents the "minutes behind" that corresponds to the redo queue. - RedoEstSecCompletion — This is a calculation of the estimated amount of time remaining before the secondary replica’s redo queue is fully processed. This is calculated by the current rate of redo (which is constantly fluctuating based on the work being redone), and the size of the redo queue. This is an estimate only, and its accuracy will vary based on the workload being redone.
What’s going on with your server?
Most likely, as you do your index maintenance, you are performing index rebuilds. Each REBUILD
command is implicitly done in it’s own transaction. This means that within Ola’s maintenance script, a command like this will run: ALTER INDEX dbo.ReallyBigTable.ReallyBigIndex REBUILD WITH(ONLINE=ON);
. Even if your AG is in synchronous commit mode, the secondary replica will process the REBUILD
command asynchronously after it is completed on primary.
I think what you’ll see is that as your index maintenance rebuilds indexes on primary, you’ll see the send queue being very "spikey" as indexes are rebuilt, and the redo queue will be similarly spikey, rising first, then catching up either after the Primary is finished, or after the largest indexes are completed.
How do you fix it?
The overall process & queue growth/completion is an expected pattern, though you may be able to improve overall throughput to "shorten" the peaks, depending on where they are. Send throughput is generally limited by network between replicas and write speed on the secondary transaction log. Redo throughput is generally limited by read speed on the secondary transaction log, and by CPU speed/CPU saturation for processing the redo queue.
Before you change network or server hardware…
Do you need to rebuild your indexes at all? Think about why you are rebuilding your indexes. Ola’s index maintenance focuses on physical fragmentation in determining thresholds for rebuilds. On modern hardware, most index rebuilds are unnecessary. It’s generally much more important to have updated statistics, rather than rebuild. REBUILD
will implicitly update stats as well, sometimes causing a false sense that rebuilding indexes fixes performance issues, when it was the "hidden" stats update that actually solved the problem.
Rebuilding is generally only necessary if your problem is with low page fill (sometimes called "low density," caused by page splits making the table/index contain many half-full pages). Unfortunately, Ola’s index maintenance code does not allow for using this metric as a trigger for rebuild. You can see the data density by looking at the avg_page_space_used_in_percent
column in the results of the sys.dm_db_index_physical_stats system function.
Stack Overflow does not do any scheduled index rebuilds or reorganizations. We have scheduled jobs that ONLY UPDATE STATISTICS
using Ola’s index maintenance code. Index rebuilds are done rarely, as an exception to planned maintenance, and only when we need to address low page fill (low data density).
If you do need to worry about physical fragmentation, I suggest looking at using REORGANIZE
rather than REBUILD
. Doing a REORGANIZE
will take longer, and use more transaction log space than a REBUILD
, but it will perform the work in many small transactions, where each time a page is reorganized, it is a separate transaction. By eliminating the "huge" REBUILD
transactions, it will flatten out the send/redo processes, and allow for both queues to be constantly processed and eliminate the queue size spikes. Note that REORGANIZE
does not implicitly update stats, so you will need to also ensure stats are updated. Switching to REORGANIZE
will make your index maintenance jobs take longer & use more space for transaction log backups. I recommend also being more targeted & only REORGANIZE
tables & indexes where it is necessary.
As an example, you may want to have your index maintenance rebuild just a couple of indexes, then do stats updates for everything. That will look a bit more like this:
EXEC DBA.dbo.IndexOptimize @Databases='MyBigDatabase, MyHugeDatabase', @Indexes = '%.dbo.Posts, MyHugeDatabase.dbo.Users', @FragmentationLow = NULL, --We don't want to rebuild/reorg indexes @FragmentationMedium = 'INDEX_REORGANIZE', --REORG ONLY @FragmentationHigh = 'INDEX_REORGANIZE', --No Rebuilds ... @LogToTable = 'Y', @Execute = 'Y'; EXEC DBA.dbo.IndexOptimize @Databases='USER_DATABASES', @FragmentationLow = NULL, --We don't want to rebuild/reorg indexes @FragmentationMedium = NULL, --We don't want to rebuild/reorg indexes @FragmentationHigh = NULL, --We don't want to rebuild/reorg indexes @UpdateStatistics = 'ALL', --Just updating stats @OnlyModifiedStatistics = 'Y', @StatisticsSample = 100, --WITH FULLSCAN @DatabaseOrder = 'DATABASE_SIZE_DESC', @LogToTable = 'Y', @Execute = 'Y';
answered 2023-02-21 by Andy Mallon