Background
At work, we make very extensive use of transactional replication. We have dozens of publications. Most publications have two or three subscribers. We replicate across domains, across subnets, across geographies. We use both push & pull replication (we’ve found pull works best when crossing the globe).
Next year, we’ll be upgrading to new hardware, and upgrading to SQL 2014. When we do that, we want to avoid having to re-snapshot our multiple terabytes of replicated data. Unfortunately, SQL Server does not provide a nice way to update the publisher of a publication–and that’s exactly what we want to do.
We came up with a solution we thought would work–and tried it out on a smaller database with a small publication. With a little trial and error, it worked (I even surprised myself). This post steps through what we did.
<disclaimer>
Our solution does involve updating the distribution database directly. I realize I’m going beyond the “unsupported” signs, and right past the “not recommended” warnings. I’m OK with that–but if you’re considering doing this yourself, take note that you’re headed down the path where there is no support. Have backups, and be prepared to rebuild replication from scratch if you hopelessly break everything.
</disclaimer>
Step 1) Moving to a new server
So the first step is to move to the new hardware. We use mirroring for our HA, and use the -PublisherFailoverPartner
parameter on the Log Reader & Snapshot agents. Moving to the server is simple–reconfigure mirroring to mirror to the new primary hardware, failover to the new primary hardware, then reconfigure mirroring to mirror to the new mirror hardware. It’s a well-known dance that’s often discussed & documented, so I won’t belabor the steps.
Step 2) Updating the distribution database
Part A) Pause replication
Stop & disable all the Snapshot, Log Reader, and Distribution agents related to this server. The next script will update all publications on the server–so if you have multiple publications, don’t forget any. We don’t want anything running while we’re mucking with tables in the distribution database. We have a dedicated distributor server, so I stop & disable all of the jobs directly in SQL Agent to be sure I don’t miss anything.
Part B) Update the distribution database
This is where we’ll update the distribution database. By reviewing sys.columns
, I came up with the below list of updates. The goal is to update every single reference to the original server, and replace it with the new server name. Run this script on the distribution database.
DECLARE @old_publisher_name sysname = 'DBServer01' DECLARE @new_publisher_name sysname = 'DBServer02' DECLARE @old_publisher_id int = (SELECT server_id FROM sys.servers WHERE is_linked = 0 AND name = @old_publisher_name) DECLARE @new_publisher_id int = (SELECT server_id FROM sys.servers WHERE is_linked = 0 AND name = @new_publisher_name) UPDATE MSarticles SET publisher_id = @new_publisher_id WHERE publisher_id = @old_publisher_id; UPDATE MSdistribution_agents SET publisher_id = @new_publisher_id WHERE publisher_id = @old_publisher_id; UPDATE MSlogreader_agents SET publisher_id = @new_publisher_id WHERE publisher_id = @old_publisher_id; UPDATE MSmerge_agents SET publisher_id = @new_publisher_id WHERE publisher_id = @old_publisher_id; UPDATE MSmerge_identity_range_allocations SET publisher_id = @new_publisher_id WHERE publisher_id = @old_publisher_id; UPDATE MSmerge_subscriptions SET publisher_id = @new_publisher_id, publisher = @new_publisher_name WHERE publisher_id = @old_publisher_id AND publisher = @old_publisher_name; UPDATE MSpublications SET publisher_id = @new_publisher_id WHERE publisher_id = @old_publisher_id; UPDATE MSpublisher_databases SET publisher_id = @new_publisher_id WHERE publisher_id = @old_publisher_id; UPDATE MSrepl_identity_range SET publisher = @new_publisher_name WHERE publisher = @old_publisher_name; UPDATE MSreplication_monitordata SET publisher_srvid = @new_publisher_id, publisher = @new_publisher_name WHERE publisher_srvid = @old_publisher_id AND publisher = @old_publisher_name; UPDATE MSsnapshot_agents SET publisher_id = @new_publisher_id WHERE publisher_id = @old_publisher_id; UPDATE MSsubscriber_info SET publisher = @new_publisher_name WHERE publisher = @old_publisher_name; UPDATE MSsubscriber_schedule SET publisher = @new_publisher_name WHERE publisher = @old_publisher_name; UPDATE MSsubscriptions SET publisher_id = @new_publisher_id WHERE publisher_id = @old_publisher_id; UPDATE MSsync_states SET publisher_id = @new_publisher_id WHERE publisher_id = @old_publisher_id;
Part C) Update & start agents
Update the SQL Agent jobs for all of the agents. Specifically, we’re updating the -Publisher
and -PublisherFailoverPartner
parameters to reflect the new server names. After the agent is updated, enable & start it.
The problem
At this point, replication seemed to work swimmingly. Transactions were replicating, data was flowing. When the Replication Monitor showed data was caught up, I tried to insert a tracer token, and was thwarted. I got an error message that my publication didn’t exist. (Obviously it did. I just watched it replicate transactions.) Switching to T-SQL, I tried sp_posttracertoken
, and got a slightly more informative error:
Msg 21200, Level 16, State 1, Procedure sp_MSadd_tracer_token, Line 75
Publication ‘PublicationName’ does not exist.
Msg 21486, Level 16, State 1, Procedure sp_MSrepl_posttracertoken, Line 128
An error occurred while logging the tracer token history information. The tracer token could not be posted.
Now that’s curious. My publication does not exist, and the tracer token history could not be logged. Since my publication does exist, that error message isn’t totally accurate. I’m guessing I probably just missed something.
The applicable call stack looks like this:
sp_posttracertoken
sp_MSrepl_posttracertoken
sp_MSadd_tracer_token
- Throws error 21200
- Throws 21486 error specifically because
sp_MSadd_tracer_token
failed
Of note, most of these stored procedures live in the resource database–just to make life a touch more interesting/confusing. So, my next step was to step through the code and figure out why sp_MSadd_tracer_token
was failing–and figure out if I can solve it.
After tracing through the source code for these system stored procedures, I was able to track down this line in sys.sp_MSrepl_posttracertoken
:
SET @publisher = publishingservername()
Turns out, this is the key! Per the documentation, publishingservername() “returns the name of the originating Publisher for a published database participating in a database mirroring session.” This line runs on the publisher database, and is still returning the original server name. This results in a bad value being passed into sp_MSadd_tracer_token
, resulting in the error. Specifically, it ends up checking the distributor expecting the publication to still exist for the old server–but since we updated the distributor already, that fails.
A quick look at system tables, and I saw dbo.sysreplservers
–which contains one row listing the old server name. We have to update this table in the publisher database so that it matches the distribution database.
Step 3) Updating the publisher database
In addition to the updates to the distribution database, I also have to make one update per published database. This is the last step in making the published database know this server is its permanent home.
When doing this in production, I ran this step at the same time as Step 2B above–that is, while all of the replication agents were disabled.
UPDATE repl SET srvname = s.name FROM dbo.sysreplservers repl JOIN sys.servers s ON s.server_id = repl.srvid WHERE repl.srvname <> s.name;
Optional — Updating job names
The names of the SQL Agent jobs on the Distributor will have names that include the name of the server that originally served as publisher. There’s no functional reason to change those. But I worry that it could be confusing. I won’t be maintaining this forever, and I’m not the only DBA on my team–so I’d like to keep the confusion to a minimum for the next guy.
This involves updating both the job name as well as some tables in the distribution database.
- MSdistribution_agents
- MSlogreader_agents
- MSmerge_agents
- MSsnapshot_agents
- MSreplication_monitordata
I’ll add a script to do this work shortly–Just need to make sure it’s tested before I publish it. Check back soon.
I lost the script file for this in The Great Google Drive Disaster of 2015. But the meat of it is posted above.
Summary
Updating system tables and making changes that are unsupported come with risk. In our case, rebuilding replication actually comes with a substantial cost as well–we would spend many hours rebuilding & resnapshotting our complex & busy replication topology. For us, the risk of hacking replication to change publishers is worth the substantial reward in saved time.
I think this hack is likely only valuable to you if you have complex, busy, or a very large replication topology. If you do find it valuable, or if you run into any troubles, please let me know.
Just wanted to say thanks for this post. I was dealing with some strange replication issues after we failed mirroring over. This pointed me in the right direction. Really like your posts. Glad to see someone else leaving notes to their future selves in job descriptions!
Glad it was helpful!
Do you happen to post the script please ?
Ah, I said I would post the script and never did. Let me scrounge it up, and I’ll get that posted somewhere.
Hello,
amazing work, I’m trying to do something similar, I’ve to detach a database which is a published database, moving the files to the new server, attaching it, resuming replication. I can’t detach the database since it’s published, the instance from where I’m detaching the database won’t be decommissioned so I can’t mess it up.
Any suggestion?
I’d recommend that you not use detach/attach to move the database. Instead, use backup/restore. This is a much safer option, and also comes with some added flexibility that will help you make the actual cutover (and thus downtime) faster & easier.