Configuring the distribution database in an Availability Group

April 8, 2019 Andy Availability Groups, replication 5

Starting with SQL Server 2017 CU6 and SQL Server 2016 SP2-CU3, you can now configure your distribution database as part of an Availability Group (AG). Up until now, your distribution database could only be part of a standalone instance, or failover cluster instance (FCI). In this first round, there’s a pretty big list of limitations & restrictions. In fact, you’ll notice that the “supported scenarios” list is shorter than the “limitations or exclusions.”

But I ran into a problem

Obviously I had a problem. All my blog posts come about because I had a problem. Except in this case, everything was working great… I just didn’t understand why.

The tooling/GUIs around Replication aren’t always the greatest. This means I end up digging into the distribution database to query it directly every once in a while. Sometimes, I’ll run a query like this to grab the list of publishers:

SELECT  ServerID            = s.server_id, 
        ServerName          = s.name,
        p.*
FROM dbo.MSpublications AS p
JOIN sys.servers AS s ON s.server_id = p.publisher_id;

This has always worked super, and has been a go-to query for me for years. But when looking at a SQL Server 2017 distributor with the distribution database in an Availability Group, that wasn’t working. All the publications had a publisher_id of 1, but in sys.servers, server_id 1 was some random linked server, and definitely not the publisher. But replication was working great. Maybe replication was set up on the other AG replica, and server_id 1 came from there.

Nope. On the other replica, it was the same story. Server_id 1 was a random linked server, and nothing to do with replication at all, let alone the publisher. But replication was working perfectly. A teammate fooling around with it in dev confirmed that if he updated the publisher_id to match the server_id we thought it should join to, replication stopped working. So, that publisher_id of 1 was correct. Or special. But also definitely different than what I’ve seen in prior versions of SQL Server.

It makes sense, really

I thought about it, and it makes sense that the distribution database, when in an AG, can’t just join directly to sys.servers based on an ID value. Different replicas could have pre-existing linked servers, and keeping server_id in sync across all the distributor replicas would be impossible. It would make a lot more sense if it were just to rely on server name. But… That’s not how it worked. And dbo.MSpublications doesn’t include the server name, it’s just got that darned ID value. So, maybe there’s a table in the distribution database that it uses instead of sys.servers. Sure enough, there it was, hiding in plain sight: dbo.MSreplservers. I just have to add a join to this table in between dbo.MSpublications and sys.servers.

SELECT  master_ServerID     = s.server_id, 
        master_ServerName   = s.name,
        dist_ServerID       = rs.srvid,
        dist_ServerName     = rs.srvname,
        p.*
FROM dbo.MSpublications AS p
JOIN dbo.MSreplservers AS rs ON rs.srvid = p.publisher_id
JOIN sys.servers AS s ON s.name = rs.srvname;

In hindsight, this actually makes a lot more sense than the code I’ve written in the past. It’s a bit confusing that dbo.MSreplservers has a srvid that is different from the server_id in sys.servers for the same server. But… c’est la vie. AGs make everything just a little bit more complicated.

Nauman Hameed
May 12, 2019 at 3:52 am

Hi Andy

I faced similar issue recently where my SQL script for publication detection stops working when using SQL Server 2016 SP2-CU3 or higher. Reason is same, i.e. the join s.server_id = p.publisher_id no longer works. After reading your post, I now understand the reason. However, I could not find any information related to dbo.MSreplservers table in Microsoft documentation. I wonder if you managed to find anything in Microsoft documentation for this table?

Regards
Nauman
Raju Kadiyala
February 15, 2020 at 11:44 pm

Hello Andy,

All my exisiting replication servers are in SQL 2014. Distribution servers and Publication servers and subscriber servers are different. Now I am planning to create new distribituion servers in SQL 2017 to put distributer databases in AG.But I need to keep my pub and sub servers in same sql 2014 only. is it possible to add the NEW SQL 2017 distributer databases to SQL 2014 Publishers and subscribers?

is there any complications? Could you please suggest on this?

Thanks,
Naga K.
- Andy
  February 16, 2020 at 10:45 am
  You should checked out the documentation for Replication Backward Compatibility. All the information for supported versions is well-documented.
  The general rules are:
  
  A Distributor can be any version as long as it is greater than or equal to the Publisher version (in many cases the Distributor is the same instance as the Publisher).
  
  A Publisher can be any version as long as it less than or equal to the Distributor version.
  
  Subscriber version depends on the type of publication:
  
  A Subscriber to a transactional publication can be any version within two versions of the Publisher version. For example: a SQL Server 2012 (11.x) Publisher can have SQL Server 2014 (12.x) and SQL Server 2016 (13.x) Subscribers; and a SQL Server 2016 (13.x) Publisher can have SQL Server 2014 (12.x) and SQL Server 2012 (11.x) Subscribers.
  
  A Subscriber to a merge publication can be all versions equal to or lower than the Publisher version which are supported as per the versions life cycle support cycle.
  - Raju kadiyala
    February 17, 2020 at 5:14 pm
    
    Thanks for your reply Andy. COnfiured the distribution db in sql server 2017 and put it on alwayson and created the listner as well. On my test sql 2014 environment i have created the publication and scbscriber with sql 2017 distributor db using with listner name. but when I am trying to generating the scripts for created publication I am getting below error. any help on this please. anything wrong on my configuration?
    
    could not connect to server ” ” because “is not defined as a remote login at the server. Verify that you have specified the correct login name” (Microsoft SQL server , Error:18483)