Starting with SQL Server 2017 CU6 and SQL Server 2016 SP2-CU3, you can now configure your distribution database as part of an Availability Group (AG). Up until now, your distribution database could only be part of a standalone instance, or failover cluster instance (FCI). In this first round, there’s a pretty big list of limitations & restrictions. In fact, you’ll notice that the “supported scenarios” list is shorter than the “limitations or exclusions.”
But I ran into a problem
Obviously I had a problem. All my blog posts come about because I had a problem. Except in this case, everything was working great… I just didn’t understand why.
The tooling/GUIs around Replication aren’t always the greatest. This means I end up digging into the distribution database to query it directly every once in a while. Sometimes, I’ll run a query like this to grab the list of publishers:
SELECT ServerID = s.server_id, ServerName = s.name, p.* FROM dbo.MSpublications AS p JOIN sys.servers AS s ON s.server_id = p.publisher_id;
This has always worked super, and has been a go-to query for me for years. But when looking at a SQL Server 2017 distributor with the distribution database in an Availability Group, that wasn’t working. All the publications had a
publisher_id of 1, but in
server_id 1 was some random linked server, and definitely not the publisher. But replication was working great. Maybe replication was set up on the other AG replica, and
server_id 1 came from there.
Nope. On the other replica, it was the same story.
Server_id 1 was a random linked server, and nothing to do with replication at all, let alone the publisher. But replication was working perfectly. A teammate fooling around with it in dev confirmed that if he updated the
publisher_id to match the
server_id we thought it should join to, replication stopped working. So, that
publisher_id of 1 was correct. Or special. But also definitely different than what I’ve seen in prior versions of SQL Server.
It makes sense, really
I thought about it, and it makes sense that the distribution database, when in an AG, can’t just join directly to
sys.servers based on an ID value. Different replicas could have pre-existing linked servers, and keeping server_id in sync across all the distributor replicas would be impossible. It would make a lot more sense if it were just to rely on server name. But… That’s not how it worked. And
dbo.MSpublications doesn’t include the server name, it’s just got that darned ID value. So, maybe there’s a table in the distribution database that it uses instead of
sys.servers. Sure enough, there it was, hiding in plain sight:
dbo.MSreplservers. I just have to add a join to this table in between
SELECT master_ServerID = s.server_id, master_ServerName = s.name, dist_ServerID = rs.srvid, dist_ServerName = rs.srvname, p.* FROM dbo.MSpublications AS p JOIN dbo.MSreplservers AS rs ON rs.srvid = p.publisher_id JOIN sys.servers AS s ON s.name = rs.srvname;
In hindsight, this actually makes a lot more sense than the code I’ve written in the past. It’s a bit confusing that
dbo.MSreplservers has a
srvid that is different from the
sys.servers for the same server. But… c’est la vie. AGs make everything just a little bit more complicated.