Fastest way to shrink LOB data in SQL Server

clip art of 
 a double-quote character

Question

We have a database of total size of 3886 GB. Out of which 778 GB is free. So total 3108 GB worth of data is present.

Out of which 2458 GB is “LOB Data” and 400 GB is worth “Regular”.

Since the database is hosted on AWS, so in order to save cost, we are trying to shrink data file in order to free up space. Son in order to accomodate data, we already have two drives of 2 TB each and both are almost FULL and my manager do not want to create another drive and put new files over there.

But when we tried to shrink 2 GB at a time, it was taking lot of time (saw dbcc lobcompact running in sp_who2) and we are sometimes seeing blocking as well. After googling, I came across below blog post from Paul S. Randal and he also states LOB data makes shrinking go slow.

https://www.sqlskills.com/blogs/paul/why-lob-data-makes-shrink-run-slooooowly-t-sql-tuesday-006/

My concern is:

Is there any way to make shrink go faster in this case or there is no way at all?
What other people are doing in such scenarios when it comes to cloud and they want to save some cost by reducing storage size?
Is my understanding correct that shrinking files can some way save costs to organization or am I wasting my time and worsening things up? If yes, then how can I convince him?

asked 2020-01-16 by sachin-SQLServernewbiee


Answer

Shrinking a database is not fast. In fact, it tends to be incredibly slow, and there’s nothing you can do to speed it up, other than to have less data to shrink.

How does shrinking work?

When you shrink a data file, SQL Server takes the 8k pages from the end of the file, and moves them to a (random) spot at the beginning of the file. It moves the pages one page at a time. In a way, the page-by-page movement of data is similar to the way a REORGANIZE works, except in reverse. Because the pages are moved individually, to a (random) new spot, it will “disorganize” pages by making previously-contiguous pages non-contiguous, creating fragmentation.

The page-by-page movement is inherently slow, which is why it is faster to REBUILD highly-fragmented indexes than to REORGANIZE them.

REBUILD instead

Normally, I’d suggest that you just REBUILD into a new filegroup, then remove or shrink the old (empty) filegroup. The REBUILD into the new filegroup will be faster than shrinking, because REBUILD if faster then REORGANIZE (and “backwards REORGANIZE” aka shrink). Then, with the old filegroup empty, it will be quick to remove or shrink when it is empty or almost empty, because there are no/few pages to remove.

And because shrinking creates fragmentation, you are going to need to do a REORGANIZE after you shrink anyway.

LOBing a curveball

In your scenario, LOB data represents 2458GB out of 3108GB total data (80%). When you do a REBUILD it will move the B-Tree pages, but it will not move the LOB data. So if you rebuild into a new filegroup, your old filegroup will still be 60+% full.

For the table(s) that have a significant amount of LOB data, you would need to take a different approach. You could create a new table in the new filegroup, migrate data to that new table, then swap table names and drop the old table. You could think of this process as a “manual rebuild”–except doing it this way would allow you to copy LOB data to the new filegroup. I describe this method in a blog post on changing data types, but the exact same method would work in this case–except you would be changing filegroups, rather than changing data types.

Shrinking (nearly) empty filegroups is fast

Shrinking is slow because it has to move allocated pages in order to create contiguous free space at the end of the data file. If the filegroup is empty, the end of the file can be truncated without moving pages and will be very fast.

If the filegroup you are shrinking is PRIMARY then you won’t be able to get it completely empty–PRIMARY contains all the system tables & metadata. By moving all your user tables out of PRIMARY, the filegroup will be almost empty, and there will be a very small number of pages to move. Having just a few pages to move will also be pretty fast, particularly compared to moving 2TB of pages.

answered 2020-01-27 by Andy Mallon