Site icon Andy M Mallon – AM²

DROP TABLE Bug on Filestream Garbage Collection

FILESTREAM is not a particularly popular feature of SQL Server–if you’re using it, you are in the minority. If you have a database with filestream in an Availability Group, then you’re REALLY in the minority. At my 9-5, this is being done, and it’s apparent that this combination of features isn’t a robust combination.

The bug

I ran into a situation where garbage collection process for filestream didn’t appear to be working properly. I initially noticed this when a significant amount of data was migrated out of the database, but no additional space was reclaimed on disk.

I tracked the problem back to this specific scenario: When you have a database that is part of an Availability Group, and you drop a table that contains filestream data, the filestream garbage collection does not clean up the data container subdirectory that corresponds to that table. Garbage collection will continue to clean up other items (eg, deleted rows), but the dropped table never gets cleaned up.

I’ve logged this as a UserVoice item for your up-voting pleasure.

The repro steps

Environment: I am testing this on SQL Server 2014 SP1-CU3 (12.0.4427.24), using an Availability Group with two replicas. Both servers run Windows 2012 R2.

And here’s a script I used. You’ll want to replace AM2_test with the name of your database that you’re using for testing. Also, I do the log backup to NUL: … That’s like sending your log backup to the shredder. Don’t do that if you actually care about your database.

--Create table
CREATE TABLE dbo.FilestreamTestData(
 ID bigint IDENTITY(1,1) NOT NULL,
 FilestreamData varbinary(max) FILESTREAM NULL,
 FilestreamGUID uniqueidentifier ROWGUIDCOL NOT NULL UNIQUE DEFAULT (newsequentialid()),
 CONSTRAINT PK_FilestreamTestData PRIMARY KEY CLUSTERED (ID)
);
GO

--Put some random data into the table
--I'll create 100 rows with 100 filestream files
INSERT INTO dbo.FilestreamTestData(FilestreamData)
SELECT TOP 100 CAST( 'Random Data' AS varbinary(max))
FROM sys.columns;
GO 

--drop the table
DROP TABLE dbo.FilestreamTestData;
GO

--force garbage collection to run multiple times
--take log backups in between to make sure things are moving along.
EXEC sp_filestream_force_garbage_collection @dbname = 'AM2_test';
BACKUP LOG AM2_test TO DISK = 'NUL:';
GO 10

 

The workaround

I haven’t figured out how to get garbage collection to collect the DROP TABLE and still keep the database in the Availability Group. In order to force the garbage collection to work, I use the following process:

 

If you have another workaround–particularly one that doesn’t involve removing the database from the AG–let me know in the comments below.

 

UPDATE (2015-12-21):

In doing another round of testing, it appears that the TRUNCATE TABLE command does not get cleaned up by filestream garbage collection either. Both the DROP TABLE and TRUNCATE TABLE will remain unprocessed by garbage collection as long as the database is part of the Availability Group. Instead, use DELETE to delete the contents of the table.

Exit mobile version