TSQL Tuesday #104: dbo.Check_FileSize

It’s T-SQL Tuesday, the blog party that SQL Server expert Adam Machanic (blog|twitter) started. This month’s episode is hosted by Bert Wagner (blog | vlog| twitter). The topic: Code You Would Hate To Live Without

It’s been a little bit since the last T-SQL Tuesday I joined, when I blogged about religion & politics (aka, Tabs vs Spaces). When I saw Bert’s excellent topic this month, I decided I’d try to get back on the horse again and do join the blog party again this month.

I’ve written before about some of the scripts in my open-source DBA Database. The script that I use the most is probably the one to find the leading blocker. But today, I’m going to talk about the FIRST script I wrote for my DBA Database. It’s served me so well, I haven’t looked at (let alone updated) the code in a couple years. I’ll try to avoid transcribing my cringing as I look at code that Jr DBA Andy wrote.

dbo.Check_FileSize

Even when you’ve got awesome monitoring software, there are still scenarios when I want to look at database file size from T-SQL. Let’s figure out what some of those scenarios are, since they’ll help us figure out what our functional specs are.

Quick look at one database size: This is totally the most common scenario I’ve come across. I just want to take a peek at how big some database is, without switching apps, without waiting for SSMS to open the database properties (it’s so sloooowwwwwww). This sounds pretty easy. We’ll throw a @DatabaseName parameter into the proc. Done.
Quick look at files on one drive: When I was a younger, more naive DBA, I used this more frequently. Drives would fill up, and I’d look for files to shrink. This still happens occasionally–usually in non-production or neglected environments. But nowadays, I’ve stopped asking myself “Should I shrink my database?” We’ll need another parameter for this one.
Just log files: Usually this comes up when I’m cleaning up from a problem. Some long-running transaction or broken backups blew up a transaction log, or I’m on-boarding a database that has a million VLFs and needs fixing. Regardless, I sometimes want to check log file size quickly & easily. This calls for a bit parameter as to whether to include log files.
Just data files: I mean, if I’m going to code for a log file parameter, a companion for just data files only makes sense.
Just the biggest files or emptiest files: On servers with many databases (as was the case when I wrote this), I was often only interested in the big databases, or the empty databases (remember my misguided shrinking exploits?). This requirement turned into a parameter for a custom ORDER BY clause.

Name that DMV!

This is every DBA’s favorite game. Figuring out what DMV contains the data you want. It turns out there are two places that database file size info is maintained. Each database has sys.database_files which has information for that database. The master database also has sys.master_files, which contains information for every database.

Using sys.master_files seems like it would be the obvious choice: everything in one view in master is going to be easier to query than hitting a view in a bunch of different databases. Alas, there’s a minor snag. For tempdb, sys.master_files has the initial file size, but not the current file size. This is valuable information, but doesn’t answer the use cases we set out above. I want to know the current file sizes. Thankfully, sys.database_files has the correct current file sizes for tempdb, so we can use that.

Using sys.database_files seems like it’s going to be the right move for us then. Alas, this isn’t quite perfect either. With Log Shipping, and Availability Group secondaries, if you’ve moved data files to a different location, sys.database_files will contain the location of the files on the primary database. Thankfully, sys.master_files has the correct local file locations for user databases, so we can use that.

Ugh, so it looks like the answer is “both”… we’ll need to use sys.database_files for tempdb, and sys.master_files for everything else.

How’s it work?

The script loops through and collects all data on all data files and throws it into a temp table. This uses sp_foreachdb and not sp_MSforeachdb. This was originally written by Aaron Bertrand (blog|twitter), and is now part of the First Responder Kit. :

EXEC sp_foreachdb @suppress_quotename = 1, @state_desc = 'ONLINE', @command = 'USE [?] 
    INSERT #FileSizeInfo (ServerName, DbName, FileSizeMB, SpaceUsedMB, GrowthAmount, LogicalFileName, PhysicalFileName, FileType, FreeSpaceMB, FreeSpacePct) 
    SELECT @@servername as ServerName,   ''?'' AS DatabaseName,   
    CAST(f.size/128.0 AS decimal(20,2)) AS FileSize, 
    CASE
        WHEN mf.type_desc = ''FILESTREAM'' THEN CAST(f.size/128.0 AS decimal(20,2))
        ELSE CAST(FILEPROPERTY(mf.name, ''SpaceUsed'')/128.0 as decimal (20,2)) 
    END AS ''SpaceUsed'', 
    CASE 
        WHEN mf.type_desc = ''FILESTREAM'' THEN NULL
        WHEN mf.is_percent_growth = 0 
            THEN convert(varchar,ceiling((mf.growth * 8192.0)/(1024.0*1024.0)))  + '' MB'' 
        ELSE convert (varchar, mf.growth) + '' Percent'' 
    END AS FileGrowth, mf.name AS LogicalFileName, 
    mf.physical_name AS PhysicalFileName, mf.type_desc AS FileType,
    CAST(f.size/128.0 - CAST(FILEPROPERTY(mf.name, ''SpaceUsed'' ) AS int)/128.0 AS int) AS FreeSpaceMB,   
    CAST(100 * (CAST (((f.size/128.0 -CAST(FILEPROPERTY(mf.name,   
        ''SpaceUsed'' ) AS int)/128.0)/(f.size/128.0))   AS decimal(4,2))) AS varchar(8)) + ''%'' AS FreeSpacePct 
    FROM sys.master_files mf 
    JOIN [?].sys.database_files f ON f.file_id = mf.file_id AND mf.database_id = db_id(''?'')
    ' ;

Then, I select out of the temp table by building some dynamic SQL to apply the filters we defined above. Notice that it uses a LIKE on database name. That’s pretty cool if you’re working in an a large environment where you have naming conventions and want to filter on that convention–which is exactly when I added that feature.

SET @sql = N'SELECT * FROM  #FileSizeInfo WHERE 1=1';

--Include optional filters
IF @IncludeDataFiles = 0
    SET @sql = @sql + N' AND FileType <> ''ROWS''';
IF @IncludeLogFiles = 0
    SET @sql = @sql + N' AND FileType <> ''LOG''';
IF @Drive IS NOT NULL
    SET @sql = @sql + N' AND PhysicalFileName LIKE ''' + @Drive + N'%''';
If @DbName IS NOT NULL
    SET @sql = @sql + N' AND DbName	LIKE ''' + @DbName + N'''';

--include order by
SET @sql = @sql + N' ORDER BY ' + @OrderBy;

What should I change about this code?

I promised at the top that I would try to avoid cringing at the code I first wrote as a Junior DBA. As I review it today, I see two problems that I really ought to fix sometime (or maybe you want to fix it and submit a pull request).

Protect against SQL injection: The way I implemented the dynamic SQL, there’s some fantastic opportunity for SQL injection. Since I only make my DBA database accessible to DBAs (who are already sysadmin), there’s not really any end user or GUI to protect against. Still, I want my code to be safe–future use/access might change, and since I’ve put it on the internet, I don’t want to be setting a bad example. To be a better role model, I should go back & fix that code.
Implement the @DbName filter differently: On servers with a large number of databases & data files, the script is going to pull all that file info into the temp table, even if the data isn’t requested. We can use the @database_list parameter on sp_foreachdb to do some of that filtering up front and make this perform better.

Try it for yourself

You can download & install my complete DBA Database from GitHub. If you want to download just the one script, that’s on GitHub, too. If you grab the stored procedure without the rest of the database, remember that you’ll also need to grab sp_foreachdb from the First Responder Kit.