It’s T-SQL Tuesday, the blog party that SQL Server expert Adam Machanic (blog|twitter) started. This month’s episode is hosted by Aaron Bertrand (blog | twitter). The topic: Dealer’s Choice (Door Number 2: Bad Habits)
When I was a kid, Choose Your Own Adventure books were overwhelming because I wanted to explore every possible option–some things never change. This month for T-SQL Tuesday, Aaron offers two blogging options and I’ve chosen to do both. “Door Number 1” is a challenge to write about what you do in your free time, inspired by Drew Furgiuele’s #sqlibirum post. That blog post is over here. This post is for the “Door Number 2” option is to talk about his Bad Habits series–I’ve opted to bring up a topic Aaron hasn’t blogged about: tabs vs spaces.
Tabs vs spaces: A matter of style?
The great tabs vs spaces debate is often framed as a matter of opinion or coding style. “As long as you’re consistent, it doesn’t matter.” Keep in mind that unlike your choice in footwear, coding style can’t be personal. Everyone at your organization must observe the same consistency. Even if everyone at your organization observes the same style, are tabs vs spaces just a matter of style?
I’m going to draw a line in the sand and say that tabs, like Crocs and mullets, are bad style.
I used to be a Tabber, but with experience, I’ve gained the wisdom and have fully converted to be a Spacer. I always use 4 spaces instead of 1 tab.
Coding style, not keystrokes
Just to be clear, when I talk “tabs vs spaces,” I’m referring to the actual text that gets saved in your source code. I’m not talking about whether you should hit the spacebar key or the tab key on your keyboard. What I am saying is that when you hit the Tab
key on your keyboard, it should not use a horizontal tab ASCII character (code point U+0009). Instead, the Tab
key should enter 4 spaces (code point U+0020).
Tabs: The bad habit SSMS promotes
Before I justify my last statement, I’m going to take a minute to kvetch that SQL Server Management Studio gets it wrong with it’s default settings. By default, when you hit the Tab
key, SSMS puts in a tab character. Everyone should change this setting to “Insert spaces” instead of “Keep tabs”.
Thankfully, Microsoft has seen the error of their ways, and SQL Operations Studio uses the proper tab settings, and replaces a Tab
keystroke with four spaces.
The cases for spaces
Source code size difference is nominal
A lot of people comment on the four spaces taking up four bytes instead of one byte for a tab. The tabbers claim that these extra bytes will make their source code far too big, and the tabs make more sense. To test this, I grabbed Ola Hallengren’s open source SQL Server Maintenance Solution. I saved one copy to a folder named “tabs” and then an identical copy to a folder named “Spaces”. Then, I used some PowerShell to do some text replacement to ensure they were formatted consistently with tabs & spaces (note: Ola uses spaces!):
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
## If there are four spaces, replace them with tabs | |
Get-ChildItem "C:\temp\Tabs-vs-Spaces\tabs" *.sql -recurse | ForEach { | |
(Get-Content $_.FullName | ForEach {$_ -replace " ", " "}) | Set-Content $_.FullName | |
} | |
## If there are tabs, replace them with four spaces | |
Get-ChildItem "C:\temp\Tabs-vs-Spaces\spaces" *.sql -recurse | ForEach{ | |
(Get-Content $_.FullName | ForEach {$_ -replace " ", " "}) | Set-Content $_.FullName | |
} | |
## Check the size of the folders, in bytes | |
$tabsTotalSize = Get-ChildItem "C:\temp\Tabs-vs-Spaces\tabs" *.sql -recurse | Measure-Object -property length -sum | |
$spacesTotalSize = Get-ChildItem "C:\temp\Tabs-vs-Spaces\spaces" *.sql -recurse | Measure-Object -property length -sum | |
Write-Host "*******************************************************" | |
Write-Host "Total size of TABS folder: $($tabsTotalSize.Sum)" | |
Write-Host "Total size of SPACES folder: $($spacesTotalSize.Sum)" | |
Write-Host "Spaces are $(($spacesTotalSize.Sum/$tabsTotalSize.Sum).ToString("P")) the size of tabs" |
******************************************************* Total size of TABS folder: 252761 Total size of SPACES folder: 269930 Spaces are 106.79% the size of tabs
This particular project is only 6% larger with spaces over tabs.
Repeating the same exercise with The First Responder Kit shows it jumps from 1109 kb to 1438 kb. OK, that’s nearly a 30% increase, but it’s still small enough to fit on a 3.5″ HD floppy diskette.
But seriously… how large is your database source code? It’s probably megabytes in size.
How large is your database? It’s probably terabytes in size. If your code were to double in size, it would still be a drop in the bucket.
Do you worry that you need to trim you fingernails before getting on the scale to check your weight? Your fingernails aren’t going to make or break your weigh-in.
We code in fixed width fonts
For me, this is the big reason that eventually convinced me to be a Spacer, and not a Tabber.
Tabs aren’t fixed width. Different text editors will render tabs different widths.
Most of the time, I’m editing my .sql
files in SSMS (and now with SQL Ops Studio, too). But sometimes I’m on a server that doesn’t have SSMS or any other IDE or advanced text editor–in those cases, I’m stuck with using Notepad as an editor. Lets use a very simple code sample based on AdventureWorks, and see how it looks in SSMS vs Notepad. First, let’s see how tabs look.
Someone painstakingly aligned the code in SSMS, but it looks quite heinous in Notepad. The column list is thrown off a little bit, but that subquery is completely thrown off. And this is just a very simple code sample! Imagine if this were a more complicated query–it could very quickly become un-readable.
OK, but what about spaces? Spaces are always the same width. There is no room for interpretation by different text editors. If I made my code look pretty in my editor, it will look equally pretty in your editor.
Tab size matters whether you’re using tabs or spaces
One argument I hear is that Spacers don’t have a consistent standard for how many spaces equal one tab. So if I write code with 4 spaces per tab, and someone else edits it with 8 spaces per tab, then the combined code isn’t pretty. If we look at the tabbed SSMS vs Notepad code again, we can see that this same argument could be applied to tabs. If the code was originally authored and formatted in SSMS with tabs, and I edit it in Notepad, my edits won’t have consistent formatting with the rest of the code.
Notepad won’t let me edit how wide a tab character displays. Notepad won’t let me automatically insert spaces when I hit the Tab
key. I could hit the spacebar multiple times (which might be annoying, but doable), but if I’m “stuck” in Notepad (or any editor with a different tab width than the original editor) and trying to edit with a few tabs, I’m nearly guaranteed to screw up formatting.
These days, it’s become a pretty widely accepted that tab size should be 4 spaces. Tab size is important no matter what–so make sure you’re using 4 spaces per tab.
Performance?
There’s an argument that compilation will be faster with tabs because it’s a smaller query. (Here’s that size thing again!) I’ve never actually seen anyone prove that tabs are faster than spaces–at least not for SQL Server. I suppose I could try to prove it, but I’m going to punt on this, and suggest that someone else give it a shot. I am quite dubious that it would have any significant impact on real-world performance. At most, I would expect the impact to be on the order of milliseconds. Very few workloads depend on millisecond-level performance. Most people would realize a bigger performance improvement by doing some old-fashioned query tuning compared to using tabs over spaces.
Aaron Bertrand wrote about the impact of comments on performance, and I would guess that the numbers on a tabs vs spaces experiment would bear out similar results. I’d love to have someone do some controlled testing and put some real numbers on the tabs vs spaces performance question. (Since this is Aaron’s T-SQL Tuesday, I know he’s going to read this… hopefully he’ll take the bait!)
UPDATE: Aaron took the bait. Check out his performance analysis on tabs vs spaces.
“Nobody cares, I’ll just reformat it” is a terrible excuse
This makes me pout, because it’s a complete non-solution. If everyone working on the project re-formats the code into their own personal style, then maintaining the code becomes a nightmare. Comparing different versions of the code that span re-formatting can be a huge pain. Depending on how aggressive the re-format is, this could either be an annoyance, or it could render any diff practically impossible.
Coding standards on a project are important, and that includes formatting style as much as anything else. It’s important for the project to pick a standard and stick with it. You can’t have different people coding on different formatting styles.
You don’t have to agree with me
I’ll quote Aaron from his Bad Habits series:
I’m not trying to correct you, or make you feel like you are “doing it wrong.”
I’m trying to help you be consistent, avoid issues, and set better examples.
Hopefully, I’ve successfully explained why I think spaces are better than tabs. If I haven’t convinced you to use spaces, that’s fine, too. I just hope I don’t have to edit your code.
write-host versus write-output