Data Compression: Costs & benefits

SQL Server supports two kinds of data compression on rowstore data. Columnstore compression is for a different day. You can compress heaps, clustered indexes (on both tables and views), and nonclustered indexes (on both tables and views).

Compression requirements

Data compression is available in SQL Server 2008 & newer. It is an Enterprise Edition feature–if you’re using Standard Edition or Express, then you’re out of luck.

Compression benefits

The obvious benefit is that compressed data takes up less space on disk. Since you probably keep multiple copies of your database (multiple environments, DR, backups, etc), this space savings can really add up. High-performance enterprise-class storage is expensive. Compressing your data to reduce footprint can have a very real benefit to your budget. I once worked on an SAP ERP database that was 12TB uncompressed, and was reduced to just under 4TB after we implemented compression.

The second benefit is that by packing more data onto each data page, you read more rows with each read. This means you can get all the rows you need for your query by performing fewer I/O operations. Reducing I/O will give you a potentially significant performance boost, if I/O is one of your performance bottlenecks. If you’re not using flash storage, there’s a really good chance your storage is a bottleneck in at least some scenarios.

The third big benefit is reduced memory usage. Data is compressed both on disk and in memory–which means your data uses less memory. By corollary, this means that you can keep more data in memory. Having more data in memory means you’re less likely to need to perform a physical read from disk. So compressed data can actually reduce both memory and disk bottlenecks!

Compression costs

Compression is an Enterprise Edition feature. If you have a production (or DR!) instance that runs Standard or Express Editions, you won’t be able to use these features. In particular, a database with compressed objects can not be restored to Standard Edition or Express. This is an important thing to keep in mind, since it impacts database portability.

MOSTLY-OFF-TOPIC:
Speaking of DR–When you pay for your primary production instance, you get a FREE DR instance included with your production licensing. You only pay for your DR instance if you fail over to it (or do reporting off it, or otherwise actually use it).

If you’re running Enterprise in production, and running Standard in DR to save a few pennies, you’re doing it wrong.

There’s also CPU overhead related to compressing & decompressing data. If you are already seeing CPU bottlenecks on your server, then proceed carefully with implementing compression. The overhead is relatively small, but it’s there–make sure you test.

Summary

There are other costs and benefits, but these are some of the biggest ones. In my personal experience, the CPU overhead isn’t significant enough to cause issues. Most servers run with sufficiently low CPU that the overhead isn’t significant enough to cause problems. I usually find that the performance benefits outweigh the performance costs for many scenarios. Most people will find that compression is the right fit somewhere in their architecture. We’ll tackle the question of finding the right scenario another day.

3 Trackbacks / Pingbacks

  1. Data Compression – Curated SQL
  2. Data Compression: How row compression works - Impatient DBA
  3. Data Compression: How page compression works - Impatient DBA

Comments are closed.