Nonclustered index storage on clustered columnstore
Question
In SQL Server, a non-unique nonclustered index on a rowstore table incorporates the base object’s bookmark (RID or clustering key) at all levels of the nonclustered index structure. The bookmark is stored as part of the nonclustered index key at all index levels.
On the other hand, if the nonclustered index is unique, the bookmark is present only at the leaf level of the index – not as part of the key (the bookmark is present as one or more included columns, in effect).
In SQL Server 2016, it is possible to build a nonclustered b-tree index on a column-oriented table (one that has a clustered columnstore index).
- What is the ‘bookmark’ used for a nonclustered b-tree index on a clustered columnstore table?
- Do the differences between unique and non-unique nonclustered indexes described above still apply?
asked 2017-01-06 by Paul White
Answer
- The “bookmark” is the columnstore index original locator (per “Pro SQL Server Internals” by Dmitri Korotkevitch). This is an 8-byte value, with the columnstore index’s
row_group_id
in the first 4-bytes and an offset in the second 4-bytes. - If you use
DBCC PAGE
to look at the non-clustered index, the 8-byte columnstore index original locator appears in the “uniquifier” column of theDBCC PAGE
output. This shows that a unique non-clustered index does not need to include the columnstore row locator, whereas a non-unique non-clustered index does.
The following code creates a columnstore-organized table with a unique and non-unique b-tree nonclustered index on the same column:
CREATE TABLE dbo.Heapish ( c1 bigint NOT NULL, c2 bigint NOT NULL, INDEX CCI_dbo_Heapish CLUSTERED COLUMNSTORE ); GO INSERT dbo.Heapish WITH (TABLOCKX) (c1, c2) SELECT TOP (1024 * 1024 * 8) c1 = ROW_NUMBER() OVER (ORDER BY C1.[object_id], C1.column_id), c2 = ROW_NUMBER() OVER (ORDER BY C1.[object_id], C1.column_id) FROM master.sys.columns AS C1 CROSS JOIN master.sys.columns AS C2 ORDER BY c1 OPTION (MAXDOP 1); GO CREATE UNIQUE NONCLUSTERED INDEX UNIQUE_c2 ON dbo.Heapish (c2) WITH (MAXDOP = 1); CREATE NONCLUSTERED INDEX NONUNIQUE_c2 ON dbo.Heapish (c2) WITH (MAXDOP = 1);
We can see the size of the index row at different levels of the b-tree using sys.dm_db_index_physical_stats
:
SELECT DDIPS.index_level, DDIPS.page_count, DDIPS.record_count, DDIPS.min_record_size_in_bytes, DDIPS.max_record_size_in_bytes FROM sys.dm_db_index_physical_stats ( DB_ID(), OBJECT_ID(N'dbo.Heapish', N'U'), INDEXPROPERTY(OBJECT_ID(N'dbo.Heapish', N'U'), N'UNIQUE_c2', 'IndexID'), NULL, 'DETAILED' ) AS DDIPS; SELECT DDIPS.index_level, DDIPS.page_count, DDIPS.record_count, DDIPS.min_record_size_in_bytes, DDIPS.max_record_size_in_bytes FROM sys.dm_db_index_physical_stats ( DB_ID(), OBJECT_ID(N'dbo.Heapish', N'U'), INDEXPROPERTY(OBJECT_ID(N'dbo.Heapish', N'U'), N'NONUNIQUE_c2', 'IndexID'), NULL, 'DETAILED' ) AS DDIPS;
The output is:
Both structures have the same row size at the leaf level, but the nonunique nonclustered index is 12 bytes larger than the unique nonclustered index at the non-leaf levels due to the 8-byte columnstore locator, plus 4 bytes of overhead for the first variable-length column in a row (uniquifier is variable length).
answered 2017-01-06 by Andy Mallon