Enforce a specific constraint to be checked first?

clip art of 
 a double-quote character

Question

I have the following SQL Server table :

create table Account(
    -- column names with their data types and properties
    Id int not null identity primary key,
    User varchar(50) not null,
    Job varchar(200) not null,
    
    -- table limitations
    constraint UniqueUserPerAccount unique (User),
    constraint UniqueJobPerAccount unique (Job),
)

For the two constraints I listed the more important constraint first hoping that if both constraints are violated, the first one would be triggered and displayed to the user. However, it appears the constraints are declared alphabetically by column name and therefore if both constraints fail, the UniqueJobPerAccount error, and only this error, is stated to have been violated.

I have tested this in two ways:

  1. When trying to load to the Account table with a duplicate User and Job, the UniqueJobPerAccount error, and only this error, is displayed in the error message.
  2. When using SQL Server Management and right-clicking on the Account table to ‘Script Table as’ -> ‘CREATE to’, it provides the constraints in the alphabetical order of the columns as seen below and not as I have provided in the –table limitations above:
 CREATE TABLE [dbo].[Account]
 (
   [Id] [int] IDENTITY(1,1) NOT NULL,
   [User] [varchar](50) NOT NULL,
   [Job] [varchar](200) NOT NULL,
   PRIMARY KEY CLUSTERED 
   (
         [Id] ASC
   ),
    CONSTRAINT [UniqueJobPerAccount] UNIQUE NONCLUSTERED 
   (
         [Job] ASC
   ),
    CONSTRAINT [UniqueUserPerAccount] UNIQUE NONCLUSTERED 
   (
         [User] ASC
   )
 );

asked 2023-10-06 by Kyle Furman


Answer

As pointed out in the comments by Aaron, SQL Server doesn’t use the definition order to determine execution order for constraints (well, sorta…Aaron’s also kinda wrong…). Since the INSERT needs to either fully succeed or fully fail, the database engine will potentially check in any order that seems optimal to SQL Server.

Let’s use your table definition, and seed the table with two rows:

As an important detail for readers, unique constraints are enforced by unique indexes. They have different metadata, but otherwise unique indexes & unique constraints can be considered to be the same.

CREATE TABLE dbo.Account(
    -- column names with their data types and properties
    Id int not null IDENTITY PRIMARY KEY,
    [User] varchar(50) not null,
    Job varchar(200) not null,
    
    -- table limitations
    CONSTRAINT UniqueUserPerAccount UNIQUE ([User]),
    CONSTRAINT UniqueJobPerAccount UNIQUE (Job),
);


INSERT INTO dbo.Account([User],[Job])
VALUES ('Andy','MyJob'),
       ('Aaron','HisJob');

Let’s cause the error:

This insert tries to insert 3 rows–the first violates BOTH unique constraints, while the other two would not:

INSERT INTO dbo.Account([User],[Job])
VALUES ('Aaron','MyJob'),
       ('Billy','ThirdJob'),
       ('Bob','FourthJob');

As expected, we get an error for only ONE of the unique constraint violations–the one that SQL Server tried to do first. Like you, I get a violation on the `UniqueJobPerAccount` constraint:

Msg 2627, Level 14, State 1, Line 33
Violation of UNIQUE KEY constraint ‘UniqueJobPerAccount’. Cannot insert duplicate key in object ‘dbo.Account’. The duplicate key value is (MyJob).

So what’s happening?

If we look at the graphical estimated execution plan for the insert, it doesn’t show us much of anything helpful:

However, if we look at the execution plan XML (by right clicking on the graphical plan & selecting “Show Execution Plan XML”, we can find the XML that corresponds to the “Clustered Index Insert”:

<RelOp AvgRowSize="9" EstimateCPU="3E-06" EstimateIO="0.03" EstimateRebinds="0" EstimateRewinds="0" EstimatedExecutionMode="Row" EstimateRows="1" LogicalOp="Insert" NodeId="0" Parallel="false" PhysicalOp="Clustered Index Insert" EstimatedTotalSubtreeCost="0.0300044">
	<OutputList />
	<Update DMLRequestSort="false">
		<Object Database="[/dev/null/]" Schema="[dbo]" Table="[Account]" Index="[PK__Account__3214EC076797E3BE]" IndexKind="Clustered" Storage="RowStore" />
		<Object Database="[/dev/null/]" Schema="[dbo]" Table="[Account]" Index="[UniqueJobPerAccount]" IndexKind="NonClustered" Storage="RowStore" />
		<Object Database="[/dev/null/]" Schema="[dbo]" Table="[Account]" Index="[UniqueUserPerAccount]" IndexKind="NonClustered" Storage="RowStore" />
	<Snip ✂ />
	<Snip ✂ />
	<Snip ✂ />
	</Update>      
</RelOp>

The <RelOp> XML node correlates to the “Clustered Index Insert” operator, and the <Object> XML nodes underneath it define the physical index structures that will be modified by that insert. Sure enough, we find them listed in the order:

  • PK__Account__3214EC076797E3BE
  • UniqueJobPerAccount
  • UniqueUserPerAccount

In practice, since the UniqueJobPerAccount comes first in this section of the execution plan, it’s going to be the unique index that finds a duplicate first, and this will be the violation reported by the error message on failure.

But why do the indexes get listed in that order? Because of the index_id. When you create the table and constraint in a single atomic CREATE TABLE statement, then SQL Server’s only goal is to accomplish what you asked, and it conveniently creates the indexes in whatever order it feels like.

You can try to play games with your creation order to force index_ids to be in the order you want–but depending on that is not something I would suggest. Indexes can be dropped & recreated, and it can be impossible to guarantee the index_ids are always ordered the way your application code expects.

For the sake of example:

If we drop & re-created the table, first with no unique constraints, then add them in a specific order as separate statements, we can influence the index_ids of the two unique indexes, which influences the execution plan, and ultimately the error message raised by the violation.

DROP TABLE IF EXISTS dbo.Account;

CREATE TABLE dbo.Account(
    -- column names with their data types and properties
    Id int not null IDENTITY PRIMARY KEY,
    [User] varchar(50) not null,
    Job varchar(200) not null
 );  
    -- table limitations
ALTER TABLE dbo.Account
	ADD CONSTRAINT UniqueUserPerAccount UNIQUE ([User]);
ALTER TABLE dbo.Account
	ADD CONSTRAINT UniqueJobPerAccount UNIQUE (Job);

INSERT INTO dbo.Account([User],[Job])
VALUES ('Andy','MyJob'),
       ('Aaron','HisJob');


INSERT INTO dbo.Account([User],[Job])
VALUES ('Aaron','MyJob'),
       ('Billy','ThirdJob'),
       ('Bob','FourthJob');

Which returns this error:

Msg 2627, Level 14, State 1, Line 29
Violation of UNIQUE KEY constraint ‘UniqueUserPerAccount’. Cannot insert duplicate key in object ‘dbo.Account’. The duplicate key value is (Aaron).

And we can see the execution plan XML has those <Update> nodes in a different order:

<RelOp AvgRowSize="9" EstimateCPU="3E-06" EstimateIO="0.03" EstimateRebinds="0" EstimateRewinds="0" EstimatedExecutionMode="Row" EstimateRows="1" LogicalOp="Insert" NodeId="0" Parallel="false" PhysicalOp="Clustered Index Insert" EstimatedTotalSubtreeCost="0.0300044">
	<OutputList />
	<Update DMLRequestSort="false">
		<Object Database="[/dev/null/]" Schema="[dbo]" Table="[Account]" Index="[PK__Account__3214EC076797E3BE]" IndexKind="Clustered" Storage="RowStore" />
        <Object Database="[/dev/null/]" Schema="[dbo]" Table="[Account]" Index="[UniqueUserPerAccount]" IndexKind="NonClustered" Storage="RowStore" />
		<Object Database="[/dev/null/]" Schema="[dbo]" Table="[Account]" Index="[UniqueJobPerAccount]" IndexKind="NonClustered" Storage="RowStore" />
	<Snip ✂ />
	<Snip ✂ />
	<Snip ✂ />
	</Update>      
</RelOp>

But what behavior do you *REALLY* want?

In my example, I’m attempting to insert 3 rows, and all three fail because of the unique constraint violation.

Also, I said that you shouldn’t rely on the internal index_id values for real life code. So what should you do instead?

If you’re doing a single-row insert, you can do something like this, which uses transaction & locking semantics to eliminate race conditions, and provides the opportunity for you to throw specific errors in an arbitrary order to have consistent behavior regardless of internal index_id values:

DECLARE @User varchar(50) = 'Aaron';
DECLARE @Job varchar(200) = 'MyJob';

BEGIN TRANSACTION
IF EXISTS (SELECT 1 FROM dbo.Account (HOLDLOCK) WHERE [User] = @User)
	BEGIN
		ROLLBACK;
		THROW 60000, 'Duplicate value for the [User] value when inserting into dbo.Account',1;
	END;
IF EXISTS (SELECT 1 FROM dbo.Account (HOLDLOCK) WHERE Job = @Job)
	BEGIN
	ROLLBACK;
		THROW 60001, 'Duplicate value for the [Job] value when inserting into dbo.Account',1;
	END;

INSERT INTO dbo.Account([User],[Job])
SELECT @User, @Job

COMMIT;

If you’re doing set-based inserts, things can get a little tricky, and you’ll need to consider what behavior you really want.

  • If you’re inserting 3 rows, and only 1 violates the unique constraint, should the other 2 still be inserted?
  • If you’re inserting 3 rows, and 2 of them conflict with each other (but not data that is already in the table), do you want the other 1 row inserted?
  • If you’re inserting 3 rows, and 2 of them conflict with each other (but not data that is already in the table), do you want to “uniquify” the insert and select one of the two conflicting rows to insert?
  • If you are inserting a large set, some rows may conflict with other rows in the insert, some rows may conflict with existing data, some rows may have duplicates for User, and other rows may have duplicates for Job.

Handling all of these cases can get complicated, particularly with custom error reporting. So I’m going to stop here and not go down the rabbit hole of figuring out all the possible permutations, since this is probably quite a bit beyond the scope of your question–but worth mentioning & thinking about.

answered 2023-10-08 by Andy Mallon