A logical flaw with function-based CHECK constraints in SQL Server

ANSI/ISO SQL defines CHECK constraints very simply as:

<check constraint definition> ::= CHECK <left paren> <search condition> <right paren>

where <search condition> is a boolean expression, which may include query expressions.

In SQL Server, Oracle and many other DBMSs, queries aren't permitted in constraints. CHECK constraints can only reference the table against which they are defined and they can only refer to column names as if they are scalar values - i.e. the values for a single row at a time. This isn't good enough. Often we want to implement business rules that refer to more than one table or to selections or aggregations of rows from the table in question. For example, the rule "Ensure that sum of orders cannot exceed the customer's credit limit" or "Ensure that total X = total Y".

In the past, triggers have been one common solution for these types of requirement. But when SQL Server 2000 introduced user-defined functions, some people tried another way to get around the limitations of CHECK constraints: put a query in a scalar function and then reference the function in a CHECK constraint.

This solution of using functions in CHECK constraints is flawed: it doesn’t always work as expected because the constraint checking mechanism doesn’t respect the atomic nature of set-based updates. SQL Server's CHECK constraints are designed to be evaluated a single row at a time rather than in set-based fashion. This can create big problems if your CHECK constraint relies on a query and if more than one row is affected by any update. What happens is that the constraint gets checked once for each row before the statement completes. That means statement atomicity is broken and the data will be exposed to the CHECK constraint logic in an inconsistent, incomplete state. The results are unpredictable and inaccurate.

Here's an example. The goal of this example is to define a constraint that guarantees the Accounts total will always balance to zero:

CREATE FUNCTION dbo.Accounts_total ()
RETURNS INT
AS
BEGIN
DECLARE @r INT;
SET @r =
(SELECT SUM(Amount) FROM dbo.Accounts);
RETURN @r;
END

GO

CREATE TABLE dbo.Accounts
(AccountNo INT PRIMARY KEY, Amount NUMERIC(10,2) NOT NULL,
CONSTRAINT ck_Accounts CHECK (dbo.Accounts_total()=0) /* balance must equal zero */ );

Having created the table, now try an INSERT:

INSERT INTO dbo.Accounts (AccountNo,Amount)
SELECT 1,-50 UNION ALL
SELECT 2,50 ;

Result:

Msg 547, Level 16, State 0, Line 1
The INSERT statement conflicted with the CHECK constraint "ck_Accounts". The
conflict occurred in database "MYTEST", table "dbo.Accounts".
The statement has been terminated.

The INSERT failed even though the constraint should not have been violated because two counterbalancing amounts were inserted simultaneously.

Now disable the constraint, INSERT the rows and then re-enable the constraint:

ALTER TABLE dbo.Accounts NOCHECK CONSTRAINT ck_Accounts;

INSERT INTO dbo.Accounts (AccountNo,Amount)
SELECT 1,-50 UNION ALL
SELECT 2,50 ;

ALTER TABLE dbo.Accounts WITH CHECK CHECK CONSTRAINT ck_Accounts;

Result: The constraint is (correctly) not violated because the Accounts table balances. The only way to insert or update the Amount column is by disabling the constraint!

I have tested this under SQL Server 2005, 2008 and 2012 with the same behaviour throughout. It is unaffected by the transaction isolation level because the flaw occurs at row level rather than statement level.

To some extent it is possible to write queries that are safe in functions within CHECK constraints but the potential restrictions are fairly severe. In purely relational terms this is obviously a very serious flaw even if it's not actually a bug from Microsoft's point of view.