Tuesday, March 30, 2010

Database schema changes unit tested using TSQLUnit

How does one test the existence of a primary key (PK) constraint belonging to a table in a database? Simple, right? Just write a test that intentionally violates that constraint. One's initial impulse would be to write a test inspecting the data contained within the PK's table. This approach is sensible and can be viewed as the conventional state-based style of testing.

Let's try it. The following test using TSQLUnit attempts to insert in a table a new row with the same value for its primary key as another existing row's PK:
create procedure dbo.ut_SubProduct_PkConstraint
as
begin
    declare @failMessage        varchar(200),
            @err                int,
            @primaryKeyError    int

    set @primaryKeyError = 2627
           
    /* RUN TEST */
-- grab row already in table and try to re-insert it.
    insert into dbo.SubProduct    
        select top 1 sp.*
        from dbo.SubProduct sp
   
    set @err = @@error       

    /* ASSERT TEST */
    if  @err <> @primaryKeyError
        begin
            set @failMessage = 'PK constraint not defined.'
            exec tsu_failure @failMessage
        end
end
The above code will not work as a good test because error messages regarding the failure to insert a row with a duplicate PK will occur that can not be suppressed in TSQLUnit's test runner (as viewed in the 'Results' window in Sql Server Management Studio). (Even with some not-too-in-depth research, I could not find a means to hide or suppress these error messages using TSQL.)

Instead, it can be tested more cleanly using SQL Server's info schema views:
create procedure dbo.ut_Product_PrimaryKey
as
begin
    declare @failMessage        varchar(200),
            @tableName          varchar(40),
            @primaryKeyColumn   varchar(40)

    set @tableName           = 'Product'
    set @primaryKeyColumn    = 'ProductID'
           
    if not exists
    (
        select      c.table_name, c.column_name, c.data_type
        from        information_schema.columns c
            inner join information_schema.key_column_usage kcu
                    on c.table_name = kcu.table_name and c.column_name =
                       kcu.column_name
            inner join information_schema.table_constraints tc
                    on tc.table_name = kcu.table_name and tc.constraint_name =
                       kcu.constraint_name and tc.constraint_type = 'PRIMARY KEY'
        where       c.table_name = @tableName and kcu.column_name = @primaryKeyColumn
    )
    begin
        set @failMessage = 'PK constraint not defined for ''' + @tableName + '.' + @primaryKeyColumn + ''''
        exec tsu_failure @failMessage
    end
end
That's it. You now have a test that handles primary key constraints.

Of course, this is not the last and only time where primary keys will require testing. While the above sproc does a satisfactory job, it is not reusable for any other tables. Let's refactor it into a more generic "Assert" sproc in an xUnit style:
create procedure dbo.tsux_AssertPrimaryKeyExists
/* This is an extension to the TSQLUnit framework. */
(  
    @tableName            varchar(40),
    @keyColumn            varchar(40),   
    @failMessage          varchar(255)=null   
)
as
begin
    if not exists
    (
        select      c.table_name, c.column_name, c.data_type
        from        information_schema.columns c
            inner join information_schema.key_column_usage kcu
                    on c.table_name = kcu.table_name and c.column_name =
                       kcu.column_name
            inner join information_schema.table_constraints tc
                    on tc.table_name = kcu.table_name and tc.constraint_name =
                       kcu.constraint_name and tc.constraint_type = 'PRIMARY KEY'
        where       c.table_name = @tableName and kcu.column_name = @keyColumn
    )
    begin
        if @failMessage is null
            set @failMessage = ' PK constraint not defined for ''' + @tableName + '.' + @keyColumn + ''''
        exec tsu_failure @failMessage
    end
end
Your actual test becomes more compact and easier to understand:
create proc dbo.ut_Product_PrimaryKey
as 
begin
    exec dbo.tsux_AssertPrimaryKeyExists
        @tableName                  = 'Product',
        @primaryKeyColumn           = 'ProductID',
end
Any verification of pure schema changes such as the creation of new tables, columns, constraints, etc. via unit tests is better served using SQL Server's system tables and views to query the necessary meta information. This has proven to be more preferable than performing data centric state based tests (This technique reminds me a little of using reflection in .NET to do testing)

The earlier test for primary keys can be also applied to foreign key constraints as well. However, it requires some additional pieces to validate including the table and column being referenced. After (once again) finding the suitable tsql needed via an online search, here is the test:
create procedure dbo.tsux_AssertForeignKeyExists
/* This is an extension to the TSQLUnit framework. */
(
    @tableName            varchar(40),
    @foreignKeyColumn     varchar(40),
    @referenceTable       varchar(40),
    @referenceColumn      varchar(40),   
    @failMessage          varchar(255)=null   
)
as
begin
    if not exists
    (
        /*
            This is modified version of the tsql query used to retrieve foreign key info
            courtesy of:
                http://www.experts-exchange.com/Microsoft/Development/MS-SQL-Server/Q_22952666.html
        */

        select
            object_name(fkeyid)     as TableName,
            a.name                  as FKColumn,
            object_name(constid)    as FKConstraint,
            object_name(rkeyid)     as ReferenceTable,
            b.name                  as ReferencedColumn
        from sysforeignkeys f
            inner join syscolumns a on a.id = f.fkeyid and a.colid = f.fkey
            inner join syscolumns b on b.id = f.rkeyid and b.colid = f.rkey
        where
            fkeyid                        = object_id( @tableName )
            and a.name                    = @foreignKeyColumn
            and object_name(rkeyid)       = @referenceTable
            and b.name                    = @referenceColumn
    )
        begin
            set @failMessage = 'FOREIGN KEY does not exist for ''' + @tableName + '.' + @foreignKeyColumn  +  ''''
            exec tsu_failure @failMessage
        end
end
Now the new unit test would plainly look like this:
create procedure dbo.ut_Product_ForeignKeys
as
begin
    exec dbo.tsux_AssertForeignKeyExists
        @tableName            ='Product',
        @foreignKeyColumn     ='CategoryID',
        @referenceTable       ='Category',
        @referenceColumn      ='CategoryID'   

end
Database Testing: How to Regression Test a Relational Database, which details areas of any database that should be tested, had been an influence on developing these types of data definition language (DDL) tests. As an example, referential integrity is mentioned as being important area to test. On the surface, it seems a bit overkill to set up tests for PKs and FKs. However, even superfically minor data errors can be costly.

I was once tasked with expanding the size of a core primary key column that had multiple dependencies to other tables (including views and sprocs). As expected, to make the change required temporarily dropping the PK and FK constraints on those other tables and then add them back after applying the change.

Herein lies the risk. What if the "adding back" part was accidentally forgotten and not included in the change script? What if it was temporarily commented out with the intention to uncomment it later but overlooked? Allowing a deficient, regression-inducing script to rollout into a live production environment would be poor software development.

What automated unit tests provide in this situation is the insurance and safety net that the constraints are less likely to be missed or forgotten. They enforce the existence of those key constraints and firmly establish them as requirements for the database. It further instills greater confidence to make these sorts of changes by providing immediate feedback during development (and not much later) if the constraints are not set.

Another situation where unit testing using meta data came in handy was increasing the data type length of a column. Initially, when the size of varchar for a few columns needed to increase, I had some data heavy tests performing row insertions. These tests were fragile since they'd break the TSQLUnit test runner itself if a test inserted data larger than the expected size. Instead, I created a generic assert stored procedure that used the system sproc, 'COL_LENGTH':
create procedure dbo.tsux_AssertColumnLength
/* This is an extension to the TSQLUnit framework. */
(
    @tableName            varchar(40),
    @columnName           varchar(40),
    @expectedLength       smallint,
    @failMessage          varchar(200)=null
)
as
begin
    declare @actualLength        smallint

    select @actualLength = COL_LENGTH(@tableName, @columnName)

    if @expectedLength != @actualLength
        begin           
            set @failMessage = 'Column length for ''' + @tableName + '.' + @columnName  +  ''' does not match expected value.'
            exec tsu_failure @failMessage
            print 'Expected: '    + cast(@expectedLength as varchar(3))
            print 'Actual: '    + cast(@actualLength as varchar(3))
        end
end

Again, reusable and reliable code which avoids relying on sql compiler errors to indicate test failure.

Other examples of generic asserts for columns aside from length:

* tsux_AssertColumnExists
* tsux_AssertColumnDataType

Also, some other areas using test assertions:

* permissions on an object (quite important and often overlooked until too late)
* stored procedure parameter lengths
* existence of tables, views, and other similar database objects
* schemabinding on views

What's more is that these types of tests can be easily code generated. For example, if database changes included adding new columns, then unit tests can be generated by extracting the columns' meta data (e.g. name, datatype, length, etc.) as defined in a sql script (or even from an xml file or spreadsheet). The same can be done with existing data objects and structures requiring DDL changes but the meta data can be pulled from the database's system tables/views. In this situation, you gain some automatic test coverage for your current schema before attacking it with alterations.

Not sure if any of the techniques detailed earlier could be applied to check constraints (e.g. inserted/updated datetime value should not be greater than today's date, etc.). Perhaps, check constratins can be sufficiently managed with simple, direct unit tests using data rather than using meta data. (Although it could be argued that the logic for most check constraints should exist in the application code and not in the database. However, in practise, this is not always the case.) For now, without an immediate need, it will remain speculative.

This is likely my last post on TSQLUnit. In the future, my data access tests will probably be created in and executed from the application code rather than on the SQL Server side. However, if I do find myself in a scenario where database-only unit tests are needed, I'd probably try out T.S.T. the T-SQL Test Tool since it has built-in assert functions.