Sunday, July 27, 2008

Snake bitten by Python (R.I.P. NAnt)

After first experimenting last year with IronPython, the .NET port of Python, I decided to take the full plunge into Python itself leading to another major milestone in my programming career. Why? Because I now have an incredibly handy language, Python, that can superbly manage rudimentary but necessary development tasks. Furthermore, Python has enlightened me to yet another way of thinking about how code can be written.

As a dynamic language, Python can be extremely powerful. It can be used for "glue" tasks like scripting but its potential is even greater. By being an interpreted language, pieces of your code can both be written and tested practically at the same time via its interpreter console (as if the Immediate Window in Visual Studio were the means to simultaneously see how your code works while you are writing it. No "compilation tax".) In addition, Python can do both OOP (e.g. classes, et al) as well as functional programming (e.g. treating functions like data in lists and supporting lambdas like Lisp). Finally, with its leaner and less verbose syntax, less code is written as compared with other static languages.

After a few days to ramp up and get acquainted with the language, I immediately started to implement Python on a few things. NAnt build scripts and Windows bat files were the main targets for Python conversion. I also intend on rewriting in Python a C# .NET console tool that merges the content of multiple files into a single one. It seemed more natural and sensible to use Python for these types of development tasks.

Replacing NAnt with Python is favored since NAnt is an XML-based DSL that might be doing a little too much. The problem is not that it is a DSL, generally a good thing particularly when done right (as the Ant/NAnt folks succeeded quite well in doing to their credit), but the part of it being "XML-based". Who really wants to program all day in XML? Apache Ant, NAnt in the Java world, was a victim of the exploding popularity of XML during the height of the dot com era web applications. XML should be left to do what it does best and what it was originally intended for: basic structured data storage and configuration. Ant (and, subsequently, NAnt) should not have mixed the following two: (1) formatting/organizing data and (2) build flow logic. Not a far cry from violating the principle "separation of concerns".

If I can avoid it, I am finished with NAnt (or any other equivalent build frameworks that rely heavily on XML for its flow logic). If given the choice in a development environment, I probably not opt to use NAnt to handle build scripts. Not that I have anything against NAnt itself, just that better, more programmer friendly alternatives exist. NAnt was (and still is) a great option as compared, say, with the inferior Windows bat files or with the dev shops that manually build their projects via the Visual Studio. Instead, a dynamic language like Python (or BOO or Ruby or whatever else) is preferable to manage this type of work. (A few build automation frameworks do exist written in Python, but I will like to take a look at the promising, .NET born BOO build system.)

NAnt documentation mentions that it has the advantage over native OS shell commands because it is "cross-platform". That might be true, but Python has that area easily covered specifically with its 'os' and 'shutil' modules. Portability is one of Python's key features.

Code generation of other programming languages is another area where I also started to use Python. Database change scripts written in TSQL that are repetitive and voluminous have benefited significantly from using Python (as one example, creating structurally similar 'drop column' statements for 100+ columns). In the future, for other kinds of code generation (e.g. NHibernate mapping files is one example), I will definitely consider Python as a substitute for heavier code-gen tools such as MyGeneration.

The more I use Python, the more I am convinced that it will be employed as my general, all-purpose utility programming language. I intend to use it as a vital supporting player handling the grunt work in my development processes. It does not matter what the primary language happens to be whether C#, TSQL, etc. By and large, I just like how quick and dirty a script can be whipped up to perform some auxiliary task without having to endure the overhead of compiling, creating, and running some executable file. (Who knows? Maybe one day I can work on a major project where Python is the star of the show.) All in all, it is just such a nice clean, readable language making it far more enjoyable to work with as compared with something like NAnt.

To give an idea on how visually different it is to use Python over NAnt, below are the code of two identical build scripts written in each language. This the first script I ported over to Python. The script runs the SQL Server Database Publishing Wizard to generate a file that contains the sql to create the schema of a baseline database required at the start of each development cycle. The following high level tasks are executed by the script:
  • Create Schema Script- Generates the raw initial tsql schema script from target database using the DB pub wiz
  • Convert Script File Encoding- Convert file from Unicode to ASCII
  • Replace Script Values- Read from external csv file containing pairs of strings to replace in script.
  • Checkout File From Source Control - Checkout from Perforce the existing schema file that will be replaced.
  • Copy File To Build Location- Move sql script file to build directory
  • Build Database And Run Unit Tests- Run another separate script (currently written in NAnt) that builds the db and runs the unit tests using TSQLUnit
NAnt Version
<?xml version="1.0"?>
<project name="Generic Database Build" default="BaselineDatabaseCreation"
<property name="base.dir" value=".\" overwrite="false" readonly ="false" />
<property name="sourceDB" value="" overwrite="false"/>
<property name="sourceServer" value=".\sqlDev2005" overwrite="false"/>
<property name="dbmsVersion" value="2000" overwrite="false"/>
<property name="connectionString" value="Server=${sourceServer};Database=${sourceDB};Trusted_Connection=True;"/>
<property name="dbBuild.dir" value="" overwrite="false"/>
<property name="targetDB" value="" overwrite="false"/>
<property name="targetServer" value=".\sqlDev2005" overwrite="false"/>
<property name="sqlScriptingTool.dir" value="C:\Program Files\Microsoft SQL Server\90\Tools\Publishing\" overwrite="false"/>
<property name="sqlScript.fileName" value="CreateSchema.sql" overwrite="false"/>
<property name="sqlScript.filePath" value="${path::combine(base.dir, sqlScript.fileName)}" overwrite="false"/>
<property name="sourceControl.filePath" value="${dbBuild.dir}Schema\${sqlScript.fileName}" overwrite="false"/>
<!-- replace values list variable -->
<property name="temp.fileName" value="temp.txt"/>
<property name="temp.filePath" value="${path::combine(base.dir, temp.fileName)}"/>
<property name="replaceValuesList.fileName" value="" overwrite="false"/>
<property name="replaceValuesList.filePath" value="${path::combine(base.dir, replaceValuesList.fileName)}"/>

<target name="BaselineDatabaseCreation" description="Creates baseline database tsql script end-to-end.">
<call target="CreateSchemaScript"/>
<call target="ConvertScriptFileEncoding"/>
<call target="ReplaceScriptValues" unless="${replaceValuesList.fileName==''}"/>
<call target="CheckoutFileFromSourceControl"/>
<call target="CopyNewScriptFileToBuildLocation"/>
<call target="GetSeedTablesData"/>
<call target="BuildDatabaseAndRunUnitTests" unless="${targetDB==''}"/>
<target name="CreateSchemaScript" description="Generates the raw initial tsql schema script from target database">
<delete file="${sqlScript.filePath}" if="${file::exists(sqlScript.filePath)}" />
<exec program="${sqlScriptingTool.dir}sqlpubwiz">
<arg value="script" />
<arg line="-C ${connectionString}" />
<arg value="${sqlScript.filePath}" />
<arg value="-schemaonly" />
<arg line="-targetserver ${dbmsVersion}" />
<!-- '-f' means overwrite existing files is true -->
<arg value="-f" />
<fail message="${sqlScript.filePath} was not created."
unless="${file::exists(sqlScript.filePath)}" />
<target name="ConvertScriptFileEncoding" description="Convert file from Unicode to ASCII">
<copy file="${sqlScript.filePath}" tofile="${temp.filePath}" outputencoding="ASCII" overwrite="true" />
<move file="${temp.filePath}" tofile="${sqlScript.filePath}" overwrite="true"
unless="${file::exists(replaceValuesList.filePath)}" />
<target name="ReplaceScriptValues"
description="Read from external csv file containing pairs of strings to replace.">
<fail message="${replaceValuesList.filePath} does not exist."
unless="${file::exists(replaceValuesList.filePath)}" />
<foreach item="Line" in="${replaceValuesList.filePath}" delim="," property="x,y">
<echo message="Replacing '${x}' with '${y}'..." />
<copy file="${temp.filePath}" tofile="${sqlScript.filePath}" overwrite="true">
<replacestring from="${x}" to="${y}" />
<copy file="${sqlScript.filePath}" tofile="${temp.filePath}" overwrite="true"/>
<move file="${temp.filePath}" tofile="${sqlScript.filePath}" overwrite="true"/>
<target name="CheckoutFileFromSourceControl" description="Checkout from source control the schema file that will be replaced.">
<fail message="${sourceControl.filePath} does not exist."
unless="${file::exists(sourceControl.filePath)}" />
<p4edit view="${sourceControl.filePath}">
<arg line="-t"/>
<arg line="text+k"/>
<target name="CopyNewScriptFileToBuildLocation">
<copy file="${sqlScript.filePath}" tofile="${sourceControl.filePath}" overwrite="true" />
<target name="GetSeedTablesData">
<!--TODO: Create a separate NAnt build script for this-->
<target name="BuildDatabaseAndRunUnitTests">
<nant buildfile="" inheritall="false" >
<property name="base.dir" value="${dbBuild.dir}"/>
<property name="server" value="${targetServer}" />
<property name="database" value="${targetDB}"/>
<property name="includeUnitTesting" value="true" />

Python Version
import os
import csv
import shutil

sqlscripting_tool=r'C:\Program Files\Microsoft SQL Server\90\Tools\Publishing\sqlpubwiz.exe'
dbms_version = '2000'
connection_string = 'Server=' + source_server + ';Database=' + source_db + ';Trusted_Connection=True;'

sqlscript_filename = 'CreateSchema.sql'
sqlscript_filepath = os.path.join(sqlscript_dir, sqlscript_filename)
source_control_filepath = os.path.join(db_build_dir, sqlscript_filename)

replace_values_filepath = os.path.join(replace_values_dir, 'ReplaceList.csv')
error_found_message = 'Error found!'

def run_script():
"""Run all tasks"""

tasks = [create_schema_script, convert_scriptfile_encoding, replace_script_values,
checkout_file_from_source_control, copy_file_to_build_location, build_database_and_run_unit_tests]
for task in tasks:
print 'Executing \'' + task.func_name + '\'... '
is_successful = task()
if not is_successful:
print 'Script Failure!'

if is_successful:
print 'Script Success!'

def create_schema_script():
"""Generates the raw initial tsql schema script from target database"""

if os.path.isfile(sqlscript_filepath):

args = ['sqlpubwiz', 'script', '-C ' + connection_string, '"' + sqlscript_filepath + '"',
'-schemaonly', '-targetserver ' + dbms_version, '-f']
os.spawnv(os.P_WAIT, sqlscripting_tool, args)

if os.path.isfile(sqlscript_filepath) == False:
print error_found_message
print "File '" + sqlscript_filepath + "' was not created."
return False

return True

def convert_scriptfile_encoding():
"""Convert file from Unicode to ASCII"""

cmd1 = 'type "' + sqlscript_filepath + '" > temp.txt'
cmd2 = 'move temp.txt "' + sqlscript_filepath + '"'
cmds = [cmd1, cmd2]
for cmd in cmds:
dos = os.popen(cmd)

return True

def replace_script_values():
""" Read from external csv file containing pairs of strings to replace values in sql script. """

# if 'replace values' list not provided then assume not needed
if replace_values_filepath == '':
return False

# check for 'replace values' file existence
if os.path.isfile(replace_values_filepath) == False:
print error_found_message
print "Replace values list file '" + replace_values_filepath + "' does not exist."
return False

# modify file content with new values
f = open(sqlscript_filepath, 'r')
text =
replace_values = csv.reader(open(replace_values_filepath, 'r'))
for row in replace_values:
find_text = row[0]
replace_with_text = row[1]
text = text.replace(find_text, replace_with_text)

# write to script file with new values
f = open(sqlscript_filepath, 'w')

return True

def checkout_file_from_source_control():
""" Checkout from source control the schema file that will be replaced. """

# look for source control file
if os.path.isfile(source_control_filepath) == False:
print error_found_message
print "Source control file '" + source_control_filepath + "' does not exist."
return False

# checkout file (note: could use PyPerforce API framework instead)
cmd = 'p4 edit -t text+k ' + source_control_filepath + ''
p4 = os.popen(cmd)

return True

def copy_file_to_build_location():
""" Move sql script file to build directory """

shutil.copy(sqlscript_filename, source_control_filepath)

return True

def build_database_and_run_unit_tests():
""" Build database and validate schema by running unit tests """

nant_tool = os.path.join(base_dir, 'Tools\\NAnt\\bin\\', 'NAnt.exe')
build_script_filepath = os.path.join(base_dir, 'Projects\\Libs\\Utils\\NAntScripts\\DatabaseBuilds\\', '')
build_dir = os.path.split(os.path.normpath(db_build_dir))[0] # hack: need to remove 'Schema' folder; todo: need to remove this from generic db build script

# todo: replace NAnt script with Python script
args = ['NAnt', '-buildfile:' + build_script_filepath, '-D:base.dir=' + build_dir,'-D:server=' + target_server,
'-D:database=' + target_db, '-D:installUnitTesting=' + 'true', ]
os.spawnv(os.P_WAIT, nant_tool, args)

return True


Monday, July 21, 2008

Data Validation, Business Rules, and the Notification Pattern

On a previous project, I had encountered some unnecessarily long 'Save' methods in various ASP.NET web pages that contained numerous validations of each input value from the UI page. Within the body of those methods it would run through all of those vaildations before it would finally reach the decision as to whether to commit changes to the database or not. (for example, something like check the length of the first name of the user is less than 20, etc.) In general, the methods ended being a bit hard to follow especially if you needed to make a change to them.

Another developer who I used to work with had mentioned to me data validation shouldn't even be in the Presenter Class of a traditional MVP/MVC implementation. He also mentioned his approach at the time (which if I recall correctly was something like exception guards?) as well as Jimmy Nilsson's approach towards data validation as described in his book, Applying Domain-Driven Design and Patterns: With Examples in C# and .NET. In the meantime, I had been recently researching how to do MVP with the ASP.NET custom validators since we are using these on our project at work and was trying to find a "better" way to handle rudimentary validation.

After some "blood, sweat, and tears" I think I was able to successfully apply Fowler's Notification Pattern to solve this "issue". The notification pattern tries to manage the capturing of error messages as it relates to data validation that are specific to domain objects and are generally outputted to the end-user. (An example is if an email address is required on a submit form. If the user skips over that then on 'submit' a message is displayed such as "An email address is required...blah blah) It all started while re-reading one of Jeremy Miller's post on validation as part of his CAB series . His post lead to both Fowler's Notification Pattern and two posts from Jean-Paul S. Boodhoo's blog (Part I and Part II). Those served as my blueprints for my implementation.

Essentially, I went through each of their slightly differing approaches to see what I could use. The core of what I ended up with borrows heavily from Fowler with most of my changes just renaming things to suit my liking. Fowler always writes with such clarity and without the cruft and his code examples are so easy to follow that his version was the main driver for what I wanted to do. Miller's and JP took the pattern to another level but it was too much for what I wanted. My goal was to keep it simple of course and let it evolve on its own (BUFD bad!) I initially developed it on a separate test project. Once that worked I then implemented it in our project at work somewhat seamlessly.

I first created the initial base classes that are the foundation of this pattern and that can be re-used on any project. Below are their interfaces:

/// Specific business rule error that provides a specific message about the broken business rules.
public interface IBusinessRuleError
/// Gets or sets the name of the property that causes the error.
string PropertyName { get; set; }

/// Gets or sets the specific error message.
string Message { get; set; }

/// Set of Business Rules used by Domain Objects that captures and stores errors.
public interface IBusinessRules
/// Gets or sets the business rule errors.
IList Errors { get; set; }

/// Gets a value indicating whether this instance has any business rule errors.
bool HasErrors { get;}

/// Determines whether the specified set of business rules contains error.
bool ContainsError(IBusinessRuleError ruleError);

Basically 'BusinessRules' manages a collection of individual 'BusinessRule'. The business rule contains the error message and it also contains the name of the specific property for the error that will be used later to when mapping it back to a specific UI control.

Now I added BusinessRules to the abstract DomainObject class and expose it as a property. Initially it was hard-coded into my domain object but at work I decided to pull it into its own class that could then be instantiated internally and, if need be, injected in as a dependency into the domain object base class (as you know ideal for mock testing it!) Here is what I call the "validator"

public interface IDomainObjectValidator
/// Runs the validation of each business rule.
/// Each derived class can override this method to define its own
/// set of validation rules.
void RunValidation();

/// Gets the business rules.
IBusinessRules Rules { get;}

/// Gets a value indicating whether this instance is valid based on whether any business rules failed.
bool IsValid

/// Determines whether [is null or blank] [the specified item to test].
bool IsNullOrBlank(string itemToTest);

/// Fails if condition to test is true.
void FailIf(bool conditionToTest, IBusinessRuleError error);

/// Fails if is null or blank the condition to test is true.
void FailIfNullOrBlank(string itemToTest, IBusinessRuleError error);

This interface has the RunValidation method whose purpose is to cycle through it each business rule that the derived class is responsible to implement for itself. In addition, the interface also has some basic, re-usable validation tests of these methods such as IsNullOrBlank, FailIf, etc. (courtesy of Fowler) (NOTE: What struck me very quickly was the similarities between these generic methods and with the Asserts of NUnit. It dawned on me when I started to implement a new one that checked the difference for dates such IsBetween(string startDate, string endDate). Mmm...looks a lot like NUnit's Is Constraint model. In fact, 'FailIf' looks like a special case of Assert.That. I'm wondering whether some framework exists out for me to use instead of trying to create and maintain my own.)

In turn, the validator's members are delegated and exposed as members of the domain class itself:

// domain object abstract class
private readonly IDomainObjectValidator _validator;

public DomainObject()
_validator = new DomainObjectValidator();

public DomainObject(IDomainObjectValidator validator)
_validator = validator;

public bool IsValid
get { return _validator.IsValid; }

public IBusinessRules Rules
get { return _validator.Rules; }

public virtual void RunValidation()

public bool IsNullOrBlank(string itemToTest)
return _validator.IsNullOrBlank(itemToTest);

public void FailIf(bool conditionToTest, IBusinessRuleError error)
_validator.FailIf(conditionToTest, error);

public void FailIfNullOrBlank(string itemToTest, IBusinessRuleError error)
_validator.FailIfNullOrBlank(itemToTest, error);

Once that was done then it was time to actually use it for a specific domain object. So I have a domain object named 'Question' that makes up a 'Quiz':

public interface IQuestion
/// The text of the question itself.
/// For example, "How old are you?"
string Description { get; set; }

/// Point value of the question if quiz taker gets it correct.
int MaxPointValue { get; set; }

/// Sequence # of the question within a quiz.
int SequenceNumber { get; set; }

// Bunch of other members...


So in the actual Question class, I override and implement the 'RunValidation' method with "rules/errors" specific to 'Question':

// Question class
public override void RunValidation()
// validation # 1
FailIfNullOrBlank(_description, new BusinessRuleError("Description", "Question description must contain a value."));

// validation # 2
if (_description != null)
FailIf(_description.Length > 10,
new BusinessRuleError("Description", "Question description can not be longer than 10 characters."));

// validation # 3
FailIf(_maxPointValue > 100, new BusinessRuleError("MaxPointValue", "Maximum Point Value can not exceed 100."));

// ....
// validation # 100...

So this is where it all happens. Basically, this is where all the business rules that require validation for 'Question' is kept and maintained. Not in the UI, not in the presenter, not in the database or not anywhere else. Right where it should be. What's great is how nice is it to itemize and view all of your business rules in one place. The best part is unit testing this (which you really can't do well at all if it's in the presenter). Here are one of the tests:

// Question test fixture

[Test][Category("Data Validation")]
public void DoesContainBrokenRuleWhenDescriptionIsNull()

Question question = new Question();
question.Description = null;

IBusinessRuleError descriptionError = new BusinessRuleError("Description", "The description for this 'Question'

must contain a value.");
Assert.That(question.Rules.ContainsError(descriptionError), "Does not contain Description error.");
Assert.That(question.IsValid, Is.False, "Question is valid.");

How cool is that? I especially like the clarity of this code line:


Here are a few more tests:

[Test][Category("Data Validation")]
public void DoesContainBrokenRuleWhenDescriptionLengthGreaterThan10()
Question question = new Question();

question.Description = "1234567891011";

BusinessRuleError descriptionError = new BusinessRuleError("Description", "The description for this 'Question' can not be longer than 10 characters.");
Assert.That(question.Rules.ContainsError(descriptionError), "Does not contain Description error.");
Assert.That(question.IsValid, Is.False, "Question is valid.");

[Test][Category("Data Validation")]
public void DoesContainBrokenRuleWhenMaxPointValueExceeds100()
Question question = new Question();

question.MaxPointValue = 101;

BusinessRuleError maxPointValueError = new BusinessRuleError("MaxPointValue", "Maximum Point Value for this 'Question' can not exceed 100.");
Assert.That(question.Rules.ContainsError(maxPointValueError), "Does not contain MaxPointValue error.");
Assert.That(question.IsValid, Is.False, "Question is valid.");

By implementing this at work the app's Domain model is now slightly less anemic. However, the auto-gen of partial classes presented an issue that I was not too happy with. The MyGeneration template is currently set up to read from the database the constraints of the columns and then it's hard-coded directly into the property setters (which includes throwing exceptions). This forces the trapping of the validation error to occur OUTSIDE of the domain object which goes against this implementation of the pattern. So unless I modify the template to remove this from the setters (or at least move into some private method) I had to circumvent updating via the setters and use some methods as so:

question.Description = "My Description"; question.MaxPointValue = 101;

becomes using overloads

question.UpdateDescriptionUsingValidation("1234567891011") question.UpdateMaxPointValueUsingValidation (101)


question.UpdateUsingValidation("1234567891011", 101)

Not really what I wanted but it works for now until I can resolve that auto-gen issue (another reason why auto-gen can sometimes be an anti-pattern)

So let's see the entity 'Question' actually used in a Controller/Presenter context:

// Presenter class
public void SaveChanges()
IQuestion question = new Question();
question.Description = _view.Description;
question.MaxPointValue = _view.MaxValuePoint;
question.SequenceNumber = _view.SequenceNumber;

if (question.IsValid)
_view.DisplaySuccess("The question has now been saved.");


Now how does that compare with one of the original LONG save methods? The intent, readability, and therefore maintainability is light years better. (NOTE: As a side note, I had to use the NHibernate's ISession.Evict() to prevent the entity from being persisted to the db.)

OK, finally the UI/View/Code-Behind

// View class
public void DisplayErrors(IList errors)
foreach (IBusinessRuleError error in errors)
if (error.PropertyName.Equals("Description"))
_ctlDescriptionValidator.ErrorMessage = error.Message ;
_ctlDescriptionValidator.IsValid = false;

if (error.PropertyName.Equals("MaxPointValue"))
_ctlMaxPointValidator.ErrorMessage = error.Message;
_ctlMaxPointValidator.IsValid = false;

The controls '_ctlDescriptionValidator' and '_ctlMaxPointValidator' are ASP.NET custom validators that are now really dumbed down. I also used the 'ValidationSummary' control on the web page without needing to do hardly any wiring up. Here is some of the related HTML:

<form id="form1" runat="server">
<asp:ValidationSummary ID="_ctlValidationSummary" runat="server" />

<asp:Label ID="_lblSuccessMessage" runat="server"></asp:Label><div>
<asp:TextBox ID="_txtDescription" runat="server" >
<asp:CustomValidator ID="_ctlDescriptionValidator" runat="server" ControlToValidate="_txtDescription"
ErrorMessage="" OnServerValidate="_ctlDescriptionValidator_ServerValidate">*</asp:CustomValidator><br />
Max Point Value
<asp:TextBox ID="_txtMaxPointValue" runat="server" >
<asp:CustomValidator ID="_ctlMaxPointValidator" runat="server" ControlToValidate="_txtMaxPointValue"
ErrorMessage="" >*</asp:CustomValidator>

All in all it does not matter if I use the validators, my own custom message controls, or whatever. The data validation is not tightly coupled with the UI by using the deadly combo of MVP and the notification pattern!!!

I'm certain that aspects of my implementation can be improved and/or extended in some fashion. There are some things I debated as to which is the best approach but I can go into more detail later (for example, I mulled over a couple of other ways on how to pass the messages to the View but settled on the one above. Another was possiblly using reflection to set the property names in the error messages...but like I said I wanted to keep it simple for now. )

Catching up

Now that I have a blog, I thought I "reprint" every once and awhile some things I have written from my "pre-blog" years that might still be relevant or even mildly interesting to read again.

print 'Hello World!'

or is that

blog 'Hello World!'

Anyway, I have finally made the plunge into the world of blogging. I no longer have to bombard my friends' emails with my long and often rambling thoughts and opinions on software development and programming. Instead it will just be swallowed up in the sea of the other mostly interchangeable blogs out there.