Sunday, July 27, 2008

Snake bitten by Python (R.I.P. NAnt)

After first experimenting last year with IronPython, the .NET port of Python, I decided to take the full plunge into Python itself leading to another major milestone in my programming career. Why? Because I now have an incredibly handy language, Python, that can superbly manage rudimentary but necessary development tasks. Furthermore, Python has enlightened me to yet another way of thinking about how code can be written.

As a dynamic language, Python can be extremely powerful. It can be used for "glue" tasks like scripting but its potential is even greater. By being an interpreted language, pieces of your code can both be written and tested practically at the same time via its interpreter console (as if the Immediate Window in Visual Studio were the means to simultaneously see how your code works while you are writing it. No "compilation tax".) In addition, Python can do both OOP (e.g. classes, et al) as well as functional programming (e.g. treating functions like data in lists and supporting lambdas like Lisp). Finally, with its leaner and less verbose syntax, less code is written as compared with other static languages.

After a few days to ramp up and get acquainted with the language, I immediately started to implement Python on a few things. NAnt build scripts and Windows bat files were the main targets for Python conversion. I also intend on rewriting in Python a C# .NET console tool that merges the content of multiple files into a single one. It seemed more natural and sensible to use Python for these types of development tasks.

Replacing NAnt with Python is favored since NAnt is an XML-based DSL that might be doing a little too much. The problem is not that it is a DSL, generally a good thing particularly when done right (as the Ant/NAnt folks succeeded quite well in doing to their credit), but the part of it being "XML-based". Who really wants to program all day in XML? Apache Ant, NAnt in the Java world, was a victim of the exploding popularity of XML during the height of the dot com era web applications. XML should be left to do what it does best and what it was originally intended for: basic structured data storage and configuration. Ant (and, subsequently, NAnt) should not have mixed the following two: (1) formatting/organizing data and (2) build flow logic. Not a far cry from violating the principle "separation of concerns".

If I can avoid it, I am finished with NAnt (or any other equivalent build frameworks that rely heavily on XML for its flow logic). If given the choice in a development environment, I probably not opt to use NAnt to handle build scripts. Not that I have anything against NAnt itself, just that better, more programmer friendly alternatives exist. NAnt was (and still is) a great option as compared, say, with the inferior Windows bat files or with the dev shops that manually build their projects via the Visual Studio. Instead, a dynamic language like Python (or BOO or Ruby or whatever else) is preferable to manage this type of work. (A few build automation frameworks do exist written in Python, but I will like to take a look at the promising, .NET born BOO build system.)

NAnt documentation mentions that it has the advantage over native OS shell commands because it is "cross-platform". That might be true, but Python has that area easily covered specifically with its 'os' and 'shutil' modules. Portability is one of Python's key features.

Code generation of other programming languages is another area where I also started to use Python. Database change scripts written in TSQL that are repetitive and voluminous have benefited significantly from using Python (as one example, creating structurally similar 'drop column' statements for 100+ columns). In the future, for other kinds of code generation (e.g. NHibernate mapping files is one example), I will definitely consider Python as a substitute for heavier code-gen tools such as MyGeneration.

The more I use Python, the more I am convinced that it will be employed as my general, all-purpose utility programming language. I intend to use it as a vital supporting player handling the grunt work in my development processes. It does not matter what the primary language happens to be whether C#, TSQL, etc. By and large, I just like how quick and dirty a script can be whipped up to perform some auxiliary task without having to endure the overhead of compiling, creating, and running some executable file. (Who knows? Maybe one day I can work on a major project where Python is the star of the show.) All in all, it is just such a nice clean, readable language making it far more enjoyable to work with as compared with something like NAnt.

To give an idea on how visually different it is to use Python over NAnt, below are the code of two identical build scripts written in each language. This the first script I ported over to Python. The script runs the SQL Server Database Publishing Wizard to generate a file that contains the sql to create the schema of a baseline database required at the start of each development cycle. The following high level tasks are executed by the script:
  • Create Schema Script- Generates the raw initial tsql schema script from target database using the DB pub wiz
  • Convert Script File Encoding- Convert file from Unicode to ASCII
  • Replace Script Values- Read from external csv file containing pairs of strings to replace in script.
  • Checkout File From Source Control - Checkout from Perforce the existing schema file that will be replaced.
  • Copy File To Build Location- Move sql script file to build directory
  • Build Database And Run Unit Tests- Run another separate script (currently written in NAnt) that builds the db and runs the unit tests using TSQLUnit
NAnt Version
<?xml version="1.0"?>
<project name="Generic Database Build" default="BaselineDatabaseCreation"
xmlns="http://nant.sf.net/release/0.85/nant.xsd">
<property name="base.dir" value=".\" overwrite="false" readonly ="false" />
<property name="sourceDB" value="" overwrite="false"/>
<property name="sourceServer" value=".\sqlDev2005" overwrite="false"/>
<property name="dbmsVersion" value="2000" overwrite="false"/>
<property name="connectionString" value="Server=${sourceServer};Database=${sourceDB};Trusted_Connection=True;"/>
<property name="dbBuild.dir" value="" overwrite="false"/>
<property name="targetDB" value="" overwrite="false"/>
<property name="targetServer" value=".\sqlDev2005" overwrite="false"/>
<property name="sqlScriptingTool.dir" value="C:\Program Files\Microsoft SQL Server\90\Tools\Publishing\" overwrite="false"/>
<property name="sqlScript.fileName" value="CreateSchema.sql" overwrite="false"/>
<property name="sqlScript.filePath" value="${path::combine(base.dir, sqlScript.fileName)}" overwrite="false"/>
<property name="sourceControl.filePath" value="${dbBuild.dir}Schema\${sqlScript.fileName}" overwrite="false"/>
<!-- replace values list variable -->
<property name="temp.fileName" value="temp.txt"/>
<property name="temp.filePath" value="${path::combine(base.dir, temp.fileName)}"/>
<property name="replaceValuesList.fileName" value="" overwrite="false"/>
<property name="replaceValuesList.filePath" value="${path::combine(base.dir, replaceValuesList.fileName)}"/>

<target name="BaselineDatabaseCreation" description="Creates baseline database tsql script end-to-end.">
<call target="CreateSchemaScript"/>
<call target="ConvertScriptFileEncoding"/>
<call target="ReplaceScriptValues" unless="${replaceValuesList.fileName==''}"/>
<call target="CheckoutFileFromSourceControl"/>
<call target="CopyNewScriptFileToBuildLocation"/>
<call target="GetSeedTablesData"/>
<call target="BuildDatabaseAndRunUnitTests" unless="${targetDB==''}"/>
</target>
<target name="CreateSchemaScript" description="Generates the raw initial tsql schema script from target database">
<delete file="${sqlScript.filePath}" if="${file::exists(sqlScript.filePath)}" />
<exec program="${sqlScriptingTool.dir}sqlpubwiz">
<arg value="script" />
<arg line="-C ${connectionString}" />
<arg value="${sqlScript.filePath}" />
<arg value="-schemaonly" />
<arg line="-targetserver ${dbmsVersion}" />
<!-- '-f' means overwrite existing files is true -->
<arg value="-f" />
</exec>
<fail message="${sqlScript.filePath} was not created."
unless="${file::exists(sqlScript.filePath)}" />
</target>
<target name="ConvertScriptFileEncoding" description="Convert file from Unicode to ASCII">
<copy file="${sqlScript.filePath}" tofile="${temp.filePath}" outputencoding="ASCII" overwrite="true" />
<move file="${temp.filePath}" tofile="${sqlScript.filePath}" overwrite="true"
unless="${file::exists(replaceValuesList.filePath)}" />
</target>
<target name="ReplaceScriptValues"
description="Read from external csv file containing pairs of strings to replace.">
<fail message="${replaceValuesList.filePath} does not exist."
unless="${file::exists(replaceValuesList.filePath)}" />
<foreach item="Line" in="${replaceValuesList.filePath}" delim="," property="x,y">
<echo message="Replacing '${x}' with '${y}'..." />
<copy file="${temp.filePath}" tofile="${sqlScript.filePath}" overwrite="true">
<filterchain>
<replacestring from="${x}" to="${y}" />
</filterchain>
</copy>
<copy file="${sqlScript.filePath}" tofile="${temp.filePath}" overwrite="true"/>
</foreach>
<move file="${temp.filePath}" tofile="${sqlScript.filePath}" overwrite="true"/>
</target>
<target name="CheckoutFileFromSourceControl" description="Checkout from source control the schema file that will be replaced.">
<fail message="${sourceControl.filePath} does not exist."
unless="${file::exists(sourceControl.filePath)}" />
<p4edit view="${sourceControl.filePath}">
<arg line="-t"/>
<arg line="text+k"/>
</p4edit>
</target>
<target name="CopyNewScriptFileToBuildLocation">
<copy file="${sqlScript.filePath}" tofile="${sourceControl.filePath}" overwrite="true" />
</target>
<target name="GetSeedTablesData">
<!--TODO: Create a separate NAnt build script for this-->
</target>
<target name="BuildDatabaseAndRunUnitTests">
<nant buildfile="GenericDatabase.build" inheritall="false" >
<properties>
<property name="base.dir" value="${dbBuild.dir}"/>
<property name="server" value="${targetServer}" />
<property name="database" value="${targetDB}"/>
<property name="includeUnitTesting" value="true" />
</properties>
</nant>
</target>
</project>

Python Version
import os
import csv
import shutil

sqlscripting_tool=r'C:\Program Files\Microsoft SQL Server\90\Tools\Publishing\sqlpubwiz.exe'
dbms_version = '2000'
connection_string = 'Server=' + source_server + ';Database=' + source_db + ';Trusted_Connection=True;'

sqlscript_filename = 'CreateSchema.sql'
sqlscript_filepath = os.path.join(sqlscript_dir, sqlscript_filename)
source_control_filepath = os.path.join(db_build_dir, sqlscript_filename)

replace_values_filepath = os.path.join(replace_values_dir, 'ReplaceList.csv')
error_found_message = 'Error found!'

def run_script():
"""Run all tasks"""

tasks = [create_schema_script, convert_scriptfile_encoding, replace_script_values,
checkout_file_from_source_control, copy_file_to_build_location, build_database_and_run_unit_tests]
for task in tasks:
print 'Executing \'' + task.func_name + '\'... '
is_successful = task()
if not is_successful:
print 'Script Failure!'
break

if is_successful:
print 'Script Success!'

def create_schema_script():
"""Generates the raw initial tsql schema script from target database"""

if os.path.isfile(sqlscript_filepath):
os.remove(sqlscript_filepath)

args = ['sqlpubwiz', 'script', '-C ' + connection_string, '"' + sqlscript_filepath + '"',
'-schemaonly', '-targetserver ' + dbms_version, '-f']
os.spawnv(os.P_WAIT, sqlscripting_tool, args)

if os.path.isfile(sqlscript_filepath) == False:
print error_found_message
print "File '" + sqlscript_filepath + "' was not created."
return False

return True

def convert_scriptfile_encoding():
"""Convert file from Unicode to ASCII"""

cmd1 = 'type "' + sqlscript_filepath + '" > temp.txt'
cmd2 = 'move temp.txt "' + sqlscript_filepath + '"'
cmds = [cmd1, cmd2]
for cmd in cmds:
dos = os.popen(cmd)
dos.read()
dos.close()

return True

def replace_script_values():
""" Read from external csv file containing pairs of strings to replace values in sql script. """

# if 'replace values' list not provided then assume not needed
if replace_values_filepath == '':
return False

# check for 'replace values' file existence
if os.path.isfile(replace_values_filepath) == False:
print error_found_message
print "Replace values list file '" + replace_values_filepath + "' does not exist."
return False

# modify file content with new values
f = open(sqlscript_filepath, 'r')
text = f.read()
replace_values = csv.reader(open(replace_values_filepath, 'r'))
for row in replace_values:
find_text = row[0]
replace_with_text = row[1]
text = text.replace(find_text, replace_with_text)
f.close()

# write to script file with new values
f = open(sqlscript_filepath, 'w')
f.write(text)
f.close()

return True

def checkout_file_from_source_control():
""" Checkout from source control the schema file that will be replaced. """

# look for source control file
if os.path.isfile(source_control_filepath) == False:
print error_found_message
print "Source control file '" + source_control_filepath + "' does not exist."
return False

# checkout file (note: could use PyPerforce API framework instead)
cmd = 'p4 edit -t text+k ' + source_control_filepath + ''
p4 = os.popen(cmd)
p4.read()
p4.close()

return True

def copy_file_to_build_location():
""" Move sql script file to build directory """

shutil.copy(sqlscript_filename, source_control_filepath)

return True

def build_database_and_run_unit_tests():
""" Build database and validate schema by running unit tests """

nant_tool = os.path.join(base_dir, 'Tools\\NAnt\\bin\\', 'NAnt.exe')
build_script_filepath = os.path.join(base_dir, 'Projects\\Libs\\Utils\\NAntScripts\\DatabaseBuilds\\', 'GenericDatabase.build')
build_dir = os.path.split(os.path.normpath(db_build_dir))[0] # hack: need to remove 'Schema' folder; todo: need to remove this from generic db build script

# todo: replace NAnt script with Python script
args = ['NAnt', '-buildfile:' + build_script_filepath, '-D:base.dir=' + build_dir,'-D:server=' + target_server,
'-D:database=' + target_db, '-D:installUnitTesting=' + 'true', ]
os.spawnv(os.P_WAIT, nant_tool, args)

return True

run_script()

2 comments:

Brian said...

I've been half interested in picking up Python myself. What would you recommed as far as tools, materials, etc. to transfer what you know in C# about programming applications to Python? What are some things you've picked up on that are not a good fit for Python?

Ray Vega said...

@woonboo

First all..

If you are serious about learning Python go download and install it RIGHT NOW! Do not wait! Seriously, the hardest part of learning something new is getting started so once you get past that initial hurdle it should hopefully only get easier. FYI, I used the 'Python 2.5.2 Windows installer' since I currently only work on Windows.

Materials...

I started with the tutorial that is part of the standard Python documentation. I found it really straight forward to follow and was able to write working code almost immediately. That is due to the very nature of Python unlike a lot of other non dynamic, non intepreted languages.

For other sources I have read a few introductory tech books on Python since then but the one that really stands out is the Mark Lutz's 'Learning Python'. Most of the other books I looked at were somewhat lacking as compared to it. I highly recommend picking it up and reading it.

Tools...

As for tools, Python has its own standard IDE named 'IDLE' which is part of its install. It includes the Python's interpreter which allows to you interact with your code. I have not tried any other IDE out there for Python but if you are new to Python it is the way to go.

And finally...

>> What are some things you've picked up on that are not a good fit for Python?

Python is extremely versitile in that it lends itself very well to not just applications but to shell scripting. So, therefore, since my need was for a new means to do shell scripting, I was able to start writing code that handle copying files, manipulating text in files, running other apps, etc. I still primarily program applications in C# so I have not made a "full plunge" in terms of doing development on actual Python applications. Therefore, I haven't really encountered at this point what is "not a good fit for Python". (The first chapter of 'Learning Python' explains really well the pros and cons of Python and how it can be used and where it might not be a "good fit".)