Wednesday, February 08, 2012

sql-baseline: a bootstraper for RoundhousE

Roundhouse is a great tool for putting the database in to source control. When Roundhouse is used on a project, changes to the schema become an after thought; every check-in updates the schema on the build server; every check-out updates the schema on all the developers PCs. When a deployment is run, the schema changes are applied automatically.

But the barrier to entry for controlling an existing database is high. Roundhouse needs all the stored procedures, views, etc as files in a folder structure. Each script must be able to create or alter the entity. For teams supporting databases with thousands of procedures and views this work will stop evaluating Roundhouse. This is a shame as they are missing out on a superb tool. So I have created something to help get you started.

What does it do?

My current client falls in to the category of having thousands of procedures and views which had to be scripted. So I created a tool to extract them from SQL Server.

https://github.com/chucknorris/powerup/

Sql-baseline is run with three options; server name, database name and location for the files:

.\sqlbaseline.exe -s:"(local)" -d:AdventureWorks -o:C:\Db\Adventure

It will generate a script for all of the stored procedures and views and put them in the default Roundhouse folders:
\db
  \sprocs
  \views
Each script is created with the Create if not Exists / Alter template which will check for the entity first and if it is missing create it. The files are then ready be run by Roundhouse.
I have a short list of updates which are coming soon
  1. Export functions and indexes
  2. Create all of the default Roundhouse folders
  3. New option to specify which entities to create
  4. Accept a set of custom folders

Conclusion

When I decided to trial Roundhouse my first hurdle was extracting the procedures and other entities in to the correct format. Using sql-baseline has enabled me to put three of my clients databases in to source control. I hope it helps others to start using Roundhouse and controlling changes to the database.

Tuesday, February 07, 2012

Tools for migrating the database

In my last post controlling the database I showed that putting the database schema in source control brings an easier work-flow for developers and more reliable deployments. In this post I will look at the tools available to automate the process. They fall in to two types, schema diff tools and migration tools.

Schema diff tools

Schema diffs tools work by comparing each object in source database to those in a target database. From this comparison, they produce a script. This script will update the target to match the source and includes adding and dropping columns in tables. Two examples of schema diff tools are Red Gate SQL Compare and Microsoft’s Visual Studio Database project.

Visual Studio Database project

The Visual Studio Database project adds the database to Visual Studio. The database schema then lives in the same solution as the application code. Visual Studio has a schema diff tool which can import an existing database. When it is finished there will be a file in the project for every object in the schema. When a change is made to one of the objects the project can be deployed to a target database to bring it up to date. Visual Studio creates an in memory representation of the database so it can support refactoring of database objects. This is clever and powerful. It provides developers with a lot of support and confidence when making changes to the schema.

I have worked on a project which used the Visual Studio Database. The database was new and it worked well overall. The use of the Visual Studio Database project made sure that all developers had a current copy of the schema and deployments were simpler. There are some downsides though.

Merge conflicts

As every table, key and index is represented by an individual file, it also needs an entry in the project file. This caused a merge conflict for the project file which was either very hard or impossible to resolve. The team’s solution was to put and exclusive lock on the project file so only one person could update.

Failed migrations

The second problem is due to the nature of a schema diff tool. They compare two schemas and produce a script to bring one in line with the other. As this is machine generated it does not examine how to move those schemas in-line with one another. For example, if a field changed from NULL to NOT NULL, what happens to the existing empty fields? When the Visual Studio Database Project encounters this, it stops the migration. The onus is then on each developer to find a work around. The Visual Studio Database project has a mechanism to avoid this in a series of scripts that are always run before or after a deployment. As these scripts are run all the time they can become very large and difficult to manage.

Cost

My final comment about the Visual Studio Database project is the cost. It is only available in the Premium and Ultimate editions. At the time of writing Amazon is selling Premium for £2,112 and Ultimate for £10,792.

Migrations tools

Schema migration tools use a series of scripts to manage the changes to a database. When the migration is run against the target database it applies all the changes which have been created since the last run. These tools do not extract changes from the database. Instead the developer must write a script to migrate the schema. Two examples are RoundhousE, Ruby on Rails migrations and Tarrantino. RoundhousE RoundhousE is an open source migration tool. It does not integrate with Visual Studio or provide the refactoring abilities of the Visual Studio Database project. But then it takes a different approach which removes the need for the heavy weight tooling. How does it do this? RoundhousE is based on SQL scripts kept in a set of directories. The directories are split in to two main groups; any time scripts and one time only scripts.

Any time scripts are procedures, views, functions and indexes. If the entity is missing from the database Roundhouse will run the script to create it. If the entity has changed it will run the script to update it.

One time scripts are what make RoundhousE a migration tool. They contain the SQL to change the schema for a particular iteration. Each file is prefixed with a number and they are run sequentially.

Migrations

To demonstrate this I have a simple example application to track students. The first task is to make a screen which captures the students name. As part of this task I create a table, but instead of creating the table on my local instance of SQL Server I add this script to the RoundHousE “up” folder, which is the default location for the one time scripts

File called 0001_Make_student_table.sql

CREATE TABLE Student Id int NOT NULL, Name varchar(50)


The file name is important as RoundhousE will run the scripts in numerical order. Running RoundhousE against the local database executes the script and creates the table.

The task is compete and so the changes are committed to source control triggering a new build. The build server also starts of RoundhousE against it’s own instance of SQL Server, before running the unit tests. RoundhousE runs through the scripts in the “up” folder and applies all those since the last time it was run.

The next feature is to also capture the student’s email address. To meet the specification the student table needs a new column, so I add the following migration script:

File called 0002_Add_email_to_student_table.sql 


ALTER TABLE Student ADD email varchar(100) NULL

I check in and the same process runs so the build server has also updated the Student table on it’s local database.

When another developer pulls the latest changes form source control, all they have to do is run Roundhouse and it will run the two new scripts and their schema is also up to date.

Deploying the changes follows the same process as the build server. The call to RoundhousE is added to the deploy script which updates the target schema. RoundhousE will also read a version number from a file and write this to the database. This is very useful, as it is possible to see that UAT is running version 1.0.44.1 while production is running 1.0.33.2.

Conclusion

Adding a tool to manage database changes makes development easier and deployments safer. With out it the developer has to remember to run any schema changes on the build server before committing the change, otherwise the tests will fail. Without a tool the person responsible for the deployment has to compile a large SQL script. As the various environments are managed by hand it maybe that UAT is out of step with Production. This means that the deployment script will have to be tested on a copy of the target environment before it is run. These complications lead to day long deployments. With a tool the changes to the database are all contained in the release package ready to be run. The tool will decide what needs to be updated since the last release. When a team works with this method for a while releases become more frequent because they are less work and more reliable.