Friday, May 23, 2014

CQS talk at Brighton Alt.Net

In March I gave a talk at the Brighton Alt.Net meeting about applying the Command and Query Separation pattern to application design. This is a technique that I have been using for sometime to help me break up systems with bloated controllers or manager classes that are doing to much.


CQS Talk Brighton Alt.Net from Keith Bloom on Vimeo.
In the talk a mention a few resource:
The code from the talk is available up on github.

Monday, December 02, 2013

F# in Finance conference

On Monday 25th November I attended the F# in Finance conference at the Microsoft offices in London. I was drawn to this single day conference as I have been learning about functional programming for some time now. I am also interested in the finance sector as it seems paradoxical to me. On the one hand it appears to have ageing IT systems and an ardent use of Excel.  This seems like a bizarre way to run any business let alone a financial institution. On the other hand they can be at the forefront of innovation in software development. Indeed it is arguably the biggest commercial adopter of functional programming so far. So, I was keen to hear how this industry was changing and to learn if there were any lessons I could use in my own programming.

The day consisted of 10 talks, an ambitious goal for a single day. The most interesting theme that I picked up on was how productive many of the speakers felt when writing F# compared to C#. Jon Harrop and Phil Trelford both talked about how modelling complex domains was vastly simpler in a functional language than in an object oriented one. Phil explained how the energy trading system he maintains has a domain model which is just a single, two hundred line file. If this were to be implemented in an object oriented language the model would span hundreds of classes.

From the discussion about domain modelling it appears that functional languages are better at separating the data from the behaviour. This is still abstract in my mind so I have much to learn. What is more concrete for me are the language features that help productivity. When asked in a panel session, the speakers said that a lack of null values, immutability and the built in actor model are the main benefits when using F#. A lack of null values and immutability seem like an obvious gain. Null reference errors are a very common error in most systems. Mutated state is also a source of pernicious bugs. A rogue branch of code can create havoc to a well tested system if it alters some piece of state. The actor model is a higher level construct also aimed at limiting state changes in a system and in F# it is called the MailboxProcessor.

F# in Finance was a fantastic day of very focused presentations from some superb presenters. Functional programming is a clear fit for the finance sector where the domain can often be modelled in algebraic terms. Given that this is a sector where any competitive edge means vast profits I am sure the uptake of functional programming will only increase. It is good to see F# and, consequently, the CLR gaining a foothold. Thanks to the presenters I now have a clearer understanding of the advantages of functional programming and will be investigating further to see how I can improve my programming skills.

Wednesday, November 27, 2013

Functional JavaScript book review

I was very excited to receive my copy of Functional JavaScript by Michael Fogus as I am interested in, and have views on, both Functional programming and JavaScript. My view  of the functional programming community is that it is full of very clever people who are focused on creating software which is robust and malleable. This is probably because the concepts behind functional programming are hard to understand. It is also because it has a closer relationship to various branches of mathematics. My opinion of JavaScript is that it is the most ubiquitous programming language we have ever known. It is a language with some good features, but it has to be handled with care. The need for care is even greater when using it to program the DOM as this is a very complex API.

The use of functional programming in JavaScript is not a new idea, indeed it has many influences from Lisp and Scheme. But it is very good to see someone write a book exploring the topic. The style of the book is very conversational and each chapter moves up through the complex layers of functional programming.

At the beginning the focus is on higher order functions (functions taking in other functions as parameters) all the way to flow based programming and a brief overview of monadic programming. This structure demonstrates very well how functions can be composed together to create bigger programs. Functions written in each chapter re-appear in later ones to be part of a bigger whole.

I have read this book once and I am working my way through it again. It is rich with ideas for any JavaScript programmer.  The concepts of functional programming certainly stretched my imperative programmers mind. Stretched as it was, I enjoyed seeing Michael Fogus take an imperative process and re-implement it as a series of functions composed together. Functional JavaScript is a very enjoyable read and I would recommend you pick up a copy.

Thursday, May 16, 2013

Investigating ASP.Net MVC: Extending validation with IValidatableObject

Introduction

Frameworks are an essential part of programming. They help developers achieve complex tasks by presenting them with a simplified API over a more complex system. In my experience, it is possible to use a framework and be productive without giving too much thought to how it works.

However, I like to understand how things work. I am interested in the choices made by the framework designers. I feel that by knowing how they are built my ability to code improves and I can work with the framework more efficiently.

In this blog post I begin my investigation of the ASP.Net MVC framework. I will start by examining one part of the framework, the model binding process. How this works and how it can be extended. I will look at how the choices made by the framework designers influence the code I write and my understanding of the framework.

How flexible is the framework

The framework designer has a tricky balancing act. A good framework is simple to understand, hides the system it is abstracting and allows for easy extension. The extension points are the API and, to create them, the framework designers have several tools to choose from. The most common are, composition, inheritance, and events. The choice they make will have a big influence on the code I end up writing.

The ASP.Net MVC framework is an abstraction over HTTP requests and respsonses. It includes all three types of extension mechanisms. It has been designed to create HTML applications where the server is responsible for creating the markup which will be sent to the client. This is different from frameworks where the browser creates markup using a set of web services. The generation of HTML on the server was a guiding principle of the original design and has had the most influence on the API.

Model binding, deep within the framework

I am focusing on the model binding process which takes raw HTTP requests and creates real types which can be passed to controller actions. To understand its purpose I must first understand what ASP.Net MVC does when it handles a request:
  • When a HTTP request is made the routing engine picks it up and loads the relevant controller
  • The controller examines the request and decides which action will handle it
  • When the action has been identified the controller will delegate to the model binder to create the parameters for the action method from the request data
  • When the model binder has created the objects for the action method, it checks they are valid. If they are, any validation errors are added to the controllers ModelState object
Now I understand the flow of data through the framework, I can use it in my dummy application. This application allows people to tell me their favourite food so that I can keep some statistics on the favourite foods of the world. Unfortunately, now and then, someone types in "House" to try and skew the results. My task then is to add validation to the application to prevent this.

So far my application consists of a form, a view model object which will represent the input and a controller to handle the request



My controller action checks the validity of the input and will either update the statistics or return the form where MVC will display the errors for me. My FoodViewModel class will never fail validation though as the framework has no knowledge of what I consider an invalid request. To achieve that I have to implement some form of validation. One solution is to add the validation logic to controller action

My controller now checks the form data to see if anyone has entered house as their favourite food. If present, I add a my error to the ModelState collection which also sets the validity of the ModelState to false. My controller will now detect invalid requests.

The controller code above demonstrates a common mistake I see in MVC applications. Here the controller is doing too much work and the code is failing to use the extensions available in the framework. Instead, the FoodViewModel can be extended to work with the model binding process to handle the validation in a more elegant and focused manner.

Extending the validation process

There are two ways that I can augment my FoodViewModel with validation rules. Simple validation can be achieved by decorating properties with attributes like [Required] or [StringLength]. The model binder will detect these and assert the rules accordingly.

For more complex validation the framework designers chose composition as a way for my code to participate in validation and created the IValidatableObject interface.

This has a method called Validate which accepts a ValidationContext and returns an enumerable of ValidationResults. To show how this works I have updated FoodViewModel to implement the interface.

It implements the interface by defining the Validate method so that when the model binder runs it can ask my object to validate itself. If the FavouriteFood property contains the word "House" it returns an error message.

Coding to a contract

The IValidatableObject interface is a contract between the model binder and my view model which allows them to work together. The FoodViewModel is declaring that it can behave as an IValidatableObject. This allows the model binder to ask if it is valid.

For the model binder this is a powerful tool. By defining this interface the model binder achieves two things, it can open itself up to the outside world and it can delegate the job of validation to someone else. This code demonstrates how the model binder can implement this

To mimic the process used by the model binder I use reflection to create an instance of the FoodViewModel and then cast it to an instance of IValidatableObject. If the cast succeeds I call the Validate method (to keep the example simple I pass in null for the validation context). Any errors that are returned I store in my error collection. Finally, I output all the messages to the console.

This code shows the power and simplicity of composition. The example code is focused on managing the process of collecting errors from other objects. It does not have any knowledge of how to validate an object but it uses a known contract to collect the results. The process of validation has been extracted and put in the IValidatableObject interface. This allows other code to extend the process by supplying their own implementations for the validation process. When this happens the two processes create a single process which does more than they could independently. This is the goal of composition, combining many simple objects to create a more complex one.

Conclusion

I feel that too often developers fail to think about the way a framework is intended to be used or what decisions have been made to abstract the lower level system. A typical indication of a lack of thinking is an application which recreates existing parts of the framework. Exploring the code and the API of a framework helps me to avoid this. I also expand my knowledge of how to use it efficiently and how to design my own code.

Examining the model binder process has given me a greater knowledge of how ASP.Net MVC takes a HTTP request and generates an object for a controller action. Understanding this complex process allows me to work with the framework so that I can extend my code in the simplest way possible to achieve the goal of validation.

I also gain knowledge by studying how composition is used in a complex process. I am now able to apply this powerful design pattern to my own code. I feel that studying existing code is an excellent way to expand my knowledge and, to be honest, I find it fun to learn how things work.

Sunday, September 23, 2012

SQL Baseline has joined the ChuckNorris Framework

I am very pleased to say that Rob and Dru have added my SQL Baseline tool to the Chuck Norris Framework. As part of SQL Baseline’s inauguration it has been renamed as PoweruP to fit alongside the likes of RoundhousE, DropkicK and WarmuP. The project has been moved over and can be found here.

I created PoweruP to help me configure RoundhousE to manage a number of existing databases. This is not an easy task and can be a barrier which stops people trying out RoundhousE as is shown by this conversation


This is a shame because once RoundhousE is setup it greatly increases development speed, it is simple to maintain and brings database development inline with application coding. What can stop people using it, is the need to extract all the stored procedures, views, functions, etc, from the database. With one command PoweruP will scaffold a new RoundhousE project from an existing database. Plus It will create the scripts and put them in the default RoundhousE folder structure. For a more detailed explanation see this post.

I am very pleased for PoweruP to be part of the Chuck Norris framework. I hope it will help more development teams to get started using RoundhousE because it is the best tool I have found for managing changes to the database schema.

Tuesday, September 18, 2012

Using 0MQ to communicate between threads

In this post I show how 0MQ can help with concurrency in a multithreaded program. To do this, I explore what concurrency means and why it is important. I then focus on in-process concurrency and threaded programming, a topic which is notoriously tricky to do well due to the need to share some kind of state between threads. I explore why this is and how this is typically tackled. I will then show how communication between threads can be achieved without sharing any state using 0MQ. Finally I propose that by constructing our multi-threaded applications using the 0MQ model, that this leads us to more succinct and simpler code.

All code can be found in this github project

What is a concurrent program?

The word concurrent means more than one thing working together to achieve a common goal. In computing this means doing one of two things; something which is computationally expensive, like encoding a video file, or something that requires some sort of IO, like retrieving the size of a number of web pages.

The opportunity to employ concurrency has exploded with the arrival of multicore processors and the rise of hosted processing platforms like Amazon EC2 and Windows Azure. These two changes represent the two ends of the concurrency spectrum. To achieve concurrency on a multicore processor we create threads within our application and manage how they will share state. Whereas achieving concurrency using something like EC2 is network based and requires the use of a communication channel like TCP. When communicating over the network, state is handled by passing messages.

0MQ recognises that the best way to create a concurrent program is to pass messages and not to share state. Whether it is two threads running within a process or thousands of processes running across the internet, 0MQ uses the same model of sockets and messaging to create very stable and scalable applications.

Multiple threads shared state and locks

In .Net any program that must do more than one task at a time must create a thread. Threads are a way for Windows to abstract the management of many different streams of execution. Each thread gets it’s own stack and set of registers. The OS will then handle which thread is to be executed at one time.

The problem with threads is that when they have to communicate with each other the typical way is to share some value in memory. This can cause data corruption as more than one thread could be accessing the data at one time, so the application has to manage access to the shared data. This is done by locking the shared data, ensuring that only one thread can manipulate it at any one time. This mechanism adds complexity to an application as it must include the locking logic. It also has an effect on performance.

0MQ multiple threads and no shared state

0MQ makes threaded programming simpler by swapping shared state for messaging. To demonstrate this I have created a simple program which calculates the size of a directory by adding up the size of each file it has.

As we are using 0MQ we have to understand some of the concepts it uses. The first concept is static and dynamic components. Static components are pieces of infrastructure that we can always expect to be there. They usually own an endpoint which can be bound to. Dynamic components come and go and generally bind to endpoints. The next concept is the types of sockets provided by 0MQ. The implementation we’ll be looking at uses two types of sockets, PUSH and PULL. The PUSH socket is designed to distribute the work fairly to all connected clients, whilst the PULL socket collects results evenly from the workers. Using these socket types prevents one thread from being flooded with tasks or left idle waiting for it’s result to be taken.

Finally the 0MQ guide has a number of patterns for composing an application depending on the type of work being done. The example below calculates the size of a directory by getting the size of each file and adding them together. To achieve this task in 0MQ, a good choice is the task ventilator pattern.

 

In the diagram each box is a component in our application and components communicate with each other using 0MQ sockets. There are two static components in this application, the Ventilator and the Sink. There will only be one instance of each in the application and they will run on the same thread. There is one dynamic component, the Worker. There can be any number of workers and each one runs on it’s own thread.

To calculate the size of the directory, the Ventilator is given a list of files from the directory. It sends the name of each one out on it’s message queue.

When the Sink is started, it is given a number of files to count the size of, in this instance we pass in the length of the array that we passed to the Ventilator. The Sink then pulls in the results from each of the workers and increments the running total for the size of the directory. When it has finished it returns the total size of the files found.

The Worker connects to the Ventilator and Sink end points and sits in an endless loop.

When a message arrives from the Ventilator it triggers an event which causes the Worker to read the file from the disk to find its size. When the operation completes the Worker publishes the size to the Sink’s end point.

All the components are brought together in the controlling program. We create a 0MQ context which will be shared with all the components. This is an important point when using 0MQ with threads, there must be a single context and it must be shared amongst all the threads. We then create instances of the Ventilator and Sink passing in the context.

Next we create five workers each on their own thread, again passing in the 0MQ context.

We do the work by building an array of files from our directory and passing this to the Ventilator. We tell the Sink how many results to expect and wait for the result to be returned.

When we have the final number we print it on the console. At no point in the process did any thread have to update a shared value.

Conclusion

In this post I investigated the programming challenges faced when dealing with concurrency, focusing on those specific to threaded concurrency. I have shown how 0MQ approaches this problem with the view that concurrency should never involve sharing state and communication is best handled by passing messages between processes. To demonstrate how this works I created a simple program to calculate the size of a directory and used the 0MQ task ventilator pattern to structure the program. By following this pattern the software is broken down into very specific parts to perform a job. All knowledge of how to read the size of a file is held in the worker. If we discover a better way to read the size of the file this component can be changed without any impact on the rest of the program. This isolation is a consequence of only allowing communication between the key components over a message channel. Therefore the code is simpler as each component does only one job.

All code can be found in this github project

Sunday, August 19, 2012

0MQ Introduction

What is 0MQ?

0MQ is a very simple library that is used for managing the communication between different processes. It is a way of using enterprise messaging patterns without the need for an enterprise messaging server. By removing the server and using the socket API, a level of complexity is removed which leads to a simpler model good for concurrent programming.

History

0MQ has its roots firmly in the world of financial services. Originally, there were two vendors, TIBCO and IBM, which each had their own protocols for enterprise messaging. This made it hard for banks to intercommunicate. In 2003 the London office of JP Morgan created the first draft of Advanced Message Queue Protocol (AMQP) which was an attempt to create a standard communication protocol for messaging systems.

In 2005 iMatix were contracted by JP Morgan to create a message broker based on the new specification and they produced OpenAMQ. The new standard was received well by others in the financial services and new members were added to the working group. However, the complexity of AMQP grew and led to iMatix leaving the working group. In 2008 Pieter Hitchens of iMatix wrote What is wrong with AMQP and how to fix it. Here Hitchens applauds early versions of the specification for being concise and simple to implement but then criticised later versions for the complexity. He argues that any specification that is too complex will fail. It is also clear that the experience iMatix had developing OpenAMQ gave them good insight into a new way of supporting high speed messaging. This experience led them to conclude that the way to simplify messaging was to remove the server that hosted the queues for the clients. This led to the development of 0MQ.

Not a message bus

In traditional enterprise messaging there is a server which hosts the queues and roots the messages. If you are using IBM this maybe WebSphere, a Microsoft shop would use MSMQ, whilst others may use RabbitMQ. All of theses solutions involve some software being installed on a server. Clients then bind to the queues they host to process messages.

0MQ is different in that it does not have a central server component, it is just a software library. For network communications you write the server and client components using the 0MQ API. Internally 0MQ uses TCP sockets to create the connection. For a lot of scenarios this is removing a redundant step in the process. Take the example of a time server on the network whose job it is to respond to requests for current time. With an enterprise service bus my time server would bind to a queue on the central exchange. Any client that wanted to know the time would send a request to that queue and wait for a response. In this operation the central server is not adding much to the task. Using 0MQ this same server can be created very easily

The client that requests data from this service is

Applications that 0MQ is good for

By combining messaging patterns with socket based communication 0MQ is very good for concurrent programming. Concurrent programming can be across a network, within a machine or within a process. 0MQ uses the same patterns for all of these.

In the example above we created a server by binding to a tcp port timeServer.Bind("tcp://*:5555"); for a service hosted on a network. To host this in process or one a machine we just change the binding type:
  • A way to connect processes on any machine and pass messages between them:
    Bind(“tcp://*:5555);
  • A way to connect processes within a machine and pass messages between them:
    Bind(“ipc//:5555);
  • A way to create threads in a process and pass messages between them:
    Bind(“inproc://myservice”);

Conclusion

0MQ is a very easy library to start using as it involves including a couple of DLL’s in your project and does not need any other infrastructure to support it. It has good abstractions and it is easy to create a variety of messaging queues.

Where I think 0MQ is really powerful is when it is applied to multi threaded programming. This is because 0MQ uses the same model for threaded programming as that used for concurrent programming across networks. Both of these pass messages to communicate instead of sharing state and this avoidance of shared state between threads leads to more reliable and simpler programs. I shall explain this fully in my next blog post.