Skip to content

Index

TIL – How To Check If a Substitute Was Called Zero Times

Setup

During this past week, I’ve been working on a new feature and during development, I ended up with code that looked like this:

public class PermissionChecker
{
  public PermissionChecker(IModuleDisabler moduleDisabler, User user)
  {
      if (user.IsAdmin) return;
      else if (user.HasFullRights) ConfigureFullRights(moduleDisabler);
      else if (user.HasPartialRights) ConfigurePartialRights(moduleDisabler);
  }

  private void ConfigureFullRights(IModuleDisabler disabler)
  {
      disabler.DisableSystemAdminModule();
  }

  private void ConfigurePartialRights(IModuleDisabler disabler)
  {
      disabler.DisableSystemAdminModule();
      disabler.DisableReportModule();
      disabler.DisableUserManagementModule();
  }
}

So the code is pretty straight forward, I have a PermissionChecker whose job is to use the IModuleDisabler to turn off certain modules depending upon the user permissions. Pretty straightforward implementation.

Now that the solution is fleshed out, it’s time to write some tests around this. When it comes to testing classes that have dependencies on non-trivial classes, I use NSubstitute, a mocking tool, to create mock versions of those dependencies. In this case, NSubstitute allows me to test how the IModuleDisabler is being used by the PermissionsChecker.

For example, let’s say that I wanted to test how the PermissionChecker interacts with the IModuleDisabler when the user has a partial access, I’d write a test that looks like the following:

[Test]
public void And_the_user_has_partial_access_then_the_disabler_disables_the_report_module()
{
     // Arrange
     var permissionChecker = new PermissionChecker();
     var mockDisabler = Substitute.For();
     var user = new User {HasPartialAccess = true};

     // Act
     permissionChecker.CheckPermissions(mockDisabler, user);

     // Assert
     mockDisabler.Received(1).DisableReportModule();
}

In the above test, our assertion step is to check if the mockDisabler received a single call to the DisableReportModule. If it didn’t receive a call, then the test fails. We can write similar tests for the different modules that should be disabled for the partial rights permission and follow a similar pattern for the full rights permission.

However, things get a bit more interesting when we’re testing what happens if the user is an admin. If we follow the same pattern, we’d end up with a test that looks like this:

[Test]
public void And_the_user_has_admin_permissions_then_the_disabler_is_not_used()
{
     // Arrange
     var permissionChecker = new PermissionChecker();
     var mockDisabler = Substitute.For();
     var user = new User {IsAdmin = true};

     // Act
     permissionChecker.CheckPermissions(mockDisabler, user);

     // Assert
     mockDisabler.DidNotReceive().DisableSystemAdminModule();
     mockDisabler.DidNotReceive().DisableReportModule();
     mockDisabler.DidNotReceive().DisableUserManagementModule();
}

This solution works for now, however, there is a major maintainability issue, can you spot it?

Problem

The issue arises when we add a new module to be disabled which forces the IModuleDisabler to implement a new method. In that case, you need to remember to update this test to also check that the new method wasn’t being called. If you forget, this test would still pass, but it’d pass for the wrong reason.

To help illustrate, let’s say that another method, DisableImportModule, has been added to the IModuleDisabler interface. In addition, we also need to make sure that this is called when users have partial access, but should not be called for users who are admins or users who have full access.

To fulfill those requirements, we modify the PermissionChecker as so:

public class PermissionChecker
{
  public PermissionChecker(IModuleDisabler moduleDisabler, User user)
  {
      if (user.IsAdmin) return;
      else if (user.HasFullRights) ConfigureFullRights(moduleDisabler);
      else if (user.HasPartialRights) ConfigurePartialRights(moduleDisabler);
  }

  private void ConfigureFullRights(IModuleDisabler disabler)
  {
      disabler.DisableSystemAdminModule();
  }

  private void ConfigurePartialRights(IModuleDisabler disabler)
  {
      disabler.DisableSystemAdminModule();
      disabler.DisableReportModule();
      disabler.DisableUserManagementModule();
      disabler.DisableImportModule();
  }
}

At this point, we’d write another test for when the a user has partial access, the import module should be disabled. However, it’s very unlikely that we’d remember to update the test for the admin. Remember, for the admin, we’re checking that it received no calls to any disable methods and the way we’re doing that is by checking each method individually.

[Test]
public void And_the_user_has_admin_permissions_then_the_disabler_is_not_used()
{
  // Arrange
  var permissionChecker = new PermissionChecker();
  var mockDisabler = Substitute.For();
  var user = new User {IsAdmin = true};

  // Act
  permissionChecker.CheckPermissions(mockDisabler, user);

  // Assert
  mockDisabler.DidNotReceive().DisableSystemAdminModule();
  mockDisabler.DidNotReceive().DisableReportModule();
  mockDisabler.DidNotReceive().DisableUserManagementModule();
  // Need to add check for DidNotReceive().DisableImportModule();
}

Solution

There’s got to be a better way. After some digging around, I found that any NSubstitute mock, has a ReceivedCalls method that returns all calls that the mock received. With this new knowledge, we can refactor the previous test with the following:

[Test]
public void And_the_user_has_admin_permissions_then_the_disabler_is_not_used()
{
  // Arrange
  var permissionChecker = new PermissionChecker();
  var mockDisabler = Substitute.For();
  var user = new User {IsAdmin = true};

  // Act
  permissionChecker.CheckPermissions(mockDisabler, user);

  // Assert
  CollectionAssert.IsEmpty(mockDisabler.ReceivedCalls());
}

This solution is much better because if we add more modules, this test is still checking to make sure that admin users do not have any modules disabled.

Summary

When using a NSubstitute mock and you need to make sure that it received no calls to any methods or properties, you can using NSubstitute’s ReceivedCalls in conjunction with CollectionAssert.IsEmpty to ensure that the substitute was not called.

Using F# To Solve a Constraints Problem

In this post, I’m going to solve a logic puzzle using C# and F#. First, I’ll define the problem being solved and what our restrictions are. Next, I’ll show how I’d break down the problem and write an easy-to-read, extendable solution using idiomatic C#. Afterwards, I’ll solve the same problem and write an easy-to-read, extendable solution writing in idiomatic F#. Finally, we’ll compare the two solutions and see why the F# solution is the better solution.

The Problem

For this problem, I’m going to write a constraint solver (thanks to Geoff Mazeroff for the inspiration).

If you’re not familiar with the concept, a constraint is simply some rule that must be followed (such as all numbers must start with a 4). So a constraint solver is something that takes all the constraints and a source of inputs and returns all values that fit all the constraints.

With that being said, our source will be a list of positive integers and our constraints are the following:

  • 4 digits long (so 1000 – 9999)
  • Must be even (so 1000, 1002, 1004, etc…)
  • The first digit must match the last digit (2002, 2012, 2022, etc…)

To further restrict solutions, all code will be production ready. This includes handling error conditions (like input being null), being maintainable (easily adding more constraints) and easy to read.

To quickly summarize, we need to find a robust, maintainable, and readable solution to help us find all 4 digit number that are even and that the first and last digit are equal.

Implementing a Solution in C

For the C# solution, I’m going to need a class for every constraint, a class to execute all constraints against a source (positive integers) and a runner that ties all the pieces together.

Starting with the smaller blocks and building up, I’m going to start with the constraint classes. Each constraint is going to take a single number and will return true if the number follows the constraint, false otherwise.

With that being said, I’d first implement the constraint that all numbers must be 4 digits long

1
2
3
4
5
6
7
class MustBeFourDigitsLongConstraint
{
    public bool FollowsConstraint(int value)
    {
        return value.ToString().Length == 4;
    }
}

Second, I’d write the constraint that all numbers must be even

1
2
3
4
5
6
7
class MustBeEvenConstraint
{
    public bool FollowsConstraint(int value)
    {
        return value % 2 == 0;
    }
}

Third, I’d implement the constraint that all numbers must have the same first digit and the last digit

1
2
3
4
5
6
7
8
class FirstDigitMustEqualLastDigitConstraint
{
    public bool FollowsConstraint(int value)
    {
        var valueString = value.ToString();
        return valueString[0] == valueString[valueString.Length-1];
    }
}

At this point, I have the constraints written, but I need them to follow a general contract so that the Constraint Solver (about to be defined) can take a list of these constraints. I’ll introduce an interface, IConstraint and update my classes to implement that interface.

1
2
3
4
5
6
7
8
9
public interface IConstraint
{
    bool FollowsConstraint(int value);
}
class MustBeFourDigitsLongConstraint : IConstraint {/* Implementation Details Omitted */}

class MustBeEvenConstraint : IConstraint {/* Implementation Details Omitted */}

class FirstDigitMustEqualLastDigitConstraint : IConstraint {/* Implementation Details Omitted */}

So now I have the constraints defined and they’re now implementing a uniform interface, I can now create the constraint solver. This class is responsible for taking the list of numbers and the list of constraints and then returning a list of numbers that follow all constraints.

class ConstraintSolver
{
    public List FindValues(List constraints, List values)
    {
        if (constraints == null) throw new ArgumentNullException("constraints");
        if (values == null) throw new ArgumentNullException("values");

        var result = values;
        foreach (var constraint in constraints)
        {
            result = result.Where(x => constraint.FollowsConstraint(x)).ToList();
        }
        return result;
    }
}

Finally, I can put all the pieces together using LINQPad (Full C# solution can be found here).

void Main()
{
    var numbers = Enumerable.Range(0, 10000).ToList();
    var constraints = new List<IConstraint>{new MustBeFourDigitsLongConstraint(), new MustBeEvenConstraint(), 
             new FirstDigitMustEqualLastDigitConstraint()};

    var constraintSolver = new ConstraintSolver();
    var result = constraintSolver.FindValues(constraints, numbers.ToList());

    result.Dump();
}

This solution is easily extendable because if we need to add another constraint, we just add another class that implements the IConstraint interface and change the Main method to add an instance of the new constraint to the list of constraints.

Implementing a Solution in F

Now that we have the C# solution, let’s take a look at how I would solve the problem using F#.

Similar to the C# solution, I’m going to create a function for every constraint, a function to execute all constraints against a source (positive integers) and a runner that ties all the pieces together.

Also similar to the C# solution, I’m going to start with creating the constraints and then work on the constraint solver function.

First, I’d implement that the number must be four digits long constraint.

let mustBeFourDigit number = 
    number.ToString().Length = 4

Next, the number must be even constraint.

let mustBeEven number =
    number % 2 = 0

Lastly, the first digit is the same as the last digit constraint.

1
2
3
4
5
let firstDigitMustBeEqualLast number =
    let numberString = number.ToString().ToCharArray()
    let firstDigit = numberString.GetValue(0)
    let lastDigit = numberString.GetValue(numberString.Length-1)
    firstDigit = lastDigit

At this stage in the C# solution, I had to create an interface, IConstraint, so that the constraint solver could take a list of constraints. What’s cool with F# is that I don’t have to define the interface. The F# type inference is saying that each of these functions are taking the same input (some generic `a) and returning a bool, so I can add all of them to the list. This is pretty convenient since I don’t have to worry about this piece of plumbing.

Now that the different constraints are defined, I’d go ahead and write the last function that takes a list of constraints and a list of numbers and returns the numbers that the constraints fit. (Confused by this function? Take a look at Implementing your own version of # List.Filter)

1
2
3
4
5
6
let rec findValidNumbers numbers constraints = 
    match constraints with
    | [] -> numbers
    | firstConstraint::remainingConstraints ->
        let validNumbers = numbers |> List.filter firstConstraint
        findValidNumbers validNumbers remainingConstraints

Finally, all the pieces are in place, so I can now put all the pieces together using LINQPad.

1
2
3
4
5
6
let numbers = [1000 .. 9999]
let constraints = [mustBeFourDigits; mustBeEven; firstDigitMustEqualLast;]

let result = findValidNumbers numbers constraints

printf "%A" result

Comparing Both Solutions

Now that we have both solutions written up, let’s compare and see which solution is better.

First, the same design was used for both solutions. I decided to use this design for both because it’s flexible enough that we could add new constraints if needed (such as, the 2nd digit must be odd). As an example, for the C# solution, I’d create a new class that implemented IConstraint and then update the runner to use the new class. For the F# solution, I’d create a new function and update the runner. So, I’d think it’s safe to say that both solutions score about the same from a maintainability and extendability point of view.

From an implementation perspective, both solutions are production ready since the code handles possible error conditions (C# with null checks in the ConstraintSolver class, F# with none because it doesn’t support null). In addition to being robust, both solutions are readable by using ample whitespace and having all variables, classes, and interfaces clearly described.

With that being said, this is where the similarities end. When we look at how much code was written to solve the problem, we have a stark difference. For the C# solution, I ended up with 48 lines of code (omitting blank lines), however, for the F# solution, it only took 19. Now you could argue that I could have written the C# solution in fewer lines of code by removing curly braces around one line statements or ignoring null inputs. However, this would lead the code to be less robust.

Another difference between the F# solution and C# is that I was able to focus on the solution without having to wire up an interface. You’ll often hear developers talk about the how little plumbing you need for F# to “just work” and this small example demonstrates that point.

Another difference (albeit subtle) is that the F# solution is that I can use the findValidNumbers function with any generic list of values and any generic list of functions that take the generic type and return true/false.

In other words, if I had another constraint problem using strings, I’d still implement the different constraints, but I could use the same findValidNumbers function. At that point, however, I’d probably rename it to findValidValues to signify the generic solution.

What’s awesome about this is that I didn’t have to do any more work to have a generic solution, F# did that for me. To be fair, the C# solution can easily be made generic, but that would have to be a conscious design decision and I think that’s a downside.

TL;DR

In this post, we took a look at solving a number constraint problem by using idiomatic C# and F#. Even though both solutions are easy to read and easy to extend, the F# version was less than 1/2 the size of the C# solution. In addition, I didn’t have to do any plumbing for the F# version, but had to do some for the C# solution, and to top it off, the F# solution is generically solved, whereas the C# solution is not.

Implementing Your Own Version of F#’s List.Filter

As I’ve been thinking more about F#, I began to wonder how certain methods in the F# stack work, so I decided to implement F#’s List.filter method.

For those of you who aren’t familiar, List.Filter takes a function that returns true or false and a list of values. The result of the call is all values that fulfill the fuction.

For example, if we wanted to keep just the even numbers in our list, then the following would accomplish that goal.

1
2
3
4
5
6
let values = [1;2;3;4]
let isItEven x = x % 2 = 0


let evenValues = List.filter isItEven values
// val it : int list = [2; 4]

Now that we know the problem, how would we begin to implement? First, we need to define a function called filter:

let filter () =

However, to match the signature for List.filter, it needs to take a function that maps integers to bools and the list of values to work on

let filter (func:int->bool) (values:int List) =

Now that we have the signature, let’s add some logic to match on the list of values. When working with lists, there are two possibilities, an empty list and a non-empty list. Let’s first explore the empty list option.

In the case of an empty list of values, then it doesn’t matter what the func parameter does, there are no possible results, so we should return an empty list for the result.

1
2
3
let filter (func:int->bool) (values:int List) =
    match values with
    | [] -> []

Now that we’ve handled the empty list, let’s explore the non-empty list scenario. In this branch, the list must have a head and a tail, so we can deconstruct the list to follow that pattern.

1
2
3
4
let filter (func:int->bool) (values:int List) =
    match values with
    | [] -> []
    | head::tail -> // what goes here?

Now that we’ve deconstructed the list, we can now use the func parameter with the head element. If the value satisfies the func parameter, then we want to add the head element to the list of results and continue processing the rest of the list. To do that, we can use recursion to call back into filter with the same func parameter and the rest of the list:

1
2
3
4
5
let rec filter (func:int->bool) (values:int List) =
    match values with
    | [] -> []
    | head::tail -> 
         if func head then head :: filter func tail

At this point, we need to handle the case where the head element does not satisfy the func parameter. In this case, we should not add the element to the list of results and we should let filter continue the work

1
2
3
4
5
6
let rec filter (func:int->bool) (values:int List) =
    match values with
    | [] -> []
    | head::tail -> 
         if func head then head :: filter func tail
         else filter func tail

By handling the base case first (an empty list), filter can focus on the current element in the list (head) and then recurse to process the rest of the list. This solution works, but we can make this better by removing the type annotations. Interestingly enough, we don’t care if we’re working with integers, strings, or whatever. Just as long as the function takes some type and returns bool and the list of values matches the same type as the func parameter, it works. So then we end up with the following:

1
2
3
4
let rec filter func values =
    match values with
    | [] -> []
    | head::tail -> if func head then head :: filter func tail else filter func tail

In general, when working with lists, I tend to start by matching the list with either an empty list or non-empty. From there, I’ve got my base case, so I can focus on the implementation for the first element. After performing the work for the first element, I can then recurse to the next element.

Today I Learned – The Chain of Responsibility Design Pattern

There is nothing new in the world except the history you do not know. – Harry S. Truman

The more experience I gain problem solving, the more this holds true. For this post, I’m going to first discuss the problem that I was trying to solve. Next, I’ll show what my first solution was, followed by the shortcomings of this solution. Thirdly, we’ll iterate over a better solution to the problem. This in turn, will provide the motivation for what the Chain of Responsibility is and how to implement. Finally, I’ll wrap up with what the benefits were of using this design. .

Problem I was trying to solve

As part of the process of installing our software, there are scripts that will update the database from it’s current version to the latest version. As it stands, it needs to be able to upgrade a database from any version to the current version.

Previous Solution

The first thing that comes to me is that I need to apply database scripts in a sequential way. For example, if the database’s current version is 1.0 and the latest version is 3.0, it would need to apply the script to upgrade the database from 1.0 to 2.0 and then apply the script to upgrade the database from 2.0 to 3.0.

For the first implementation, there were only two versions, 1.0 and 2.0. Since I didn’t want to build in a lot of functionality if it wasn’t needed yet, I created a helper method that returns the correct updater for a given version. In the below code, if the version does not exist, I assume the database does not exist and return the class that will create the database. Otherwise if the version is 1.0, I return a class that is responsible for the upgrading a database from 1.0 to 2.0. If the version is 2.0, I return a class that doesn’t do anything (i.e. there’s no upgrades to be done).

public IDatabaseUpdater GetDatabaseUpdater(string version)
{
  if (string.IsNullOrWhiteSpace(version))
    return new DatabaseCreator();
  if (version == "1.0")
    return new Database100To200Updater();
  if (version == "2.0")
    return new CurrentVersionUpdater();
  throw new ArgumentException("The version " + version + " is not supported for database upgrades.");
}

Problem with solution

This solution worked well when there only three possible actions (create a new database, apply the single script, or do nothing). However, we are now going to be shipping version 3.0 and there will need to be a new class that is responsible for upgrading the 2.0 to 3.0. In order to add this functionality, I’d have to do the following:

  1. Create the Database200To300Updater class that contained the logic for updating the database from 2.0 to 3.0.
  2. Modify the Database100To200Updater class to also use the Database200To300Updater in order to perform the next part of the upgrade.
  3. Add additional logic to the above method so that if the database is 2.0 to return the Database200To300Updater class.

After making the modifications, the method now looks like:

public IDatabaseUpdater GetDatabaseUpdater(string version)
{
  if (string.IsNullOrWhiteSpace(version))
    return new DatabaseCreator();
  if (version == "1.0")
    return new Database100To200Updater(new Database200To300Updater());
  if (version == "2.0")
    return new Database200To300Updater();
  if (version == "3.0")
    return new CurrentVersionUpdater();

  throw new ArgumentException("The version " + version + " is not supported for database upgrades.");
}

So far, so good, we now have the logic to be able to apply scripts in order, however, now that we’ve added version 3.0, I start to wonder what I would do if we added more versions? After some thought, it would look identical to the previous steps (see below for what would happen if we added version 4.0).

public IDatabaseUpdater GetDatabaseUpdater(string version)
{
  if (string.IsNullOrWhiteSpace(version))
    return new DatabaseCreator();
  if (version == "1.0")
    return new Database100To200Updater(new Database200To300Updater(new Database300To400Updater()));
  if (version == "2.0")
    return new Database200To300Updater(new Database300To400Updater());
  if (version == "3.0")
    return new Database300To400Updater();
  if (version == "4.0")
    return new CurrentVersionUpdater();
  throw new ArgumentException("The version " + version + " is not supported for database upgrades.");
}

If we create some variables to hold onto these classes, and reorder the if statements, we can write this helper method as:

public IDatabaseUpdater GetDatabaseUpdater(string version)
{
  if (string.IsNullOrWhiteSpace(version))
    return new DatabaseCreator();
  if (version == "4.0")
    return new CurrentVersionUpdater();
  var database300Updater = new Database300To400Updater();
  var database200Updater = new Database200To300Updater(database300To400Updater);
  var database100Updater = new Database100To200Updater(database200To300Updater);

  if (version == "1.0")
    return database100Updater;
  if (version == "2.0")
    return new database200Updater;
  if (version == "3.0")
    return new database300Updater;

  throw new ArgumentException("The version " + version + " is not supported for database upgrades.");
}

Motivation for the Chain of Responsibility

What I find interesting in this design is that I’ve now chained these updater classes together so that if the version 1.0 is returned, it will also use the 2.0 updater, which in turn calls the 3.0 updater. It was at this point, that I remembered a design pattern that followed this structure.

In this design pattern, you essentially have Handlers (in my case updaters) that check to see if they can handle the request. If so, they do and that stops the chain. However, if they can’t handle the request, they pass it to their Successor (which was also a Handler) to handle the request. The design pattern I was thinking about is the Chain of Responsibility pattern.

In order to implement this pattern, you need to have an IHandler interface that exposes a Handle method and either a method or property to set the Successor. The method is the action to take (in our case Update) and the Successor represents the next Handler in the chain if the request could not be handled. The second component is referred to as ConcreteHandlers and they are just the implementors of the interface. One way to implement this is like the following:

public interface IHandler
{
  IHandler Successor { get; set; }
  void Update(int version);
}

public class ConcreteHandlerA : IHandler
{
  public IHandler Successor { get; set; }

  public void Update(int version)
  {
    if (CanTheRequestBeHandled) {
      // handle the request
    }
    else {
      Successor.Update(version);
    }
  }
}

The main difference between the pattern and what I need is that instead of doing if (canHandle)/else call Successor, what I’m really looking for is to run the upgrade script if the version we’re upgrading to is higher than our current version and then always call the successor. Given this change, here’s what that new implementation looks like:

public class ConcreteHandlerA : IHandler
{
  public Successor { get; set; }
  public void Update(int version)
  {
    if (CanTheRequestBeHandled) {
      // handle the request
    }
    Successor.Update(version);
  }
}

Implementing the Chain of Responsibility

Now that I know the pattern to use and how it works, I need to update the IDatabaseUpdater interface to follow the IHandler interface. Next, I will need to modify the concrete handlers to use the new interface correctly.

Implementing the Handler

First, we will update our IDatabaseUpdater interface to follow the IHandler look:

Before
1
2
3
4
public interface IDatabaseUpdater
{
  void Update(int version);
}
After
1
2
3
4
5
public interface IDatabaseUpdateHandler
{
  void Update(int version);
  IDatabaseUpdateHandler Successor { get; set; }
}

Implementing the Concrete Handler

Second, we will need to update our concrete handlers to implement the interface correctly and to update their UpdateMethod to follow the design. In my case, the concrete handlers perform similar logic, so one of the classes is used for an example.

Before
public class Database100To200Updater : IDatabaseUpdater
{
  private Database200To300Updater _successor;
  public Database100To200Updater(Database200To300Updater successor)
  {
    if (successor == null)
      throw new ArgumentNullException("successor");
    _successor = successor;
  }

  public void Update()
  {
    Console.WriteLine("Updating the database to version 2.0");
    _successor.Update();
  }
}
After

Thanks to the public property, I was able to remove the private member and that in turn allowed me to remove the constructor.

public class Database100To200Updater : IDatabaseUpdateHandler
{
  public void Update(int version)
  {
    if (version >= 2)
      Console.WriteLine("Updating the database to version 2.0");
    if (Successor != null)
      Successor.Update(version);
  }

  public IDatabaseUpdateHandler Successor { get; set;}
}

Updating the Helper Method

Now that we’ve updated the interface and implementors, it’s time to update the helper method to take advantage of the new design.

public IDatabaseUpdateHandler GetDatabaseUpdater(string version)
{
  if (string.IsNullOrWhiteSpace(version))
    return new DatabaseCreator();

  var database300To400 = new Database300To400Updater();
  var database200To300 = new Database200To300Updater();
  var database100To200 = new Database100To200Updater();

  database100To200.Successor = database200To300;
  database200To300.Successor = database300To400;

  return database100To200;
}

Chain of Responsibility is great, here’s why

What I really like about the chain of responsibility pattern is that I was able to connect my upgrade classes together in a consistent fashion. Another reason why I like this pattern is that it forces me to have the logic to determine whether I should run the update or not inside the individual classes instead of the helper method. This produces more readable code which then lends itself to easier maintainability.

Today I Learned – How to Break Down A Massive Method

During this past week, I was working with our intern and showing him some cool debugging tricks when I came across a massive method. I gave him 30 seconds to look at it and tell me what he thought the method was doing. After a while, he was able to figure it out, but the fact that it wasn’t easy discernible was enough to give pause.

The lesson here is that if you can’t determine what the method is doing easily, then it’s probably doing way too much (violating the Single Responsibility Principle) and needs to be broken into more easily readable pieces.

To demonstrate what I mean, I wrote a program that inserts Messages into a database. A Message contains a description, the number (for identification when troubleshooting) and the module. We would have issues where different messages would have the same number which would cause confusion when troubleshooting errors.

In the program I wrote, the user provides the message and what module the message belongs to and the program automatically generates the message number and inserts the message into the database.

For brevity’s sake, shown below is the logic for determining what the next message number should be.

public int GetNextAlertAndErrorModuleNumber(string module)
{
  if (String.IsNullOrEmpty(module))
    throw new ArgumentException("module cannot be null or empty");
  if (_connection == null)
    _connection = CreateConnection();

  var results = new List<int>();

  _connection.Open();
  var cmd = new SqlCommand("dbo.GetAlertData", _connection);
  cmd.CommandType = CommandType.StoredProcedure;

  var reader = cmd.ExecuteReader();
  while (reader.Read())
  {
    if (!reader["ALERT_ID_NUMBER"].ToString().Contains(module))
      continue;

    var pieces = reader["ALERT_ID_NUMBER"].ToString().Split( );

    results.Add(Int32.Parse(pieces[1]));
  }
  if (reader != null)
    reader.Close();

  cmd = new SqlCommand("dbo.GetErrorData";, _connection);
  cmd.CommandType = CommandType.StoredProcedure;

  reader = cmd.ExecuteReader();
  while (reader.Read())
  {
    if (!reader["ERROR_ID_NUMBER"].ToString().Contains(module))
      continue;

    var pieces = reader["ERROR_ID_NUMBER"].ToString().Split( );

    results.Add(Int32.Parse(pieces[1]));
  }
  if (reader != null)
    reader.Close();

  if (_connection != null)
    _connection.Close();

  return results.Max() + 1;
}

The method itself isn’t complex, just calling some stored procedures, parsing the output and adding the number to a list. However, it’s not abundantly clear what the purpose of the calling the stored procedures.

First, it looks like we’re reading the alerts error numbers from a stored procedure call, why don’t we extract that logic out to a helper method and have the public method call the helper?

public int GetNextAlertAndErrorModuleNumber(string module)
{
  if (String.IsNullOrEmpty(module))
    throw new ArgumentException(&amp;amp;quot;module cannot be null or empty&amp;amp;quot;);

  if (_connection == null)
    _connection = CreateConnection();

  var results = new List<int>();

  _connection.Open();
  results.AddRange(ReadAlerts(module.ToUpper()));

  var cmd = new SqlCommand("dbo.GetErrorData", _connection);
  cmd.CommandType = CommandType.StoredProcedure;

  var reader = cmd.ExecuteReader();
  while (reader.Read())
  {
    if (!reader["ERROR_ID_NUMBER"].ToString().Contains(module))
      continue;

    var pieces = reader["ERROR_ID_NUMBER&"].ToString().Split( );

    results.Add(Int32.Parse(pieces[1]));
  }
  if (reader != null)
    reader.Close();

  if (_connection != null)
    _connection.Close();

  return results.Max() + 1;
  }

  private List<int> ReadAlerts(string module)
  {
    var results = new List<int>();
    var cmd = new SqlCommand("dbo.GetAlertData", _connection);
    cmd.CommandType = CommandType.StoredProcedure;

    var reader = cmd.ExecuteReader();
    while (reader.Read())
    {
      if (!reader["ALERT_ID_NUMBER"].ToString().Contains(module))
        continue;

      var pieces = reader["ALERT_ID_NUMBER"].ToString().Split( );
      results.Add(Int32.Parse(pieces[1]));
    }
    if (reader != null)
    reader.Close();

    return results;
}

By doing this, we fix two issues at once. First, we’ve given a name to the process of reading the alerts which in turns allows us to quickly understand what the public method should be doing (i.e. improved readability).

Second, it allows us for easier debugging because we now have smaller components. For example, let’s say that we were getting the wrong value. In the first implementation, we would have to put breakpoints in different areas trying to determine which part was broken. However, in the new form, we can check to see if ReadAlerts is behaving correctly. If it isn’t, we now know the bug has to be in that method, otherwise, it’s in the rest.

For the next step, you may have noticed that we can repeat the same refactoring trick again, except this time, we can extract the logic for reading the errors into a helper method.

public int GetNextAlertAndErrorModuleNumber(string module)
{
  if (String.IsNullOrEmpty(module))
    throw new ArgumentException("module cannot be null or empty");
  if (_connection == null)
    _connection = CreateConnection();

  _connection.Open();

  var results = new List<int>();
  results.AddRange(ReadAlerts(module.ToUpper()));
  results.AddRange(ReadErrors(module.ToUpper()));

  if (_connection != null)
    _connection.Close();

  return results.Max() + 1;
}

private List<int> ReadAlerts(string module)
{
  var results = new List<int>();
  var cmd = new SqlCommand("dbo.GetAlertData", _connection);
  cmd.CommandType = CommandType.StoredProcedure;

  var reader = cmd.ExecuteReader();
  while (reader.Read())
  {
    if (!reader["ALERT_ID_NUMBER"].ToString().Contains(module))
      continue;

    var pieces = reader["ALERT_ID_NUMBER"].ToString().Split( );
    results.Add(Int32.Parse(pieces[1]));
  }
  if (reader != null)
    reader.Close();

  return results;
}

private List<int> ReadErrors(string module)
{
  var results = new List<int>();
  var cmd = new SqlCommand("dbo.GetErrorData", _connection);
  cmd.CommandType = CommandType.StoredProcedure;

  var reader = cmd.ExecuteReader();
  while (reader.Read())
  {
    if (!reader["ERROR_ID_NUMBER"].ToString().Contains(module))
      continue;

    var pieces = reader["ERROR_ID_NUMBER"].ToString().Split( );
    results.Add(Int32.Parse(pieces[1]));
  }
  if (reader != null)
    reader.Close();

  return results;
}

After the changes, anyone who looks at the public API can easily see that it’s reading from both Alerts and Errors. This is really powerful because now you can communicate with non-technical people about requirements and what the code is doing.

Let’s say that in the future, the requirements change and this conversation plays out:

Alice (QA, finding an error) – Hey Bob, I was running one of our test plans and it looks like that we’re getting the wrong message number if we’re trying to add a new message and there are warning messages in the database. We are including the warning table when figuring that out, right?

Bob (Engineer, finding the root cause) – Hey you’re right, it looks like we’re only using the alerts and error tables when calculating the next number. Why don’t we write up a ticket for that and get a fix in?

The key point is that no matter how large a method is, there always have to be steps being performed in some order (by definition of an algorithm) and this is the key to refactoring larger methods into more maintainable pieces of code. The trick is determining what those steps are and making decisions on whether to make helper methods or helper classes.

If those steps become complicated, then they should be broken out into helper methods. As time progresses and those helper methods start to become more complicated, then those helper methods should in turn become classes of their own.

Today I Learned: The Law of Demeter

Don’t pass in more information that you need. It sounds simple, but when working with messy legacy code, it’s easy to forget.

The impetus for this post came from a peer code review. During the review, I found this method:

1
2
3
4
5
6
7
8
9
public IStrategy GetStrategy(Project project, bool isAffected)
{
  var type = project.Type;
  if (type == ProjectType.A && isAffected)
    return new ProjectAIsAffectedStrategy();
  if (type == ProjectType.B)
    return new ProjectBStrategy();
  // Similar if statements
}

At first glance, it looks pretty good. Logic was sound and it seemed to be returning a class implementing an interface similar to what we would expect for the Factory pattern. However, there’s a slight problem, can you spot it?

The issue is with the first parameter, Project. The method takes a Project, however, we’re only really depending on the Project’s Type property.

1
2
3
4
5
6
7
8
9
public IStrategy GetStrategy(Project project, bool isAffected)
{
  var type = project.Type;
  if (type == ProjectType.A && isAffected)
    return new ProjectAIsAffectedStrategy();
  if (type == ProjectType.B)
    return new ProjectBStrategy();
  // Similar if statements
}

So why don’t we get rid of the dependency on the Project and instead replace it with the dependency on the ProjectType instead?

1
2
3
4
5
6
7
8
public IStrategy GetStrategy(ProjectType type, bool isAffected)
{
  if (type == ProjectType.A &amp;&amp; isAffected)
    return new ProjectAIsAffectedStrategy();
  if (type == ProjectType.B)
    return new ProjectBStrategy();
  // Similar if statements
}

Instinctual, I knew this was the right call, but I couldn’t remember why I knew it was a good choice. After some digging, I remembered that this is a Law of Demeter violation, or better known as the Principle of Least Knowledge violation.

In general, this principle states that a method should have the least amount of information it needs to do it’s job. Other classic violations of this principle is when you use a class’s internals internals. For example,

SomeClassA.SomePropertyB.WithSomeMethodC()

One of the reasons that I really like the Law of Demeter is that if you follow it, you create easier to test methods. Don’t believe me? Which is easier to create, the Project class (which may have many dependencies that would need to be stubbed) or the ProjectType enum (which by definition has zero dependencies)?

Another reason that following the Law of Demeter is good practice is that it forces your code to be explicit about what dependencies are required. For example, in the first implementation, the caller knew that the method needed a Project, but had no clue on how much of the Project it needed (does it need all of the properties set? Does it need further setup besides basic instantiation?). However, with the refactored version, now it’s much clearer that the method has a looser dependency on not Project, but just the ProjectType.

Beginner Basics: Establishing a SOLID Foundation – The Dependency Inversion Principle

Welcome to the final installment of Establishing a SOLID Foundation series. In this post, we’ll be exploring the fifth part of SOLID, the Dependency Inversion Principle.

What is the Dependency Inversion Principle?

When working with object-oriented languages, we take large problems and break them down into smaller pieces. These smaller pieces in turn are broken down into even smaller, more manageable pieces to work on. As part of the breaking down process, we inherently have to introduce dependencies between the larger pieces and the smaller pieces.

How we weave these dependencies together is the difference between easily changing behavior and spending the next week pulling your hair out.

When working with classes, dependencies are usually introduced by constructing them in the class that they’re used in. For example, let’s say that we’ve been asked to write an utility that emulates an calculator but it also keeps a transaction log for record keeping purposes.

class Logger
  def log (content)
    File.open("C:\\temp\\results.txt", 'a') {|f| f.write(content)}
  end
end

class Calculator
  def initialize
      @logger = Logger.new()
  end
  def add (a, b)
      log(a, b, "+")
      return a + b
  end
  def sub (a, b)
      log(a, b, "-")
      return a - b
  end
  def mult (a, b)
      log(a,b,"*")
      return a * b
  end
  def div (a, b)
      log(a,b,"/")
      return a.to_f / b
  end

  def log(a, b, sym)
      text = a.to_s + " " + sym + " " + b.to_s + " = "
      if sym == "+"
            text += (a + b).to_s
      elsif sym == "-"
            text += (a-b).to_s
      elsif sym == "*"
            text += (a*b).to_s
      else
            text += (a.to_f/b).to_s
      end
      text += "\n"
      @logger.log(text)
  end
end

# Usage
calc = Calculator.new()
puts calc.add(4,3)
puts calc.sub(2,1)
puts calc.mult(100,2)
puts calc.div(5,2)

So far so good, we have two classes (Logger and Calculator) that is responsible for logging and the calculations. Even though this is a small amount of code (~50 lines for both classes), there are three dependencies in the code. The first dependency that I notice is that Calculator depends on Logger. We can see this by looking at the initialize method for Calculator (as hinted above):

1
2
3
4
5
class Calculator
  def initialize
    @logger = Logger.new("C:\\temp\\results.txt")
  end
end

The second dependency is a bit trickier to find, but in the Logger class’ log method, we use a hard coded file path.

1
2
3
4
5
class Logger
  def log (content)
      File.open("C:\\temp\\results.txt", 'a') {|f| f.write(content)}
  end
end

The third dependency is probably the hardest to find, but in the the Logger class’ log method, we are also depending on the file system for the machine by using Ruby’s File class. But wait a minute, I hear you say, why is the File class considered a dependency, that’s a part of the Ruby language? I agree with you, but it’s still a dependency in the code and something that we should keep in mind.

From these three dependencies, we’re going to focus on resolving the first two. We could make resolve the dependency on the file system, but it would take so much effort for so little gain.

Why don’t we resolve the file dependency issue?

One thing that I keep in mind when identifying which dependencies to invert is to focus on inverting dependencies outside of the framework. In this case, we’re going to ignore the file system dependency because I, the programmer, depend on Ruby to behave correctly. If Ruby stops working, then a broken file system is the least of my worries. Therefore, it’s not worth the effort to resolve.

Making the Calculator and Logger more flexible

In order to resolve these DIP violations, we need to expose ways to drop in these dependencies. There are two ways of doing this. We can either:

  • Expose the dependency via the constructor
  • Expose the dependency via a public property

Using the Calculator example, we can expose the logger dependency via the constructor and we can expose the filePath dependency by exposing it as a property.

For the Calculator class, I’m going to first change the initialize method so that it takes a logger instead of constructing it’s own.

1
2
3
4
5
class Calculator
  def initialize(logger)
    @logger = logger
  end
end

Next, I will construct the logger in the usage and pass the logger as part of the constructor.

1
2
3
# Usage
logger = Logger.new()
calc = Calculator.new(logger)

A quick run of our program tells us that everything is still working correctly.

Now that we’ve finished up exposing the logger dependency, it’s time to expose the file path. First, I’m going to add a public property on the Logger class call filePath and use that property in the log method

1
2
3
4
5
6
class Logger
  def log (content)
      File.open(filePath, 'a') {|f| f.write(content)}
  end
  attr_accessor :filePath
end

Now that we’ve introduced a seam for the file path dependency, we use that seam in the program and assign the property```ruby

1
2
3
4
# Usage
logger = Logger.new()
logger.filePath = "C:\\temp\\results.txt"
calc = Calculator.new(logger)

Multiple ways of solving the issue, which one is best?

When using the constructor method, it’s very clear to see what dependencies the class has, just look at the constructor. However, adding a new dependency to the constructor may cause other code to fail because the signature of the constructor has changed. This in turn can lead to cascading changes where multiple places of code need to be updated to pass in the dependency.

On the other hand, using the property method, the change is less invasive because the property can be set independently of the when the object was constructed. However, it’s harder to see the dependencies for the class because now the properties are containing the dependencies. Also, it’s very easy to forget to set a property before using the object.

Both of these methods are valid, but when I’m working with DIP, I prefer to expose the dependencies via the constructor because if my class starts to gain more and more dependencies, then it’s a sign that my class is doing too much and violating the Single Responsibility Principle (SRP). You can say that DIP is the canary in the coal mine for SRP.

TL;DR

In summary, the Dependency Inversion Principle (DIP) tells us that we should we should have the outside world pass in our dependencies. If we don’t follow this rule, then we will not. In order to resolve violations, we need to determine what dependencies we have and modify our constructor to accept those dependencies. If our constructor becomes too large, then our class might be violating the Single Responsibility Principle. By following the DIP, we expose the dependencies our classes require and allows for greater decoupling. As always, don’t forget to refactor as you go along.

Establishing a SOLID Foundation Series

Beginner Basics: Establishing a SOLID Foundation – The Interface Segregation Principle

Welcome to the fourth installment of Establishing a SOLID Foundation series. In this post, we’ll be exploring the fourth part of SOLID, the Interface Segregation Principle and how by following this principle, you will write more robust code.

What is the Interface Segregation Principle?

The Interface Segregation Principle (ISP) tell us that clients should not be forced to use an interface that defines methods that it does not use. But what does this mean? Let’s say that we have the following ContactManager class and a ContactFinder class.

public class ContactManager
{
     private List _names;

     public ContactManager()
     {
          _names = new List();
          _names.Add("Cameron");
          _names.Add("Geoff");
          _names.Add("Phillip");
     }

     public void PrintNames()
     {
          foreach (var name in _names)
              Console.WriteLine(name);
     }

     public void SetNames(List names)
     {
          foreach (var name in names)
               _names.Add(name);
     }

    public bool DoesNameExist(string name)
    {
         var results = _names.IndexOf(name);
         if (results != -1)
              return true;

         return false;
     }
}

public class ContactFinder
{
     private ContactManager _manager;

     public ContactFinder(ContactManager manager)
     {
         _manager = manager;
     }

     public void FindContacts(List names)
     {
          foreach (var name in names)
          {
               if (_manager.DoesNameExist(name))
                    Console.WriteLine("Found " + name);
               else
                    Console.WriteLine("Couldn't find " + name);
          }
      }
}

So far, so good, the ContactManager is responsible for holding onto the list of names and some basic methods and the ContactFinder is responsible for determining which contacts are in our list.

However, there is a problem with this example code. We have this ContactManager class that has a lot methods defined, but the only method that’s required is the DoesNameExist method.

Another concept ISP tells in a roundabout fashion is that objects should depend on interfaces, not concrete classes. This allows us to switch out dependencies much easier when we code to the interface.

What’s the big deal, I don’t see what the issue is

The big issue that comes up is that it’s hard to figure out what methods that the ContactFinder actually needs. We say that it needs a ContactManager which would lead us to assume that it needs all of those methods. However, that’s not the case. So by violating ISP, it’s easy to make the wrong design decision.

Another issue arises when it comes to creating the interface for the dependency. If we assume that all methods for the ContactManager is required, then any class that implements that interface has to also implement those unneeded methods. How many times have you seen an object implement an interface with a lot of methods that did nothing or just threw exceptions?

public interface BloatedInterface
{
     void SetContent();
     bool IsContentSet();
     void RemoveContent(string content);
     void AddContent(string content);
     void ImportantMethod();
}

public class BloatedObject : BloatedInterface
{
     public void SetContent()
     {
     }
     public bool IsContentSet()
     {
          throw new NotImplementedException();
     }
     public void RemoveContent(string content)
     {
          throw new NotImplementedException();
     }
     public void AddContent(string content)
     {
          throw new NotImplementedException();
     }
     public void ImportantMethod()
     {
     }
}

Fixing the issue

Alright, alright, I hear you say, you’ve convinced me, how do I fix this problem? The steps are simple:

  1. First, you need to identify which methods the client needs
  2. Next, you need to create an interface that contains the methods that the client uses
  3. After creating the interface, have the dependency implement the interface
  4. Finally, change the signature client so that it uses the interface instead of the concrete class

Using our code base, first, we look at the ContactFinder class and see what methods from ContactManager that it uses.

So far, it looks like it only needs the DoesNameExist method. So let’s create an interface, called IContactSearcher that contains the single method.

1
2
3
4
public interface IContactSearcher
{
     bool DoesNameExist(string name);
}

Now that we’ve extracted the interface, it’s time to have the ContactManager class implement the interface:

public class ContactManager : IContactSearcher
{
     private List _names;

     public ContactManager()
     {
          _names = new List();
          _names.Add("Cameron");
          _names.Add("Geoff");
          _names.Add("Phillip");
     }

     public void PrintNames()
     {
          foreach (var name in _names)
          Console.WriteLine(name);
     }

     public void SetNames(List names)
     {
          foreach (var name in names)
               _names.Add(name);
     }

     public bool DoesNameExist(string name)
     {
          var results = _names.IndexOf(name);
          if (results != -1)
               return true;

          return false;
     }
}

Finally, we update the references in ContactFinder to use the IContactSearcher interface instead of the concrete class ContactManager.

public class ContactFinder
{
     private IContactSearcher _searcher;

     public ContactFinder(IContactSearcher searcher)
     {
         _searcher = searcher;
     }

     public void FindContacts(List names)
     {
          foreach (var name in names)
          {
               if (_searcher.DoesNameExist(name))
                    Console.WriteLine("Found " + name);
               else
                    Console.WriteLine("Couldn't find " + name);
          }
     }
}

With this last step, we’ve now resolved the ISP violation.

TL;DR

In summary, the Interface Segregation Principle (ISP) tells us that we should interfaces instead of concrete classes for our dependencies and that we should use the smallest interface for our client to work. If we don’t follow these rules, then it’s easy to create bloated interfaces that clutter up readability. In order to resolve violations, we need to determine what methods our client require and code an interface to contains those methods. Finally, we have our class implement those methods and pass it to the client. By following the ISP, we reduce the complexity required by our code and reduce the coupling between client and dependency. As always, don’t forget to refactor as you go along.

Establishing a SOLID Foundation Series

Beginner Basics: Establishing a SOLID Foundation – The Liskov Substitution Principle

Welcome to the third installment of Establishing a SOLID Foundation series. In this post, we’ll be exploring the third part of SOLID, the Liskov Substitution Principle and how following this principle can lead to loose coupling of your code.

So what is the Liskov Substitution Principle?

Before diving into the Liskov Substitution Principle (LSP), let’s look at a code example demonstrating the violation.

Let’s say that we have a Rectangle class:

# A Rectangle can have height and width
# set to any value
class Rectangle
  def height=(height)
      @height = height
  end
  def width=(width)
      @width = width
  end
  def height
      @height
  end
  def width
      @width
  end
  def area
      height * width
  end
end

If we run the following implementation, it’s pretty clear that it works like we would expect:

1
2
3
4
rect = Rectangle.new
rect.height = 5
rect.height = 6
puts rect.area # => 30

Seems pretty simple, we have a Rectangle class with two public properties, height and width and the class behaves the way we would expect.

Now let’s add a new class, called Square. Since all Squares are also Rectangles, it makes sense that the Square class should inherit from the Rectangle class. However, since Squares have to maintain the same height and width, we need to add some additional logic for that:

# A Square must maintain the same height and width
class Square < Rectangle
  def height=(height)
      @height = height
      @width = height
  end
  def width=(width)
      @width = width
      @height = width
  end
end

Using a Square instead of a Rectangle and running the same input, we get the following output:

1
2
3
4
square = Square.new
square.height = 10
square.width = 5
puts square.area # => 25

Hold up, why is the area 25? If I read this code, then the height should be 10 and the width should be 5, giving an area of 50. This is no longer the case because of the domain constraint caused by the Square class. By using a Square where the code expected a Rectangle, we get different behavior then we would expect. This is the heart of the Liskov Substitution Principle.

In short, the Liskov Substitution Principle states that if we have an object (Rectangle) in our code and it works correctly, then we should be able to use any sub-type (Square) without the results being modified.

The most common example of LSP violations are when the “is-a” phrase from Object-Oriented Design break down. In the Rectangle-Square example, we say that a Square “is-a” Rectangle which is true. However, when we covert that relationship to code and use inheritance, the relationship does not hold up.

I don’t know, this sounds confusing, what’s the point? To me, the Liskov Substitution Principle is the hardest part of SOLID to understand. It’s heavy on the theoretical and it’s not blatantly obvious when a violation has occurred until testing.

However, there are plenty of benefits of following LSP.

First, following LSP reduces the tight coupling involved in your code. Let’s look back at our Recipes class from the Open/Closed Principle post and examine the MakeOrder method:

class Recipes
     def initialize
          @recipes = {}
          @recipes[RecipeNames::ChickenWithBroccoli] = ChickenWithBroccoli.new()
          @recipes[RecipeNames::SteakWithPotatoes] = SteakWithPotatoes.new()
          @recipes[RecipeNames::PorkWithApples] = PorkWithApples.new()
     end
     def MakeOrder(order)
          recipe = @recipes[order]
          if recipe == nil
               puts "Can't cook " + order
          else
               recipe.Cook()
          end
     end
end

In this class, you see that we load different recipes and when one’s requested, we call the Cook method. We don’t have to do any set-up, special handling, or other logic, we just trust that the Cook method for whatever recipe we choose works as expected. By following this design, code will be easier to read and to maintain.

Going back to our Square/Rectangle example, if we wanted a method that would return a new Square or Rectangle, it would have to look something like this:

def CreateShape(classType, height, width)
     shape = nil
     if classType == "Rectangle"
          shape = Rectangle.new
          shape.height = height
          shape.width = width
     else
          shape = Square.new
          shape.height = height
     end
     return shape
end

This code works, but there is one major problem. When someone is looking at this code, they’re going to get confused of why the Rectangle and Square are setup differently.

For example, when I see that the Square’s height is being set, but not the width, my first thought is that this is a bug. Then, I’d have to look into the Square’s class definition and then I’d see the logic of where setting the height also sets the width.

Long story short, by identifying and resolving LSP violations, we can make the code easier to read and maintain.

So it looks like LSP is pretty useful, but how do I fix violations?

Now that we’ve talked about spotting LSP violations and why it’s important to follow LSP, let’s discuss how to fix the violations.

To be honest, fixing a LSP violation is not easy. Since the nature of the problem is caused by a broken abstraction, discarding the abstraction is the best option. However, if you absolutely need to use the abstraction, then one solution is to remove the method that causes the violation.

In the Square/Rectangle example, we would remove the setters for height and width from our Rectangle class because that is how the violation can occur. After removing the setters, we need to modify the initialize method of Square to only take one parameter, size, and send that twice to the Rectangle class. Now, our classes look something like this:

# A Rectangle can have height and width
# set to any value
class Rectangle
  def initialize(height, width)
      @height = height
      @width = width
  end
  def height
      @height
  end
  def width
      @width
  end
  def area
      height * width
  end
end

# A Square must maintain the same height and width
class Square < Rectangle
  def initialize(size)
      super(size, size)
  end
end

With sample implementation and output

1
2
3
4
5
rect = Rectangle.new(10, 5)
puts rect.area # => 50

square = Square.new(5)
puts square.area # => 25

TL;DR

In short, the Liskov Substitution Principle (LSP) enforces the idea that if a class has a sub-type (through inheritance or interfaces), then by passing the sub-type, the program should still produce the same results. If you run across a class that violates LSP, then you know that your abstraction is not complete and you can either

  • Remove the offending methods/properties or
  • Abandon the abstraction

As always, don’t forget to refactor and reorganize your code as needed.

Establishing a SOLID Foundation Series

Beginner Basics: Establishing a SOLID Foundation - The Open/Closed Principle

Welcome to the second installment of Establishing a SOLID Foundation series. In this post, we’ll be exploring the second part of SOLID, the Open/Closed Principle and how following this principle can lead to great design choices.

So what is the Open/Closed Principle? In order to set the context for discussion, let’s revisit our last example of the Chef class:

class Chef
  def CookFood(order, tableNumber)
    if order == "chicken with broccoli"
      CookChickenWithBroccoli()
    end
  end

  def CookChickenWithBroccoli
    puts "Cooked chicken with broccoli"
  end
end

So it looks like this Chef is pretty simple, it only has one public method of CookFood and he can only cook ChickenWithBroccoli. However, a Chef that can only cook one thing isn’t very useful. So how about we add some more menu items?

class Chef
  def CookFood (order)
    if order == "chicken with broccoli"
      CookChickenWithBroccoli()
    elsif order == "steak with potatoes"
      CookSteakWithPotatoes()
    elsif order == "pork with apples"
      CookPorkWithApples()
    else
      puts "Don't know how to cook " + order
    end
  end

  def CookChickenWithBroccoli
    puts "Cooked chicken with broccoli"
  end

  def CookSteakWithPotatoes
    puts "Cooked steak with potatoes"
  end

  def CookPorkWithApples
    puts "Cooked pork with apples"
  end
end

So our new Chef can cook more food, but the code base expanded quite a bit. In order to add more menu choices, we need to add an additional if check in CookFood and to define a new method for CookFood to call. This might not sound like a lot of work because each of these Cook* methods just print something to the screen. However, what if the steps to create each food item were much more complex?

def CookChickenWithBroccoli
  CookChicken()
  CookBroccoli()
end

def CookChicken
  print "Cooked chicken "
end

def CookBroccoli
  puts "with broccoli"
end

Also, what if we modified how the CookChickenWithBroccoli method worked? We would need to modify the Chef class, but that doesn’t make sense. In the real world, we would modify the recipe and the Chef would then follow the new recipe. This concept that we would have to modify an unrelated object in order to add new functionality is the inspiration for the Open/Closed Principle.

In short, the Open/Closed Principle means that an object should be Open for extension, but Closed for modification. This principle relies on the idea that new functionality is added by creating new classes, not by modifying pre-existing classes’ behavior. By doing this, we’re decreasing the chances of breaking current functionality and increasing code stability.

This sounds good, but is it worth the additional design time?

Now that we’ve discussed the Open/Closed Principle, you might be wondering what some of the benefits are of cleaning up this design.

First, classes that follow the Open/Closed Principle are small and focused in nature playing off the idea of the Single Responsibility Principle. Looking back at our Chef class, it’s very clear that by adding new functionality, Chef is going to be handling way too many things.

Next, by following OCP, there won’t be multiple classes modified just to add new functionality. There’s nothing like a change set containing tons of modified files to make even the most experienced developer shudder in fear.

By definition of OCP, we won’t be modifying a lot of files (ideally only one file should be modified) and we’re adding new classes. Since we’re adding in these new classes, we inherently have the opportunity to bake in testing.

Alright, I get it OCP is awesome, but how do I refactor the Chef class?

In order to fix the violation, we’re going to take each menu item and make them into their own class

class ChickenWithBroccoli
  def initialize
    @name = "Chicken with Broccoli"
  end

  def Cook
    CookChicken()
    CookBroccoli()
  end

  def CookChicken
    print "Cooked chicken "
  end

  def CookBroccoli
    puts "with broccoli"
  end
end

class SteakWithPotatoes
  def initialize
    @name = "Steak with Potatoes"
  end

  def Cook
    CookSteak()
    CookPotatoes()
  end

  def CookSteak
    print "Cooked steak "
  end

  def CookPotatoes
    puts "with potatoes"
  end
end

class PorkWithApples
  def initialize
    @name = "Pork with Apples"
  end

  def Cook
    CookPork()
    CookApples()
  end

  def CookPork
    print "Cooked pork "
  end

  def CookApples
    puts "with apples"
  end
 end

Now that we have these different classes, we need to come up with some way for our Chef to interact with them. So why don’t we organize these menu items into a Recipes class?

class Recipes
  def initialize
      @recipes = {}
      @recipes[:chicken] = ChickenWithBroccoli.new()
      @recipes[:steak] = SteakWithPotatoes.new()
      @recipes[:pork] = PorkWithApples.new()
  end

  def MakeOrder(order)
      recipe = nil
      if order == "chicken with broccoli"
          recipe = @recipes[:chicken]
      elsif order == "steak with potatoes"
          recipe = @recipes[:steak]
      elsif order == "pork with apples"
          recipe = @recipes[:pork]
      end
      if recipe == nil
          puts "Can't cook " + order
      else
          recipe.Cook()
      end
  end
end

Now we have this Recipes class that contains all of the menu items for our Chef to use. When adding new menu items to Recipes, all we have to add is the class in the initialize method and add an additional if check in the MakeOrder method. But hold on, I hear you say, This is the same as what we had with the Chef at the beginning, why is this design better? Before, we would have to modify the Chef in order to add more menu items which doesn’t really make sense, now we’ve moved that logic to Recipes which makes sense that it needs to be modified if a new menu item is added.

On the topic of our Chef, after cleaning up to use the Recipes class, our Chef is simpler and relies on Recipes for the menu items, not itself:

1
2
3
4
5
6
7
8
9
class Chef
  def initialize
    @recipes = Recipes.new()
  end

  def CookFood (order)
    @recipes.MakeOrder(order)
  end
end

Now that we’ve fixed the violation, let’s go ahead and refactor some. Looking at the menu choices, it’s pretty clear that we can abstract the behavior to a base class called MenuItem for them all to share (Note: By defining Cook by raising an exception, I’m forcing all classes to provide their own implementation).

1
2
3
4
5
6
7
8
9
class MenuItem
  def initialize(name)
      @name = name
  end

  def Cook
      raise "This should be overridden in child class"
  end
end

Also, as part of this refactoring, we’re going to move some of the strings into constants as part of the RecipeNames module so that the Chef and Recipes can communicate with one another:

1
2
3
4
5
module RecipeNames
  ChickenWithBroccoli = "Chicken with Broccoli"
  SteakWithPotatoes = "Steak with Potatoes"
  PorkWithApples = "Pork with Apples"
end

With these additions, let’s update the menu choices to use the module and the MenuItem base class:

class ChickenWithBroccoli < MenuItem
  def initialize
      super(RecipeNames::ChickenWithBroccoli)
  end

  def Cook
      CookChicken()
      CookBroccoli()
  end

  def CookChicken
      print "Cooked chicken "
  end

  def CookBroccoli
      puts "with broccoli"
  end
end

class SteakWithPotatoes < MenuItem
  def initialize
      super(RecipeNames::SteakWithPotatoes)
  end

  def Cook
      CookSteak()
      CookPotatoes()
  end

  def CookSteak
      print "Cooked steak "
  end

  def CookPotatoes
      puts "with potatoes"
  end
end

class PorkWithApples < MenuItem
  def initialize
      super(RecipeNames::PorkWithApples)
  end

  def Cook
      CookPork()
      CookApples()
  end

  def CookPork
      print "Cooked pork "
  end

  def CookApples
      puts "with apples"
  end
 end

With these changes, we need to update the Recipes class to use the RecipeNames module:

class Recipes
  def initialize
      @recipes = {}
      @recipes[RecipeNames::ChickenWithBroccoli] = ChickenWithBroccoli.new()
      @recipes[RecipeNames::SteakWithPotatoes] = SteakWithPotatoes.new()
      @recipes[RecipeNames::PorkWithApples] = PorkWithApples.new()
  end
  def MakeOrder(order)
      recipe = @recipes[order]
      if recipe == nil
          puts "Can't cook " + order
      else
          recipe.Cook()
      end
  end
end

With this current layout, if we needed to add another menu item (let’s say Fish and Chips), we would need to:

  1. Create a new class that extends MenuItem called FishAndChips
  2. Add another string constant to RecipeNames
  3. Add another line in the Recipes initialize method to add it to the array

TL;DR

In short, the Open/Closed Principle (OCP) reinforces the idea that every class should be open for extension and closed to modifications. By following this principle, you’re much more likely to create separated code that allows you to increase functionality and decrease the odds of breaking current functionality. If you run across a class that is doing way too much, use the Single Responsibility Principle to separate the classes and then use a new object that serves as the middle man. In our case, the Recipes class was the middle man between the Chef and the different menu items. As always, don’t forget to refactor and reorganize your code as needed.

Establishing a SOLID Foundation Series