Skip to content

Index

Today I Learned – How to Break Down A Massive Method

During this past week, I was working with our intern and showing him some cool debugging tricks when I came across a massive method. I gave him 30 seconds to look at it and tell me what he thought the method was doing. After a while, he was able to figure it out, but the fact that it wasn’t easy discernible was enough to give pause.

The lesson here is that if you can’t determine what the method is doing easily, then it’s probably doing way too much (violating the Single Responsibility Principle) and needs to be broken into more easily readable pieces.

To demonstrate what I mean, I wrote a program that inserts Messages into a database. A Message contains a description, the number (for identification when troubleshooting) and the module. We would have issues where different messages would have the same number which would cause confusion when troubleshooting errors.

In the program I wrote, the user provides the message and what module the message belongs to and the program automatically generates the message number and inserts the message into the database.

For brevity’s sake, shown below is the logic for determining what the next message number should be.

public int GetNextAlertAndErrorModuleNumber(string module)
{
  if (String.IsNullOrEmpty(module))
    throw new ArgumentException("module cannot be null or empty");
  if (_connection == null)
    _connection = CreateConnection();

  var results = new List<int>();

  _connection.Open();
  var cmd = new SqlCommand("dbo.GetAlertData", _connection);
  cmd.CommandType = CommandType.StoredProcedure;

  var reader = cmd.ExecuteReader();
  while (reader.Read())
  {
    if (!reader["ALERT_ID_NUMBER"].ToString().Contains(module))
      continue;

    var pieces = reader["ALERT_ID_NUMBER"].ToString().Split( );

    results.Add(Int32.Parse(pieces[1]));
  }
  if (reader != null)
    reader.Close();

  cmd = new SqlCommand("dbo.GetErrorData";, _connection);
  cmd.CommandType = CommandType.StoredProcedure;

  reader = cmd.ExecuteReader();
  while (reader.Read())
  {
    if (!reader["ERROR_ID_NUMBER"].ToString().Contains(module))
      continue;

    var pieces = reader["ERROR_ID_NUMBER"].ToString().Split( );

    results.Add(Int32.Parse(pieces[1]));
  }
  if (reader != null)
    reader.Close();

  if (_connection != null)
    _connection.Close();

  return results.Max() + 1;
}

The method itself isn’t complex, just calling some stored procedures, parsing the output and adding the number to a list. However, it’s not abundantly clear what the purpose of the calling the stored procedures.

First, it looks like we’re reading the alerts error numbers from a stored procedure call, why don’t we extract that logic out to a helper method and have the public method call the helper?

public int GetNextAlertAndErrorModuleNumber(string module)
{
  if (String.IsNullOrEmpty(module))
    throw new ArgumentException(&amp;amp;quot;module cannot be null or empty&amp;amp;quot;);

  if (_connection == null)
    _connection = CreateConnection();

  var results = new List<int>();

  _connection.Open();
  results.AddRange(ReadAlerts(module.ToUpper()));

  var cmd = new SqlCommand("dbo.GetErrorData", _connection);
  cmd.CommandType = CommandType.StoredProcedure;

  var reader = cmd.ExecuteReader();
  while (reader.Read())
  {
    if (!reader["ERROR_ID_NUMBER"].ToString().Contains(module))
      continue;

    var pieces = reader["ERROR_ID_NUMBER&"].ToString().Split( );

    results.Add(Int32.Parse(pieces[1]));
  }
  if (reader != null)
    reader.Close();

  if (_connection != null)
    _connection.Close();

  return results.Max() + 1;
  }

  private List<int> ReadAlerts(string module)
  {
    var results = new List<int>();
    var cmd = new SqlCommand("dbo.GetAlertData", _connection);
    cmd.CommandType = CommandType.StoredProcedure;

    var reader = cmd.ExecuteReader();
    while (reader.Read())
    {
      if (!reader["ALERT_ID_NUMBER"].ToString().Contains(module))
        continue;

      var pieces = reader["ALERT_ID_NUMBER"].ToString().Split( );
      results.Add(Int32.Parse(pieces[1]));
    }
    if (reader != null)
    reader.Close();

    return results;
}

By doing this, we fix two issues at once. First, we’ve given a name to the process of reading the alerts which in turns allows us to quickly understand what the public method should be doing (i.e. improved readability).

Second, it allows us for easier debugging because we now have smaller components. For example, let’s say that we were getting the wrong value. In the first implementation, we would have to put breakpoints in different areas trying to determine which part was broken. However, in the new form, we can check to see if ReadAlerts is behaving correctly. If it isn’t, we now know the bug has to be in that method, otherwise, it’s in the rest.

For the next step, you may have noticed that we can repeat the same refactoring trick again, except this time, we can extract the logic for reading the errors into a helper method.

public int GetNextAlertAndErrorModuleNumber(string module)
{
  if (String.IsNullOrEmpty(module))
    throw new ArgumentException("module cannot be null or empty");
  if (_connection == null)
    _connection = CreateConnection();

  _connection.Open();

  var results = new List<int>();
  results.AddRange(ReadAlerts(module.ToUpper()));
  results.AddRange(ReadErrors(module.ToUpper()));

  if (_connection != null)
    _connection.Close();

  return results.Max() + 1;
}

private List<int> ReadAlerts(string module)
{
  var results = new List<int>();
  var cmd = new SqlCommand("dbo.GetAlertData", _connection);
  cmd.CommandType = CommandType.StoredProcedure;

  var reader = cmd.ExecuteReader();
  while (reader.Read())
  {
    if (!reader["ALERT_ID_NUMBER"].ToString().Contains(module))
      continue;

    var pieces = reader["ALERT_ID_NUMBER"].ToString().Split( );
    results.Add(Int32.Parse(pieces[1]));
  }
  if (reader != null)
    reader.Close();

  return results;
}

private List<int> ReadErrors(string module)
{
  var results = new List<int>();
  var cmd = new SqlCommand("dbo.GetErrorData", _connection);
  cmd.CommandType = CommandType.StoredProcedure;

  var reader = cmd.ExecuteReader();
  while (reader.Read())
  {
    if (!reader["ERROR_ID_NUMBER"].ToString().Contains(module))
      continue;

    var pieces = reader["ERROR_ID_NUMBER"].ToString().Split( );
    results.Add(Int32.Parse(pieces[1]));
  }
  if (reader != null)
    reader.Close();

  return results;
}

After the changes, anyone who looks at the public API can easily see that it’s reading from both Alerts and Errors. This is really powerful because now you can communicate with non-technical people about requirements and what the code is doing.

Let’s say that in the future, the requirements change and this conversation plays out:

Alice (QA, finding an error) – Hey Bob, I was running one of our test plans and it looks like that we’re getting the wrong message number if we’re trying to add a new message and there are warning messages in the database. We are including the warning table when figuring that out, right?

Bob (Engineer, finding the root cause) – Hey you’re right, it looks like we’re only using the alerts and error tables when calculating the next number. Why don’t we write up a ticket for that and get a fix in?

The key point is that no matter how large a method is, there always have to be steps being performed in some order (by definition of an algorithm) and this is the key to refactoring larger methods into more maintainable pieces of code. The trick is determining what those steps are and making decisions on whether to make helper methods or helper classes.

If those steps become complicated, then they should be broken out into helper methods. As time progresses and those helper methods start to become more complicated, then those helper methods should in turn become classes of their own.

Today I Learned: The Law of Demeter

Don’t pass in more information that you need. It sounds simple, but when working with messy legacy code, it’s easy to forget.

The impetus for this post came from a peer code review. During the review, I found this method:

1
2
3
4
5
6
7
8
9
public IStrategy GetStrategy(Project project, bool isAffected)
{
  var type = project.Type;
  if (type == ProjectType.A && isAffected)
    return new ProjectAIsAffectedStrategy();
  if (type == ProjectType.B)
    return new ProjectBStrategy();
  // Similar if statements
}

At first glance, it looks pretty good. Logic was sound and it seemed to be returning a class implementing an interface similar to what we would expect for the Factory pattern. However, there’s a slight problem, can you spot it?

The issue is with the first parameter, Project. The method takes a Project, however, we’re only really depending on the Project’s Type property.

1
2
3
4
5
6
7
8
9
public IStrategy GetStrategy(Project project, bool isAffected)
{
  var type = project.Type;
  if (type == ProjectType.A && isAffected)
    return new ProjectAIsAffectedStrategy();
  if (type == ProjectType.B)
    return new ProjectBStrategy();
  // Similar if statements
}

So why don’t we get rid of the dependency on the Project and instead replace it with the dependency on the ProjectType instead?

1
2
3
4
5
6
7
8
public IStrategy GetStrategy(ProjectType type, bool isAffected)
{
  if (type == ProjectType.A &amp;&amp; isAffected)
    return new ProjectAIsAffectedStrategy();
  if (type == ProjectType.B)
    return new ProjectBStrategy();
  // Similar if statements
}

Instinctual, I knew this was the right call, but I couldn’t remember why I knew it was a good choice. After some digging, I remembered that this is a Law of Demeter violation, or better known as the Principle of Least Knowledge violation.

In general, this principle states that a method should have the least amount of information it needs to do it’s job. Other classic violations of this principle is when you use a class’s internals internals. For example,

SomeClassA.SomePropertyB.WithSomeMethodC()

One of the reasons that I really like the Law of Demeter is that if you follow it, you create easier to test methods. Don’t believe me? Which is easier to create, the Project class (which may have many dependencies that would need to be stubbed) or the ProjectType enum (which by definition has zero dependencies)?

Another reason that following the Law of Demeter is good practice is that it forces your code to be explicit about what dependencies are required. For example, in the first implementation, the caller knew that the method needed a Project, but had no clue on how much of the Project it needed (does it need all of the properties set? Does it need further setup besides basic instantiation?). However, with the refactored version, now it’s much clearer that the method has a looser dependency on not Project, but just the ProjectType.

Beginner Basics: Establishing a SOLID Foundation – The Dependency Inversion Principle

Welcome to the final installment of Establishing a SOLID Foundation series. In this post, we’ll be exploring the fifth part of SOLID, the Dependency Inversion Principle.

What is the Dependency Inversion Principle?

When working with object-oriented languages, we take large problems and break them down into smaller pieces. These smaller pieces in turn are broken down into even smaller, more manageable pieces to work on. As part of the breaking down process, we inherently have to introduce dependencies between the larger pieces and the smaller pieces.

How we weave these dependencies together is the difference between easily changing behavior and spending the next week pulling your hair out.

When working with classes, dependencies are usually introduced by constructing them in the class that they’re used in. For example, let’s say that we’ve been asked to write an utility that emulates an calculator but it also keeps a transaction log for record keeping purposes.

class Logger
  def log (content)
    File.open("C:\\temp\\results.txt", 'a') {|f| f.write(content)}
  end
end

class Calculator
  def initialize
      @logger = Logger.new()
  end
  def add (a, b)
      log(a, b, "+")
      return a + b
  end
  def sub (a, b)
      log(a, b, "-")
      return a - b
  end
  def mult (a, b)
      log(a,b,"*")
      return a * b
  end
  def div (a, b)
      log(a,b,"/")
      return a.to_f / b
  end

  def log(a, b, sym)
      text = a.to_s + " " + sym + " " + b.to_s + " = "
      if sym == "+"
            text += (a + b).to_s
      elsif sym == "-"
            text += (a-b).to_s
      elsif sym == "*"
            text += (a*b).to_s
      else
            text += (a.to_f/b).to_s
      end
      text += "\n"
      @logger.log(text)
  end
end

# Usage
calc = Calculator.new()
puts calc.add(4,3)
puts calc.sub(2,1)
puts calc.mult(100,2)
puts calc.div(5,2)

So far so good, we have two classes (Logger and Calculator) that is responsible for logging and the calculations. Even though this is a small amount of code (~50 lines for both classes), there are three dependencies in the code. The first dependency that I notice is that Calculator depends on Logger. We can see this by looking at the initialize method for Calculator (as hinted above):

1
2
3
4
5
class Calculator
  def initialize
    @logger = Logger.new("C:\\temp\\results.txt")
  end
end

The second dependency is a bit trickier to find, but in the Logger class’ log method, we use a hard coded file path.

1
2
3
4
5
class Logger
  def log (content)
      File.open("C:\\temp\\results.txt", 'a') {|f| f.write(content)}
  end
end

The third dependency is probably the hardest to find, but in the the Logger class’ log method, we are also depending on the file system for the machine by using Ruby’s File class. But wait a minute, I hear you say, why is the File class considered a dependency, that’s a part of the Ruby language? I agree with you, but it’s still a dependency in the code and something that we should keep in mind.

From these three dependencies, we’re going to focus on resolving the first two. We could make resolve the dependency on the file system, but it would take so much effort for so little gain.

Why don’t we resolve the file dependency issue?

One thing that I keep in mind when identifying which dependencies to invert is to focus on inverting dependencies outside of the framework. In this case, we’re going to ignore the file system dependency because I, the programmer, depend on Ruby to behave correctly. If Ruby stops working, then a broken file system is the least of my worries. Therefore, it’s not worth the effort to resolve.

Making the Calculator and Logger more flexible

In order to resolve these DIP violations, we need to expose ways to drop in these dependencies. There are two ways of doing this. We can either:

  • Expose the dependency via the constructor
  • Expose the dependency via a public property

Using the Calculator example, we can expose the logger dependency via the constructor and we can expose the filePath dependency by exposing it as a property.

For the Calculator class, I’m going to first change the initialize method so that it takes a logger instead of constructing it’s own.

1
2
3
4
5
class Calculator
  def initialize(logger)
    @logger = logger
  end
end

Next, I will construct the logger in the usage and pass the logger as part of the constructor.

1
2
3
# Usage
logger = Logger.new()
calc = Calculator.new(logger)

A quick run of our program tells us that everything is still working correctly.

Now that we’ve finished up exposing the logger dependency, it’s time to expose the file path. First, I’m going to add a public property on the Logger class call filePath and use that property in the log method

1
2
3
4
5
6
class Logger
  def log (content)
      File.open(filePath, 'a') {|f| f.write(content)}
  end
  attr_accessor :filePath
end

Now that we’ve introduced a seam for the file path dependency, we use that seam in the program and assign the property```ruby

1
2
3
4
# Usage
logger = Logger.new()
logger.filePath = "C:\\temp\\results.txt"
calc = Calculator.new(logger)

Multiple ways of solving the issue, which one is best?

When using the constructor method, it’s very clear to see what dependencies the class has, just look at the constructor. However, adding a new dependency to the constructor may cause other code to fail because the signature of the constructor has changed. This in turn can lead to cascading changes where multiple places of code need to be updated to pass in the dependency.

On the other hand, using the property method, the change is less invasive because the property can be set independently of the when the object was constructed. However, it’s harder to see the dependencies for the class because now the properties are containing the dependencies. Also, it’s very easy to forget to set a property before using the object.

Both of these methods are valid, but when I’m working with DIP, I prefer to expose the dependencies via the constructor because if my class starts to gain more and more dependencies, then it’s a sign that my class is doing too much and violating the Single Responsibility Principle (SRP). You can say that DIP is the canary in the coal mine for SRP.

TL;DR

In summary, the Dependency Inversion Principle (DIP) tells us that we should we should have the outside world pass in our dependencies. If we don’t follow this rule, then we will not. In order to resolve violations, we need to determine what dependencies we have and modify our constructor to accept those dependencies. If our constructor becomes too large, then our class might be violating the Single Responsibility Principle. By following the DIP, we expose the dependencies our classes require and allows for greater decoupling. As always, don’t forget to refactor as you go along.

Establishing a SOLID Foundation Series

Beginner Basics: Establishing a SOLID Foundation – The Interface Segregation Principle

Welcome to the fourth installment of Establishing a SOLID Foundation series. In this post, we’ll be exploring the fourth part of SOLID, the Interface Segregation Principle and how by following this principle, you will write more robust code.

What is the Interface Segregation Principle?

The Interface Segregation Principle (ISP) tell us that clients should not be forced to use an interface that defines methods that it does not use. But what does this mean? Let’s say that we have the following ContactManager class and a ContactFinder class.

public class ContactManager
{
     private List _names;

     public ContactManager()
     {
          _names = new List();
          _names.Add("Cameron");
          _names.Add("Geoff");
          _names.Add("Phillip");
     }

     public void PrintNames()
     {
          foreach (var name in _names)
              Console.WriteLine(name);
     }

     public void SetNames(List names)
     {
          foreach (var name in names)
               _names.Add(name);
     }

    public bool DoesNameExist(string name)
    {
         var results = _names.IndexOf(name);
         if (results != -1)
              return true;

         return false;
     }
}

public class ContactFinder
{
     private ContactManager _manager;

     public ContactFinder(ContactManager manager)
     {
         _manager = manager;
     }

     public void FindContacts(List names)
     {
          foreach (var name in names)
          {
               if (_manager.DoesNameExist(name))
                    Console.WriteLine("Found " + name);
               else
                    Console.WriteLine("Couldn't find " + name);
          }
      }
}

So far, so good, the ContactManager is responsible for holding onto the list of names and some basic methods and the ContactFinder is responsible for determining which contacts are in our list.

However, there is a problem with this example code. We have this ContactManager class that has a lot methods defined, but the only method that’s required is the DoesNameExist method.

Another concept ISP tells in a roundabout fashion is that objects should depend on interfaces, not concrete classes. This allows us to switch out dependencies much easier when we code to the interface.

What’s the big deal, I don’t see what the issue is

The big issue that comes up is that it’s hard to figure out what methods that the ContactFinder actually needs. We say that it needs a ContactManager which would lead us to assume that it needs all of those methods. However, that’s not the case. So by violating ISP, it’s easy to make the wrong design decision.

Another issue arises when it comes to creating the interface for the dependency. If we assume that all methods for the ContactManager is required, then any class that implements that interface has to also implement those unneeded methods. How many times have you seen an object implement an interface with a lot of methods that did nothing or just threw exceptions?

public interface BloatedInterface
{
     void SetContent();
     bool IsContentSet();
     void RemoveContent(string content);
     void AddContent(string content);
     void ImportantMethod();
}

public class BloatedObject : BloatedInterface
{
     public void SetContent()
     {
     }
     public bool IsContentSet()
     {
          throw new NotImplementedException();
     }
     public void RemoveContent(string content)
     {
          throw new NotImplementedException();
     }
     public void AddContent(string content)
     {
          throw new NotImplementedException();
     }
     public void ImportantMethod()
     {
     }
}

Fixing the issue

Alright, alright, I hear you say, you’ve convinced me, how do I fix this problem? The steps are simple:

  1. First, you need to identify which methods the client needs
  2. Next, you need to create an interface that contains the methods that the client uses
  3. After creating the interface, have the dependency implement the interface
  4. Finally, change the signature client so that it uses the interface instead of the concrete class

Using our code base, first, we look at the ContactFinder class and see what methods from ContactManager that it uses.

So far, it looks like it only needs the DoesNameExist method. So let’s create an interface, called IContactSearcher that contains the single method.

1
2
3
4
public interface IContactSearcher
{
     bool DoesNameExist(string name);
}

Now that we’ve extracted the interface, it’s time to have the ContactManager class implement the interface:

public class ContactManager : IContactSearcher
{
     private List _names;

     public ContactManager()
     {
          _names = new List();
          _names.Add("Cameron");
          _names.Add("Geoff");
          _names.Add("Phillip");
     }

     public void PrintNames()
     {
          foreach (var name in _names)
          Console.WriteLine(name);
     }

     public void SetNames(List names)
     {
          foreach (var name in names)
               _names.Add(name);
     }

     public bool DoesNameExist(string name)
     {
          var results = _names.IndexOf(name);
          if (results != -1)
               return true;

          return false;
     }
}

Finally, we update the references in ContactFinder to use the IContactSearcher interface instead of the concrete class ContactManager.

public class ContactFinder
{
     private IContactSearcher _searcher;

     public ContactFinder(IContactSearcher searcher)
     {
         _searcher = searcher;
     }

     public void FindContacts(List names)
     {
          foreach (var name in names)
          {
               if (_searcher.DoesNameExist(name))
                    Console.WriteLine("Found " + name);
               else
                    Console.WriteLine("Couldn't find " + name);
          }
     }
}

With this last step, we’ve now resolved the ISP violation.

TL;DR

In summary, the Interface Segregation Principle (ISP) tells us that we should interfaces instead of concrete classes for our dependencies and that we should use the smallest interface for our client to work. If we don’t follow these rules, then it’s easy to create bloated interfaces that clutter up readability. In order to resolve violations, we need to determine what methods our client require and code an interface to contains those methods. Finally, we have our class implement those methods and pass it to the client. By following the ISP, we reduce the complexity required by our code and reduce the coupling between client and dependency. As always, don’t forget to refactor as you go along.

Establishing a SOLID Foundation Series

Beginner Basics: Establishing a SOLID Foundation – The Liskov Substitution Principle

Welcome to the third installment of Establishing a SOLID Foundation series. In this post, we’ll be exploring the third part of SOLID, the Liskov Substitution Principle and how following this principle can lead to loose coupling of your code.

So what is the Liskov Substitution Principle?

Before diving into the Liskov Substitution Principle (LSP), let’s look at a code example demonstrating the violation.

Let’s say that we have a Rectangle class:

# A Rectangle can have height and width
# set to any value
class Rectangle
  def height=(height)
      @height = height
  end
  def width=(width)
      @width = width
  end
  def height
      @height
  end
  def width
      @width
  end
  def area
      height * width
  end
end

If we run the following implementation, it’s pretty clear that it works like we would expect:

1
2
3
4
rect = Rectangle.new
rect.height = 5
rect.height = 6
puts rect.area # => 30

Seems pretty simple, we have a Rectangle class with two public properties, height and width and the class behaves the way we would expect.

Now let’s add a new class, called Square. Since all Squares are also Rectangles, it makes sense that the Square class should inherit from the Rectangle class. However, since Squares have to maintain the same height and width, we need to add some additional logic for that:

# A Square must maintain the same height and width
class Square < Rectangle
  def height=(height)
      @height = height
      @width = height
  end
  def width=(width)
      @width = width
      @height = width
  end
end

Using a Square instead of a Rectangle and running the same input, we get the following output:

1
2
3
4
square = Square.new
square.height = 10
square.width = 5
puts square.area # => 25

Hold up, why is the area 25? If I read this code, then the height should be 10 and the width should be 5, giving an area of 50. This is no longer the case because of the domain constraint caused by the Square class. By using a Square where the code expected a Rectangle, we get different behavior then we would expect. This is the heart of the Liskov Substitution Principle.

In short, the Liskov Substitution Principle states that if we have an object (Rectangle) in our code and it works correctly, then we should be able to use any sub-type (Square) without the results being modified.

The most common example of LSP violations are when the “is-a” phrase from Object-Oriented Design break down. In the Rectangle-Square example, we say that a Square “is-a” Rectangle which is true. However, when we covert that relationship to code and use inheritance, the relationship does not hold up.

I don’t know, this sounds confusing, what’s the point? To me, the Liskov Substitution Principle is the hardest part of SOLID to understand. It’s heavy on the theoretical and it’s not blatantly obvious when a violation has occurred until testing.

However, there are plenty of benefits of following LSP.

First, following LSP reduces the tight coupling involved in your code. Let’s look back at our Recipes class from the Open/Closed Principle post and examine the MakeOrder method:

class Recipes
     def initialize
          @recipes = {}
          @recipes[RecipeNames::ChickenWithBroccoli] = ChickenWithBroccoli.new()
          @recipes[RecipeNames::SteakWithPotatoes] = SteakWithPotatoes.new()
          @recipes[RecipeNames::PorkWithApples] = PorkWithApples.new()
     end
     def MakeOrder(order)
          recipe = @recipes[order]
          if recipe == nil
               puts "Can't cook " + order
          else
               recipe.Cook()
          end
     end
end

In this class, you see that we load different recipes and when one’s requested, we call the Cook method. We don’t have to do any set-up, special handling, or other logic, we just trust that the Cook method for whatever recipe we choose works as expected. By following this design, code will be easier to read and to maintain.

Going back to our Square/Rectangle example, if we wanted a method that would return a new Square or Rectangle, it would have to look something like this:

def CreateShape(classType, height, width)
     shape = nil
     if classType == "Rectangle"
          shape = Rectangle.new
          shape.height = height
          shape.width = width
     else
          shape = Square.new
          shape.height = height
     end
     return shape
end

This code works, but there is one major problem. When someone is looking at this code, they’re going to get confused of why the Rectangle and Square are setup differently.

For example, when I see that the Square’s height is being set, but not the width, my first thought is that this is a bug. Then, I’d have to look into the Square’s class definition and then I’d see the logic of where setting the height also sets the width.

Long story short, by identifying and resolving LSP violations, we can make the code easier to read and maintain.

So it looks like LSP is pretty useful, but how do I fix violations?

Now that we’ve talked about spotting LSP violations and why it’s important to follow LSP, let’s discuss how to fix the violations.

To be honest, fixing a LSP violation is not easy. Since the nature of the problem is caused by a broken abstraction, discarding the abstraction is the best option. However, if you absolutely need to use the abstraction, then one solution is to remove the method that causes the violation.

In the Square/Rectangle example, we would remove the setters for height and width from our Rectangle class because that is how the violation can occur. After removing the setters, we need to modify the initialize method of Square to only take one parameter, size, and send that twice to the Rectangle class. Now, our classes look something like this:

# A Rectangle can have height and width
# set to any value
class Rectangle
  def initialize(height, width)
      @height = height
      @width = width
  end
  def height
      @height
  end
  def width
      @width
  end
  def area
      height * width
  end
end

# A Square must maintain the same height and width
class Square < Rectangle
  def initialize(size)
      super(size, size)
  end
end

With sample implementation and output

1
2
3
4
5
rect = Rectangle.new(10, 5)
puts rect.area # => 50

square = Square.new(5)
puts square.area # => 25

TL;DR

In short, the Liskov Substitution Principle (LSP) enforces the idea that if a class has a sub-type (through inheritance or interfaces), then by passing the sub-type, the program should still produce the same results. If you run across a class that violates LSP, then you know that your abstraction is not complete and you can either

  • Remove the offending methods/properties or
  • Abandon the abstraction

As always, don’t forget to refactor and reorganize your code as needed.

Establishing a SOLID Foundation Series

Beginner Basics: Establishing a SOLID Foundation - The Open/Closed Principle

Welcome to the second installment of Establishing a SOLID Foundation series. In this post, we’ll be exploring the second part of SOLID, the Open/Closed Principle and how following this principle can lead to great design choices.

So what is the Open/Closed Principle? In order to set the context for discussion, let’s revisit our last example of the Chef class:

class Chef
  def CookFood(order, tableNumber)
    if order == "chicken with broccoli"
      CookChickenWithBroccoli()
    end
  end

  def CookChickenWithBroccoli
    puts "Cooked chicken with broccoli"
  end
end

So it looks like this Chef is pretty simple, it only has one public method of CookFood and he can only cook ChickenWithBroccoli. However, a Chef that can only cook one thing isn’t very useful. So how about we add some more menu items?

class Chef
  def CookFood (order)
    if order == "chicken with broccoli"
      CookChickenWithBroccoli()
    elsif order == "steak with potatoes"
      CookSteakWithPotatoes()
    elsif order == "pork with apples"
      CookPorkWithApples()
    else
      puts "Don't know how to cook " + order
    end
  end

  def CookChickenWithBroccoli
    puts "Cooked chicken with broccoli"
  end

  def CookSteakWithPotatoes
    puts "Cooked steak with potatoes"
  end

  def CookPorkWithApples
    puts "Cooked pork with apples"
  end
end

So our new Chef can cook more food, but the code base expanded quite a bit. In order to add more menu choices, we need to add an additional if check in CookFood and to define a new method for CookFood to call. This might not sound like a lot of work because each of these Cook* methods just print something to the screen. However, what if the steps to create each food item were much more complex?

def CookChickenWithBroccoli
  CookChicken()
  CookBroccoli()
end

def CookChicken
  print "Cooked chicken "
end

def CookBroccoli
  puts "with broccoli"
end

Also, what if we modified how the CookChickenWithBroccoli method worked? We would need to modify the Chef class, but that doesn’t make sense. In the real world, we would modify the recipe and the Chef would then follow the new recipe. This concept that we would have to modify an unrelated object in order to add new functionality is the inspiration for the Open/Closed Principle.

In short, the Open/Closed Principle means that an object should be Open for extension, but Closed for modification. This principle relies on the idea that new functionality is added by creating new classes, not by modifying pre-existing classes’ behavior. By doing this, we’re decreasing the chances of breaking current functionality and increasing code stability.

This sounds good, but is it worth the additional design time?

Now that we’ve discussed the Open/Closed Principle, you might be wondering what some of the benefits are of cleaning up this design.

First, classes that follow the Open/Closed Principle are small and focused in nature playing off the idea of the Single Responsibility Principle. Looking back at our Chef class, it’s very clear that by adding new functionality, Chef is going to be handling way too many things.

Next, by following OCP, there won’t be multiple classes modified just to add new functionality. There’s nothing like a change set containing tons of modified files to make even the most experienced developer shudder in fear.

By definition of OCP, we won’t be modifying a lot of files (ideally only one file should be modified) and we’re adding new classes. Since we’re adding in these new classes, we inherently have the opportunity to bake in testing.

Alright, I get it OCP is awesome, but how do I refactor the Chef class?

In order to fix the violation, we’re going to take each menu item and make them into their own class

class ChickenWithBroccoli
  def initialize
    @name = "Chicken with Broccoli"
  end

  def Cook
    CookChicken()
    CookBroccoli()
  end

  def CookChicken
    print "Cooked chicken "
  end

  def CookBroccoli
    puts "with broccoli"
  end
end

class SteakWithPotatoes
  def initialize
    @name = "Steak with Potatoes"
  end

  def Cook
    CookSteak()
    CookPotatoes()
  end

  def CookSteak
    print "Cooked steak "
  end

  def CookPotatoes
    puts "with potatoes"
  end
end

class PorkWithApples
  def initialize
    @name = "Pork with Apples"
  end

  def Cook
    CookPork()
    CookApples()
  end

  def CookPork
    print "Cooked pork "
  end

  def CookApples
    puts "with apples"
  end
 end

Now that we have these different classes, we need to come up with some way for our Chef to interact with them. So why don’t we organize these menu items into a Recipes class?

class Recipes
  def initialize
      @recipes = {}
      @recipes[:chicken] = ChickenWithBroccoli.new()
      @recipes[:steak] = SteakWithPotatoes.new()
      @recipes[:pork] = PorkWithApples.new()
  end

  def MakeOrder(order)
      recipe = nil
      if order == "chicken with broccoli"
          recipe = @recipes[:chicken]
      elsif order == "steak with potatoes"
          recipe = @recipes[:steak]
      elsif order == "pork with apples"
          recipe = @recipes[:pork]
      end
      if recipe == nil
          puts "Can't cook " + order
      else
          recipe.Cook()
      end
  end
end

Now we have this Recipes class that contains all of the menu items for our Chef to use. When adding new menu items to Recipes, all we have to add is the class in the initialize method and add an additional if check in the MakeOrder method. But hold on, I hear you say, This is the same as what we had with the Chef at the beginning, why is this design better? Before, we would have to modify the Chef in order to add more menu items which doesn’t really make sense, now we’ve moved that logic to Recipes which makes sense that it needs to be modified if a new menu item is added.

On the topic of our Chef, after cleaning up to use the Recipes class, our Chef is simpler and relies on Recipes for the menu items, not itself:

1
2
3
4
5
6
7
8
9
class Chef
  def initialize
    @recipes = Recipes.new()
  end

  def CookFood (order)
    @recipes.MakeOrder(order)
  end
end

Now that we’ve fixed the violation, let’s go ahead and refactor some. Looking at the menu choices, it’s pretty clear that we can abstract the behavior to a base class called MenuItem for them all to share (Note: By defining Cook by raising an exception, I’m forcing all classes to provide their own implementation).

1
2
3
4
5
6
7
8
9
class MenuItem
  def initialize(name)
      @name = name
  end

  def Cook
      raise "This should be overridden in child class"
  end
end

Also, as part of this refactoring, we’re going to move some of the strings into constants as part of the RecipeNames module so that the Chef and Recipes can communicate with one another:

1
2
3
4
5
module RecipeNames
  ChickenWithBroccoli = "Chicken with Broccoli"
  SteakWithPotatoes = "Steak with Potatoes"
  PorkWithApples = "Pork with Apples"
end

With these additions, let’s update the menu choices to use the module and the MenuItem base class:

class ChickenWithBroccoli < MenuItem
  def initialize
      super(RecipeNames::ChickenWithBroccoli)
  end

  def Cook
      CookChicken()
      CookBroccoli()
  end

  def CookChicken
      print "Cooked chicken "
  end

  def CookBroccoli
      puts "with broccoli"
  end
end

class SteakWithPotatoes < MenuItem
  def initialize
      super(RecipeNames::SteakWithPotatoes)
  end

  def Cook
      CookSteak()
      CookPotatoes()
  end

  def CookSteak
      print "Cooked steak "
  end

  def CookPotatoes
      puts "with potatoes"
  end
end

class PorkWithApples < MenuItem
  def initialize
      super(RecipeNames::PorkWithApples)
  end

  def Cook
      CookPork()
      CookApples()
  end

  def CookPork
      print "Cooked pork "
  end

  def CookApples
      puts "with apples"
  end
 end

With these changes, we need to update the Recipes class to use the RecipeNames module:

class Recipes
  def initialize
      @recipes = {}
      @recipes[RecipeNames::ChickenWithBroccoli] = ChickenWithBroccoli.new()
      @recipes[RecipeNames::SteakWithPotatoes] = SteakWithPotatoes.new()
      @recipes[RecipeNames::PorkWithApples] = PorkWithApples.new()
  end
  def MakeOrder(order)
      recipe = @recipes[order]
      if recipe == nil
          puts "Can't cook " + order
      else
          recipe.Cook()
      end
  end
end

With this current layout, if we needed to add another menu item (let’s say Fish and Chips), we would need to:

  1. Create a new class that extends MenuItem called FishAndChips
  2. Add another string constant to RecipeNames
  3. Add another line in the Recipes initialize method to add it to the array

TL;DR

In short, the Open/Closed Principle (OCP) reinforces the idea that every class should be open for extension and closed to modifications. By following this principle, you’re much more likely to create separated code that allows you to increase functionality and decrease the odds of breaking current functionality. If you run across a class that is doing way too much, use the Single Responsibility Principle to separate the classes and then use a new object that serves as the middle man. In our case, the Recipes class was the middle man between the Chef and the different menu items. As always, don’t forget to refactor and reorganize your code as needed.

Establishing a SOLID Foundation Series

Beginner Basics: Establishing a SOLID Foundation – The Single Responsibility Principle

Welcome to the first installment of Establishing a SOLID Foundation series. In this post, we’ll be exploring the first part of SOLID, the Single Responsibility Principle and how following this principle can lead to great design choices.

So what is the Single Responsibility Principle?

Before diving into code, let’s take a look at a real life example. Let’s say that we open a new restaurant. Clearly, we need to hire a fantastic head chef to prepare the food. Between these two candidates, which one seems to be the better fit?

  • Chef A – spends their time on creating new dishes and preparing the best food possible
  • Chef B – spends their time on taking orders, preparing food, and busing tables

Well that’s easy, you say, we should hire Chef A because he’s focusing on cooking. We can hire other people to take orders and clean tables. It’s pretty obvious that Chef A is the better choice because of the focusing on a single job. So if this is what happens in real life, why is that we see code that is doing way too many things?

For example here’s an example of what the Chef B(efore) class might look like

class Chef
  def initialize
    @position = 0
    @order = nil
    @orderReady = false
  end

  def CookFood(order, tableNumber)
    if order == "chicken with broccoli"
      CookChickenWithBroccoli()
      DeliverFood(order, tableNumber)
    end
  end

  def CookChickenWithBroccoli
    @orderReady = true
  end

  def DeliverFood(order, tableNumber)
    GoToTable(tableNumber)
    GiveFood(order)
  end

  def GoToTable(tableNumber)
    @position = tableNumber
  end

  def GiveFood(order)
    puts "Food delivered"
  end

  def BusTables(tableNumber)
    GoToTable(tableNumber);
    CleanTable(tableNumber)
  end

  def CleanTable(tableNumber)
    puts "Table # " + tableNumber.to_s + " cleaned"
  end

  def TakeOrders(tableNumber)
    GoToTable(tableNumber)
    order = AskForOrder()
    return order
  end

  def AskForOrder()
    return "chicken with broccoli"
  end
end

with an example implementation usage:

1
2
3
4
5
tableNumber = 3
chef = Chef.new()
order = chef.TakeOrders(tableNumber)
chef.CookFood(order, tableNumber)
chef.BusTables(tableNumber)
As you can tell in the Chef example, there are a lot of methods that need to be defined in order to get the different pieces of main functionality working. If any of these main pieces needed to be changed, we would have to modify the Chef class.

Because of the dependencies, another way to describe the SRP is that the class should only have one reason to change. In this case, the Chef class has three reasons for changing. (Fun fact: when dealing with classes that have a lot of reasons to change, it can be a sign that the class is following the God-Object anti-pattern.)

I don’t know, what’s in it for me?

Now that you have a good understanding of the SRP, you might be asking what are some of the benefits of cleaning up your design.

First, classes that only do one job have less dependencies to worry about. Looking back at our code example, it’s clear that the Chef class would have to change if we needed to change the business rules for taking orders, cleaning tables, or for cooking food.

Next, by following the SRP, it’s easier for a team to solve issues. For example. let’s say that you had to fix an error in cleaning the tables and someone else on your team was assigned to update the taking an order scenario. Using the Chef B class definition, the two of you would have to make different changes to the same class.

Finally, by following the SRP, we’re more closely following the idea behind object-oriented design. By definition, we should take complex problems and break them down into their individual actors. Since we define that each class can only have one responsibility, we are ensuring that the problem is being broken down to its smallest pieces.

Ok, ok, you’ve convinced me, how do I take a busy class and make it simple?

Fortunately, if you have a class that has is doing too many things, there’s a really simple fix. Just create more objects that contain the different pieces.

Using our Chef example, I’m going to separate the responsibilities into two new classes. First, I’m going to move all the methods involved in taking orders to a new Waiter class.

class Waiter
  def initialize
    @position = 0
  end

  def DeliverFood(order, tableNumber)
    GoToTable(tableNumber)
    GiveFood(order)
  end

  def GoToTable(tableNumber)
    @position = tableNumber
  end

  def GiveFood(order)
    puts "Food delivered"
  end

  def TakeOrders(tableNumber)
    GoToTable(tableNumber)
    order = AskForOrder()
    return order
  end

  def AskForOrder()
    return "chicken with broccoli"
  end
end

Next, I’m going to extract every method needed to bus tables into our new Busboy class:

class Busboy
  def initialize
    @position = 0
  end

  def BusTables(tableNumber)
    GoToTable(tableNumber);
    CleanTable(tableNumber)
  end

  def GoToTable(tableNumber)
    @position = tableNumber
  end

  def CleanTable(tableNumber)
    puts "Table # " + tableNumber.to_s + " cleaned"
  end
end

Now that we’ve broken up the responsibilities, the next step is to look at some common functionality that classes might share. For example, it looks like the Waiter and Busboy class both use @position and the GoToTable method, so why don’t we create a new class called BaseService

1
2
3
4
5
6
7
8
class BaseService
  def initialize
    @position = 0
  end
  def GoToTable(tableNumber)
    @position = tableNumber
  end
end

and allow both the Waiter and Busboy classes to inherit?

class Waiter < BaseService
  def initialize
    super
  end

  def DeliverFood(order, tableNumber)
    GoToTable(tableNumber)
    GiveFood(order)
  end

  def GiveFood(order)
    puts "Food delivered"
  end

  def TakeOrders(tableNumber)
    GoToTable(tableNumber)
    order = AskForOrder()
    return order
  end

  def AskForOrder()
    return "chicken with broccoli"
  end
end

class Busboy < BaseService
  def initialize
    super
  end

  def BusTables(tableNumber)
    GoToTable(tableNumber);
    CleanTable(tableNumber)
  end

  def CleanTable(tableNumber)
    puts "Table # " + tableNumber.to_s + " cleaned"
  end
end

Great, we’ve finished refactoring the overly-obsessive Chef from having control on everything in the restaurant to just keep his attention on cooking great food.

class Chef
  def initialize
    @orderReady = false
  end

  def CookFood(order, tableNumber)
    if order == "chicken with broccoli"
      CookChickenWithBroccoli()
    end
  end

  def CookChickenWithBroccoli
    @orderReady = true
  end
end

Wow, after separating the concerns, our chef class has been massively condensed down to focus on just cooking food, but how do all of these individual classes interact with one another? Does it look like the classes are now focusing on performing one job well?

1
2
3
4
5
6
7
8
tableNumber = 3
chef = Chef.new()
waiter = Waiter.new()
busboy = Busboy.new()
order = waiter.TakeOrders(tableNumber)
chef.CookFood(order, tableNumber)
waiter.DeliverFood(order, tableNumber)
busboy.BusTables(tableNumber)

Something to keep in mind when separating responsibilities is that a single responsibility does not equal a single method. If the methods are all related to performing the same task, then it’s not a violation of the SRP.

TL;DR

In short, the Single Responsibility Principle (SRP) reinforces the idea that every class should have one job and should do that job well. By following this principle, you’re much more likely to create more readable and maintainable code. When you run across classes that are doing too much, the best solution is to extract the extra functionality into another class. After extraction, don’t forget to refactor and reorganize as needed.

Establishing a SOLID Foundation Series

Establish solid intro

If you’ve been working at any dev shop worth its salt, it’s a safe bet that you’ve heard someone mention writing SOLID code or that something isn’t SOLID. Well, what exactly do they mean by SOLID?

As part of this Beginner Basics series, we’re going to first look at what does SOLID mean and why is it so important. For the next five weeks, we’ll explore a different aspect of SOLID by in terms of which principle does each letter represent, some code samples that follow the principle and code samples that break the principle. . As this series progresses, I’ll be adding the code samples to my public Bitbucket account for you to clone (You do know how to use Mercurial and Bitbucket, right?)

What is SOLID?

In a nutshell, SOLID is a mnemonic created by Michael Feathers to help developers remember the five principles of great code construction introduced by Robert C. Martin (“Uncle Bob”). By following these principles, it’s much more likely that the code designed will be easier to maintain and to extend. As such, SOLID code follow these principles:

(S)ingle Responsibly Principle – Every class should do just one thing and do it well (O)pen/Closed Principle – Code should be open for extension, but closed for modifications (L)iskov Substitution Principle – Make sure that two classes that are interchangeable have the same behavior (I)nterface Segregation Principle – Better to use multiple specific interfaces then to use a single general interface (D)ependency Inversion Principle – Code should depend on abstractions, not on concrete classes

What’s so important about being SOLID?

When building software, its paramount that you have a solid foundation. A common analogy for building software is that it’s like building a house. If the foundation is weak, then code starts becoming stiff, changes become much harder to make and the next thing you know the shower is draining into the kitchen. If your code isn’t SOLID, then it’s going to be harder to add new features/modifications, more difficult to support, and easier to introduce features bugs. As a developer, you should strive to write code that is bug free, easily maintainable, and simple to expand. By following SOLID principles, we can carry out these goals and our code can be SOLID.

Next Steps

Stay tuned for the next part of the series when we begin exploring the first part of SOLID, the Single Responsibility Principle. Until then, stay sharp, keep learning, and code on.

Establishing a SOLID Foundation Series

The Basics: Source Control

As the first part of the Basics of Software Development series, we’re going to talk about source control is, why you should be using it and commonly used source control management (SCM) tools.

Source Control, what’s that?

To put it simply, SCM is a set of tools that allow users to keep track of source code by using a repository. The most common workflow is to get a copy of the source code from SCM, make changes to some of the files (refactoring for example) and committing those changes back.

Benefits of Source Control

  • Allows for users to undo uncommitted changes
  • Promotes developers to refactor code
  • Keeps track of what changes have been made over time
  • Great for seeing what changes have been made over time (for example between versions)
  • By definition, it’s a backup of the source code
  • Perfect if your workstation crashes and you don’t have a backup
  • Makes it easier for other developers to get the latest changes
  • Instead of one developer making changes and “pushing” them to the team, source control allows the team to “pull” changes when ready.

Without source control, anytime that code changes were made, the developer who made changes would have to push the code to the other developers. This may not sound terrible, but what if you were in the middle of rewriting a file? You would have to figure out what files have been changed, hope that the files that were changed aren’t files that you’re currently working on. After ensuring all of that, you need to merge your changes to the latest version and hope that nothing broke. As you can see, this is a horrible workflow, prone for errors. However, using a SCM solves these issues.

Choosing Source Control

All SCM tools work either as centralized or distributed. Before you can choose a source control, you need to figure what type of source control type works best for your situation (tech requirements, ease of setup, company policies, etc..)

In a centralized source control implementation, there is a main server (central repository) that holds the source code and history of changes. When a developer checks out source code, they are only getting the source code (no revision history). After the developer makes changes and commits the changes, the files that have been modified are sent back to the central repository. Since the repository has the source code and history of changes, the repository is known to have the “blessed” or the “single source of truth” copy of the source code.

Since the centralized implementation utilizes the central server, it’s very easy to see what’s the latest and greatest. However, due to the centralization, if the server ever goes down, the team cannot commit changes.

In a distributed source control implementation, the big difference is that there is not a central repository. This means that every developer has both the latest source code and the revision history.

The main advantage to this implementation is that a developer can commit changes and continue working if there is no network connection. However, due to the lack of a central repository, there is no such thing as “blessed” version.

Common Source Control Implementations

Team Foundation Server (TFS) – This is a centralized source control that is most likely working with .NET code and Visual Studio. Even though It is possible to use TFS for other languages and IDE’s, you have to use Team Explorer Everywhere and is currently supported with Eclipse.

Subversion (SVN) – This is a centralized source control platform that is designed to be used for any programming language and IDE.

Git – This is one of the most widely used distributed source control platform. This platform is most commonly used with GitHub.

Mercurial – Another common distributed source control platform. This is most commonly used with Bitbucket.

Visual SourceSafe – Very old solution from Microsoft, if your company is using this for source control, you need to migrate to another solution. This was the precursor for TFS and should no longer be used for source control.