Today I Learned

During this past week, I was working with our intern and showing him some cool debugging tricks when I came across a massive method. I gave him 30 seconds to look at it and tell me what he thought the method was doing. After a while, he was able to figure it out, but the fact that it wasn’t easy discernible was enough to give pause.

The lesson here is that if you can’t determine what the method is doing easily, then it’s probably doing way too much (violating the Single Responsibility Principle) and needs to be broken into more easily readable pieces.

To demonstrate what I mean, I wrote a program that inserts Messages into a database. A Message contains a description, the number (for identification when troubleshooting) and the module. We would have issues where different messages would have the same number which would cause confusion when troubleshooting errors.

In the program I wrote, the user provides the message and what module the message belongs to and the program automatically generates the message number and inserts the message into the database.

For brevity’s sake, shown below is the logic for determining what the next message number should be.

[code language=”csharp”] public int GetNextAlertAndErrorModuleNumber(string module)
{
if (String.IsNullOrEmpty(module))
throw new ArgumentException("module cannot be null or empty");

if (_connection == null)
_connection = CreateConnection();

var results = new List<int>();

_connection.Open();
var cmd = new SqlCommand("dbo.GetAlertData", _connection);
cmd.CommandType = CommandType.StoredProcedure;

var reader = cmd.ExecuteReader();
while (reader.Read())
{
if (!reader["ALERT_ID_NUMBER"].ToString().Contains(module))
continue;

var pieces = reader["ALERT_ID_NUMBER"].ToString().Split(‘ ‘);

results.Add(Int32.Parse(pieces[1]));
}
if (reader != null)
reader.Close();

cmd = new SqlCommand("dbo.GetErrorData";, _connection);
cmd.CommandType = CommandType.StoredProcedure;

reader = cmd.ExecuteReader();
while (reader.Read())
{
if (!reader["ERROR_ID_NUMBER"].ToString().Contains(module))
continue;

var pieces = reader["ERROR_ID_NUMBER"].ToString().Split(‘ ‘);

results.Add(Int32.Parse(pieces[1]));
}
if (reader != null)
reader.Close();

if (_connection != null)
_connection.Close();

return results.Max() + 1;
}
[/code] The method itself isn’t complex, just calling some stored procedures, parsing the output and adding the number to a list. However, it’s not abundantly clear what the purpose of the calling the stored procedures.

First, it looks like we’re reading the alerts error numbers from a stored procedure call, why don’t we extract that logic out to a helper method and have the public method call the helper?

[code language=”csharp” highlight=”12, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55″]

<pre>
public int GetNextAlertAndErrorModuleNumber(string module)
{
if (String.IsNullOrEmpty(module))
throw new ArgumentException(&amp;amp;quot;module cannot be null or empty&amp;amp;quot;);

if (_connection == null)
_connection = CreateConnection();

var results = new List<int>();

_connection.Open();
results.AddRange(ReadAlerts(module.ToUpper()));

var cmd = new SqlCommand("dbo.GetErrorData", _connection);
cmd.CommandType = CommandType.StoredProcedure;

var reader = cmd.ExecuteReader();
while (reader.Read())
{
if (!reader["ERROR_ID_NUMBER"].ToString().Contains(module))
continue;

var pieces = reader["ERROR_ID_NUMBER&"].ToString().Split(‘ ‘);

results.Add(Int32.Parse(pieces[1]));
}
if (reader != null)
reader.Close();

if (_connection != null)
_connection.Close();

return results.Max() + 1;
}

private List&amp;amp;lt;int&amp;amp;gt; ReadAlerts(string module)
{
var results = new List<int>();
var cmd = new SqlCommand("dbo.GetAlertData", _connection);
cmd.CommandType = CommandType.StoredProcedure;

var reader = cmd.ExecuteReader();
while (reader.Read())
{
if (!reader["ALERT_ID_NUMBER"].ToString().Contains(module))
continue;

var pieces = reader["ALERT_ID_NUMBER"].ToString().Split(‘ ‘);
results.Add(Int32.Parse(pieces[1]));
}
if (reader != null)
reader.Close();

return results;
}
[/code]

By doing this, we fix two issues at once. First, we’ve given a name to the process of reading the alerts which in turns allows us to quickly understand what the public method should be doing (i.e. improved readability).

Second, it allows us for easier debugging because we now have smaller components. For example, let’s say that we were getting the wrong value. In the first implementation, we would have to put breakpoints in different areas trying to determine which part was broken. However, in the new form, we can check to see if ReadAlerts is behaving correctly. If it isn’t, we now know the bug has to be in that method, otherwise, it’s in the rest.

For the next step, you may have noticed that we can repeat the same refactoring trick again, except this time, we can extract the logic for reading the errors into a helper method.

[code language=”csharp” highlight=”11,12″] public int GetNextAlertAndErrorModuleNumber(string module)
{
if (String.IsNullOrEmpty(module))
throw new ArgumentException("module cannot be null or empty");

if (_connection == null)
_connection = CreateConnection();
_connection.Open();

var results = new List<int>();
results.AddRange(ReadAlerts(module.ToUpper()));
results.AddRange(ReadErrors(module.ToUpper()));

if (_connection != null)
_connection.Close();

return results.Max() + 1;
}

private List&amp;amp;lt;int&amp;amp;gt; ReadAlerts(string module)
{
var results = new List<int>();
var cmd = new SqlCommand("dbo.GetAlertData", _connection);
cmd.CommandType = CommandType.StoredProcedure;

var reader = cmd.ExecuteReader();
while (reader.Read())
{
if (!reader["ALERT_ID_NUMBER"].ToString().Contains(module))
continue;

var pieces = reader["ALERT_ID_NUMBER"].ToString().Split(‘ ‘);
results.Add(Int32.Parse(pieces[1]));
}
if (reader != null)
reader.Close();

return results;
}

private List&amp;amp;lt;int&amp;amp;gt; ReadErrors(string module)
{
var results = new List<int>();
var cmd = new SqlCommand("dbo.GetErrorData", _connection);
cmd.CommandType = CommandType.StoredProcedure;

var reader = cmd.ExecuteReader();
while (reader.Read())
{
if (!reader["ERROR_ID_NUMBER"].ToString().Contains(module))
continue;

var pieces = reader["ERROR_ID_NUMBER"].ToString().Split(‘ ‘);
results.Add(Int32.Parse(pieces[1]));
}
if (reader != null)
reader.Close();

return results;
}
[/code]

After the changes, anyone who looks at the public API can easily see that it’s reading from both Alerts and Errors. This is really powerful because now you can communicate with non-technical people about requirements and what the code is doing.

Let’s say that in the future, the requirements change and this conversation plays out:

Alice (QA, finding an error) – Hey Bob, I was running one of our test plans and it looks like that we’re getting the wrong message number if we’re trying to add a new message and there are warning messages in the database. We are including the warning table when figuring that out, right?

Bob (Engineer, finding the root cause) – Hey you’re right, it looks like we’re only using the alerts and error tables when calculating the next number. Why don’t we write up a ticket for that and get a fix in?

The key point is that no matter how large a method is, there always have to be steps being performed in some order (by definition of an algorithm) and this is the key to refactoring larger methods into more maintainable pieces of code. The trick is determining what those steps are and making decisions on whether to make helper methods or helper classes.

If those steps become complicated, then they should be broken out into helper methods. As time progresses and those helper methods start to become more complicated, then those helper methods should in turn become classes of their own.

Basics

Welcome to the third installment of Establishing a SOLID Foundation series. In this post, we’ll be exploring the third part of SOLID, the Liskov Substitution Principle and how following this principle can lead to loose coupling of your code.

So what is the Liskov Substitution Principle?

Before diving into the Liskov Substitution Principle (LSP), let’s look at a code example demonstrating the violation.

Let’s say that we have a Rectangle class:

If we run the following implementation, it’s pretty clear that it works like we would expect:

Seems pretty simple, we have a Rectangle class with two public properties, height and width and the class behaves the way we would expect.

Now let’s add a new class, called Square. Since all Squares are also Rectangles, it makes sense that the Square class should inherit from the Rectangle class. However, since Squares have to maintain the same height and width, we need to add some additional logic for that:

Using a Square instead of a Rectangle and running the same input, we get the following output:

Hold up, why is the area 25? If I read this code, then the height should be 10 and the width should be 5, giving an area of 50. This is no longer the case because of the domain constraint caused by the Square class. By using a Square where the code expected a Rectangle, we get different behavior then we would expect. This is the heart of the Liskov Substitution Principle.

In short, the Liskov Substituion Principle states that if we have an object (Rectangle) in our code and it works correctly, then we should be able to use any sub-type (Square) without the results being modified.

The most common example of LSP violations are when the “is-a” phrase from Object-Oriented Design break down. In the Rectangle-Square example, we say that a Square “is-a” Rectangle which is true. However, when we covert that relationship to code and use inheritance, the relationship does not hold up.

I don’t know, this sounds confusing, what’s the point?

To me, the Liskov Substitution Principle is the hardest part of SOLID to understand. It’s very heavy on the theoretical and it’s not blatantly obvious when a violation has occurred until testing.

However, there are plenty of benefits of following LSP.

First, following LSP reduces the tight coupling involved in your code. Let’s look back at our Recipes class from the Open/Closed Principle post and examine the MakeOrder method:

In this class, you see that we load different recipes and when one’s requested, we call the Cook method. We don’t have to do any set-up, special handling, or other logic, we just trust that the Cook method for whatever recipe we choose works as expected. By following this design, code will be easier to read and to maintain.

Going back to our Square/Rectangle example, if we wanted a method that would return a new Square or Rectangle, it would have to look something like this:

This code works, but there is one major problem. When someone is looking at this code, they’re going to get confused of why the Rectangle and Square are setup differently.

For example, when I see that the Square’s height is being set, but not the width, my first thought is that this is a bug. Then, I’d have to look into the Square’s class definition and then I’d see the logic of where setting the height also sets the width.

Long story short, by identifying and resolving LSP violations, we can make the code easier to read and maintain.

So it looks like LSP is pretty useful, but how do I fix violations?

Now that we’ve talked about spotting LSP violations and why it’s important to follow LSP, let’s discuss how to fix the violations.

To be honest, fixing a LSP violation is not easy. Since the nature of the problem is caused by a broken abstraction, discarding the abstraction is the best option. However, if you absolutely need to use the abstraction, then one solution is to remove the method that causes the violation.

In the Square/Rectangle example, we would remove the setters for height and width from our Rectangle class because that is how the violation can occur. After removing the setters, we need to modify the initialize method of Square to only take one parameter, size, and send that twice to the Rectangle class. Now, our classes look something like this:

With sample implementation and output

TL;DR

In short, the Liskov Substitution Principle (LSP) enforces the idea that if a class has a sub-type (through inheritance or interfaces), then by passing the sub-type, the program should still produce the same results. If you run across a class that violates LSP, then you know that your abstraction is not complete and you can either

  • Remove the offending methods/properties or
  • Abandon the abstraction

As always, don’t forget to refactor and reorganize your code as needed.

All code shown/used can be found here on TheSoftwareMentor’s repository at Bitbucket

Establishing a SOLID Foundation Series

Basics

Welcome to the second installment of Establishing a SOLID Foundation series. In this post, we’ll be exploring the second part of SOLID, the Open/Closed Principle and how following this principle can lead to great design choices.

So what is the Open/Closed Principle?

In order to set the context for discussion, let’s revisit our last example of the Chef class:

So it looks like this Chef is pretty simple, it only has one public method of CookFood and he can only cook ChickenWithBroccoli. However, a Chef that can only cook one thing isn’t very useful. So how about we add some more menu items?

So our new Chef can cook more food, but the code base expanded quite a bit. In order to add more menu choices, we need to add an additional if check in CookFood and to define a new method for CookFood to call. This might not sound like a lot of work because each of these Cook* methods just print something to the screen. However, what if the steps to create each food item were much more complex?

Also, what if we modified how the CookChickenWithBroccoli method worked? We would need to modify the Chef class, but that doesn’t make sense. In the real world, we would modify the recipe and the Chef would then follow the new recipe. This concept that we would have to modify an unrelated object in order to add new functionality is the inspiration for the Open/Closed Principle.

In short, the Open/Closed Principle means that an object should be Open for extension, but Closed for modification. This principle relies on the idea that new functionality is added by creating new classes, not by modifying pre-existing classes’ behavior. By doing this, we’re decreasing the chances of breaking current functionality and increasing code stability.

This sounds good, but is it worth the additional design time?

Now that we’ve discussed the Open/Closed Principle, you might be wondering what some of the benefits are of cleaning up this design.

First, classes that follow the Open/Closed Principle are small and focused in nature playing off the idea of the Single Responsibility Principle. Looking back at our Chef class, it’s very clear that by adding new functionality, Chef is going to be handling way too many things.

Next, by following OCP, there won’t be multiple classes modified just to add new functionality. There’s nothing like a change set containing tons of modified files to make even the most experienced developer shudder in fear.

By definition of OCP, we won’t be modifying a lot of files (ideally only one file should be modified) and we’re adding new classes. Since we’re adding in these new classes, we inherently have the opportunity to bake in testing.

Alright, I get it OCP is awesome, but how do I refactor the Chef class?

In order to fix the violation, we’re going to take each menu item and make them into their own class

Now that we have these different classes, we need to come up with some way for our Chef to interact with them. So why don’t we organize these menu items into a Recipes class?

Now we have this Recipes class that contains all of the menu items for our Chef to use. When adding new menu items to Recipes, all we have to add is the class in the initialize method and add an additional if check in the MakeOrder method. But hold on, I hear you say, This is the same as what we had with the Chef at the beginning, why is this design better? Before, we would have to modify the Chef in order to add more menu items which doesn’t really make sense, now we’ve moved that logic to Recipes which makes sense that it needs to be modified if a new menu item is added.

On the topic of our Chef, after cleaning up to use the Recipes class, our Chef is now very simple and relies on Recipes for the menu items, not itself:

Now that we’ve fixed the violation, let’s go ahead and refactor some. Looking at the menu choices, it’s pretty clear that we can abstract the behavior to a base class called MenuItem for them all to share (Note: By defining Cook by raising an exception, I’m forcing all classes to provide their own implementation).

Also, as part of this refactoring, we’re going to move some of the strings into constants as part of the RecipeNames module so that the Chef and Recipes can communicate with one another:

With these additions, let’s update the menu choices to use the module and the MenuItem base class:

With these changes, we need to update the Recipes class to use the RecipeNames module:

with this current layout, if we needed to add another menu item (let’s say Fish and Chips), we would need to:

  1. Create a new class that extends MenuItem called FishAndChips
  2. Add another string constant to RecipeNames
  3. Add another line in the Recipes initialize method to add it to the array

TL;DR

In short, the Open/Closed Principle (OCP) reinforces the idea that every class should be open for extension and closed to modifications. By following this principle, you’re much more likely to create separated code that allows you to increase functionality and decrease the odds of breaking current functionality. If you run across a class that is doing way too much, use the Single Responsibility Principle to separate the classes and then use a new object that serves as the middle man. In our case, the Recipes class was the middle man between the Chef and the different menu items. As always, don’t forget to refactor and reorganize your code as needed.

All code shown/used can be found here on TheSoftwareMentor’s repository at Bitbucket

Establishing a SOLID Foundation Series