Skip to content

Today I Learned

Today I Learned: LibYear

When writing software, it's difficult (if not impossible) to write everything from scratch. We're either using a framework or various third-party libraries to make our code work.

On one hand, this is a major win for the community because we're not all having to solve the same problem over and over again. Could you imagine having to implement your own authentication framework (on second thought...)

However, this power comes at a cost. These libraries aren't free as in beer, but more like puppies. So if we're going to take in the library, then we need to make sure that our dependencies are up-to-date with the latest and greatest. There are new features, bug fixes, and security patches occurring all the time and the longer we let a library drift, the more painful it can be to upgrade.

If the library is leveraging semantic versioning, then we can take a guess on the likelihood of a breaking change base on which number (Major.Minor.Maintenance) has changed.

  • Major - We've made a breaking change that's not backward compatible. Your code may not work anymore.
  • Minor - We've added added new features that you might be interested in or made other changes that are backwards compatible.
  • Maintenance - We've fixed some bugs, sorry about that!

Keeping Up With Dependencies

For libraries that have known vulnerabilities, you can leverage tools like GitHub's Dependabot to auto-create pull requests that will upgrade those dependencies for you. Even though the tool might be "noisy", this is a great way to take an active role in keeping libraries up to date.

However, this approach only works for vulnerabilities, what about libraries that are just out-of-date? There's a cost/benefit of upgrading where the longer you go between the upgrades, the riskier the upgrade will be.

In the JavaScript world, we know that dependencies are listed in the package.json file with minimum versions and the package-lock.json file states the exact versions to use.

Using LibYear

I was working with one of my colleagues the other day and he referred me to a library called LibYear that will check your package.json and lock file to determine how much "drift" that you have between your dependencies.

Under the hood, it's combining the npm outdated and npm view <package> commands to determine the drift.

What I like about this tool is that you can use this as a part of a "software health" report for your codebase.

As engineers, we get pressure to ship features and hit delivery dates, but it's our responsibility to make sure that our codebase is in good shape (however we defined that term). I think this library is a good way for us to capture a data point about software health which then allows the team to make a decision on whether we should update our libraries now (or defer).

The nice thing about the LibYear package is that it lends itself to be ran in a pipeline and then you could take those results and post them somewhere. For example, maybe you could write your own automation bot that could post the stats in your Slack or Teams chat.

It looks like there's already a GitHub Action for running this tool today, so you could start there as a basis.

Today I Learned - Iterating Through Union Types

In a previous post, we cover on how using union types in TypeScript is a great approach for domain modeling because it limits the possible values that a type can have.

For example, let's say that we're modeling a card game with a standard deck of playing cards. We could model the domain as such.

1
2
3
4
5
type Rank = "Ace" | "Two" | "Three" | "Four" | "Five" | "Six" | "Seven"
           | "Eight" | "Nine" | "Ten" |"Jack" | "Queen" | "King"
type Suite = "Hearts" | "Clubs" | "Spades" | "Diamonds"

type Card = {rank:Rank; suite:Suit}

With this modeling, there's no way to create a Card such that it has an invalid Rank or Suite.

With this definition, let's create a function to build the deck.

function createDeck(): Card[] {
  const ranks = ["Ace", "Two", "Three", "Four", "Five", "Six", "Seven", "Eight", "Nine", "Ten", "Jack", "Queen", "King"];
  const suites = ["Hearts", "Clubs", "Spades", "Diamonds"];

  const deck:Card[] = [];
  for (const rank of ranks) {
    for (const suite of suites) {
      deck.push({rank, suite});
    }
  }
  return deck;
}

This code works, however, I don't like the fact that I had to formally list the option for both Rank and Suite as this means that I have two different representtions for Rank and Suite, which implies tthat if we needed to add a new Rank or Suite, then we'd need to add it in two places (a violation of DRY).

Doing some digging, I found this StackOverflow post that gave a different way of defining our Rank and Suite types. Let's try that new definition.

1
2
3
4
const ranks = ["Ace", "Two", "Three", "Four", "Five", "Six", "Seven", "Eight", "Nine", "Ten", "Jack", "Queen", "King"] as const;
type Rank = typeof ranks[number];
const suites = ["Hearts", "Clubs", "Spades", "Diamonds"] as const;
type Suite = typeof suites[number]

In this above code, we're saying that ranks cannot change (either by assignment or by operations like push). With that definition, we can say that Rank is some entry in the ranks array. Similar approach for our suites array and Suite type.

I prefer this approach much more because we have our ranks and suites defined in one place and our code reads cleaner as this says Here are the possible ranks and Rank can only be one of those choices.

Limitations

The main limitation is that it only works for "enum" style unions. Let's change example and say that we want to model a series of shapes with the following.

1
2
3
4
5
type Circle = {radius:number};
type Square = {length:number};
type Rectangle = {height:number, width:number}

type Shape = Circle | Square | Rectangle

To use the same trick, we would need to have an array of constant values. However, we can't have a constant value for any of the Shapes because there are an infinite number of valid Circles, Squares, and Rectangles.

Today I Learned – The Difference Between Bubble and Capture for Events

I've recently been spending some time learning about Svelte and have been going through the tutorials.

When I made it to the event modifiers section, I saw that there's a modifier for capture where it mentions firing the handler during the capture phase instead of the bubbling phase.

I'm not an expert on front-end development, but I'm not familiar with either of these concepts. Thankfully, the Svelte docs refer out to MDN for a better explanation of the two.

What is Event Bubbling?

Long story short, by default, when an event happens, the element that's been interacted with will fire first and then each parent will receive the event.

So if we have the following HTML structure where there's a body that has a div that has a button

1
2
3
4
5
6
<body>
  <div id="container">
    <button>Click me!</button>
  </div>
  <pre id="output"></pre>
</body>

With the an event listener at each level:

// Setting up adding string to the pre element
const output = document.querySelector("#output");
const handleClick = (e)=> output.textContent += `You clicked on a ${e.currentTarget.tagName} element\n`;

const container = document.querySelector("#container");
const button = document.querySelector("button");

// Wiring up the event listeners
document.body.addEventListener("click", handleClick);
container.addEventListener("click", handleClick);
button.addEventListener("click", handleClick);

And we click the button, our <pre> element will have:

1
2
3
You clicked on a BUTTON element
You clicked on a DIV element
You clicked on a BODY element

What is Event Capturing?

Event Capturing is the opposite of Event Bubbling where the root parent receives the event and then each inner parent will receive the event, finally making it to the innermost child of the element that started the event.

Let's see what happens with our example when we use the capture approach.

1
2
3
4
// Wiring up the event listeners
document.body.addEventListener("click", handleClick, {capture:true});
container.addEventListener("click", handleClick, {capture:true});
button.addEventListener("click", handleClick, {capture:true});

After clicking the button, we'll see the following messages:

1
2
3
You clicked on a BODY element
You clicked on a DIV element
You clicked on a BUTTON element

Why Would You Use Capture?

By default, events will work in a bubbling fashion and this intuitively makes sense to me since the element that was interacted with is most likely the right element to handle the event.

One case that comes to mind is if you finding yourself attaching the same event listener to every child element. Instead, we could move that up.

For example, let's say that we had the following layout

1
2
3
4
5
6
7
<div>
    <ul style="list-style-type: none; padding: 0px; margin:0px; float:left">
      <li><a id="one">Click on 1</a></li>
      <li><a id="two">Click on 2</a></li>
      <li><a id="three">Click on 3</a></li>
    </ul>
  </div>
li {
  list-style-type: none;
  padding: 0px;
  margin:0px;
  float:left
}

li a {
  color:black;
  background:#eee;
  border: 1px solid #ccc;
  padding: 10px 15px;
  display:block
}
Which gives us the following layout

With this layout, let's say that we need to do some business rules for when any of those buttons are clicked. If we used the bubble down approach, we would have the following code:

// Stand-in for real business rules
function handleClick(e) {
  console.log(`You clicked on ${e.target.id}`);
}
// Get all the a elements
const elements = document.querySelectorAll("a");
// Wire up the click handler
for (const e of elements) {
  e.addEventListener("click", handleClick);
}

This isn't a big deal with three elements, but let's pretend that you had a list with tens of items, or a hundred items. You may run into a performance hit because of the overhead of wiring up that many event listeners.

Instead, we can use one event listener, bound to the common parent. This can accomplish the same affect, without as much complexity.

Let's revisit our JavaScript and make the necessary changes.

// Stand-in for real business rules
function handleClick(e) {
  // NEW: To handle the space of the unordered list, we'll return early
  // if the currentTarget is the same as the original target
  if (e.currentTarget === e.target) {
    return;
  }
  console.log(`You clicked on ${e.target.id}`);
}
// NEW: Getting the common parent
const parent = document.querySelector("ul");
// NEW setting the eventListener to be capture based
parent.addEventListener("click", handleClick, {capture:true});

With this change, we're now only wiring up a single listener instead of having multiple wirings.

Wrapping Up

In this post, we looked at two different event propagation models, bubble and capture, the differences between the two and when you might want to use capture.

TIL – How To Check If a Substitute Was Called Zero Times

Setup

During this past week, I’ve been working on a new feature and during development, I ended up with code that looked like this:

public class PermissionChecker
{
  public PermissionChecker(IModuleDisabler moduleDisabler, User user)
  {
      if (user.IsAdmin) return;
      else if (user.HasFullRights) ConfigureFullRights(moduleDisabler);
      else if (user.HasPartialRights) ConfigurePartialRights(moduleDisabler);
  }

  private void ConfigureFullRights(IModuleDisabler disabler)
  {
      disabler.DisableSystemAdminModule();
  }

  private void ConfigurePartialRights(IModuleDisabler disabler)
  {
      disabler.DisableSystemAdminModule();
      disabler.DisableReportModule();
      disabler.DisableUserManagementModule();
  }
}

So the code is pretty straight forward, I have a PermissionChecker whose job is to use the IModuleDisabler to turn off certain modules depending upon the user permissions. Pretty straightforward implementation.

Now that the solution is fleshed out, it’s time to write some tests around this. When it comes to testing classes that have dependencies on non-trivial classes, I use NSubstitute, a mocking tool, to create mock versions of those dependencies. In this case, NSubstitute allows me to test how the IModuleDisabler is being used by the PermissionsChecker.

For example, let’s say that I wanted to test how the PermissionChecker interacts with the IModuleDisabler when the user has a partial access, I’d write a test that looks like the following:

[Test]
public void And_the_user_has_partial_access_then_the_disabler_disables_the_report_module()
{
     // Arrange
     var permissionChecker = new PermissionChecker();
     var mockDisabler = Substitute.For();
     var user = new User {HasPartialAccess = true};

     // Act
     permissionChecker.CheckPermissions(mockDisabler, user);

     // Assert
     mockDisabler.Received(1).DisableReportModule();
}

In the above test, our assertion step is to check if the mockDisabler received a single call to the DisableReportModule. If it didn’t receive a call, then the test fails. We can write similar tests for the different modules that should be disabled for the partial rights permission and follow a similar pattern for the full rights permission.

However, things get a bit more interesting when we’re testing what happens if the user is an admin. If we follow the same pattern, we’d end up with a test that looks like this:

[Test]
public void And_the_user_has_admin_permissions_then_the_disabler_is_not_used()
{
     // Arrange
     var permissionChecker = new PermissionChecker();
     var mockDisabler = Substitute.For();
     var user = new User {IsAdmin = true};

     // Act
     permissionChecker.CheckPermissions(mockDisabler, user);

     // Assert
     mockDisabler.DidNotReceive().DisableSystemAdminModule();
     mockDisabler.DidNotReceive().DisableReportModule();
     mockDisabler.DidNotReceive().DisableUserManagementModule();
}

This solution works for now, however, there is a major maintainability issue, can you spot it?

Problem

The issue arises when we add a new module to be disabled which forces the IModuleDisabler to implement a new method. In that case, you need to remember to update this test to also check that the new method wasn’t being called. If you forget, this test would still pass, but it’d pass for the wrong reason.

To help illustrate, let’s say that another method, DisableImportModule, has been added to the IModuleDisabler interface. In addition, we also need to make sure that this is called when users have partial access, but should not be called for users who are admins or users who have full access.

To fulfill those requirements, we modify the PermissionChecker as so:

public class PermissionChecker
{
  public PermissionChecker(IModuleDisabler moduleDisabler, User user)
  {
      if (user.IsAdmin) return;
      else if (user.HasFullRights) ConfigureFullRights(moduleDisabler);
      else if (user.HasPartialRights) ConfigurePartialRights(moduleDisabler);
  }

  private void ConfigureFullRights(IModuleDisabler disabler)
  {
      disabler.DisableSystemAdminModule();
  }

  private void ConfigurePartialRights(IModuleDisabler disabler)
  {
      disabler.DisableSystemAdminModule();
      disabler.DisableReportModule();
      disabler.DisableUserManagementModule();
      disabler.DisableImportModule();
  }
}

At this point, we’d write another test for when the a user has partial access, the import module should be disabled. However, it’s very unlikely that we’d remember to update the test for the admin. Remember, for the admin, we’re checking that it received no calls to any disable methods and the way we’re doing that is by checking each method individually.

[Test]
public void And_the_user_has_admin_permissions_then_the_disabler_is_not_used()
{
  // Arrange
  var permissionChecker = new PermissionChecker();
  var mockDisabler = Substitute.For();
  var user = new User {IsAdmin = true};

  // Act
  permissionChecker.CheckPermissions(mockDisabler, user);

  // Assert
  mockDisabler.DidNotReceive().DisableSystemAdminModule();
  mockDisabler.DidNotReceive().DisableReportModule();
  mockDisabler.DidNotReceive().DisableUserManagementModule();
  // Need to add check for DidNotReceive().DisableImportModule();
}

Solution

There’s got to be a better way. After some digging around, I found that any NSubstitute mock, has a ReceivedCalls method that returns all calls that the mock received. With this new knowledge, we can refactor the previous test with the following:

[Test]
public void And_the_user_has_admin_permissions_then_the_disabler_is_not_used()
{
  // Arrange
  var permissionChecker = new PermissionChecker();
  var mockDisabler = Substitute.For();
  var user = new User {IsAdmin = true};

  // Act
  permissionChecker.CheckPermissions(mockDisabler, user);

  // Assert
  CollectionAssert.IsEmpty(mockDisabler.ReceivedCalls());
}

This solution is much better because if we add more modules, this test is still checking to make sure that admin users do not have any modules disabled.

Summary

When using a NSubstitute mock and you need to make sure that it received no calls to any methods or properties, you can using NSubstitute’s ReceivedCalls in conjunction with CollectionAssert.IsEmpty to ensure that the substitute was not called.

Today I Learned – The Chain of Responsibility Design Pattern

There is nothing new in the world except the history you do not know. – Harry S. Truman

The more experience I gain problem solving, the more this holds true. For this post, I’m going to first discuss the problem that I was trying to solve. Next, I’ll show what my first solution was, followed by the shortcomings of this solution. Thirdly, we’ll iterate over a better solution to the problem. This in turn, will provide the motivation for what the Chain of Responsibility is and how to implement. Finally, I’ll wrap up with what the benefits were of using this design. .

Problem I was trying to solve

As part of the process of installing our software, there are scripts that will update the database from it’s current version to the latest version. As it stands, it needs to be able to upgrade a database from any version to the current version.

Previous Solution

The first thing that comes to me is that I need to apply database scripts in a sequential way. For example, if the database’s current version is 1.0 and the latest version is 3.0, it would need to apply the script to upgrade the database from 1.0 to 2.0 and then apply the script to upgrade the database from 2.0 to 3.0.

For the first implementation, there were only two versions, 1.0 and 2.0. Since I didn’t want to build in a lot of functionality if it wasn’t needed yet, I created a helper method that returns the correct updater for a given version. In the below code, if the version does not exist, I assume the database does not exist and return the class that will create the database. Otherwise if the version is 1.0, I return a class that is responsible for the upgrading a database from 1.0 to 2.0. If the version is 2.0, I return a class that doesn’t do anything (i.e. there’s no upgrades to be done).

public IDatabaseUpdater GetDatabaseUpdater(string version)
{
  if (string.IsNullOrWhiteSpace(version))
    return new DatabaseCreator();
  if (version == "1.0")
    return new Database100To200Updater();
  if (version == "2.0")
    return new CurrentVersionUpdater();
  throw new ArgumentException("The version " + version + " is not supported for database upgrades.");
}

Problem with solution

This solution worked well when there only three possible actions (create a new database, apply the single script, or do nothing). However, we are now going to be shipping version 3.0 and there will need to be a new class that is responsible for upgrading the 2.0 to 3.0. In order to add this functionality, I’d have to do the following:

  1. Create the Database200To300Updater class that contained the logic for updating the database from 2.0 to 3.0.
  2. Modify the Database100To200Updater class to also use the Database200To300Updater in order to perform the next part of the upgrade.
  3. Add additional logic to the above method so that if the database is 2.0 to return the Database200To300Updater class.

After making the modifications, the method now looks like:

public IDatabaseUpdater GetDatabaseUpdater(string version)
{
  if (string.IsNullOrWhiteSpace(version))
    return new DatabaseCreator();
  if (version == "1.0")
    return new Database100To200Updater(new Database200To300Updater());
  if (version == "2.0")
    return new Database200To300Updater();
  if (version == "3.0")
    return new CurrentVersionUpdater();

  throw new ArgumentException("The version " + version + " is not supported for database upgrades.");
}

So far, so good, we now have the logic to be able to apply scripts in order, however, now that we’ve added version 3.0, I start to wonder what I would do if we added more versions? After some thought, it would look identical to the previous steps (see below for what would happen if we added version 4.0).

public IDatabaseUpdater GetDatabaseUpdater(string version)
{
  if (string.IsNullOrWhiteSpace(version))
    return new DatabaseCreator();
  if (version == "1.0")
    return new Database100To200Updater(new Database200To300Updater(new Database300To400Updater()));
  if (version == "2.0")
    return new Database200To300Updater(new Database300To400Updater());
  if (version == "3.0")
    return new Database300To400Updater();
  if (version == "4.0")
    return new CurrentVersionUpdater();
  throw new ArgumentException("The version " + version + " is not supported for database upgrades.");
}

If we create some variables to hold onto these classes, and reorder the if statements, we can write this helper method as:

public IDatabaseUpdater GetDatabaseUpdater(string version)
{
  if (string.IsNullOrWhiteSpace(version))
    return new DatabaseCreator();
  if (version == "4.0")
    return new CurrentVersionUpdater();
  var database300Updater = new Database300To400Updater();
  var database200Updater = new Database200To300Updater(database300To400Updater);
  var database100Updater = new Database100To200Updater(database200To300Updater);

  if (version == "1.0")
    return database100Updater;
  if (version == "2.0")
    return new database200Updater;
  if (version == "3.0")
    return new database300Updater;

  throw new ArgumentException("The version " + version + " is not supported for database upgrades.");
}

Motivation for the Chain of Responsibility

What I find interesting in this design is that I’ve now chained these updater classes together so that if the version 1.0 is returned, it will also use the 2.0 updater, which in turn calls the 3.0 updater. It was at this point, that I remembered a design pattern that followed this structure.

In this design pattern, you essentially have Handlers (in my case updaters) that check to see if they can handle the request. If so, they do and that stops the chain. However, if they can’t handle the request, they pass it to their Successor (which was also a Handler) to handle the request. The design pattern I was thinking about is the Chain of Responsibility pattern.

In order to implement this pattern, you need to have an IHandler interface that exposes a Handle method and either a method or property to set the Successor. The method is the action to take (in our case Update) and the Successor represents the next Handler in the chain if the request could not be handled. The second component is referred to as ConcreteHandlers and they are just the implementors of the interface. One way to implement this is like the following:

public interface IHandler
{
  IHandler Successor { get; set; }
  void Update(int version);
}

public class ConcreteHandlerA : IHandler
{
  public IHandler Successor { get; set; }

  public void Update(int version)
  {
    if (CanTheRequestBeHandled) {
      // handle the request
    }
    else {
      Successor.Update(version);
    }
  }
}

The main difference between the pattern and what I need is that instead of doing if (canHandle)/else call Successor, what I’m really looking for is to run the upgrade script if the version we’re upgrading to is higher than our current version and then always call the successor. Given this change, here’s what that new implementation looks like:

public class ConcreteHandlerA : IHandler
{
  public Successor { get; set; }
  public void Update(int version)
  {
    if (CanTheRequestBeHandled) {
      // handle the request
    }
    Successor.Update(version);
  }
}

Implementing the Chain of Responsibility

Now that I know the pattern to use and how it works, I need to update the IDatabaseUpdater interface to follow the IHandler interface. Next, I will need to modify the concrete handlers to use the new interface correctly.

Implementing the Handler

First, we will update our IDatabaseUpdater interface to follow the IHandler look:

Before
1
2
3
4
public interface IDatabaseUpdater
{
  void Update(int version);
}
After
1
2
3
4
5
public interface IDatabaseUpdateHandler
{
  void Update(int version);
  IDatabaseUpdateHandler Successor { get; set; }
}

Implementing the Concrete Handler

Second, we will need to update our concrete handlers to implement the interface correctly and to update their UpdateMethod to follow the design. In my case, the concrete handlers perform similar logic, so one of the classes is used for an example.

Before
public class Database100To200Updater : IDatabaseUpdater
{
  private Database200To300Updater _successor;
  public Database100To200Updater(Database200To300Updater successor)
  {
    if (successor == null)
      throw new ArgumentNullException("successor");
    _successor = successor;
  }

  public void Update()
  {
    Console.WriteLine("Updating the database to version 2.0");
    _successor.Update();
  }
}
After

Thanks to the public property, I was able to remove the private member and that in turn allowed me to remove the constructor.

public class Database100To200Updater : IDatabaseUpdateHandler
{
  public void Update(int version)
  {
    if (version >= 2)
      Console.WriteLine("Updating the database to version 2.0");
    if (Successor != null)
      Successor.Update(version);
  }

  public IDatabaseUpdateHandler Successor { get; set;}
}

Updating the Helper Method

Now that we’ve updated the interface and implementors, it’s time to update the helper method to take advantage of the new design.

public IDatabaseUpdateHandler GetDatabaseUpdater(string version)
{
  if (string.IsNullOrWhiteSpace(version))
    return new DatabaseCreator();

  var database300To400 = new Database300To400Updater();
  var database200To300 = new Database200To300Updater();
  var database100To200 = new Database100To200Updater();

  database100To200.Successor = database200To300;
  database200To300.Successor = database300To400;

  return database100To200;
}

Chain of Responsibility is great, here’s why

What I really like about the chain of responsibility pattern is that I was able to connect my upgrade classes together in a consistent fashion. Another reason why I like this pattern is that it forces me to have the logic to determine whether I should run the update or not inside the individual classes instead of the helper method. This produces more readable code which then lends itself to easier maintainability.

Today I Learned – How to Break Down A Massive Method

During this past week, I was working with our intern and showing him some cool debugging tricks when I came across a massive method. I gave him 30 seconds to look at it and tell me what he thought the method was doing. After a while, he was able to figure it out, but the fact that it wasn’t easy discernible was enough to give pause.

The lesson here is that if you can’t determine what the method is doing easily, then it’s probably doing way too much (violating the Single Responsibility Principle) and needs to be broken into more easily readable pieces.

To demonstrate what I mean, I wrote a program that inserts Messages into a database. A Message contains a description, the number (for identification when troubleshooting) and the module. We would have issues where different messages would have the same number which would cause confusion when troubleshooting errors.

In the program I wrote, the user provides the message and what module the message belongs to and the program automatically generates the message number and inserts the message into the database.

For brevity’s sake, shown below is the logic for determining what the next message number should be.

public int GetNextAlertAndErrorModuleNumber(string module)
{
  if (String.IsNullOrEmpty(module))
    throw new ArgumentException("module cannot be null or empty");
  if (_connection == null)
    _connection = CreateConnection();

  var results = new List<int>();

  _connection.Open();
  var cmd = new SqlCommand("dbo.GetAlertData", _connection);
  cmd.CommandType = CommandType.StoredProcedure;

  var reader = cmd.ExecuteReader();
  while (reader.Read())
  {
    if (!reader["ALERT_ID_NUMBER"].ToString().Contains(module))
      continue;

    var pieces = reader["ALERT_ID_NUMBER"].ToString().Split( );

    results.Add(Int32.Parse(pieces[1]));
  }
  if (reader != null)
    reader.Close();

  cmd = new SqlCommand("dbo.GetErrorData";, _connection);
  cmd.CommandType = CommandType.StoredProcedure;

  reader = cmd.ExecuteReader();
  while (reader.Read())
  {
    if (!reader["ERROR_ID_NUMBER"].ToString().Contains(module))
      continue;

    var pieces = reader["ERROR_ID_NUMBER"].ToString().Split( );

    results.Add(Int32.Parse(pieces[1]));
  }
  if (reader != null)
    reader.Close();

  if (_connection != null)
    _connection.Close();

  return results.Max() + 1;
}

The method itself isn’t complex, just calling some stored procedures, parsing the output and adding the number to a list. However, it’s not abundantly clear what the purpose of the calling the stored procedures.

First, it looks like we’re reading the alerts error numbers from a stored procedure call, why don’t we extract that logic out to a helper method and have the public method call the helper?

public int GetNextAlertAndErrorModuleNumber(string module)
{
  if (String.IsNullOrEmpty(module))
    throw new ArgumentException(&amp;amp;quot;module cannot be null or empty&amp;amp;quot;);

  if (_connection == null)
    _connection = CreateConnection();

  var results = new List<int>();

  _connection.Open();
  results.AddRange(ReadAlerts(module.ToUpper()));

  var cmd = new SqlCommand("dbo.GetErrorData", _connection);
  cmd.CommandType = CommandType.StoredProcedure;

  var reader = cmd.ExecuteReader();
  while (reader.Read())
  {
    if (!reader["ERROR_ID_NUMBER"].ToString().Contains(module))
      continue;

    var pieces = reader["ERROR_ID_NUMBER&"].ToString().Split( );

    results.Add(Int32.Parse(pieces[1]));
  }
  if (reader != null)
    reader.Close();

  if (_connection != null)
    _connection.Close();

  return results.Max() + 1;
  }

  private List<int> ReadAlerts(string module)
  {
    var results = new List<int>();
    var cmd = new SqlCommand("dbo.GetAlertData", _connection);
    cmd.CommandType = CommandType.StoredProcedure;

    var reader = cmd.ExecuteReader();
    while (reader.Read())
    {
      if (!reader["ALERT_ID_NUMBER"].ToString().Contains(module))
        continue;

      var pieces = reader["ALERT_ID_NUMBER"].ToString().Split( );
      results.Add(Int32.Parse(pieces[1]));
    }
    if (reader != null)
    reader.Close();

    return results;
}

By doing this, we fix two issues at once. First, we’ve given a name to the process of reading the alerts which in turns allows us to quickly understand what the public method should be doing (i.e. improved readability).

Second, it allows us for easier debugging because we now have smaller components. For example, let’s say that we were getting the wrong value. In the first implementation, we would have to put breakpoints in different areas trying to determine which part was broken. However, in the new form, we can check to see if ReadAlerts is behaving correctly. If it isn’t, we now know the bug has to be in that method, otherwise, it’s in the rest.

For the next step, you may have noticed that we can repeat the same refactoring trick again, except this time, we can extract the logic for reading the errors into a helper method.

public int GetNextAlertAndErrorModuleNumber(string module)
{
  if (String.IsNullOrEmpty(module))
    throw new ArgumentException("module cannot be null or empty");
  if (_connection == null)
    _connection = CreateConnection();

  _connection.Open();

  var results = new List<int>();
  results.AddRange(ReadAlerts(module.ToUpper()));
  results.AddRange(ReadErrors(module.ToUpper()));

  if (_connection != null)
    _connection.Close();

  return results.Max() + 1;
}

private List<int> ReadAlerts(string module)
{
  var results = new List<int>();
  var cmd = new SqlCommand("dbo.GetAlertData", _connection);
  cmd.CommandType = CommandType.StoredProcedure;

  var reader = cmd.ExecuteReader();
  while (reader.Read())
  {
    if (!reader["ALERT_ID_NUMBER"].ToString().Contains(module))
      continue;

    var pieces = reader["ALERT_ID_NUMBER"].ToString().Split( );
    results.Add(Int32.Parse(pieces[1]));
  }
  if (reader != null)
    reader.Close();

  return results;
}

private List<int> ReadErrors(string module)
{
  var results = new List<int>();
  var cmd = new SqlCommand("dbo.GetErrorData", _connection);
  cmd.CommandType = CommandType.StoredProcedure;

  var reader = cmd.ExecuteReader();
  while (reader.Read())
  {
    if (!reader["ERROR_ID_NUMBER"].ToString().Contains(module))
      continue;

    var pieces = reader["ERROR_ID_NUMBER"].ToString().Split( );
    results.Add(Int32.Parse(pieces[1]));
  }
  if (reader != null)
    reader.Close();

  return results;
}

After the changes, anyone who looks at the public API can easily see that it’s reading from both Alerts and Errors. This is really powerful because now you can communicate with non-technical people about requirements and what the code is doing.

Let’s say that in the future, the requirements change and this conversation plays out:

Alice (QA, finding an error) – Hey Bob, I was running one of our test plans and it looks like that we’re getting the wrong message number if we’re trying to add a new message and there are warning messages in the database. We are including the warning table when figuring that out, right?

Bob (Engineer, finding the root cause) – Hey you’re right, it looks like we’re only using the alerts and error tables when calculating the next number. Why don’t we write up a ticket for that and get a fix in?

The key point is that no matter how large a method is, there always have to be steps being performed in some order (by definition of an algorithm) and this is the key to refactoring larger methods into more maintainable pieces of code. The trick is determining what those steps are and making decisions on whether to make helper methods or helper classes.

If those steps become complicated, then they should be broken out into helper methods. As time progresses and those helper methods start to become more complicated, then those helper methods should in turn become classes of their own.

Today I Learned: The Law of Demeter

Don’t pass in more information that you need. It sounds simple, but when working with messy legacy code, it’s easy to forget.

The impetus for this post came from a peer code review. During the review, I found this method:

1
2
3
4
5
6
7
8
9
public IStrategy GetStrategy(Project project, bool isAffected)
{
  var type = project.Type;
  if (type == ProjectType.A && isAffected)
    return new ProjectAIsAffectedStrategy();
  if (type == ProjectType.B)
    return new ProjectBStrategy();
  // Similar if statements
}

At first glance, it looks pretty good. Logic was sound and it seemed to be returning a class implementing an interface similar to what we would expect for the Factory pattern. However, there’s a slight problem, can you spot it?

The issue is with the first parameter, Project. The method takes a Project, however, we’re only really depending on the Project’s Type property.

1
2
3
4
5
6
7
8
9
public IStrategy GetStrategy(Project project, bool isAffected)
{
  var type = project.Type;
  if (type == ProjectType.A && isAffected)
    return new ProjectAIsAffectedStrategy();
  if (type == ProjectType.B)
    return new ProjectBStrategy();
  // Similar if statements
}

So why don’t we get rid of the dependency on the Project and instead replace it with the dependency on the ProjectType instead?

1
2
3
4
5
6
7
8
public IStrategy GetStrategy(ProjectType type, bool isAffected)
{
  if (type == ProjectType.A &amp;&amp; isAffected)
    return new ProjectAIsAffectedStrategy();
  if (type == ProjectType.B)
    return new ProjectBStrategy();
  // Similar if statements
}

Instinctual, I knew this was the right call, but I couldn’t remember why I knew it was a good choice. After some digging, I remembered that this is a Law of Demeter violation, or better known as the Principle of Least Knowledge violation.

In general, this principle states that a method should have the least amount of information it needs to do it’s job. Other classic violations of this principle is when you use a class’s internals internals. For example,

SomeClassA.SomePropertyB.WithSomeMethodC()

One of the reasons that I really like the Law of Demeter is that if you follow it, you create easier to test methods. Don’t believe me? Which is easier to create, the Project class (which may have many dependencies that would need to be stubbed) or the ProjectType enum (which by definition has zero dependencies)?

Another reason that following the Law of Demeter is good practice is that it forces your code to be explicit about what dependencies are required. For example, in the first implementation, the caller knew that the method needed a Project, but had no clue on how much of the Project it needed (does it need all of the properties set? Does it need further setup besides basic instantiation?). However, with the refactored version, now it’s much clearer that the method has a looser dependency on not Project, but just the ProjectType.