Better Domain Modeling with Discriminated Unions
When I think about software, I like designing software so that doing the right things are easy and doing the wrong things are impossible (or at least very hard). This approach is typically called falling into the pit of success.
Having a well-defined domain model can prevent many mistakes from happening just because the code literally won't let it happen (either through a compilation error or other mechanisms).
I'm a proponent of functional programming as it allows us to model software in a better way that can reduce the number of errors we make.
Let's at one of my favorite techniques discriminated unions.
In the GitHub API, there's an endpoint that allows you to get the events that have occurred for a pull request.
Let's take a look at the example response in the docs.
Based on the name of the docs, it seems like we'd expect to get back an array of events, let's call this
Let's go ahead and define the
TimelineEvent type. One approach is to start copying the fields from the events in the array. By doing this, we would get the following.
This definition will work, as it will cover all the data. However, the problem with this approach is that
rename had to be defined as nullable as they can sometimes be specified, but not always (for example, the
lock_reason isn't specified for a label event).
Let's say that we wanted to write a function that printed data about
TimelineEvent, we would have to write something like the following:
The main problem is that the we have to remember that the
labeled event has a
label property, but not the
locked property. It might not be a big deal right now, but given that the GitHub API has over 40 event types, the odds of forgetting which properties belong where can be challenging.
The pattern here is that we have a type
TimelineEvent that can have different, separate shapes, and we need a type that can represent all the shapes.
One of the cool things about Typescript is that there is a union operator (|), that allows you to define a type as one of the other types.
Let's refactor our
TimelineEvent model to use the union operator.
First, we need to define the different events as their own types
At this point, we have three types, one for each specific event. A
LockedEvent has no knowledge of a
label property and a
RenamedEvent has no knowledge of a
Next, we can update our definition of
TimelineEvent to use the union operator as so.
This would be read as A
TimelineEvent can either be a
LockedEvent or a
LabeledEvent or a
With this new definition, let's rewrite the
Not only do we not have to use the
! operator to ignore type safety, but we also have better autocomplete (note that
rename don't appear when working with a labeled event).
At a general level, what we've modeled is a sum type and it's great for when you have a type that can take on a finite number of differing shapes.
Sum types are implemented as either tagged unions or untagged unions. Typescript has untagged unions, however, other languages like Haskell and F#, use tagged unions. Let's see what the same implementation in F# would have looked like.
A tagged union is when each shape has a specific constructor. So in the F# version, the
Locked is the tag for the
Labeled is the tag for the
LabeledEvent, so on and so forth. In the Typescript example, we worked around it because the
event property is on every
TimelineEvent and is a different value.
If that wasn't true, then we would had to have added a field to
TimelineEvent (typically called
tag) that would help us differentiate between the various shapes.
When defining domain models where the model can have different shapes, you can use a sum type to define the model.