Scaling Effectiveness with Docs - Finding Stale Docs
In a previous post, I argued that to help your team be effective, you need to have up-to-date docs, and to have this happen, you need some way of flagging stale documentation.
In this series, I show you how you can automate this process by creating a seed script, a check script, and then automating the check script. In today's post, let's develop the check script.
Breaking Down the Check Script
At a high level, our script will need to perform the following steps:
- Specify the location to search.
- Find all the markdown files in directory.
- Get the "Last Reviewed" line of text.
- Check if the date is more than 90 days in the past.
- If so, print the file to the screen.
Specifying Location
Our script is going to search over our repository, however, I don't want our script to be responsible for cloning and cleaning up those files. Since the long term plan is for our script to run through GitHub Actions, we can have the pipeline be responsible for cloning the repo.
This means that our script will have to be told where to search and since it can't take in manual input, we're going to use an environment variable to tell the script where to search.
First, let's create a .env
file that will store the path of the repository:
.env | |
---|---|
From there, we can start working on our script to have it use this environment variable.
If we were to run our Deno script with the following command deno run --allow-read --allow-env ./index.ts
, we should see the environment variable getting logged.
Finding all the Markdown Files
Now that we have a directory, we need a way to get all the markdown files from that location.
Doing some digging, I didn't find a built-in library for doing this, but building our own isn't too terrible.
By using Deno.readDir/Sync
, we can get all the entries in the specified directory.
From here, we can then recurse into the other folders and get their markdown files as well.
Let's create a new file, utility.ts
and add a new function, getMarkdownFilesFromDirectory
With this function in place, we can update our index.ts
script to be the following:
Running the script with deno run --allow-read --allow-env ./index.ts
, should get a list of all the markdown files being printed to the screen.
Getting the Last Reviewed Text
Now that we have each file, we need a way to get their last line of text.
Using Deno.readTextFile/Sync, we can get the file contents. From there, we can convert them to lines and then find the latest occurrence of Last Reviewed
Let's add a new function, getLastReviewedLine
to the utility.ts
file.
Let's try this function out by modifying our index.ts
file to display files that don't have a Last Reviewed On line.
Determining If A Page Is Stale
At this point, we can get the "Last Reviewed On" line from a file, but we've got some more business rules to implement.
- If there's a Last Reviewed On line, but there's no date, then the files needs to be reviewed
- If there's a Last Reviewed On line, but the date is invalid, then the file needs to be reviewed
- If there's a Last Reviewed On line, and the date is more than 90 days old, then the file needs to be reviewed.
- Otherwise, the file doesn't need review.
We know from our filter logic that we're only going to be looking at lines that start with "Last Reviewed On", so now we need to extract the date.
Since we assume our format is Last Reviewed On, we can use substring to get the rest of the line. We're also going to assume that the date will be in YYYY/MM/DD format.
Let's update our index.ts
file to use the new function.
And just like that, we're able to print stale docs to the screen. At this point, you could create a scheduled batch job and start using this script.
However, if you wanted to share this with others (and have this run not on your box), then stay tuned for the final post in this series where we put this into a GitHub Action and post a message to Slack!