August 10, 2019

Automated Blog Post Tag Generation


This blog is created with Jekyll using Github Pages. Jekyll has support for tagging blog posts, but Github Pages doesn’t whitelist those plugins, so if you want to use tags with GH Pages, you’ll have to do it yourself. I followed the guidelines on this blog post to get my tags up and running, which worked well, but I also wanted to automate the generation of the tag files in tags/ so I wouldn’t forget to do it.

Here’s the generation script I wrote:

from os import walk

def filenames_in_folder(folder_name):
  names = []
  for (dirpath, dirnames, filenames) in walk(folder_name):
    names.extend(filenames)
    break
  return names

def tags_from_post(f):
  tags = set()
  for line in f:
      line = line.rstrip()
      if line.startswith("tags: "):
          file_tags = line.split(" ")[1:]
          tags.update(file_tags)
  return tags

files = filenames_in_folder("_posts/")

# Get all tags from all posts
tags = set()
for file in files:
  with open(u"_posts/{}".format(file)) as f:
      tags.update(tags_from_post(f))

existing_tags = set([f[:-3] for f in filenames_in_folder("tags/")])
new_tags = tags - existing_tags

def file_text(tag_name):
  return u"""---
layout: tagpage
title: "Tag: {}"
tag: {}
---
""".format(tag_name, tag_name)

# For each new tag, create a file in tags/ with the correct boilerplate
for tag in new_tags:
    with open(u"tags/{}.md".format(tag), "w+") as f:
      f.write(file_text(tag))

And I added a pre-commit hook to .git/:

#!/bin/sh
python3 generate_tags.py
git add .
exit 0

Now, whenever I commit, if I added new tags, generate_tags.py will run and the new tag files will automatically get added to the commit. It’s a small piece of automation, but now I won’t have to worry about forgetting to run it and breaking my tag system.