Blog Generator Rewrite

I've always used a custom static HTML generator for this site. It started off as a Python program, and later I rewrote it in Go. Part of my motivation to rewrite it in Go was to make it run faster, but another reason was that I was using Go a lot at work, and I thought proting my code to Go would be a good exercise of my Go programming skills. Well now I've rewritten it again, this time in C++. My motivations for rewriting in C++ weren't so much about making it run faster (although it does run much faster now), but rather that I'm now doing most of my coding at work and home in C++, and once again I thought it would be a good programming exercise.

The design of the new codebase is very similar to how the Go version worked. I still write all of my blog posts as Markdown files, with a simple ad-hoc format at the top of each file to indicate the date of the post and other special metadata (for instance, a few of my posts use Javascript, so I have a way to indicate if a JS file is needed for the page). I didn't have to update any of my existing Markdown files for the C++ conversion.

I parse the Markdown files using md4c, which was the best Markdown parser/generator I could find for C or C++ (and I tried a few). It generates slightly different HTML in some cases than the old library I was using (which was russross/blackfriday), particularly for footnotes, but nothing major has changed; as far as I can tell all the old pages are still rendering fine. To emit the final HTML for each page I need to add some header/footer content which I do using Jinja2Cpp, which is a port of the Jinja HTML template engine to C++. I had to rewrite my HTML templates as the Go version used the html/template package which uses a different syntax than Jinja. However, this wasn't too much work as I only have a few templates and they're quite simple. I'm generating static gzip content using zopfli. I feel like this part of the code improved a lot because in the Go version I had to call out to the zopfli command line program to do the compression (to avoid using cgo), whereas now I can just use the library directly.

As in the Go version, I am generating the static HTML in parallel using all my CPU cores. I'm using CTPL (with the boost::lockfree::queue backend) to manage the worker threadpool. This was really easy to do, and isn't really any more work than using goroutines. One new feature I added to the code was a file watcher mode so that I can just run the HTML generator binary in a mode where it uses inotify watches to detect local filesystem changes and then have it re-generate any changed content on the fly. I'm using the inotify_add_watch system call directly, rather than using some high-level library. That's one of the things I like about writing C++: I can just use regular C libraries (like md4c) or C system calls without any fuss.

Overall this project was a lot of fun because I got to use a number of new libraries and see how a real-world port of a Go program to C++ stacks up. The number of lines of code in each version are about the same, and I didn't have any particular difficulties in the C++ version. The C++ version runs a lot faster, although it's not an apples-to-apples comparison because the Go version was doing things like HTML escaping which the C++ version does not, and I think md4c supports a smaller subset of Markdown.

I've been writing C++ full time at work for about a year and a half now, and I have a lot to say about my experiences with C++ generally. The language has a few serious warts, but overall I'm very happy writing C++ and I think outsiders have a lot of serious misconceptions about the language. Hopefully I'll be more active in updating this blog again and write about my thoughts on this topic soon.