GitHub Private Repos

I've been a paying customer of GitHub pretty much since the site launched, when I signed up in March 2008. Back when I signed up I was working on a number of projects that I wanted to keep private, including a few where I wanted to collaborate with other people. That has changed over the years. When I recently reviewed what repos I still had private, I found that the only ones I still cared about were a handful of things like my blog and my dot files that I was developing on my own (i.e. without collaborating with others).

For a non-corporate entity like myself I found that there are really only a few value adds that GitHub is providing:

I'm not really collaborating with others on the private repos I want to keep private; and if I were, I've found that I dislike the collaboration workflow that GitHub implements anyway. Hosting repos with git literally just requires having ssh access to the remote server and git itself installed, which I already have on the server running eklitzke.org. Doing secure backups to the cloud is trivial nowadays with S3 and GCP. So I decided to cancel my GitHub plan and just host my remaining private repos myself.

If you haven't ever hosted your own git repo before, here's what you do. On the server you want to host the git repo you do this:

mkdir ~/myproject.git
cd ~/myproject.git
git init --bare

Then in a checkout of your project (i.e. on the client) you do something like:

git remote add upstream myserver:myproject
git push upstream master

That's it. Now you can push/pull as usual, and to do a checkout you'd just do:

git clone myserver:myproject

Backups are also super easy. I'm using Google Cloud Storage, but you could just as easily use S3 or any other cloud storage service. Here's the actual script I'm using to backup each of the git projects I'm hosting in this way:

#!/bin/bash

set -eu
TARGET="$1.git"

if [ ! -d "$TARGET" ]; then
    echo "Cowardly refusing when $TARGET does not exist."
    exit 1
fi

BACKUP="${TARGET}.tar.xz"

function cleanup {
    rm -f "${BACKUP}"
}
trap cleanup EXIT

tar -cvJf "${BACKUP}" "${TARGET}"
gsutil cp "${BACKUP}" gs://MY-PRIVATE-BUCKET/git/

I have a cron job that runs this script daily for each of my repos. You could easily GPG encrypt the backups by piping the tar command to gpg -e, but I don't actually have any sensitive content in these repos so I don't bother.

One of the nice things about git is that even if my server goes down, restoring a git repo from a backup made in this fashion is trivial. Unpacking the backup tarball creates an actual git repo in the "bare" format, and converting from the bare format to the regular format is trivial (I would do it by passing the path to the bare checkout to a regular git clone invocation).

A few years ago I would have been happy to continue being a paying customer for GitHub just to support them as an organization, even if the value they were adding was small. Recently my thoughts on this have changed. Today I'm skeptical hat GitHub is providing a positive impact on the open source/free software movements, and I'm worried about the near monopoly status they have on new open source projects. This is a complex topic, so perhaps I'll flesh out my ideas on it more in a future post.