How To Host Your Own Private Git Repositories

Hosting your own private git repositories is really easy. If you already have a dedicated web server (e.g. to host your blog), you have everything you need to host your own repos. Doing this is cheaper than paying GitHub, and it will give you the satisfaction of being a True Hacker.

Git can operate over multiple protocols. I strongly recommend using the SSH protocol. It's very easy to set up: you almost certainly already have SSH set up on your server. When you use git over SSH, there's no daemon process to run or manage. All you need is the regular git command line tools installed on your server. I recommend creating a dedicated git user, rather than using your regular login. Creating a dedicated user makes it possible to share access to git repos on your server with another person, and it will also keep your home directory cleaner. Create a user called git, and set the login shell to git-shell. On a Debian or Ubuntu system this would be done with a command like:

# Create a new UNIX account for the git user.
sudo adduser --shell $(which git-shell) git

Setting the shell to git-shell limits the account to only running git commands. This isn't strictly necessary, but it increases security. This is especially helpful if you want to give other people access to your repositories, since it prevents them from getting shell access on your server.

To create a repository named foo you simply create a directory called foo.git in the git user's home directory, and run git init --bare in that directory. The .git suffix is just a convention for bare repositories. A bare repository is one that just has the git metadata in it, and doesn't have an actual working copy checked out:

# Create a new git repository called "foo".
sudo -u git mkdir ~git/foo.git
sudo -u git git -C ~git/foo.git init --bare

Add your SSH key to ~git/.ssh/authorized_keys, and then you should be able to clone the repo locally using a command like:

# Clone the repo; you can also use git@example.com:foo.git
git clone git@example.com:foo

You'll also want to regularly backup your git repositories, in case Something Bad happens. To this end, I've written a small shell script that creates a tar archive for each git directory, and uploads it to a private Google cloud storage bucket if the SHA-1 of the archive has changed since the last time the script ran. I run this via a cron job, set to run once an hour. Here's what my backup script looks like:

#!/bin/bash
set -eux
GITDIR=/home/git
BUCKET=gs://my-google-cloud-bucket/git/

# Renice ourselves.
renice -n 10 $$

# Clean things up when we're done.
trap "rm -f /tmp/*.git.tar.xz" EXIT

cd /tmp

# Create tar archives of each git repo.
for proj in ${GITDIR}/*.git; do
    base=$(basename $proj)
    tar -C $GITDIR -cJf ${base}.tar.xz $base
done

upload() {
    if [ $# -gt 0 ]; then
        gsutil -q cp "$@" $BUCKET
    fi
}

# If hashes.txt exists, only upload files whose hashes have changed. Otherwise,
# upload everything.
if [ -f hashes.txt ]; then
    upload $(sha1sum -c hashes.txt 2>/dev/null | awk -F: '/FAILED/ {print $1}')
else
    upload *.git.tar.xz
fi

# Write out the new file hashes for the next run.
sha1sum *.git.tar.xz >hashes.txt

You should just as easily use S3 (or something else) for storing the backups. I find Google cloud storage to be convenient because I already use GCE to host my blog, and there's a flag you can set to give GCE instances magic read/write access to cloud storage without having to deal with credentials files. Google currently has a free tier that allows you to store up to 5GB for free, which is way more than enough for my backups. Right now I am literally creating and storing these backups at no cost, but it would cost me less than a penny a month for storage even if that wasn't the case.