Demystifying The Selfish Mining Bet

Recently Craig Wright and Peter Rizun made a small bet, 1 Bitcoin (currently about $2700), regarding the solution to following problem:

fallacy

In this post I’m going to explain who win’s the bet, and why. Since this post is more interesting if you think about the answer first, I’m going to clarify some points about the problem. This will let you think it through before reading on to the solution. Feel free to skip the background if you already have a solid grasp of how Bitcoin mining works. Either way, try to solve the problem yourself before reading the solution: it’s much more fun that way!

Requisite Background

In Bitcoin new blocks are “mined” by solving a hard cryptographic puzzle. If you solve the puzzle, you are rewarded with newly issued Bitcoin. There are many people around the world constantly racing each other to mine the next block. When a new block is mined, it gets broadcast to the rest of the network (usually within a few seconds), and then the puzzle resets and everyone starts working from scratch on a slightly different variation of the puzzle.

The difficulty of mining a block is self-adjusting, so that new blocks are found approximately once every ten minutes. If more miners join the network, then the difficulty will self-adjust to become slightly harder; and if miners leave the network, the puzzle will self-adjust to get slightly easier. The result is that the expected time for a new block is always ten minutes.

The puzzle being solved during mining involves finding a block whose cryptographic hash has a certain property. There’s effectively an infinite number of possible next blocks, and miners act independently. The amount of blocks you are expected to mine is proportional to your total hashing power compared to the rest of the network. If your hashing power is a proportion $\alpha$ such that $0 < \alpha \leq 1$, then the expected time in minutes time to a new block is:

$$ T_{expected} = \frac{10}{\alpha} $$

In the puzzle, the selfish miner has $\alpha = \frac{1}{3}$. This gives the selfish miner an expected time of 30 minutes per new block, and the rest of the network an expected time of 15 minutes per new block. The selfish miner problem comes up in the context of a type of attack miners can launch against the network (see the postscript for details). However, for the purpose of understanding this bet, you can ignore the question of why a selfish miner might exist.

The Solution

The first step to solving this problem is observing that the actions of the selfish miner and the actions of the honest miners are totally independent. Therefore, we should ignore the selfish miner completely.

The honest miners have $\frac{2}{3}$ of the hashing power, so the expected time for them to mine a new block is 15 minutes. The honest miners start working on mining a new block, and after 10 minutes they still haven’t found a block. Is the expected time for their next block now 5 minutes, or is it still 15 minutes?

This problem is well known, and is called the gambler’s fallacy. You may have heard of it before. Here’s another way of posing the problem, that you may be more familiar with. You’re flipping a fair coin, and want it to land heads up. You flip the coin ten times, and due to extreme poor luck it lands tails up each time. On the eleventh flip, what’s the probability of flipping a heads? Since the coin flips are independent events, the chance of flipping a heads on the eleventh try is still $\frac{1}{2}$.

Let’s look a slightly more complicated problem: drawing cards from a deck until an ace is drawn. In the independent version of the problem, after each card is drawn it will be put back into the deck and the deck will be reshuffled. In this version, each time we draw there’s a $\frac{1}{13}$ chance of drawing an ace. Therefore, the probability that it will take $N$ draws to get the first ace is:

$$ P_{independent} = 1 - \prod_{n=1}^N \frac{12}{13} = 1 - \left(\frac{12}{13}\right)^N $$

You can write a short loop in your favorite scripting language to show that $P = \frac{1}{2}$ when $8 < N < 9$. More precisely, if you take the logarithm of both sides and solve for $N$ you’ll find that when $P = \frac{1}{2}$ the real-valued solution is:

$$ N = \log_{\frac{12}{13}} \frac{1}{2} = \frac{\log 2}{\log 13 - 2 \log 2 - \log 3} \approx 8.6597 $$

The dependent version of the problem would be where we cards are discarded after being drawn. So on the first draw, the deck has 52 cards and 4 aces, and thus there’s a $\frac{1}{13}$ chance of drawing an ace. On the second draw there are 51 cards in the deck, so the second draw has a chance of $\frac{4}{47}$ of being an ace. We can generalize this, so that the probability it takes $N$ draws to get the first ace is:

$$ P_{dependent} = 1 - \prod_{n=1}^{N} \frac{48-n+1}{52-n+1} $$

This product series can’t be solved analytically the same was as the dependent case. However, note that this equation approaches the solution to the dependent problem if we add more decks of cards. If we use a shoe of $D$ decks, then the series becomes:

$$ P_{dependent} = 1 - \prod_{n=1}^{N} \frac{48D-n+1}{52D-n+1} $$

When $D$ is large, this approaches the independent case. In the limit we can reduce the fraction as:

$$ \lim_{D \rightarrow \infty} \prod_{n=1}^{N} \frac{48D-n+1}{52D-n+1} = \left(\frac{12}{13}\right)^N $$

So what does this have to do with Bitcoin mining? Bitcoin mining is a lot like flipping a coin, or independently drawing from a deck of cards. There’s effectively an infinite number of new blocks that can be mined, because the miners control many parameters that go into the cryptographic hashing function: the actual transactions in the block, the order of those transactions, a 32-bit nonce value, a timestamp (which is allowed to have some skew), and a 256-bit public key which encodes an address that pays the miner. Thus mining approximates independent trials of a random process, to a very high degree.

To be completely pedantic: mining is not truly independent. If you try to mine a particular block, and that block fails, you’ll update the nonce (or some other field in the block) and never retry that exact block again. But if we actually tried to take this into account, the terms in the numerator and denominator of the fraction would be so large that it would be inconsequential to try to compute the dependence effect.The cryptographic hash used in the Bitcoin proof-of-work function is 256 bits wide, and $2^{256} \approx 10^{77}$. It’s as if you’re solving the dependent card drawing problem using a shoe with an astronomically large $D$ value.

Back to our problem: after ten minutes, the honest miners still haven’t found a block. Because mining is a random process, they haven’t actually made any progress! The expected time for the next block is the same as it was when they started: 15 minutes. If you think that they’ve somehow “used up” 10 of the 15 minutes, you’re falling for the gambler’s fallacy. Here’s another way of looking at it, which hopefully makes it even more intuitive. Let’s say that even though the expected time for a new block is 15 minutes, somehow the miners get unlucky, and they’ve been mining for 20 minutes without finding a new block. What’s the expected time until they find the next block? The logic that would tell you that Craig Wright wins the bet is the same logic that would tell you their next block should be mined in negative 5 minutes, even though the solution must be a positive number!

Peter Rizun wins the bet. Bitcoin mining is effectively an independent, random process. Miners who control $\frac{2}{3}$ of the hashing power always have an expected time of 15 minutes until they find a new block, regardless of how long they’ve already been working on that block.

Postscript

This problem originally arose in the context the paper Majority is not Enough: Bitcoin Mining is Vulnerable which describes an attack where a selfish miner can bet more than their expected share of mined blocks. The paper is well researched, and has a lot of math justifying their results. The problem that Craig Wright and Peter Rizun bet on here is kind of a Fisher-Price version of the more complex selfish mining problem.

The actual selfish mining attack works like this. Let’s say that again the selfish miner has $\alpha = \frac{1}{3}$. At time $t = 0$ block $N$ is mined, and everyone starts working on mining block $N + 1$. By pure luck, the selfish miner happens to mine the next block very quickly. Suppose they mine it at $t = 1$, where $t$ is measured in minutes. Normally a miner would immediately announce the new block to the network, and collect the mining reward. The selfish miner could gamble however, and not immediately announce the $N + 1$ block. Instead, they keep it secret, and start building off it to mine block $N + 2$. The idea is that they’ll wait until some future time, say $t = 5$, and then announce the $N + 1$ block.

If the gamble pays off, the rest of the network will have wasted the last 4 minutes working on block $N + 1$ which had already been mined by the selfish miner. Effectively, the selfish miner has a 4 minute head start on mining block $N + 2$. The gamble doesn’t always pay off: if the selfish miner waits too long another miner will find another $N + 1$ block before the selfish miner announces their block. In this case, the selfish miner has lost out on their mining reward.

The paper goes on to show that there’s a strategy that allows miners to improve their mining rewards. The strategy is proven both analytically, and via a Monte Carlo simulation. Craig Wright recently put up a rebuttal to the paper, titled The Fallacy of Selfish Mining in Bitcoin: A Mathematical Critique. I don’t have access to the paper, the smart money is on Craig being wrong.

Craig Wright has previously claimed to be Satoshi Nakamoto. There are about zero people who still believe this claim, but in case you had any doubts, you can lay them to rest now. Craig may be very smart, but he’s not the math savant he claims to be.