Calculating Bitcoin Hash rate - KanoczTomas/bitcoin-learn GitHub Wiki

How to estimate Bitcoin Hashrate

I was trying to understand how the hash rate equation at bitcoin wiki worked. Took me some time to grasp the math behind it. This is a write up, my journey how I cracked the math. I hope this explanation can better explain how the hash rate is estimated and help others understand Bitcoin better.

Lets dive in!
Well not so fast, first we have to explain some terms ...

The Target, Diff1, Probability of finding a Diff1 block

I assume the reader knows how Bitcoin mining works, if not a very nice demo can be found at https://anders.com/blockchain/. A block is valid only if its hash is smaller than the Target. The Target is a 256 bit number. The smallest target (called often Diff 1 Target) is equal to: 0x00000000FFFF0000000000000000000000000000000000000000000000000000. It was arbitrarily picked by Satoshi.

It means the range of valid block hashes is from (*) 0x00000000FFFEFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF to 0x0000000000000000000000000000000000000000000000000000000000000000

Let's try to find what is the probability that we find a hash that is in the range we showed above. The total number of valid sha256 hashes is 2^256. If we look closer on the valid range, we see that they share a common part, that is 32 bits of zeros at the beginning, or 8 hexadecimal digits. So the total number of valid combinations is 32 bits less, leaving us with 224 bits to "play with". The number of valid hashes is thus 2^224 - 1. See that E hiding in the first range (marked with *)? That is that - 1 there. We can safely omit it, so I will refer to the number of valid hashes that are lower than the Diff 1 Target as 2^224 to make math easier. We finally get to the probability, which is 2^256/2^224. It leaves us with 1/2^32.
Let's write it like this:

p(diff1) = 1/2^32

The Difficulty

If we look at the Target, is is a huge number, something our mind has trouble to work with. To be able to compare whether it is harder or easier to find a block the Difficulty was defined. We know that if we lower the Target, we make the interval of valid hashes smaller, so the problem of finding a block becomes harder. If we define Difficulty as how many times it is harder to find a valid block, then we can calculate it by comparing the new (harder) Target to the Diff1 Target (the easiest possible). For a block with a target Target we get:

Diff = Diff1 Target/Target

By definition we see that the smallest Difficulty is 1, when the Target is equal to Diff1 Target.

Can we calculate the probability of finding a block with a different target than Diff1? Of course we can! We know that the Difficulty means it is n times harder to find a valid block, so the probability p(diff1) has to be divided (made smaller by a factor of) n. We get the probability of finding a block with Difficulty diffX:

p(diffX) = p(diff1)/diffX 
p(diffX) = (1/2^32)/diffX = 1/(2^32*diffX)

Retarget interval and Difficulty

We know that the Target is recomputed every 2016 blocks. The goal of the Bitcoin network is to create blocks every 10 minutes on average. When we look at the previous 2016 blocks and the time it took to generate them, we can get an idea about how well we are keeping the target of 1 block/10 minutes. When the network finds blocks more rapidly then every 10 minutes, we need to make finding a block harder. In case of slower blocks we need to make the problem of solving blocks easier. When we increase the Target, we decrease difficulty, and vise versa. Here is how the Bitcoin network finds the new target, to keep blocks coming at 1 block per 10 minutes (source):

New Target = Old Target * (Actual Time of Last 2016 Blocks / 20160 minutes)

Finding the Hash rate

I was not able to get the formula directly so here is how I started solving this problem. Let's create an easier problem. There is a person flipping coins. He told us that he had 300 tails flipped in an hour (he did not want to give us the full answer just like that ...). We want to get the frequency (flips/second), how quickly he is flipping those coins. The probability of the coin landing as a tail is 50% (0.5). As we know there were 300 of them. We can get the number of flips (N) as follows:

N = 300 * 1/0.5 = 300 * 2 = 600

We have 600 coin flips in an hour, we want to get the coins flipped per second on average. An hour has 3600 seconds thus we divide the number of coin flips with the time it took (in seconds) to get the frequency f.

f = 600/3600 = 0,16666666... 

With me?
Ok, how about something a little bit more difficult. We are in a strange lottery, where we are picking from 4 letters, ABCD. All letters are equally probable, so the probability of picking letter A for example is 1/4. We know that this round of the lottery had 100 D letters picked in 10 minutes. We want to find the frequency.
First let's find the number of total letters picked on average. On average every 4th letter is D, we have 100 of them, so the total number of letters must be 4 times as many on average. Total number of draws N is:

N = 100 * 1/(1/4) = 100*4 = 400

10 minutes are 600 seconds, the frequency is thus:

f = N / 600 = 400/600 = 0.666...

Do you see the pattern there? These problems are all similar, the coins, or letters, or block hashes in Bitcoin are all equally probable. In that case the general formula for the frequency (coins/second, letters/second, hashes/second) f is:

f = (A * (1/p))/time in seconds

where A is number of samples (with coins number of tails, with letters number of Ds). p is the probability of picking the sample.

Putting it all together

In our case for finding the hash rate of the Bitcoin network, knowing we find 1 block in 10 minutes on average, our A is 1 (we find 1 block in 10 minutes). The probability of finding a block with the current difficulty is p(diffX) (look above how to compute it). We know that the difficulty gets adjusted every 2016 blocks, which in case of 10 minute blocks is exactly 2 weeks. Of course we know that a lot can change in 2 weeks, miners come and go, hash rate increases and decreases. The difficulty is a good base, but it tells us only what happened in the last re-target interval. We will get back to this problem, but first let's compute the Hash rate in an ideal world, where the network finds exactly 1 block in 10 minutes. The frequency f (or Hash rate) is:

f = (1/p(diffX))/10 minutes in seconds
f = (1/p(diffX))/600
f = (2^32*diffX)/600

For Diff1 the average hash rate is 7.158 million hasher per second or 7.158 Mh/s.

The Difficulty tells us what happened in the previous re-target period, to approximate the hash rate after re-target, usually the rate of creating blocks the past day or 3 is used. As we get a smoother approximation. We can calculate the "correction factor" CR from the last day as follows:

CR = number of blocks found in last 24 hours/144

With 1 block found in 10 minutes we get 6 blocks per an hour and 144 in 24 hours.

Using the "correction factor" CR (you can use more days, or a different time) we get the final formula for hash rate (f) at current difficulty (diffX) estimation as follows:

f = (CR * 2^32 * diffX)/600

Q.E.D