Blockchain, Bitcoin, and Ethereum - 180D-FW-2023/Knowledge-Base-Wiki GitHub Wiki

Nick Brandis

Blockchain, Bitcoin, and Ethereum

In this article, we will begin by developing an understanding of the blockchain data structure, then introduce two blockchain-based applications, Bitcoin and Ethereum.

Blockchain Overview

Put simply, a blockchain is a data structure that consists of a list of blocks. It is like a linked list except the nodes are immutable, meaning you can't change the contents of a node without creating an entirely new blockchain.

How does it ensure the nodes are immutable? It does this by having each block contain the hash of the block before it. We will talk about hashes later, but for now just know that in this context, it is a value that depends on the contents of the block. So if the data is changed in some node n, the hash of node n would change. Since node n+1 contains the hash of node n, and node n+2 contains the hash of node n+1 (and so on), this new hash would require recalculating the hashes of all nodes after, and including, node n, which would create a whole new blockchain.

Each block contains data that represents a ledger, or a list of transactions, and the entire blockchain is decentralized, i.e. distributed across a network of computers. We will get into why this is necessary later.

alt1

There are other components within a block such as the nonce and Merkel root hash, but I will get into those later. At a high level, the goal is for the blockchain to show a decentralized, immutable history of all valid transactions that have occurred, so that if a transaction is on the blockchain, we can be confident that it actually took place.

The blockchain may seem like an overcomplicated way of representing transaction history. For instance, it may seem unnecessary to have a list of blocks rather than one large block. But multiple blocks are important for organization, and help limit the computation required for the consensus mechanism, which I will get to later.

The use of a list of blocks is important to express an agreed upon ordering, since a transactional history is meaningless without a sense of time and order. In addition, each block contains the hash of the previous block, so if one block on the blockchain is altered, every block after also must be altered. This makes it difficult to change data on the blockchain without anyone noticing.

So how can we be confident that all transactions on the blockchain are trustworthy? Well, the first reason has to do with digital signatures. Say Alice wants to send a transaction to Bob. In order for Bob to know that Alice was the true creator of the transaction, Alice must “sign” a transaction with her private cryptographic key. Then Bob can use Alice’s public key to verify the transaction.

Since the blockchain is run on a network of computers, any node can use Alice’s public key to verify that her transaction was intentional. So every transaction that isn’t rejected by the network is legitimate, right?

Actually, there is another issue. As an example, say Alice tries to send the same piece of currency to Bob and Charlie simultaneously. The nodes will verify her digital signature, but Alice has gotten away with sending the same piece of currency twice. This is called the double spending problem.

To address this, nodes must do additional verification before including them in blocks on the blockchain. Nodes must then verify that the sender meant to create the transaction, has sufficient funds, and that the funds haven’t just been spent. Therefore, if Alice double spends, only one transaction will be deemed valid.

There is one flaw with this reasoning, and that is that some nodes may deem the Alice-Bob transaction as valid, while others may deem the Alice-Charlie transaction as valid, when only one is truly valid. This is resolved with something called the Proof of Work mechanism.

Proof of Work and Bitcoin

The Proof of Work consensus mechanism determines when blocks are added to the blockchain. The thinking is that the majority of work that is being done to validate transactions is honest (i.e. not approving fake transactions), and so whoever controls the majority of computational resources shall control the blockchain.

Accordingly, Proof of Work ensures that it is computationally expensive to add blocks to the blockchain. (If it were computationally cheap to validate transactions, it would be easier for a hacker to gain a majority of computational resources in the network.) We create a subset of nodes called miners that are responsible for attempting to add blocks to the blockchain. Any node can become a miner, but in general not all nodes are miners. Additionally, a reward is created for successfully adding a block so that miners are encouraged to help validate transactions despite it being computationally expensive.

This is where the nonce, hash, and Merkle root come into play. A nonce is some random number that in Bitcoin is a 32-bit integer. A hash is the output of a hash function like SHA-256, with the property that it is computationally infeasible to find an input to the hash function that outputs some specific hash value. A Merkle root is the root of a tree-like structure designed to make it more efficient to verify each transaction in block. Each transaction gets hashed and paired with a hash of another transaction, and the resulting pair gets hashed with another hashed pair, all the way up to the root hash. See 3 below for an example.

To create a new block, miners try to guess different values of the nonce. The correct value will yield the “right hash” for the new block when hashed along with the transaction contents of the block and the hash of the previous block.

alt3

The “right hash” is determined by the blockchain rules, but one simple rule is a number with a certain number of leading zeros. This number of leading zeros can be adjusted to increase or decrease the difficulty of the cryptographic puzzle. In this example, the miner finds that a nonce value of 2 yields a hash with 18 leading zeros.

alt4

In Bitcoin, the Proof of Work requirement is similar, but miners must instead find a hash value that is less than or equal to a number with some amount of leading zeros (called the difficulty target).

It may not seem very computationally expensive to add a block if there are only 2^32 possible values for the nonce. However, the contents of the block include a timestamp, which will constantly change the input to the hash function. So there is some luck involved in finding a nonce value that results in a hash that meets the difficulty target requirement.

How do miners decide which transactions to include in blocks, though? Users involved in a transaction pay a small transaction fee that is rewarded to the miner who is able to solve the Proof of Work puzzle and add the block to the blockchain. Miners thus have an incentive to pick transactions with higher fees, which makes it more costly for attackers to try to get their fake transactions approved.

Revisiting the double spending problem, if miners are broadcasted the Alice-Bob and Alice-Charlie transactions at roughly the same time, which one will miners include in their blocks? There is no guarantee that all miners will include the same transaction; some may choose Alice-Bob and some Alice-Charlie, depending on the miners’ acceptance strategies (first come first served, randomness, etc).

Say a majority of computational power consists of miners who choose the Alice-Bob transaction. Then that transaction is considered the valid one by the network, and it is more likely that a miner in the Alice-Bob camp solves the mathematical puzzle first.

However, say a miner in the Alice-Charlie camp solves the puzzle first. Then the block will be broadcasted to all nodes, but since the majority believes the Alice-Bob transaction to be the true use of that currency, the block will be rejected by many nodes. Nodes only accept a block if all transactions are valid and not already spent as determined from their copy of the blockchain. In the long run, the majority will outperform the minority and the Alice-Bob transaction will eventually earn its spot on the blockchain.

One last scenario is if two blocks get added to the blockchain at roughly the same time, not necessarily with conflicting transaction information.

alt5a

Then it is no longer clear in what order transactions have occurred. This is handled by a simple rule that miners will only extend the longer fork. For example, if some miners receive Block A first and others receive Block B first, we will have two groups of miners working on different paths of the blockchain. If the Block B group is first to mine the next block, they likely have the majority of computational power in the network, and so miners in the Block A group will switch to working on the other fork.

alt5b

An attacker attempting to alter the path of the blockchain would need to outpace the honest miners’ combined computational power, which is economically infeasible.

We now have a good understanding of Bitcoin and the underlying blockchain technology, and can finally proceed to talk about another blockchain-based technology: Ethereum.

Ethereum Basics and Applications

Ethereum is a decentralized blockchain network serving as a foundation for smart contract applications. We now understand what the decentralized blockchain network portion means, and can proceed to talk about smart contracts.

A smart contract is essentially a piece of code that runs on a blockchain. It is similar to real-world contracts in that it is a grouping of logic and rules that two parties agree to. But the key difference is that smart contracts remove the trusted third party needed to mediate real-world contracts.

alt6

In this example, Bob and John agree to a smart contract that John will pay Bob for his house. The smart contract replaces a trusted third party to verify that John sends a sufficient payment and Bob transfers ownership of his house before either side gets their end of the deal. The smart contract handles receiving and distributing the assets according to the logic of the code.

All nodes in the Ethereum network keep a local copy of the full blockchain, which includes all transactions as well as the code behind smart contracts. This means that smart contract code is immutable, and that each node can execute smart contracts independently and verify the results with the network.

But how does the smart contract code actually run? Well, to allow them to run consistently on any system, there is an Ethereum Virtual Machine (EVM) on which smart contracts can execute. Additionally, smart contracts are written in an object-oriented programming language called Solidity, which gets compiled into Bytecode that can run on the EVM.

alt7

One problem you might be thinking of is if a smart contract with a bug is stored on the blockchain. This is entirely possible, and is a major issue that Ethereum faces. Since smart contract code is immutable once on the blockchain, it can be difficult to prevent all execution of this code. There are some mechanisms like emergency stop functions and hard forks that address this problem, but I won't get into the details in this article.

That being said, some bugs like infinite loops and inefficient code can be resolved, using something called "gas". To prevent abuse of the Ethereum network, every transaction has some amount of gas added to it, which gets consumed as smart contracts execute operations. If the gas runs out before the smart contract finishes execution, it triggers an error and any changes that were made are reversed. Users are required to pay for this gas to keep the allocation of computational resources efficient (more computationally expensive tasks will cost more). This is similar to the concept of transaction fees in Bitcoin, and discourages inefficient or malicious code from clogging the network since it costs the user.

Other than smart contracts, there are a couple other differences between Bitcoin and Ethereum. For one, Ethereum is more than just a digital currency; it is a programmable financial system, and the currency in the Ethereum system is known as Ether (ETH). The currency in Ethereum is created as a store of value (like Bitcoin), but also for gas fees, rewards for validators, and Proof of Stake (see below).

Second, the block generation time is much faster on Ethereum. Mining Bitcoin is computationally expensive and the difficulty target is adjusted so the block generation time averages around 10 minutes. In contrast, Ethereum’s PoW puzzle is adjusted so block generation occurs about every 15 seconds. The downside to this is that forking and the double spending problem are more prevalent, but there is research going into methods to combat this.

Lastly, Ethereum is transitioning from the PoW consensus mechanism to Proof of Stake (PoS). One downside of PoW is scalability, since it has a slow block generation time and requires significant computational power (the latter of which is having a non-negligible effect on greenhouse gas emissions). PoS aims to address this while maintaining the security and decentralization of PoW.

The core idea of PoS is that the majority of ownership of the network’s token is trustworthy. Thus block validators are chosen according to how much stake they have in the network’s token, which is Ether in Ethereum’s case.

If a validator stakes enough tokens but engages in fraudulent behavior (i.e. producing or approving fake blocks), their stake gets slashed. There are a lot of specifics on conditions for when a stake should be slashed, but I will not get into them in this article. The key is that the more stake someone has in the Ethereum network, the more trustworthy they are.

In conclusion, Ethereum is a decentralized blockchain-based system that provides a basis for smart contract applications. It is more scalable than Bitcoin due to its faster block generation time, but suffers from more security and forking issues. It is currently transitioning from using the PoW mechanism to PoS.

Sources:

"Bitcoin" by Satoshi Nakamoto

“Research and Application of Smart Contract Based on Ethereum Blockchain” by Yuxin Huang et al 2021

https://link.springer.com/article/10.1007/s12599-017-0467-3

https://vitalflux.com/blockchain-linked-list-like-data-structure/

https://en.wikipedia.org/wiki/Digital_signature

https://vitalflux.com/bitcoin-blockchain-proof-work/

https://www.asynclabs.co/blog/blockchain-development/proof-of-work-what-it-is-and-how-does-it-work/

https://michaelnielsen.org/ddi/how-the-bitcoin-protocol-actually-works/

https://www.researchgate.net/publication/366170214_Fundraising_Tracking_System_Using_Blockchain