Site icon R-bloggers

Building your own blockchain in R

[This article was first published on R – Big Data Doctor, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Everybody is freaking out about the rise of the Bitcoin and the potential of the Blockchain technologies. The advent of cryptocurrencies, game changing use cases, disruption of established business models by disintermediation, etc..

By the time I’m writing this article, there are more than 1300 crypto-currencies listed in coinmarketcap.. And a lot more coming with the next ICOs (Internet Coin Offering).

Most certainly, the main enabler of Bitcoin and of many other currencies (although not all of them) is the Blockchain technology.

Although the original paper from Satoshi explains very well the ground idea behind this technology, nothing like creating your own blockchain to fully understand how it works, its limitations and potential improvements (aka “If you can code it, you certainly understand it”).

In this post I’d like to share a gentle coding exercise in R (#rstats). Why R? Just because it’s my favorite language… I wouldn’t choose R for a productive, full-fledge block chain implementation, but again, this is not the purpose of this post. This is just a learning exercise hacked quite quickly without any aspiration of ever running this code in a productive environment, and should be understood as such.

For convenience, I stored the code in a git repository, for others to improve, re-use, try, etc.

First things first: what is what in a Blockchain


A blockchain is an immutable chain of sequential records or blocks that are “chained” together using hashes. Blocks can be understood as containers, typically of transactions, but it can be extended to documents, etc. We can think of a blockchain as a database where new data is stored in blocks, that are added to an immutable chain based on all other existing blocks.
Blockchain is often referred as a digital ledger of transactions performed in a cryptocurrency of choice (Bitcoin or whatever). A transaction requires a sender address, a recipient address and a given amount and needs to be assigned to a block.
The json below shows a typical block with a particular index, an integer representing the proof, a hashed representation of the previous block, which allows for consistency check (more about proof and hashing later) and a timestamp.

{
	"index": 2,
	"timestamp": 1514108190.2831,
	"transactions": [{
		"sender": "d4ee26eee15148ee92c6cd394edd964",
		"recipient": "23448ee92cd4ee26eee6cd394edd964",
		"amount": 15
	}, {
		"sender": "6eee15148ee92c6cd394edd974d4ee2",
		"recipient": "15148ee92cd4ee26eee6cd394edd964",
		"amount": 225
	}],
	"proof": 211,
	"previousHash": "afb49032c6c086445a1d420dbaf88e4925681dec0a5b660d528fe399e557bf68"
}

Once we have understood the concept behind the blockchain, let’s build one and make it work
We are going to need three different files:

  • The class definition file, where we create the Blockchain class with its components and methods.
  • The API definition file, where we instantiate the class, register a node and expose the blockchain methods as GET and POST calls for the peers to interact with.
  • The plumber launch file, to start the server and expose the API methods

Building a Blockchain

To represent the Blockchain, we need a list of blocks (the chain), a list of currentTransactions (waiting to be mined) and a list of mining nodes.

Our Blockchain class implements a method to register a new transaction addTransaction. Transactions are appended to the list of currentTransactions until a new block is created, which takes care of the newly added transactions that have not been added to any block yet.
The creation of a new block takes place after calling nextBlock. To maintain the consistency, a new block can only be added knowing the hashed value of the previous one as part of a Proof of Work procedure (PoW). Basically, it is just finding a number that satisfies a condition. This number should be easy enough to be verified by anyone in the chain, but difficult enough so that a brute force attack to find it wouldn’t be feasible (too long, too expensive).
In our example, proofOfWork relies on finding a number called proof such that if we append this proof to the previous proof and we hash it, the last 2 characters of the resulting string are exactly two zeroes. The way it works in the real Bitcoin chain is quite different, but based on the same concept.

In Bitcoin, the Proof of Work algorithm is called Hashcash. Solving this problem requires computational resources (the longer the strings and the more characters to be found within it, the higher the complexity), and miners are rewarded by receiving a coin—in a transaction.

The Blockchain class provides a method to check the consistency over all blocks validChain, which iterates over the chain checking if both each block is properly linked to the previous one and whether the PoW is preserved.
The method registerNode adds the URL of a mining node to the central nodes register.

Finally, as we need to think of a Blockchain as a distributed system, there is a chance that one node has a different chain to another node. The handleConflicts resolves this conflict by declaring the longest chain the proper one… the one that all nodes take.

list.of.packages <- c("digest", "httr","jsonlite")
new.packages <- list.of.packages[!(list.of.packages %in% installed.packages()[,"Package"])]
if(length(new.packages)) install.packages(new.packages)
require(digest)
require(jsonlite)
require(httr)
Blockchain <- function ()
{
  bc = list (
    chain = list(),
    currentTransactions  = list(),
    nodes = list()
  )
  #' Create a new Block in the Blockchain
  #'
  #' @param proof <int> The proof given by the Proof of Work algorithm
  #' @param previousHash <str> Hash of previous Block
  #' @return new block generated given the \code{proof} and the \code{previousHash}
  #' @examples  
  #' blockchain = Blockchain()
  #' blockchain$nextBlock(previousHash=1, proof=100) # genesis block
  bc$nextBlock = function (proof, previousHash=NULL){
    previousHash <- ifelse (is.null(previousHash), bc$hashBlock(bc$chain[length(bc$chain)]), previousHash)
    block = list('block' = list('index' = length (bc$chain) + 1, 'timestamp' = as.numeric(Sys.time()) , 'transactions' =  bc$currentTransactions, 'proof' = proof, 'previousHash' = previousHash))
    bc$currentTransactions = NULL
    bc$chain <- append(bc$chain, block)
    return (block)
  }
  #' Returns the last block in the Blockchain
  #'
  #' @examples  
  #' blockchain$lastBlock()
  bc$lastBlock = function () {
    bc$chain[length(bc$chain)]
  }
  #' Register a new transaction in the Blockchain
  #'
  #' @param sender <str> address of the sender
  #' @param recipient <str> address of the recipient
  #' @param amount <int> transaction amount
  #' @return  <int> Index of the Block that will hold this transaction
  bc$addTransaction = function (sender, recipient, amount) 
  {
    txn <-  list('transaction'= list('sender'=sender,'recipient'=recipient,'amount'=amount))
    bc$currentTransactions <- append(bc$currentTransactions, txn)
    last.block <- bc$lastBlock()
    return(last.block$block$index + 1)
  }
  #' Hash a block using SHA256
  #'
  #' @param block <block> 
  #' @return  <str> SHA256 hashed value for \code(block)
  #' @examples  
  bc$hashBlock = function (block) {
    require(digest)
    digest(block,algo="sha256")
  }
  
  #' Find a number p' such that hash(pp') contains leading 4 zeroes, where p is the previous p'
  #' p is the previous proof and p' is the new proof
  #' @param last_proof <block> 
  #' @return  <str> SHA256 hashed value for \code(block)
  bc$proofOfWork <- function (last_proof)
  {
    proof <- 0
    while (!bc$validProof(last_proof, proof))
    {
      proof <- proof + 1
    }
    return (proof)
  }

  #' Find a number p' such that hash(pp') ends with two zeroes, where p is the previous p'
  #' p is the previous proof and p' is the new proof
  #' @param last_proof <int> previous proof 
  #' @param proof <int> proof
  #' @return  <bool> TRUE if correct, FALSE if not
  bc$validProof <- function (last_proof, proof) 
  {
    guess = paste0(last_proof,proof)
    guess_hash = digest(guess, algo = 'sha256')
    return (gsub('.*(.{2}$)', '\\1',guess_hash) == "00")
  }
  #' Checks whether a given blockchain is valid
  #'
  #' @return  <bool> TRUE if the chain is valid, FALSE otherwise
  bc$validChain <- function (chain)
  {
    lastBlock <- chain[0]
    currentIndex <- 1
    while (currentIndex < length(chain))
    {
      block = chain[currentIndex]
      # checking for valid linking
      if (block$block$previousHash != bc$hashBlock(lastBlock)) {
        return(FALSE)
      }
      # checking for proof validity
      if(!bc$validProof(lastBlock$block$proof, block$block$proof))
      {
        return (FALSE)
      }
      lastBlock <- block
      currentIndex <- currentIndex +1
    }
    return(TRUE)
  }
  #' Add a new node to the list of existing nodes
  #' 
  #' @param address <str> full URL of the node  
  #' @examples  
  #' blockchain = Blockchain()
  #' blockchain$registerNode('http://192.168.0.5:5000')
  bc$registerNode <- function(address)
  {
    parsed_url = address
    bc$nodes<- append(bc$nodes, parsed_url)
  }
  #' Resolve conflicts by replacing the current chain by the longest chain in the network
  #'
  #' @return  <bool> TRUE if the chain was replaced, FALSE otherwise
  bc$handleConflicts <- function()
  {
    neighbours <- bc$nodes 
    new_chain <- NULL
    max_length = length(bc$chain)
    for (i in 1:length(neighbours))
    {
      chain.node <- GET(paste0(neighbours[i],'/chain'))
      node.chain.length <- jsonlite::fromJSON(chain.node)$length
      node.chain.chain <- jsonlite::fromJSON(chain.node)$chain if (node.chain.length > max_length)
      {
        new_chain = node.chain.chain
        max_length<-node.chain.length
      }
    }
    if (!is.null(new_chain))
    {
      bc$chain <- new_chain 
    }
  }
  # Adding bc to the environment
  bc <- list2env(bc)
  class(bc) <- "BlockchainClass"
  return(bc)
}

Defining the API Methods

After defining the Blockchain class, we need to create an instance of it running on a node. First we generate a valid identifier for the node (using Universally Unique IDentifier). Then, we add the genesis block or first block in the chain, using some default parameters (previousHash=1 and proof=100).
Everything else takes place when users invoke the “transaction/new” method to create new transactions and miners call the “/mine” function to trigger the creation of new blocks according to the PoW schema and process the newly created transactions.

Apart from these core methods, we also enable the registration of new nodes “/nodes/register”, the consensus method to handle potential conflicts “/nodes/resolve”, the retrieval “/chain” and html display of the chain “/chain/show”.

To host these methods, we use rplumber, which is a wonderful package to expose R functions as REST and RESTFULL services. It’s really versatile and easy to use (but single-threaded, which requires more advanced setups).
Apart from the aforementioned methods, we define a filter to log all requests with the server (logger).

list.of.packages <- c("uuid")
new.packages <- list.of.packages[!(list.of.packages %in% installed.packages()[,"Package"])]
if(length(new.packages)) install.packages(new.packages)

require(uuid)
# make sure you put the path of your blockchain.R file
source('blockchain.R')

# Generate a globally unique address for this node
node_identifier = gsub('-','',UUIDgenerate())
# Instantiate the Blockchain
blockchain = Blockchain()
# genesis block
blockchain$nextBlock(previousHash=1, proof=100)
#* @serializer custom_json
#* @post /transactions/new
function(req)
{
  #eg req_json <- '{"sender": "my address", "recipient": "someone else address", "amount": 5}'
  #values <- jsonlite::fromJSON(req_json)
  values <- jsonlite::fromJSON(req$postBody)

  # Check that the required fields are in the POST'ed data
  required = c('sender','recipient', 'amount')
  if (!all(required %in% names(values))) {
    return ('Missing Values - sender, recipient and amount are required')
  }
  index = blockchain$addTransaction(values$sender, values$recipient, values$amount)
  
  list('message' = paste('Transaction will be added to Block', index))
}

#* @serializer custom_json
#* @get /chain
function(req)
{
  list('chain'=blockchain$chain, 'length'=length(blockchain$chain))
}
#* @serializer custom_json
#* @get /mine
function(req)
{
  # We run the proof of work algorithm to get the next proof
  lastBlock = blockchain$lastBlock()
  lastProof = lastBlock$block$proof
  proof = blockchain$proofOfWork(lastProof)
  
  # We must receive a reward for finding the proof.
  # The sender is "0" to signify that this node has mined a new coin.
  blockchain$addTransaction(sender="0",recipient = node_identifier, amount=1)

  # Forge the new block by adding it to the chain
  previousHash = blockchain$hashBlock(lastBlock)
  block = blockchain$nextBlock(proof, previousHash)
  list('message'='New block forged', 'index'= block$block$index, 'transactions'= block$block$transactions, 'proof'=block$block$proof,'previousHash'=block$block$previousHash)
#  list('message'='New block forged', c('index'= block$block$index, 'transactions'= block$block$transactions, 'proof'=block$block$proof,'previousHash'=block$block$previousHash))
}
#* @serializer custom_json
#* @post /nodes/register
function (req)
{
#  req_json <- '{"sender": "my address", "recipient": "someone else address", "amount": 5}'
  values <- jsonlite::fromJSON(req$postBody)
  nodes <-  values$nodes
  if (is.null(nodes))
  {
    return("Error: the list of nodes is not valid")
  }
  for (i in 1:length(nodes))
  {
    blockchain$registerNode(nodes[i])
  }
  TRUE
}
#* @serializer custom_json
#* @get /nodes/resolve
function (req)
{
  replaced = blockchain$handleConflicts()
  if (replaced)
  {
    list('message'='Replaced', 'chain'=blockchain$chain)
  } else  {
    list('message'='Authoritative block chain - not replaceable ', 'chain'=blockchain$chain)
  }
}
#* Log some information about the incoming request
#* @filter logger
function(req){
  cat(as.character(Sys.time()), "-", 
      req$REQUEST_METHOD, req$PATH_INFO, "-", 
      req$HTTP_USER_AGENT, "@", req$REMOTE_ADDR, "\n")
  plumber::forward()
}
#* @get /chain/show
#* @html
function(req)
{
  render.html <- ""
  
  paste0(render.html, '<br>')
  render.html <- paste0(render.html, 'Current transactions:<br>')
  for (i in 1:length(blockchain$currentTransactions))
  {
    render.html <- paste0(render.html, 'Transaction' , i ,'<br>')
    render.html <- paste0(render.html, 'sender:', blockchain$currentTransactions[i]$transaction$sender)
    render.html <- paste0(render.html, '<br>')
    render.html <- paste0(render.html, 'recipient:', blockchain$currentTransactions[i]$transaction$recipient)
    render.html <- paste0(render.html, '<br>')
    render.html <- paste0(render.html, 'amount:', blockchain$currentTransactions[i]$transaction$amount)
    render.html <- paste0(render.html, '<br>')
  }
  render.html <- paste0(render.html, '<br>')
  render.html <- paste0(render.html, 'Current transactions:')
  render.html <- paste0(render.html, '<div>')
  for (i in 1:blockchain$lastBlock()$block$index)
  {
    render.html <- paste0(render.html, '<br>')
    render.html <- paste0(render.html, '<b>Block nr:</b>', blockchain$chain[i]$block$index)
    render.html <- paste0(render.html, '<br>')
    render.html <- paste0(render.html, '<b>Transactions</b>')
    render.html <- paste0(render.html, '<br>')
    render.html <- paste0(render.html, blockchain$chain[i]$block$transactions)
    render.html <- paste0(render.html, '<br')
    render.html <- paste0(render.html, '<b>Proof</b>')
    render.html <- paste0(render.html, '<br>')
    render.html <- paste0(render.html,blockchain$chain[i]$block$proof)
    render.html <- paste0(render.html, '<br>')
  }
  render.html <- paste0(render.html, '</div>')
  render.html
}

We stored the code above in a file called “blockchain-node-server.R” to “plumb” it with the script below. First of all, we define a custom Serializer to handle a json boxing issue, as shown here.

list.of.packages <- c("plumber","jsonlite")
new.packages <- list.of.packages[!(list.of.packages %in% installed.packages()[,"Package"])]
if(length(new.packages)) install.packages(new.packages)

require(plumber)
require(jsonlite)

custom_json <- function(){
  function(val, req, res, errorHandler){
    tryCatch({
      json <- jsonlite::toJSON(val,auto_unbox=TRUE)

      res$setHeader("Content-Type", "application/json")
      res$body <- json

      return(res$toResponse())
    }, error=function(e){
      errorHandler(req, res, e)
    })
  }
}

addSerializer("custom_json",custom_json)
# Make sure you put the path to your blockchain-node-server.R script
r <- plumb('blockchain-node-server.R')
r$run(port=8000)

Rplumber implements the swagger UI, which can be reached under http://127.0.0.1:8000/__swagger__/. The picture below shows our methods as we declared them in the blockchain-node-server.R script:

Interacting with our Blockchain

Once we have our Blockchain API running in the url we specified in the plumb command, we can try it with a simple client:

library(jsonlite)
library(httr)
# to register a node
req <- POST("http://127.0.0.1:8000/nodes/register", 
            body = '{"nodes": "http://127.0.0.1:8000"}')
cat(jsonlite::fromJSON(content(req, "text")))
# create a new transaction
req <- POST("http://127.0.0.1:8000/transactions/new", 
            body = '{"sender": "d4ee26eee15148ee92c6cd394edd964", 
            "recipient": "23448ee92cd4ee26eee6cd394edd964", "amount": 15}')
object <- jsonlite::fromJSON(content(req, "text"));object$message
# create a new transaction
req <- POST("http://127.0.0.1:8000/transactions/new", 
            body = '{"sender": "6eee15148ee92c6cd394edd974d4ee2", 
            "recipient": "15148ee92cd4ee26eee6cd394edd964", "amount": 225}')
object <- jsonlite::fromJSON(content(req, "text"));object$message
# start mining
req <- GET("http://127.0.0.1:8000/mine")
object <- jsonlite::fromJSON(content(req, "text"));object$message

# create a new transaction
req <- POST("http://127.0.0.1:8000/transactions/new", 
            body = '{"sender": "334e15148ee92c6cd394edd974d4ee2", 
            "recipient": "8ee98ee92cd4ee26eee6cd3334e1514", "amount": 887}')
object <- jsonlite::fromJSON(content(req, "text"));object$message

# mine again
req <- GET("http://127.0.0.1:8000/mine")
object <- jsonlite::fromJSON(content(req, "text"));object$message

To try it locally, I opened 2 instances of RStudio: the first one for the server, where rplumber executes the server part, and the second one for the client, from where I fired all the sample requests from the previous script.
You can check the status of the chain in the browser, as shown in the picture below

But you can also interact with the chain programmatically from the client:

# retrieve the chain
req <- GET("http://127.0.0.1:8000/chain")
chain <- jsonlite::fromJSON(content(req, "text"))
# check the amount of the first transaction in the first block of the chain
chain$chain$block.1$transactions$transaction$amount

Credits

This coding has been inspired by different articles and tutorials (mostly in Python) showing how to build a blockchain from scratch. For example, the ones from Gerald Nash, Eric Alcaide and
Lauri Hartikka (this one in JS).
Special mention to the one created by Daniel van Flymen, very inspiring and easy to follow.

To leave a comment for the author, please follow the link and comment on their blog: R – Big Data Doctor.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.