These are a few reasons:

  1. It is reinventing the Internet.
  2. It is building the digital economy.
  3. It is challenging.
  4. The demand for open positions gigantic.

Some industries could drastically change:

  1. Personal identification.
  2. Medicine: patient information.
  3. Finance: send money across the world.
  4. Supply Chain.
  5. Government.

It brings us new ways of sharing data more secure, cheaper and transparent without large companies storing our data.

The blockchain is a shared database that contains a list of transactions made by the users where the database belongs to the network.

Bitcoin as example

Open to anyone, it can establish the trust needed to the transaction.

We often give personal information to companies in exchange of the services they provide. We can easily lose control of our data which is in one single location, “easily” hackable.

The blockchain data is shared by people in the network, it is encrypted and it is nearly impossible to hack.

Today knowing all the data about a single car can be extremely messy. Using the blockchain we could share the details of the car from the very beginning. In the blockchain:

  1. The information is anonymous very often.
  2. The information belongs to its proprietaries, the participants of the chain.
  3. The information is stored securely, encrypted.
  4. The transactions history is trusty and precise.
  5. A group of transactions big enough form a block which is added to the chain. Each block contains a hash which depends on the previous block in a way that makes impossible to change any of it. The hash identifies the data within the block.

Potential flaws on financial transactions today

  1. Many steps make the whole process slow.
  2. Many transactions need to be secured.
  3. Different protocols need to be defined (interfaces between people and banks, between banks…).
  4. The whole process depends on a single node, the bank.
  5. No easy way to see if transactions have been tampered with.
  6. Transaction time is dependent on the banks to validate the transactions.

If we create a shared ledger we do not need to trust a single bank that would or would not give as personal information. On the other hand, with the blockchain we will not have intermediaries of financial services that could charge fees on each transaction (like Paypal), each time they communicate to banks.

Bitcoin

Bitcoin is an implementation of the blockchain. A digital currency that facilitates transactions. An study of Heber and Stermetta already defined the idea of a chain of timestamped blocks. After that a very famous paper of Satoshi Nakamoto defined the Bitcoin like a peer to peer electronic cash system. Here is an extract:

A purely peer-to-peer version of electronic cash would allow online payments to be sent directly from one party to another without going through a financial institution. Digital signatures provide part of the solution, but the main benefits are lost if a trusted third party is still required to prevent double-spending. We propose a solution to the double-spending problem using a peer-to-peer network. The network timestamps transactions by hashing them into an ongoing chain of hash-based proof-of-work, forming a record that cannot be changed without redoing the proof-of-work. The longest chain not only serves as proof of the sequence of events witnessed, but proof that it came from the largest pool of CPU power. As long as a majority of CPU power is controlled by nodes that are not cooperating to attack the network, they’ll generate the longest chain and outpace attackers. The network itself requires minimal structure. Messages are broadcast on a best effort basis, and nodes can leave and rejoin the network at will, accepting the longest proof-of-work chain as proof of what happened while they were gone.

Hashing

A digital fingerprint for information. A unique string of letters and numbers that represents a set of data. A function that generates hashes is SHA256, used by Bitcoin for example to be able to identify a block within the blockchain. Here an example written in javascript code:

let sha256 = require('crypto-js/sha256');

const stringToBeHashed = "Blockchain Rock!";

const objectToBeHashed = {
    id: 1,
    body: "With Object Works too",
    time: new Date().getTime().toString().slice(0,-3)
};

function generateHash(obj) {
    return sha256(JSON.stringify(obj));
}

console.log(`SHA256 Hash: ${generateHash(stringToBeHashed)}`);
console.log(`SHA256 Hash: ${generateHash(objectToBeHashed)}`);

Blocks

They have a list of transactions to be added to the blockchain.

A block also has a header that contains:

  1. Previous blocks hash.
  2. Time: A solution to the double-spending problem.
  3. Merkle root: A hash representing every transaction in the block.
  4. Nonce: Arbitrary number that can only be used once. All the blocks of data combined with the hash value have to produce a hash meeting some conditions (pow). The computer has to guess the nonce in order to generate a block. This conditions are a number of zeros on the left side of the hash (more zeros, more difficult to calculate). Another condition is the maximum size of the block, determined by the developer.

Blockchain

  1. It is just a part of the blockchain framework (where the information is stored).
  2. The information is permanent and it can not be updated. The data is immutable.
  3. The information is divided in blocks that are related using hash values.
  4. Each block has a number associated which refers to the position of the block in the chain.
  5. Any change in the block triggers a change in the hash value and therefor a change in the rest of the blocks.
  6. The first block in a chain is called “Genesis block”.

Distributed Peer to peer network

The blockchain is based on a distributed peer to peer network. The users (nodes) can send information between them directly. The information belongs to the members of the network. There is not a central node. Everyone in the network has access to the information equally Vs network centralized / decentralized (multiple centers).

Memory Pool / Mempool

A place for transaction to wait to be included in the blockchain (backlog of information). It is also a waiting place of unconfirmed transactions. (The ram memory of the nodes).

The miners have to validate the transactions before to be added, so it exists a queue of untreated transactions.

There are some reasons for a transaction to quit the queue:

  1. The transaction is expired.
  2. The maximum size of the mempool is reached and a new transaction with higher fee (a tip for the miner) arrives, so the transaction with lower fee quits the mempool.
  3. The transaction is included in a bloc.
  4. The transaction has conflicts with another transaction in the block.

Consensus

What transactions are valid?
How the network reaches agreement?

See the Bizantine general’s problem.

The blockchain needs to establish trust between nodes when it is impossible to comunicate easily. The whitepaper of the bitcoin gives a solution to this problem: Proof of work (a way of reach consensus without central authority) but there are a few of them.

Proof of work

Whoever puts in the most work to contribute to the system is the most trustworthy. The work of a node demands a lot of resources but it is easy to validate the work by other nodes. Finding a hash with more leading zeros is a hash more specific and much more time is needed to get it. The miners search the nonce to generate the hash. The number of zeros on the left is known as the “block difficulty”. The difficulty can be changed if the blocks are being built too quickly o too slowly. The idea is to need 10 minutes (a balance between speed and security). If the blocks are generated too quickly, hackers could find ways to validate false transactions. If the blocks are generated too slowly there would be too many transactions waiting in the mempool.

There are two main concerns with this algorithm:

  1. Energy consumption high.
  2. The monopoly of trustworthy miners could lead to the centralization (pools sharing resources and gains centralized the network).

Bitcoin uses proof of work.

Proof of stake

It focuses on giving votes to members, depending on how much stake they have in the success of the chain.

There are not miners in this algorithm. All the coins exist at the beginning, there are only “validators”. The validators bet theirs coins to be able to validate a block and to get more according to the bet. If the block is not validated the validator lost the money. The validator having more coins has a great probability of getting the next block to validate.

Some advantages are the security, a reduced risk of centralization and energy efficiency.

An important issue is:

  1. Nothing at stake: Validators bet several blocks at a time in a way that they never lost and they are always at the top of the validators list aiming to modify the transactions.

A solution:

  1. Slasher: Validators who place bets in several blocks at the same time are penalized.
  2. Punisher: It penalizes validators adding blocks on forks.

It is used by Casper, Dash or Lisk.

Delegate Bizantine fault tolerance DBFT

It assigns roles to the nodes to help coordinate consensus. There are not miners, there are specialized nodes (regular nodes and consensus nodes) A consensus node is chosen randomly from the pool of consensus nodes (speaker), the speakr creates a new block and proposes it to the rest of the consensus nodes (delegates voted for the rest of the nodes) who must approve it, otherwise a new speaker is chosen and the process starts again.

Advantages:

  1. There is not a need to solve cryptographic problems.
  2. It is resistant to forking because there is only one block being added at a given moment.

Issues:

Dishonest speaker: It depends on honest delegates to validate or not, it depends on the delegate to vote honest speakers and it depends on the users to vote honest delegates.

Dishonest delegate: If there are several dishonest delegates they could technically validate a block that is not valid. If most of the delegates is honest the block will be refused and a new speaker will be chose.

This method is used by Neo.

Blockchain transaction

Identity

  1. Only someone who owns the money has access to spend it.
  2. The transactions cannot be traced.
  3. It should be possible to share the identity so others can make transactions with me.

We get that using a wallet:

  1. Private key: it allows spend your money from your wallet.
  2. Public key: publicly sharable key allows people to send you money.
  3. Wallet Address identification unique you can share with others and allows you to make transactions.

The public key is generated from the private key. For instance Bitcoin uses the algorithm ECDSA to generate public keys, which is extremely difficult to hack nowadays.

The wallet address allows the transactions not to be traced. Bitcoin uses two algorithm to get the wallet address:

public key -> sha256 -> 256 bits number -> ripemd160 -> 160 bit number (Wallet address)

Then we take the number and make it more readable:

base58check -> base58 number.

Wallet types

Deterministic

  • Sequential: derived sequentially from a single seed, it can be traced back to that seed. Private key is sha256(seed + n), n are 128 random bits. With the seed the private keys can be regenerated.
  • Hierarchical: Derived in a tree structure. Private key is sha256(address(publickKey(seed)+n)). It allows us to have a lot of keys, for example for a company with many departments.

Non deterministic

  • Randomly generated private keys with numbers between 1 and 2^256. For privacy is better to generate a new key for each transaction and back up to the wallet, so no one can track links between the addresses.

Private keys

A 256 bit random number between 1 and 2^256 which can be represented using different formats like Hex, WIF wallet import format (base58check) or Wiff compressed. To be generated and obtain unique keys a random non repeatable method is needed so a source of entropy is also needed. There is several ways to generate keys, like using some software like Electrum.

Sign a transaction

It establishes a proof of ownership for each transaction on the blockchain. Before a transaction is submitted to the network a signature is needed to establish the ownership of the money (Bitcoin example). The signature is made using a wallet address (bank account number like) which is linked to a private key. The transaction is sent as a UTXO (Unspent transaction ouput). Each transaction input will need to be converted into a transaction output that contains the proof of ownership using the private key. To create a transaction output you need to have the sum of the input transactions which are equal to or greater than the value you are sending. So sometimes we want the change back, which is possible.

Lifecycle

  1. A wants to send 1 bitcoin to B.
  2. A gets the wallet address of B.
  3. A creates a new transaction of 1 bitcoin plus optionally, transaction fees for miners.
  4. A verifies his information and sends the transaction.
  5. The wallet starts the transaction sign algorithm which signs his transaction using his private key.
  6. The transaction is broadcasted to the memory pool in the network (the public key, the signature and the message).
  7. The receiving node then checks using the verification algorithm that the message has been signed by the sender.
    • Authentication (message sent by a known sender).
    • Integrity (message not altered).
    • Non-repudiation (The sender cannot deny sending the message).
  8. The transaction is eventually accepted by miners, including it in a block and creating a hash value for the block.
  9. The block is added to the blockchain.
  10. The transaction is accepted as a valid transaction in the blockchain.
  11. B gets the money.