strangelights'

Ten out of ten for style, but minus several million for good thinking

Bitcoin: A Detailed Yet Easy to Understand Explanation

I’ve had an unhealthy fascination with Bitcoin and crypto currencies for a while now. I’m writing this article because there seems to be a gap in the market for an explanation of how Bitcoin works and thinking on the consequences of that. When the popular media covers Bitcoins it inevitably skips over important details, when technical bloggers cover Bitcoin they tend to get too bogged down in the details of the cryptography. This is an attempt to provide something in between, an explanation that anyone with reasonable grasp on computing will understand.

Crypto Currencies

Bitcoin is a crypto currency. The objective of crypto currencies is too allow users to send each other payments over the internet via peer-to-peer network with no central authority overseeing the transaction. User can remain anonymous as with crypto currencies there’s no need for any kind of sign up, you need to install the client software and then create an “address” which you can use to send and receive payments. Of course at this point you won’t actually have any of crypto currency, so you’ll need to persuade some to give or sell you some of their cypto currency before you start making payments of your own. You could also try “mining” to get hold of some crypto currency, but this is a tricky business and unlikely to yield much currency without considerable investment.

You could be forgiven for thinking Bitcoin was the only crypto currency, it is not. Bitcoin was the first and is the most well established but at the time of writing there were about 60 crypto currencies, with wikipedia categorizing 12 of them as “major”. Wikipedia doesn’t really qualify why it classes them as major, but I guess this is based on number of users and infrastructure like exchanges were the crypto currency can be traded for other currencies. All the major crypto currencies, with one exception ripple, are based on the ideas of Bitcoin. As Bitcoin is entirely open source and programmer can create their own crypto currency by forking Bitcoin and making a few minor adjustments to the code base. However, forking the client software is the easy bit, you then have to persuade people to use your crypto currency instead of the one they use currently. This means offering advantages over Bitcoin, or other crypto currencies, and to do this you’d need to make more significant changes to the Bitcoin code base. Each major crypto currency offers what they see as advantages over Bitcoin.

This article is about how Bitcoin works, but as the vast majority of crypto currency are based on not only the ideas of Bitcoin but also the code based, if you understand Bitcoin you’ll be a long way to understanding all crypto currencies.

The mining misnomer

Those involved with Bitcoin have popularized the idea that bitcoins are mined. I can see why they chose this metaphor, but I think it’s a metaphor that’s flawed and unrepresentative of how Bitcoin works. To see why I say this let’s dig into how Bitcoin works. Bitcoin draws on several ideas from modern cryptography to ensure payments made with Bitcoin are secure. Bitcoin didn’t invent any new cryptographic techniques, they just found away of putting together existing techniques to allow secure payments over a peer-to-peer network.

At it’s heart Bitcoin is a giant ledger of which “address” own which Bitcoins. Each Bitcoin client downloads it’s own copy of this ledger so everyone can agree on which addresses own which Bitcoins. Bitcoin protects this ledger from tampering via a combination of cryptographic hashing and proof of work. Lets dig into these ideas one by one.

Hashing is technique used in many areas of programming. The idea of hashing is to take data of an arbitrary length and produce a “signature” that’s of a fixed size, and usually shorter that the original data, but is representative of the original data. For example imagine you’re trying to download a large file from me and there’s a danger it might get corrupted during the download process. To check if corruption occurred you can use a checksum, a simple example of a hash. To calculate a checksum you simple add up all the bytes in a file, when the total is more that some arbitrary value, say two digits, you wrap round to zero and carry on adding, so if I had a value of 89 and the next value to added was 22 I’d wrap round to 11 and then add the next value. You now have two digits that are representative of the file as whole, any change to the file would alter the check sum. If I give my pre-calculated checksum you can check your checksum against mine and if they match there’s a high probability that the file wasn’t corrupted. Hashes can also be used to protect data from tampering, if I know the hash of a file I can tell whether it has been altered since the alteration would change the hash, so it’s simple for me to tell if it’s been modified. This is essentially how Bitcoin protects it’s ledger, Bitcoin doesn’t use checksum, a non-cryptograhic hash, but the popular cryptographic hash SHA256, but the principal of protection from modification is just the same. Bitcoin stores each entry in the ledger a long with a hash, so that everyone can check it hasn’t been modified. But if I wanted to modify the ledger why wouldn’t I just recompute the hash? This is where Bitcoin use proof of work.

Proof of work systems are an idea that were first employed as a way of blocking spam and denial of servers attacks. Proof of work is quite a simple idea, imagine that I host a service and you want to access that service. I want you to use my service but I don’t want you to abuse it though over use. To prevent overuse each time you want to use the service I send you a puzzle, the puzzle must be hard for you to solve, but easy for me to verify you correctly solved the puzzle, that way solving the puzzle becomes a limiting factor in the number of request to use my service that you can make. The difficulty is finding a puzzle that will provide a large amount of work for you solve, yet still be easy for me to check you solve it correctly. A popular way of setting a puzzle is using hashing functions, the very same hashing functions we met in the previous paragraph. For example the SHA-256 hashing algorithm generates a hash that is 256 bit number, which is a huge 77 digit number. For most input data SHA-256 generates a number that uses all 77 digits, however some hashes are small and so don’t use all 77 digits. (These hashes are padded with a number of leading zeros to make them a fixed width). These small hashes are rare and occur randomly, but have a fairly event distribution. This means I can set you a puzzle by saying here is an arbitrary piece of data, find me one of these rare small hashes by modifying these few bytes within the data. The puzzle is hard for you as your only choice is use brute force and search all of the bytes I’ve allowed you to modify (even if your only allowed to modify four bytes that’s still over 100 million values to check), but it’s easy for me to check as I can check your answer by just one hash. As smaller hashes are progressively rarer I can calibrate the amount of work you need to do by modify the size of the number I require, a SHA-256 using 70 digits will be fairly easy to find, one using only 60 digits will be harder and only 50 digits harder still and so on. Bitcoin demands that the SHA-256 hashes used to protect it’s ledger are small and so hard to find. This means anyone wanting to alter the history will have a large amount of work recomputing the hash of the entry they wish to modify. Not only that, entries are chained together, by including the hash of the previous entry in the next entry. This means to alter the history you not only have recompute the hash of the entry you wish to modify, but you then need to recompute the hash for all the entries after it. This means the amount of computing power required alter Bitcoins history quickly becomes unfeasible large. The ledge is protected from fraud by economics, the cost of the computing power needed to modify the ledge would be far great than the amount you could earn from modifying it.

Now we understand how the Bitcoin ledger works lets look at what our miners actually do and how new Bitcoins are manufactured, although the idea that they are manufactured is another misnomer in my opinion. Requests for new transactions, requests to move Bitcoins from one address to another, are broadcast over the Bitcoin peer-to-peer network by the owners of Bitcoins. It’s the job of a Bitcoin miner to gather all these transactions and verify them, once verified they need to enter them into the ledger. To do this they needed to calculate a hash, and because Bitcoin is a proof of work system this hash is hard to find. Miners are racing against each other to be the first one to find the hash and update the ledger. The miners race each other because the one who finds the hash is allowed to add an extra transaction to the ledger send 50 Bitcoins to any address they choose, normally this will be an address they own, as it there reward for doing the work required to update the ledger. Once they find a hash they broadcast the updated ledger to all their peers, who then verify the work was done correctly. The payment of 50 Bitcoins per block of transactions processed is entirely arbitrary, the only reason miners cannot pay themselves more is their piers would reject the transaction block and their hard work would be wasted. Miners do not mine or manufacture anything, a Bitcoin is nothing more then an integer associated with an address which controls this amount. Miners are more like ledger clerks who are paid by the number of transactions they process and must compete against a number of peers for the privilege of being paid. The important thing to note is miners are not simply looking for or creating new Bitcoins, their work is vital to keeping Bitcoin running and so they are paid a reward for doing this vital work.

Bitcoin also has built in calibration, it was designed to keep a constant rate of processing transactions, a new block should be entered into the ledger approximately every 10 mins. In Bitcoin this calibration values is called hardness, and the hardness determines the size of SHA-256 hash that needs to be found. A new hardness is agreed by the peer-to-peer network approximately every two weeks. As interested in Bitcoin has increase the hardness has been progressively increasing too as more people try there hand at mining.

It’s interesting to ask why the metaphor of mining Bitcoins was chosen, when it seems an inaccurate description of what miners actually do. Clearly the term “miner” will stroke masculine egos, that are so common in the tech industry, in a way that “ledger clerk” wouldn’t. The term “miner” has association with the frontier and the gold rush that’s also likely to appeal to techies. Aside from appealing to techies, the term miner is clearly an attempt to emphasizes the fact that Bitcoins are backed by real world commodities. This is undeniably true, to be paid the reward for processing a block takes an significant amount of electricity and these days also requires investment in specialist hardware called “ASIC”. I think interesting to note the term miner seeks to emphasizes this. In my opinion it also seeks to disassociate the process of Bitcoin creation from transaction processing. Why? It been widely publicized that the supply of Bitcoins will be limited to 21 million, but if you understand Bitcoins are not created and are paid as a reward for doing the hard work involved with transaction process then your likely to ask the question how does this limiting process work? Bitcoin creation will be limited because the reward miners will be paid halves every 4 years (it started out at 50 Bitcoins, it’s down to 25 these days). Plus this reward is only paid until the limit of 21 million is reached after which no reward will be paid. This awkward for Bitcoin as it makes you pose the question, who will be willing to spend vast amounts of money processing Bitcoin transactions when they are not being paid for it? This question doesn’t have good answer in my opinion. Currently Bitcoin users can pay an optional fee to miners, I’ve not looked into the subject, but I’m guessing most people don’t. In the future miners could choose to impose fees on transaction, but that would be difficult unless miners could reach a common agreement on fees levels. To see why this would be odd, imagine that one minor imposed a fee and another didn’t. If I send my transaction without a fee then the miner charging the fee will not process it, but the one who doesn’t charge will. For my transaction to be process I have to wait to the miner not charging a fee processes a block of transactions, rather than just waiting for any minors to process a transaction.

Addresses and the Joy and Pain of Decentralization

Hopefully now you feel well versed in the way transactions are processed in Bitcoin, but there’s still a missing part of the process, the Bitcoin “address” and how they control the Bitcoins associated with them. Address use another part of modern cryptography we’ve not met yet: public/private key cryptography, so lets dive in.

The idea of encryption is easy to understand and predates computers by a very long time. The idea is this: I have a message I want to send, but I don’t want just anyone to send it, so I encrypt it, now to the outside world it looks like random characters, but to those who can decrypt to the message are able to read the message. Encryption is broken into two parts the encryption algorithm and the key. The encryption algorithm is the general rules of how the data will be encoded, while the key is a parameter of the algorithm that determines the specifics of how the data will be encoded. Keys are important as without them simply knowing the algorithm would be enough to decode the data, but with keys many people can use the same algorithm but only the person with a valid key can decode it. Many algorithms uses the same key for encoding and decoding data, these are know as symmetric encryption algorithms. While these algorithms are perfectly secure, the key often poses a problem as it must be shared securely between users and this presents a sort of chicken and egg problem. How do you securely distribute a key? It needs to be encrypted, but what key should I use to encrypt the key?

To solve this problem public/private key, or asymmetric, cryptography was invented. In public/private key cryptography there are two keys a public key which is shared with the world and private one which must be protected and kept secret. Each person has there own public and private key, so there’s never any need to share the private keys. So if I have a public and private key and you want to send me a message, you take my public, which I’ve published on the internet, and encrypted the message and now only the person that holds the private key, me, can decrypt it.

It was also noticed that the process works in reverse, if I encrypt a message with my private key, it can only be decrypted with my public key. This is useful as proof ownership, if I’ve encrypted a message with my private key, and you decrypted the message with my public key and verify the contents you know that message could only be generated by my private key, so you know it must have come from me. This process of proof of ownership is know as signing and it is that Bitcoin uses to ensure only the owner of an address may spend the Bitcoins associated with it.

In Bitcoin an address is something generated for you by the client software, it’s quick to generate addresses, and you may generate as many as you like. An address has a public and private key associated with it. Once an address has been sent some Bitcoins it may spend them by using the private key associated with the address. The miners will then use the public key associated with the address to verify that the transaction came from the owner of the private key associated with the address and will only process the transaction if everything checks out.

The obvious draw back of this system is that if you loose the private key associated with your address you cannot spend your Bitcoins. There simply no way round this, and there many ways a private key could be lost. If you only copy of your private key is on your laptop and you leave your laptop in the pub, the Bitcoins are lost, if you hard drive crashes and can’t be recovered then the Bitcoins are lost. There’s no central authority you can complain to, or ask for help, this is a decentralized world and you are the only person responsible for the care of your private key. The answer to the problem of private key loss, as with all data protection issues, is to create backups, preferably “off-site”. By “off-site” I mean it’s better to backup your data to a different physical location, if you backup your laptop to removable hard disk which you store at home then there’s a danger a thief may steel your laptop and backup drive, or your house could burn down. The easiest option for off-site back is to use a network, but here you open yourself to another problem, Bitcoin private keys are not just vulnerable to loss but also to theft. If you store your Bitcoin private keys on a “cloud-drive”, such as dropbox or copy.com then someone could hack your account and steel your keys (I guess a hacker could also hack your home network and steal your keys directly from your laptop but cloud drive is probably a slightly softer target). You are strongly encouraged to use the client software to encrypt your private keys, but that only adds another layer of protection, if hacker managed to access your laptop or on-line backup then there’s a good chance they’ll be able to hack the password on your Bitcoin private key storage, unless you are very careful about how you encrypt those private keys. Once a hacker has obtained your private key they’ll transfer your Bitcoins to address where they own the private key. You’ll be able to see where you’re Bitcoins have gone, but you won’t be able to access them. Bitcoin transactions are completely irreversible, again no central authority to arbitrate a reversal.

The Anonymity of Bitcoin

Much has been made of how Bitcoin is completely anonymous. In someways this is true, all you need to know send Bitcoins to someone is there address. However, any address that owns Bitcoins is by definition stored in the Bitcoin ledger. You may not know who owns what address but you can tell how many Bitcoins an address has, which address they received them from and which address they send them to. If some careless uses make public the fact they own a Bitcoin address you can start to look at who has been sending them Bitcoins and who’ve they sent Bitcoins to. Making public your address isn’t such an unlike scenario, since you need to tell people your address from them to be able to send your money. You are advised to use a different address for each transaction to help keep your anonymity, but not everybody does. It starts to look feasible to track dirty Bitcoins as they follow thought the Bitcoin network, and at some point they’ll pop up an address who’s owner is known and so we could then start to trace back though the network by asking them how they came by those Bitcoins. It’s true the graph of who spent what Bitcoins is huge, but computers are getting better and better at handling large graphs of data. It may not be an easy task, but it certainly seems easier that working out want happen to stolen cash.

There isn’t a huge amount you can spend Bitcoins on these days, so at some point your probably going to want to cash out your Bitcoins for a traditional currency. This is mostly done via exchanges, though there is an “over the counter” option where you can exchange Bitcoins directly with other Bitcoin users. It would seem these are ideal starting points for finding out who owns which Bitcoin address. Rather than trying to ban deposits of Bitcoin it would seem wiser to try and regulate the exchange of Bitcoins for other currencies, this would give investigations into stolen Bitcoins an excellent point to work back from.

Price of a Bitcoin

You’ve probably seen the headlines of Bitcoins hitting $1000 per coin and thought, wow, that’s over valued. However, if you look at the economics of Bitcoin you’ll see that’s actually undervalued (in someways). The idea is that Bitcoin is a commodity, as it takes electricity to produce it. The price of a Bitcoin should be roughly the price of the electricity that it took to produce it, perhaps a little more as the miners are doing vital work for the Bitcoin network so they need some kind of incentive to do it. We know exactly how much Bitcoin miners are earning, blockchain.info is kind enough to publish these stats for us, though if they didn’t we could have worked in out ourselves from the Bitcoin ledger. So at the time of writing, Bitcoin miners earn $3,925,863 over the past 24 hours. That sounds a lot, but they need make 11,388,578 giga-hashes per second to do this and that too is a big number and is going to cost a lot in electricity. It’s difficult to estimate the exact costs to the miners but we could do a quick back of the evolve calculation to get a rough idea of how much their costs are. So lets assume most people are mining with something roughly as efficient as Radeon 5870 video card. This card is given as one of the most profitable graphics cards on this site. Radeon 5870 performs a hash rate of 402 mega-hashes per second and has an operating cost of $1.2 per day for a machine running two cards in place in the US where electricity is reasonably cheep, again info from this site. To achieve the required hash rate you need to run 14,164,898 machines with a running cost of $16,997,877 which is a $13,072,014 dollar per day loss for the miners.

It’s true that most miners don’t use graphics cards these days, but specially constructed “ASIC” machines. How much better hash rate per gigawatt do these machines give you? They’d have do a hell of a lot better for mining to become viable again.

It’s interesting to note that until recently blockchain.info also published an estimate of the profit/loss made by miners. They seem to have stopped doing so. One might say it was silly of them publish such a vague estimate as part of their stats, and that why they removed, but one has wonder if they removed it because it was putting miners off.

From the point of view of a miner starting out today Bitcoin is seriously under priced. Today’s miner will be unwilling to sell their Bitcoins cheaply as they have huge electricity costs to cover (not to mention the cost of their specialist hardware). This is only going to get worse, unless there’s a mass exit from mining, as the number of coins produced from each transaction is scheduled to halve every 4 years. On the flip side, vast hoards of Bitcoins were created very cheaply in Bitcoins early days when there was much fewer miners, so the hardness was much lower and therefore less hashes/electricity was required to create Bitcoins. This creates something of a conflict in the Bitcoin sellers market, where people with older Bitcoins could afford to sell their coins much cheaper than those who have recently started mining.

A Little About Mining Pools and Peer-to-Peer Systems

Bitcoin is a peer-to-peer protocol. This means it works because all peers in the network agree on the protocol, it can change only if the majority of nodes on the network agree to the change. Majority agreement should protect users from changes to the protocol that would adversely affect them. Thanks to the way the Bitcoin protocol work some nodes on the peer-to-peer network are more important than others, let take a look at why.

We’ve already seen Bitcoin mining takes vast resources, more that most individuals could muster. To get round this Bitcoin miners organize themselves into pools. Pools have a central server and this handles the implementation of the Bitcoin protocol, gathering and verify the transactions, but do not perform any the really resource intensive work of searching for a hash to enter the transaction into the ledger. Pools farm out the hashing work to anyone willing to lend a hand. To work with a pool you typically go though a short registration process, then you’ll be giving details of how to connect whatever device you want to mine with to the pool. Once your device is connected the pool will send your device hashing work, you device will hash it and send the results of the work. You’ll then be given a share of any coins that are mined and the pool will keep some for itself, to cover running costs. There’s no fixed formula for how the reward is distributed, each pool is different. Because of the huge numbers of hashes that now need to be preformed working together in this way is the only way miners can hope to find the hashes they need to update the ledger.

Pools vary greatly in popularity. The Bitcoin wiki contains a comparison of pools, which list about 30 pools, there may be more which aren’t list. The wiki also gives the gives an estimate of the number of hashes each pool performs, which is a measure of the pools popularity or what percentage of transactions it is likely to enter into the ledger. At the time of writing the top 5 pools in terms of hash rate were (TH/s is tera-hashes per second):

  • BTC Guild: 1550 TH/s
  • GHash.IO: 1500 TH/s
  • Eligius: 700 TH/s
  • BitMinter: 370 TH/s
  • Slush’s pool (mining.bitcoin.cz): 360 TH/s

Between them they will write 92% of transactions into the Bitcoin ledger (the top 3 will write 77%). This means if the owners of these 5 pools agree a change to the protocol no-one else’s opinion really counts. Changing the Bitcoin protocol without 100% agreement to between all users would lead to somewhat difficult to predict effects. I think if the biggest pools decided to make a change to the protocol without bothering to ask everyones consent the most like outcome is the ledger would split into an old and new version and since the new version would powered by those doing the most update it would quickly leave behind those still running the old protocol, which would effectively force everyone else to upgrade.

So what might change in the protocol? Well they’ll probably be some fairly benign changes at some point to changes field sizes that are two small, but probably no-one object to that. More fundamental things could in theory be changed too. One controversial change that the bigger mining pools might try to push though is changing the amount miners get payed for finding a block. Remember the amount miners are paid is merely a conversion, the only reason they can’t pay themselves more is their peers would reject the transaction block. But what it the bigger pools decided on a change to push up the fee for finding a block up, could they steam roller this though on everyone else?

Conclusion

Crypto currencies are a fascinating set of technologies. I do not think they are either good or bad, they are just a tool for sending money over the internet that can be used legitimately or could be abused illegal activities. The problem with Bitcoin specifically is that the game seems to have been very heavily tilted in the favour of early adopters. The problem with crypto currencies generally is they are only as secure as the private keys that control spending the coins at an address. I would be will to will to use a crypto currency to send small amounts of money round the internet, but I would be too scared to use a crypto currency to store large amounts of money or make large transactions as I don’t think I have the necessary skills to keep the private key safe and secure enough. The fact that all transactions are record in a giant ledger may or may not be a problem depending on your point of view and how much you care about privacy. Whether or not you decided you like to use crypto currency, I hope this article has now left you with a more informed point of view about it.

Trying Out Linux for a While

So, I had a good time at BuildStuff.lt, but my laptop died in quite an odd way. I’ll write more about the laptop death in another post. I’ll also write more about BuildStuff.lt in another post. This post is about the laptops resurrection.

I tried to repair the laptop using a windows using a bootable USB rescue disk kindly provided by Jemery and Ronan, but despite taking a long time this just made it worse.

So when I got home I decided to zap everything and install Linux. I’ve been a Windows for a long, long time now. I’ve made various attempts to use Linux instead but I’ve also either hosted Linux in a VM or duel booted and this means I always drift back to Windows in the end, more for the software than the OS. So this time instead of missing the windows software, I’m going to attempt to do things the “Linux way”, meaning instead of using specialist software for each task (ie Visual Studio for coding, Word for writing, Live Writer for blogging etc.), I’m going to try and learn one text editor well and use it for doing all those things in plain text plus markup/markdown.

This new blog is the first visible sign of this. It’s octopress based and hosted on github, if everything works out I’ll eventually migrate my domain and old blog posts here. If it doesn’t it’ll be quietly dropped.

The first 72 hours has had it’s ups and downs. Here’s a summary:

The good

The Linux terminal is great, light years heard of either cmd or powershell in Windows. If I do go back to Windows, I think I’ll just use Cygwin as my terminal and do more stuff through that.

Having repository based app installation is great, especially as if a command is missing the shell will often tell you what package it’s in.

The bad

The version of LibreOffice wouldn’t open (I didn’t want to use it for writing but I need to be able to read a word doc). Doing apt-get install libreoffice fixed this, but still if it comes pre-installed would be nice if it worked out the box, right?

The latest version of Skype wouldn’t install, so I’m stuck with one from apt-get which has no web cam support.

The cloud disky type thing I’m using, copy.com, has a Linux client, but it comes with no installer. You install it where you like. Which had me scratching my head where should I install this?

Octopress depends on version 1.9.3 of Ruby that isn’t on apt-get. The solution seems to be building from source. Getting the right source and building was pretty easy. The problem is running ./configure cleverly doesn’t compile the bits that rely on libraries that aren’t there. This is great if you don’t need that bit, but if you do it sends you down a cycle of find the library, install it then recompile. This is what was missing for me:

  • You need to change the default yaml parser to install a gem that octopress relies on, following the instructions from this post
  • Ensure zlib is installed, the apt-get package name is zlibc on zlib
  • Ensure OpenSsl is installed, the apt-get packages that need to be installed can be found here

On the one hand, this is all noobe stuff, on the other hand it makes me feel I’m climbing a hill that didn’t ought to be there.