This is a guest post by Ray Patterson
Before Bitcoin: Hashcash & Moderately Hard, Memory-bound Functions
The Proof-of-work concept appeared for the first time in the paper "Pricing via Processing or Combatting Junk Mail" in 1993. Despite the fact that authors never used this notion in the article itself (it is 6 years before it appears), we are going to name it this way (or PoW).
So, what was the Idea proposed by Cynthia Dwork and Moni Naor in their paper?
The main idea is to require a user to compute a moderately hard, but not intractable, function in order to gain access to the resource, thus preventing frivolous use.
We can use the following parallel: the Ministry of Fishery makes Jeremy Wade do some work in order to access a local pond, e.g. plant a couple of trees and send the photos of plantlets. The important point is that Jeremy is able to perform this work in a reasonable time: before all the fish is poisoned with chemicals streamed into said pond. On the other hand, this work should not be so simple that bunches of fishers could empty the pond.
Finding an example of such work in the digital world is not a trivial task. For example, a server can offer a user to multiply two random numbers and check the result. There is an obvious problem, however: the check is not faster than the work itself, so the server would not be able to handle a large number of clients.
Another attempt: the server generates two large prime numbers, multiplies them and shows the result to the user. The latter should factorize this number. It is a more complex task. There are some nuances, but generally, the scheme became better: 'work time/check time' ratio has grown considerably. It is the main requirement formulated in the article:
Calculating a function with given input should be much more complicated than checking the result.
Moreover, the function should not be amendable to amortization, i.e. time for completing n tasks should be proportional to the amount of them (if Jeremy wants to get a double catch quota, he should do twice as much work: yesterday's or someone else's photos will not do).
Among other examples of such functions was extracting square roots modulo a prime P: here, we have log(P) multiplications for the solution and only one multiplication for the check. The input value can be generated by a server (generate x, show x2), or selected autonomously: select x to extract sqrt(hash(x)). The last case would require some tweaking: not every element in Z_p is a square. But it is not a big deal.
The following next years saw other works with keywords proof-of-work. We would like to point out two of them:
- The Hashcash project by Adam Back dedicated to the same task of spam-protection. The problem was as follows: "Find such x that hash SHA(x) would contain N high-order null bits". The work then, actually, turned out to be hashing the same data (letter) with a searchable part. As an irreversible hash-function is used, there is nothing better than exhaustive search moreover, the average scope of work depends on N. A simple solution, indeed.
- "Moderately hard, memory-bound functions". The key notion here is memory-bound. By the beginning of the New Millennia, the difference between high-end and low-end CPUs in solving PoW-problems became too obvious to paint them with the same brush. The algorithm proposed in the article was the first to utilize relatively large amounts of data (megabytes). The bottleneck was not the CPU operation speed but the delay in data exchange with RAM or the required RAM volume itself; it varied from PC to PC in a smaller range. One of such algorithms was scrypt, but we will cover it later.
Unicellulars: First PoW-functions
It will be incorrect to say that Satoshi was the first to think about using PoW for the digital currency. The idea of "reusable proof-of-work" had long been hovering among cryptoanarchists (and was implemented at some point, partially though), but it never got popular in its original concept.
The RPoW concept by Hal Finney proposed the "RPOW-tokens" to be inherent values (coins), while Bitcoin uses PoW only as a way to reach some distributed consensus (on the subject of which blockchain version should be correct), i.e. approved by the majority. This was the principal idea of Bitcoin that worked out.
Hashcash concept was selected as a proof-of-work scheme, and the SHA-256 was chosen as a computed function. It was the most popular hash function at that moment (2008). Most probably, the key factors were the simplicity of Hashcash and the 'commonality' of SHA-256. Satoshi added the 'difficulty factor' to the Hashcash (increasing or decreasing N - the required amount of zeros - depending on the number of participants) and, it seemed, secured clear decentralized future for everyone. Still, what happened later happened: GPU, FPGA, ASIC and even decentralized cloud mining.
There is no consensus on notion that "Satoshi (could) had foreseen" all this bedlam or not, but the fact remains: in September 2011, the first cryptocurrency that utilized a principally different PoW function appeared. Tenebrix used scrypt instead of SHA-256
Mutation: ASIC-mining and the Scrypt Algorithm
Well, why change such critical element as the consensus algorithm instead of changing such useful and important values in Bitcoin's code as, for example, the total amount of coins? The motive is obvious, as by 2011 GPU farms were already operating, FPGAs were being pre-ordered, and ASICs appeared on the horizon. Combined, these facts led to practically nullifying the contribution of a common CPU-user. But what was the reason?
Naturally, we could try to explain it by a common grudge or by a wish to 'make a fast buck': there were forks, pre-mines and pump'n'dumps before Tenebrix. But the arguments provided by supporters of the "ASIC-resistant" functions are rational:
- Promotion As every computer (theoretically) provides approximately similar hashrate, more people will join the mining, even in a background. Hundreds of millions of office PCs by slightly increasing their load will contribute to the protection of the network, getting coins for that, and, later, spending them.
- Decentralization There are two points here: firstly, the concern about hardware manufacturers. The majority of ASICs are made on a few factories in China, and very few people possess the knowledge required to produce them. Common PC is a very different matter. Secondly, the concern of geographical location of the miners. If we do not consider pools, the versatile CPU-friendly algorithm is wider spread because everyone can afford to launch a miner.
- High cost of 51% attack It is possible, of course, to design a specialized device that would solve a specific task more efficiently (time- or powerwise) than your average PC. The question is, how much would R&D cost, at what production scale the whole project would be profitable etc.? It is considered to be more costly than under existing conditions with Bitcoin in any case. We are not talking about absolute cost of an attack but about the cost in relation with 'maturity' of a network. For example, even attacking a network with super-ASIC-proof algorithm will be considerably cheap in case it is only running with twenty PCs mining.
One can counter any point here, but all I wanted to show was that when selecting a PoW-function it is good to have a thought about its ASIC-resistance. Actually, the very first article gave a hint about it! The function for amortization non-amenability might, by stretching a point, be considered ASIC-resistance.
Now it is time to give a few technical details on operating principles of the scrypt algorithm that became the basis for Tenebrix and, later, Litecoin (which, as you probably know, became the second popular cryptocurrency).
Strictly speaking, scrypt is not a cryptographic hash-function, but a key derivation function (KDF). Such functions are used exactly for the purposes you are thinking about: for deriving a secret key from some data (password, random bits etc.). We can name PBKDF2 and bcrypt as other examples.
The main purpose of KDF is complicating the generation of the final key, not too much to be useful for some applied tasks, but to the level that is sufficient for preventing malicious usage (for example, mass password search). This wording reminds you of something, right? It is not surprising that they have met, PoW and KDF.
Scrypt operating principle
Layer 1:
- Input data is processed through PBKDF2
- It is divided into p blocks, each processed by SMix function
- The resultant data is pieced together and processed through PBKDF2 again
Please note that any of the p blocks can be processed individually. It means that scrypt does allow for paralleling, in theory (by default p=1).
Layer 2:
- Input block is enhanced to an array of N blocks by sequential processing by BlockMix pseudo-random function
- Block X resultant from the last component is interpreted as an integer j and component Vj from the array is read
- X = BlockMix(X xor Vj)
- Repeat steps 2-3 N times
- The result is the current value of X block.
Here a fascinating part happens. The larger the value N we select, the more memory is required by the algorithm. Although we only need a single component from an array, we cannot unallocate memory until step 5 as this component is selected pseudo-randomly, depending on the last result! Therefore, if the BlockMix (see below) can be calculated swiftly, latencies become the bottleneck.
Layer 3:
- Input is divided into 2r blocks
- Each block is xor-ed with the previous one and hashed by H function
- The result is displayed in displaced order
Salsa20/8 is used as H function. Yes, it is not a hash-function, it is stream cipher. But for the scrypt purposes, collision-resistance is not a necessity, unlike fast and pseudo-random result, so Salsa was a good choice. Scrypt offers r=8 by default, however, for the cryptocurrency purposes r=1 was selected (most probably to enhance performance). We will describe the consequences later.
For further details on selecting different parameters in the algorithm, refer to scrypt description.
Scrypt parameters for Tenebrix were selected in such way that the total memory required was 128 kB, so it could be processed in CPU L2 cache (By default it used 16MB!). The experiment turned out to be a great success: CPU-miners, almost left alone on the margins of the progress, joined the system. Most new forks were born from Litecoin, inheriting scrypt instead of SHA256. Curiously enough, the fact that GPUs were quite able to deal with 128 kB parameters never bemused the users. Optimized GPU-miner appeared only a few months later and provided 10 times the efficiency. "At least it is not 100 times, as for Bitcoin", they said. "At least, not 1000 times!" The increase in GPU/CPU powers ratio (these were pushed out of Bitcoin network by new ASICs) was not dramatic, possibly due to compensation with forks-boom: almost every week a new scrypt-based cryptocurrency appeared and drew away new powers.
Still, understanding that scrypt was not an ideal PoW function dawned even earlier than the first scrypt-ASIC was announced. People were not heady with success anymore, Satoshi-cult was becoming less popular and ideas for the new PoW-candidates started mushrooming.
Part 2 coming soon. In the next episode:
"Mate and breed: X11 algorithms and Co!" "Atavisms vs vestiges: SHA3 or scrypt-jane?" "Historical paradoxes: Momentum and birthday problem" "Where does the pinnacle of evolution lie: in theoretical or practical IT? CuckooCycle and CryptoNight"